实现步骤
1、随便输入关键字,打开调试,发现是ajax传输,post请求
不难发现,请求连接Request URL:
http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule
里边有我们需要的json数据
{ "translateResult": [ [{ "tgt": "baidu", "src": "百度" }] ], "errorCode": 0, "type": "zh-CHS2en", "smartResult": { "entries": ["", "Baidu\r\n"], "type": 1 } }
2、找到FormData请求发送的参数
i: 百度 from: AUTO to: AUTO smartresult: dict client: fanyideskweb salt: 1535649141531 sign: 9204f873edc8ee3df27e5f097b973de5 doctype: json version: 2.1 keyfrom: fanyi.web action: FY_BY_CLICKBUTTION typoResult: false
多请求几次,观察规律发现变量有3个:
i: 很明显是需要翻译的词语
salt:类似时间戳
sign:类似md5
3、查看Request Headers请求头
Accept: application/json, text/javascript, */*; q=0.01 Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh;q=0.9,en;q=0.8 Connection: keep-alive Content-Length: 218 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Cookie: OUTFOX_SEARCH_USER_ID_NCOO=1715177970.4171937; OUTFOX_SEARCH_USER_ID=1390060209@59.111.179.144; ___rl__test__cookies=1535649371369 Host: fanyi.youdao.com Origin: http://fanyi.youdao.com Referer: http://fanyi.youdao.com/ User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36 X-Requested-With: XMLHttpRequest
多请求几次,发现变化的值有2个:
Content-Length:文本长度,后面测试中删除也能通过
Cookie中的___rl__test__cookies: 类似时间戳
尝试删除cookie,发现返回50,说明需要带cookie,再次发送请求,发现必要的cookie是:
OUTFOX_SEARCH_USER_ID_NCOO=1715177970.4171937; OUTFOX_SEARCH_USER_ID=1390060209@59.111.179.144; ___rl__test__cookies=1535649371369
4、打开调试找到文件:fanyi.min.js,搜索salt找到如下代码,发现和post提交的参数大致一样了
var t = e.i, i = "" + ((new Date).getTime() + parseInt(10 * Math.random(), 10)), o = n.md5("fanyideskweb" + t + i + "ebSeFb%=XZ%T[KZ)c(sy!"); r && r.abort(), r = n.ajax({ type: "POST", contentType: "application/x-www-form-urlencoded; charset=UTF-8", url: "/bbk/translate_m.do", data: { i: e.i, client: "fanyideskweb", salt: i, sign: o, tgt: e.tgt, from: e.from, to: e.to, doctype: "json", version: "3.0", cache: !0 }, dataType: "json", success: function(t) { t && 0 == t.errorCode ? e.success && e.success(t) : e.error && e.error(t) }, error: function(e) {} })
按照第2步中找到的变量参数,部分代码截取如下:
var t = e.i, i = "" + ((new Date).getTime() + parseInt(10 * Math.random(), 10)), o = n.md5("fanyideskweb" + t + i + "ebSeFb%=XZ%T[KZ)c(sy!"); i: e.i,
现在就很明白了需要两个参数:
salt: i, 可以被算出来
sign: o :需要找到 t 和 i
t == e.i == i (就是我们需要翻译的词语)
我们在控制台执行测试
>> (new Date).getTime() // 当前时间戳 1535650639011 >> Math.random() // 0 - 1随机数 0.6568699792902069 // 当前时间戳 + 0 - 10的随机数 >> ((new Date).getTime() + parseInt(10 * Math.random(), 10)) 1535650592194 >> "1535650592194".length 13
这样基本就明白了,不过python中的时间戳是10位+小数点
需要做一个转换
time_span = int(time.time()*1000) # 10位时间戳变13位 salt = str(time_span + random.randint(0, 10)) # 加一个0-10的随机数转为字符串
把参数代入计算出md5即可
最后调试发现:
salt 的随机数可以不加
Content-Length 可以没有
不过为了保持原有规则,我没有去除
最终代码
import requests import time import random import hashlib import json def youdao_fanyi(key): url = "http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule" # 将js转换为python代码计算参数(重点) time_span = int(time.time()*1000) salt = str(time_span + random.randint(0, 10)) s = "fanyideskweb" + key + salt + "ebSeFb%=XZ%T[KZ)c(sy!" sign = hashlib.md5(s.encode()).hexdigest() # post提交的参数 data = { "i": key, "from": "AUTO", "to": "AUTO", "smartresult": "dict", "client": "fanyideskweb", "salt": salt, "sign": sign, "doctype": "json", "version": "2.1", "keyfrom": "fanyi.web", "action": "FY_BY_CLICKBUTTION", "typoResult": "false" } # 请求必要的头部和cookie headers = { "Accept": "application/json, text/javascript, */*; q=0.01", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8", "Connection": "keep-alive", "Content-Length": str(len(key)), "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8", "Host": "fanyi.youdao.com", "Origin": "http://fanyi.youdao.com", "Referer": "http://fanyi.youdao.com/", "User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36", "X-Requested-With": "XMLHttpRequest", } cookies ={ "OUTFOX_SEARCH_USER_ID_NCOO": "937253968.9247279", "OUTFOX_SEARCH_USER_ID": "-850496506@10.168.8.76", "___rl__test__cookies": str(time_span) } # 提交请求 response = requests.post(url, data=data, headers=headers, cookies=cookies) if response.status_code == 200: result = response.json() else: result = {} if result.get("errorCode") == 0: return result.get("translateResult")[0][0].get("tgt") else: return key if __name__ == '__main__': key="紫色" print(youdao_fanyi(key)) # purple