网址链接中的中文编码
中文的gbk(GB2312)编码: 一个汉字对应两组%xx,即%xx%xx
中文的UTF-8编码: 一个汉字对应三组%xx,即%xx%xx%xx
可以利用百度进行URL编码解码 默认gbk
https://www.baidu.com/s?wd=%E4%B8%AD%E5%9B%BD
python3编码解码示例
# -*- coding: utf-8 -*- # @File : urldecode_demo.py # @Date : 2018-05-11 from urllib.request import quote, unquote # 编码 url1 = "https://www.baidu.com/s?wd=中国" # utf8编码,指定安全字符 ret1 = quote(url1, safe=";/?:@&=+$,", encoding="utf-8") print(ret1) # https://www.baidu.com/s?wd=%E4%B8%AD%E5%9B%BD # gbk编码 ret2 = quote(url1, encoding="gbk") print(ret2) # https%3A//www.baidu.com/s%3Fwd%3D%D6%D0%B9%FA # 解码 url3 = "https://www.baidu.com/s?wd=%E4%B8%AD%E5%9B%BD" ret3 = unquote(url3, encoding='utf-8') print(ret3) # https://www.baidu.com/s?wd=中国