我在学习Python for Information 这本书,第12章是用socket去抓取一张网络图片,但是报错UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 246: invalid start byte
用python 2.7是可以正常用的,但是python3.5 在
picture = picture + bytes.decode(data)
所有的代码如下,请大神帮忙改进,谢谢!
import socket import time mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) mysock.connect(('www.py4inf.com', 80)) s = 'GET http://www.py4inf.com/cover.jpg HTTP/1.0\n\n' mysock.send(str.encode(s)) count = 0 picture = '' while True: data = mysock.recv(5120) if (len(data) < 1): break #time.sleep(0.25) count = count + len(data) print(len(data), count) picture = picture + bytes.decode(data) mysock.close() # Look for the end of header pos = picture.find('\r\n\r\n') print('Header length', pos) print(picture[:pos]) # Skip past the header and save the picture data picture = picture[pos+4:] fhand = open('stuff.jpg', 'wb') fhand.write(picture) fhand.close()
这个你要先了解http协议的格式
<status-line><headers><blankline>[<response-body>]这段代码的返回结果应该是这样的:
HTTP/1.1200OKDate:Tue,23Feb201609:32:03GMTServer:ApacheLast-Modified:Fri,04Dec201519:05:04GMTETag:"b294001f-111a9-526172f5b7cc9"Accept-Ranges:bytesContent-Length:70057Connection:closeContent-Type:image/jpeg图片数据而socket返回的结果是经过编码过的,所以要找头的时候,应该用pos=picture.find('\r\n\r\n'.encode('utf-8'))
而图片本身就是二进制的,所以不要解码,直接往文件里写就可以了。
修改后的就像这样:
importsocketimporttimemysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)mysock.connect(('www.py4inf.com',80))s='GEThttp://www.py4inf.com/cover.jpgHTTP/1.0\n\n'mysock.send(s.encode('utf-8'))count=0picture=b''whileTrue:data=mysock.recv(5120)if(len(data)<1):break#time.sleep(0.25)count=count+len(data)print(len(data),count)picture=picture+datamysock.close()#Lookfortheendofheaderpos=picture.find('\r\n\r\n'.encode('utf-8'))header=picture[:pos]print('Headerlength',pos)print(picture[:pos].decode('utf-8'))#Skippasttheheaderandsavethepicturedatapicture=picture[pos+4:]fhand=open('stuff.jpg','wb')fhand.write(picture)fhand.close()版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。