1. Python解析json时提示“string indices must be integers”问题解决方法
在得到页面爬取到的数据后返回一个字典形式的数据,但是我无法访问其键值,并且控制台显示:
TypeError: string indices must be integers
意思是[]
只能是数字,那不就是数组吗?于是我查看这个数据的类型为这个:<class 'str'>
image.png
于是我明白虽然看起来是字典,但是实际上是字符串,这需要我们将字符串转化为JSON格式的数据,这里有一个方法叫做loads(text)
问题就解决啦!
content = json.loads(content) print(type(content)) prod_price_list = content["list"]
image.png
2. python 中LXML模块使用etree.parse解析html文件出现lxml.etree.XMLSyntaxError: Specification mandates value for attribute mask, line 1, column 580
的错误
源代码:
from lxml import etree html = etree.parse(source='baidu.html') result=html.xpath('/html/head') print(result)
报错信息如下:
报错信息.png
大致意思是:
规范强制属性掩码的值,也就是必须要给parser赋值制定解析的解析器,该案例是解析html文件,故parser=etree.HTMLParser(),改进代码如下:
from lxml import etree html = etree.parse(source='baidu.html',parser=etree.HTMLParser()) result=html.xpath('/html/head') print(result)
结果如下:
运行结果找到了.png
DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11. await asyncio.wait(task)
# 源代码 # 协程的推荐写法 # 导入异步处理的python库 import asyncio import time # 异步程序 async def func1(): print("我爱天空num1") await asyncio.sleep(3) # 这里线程处于堵塞状态,CPU不为我进行工作 print("我真的很爱天空num1") async def func2(): print("我爱天空num2") await asyncio.sleep(2) # 这里线程处于堵塞状态,CPU不为我进行工作 print("我真的很爱天空num2") async def func3(): print("我爱天空num3") await asyncio.sleep(4) # 这里线程处于堵塞状态,CPU不为我进行工作 print("我真的很爱天空num3") async def main(): f1 = func1() f2 = func2() f3 = func3() task = [f1, f2, f3] await asyncio.wait(task) if __name__ == "__main__": t1 = time.time() # 执行异步程序 asyncio.run(main()) t2 = time.time() print(t2-t1)
系统报错
3. DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11. await asyncio.wait(task)
结果如下:
解决方案,将
task = [f1, f2, f3]
改为
task =[asyncio.create_task(f1), asyncio.create_task(f2), asyncio.create_task(f3)]
4. AttributeError: module 'aiohttp' has no attribute 'ClientSession'
今天在调用aiohttp模块时创建ClientSession对象时突然给我报错
原代码是:
async with aiohttp.ClientSession() as session:
报错内容说我这一行有错误,报错内容为:
image.png
后面发现是自己文件的命名问题
image.png
5. asycnio.run(main())运行时报错RuntimeError: Event loop is close
asyncio.run(main())
报错:
RuntimeError: Event loop is close
解决方案:
将上面的代码换成就解决了
# asyncio.run(main()) loop = asyncio.get_event_loop() loop.run_until_complete(main())
6.SyntaxError: Non-UTF-8 code starting with '\xe6' in file F:\HuaDaBoSi\scratch\practice\yixinli.py on line 10, but no encoding declared;
image.png
7. python爬虫报错UnicodeEncodeError: 'latin-1' codec can't encode characters in position 422-425: ordinal not in range(256)
爬虫时显示我的cookie设置的有问题,系统报了一个错误,UnicodeEncodeError: 'latin-1' codec can't encode characters in position 422-425: ordinal not in range(256)
'cookie':'log_session_id=eyJpdiI6IjdYc0lpaDl2elwvWGRXdXRNVTgrZzdBPT0iLCJ2YWx1ZSI6IkVvMWErRm9CT0FTQXlnc0FGUThjRjRXNE9RUWZZSzUyQTdmanFvMWVlNjA9IiwibWFjIjoiMTQyZGE1Zjc0MzMwMTZkOWQxZWQyZWU0MmIzOTJhN2U2NGU5YzcxYTQxOWM5ODNhMWUyMTJlNTdiYjQ2MDVlNCJ9; sajssdk_2015_cross_new_user=1; sensorsdata2015jssdkcross={"distinct_id":"188db94a57517-0201f76e7a10b72-2343360-1024000-188db94a576151","first_id":"","props":{"$latest_traffic_source_type":"直接流量","$latest_search_keyword":"未取到值_直接打开","$latest_referrer":""},"identities":"eyIkaWRlbnRpdHlfY29va2llX2lkIjoiMTg4ZGI5NGE1NzUxNy0wMjAxZjc2ZTdhMTBiNzItMjM0MzM2MC0xMDI0MDAwLTE4OGRiOTRhNTc2MTUxIn0=","history_login_id":{"name":"","value":""},"$device_id":"188db94a57517-0201f76e7a10b72-2343360-1024000-188db94a576151"}; Hm_lvt_d64469e9d7bdbf03af6f074dffe7f9b5=1687311132; acw_tc=0bde432216873260530556762e65a1731973108fc627fd334745b40a3a8a30; acw_sc__v2=64928d9ff03d3c85ab9bc913ab684ed49d5fea51; laravel_session=eyJpdiI6IkZjNUJEQzk1SnhPMFwvV0hvMUk1XC9Sdz09IiwidmFsdWUiOiJKa3V3eVR2UkRtVCtcL3ZIc1RsNk5ZVVFDSjNvU2lNRFl0YjVqeW1udStvRkZVQXJEK3F1Zk5FUjZ3YUxPMSt6cFc2bzhnMTRSTnZlb1h5Y0srXC9iUkl3PT0iLCJtYWMiOiI2ZTI4Yjg0YmY2NzhlZDJiYTFkZmVmM2Q0ZjA2MGViNTY2Mjc1MjFjOTM0OWE3YTlhNmY3MDE2YmFiYzdhNDMzIn0=; zg_did={"did": "188db94a5b4170-0aa171818df2e8-2343360-fa000-188db94a5b5173"}; zg_18f5038ab49c4ae4918641ae36d67496={"sid": 1687326114825,"updated": 1687326114831,"info": 1687311132093,"superProperty": "{}","platform": "{}","utm": "{}","referrerDomain": "www.xinli001.com","zs": 0,"sc": 0,"firstScreen": 1687326114825}; Hm_lpvt_d64469e9d7bdbf03af6f074dffe7f9b5=1687326115',
原因是我在复制value的时候cookie里面有中文,因此在设置headers的cookie时必须将cookie进行解码
image.png
解决方案:在复制的cookie值后加上.encode('utf-8')
'cookie':'log_session_id=eyJpdiI6IjdYc0lpaDl2elwvWGRXdXRNVTgrZzdBPT0iLCJ2YWx1ZSI6IkVvMWErRm9CT0FTQXlnc0FGUThjRjRXNE9RUWZZSzUyQTdmanFvMWVlNjA9IiwibWFjIjoiMTQyZGE1Zjc0MzMwMTZkOWQxZWQyZWU0MmIzOTJhN2U2NGU5YzcxYTQxOWM5ODNhMWUyMTJlNTdiYjQ2MDVlNCJ9; sajssdk_2015_cross_new_user=1; sensorsdata2015jssdkcross={"distinct_id":"188db94a57517-0201f76e7a10b72-2343360-1024000-188db94a576151","first_id":"","props":{"$latest_traffic_source_type":"直接流量","$latest_search_keyword":"未取到值_直接打开","$latest_referrer":""},"identities":"eyIkaWRlbnRpdHlfY29va2llX2lkIjoiMTg4ZGI5NGE1NzUxNy0wMjAxZjc2ZTdhMTBiNzItMjM0MzM2MC0xMDI0MDAwLTE4OGRiOTRhNTc2MTUxIn0=","history_login_id":{"name":"","value":""},"$device_id":"188db94a57517-0201f76e7a10b72-2343360-1024000-188db94a576151"}; Hm_lvt_d64469e9d7bdbf03af6f074dffe7f9b5=1687311132; acw_tc=0bde432216873260530556762e65a1731973108fc627fd334745b40a3a8a30; acw_sc__v2=64928d9ff03d3c85ab9bc913ab684ed49d5fea51; laravel_session=eyJpdiI6IkZjNUJEQzk1SnhPMFwvV0hvMUk1XC9Sdz09IiwidmFsdWUiOiJKa3V3eVR2UkRtVCtcL3ZIc1RsNk5ZVVFDSjNvU2lNRFl0YjVqeW1udStvRkZVQXJEK3F1Zk5FUjZ3YUxPMSt6cFc2bzhnMTRSTnZlb1h5Y0srXC9iUkl3PT0iLCJtYWMiOiI2ZTI4Yjg0YmY2NzhlZDJiYTFkZmVmM2Q0ZjA2MGViNTY2Mjc1MjFjOTM0OWE3YTlhNmY3MDE2YmFiYzdhNDMzIn0=; zg_did={"did": "188db94a5b4170-0aa171818df2e8-2343360-fa000-188db94a5b5173"}; zg_18f5038ab49c4ae4918641ae36d67496={"sid": 1687326114825,"updated": 1687326114831,"info": 1687311132093,"superProperty": "{}","platform": "{}","utm": "{}","referrerDomain": "www.xinli001.com","zs": 0,"sc": 0,"firstScreen": 1687326114825}; Hm_lpvt_d64469e9d7bdbf03af6f074dffe7f9b5=1687326115'.encode('utf-8'),
解决!!!!