今天,大家爬虫练个手:爬取当当网30日好评榜图书,具体网站为:
简单用个requests、bs4就搞定了,具体代码如下:
import requestsimport bs4import re for i in range(1,26): url = 'http://bang.dangdang.com/books/fivestars/01.00.00.00.00.00-recent30-0-0-1-' + str(i) response = requests.get(url) html=bs4.BeautifulSoup(response.text) kk=html.select('ul > li > div.name > a') for ii in kk:# print(ii.text) with open('book.txt','a+',encoding='utf-8') as f: f.write(ii.text+'\n')
完工截图如下:
爬虫时不时拿出来练练手,免得手生了😄,bye!