本文是小编的一位好友dalao写的。今天,就让这位dalao带领我们,一起看看,如何用神奇的python打造一个把妹神器吧。看完这个,你们就能走向人生巅峰,迎娶白富美啦。
我知道你们想看看效果
当然啦,这只是测试版的效果,真正的版本可比这个厉害多啦。不过作为一个直男,另一个男的给小编发这个测试感觉还是有点怪怪的哈。
好啦下面看dalao的表演啦。
文:吉柏言
暑假来了,各位又不得不和男女朋友暂时分开2个月了!!长达两个月的时间里不能相见,你可知我多想你啊,想知道你的城市下雨了吗,想知道你带伞了吗,想知道你长什么样,想知道你叫啥,咳咳,单身汪小编表示情绪稳定。
没关系,虽然不能见面,但是云关怀还是要到的嘛,每天查一查你那里的天气如何,送上作为男朋友的关切问候,再配上一张爱你的图片,噫~~。但是作为绝地鸡王那必须每晚吃鸡呀,早上醒来忘了打卡怎么办?? 能让机器干的活我们何必自己动手呢?当然可以走一波python大法好啦!
今天的代码我们要做得就是定点打卡,每天向亲爱的女票送去温暖的祝福~~,单身汪小编表示情绪稳定。
环境准备
首先,安装我们需要的库:
1import requests 2from bs4 import BeautifulSoup 3from email.mime.image import MIMEImage 4from email.mime.multipart import MIMEMultipart 5from email.mime.text import MIMEText 6from email.header import Header 7import smtplib 8import os
我们用requests + bs4 库进行爬取当日的天气状况以及我们需要的图片,用email库和smtplib库来发邮件,当中我们还需要os库对其他文件进行操作。
开始搞事
首先爬取天气状况和图片资源,我选择的是对中国气象台和豆瓣上一位名为名为“狼魄乾坤”的网友的豆瓣相册进行爬取,首先本着盗亦有道的原则,先查看robots协议。
很好,中央气象站没有robots协议,豆瓣也没有对相册加以限制,那么我们可以放心大胆地爬取了。
进入网站,查找一下她所在的城市,本汪没有女票就以自己所在的城市为例子了。
http://www.nmc.cn/publish/forecast/AHB/wuhan.html 。分析一下这个地址,发现对于城市的分类命名规则是A+省份简写如湖北(HB)以及城市拼音,对于一些比较模糊的省份简写小编附在下图的代码中:
1def main(): 2# print("河北HE 内蒙古NM 陕西SN 黑龙江HL 河南HA") 3# province = input("input the province,the big alpha for short:") 4# city = input("input the city you wanna search plz:") 5 province = "HB" 6 city = "wuhan" 7 url = "http://www.nmc.cn/publish/forecast/A" + province + "/" + city + ".html" 8 html = getHTMLText(url) 9 url = "https://www.douban.com/photos/album/157693223/" 10 image = getHTMLText(url)
请忽略小编的辣鸡英文。
getHTMLText(url)是自定义方法,为的是获取网页的源代码,返回值为包含整个源代码的字符串:
1def getHTMLText(url): 2 try: 3 r = requests.get(url) 4 print(r.raise_for_status) 5 r.encoding = r.apparent_encoding 6 return r.text 7 except: 8 return ""
我们用requests.get(url)向网站提出爬取申请,用raise_for_status查看状态码,如果为200则说明爬取成功,然后我们用apparent_encoding替换掉encoding,这是让程序自己识别编码方式,保证返回的不是乱码。倘若爬取过程没有出错,就把爬下来的新鲜的天气信息素材返回给变量html。用同样的方法,我们获取新鲜的图片库的素材也用同样的方法,返回给变量image。
1 imagLink = [] 2 whetherInfo = parserHTMLWeather(html) 3 name = 1 4 for image in imagLink: 5 print(image) 6 for image in imagLink: 7 downloadPicture(image,name) 8 name += 1
回到main方法,我们要声明一个imagLink的列表,用来存放每个图库中每个图的地址,whetherInfo用来存储解析后的html的信息。打印image确定地址返回无误,因为在图库的源码中有豆瓣自己的大图地址和图片的地址,我们需要的是图片地址,确定无误后就可以逐个进行下载图片资源了。
先来看解析天气信息的parserHTMLWeather方法:
1def parserHTMLWeather(html): 2 try: 3 dirt = {} 4 soup = BeautifulSoup(html,"html.parser") 5 place = soup.find(name = "head").find("title") 6 dirt["place"] = str(place.string).split("-")[0] 7 AnnoceTime = soup.find(name = 'div', attrs = {"class":"btitle"}).find("span") 8 dirt["AnnoceTime"] = str(AnnoceTime.string) 9 Everyday = AnnoceTime.find_parent().find_next_sibling().find_all(name = "div",class_ = "detail") 10 for eachday in Everyday: 11 info = eachday.find(name = "div",class_ = "day") 12 thisDay = {} 13 date = str(info.find(name = "div",class_ = "date").string) 14 week = str(info.find(name = "div",class_ = "week").string) 15 wdesc = str(info.find(name = "div",class_ = "wdesc").string) 16 temp = str(info.find(name = "div",class_ = "temp").string) 17 direct = str(info.find(name = "div",class_ = "direct").string) 18 wind = str(info.find(name = "div",class_ = "wind").string) 19 20 thisDay["date"] = date 21 thisDay["week"] = week 22 thisDay["wdesc"] = wdesc 23 thisDay["temp"] = temp 24 thisDay["direct"] = direct 25 thisDay["wind"] = wind 26 dirt[thisDay["date"]] = thisDay 27 28 return dirt 29 except: 30 return {}
首先先声明dirt为一个字典,然后把html用beautifulSoup库对其进行解析,解析后的soup对象可以调用它的find方法和find_all方法开始寻找我们需要的信息所对应的标签。至于哪个信息对应哪个标签,可以在浏览器中用ctrl + F的快捷键调出搜索框。获取到我们需要的信息后,我们可以把它进行加工保存在每天的thisDay字典里,然后再把7天的thisDay字典加入dirt字典里,最后返回dirt字典。具体的加工方法就是用split方法切片、提取。当然也可以选择正则表达式,需要额外再引用re库。
然后是解析图片:
1def parserHTMLPicture(imag,imagLink): 2 try: 3 soup = BeautifulSoup(imag,"html.parser") 4# next_url = soup.find(name = 'link',rel = 'next')['href'] 5# next_page = getHTMLText(next_url) 6 imagAddress = soup.find(name='div',class_ = 'photolst clearfix').find_all(name = 'img') 7 for image in imagAddress: 8 imagLink.append(image['src']) 9 10 return imagLink 11 except: 12 return []
解析图片我们只需要把图片的地址获取到imagLink列表中即可。然后我们遍历这个列表,并且下载这些图片:
1def downloadPicture(url,name): 2 root = 'C:\\Users\\10990\\Pictures\\'#这里填保存的路径 3 path = root + str(name) + '.jpg' 4 try: 5 if not os.path.exists(root): 6 os.mkdir(root) 7 if not os.path.exists(path): 8 r = requests.get(url) 9 with open(path,'wb') as f: 10 f.write(r.content) 11 f.close() 12 print("文件保存成功") 13 else: 14 print("文件已存在") 15 except: 16 print("爬取失败")
在下载前我们要注意判断路径是否存在,若不存在要建立一个,在开始爬之前要留意是否已经爬取过,若已经存在则跳过。命名我是以数字顺序命名的,在后续调用中也更方便。
然后我们需要新建一个txt文件,用来保存本次发送的照片名字,注意该文件应该和代码的py文件保存在同一路径下。
回到main()方法
1with open('pictureName.txt','r') as f: 2 name = eval(f.read()) 3 f.close() 4 with open('pictureName.txt','w') as f: 5 newName = str(name + 1) 6 f.write(newName) 7 f.close() 8 msgRoot = makeMessage(whetherInfo,name) 9sendMsg(msgRoot)
然后我们读取当前的图片名,赋给name,再把name名加一后重新保存下来,这样每天发给女票的就是一张新的图片了。然后要把我们的天气信息和我们每天想说的话以及图片打包成一个email对象发送出去就行啦。
1def makeMessage(dirt,image): 2 #编辑消息 3 print(dirt) 4 message = dirt["place"]+' 今天 ' 5 items = {'wdesc','temp','direct','wind'} 6 for item in items: 7 message += dirt["\n 今天\n "][item].strip('\n ')+" " 8 for temp in message.split(" "): 9 if temp.find("℃") != -1: 10 if eval(temp.split("℃")[0]) > 25: 11 message += "今天很热,尽量别出门啦" 12 elif eval(temp.split("℃")[0]) < 12: 13 message += "今天很冷,注意保暖" 14 if message.find("雨") != -1: 15 message += " 出门的话记得带伞" 16 print(message) 17 18 #生成邮件对象 19 msgRoot = MIMEMultipart('related') 20 msgRoot['From'] = Header("我是发信人","utf-8") 21 msgRoot['To'] = Header('我是收信人','utf-8') 22 subject = '赴戍登程口占示家人' 23 msgRoot['Subject'] = Header(subject,'utf-8') 24 25 msgAlternative = MIMEMultipart('alternative') 26 msgRoot.attach(msgAlternative) 27 28 mail_msg = ''' 29 <p> 力微任重久神疲,再竭衰庸定不支。 30 苟利国家生死以,岂因祸福避趋之? 31 谪居正是君恩厚,养拙刚于戍卒宜。 32 戏与山妻谈故事,试吟断送老头皮。 33 </p> 34 <p>'''+message+'''</p> 35 <p><img src = "cid:image1"></p> 36''' 37 msgAlternative.attach(MIMEText(mail_msg,'html','utf-8')) 38 39 catalog = 'C:\\Users\\10990\\Pictures\\' + str(image) + ".jpg" 40 #指定图片为当前目录 41 with open(catalog,'rb') as fp: 42 msgImage = MIMEImage(fp.read()) 43 fp.close() 44 45 #定义图片在ID,在HTML文本中引用 46 msgImage.add_header('Content-ID','<image1>') 47 msgRoot.attach(msgImage) 48 return msgRoot 49 50def sendMsg(message): 51 mail_host = "smtp.qq.com"#要使用的smtp服务器 52 mail_user = "*******"#用户名和密码 53 mail_pass = "********" 54 sender = '********'#发送者 55 receivers = ['*******']#收信者,注意这里是一个列表,就是说可以群发,当然劝君莫浪~~ 56 try: 57 smtpObj = smtplib.SMTP() 58 smtpObj.connect(mail_host) 59 smtpObj.ehlo() 60 smtpObj.starttls() 61 smtpObj.login(mail_user,mail_pass) 62 smtpObj.sendmail(sender,receivers,message.as_string()) 63 print("邮件发送成功") 64 smtpObj.quit() 65 except smtplib.SMTPException: 66 print("Error:无法发送邮件")
往后都是可以从网上找到的代码,当然了各位也可以更进一步,从网上爬取各种骚话,用同样的方式解析并加入email对象中,为了不吃狗粮小编决定交给各位自己发掘(其实就是懒)需要注意,图片我们只爬取了一页的图片,各位还可以自行添加代码,完成自动换页之后的爬取,因为图片有限,当我们的txt文件数大于18,即自动发送18天后,需要另外爬取第二页的图片。
另外,推荐把程序挂到服务器上面,做个定时发送。每天准点发送。这样妹子就可以天天收到你的云关怀啦。
记得让女票把你加入白名单,否则你发过去的邮件都会被投进垃圾箱的。
完整代码:
1import requests 2from bs4 import BeautifulSoup 3from email.mime.image import MIMEImage 4from email.mime.multipart import MIMEMultipart 5from email.mime.text import MIMEText 6from email.header import Header 7import smtplib 8import os 9def getHTMLText(url): 10 try: 11 r = requests.get(url) 12 print(r.raise_for_status) 13 r.encoding = r.apparent_encoding 14 return r.text 15 except: 16 return "" 17 18def parserHTMLWeather(html): 19 try: 20 dirt = {} 21 soup = BeautifulSoup(html,"html.parser") 22 place = soup.find(name = "head").find("title") 23 dirt["place"] = str(place.string).split("-")[0] 24 AnnoceTime = soup.find(name = 'div', attrs = {"class":"btitle"}).find("span") 25 dirt["AnnoceTime"] = str(AnnoceTime.string) 26 Everyday = AnnoceTime.find_parent().find_next_sibling().find_all(name = "div",class_ = "detail") 27 for eachday in Everyday: 28 info = eachday.find(name = "div",class_ = "day") 29 thisDay = {} 30 date = str(info.find(name = "div",class_ = "date").string) 31 week = str(info.find(name = "div",class_ = "week").string) 32 wdesc = str(info.find(name = "div",class_ = "wdesc").string) 33 temp = str(info.find(name = "div",class_ = "temp").string) 34 direct = str(info.find(name = "div",class_ = "direct").string) 35 wind = str(info.find(name = "div",class_ = "wind").string) 36 37 thisDay["date"] = date 38 thisDay["week"] = week 39 thisDay["wdesc"] = wdesc 40 thisDay["temp"] = temp 41 thisDay["direct"] = direct 42 thisDay["wind"] = wind 43 dirt[thisDay["date"]] = thisDay 44 45 return dirt 46 except: 47 return {} 48 49def parserHTMLPicture(imag,imagLink): 50 try: 51 soup = BeautifulSoup(imag,"html.parser") 52 imagAddress = soup.find(name='div',class_ = 'photolst clearfix').find_all(name = 'img') 53 for image in imagAddress: 54 imagLink.append(image['src']) 55 56 return imagLink 57 except: 58 return [] 59 60def downloadPicture(url,name): 61 root = 'C:\\Users\\10990\\Pictures\\'#这里填保存的路径 62 path = root + str(name) + '.jpg' 63 try: 64 if not os.path.exists(root): 65 os.mkdir(root) 66 if not os.path.exists(path): 67 r = requests.get(url) 68 with open(path,'wb') as f: 69 f.write(r.content) 70 f.close() 71 print("文件保存成功") 72 else: 73 print("文件已存在") 74 except: 75 print("爬取失败") 76 77def makeMessage(dirt,image): 78 #编辑消息 79 print(dirt) 80 message = dirt["place"]+' 今天 ' 81 items = {'wdesc','temp','direct','wind'} 82 for item in items: 83 message += dirt["\n 今天\n "][item].strip('\n ')+" " 84 for temp in message.split(" "): 85 if temp.find("℃") != -1: 86 if eval(temp.split("℃")[0]) > 25: 87 message += "今天很热,尽量别出门啦" 88 elif eval(temp.split("℃")[0]) < 12: 89 message += "今天很冷,注意保暖" 90 if message.find("雨") != -1: 91 message += " 出门的话记得带伞" 92 print(message) 93 94 #生成邮件对象 95 msgRoot = MIMEMultipart('related') 96 msgRoot['From'] = Header("我是发信人","utf-8") 97 msgRoot['To'] = Header('我是收信人','utf-8') 98 subject = '赴戍登程口占示家人' 99 msgRoot['Subject'] = Header(subject,'utf-8') 100 101 msgAlternative = MIMEMultipart('alternative') 102 msgRoot.attach(msgAlternative) 103 104 mail_msg = ''' 105 <p> 力微任重久神疲,再竭衰庸定不支。 106 苟利国家生死以,岂因祸福避趋之? 107 谪居正是君恩厚,养拙刚于戍卒宜。 108 戏与山妻谈故事,试吟断送老头皮。 109 </p> 110 <p>'''+message+'''</p> 111 <p><img src = "cid:image1"></p> 112''' 113 msgAlternative.attach(MIMEText(mail_msg,'html','utf-8')) 114 115 catalog = 'C:\\Users\\10990\\Pictures\\' + str(image) + ".jpg" 116 #指定图片为当前目录 117 with open(catalog,'rb') as fp: 118 msgImage = MIMEImage(fp.read()) 119 fp.close() 120 121 #定义图片在ID,在HTML文本中引用 122 msgImage.add_header('Content-ID','<image1>') 123 msgRoot.attach(msgImage) 124 return msgRoot 125 126def sendMsg(message): 127 mail_host = "smtp.qq.com"#要使用的smtp服务器 128 mail_user = "*******"#用户名和密码 129 mail_pass = "********" 130 sender = '********'#发送者 131 receivers = ['*******']#收信者,注意这里是一个列表,就是说可以群发,当然劝君莫浪~~ 132 try: 133 smtpObj = smtplib.SMTP() 134 smtpObj.connect(mail_host) 135 smtpObj.ehlo() 136 smtpObj.starttls() 137 smtpObj.login(mail_user,mail_pass) 138 smtpObj.sendmail(sender,receivers,message.as_string()) 139 print("邮件发送成功") 140 smtpObj.quit() 141 except smtplib.SMTPException: 142 print("Error:无法发送邮件") 143def main(): 144# print("河北HE 内蒙古NM 陕西SN 黑龙江HL 河南HA") 145# province = input("input the province,the big alpha for short:") 146# city = input("input the city you wanna search plz:") 147 province = "HB" 148 city = "wuhan" 149 url = "http://www.nmc.cn/publish/forecast/A" + province + "/" + city + ".html" 150 html = getHTMLText(url) 151 url = "https://www.douban.com/photos/album/157693223/" 152image = getHTMLText(url) 153imagLink = [] 154 whetherInfo = parserHTMLWeather(html) 155 name = 1 156 for image in imagLink: 157 print(image) 158 for image in imagLink: 159 downloadPicture(image,name) 160 name += 1 161with open('pictureName.txt','r') as f: 162 name = eval(f.read()) 163 f.close() 164 with open('pictureName.txt','w') as f: 165 newName = str(name + 1) 166 f.write(newName) 167 f.close() 168 msgRoot = makeMessage(whetherInfo,name) 169sendMsg(msgRoot) 170main()