最近心里非常郁闷打算换工作,但是发现一个问题,投了N多简历,可是打电话面试的少之又少,为了能更有效的发送简历,或者说降低我发送简历的成本,我就开始着手写简历群发器,这个群发器的设计思路是,根据用户(就是我)在资源文件配置的想向哪个招聘网站检索招聘信息,然后针对不同的招聘网站,有不同的Sender,因为不同的网站里面的网页内容是不一样,要使用不同的正则表达式进行匹配,但是所有Sender必须实现ISender接口,这个接口里面定义了检索职位信息,过滤不想投递的公司,等等相关的方法,这个群发器的功能非常强大,现在我已经开发完成51Job网站的Sender,剩下的智联招聘和中华英才网,会在近期陆续开发出来,从这个群发器的设计我深深体会到面向接口编程的好处.下面是整体的设计图.
下面是检索内容的接口
下面是资源文件,看到资源文件,你是不是觉得这个群发器功能会很强大呢?
如果觉得不强大我也没办法....呵呵
仔细解释一下这个资源文件的内容,首先是keyword关键字,群发器(下称Sender)根据用户填写的关键字进行检索,website是想要在哪个网站进行检索(目前只有51JobSender),再下面是城市可以包含多个城市,后面是exclud不包括,就是不想往这个公司投简历(和我一样,现任职的公司你肯定不想投),最后是email先关的信息有一个有意思的地方,就是mailsubject这个是邮件的主题,可以写成我上面写的那样,"JAVAEE四年应聘:XXX职位"后面的XXX职位是根据查询结果自动获取的.
这个Sender还要介绍两个类一个是City类
由于各个网站对于检索的数据是不一样的,像51Job就是将每一个城市对应一个编码(智联招聘却是直接将城市的名称提交查询),这个编码用户(我)是不知道的,用户只知道城市的名称,那么这个对照表就由我们的Sender进行维护吧.
还有一个类是Email类这个类用于封装Email信息
解释一下这个类,这个类里面的MAILHOSTNAME_MAPPING这个属性,用于存储各个邮箱的发送服务器地址,因为我们要根据用户的邮箱地址发送邮件,所以这个Mapping就由我们的Sender管理吧.^_^
下面是程序入口代码
/*
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base;
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base;
import com.sunwell.base.core.SenderDispatch;
public class Main{
public static void main(String[] args) throws Exception {
SenderDispatch sender = new SenderDispatch();
sender.run();
}
}
我们的SenderDispatch
/
*
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base.core;
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base.core;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.MalformedURLException;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.MalformedURLException;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import org.apache.log4j.Logger;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.apache.commons.io.IOUtils;
import org.apache.log4j.Logger;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import com.sunwell.base.util.HttpConnection;
public class SenderDispatch {
protected final Logger logger = Logger.getLogger(SenderDispatch.class);
private Document document;
private ISender sender;
public static final HttpConnection HTTP = new HttpConnection();
static {
HTTP.setConnectTimeout(50000);
HTTP.setReadTimeout(50000);
}
/**
*
* @throws FileNotFoundException
* @throws IOException
* @throws DocumentException
*/
public SenderDispatch() throws FileNotFoundException, IOException, DocumentException{
File configFile = new File(getClass().getClassLoader().getResource("").getPath()+"sysconfig.xml");
String xml = IOUtils.toString(FileUtils.openInputStream(configFile));
document = DocumentHelper.parseText(xml);
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-23
* 描 述:
* 根据用户填写的网址(通过逗号分隔),实例化url对象
* </pre>
* @throws DocumentException
* @throws IOException
* @throws FileNotFoundException
*/
private void buildWebSite(String webSite) throws FileNotFoundException, IOException, DocumentException{
for(String ws : webSite.split(",")){
if("51job".equals(ws) || "51JOB".equals(ws)){
sender = new Job51Senderimpl();
sender.searchResultByKeyWord(" http://search.51cto.com/jobsearch/search_result.php/ ");
}
else if("智联招聘".equals(ws) || "智联".equals(ws) || "zhaopin".equals(ws)){
//this.urls.add(new URL(" http://search.zhaopin.com/jobs/request.asp/ "));
}
}
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-24
* 描 述:
* 功能入口
* </pre>
* @throws Exception
*/
public void run() throws Exception{
try {
this.buildWebSite(document.selectSingleNode("ROOT/website").getText());
} catch (MalformedURLException e) {
e.printStackTrace();
logger.error("构建网站地址时发生错误,原因[可能是资源文件中网站地址填写错误]");
}
}
}
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-23
* 描 述:
* 根据用户填写的网址(通过逗号分隔),实例化url对象
* </pre>
* @throws DocumentException
* @throws IOException
* @throws FileNotFoundException
*/
private void buildWebSite(String webSite) throws FileNotFoundException, IOException, DocumentException{
for(String ws : webSite.split(",")){
if("51job".equals(ws) || "51JOB".equals(ws)){
sender = new Job51Senderimpl();
sender.searchResultByKeyWord(" http://search.51cto.com/jobsearch/search_result.php/ ");
}
else if("智联招聘".equals(ws) || "智联".equals(ws) || "zhaopin".equals(ws)){
//this.urls.add(new URL(" http://search.zhaopin.com/jobs/request.asp/ "));
}
}
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-24
* 描 述:
* 功能入口
* </pre>
* @throws Exception
*/
public void run() throws Exception{
try {
this.buildWebSite(document.selectSingleNode("ROOT/website").getText());
} catch (MalformedURLException e) {
e.printStackTrace();
logger.error("构建网站地址时发生错误,原因[可能是资源文件中网站地址填写错误]");
}
}
}
里面的代码已经写得很明白了,大家如果还是看不太明白,可以拷贝到Eclipse中使用Eclipse的高亮显示,进行查看,可能会有很大的帮助.
/*
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base.core;
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base.core;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.URLEncoder;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.URLEncoder;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import org.apache.log4j.Logger;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import org.apache.commons.io.IOUtils;
import org.apache.log4j.Logger;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import com.sunwell.base.mail.Email;
import com.sunwell.base.mail.MailSender;
import com.sunwell.base.util.City;
import com.sunwell.base.mail.MailSender;
import com.sunwell.base.util.City;
public class Job51Senderimpl implements ISender{
protected final Logger logger = Logger.getLogger(Job51Senderimpl.class);
private Document document;
public Job51Senderimpl() throws IOException, DocumentException{
File configFile = new File(getClass().getClassLoader().getResource("").getPath()+"sysconfig.xml");
String xml = IOUtils.toString(FileUtils.openInputStream(configFile));
document = DocumentHelper.parseText(xml);
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 根据关键字和所在城市,进行职位查询
* </pre>
* @param url
*/
@SuppressWarnings({"unchecked"})
public void searchResultByKeyWord(String url){
List<Element> citys = document.selectNodes("ROOT/city/value");
this.logger.debug(((Element)citys.get(0)).getText());
String jobarea = "";
for(Element e : citys){
jobarea += City.CITY_LIST_51JOB.get(e.getText()) + ",";
}
String html = null;
try {
html = getHTMLByUrl(url+"?fromJs=1" +
"&jobarea="+URLEncoder.encode(jobarea.substring(0, jobarea.length()-1), "UTF-8")+"" +
"&funtype=0000&industrytype=00" +
"&keyword="+URLEncoder.encode(document.selectSingleNode("//ROOT/keyword").getText(),"UTF-8")+
"&keywordtype=2&lang=c&stype=1&postchannel=0000&fromType=1");
//this.logger.debug(html);
this.getJobCorpInfoByHTML(html);
}catch (FileNotFoundException e) {
e.printStackTrace();
logger.error("提交查询条件时发生错误,原因[可能是网站查询地址改变或错误]");
}catch (Exception e) {
e.printStackTrace();
}
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 核心方法,首先获取公司名称和职位链接,并对于资源文件中的不包括的公司名称进行过滤,在获得邮箱地址之后,向邮箱地址发送邮件.
* 执行完本页的职位列表之后,获取下一页链接如果有下一页链接,将递归调用此方法.
* </pre>
* @param html
* @throws Exception
*/
@SuppressWarnings("unchecked")
public void getJobCorpInfoByHTML(String html) throws Exception{
String theHtml = html;
Matcher matcherJobURL = Pattern.compile("<a href=\"(.+?)\" onclick=\"zzSearch.acStatRecJob").matcher(theHtml);
Matcher matcherCorpName = Pattern.compile("<a href=\".+?\" class=coname target=\"_blank\" >(.+?)</a></td>").matcher(theHtml);
while(matcherJobURL.find() && matcherCorpName.find()){
for(Element exclud : (List<Element>)document.selectNodes("ROOT/exclud/value")){
if(matcherCorpName.group(1).indexOf(exclud.getText()) == -1){
//logger.info("准备打开职位详细页面 : " + matcherJobURL.group(1));
try {
//打开职位详细页面
Email email = buildEmailDetailByHTML(getHTMLByUrl(matcherJobURL.group(1)));//获取邮箱信息
if(email != null){
MailSender.sendHTML(email);//发送邮件
logger.error("发送邮件: "+email.getSendTo());
}
} catch (Exception e) {
logger.error("打开职位详细页面时发生错误");
e.printStackTrace();
}
}else{logger.info("过滤掉公司名称为" + exclud.getText() + "的公司");}
}
}
String nextUrl = getNextUrlsByHtml(theHtml);
if(nextUrl != null){
this.getJobCorpInfoByHTML(SenderDispatch.HTTP.doGet(nextUrl).toString("GB2312"));
}
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 根据html中的内容找到包含的Email地址
* </pre>
* @param html
* @return
*/
public Email buildEmailDetailByHTML(String html){
//logger.info("获取职位名称和邮箱地址信息");
Email emailDetail = null;
Matcher jobName = Pattern.compile("<td class=\"sr_bt\" colspan=\"2\">(.+?)</td>").matcher(html);
Matcher mailAddress = Pattern.compile("<a href=\" mailto:.+?\ " class=\"orange\">(.+?)</a>").matcher(html);
if(jobName.find() && mailAddress.find()){
emailDetail = new Email(document.selectSingleNode("ROOT/emaildetail/emailusername").getText(),
document.selectSingleNode("ROOT/emaildetail/emailpassword").getText());
emailDetail.setSendTo(mailAddress.group(1));
//emailDetail.setSendTo(" w_t_888@163.com ");
emailDetail.setSubject(document.selectSingleNode("ROOT/emaildetail/mailsubject").getText() + " : " + jobName.group(1));
}
return emailDetail;
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 获得下一页的连接
* </pre>
* @param html
* @return 如果没有匹配到下一页的连接,返回null
*/
public String getNextUrlsByHtml(String html){
//logger.debug(html);
String url = "";
Matcher matcher = Pattern.compile("</td><td><a href=\"(.+?)\" .+?").matcher(html);
if(!matcher.find()) //需要先find才能取
return null;
else{
url = matcher.group(1);
}
logger.debug("下一页链接 : " + url);
return url;
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-24
* 描 述:
* 获取网站内容
* </pre>
* @param url
*/
public String getHTMLByUrl(String url){
String html = null;
try {
html = SenderDispatch.HTTP.doGet(url.toString()).toString("GB2312");
//logger.debug(html);
}
catch (IOException e) {
e.printStackTrace();
logger.error("创建连接时发生错误,原因[可能是网络连接错误或对方服务器无法访问]");
} catch (Exception e) {
e.printStackTrace();
logger.error("试图连接时发生错误,原因[可能是网络地址填写错误]");
}
return html;
}
}
protected final Logger logger = Logger.getLogger(Job51Senderimpl.class);
private Document document;
public Job51Senderimpl() throws IOException, DocumentException{
File configFile = new File(getClass().getClassLoader().getResource("").getPath()+"sysconfig.xml");
String xml = IOUtils.toString(FileUtils.openInputStream(configFile));
document = DocumentHelper.parseText(xml);
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 根据关键字和所在城市,进行职位查询
* </pre>
* @param url
*/
@SuppressWarnings({"unchecked"})
public void searchResultByKeyWord(String url){
List<Element> citys = document.selectNodes("ROOT/city/value");
this.logger.debug(((Element)citys.get(0)).getText());
String jobarea = "";
for(Element e : citys){
jobarea += City.CITY_LIST_51JOB.get(e.getText()) + ",";
}
String html = null;
try {
html = getHTMLByUrl(url+"?fromJs=1" +
"&jobarea="+URLEncoder.encode(jobarea.substring(0, jobarea.length()-1), "UTF-8")+"" +
"&funtype=0000&industrytype=00" +
"&keyword="+URLEncoder.encode(document.selectSingleNode("//ROOT/keyword").getText(),"UTF-8")+
"&keywordtype=2&lang=c&stype=1&postchannel=0000&fromType=1");
//this.logger.debug(html);
this.getJobCorpInfoByHTML(html);
}catch (FileNotFoundException e) {
e.printStackTrace();
logger.error("提交查询条件时发生错误,原因[可能是网站查询地址改变或错误]");
}catch (Exception e) {
e.printStackTrace();
}
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 核心方法,首先获取公司名称和职位链接,并对于资源文件中的不包括的公司名称进行过滤,在获得邮箱地址之后,向邮箱地址发送邮件.
* 执行完本页的职位列表之后,获取下一页链接如果有下一页链接,将递归调用此方法.
* </pre>
* @param html
* @throws Exception
*/
@SuppressWarnings("unchecked")
public void getJobCorpInfoByHTML(String html) throws Exception{
String theHtml = html;
Matcher matcherJobURL = Pattern.compile("<a href=\"(.+?)\" onclick=\"zzSearch.acStatRecJob").matcher(theHtml);
Matcher matcherCorpName = Pattern.compile("<a href=\".+?\" class=coname target=\"_blank\" >(.+?)</a></td>").matcher(theHtml);
while(matcherJobURL.find() && matcherCorpName.find()){
for(Element exclud : (List<Element>)document.selectNodes("ROOT/exclud/value")){
if(matcherCorpName.group(1).indexOf(exclud.getText()) == -1){
//logger.info("准备打开职位详细页面 : " + matcherJobURL.group(1));
try {
//打开职位详细页面
Email email = buildEmailDetailByHTML(getHTMLByUrl(matcherJobURL.group(1)));//获取邮箱信息
if(email != null){
MailSender.sendHTML(email);//发送邮件
logger.error("发送邮件: "+email.getSendTo());
}
} catch (Exception e) {
logger.error("打开职位详细页面时发生错误");
e.printStackTrace();
}
}else{logger.info("过滤掉公司名称为" + exclud.getText() + "的公司");}
}
}
String nextUrl = getNextUrlsByHtml(theHtml);
if(nextUrl != null){
this.getJobCorpInfoByHTML(SenderDispatch.HTTP.doGet(nextUrl).toString("GB2312"));
}
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 根据html中的内容找到包含的Email地址
* </pre>
* @param html
* @return
*/
public Email buildEmailDetailByHTML(String html){
//logger.info("获取职位名称和邮箱地址信息");
Email emailDetail = null;
Matcher jobName = Pattern.compile("<td class=\"sr_bt\" colspan=\"2\">(.+?)</td>").matcher(html);
Matcher mailAddress = Pattern.compile("<a href=\" mailto:.+?\ " class=\"orange\">(.+?)</a>").matcher(html);
if(jobName.find() && mailAddress.find()){
emailDetail = new Email(document.selectSingleNode("ROOT/emaildetail/emailusername").getText(),
document.selectSingleNode("ROOT/emaildetail/emailpassword").getText());
emailDetail.setSendTo(mailAddress.group(1));
//emailDetail.setSendTo(" w_t_888@163.com ");
emailDetail.setSubject(document.selectSingleNode("ROOT/emaildetail/mailsubject").getText() + " : " + jobName.group(1));
}
return emailDetail;
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-26
* 描 述:
* 获得下一页的连接
* </pre>
* @param html
* @return 如果没有匹配到下一页的连接,返回null
*/
public String getNextUrlsByHtml(String html){
//logger.debug(html);
String url = "";
Matcher matcher = Pattern.compile("</td><td><a href=\"(.+?)\" .+?").matcher(html);
if(!matcher.find()) //需要先find才能取
return null;
else{
url = matcher.group(1);
}
logger.debug("下一页链接 : " + url);
return url;
}
/**
*
* <pre>
* 创建人: 王涛
* 创建于: 2009-6-24
* 描 述:
* 获取网站内容
* </pre>
* @param url
*/
public String getHTMLByUrl(String url){
String html = null;
try {
html = SenderDispatch.HTTP.doGet(url.toString()).toString("GB2312");
//logger.debug(html);
}
catch (IOException e) {
e.printStackTrace();
logger.error("创建连接时发生错误,原因[可能是网络连接错误或对方服务器无法访问]");
} catch (Exception e) {
e.printStackTrace();
logger.error("试图连接时发生错误,原因[可能是网络地址填写错误]");
}
return html;
}
}
下面是MailSender代码
/*
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base.mail;
*太阳井软件研发中心 版权所有 2009
*Copyright(C) 2009 SunWell Software Develop Center. All rights reserved.
*/
package com.sunwell.base.mail;
import org.apache.commons.mail.HtmlEmail;
public class MailSender {
public static void sendHTML(Email email) {
try {
HtmlEmail semail = new HtmlEmail();
semail.setHostName(email.getHostName());
semail.addTo(email.getSendTo());
semail.setAuthentication(email.getEmailUserName(), email.getEmailPassword());
semail.setFrom(email.getEmailUserName(), email.getEmailUserName());
semail.setSubject(email.getSubject());
//下面这句话是插入我自己的一个Logo
String imgSrc = " http://www36.babidou.com/pic/2008/10/6/perfectnini/mailsender.gif ";
String ad = "<center><a href='http://tonyaction.blog.51cto.com/'><img src='"+imgSrc+"'></a></center>";
semail.buildMimeMessage();
semail.setCharset("UTF-8");
semail.setHtmlMsg(ad + email.getContent());
semail.send();
} catch (Exception ex) {
System.out.println("邮件发送失败:" + ex.getMessage());
ex.printStackTrace();
}
}
}
public static void sendHTML(Email email) {
try {
HtmlEmail semail = new HtmlEmail();
semail.setHostName(email.getHostName());
semail.addTo(email.getSendTo());
semail.setAuthentication(email.getEmailUserName(), email.getEmailPassword());
semail.setFrom(email.getEmailUserName(), email.getEmailUserName());
semail.setSubject(email.getSubject());
//下面这句话是插入我自己的一个Logo
String imgSrc = " http://www36.babidou.com/pic/2008/10/6/perfectnini/mailsender.gif ";
String ad = "<center><a href='http://tonyaction.blog.51cto.com/'><img src='"+imgSrc+"'></a></center>";
semail.buildMimeMessage();
semail.setCharset("UTF-8");
semail.setHtmlMsg(ad + email.getContent());
semail.send();
} catch (Exception ex) {
System.out.println("邮件发送失败:" + ex.getMessage());
ex.printStackTrace();
}
}
}
本文转自 tony_action 51CTO博客,原文链接:http://blog.51cto.com/tonyaction/170755,如需转载请自行联系原作者