@黄亿华 你好,想跟你请教个问题:
在eclipse里搭建的环境,就是把webmagic源码放到工程里,然后运行OschinaBlogPageProcesser类,就报错,如下:
13-12-27 14:45:45,352 INFO us.codecraft.webmagic.Spider(Spider.java:288) ## Spider my.oschina.net started!
13-12-27 14:45:45,354 INFO us.codecraft.webmagic.downloader.HttpClientDownloader(HttpClientDownloader.java:99) ## downloading page http://my.oschina.net/flashsword/blog
13-12-27 14:45:50,458 WARN us.codecraft.webmagic.downloader.HttpClientDownloader(HttpClientDownloader.java:131) ## download page http://my.oschina.net/flashsword/blog error
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:136)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:152)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:270)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:161)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.http.impl.conn.CPoolProxy.invoke(CPoolProxy.java:138)
at $Proxy0.receiveResponseHeader(Unknown Source)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:253)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:194)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at us.codecraft.webmagic.downloader.HttpClientDownloader.download(HttpClientDownloader.java:117)
at us.codecraft.webmagic.Spider.processRequest(Spider.java:369)
at us.codecraft.webmagic.Spider$1.run(Spider.java:304)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at us.codecraft.webmagic.Spider.run(Spider.java:300)
at us.codecraft.webmagic.samples.OschinaBlogPageProcesser.main(OschinaBlogPageProcesser.java:33)
贴你写的爬虫类
问题解决了,简直无语,myeclipse里搭建环境就报这个错,换成eclipse就没问题了
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。