人人网登录地址:http://www.renren.com/
此处登录没有考虑验证码验证码。
首先对登录方法进行分析
有两种方法。
一)在Elements中分析源码
发现登录点击后的事件是http://www.renren.com/PLogin.do
二)在Network中分析网络请求
请求链接:http://www.renren.com/ajaxLogin/login?1=1&uniqueTimestamp=2017110237292
表单数据 :
email 账号用户名
icode 验证码,可为空
origURL : http://www.renren.com/home
domain:renren.com
key_id:1
captcha_type:web_login
password: 密码,需要对输入的密码进行加密处理
rkey: 密码处理
f: 未知
此处采取直接使用Elements发现的触发事件。
1 package 人人网模拟登录;
2
3 import org.apache.http.Header;
4 import org.apache.http.NameValuePair;
5 import org.apache.http.client.ResponseHandler;
6 import org.apache.http.client.entity.UrlEncodedFormEntity;
7 import org.apache.http.client.methods.CloseableHttpResponse; 8 import org.apache.http.client.methods.HttpGet; 9 import org.apache.http.client.methods.HttpPost; 10 import org.apache.http.impl.client.BasicResponseHandler; 11 import org.apache.http.impl.client.CloseableHttpClient; 12 import org.apache.http.impl.client.HttpClients; 13 import org.apache.http.message.BasicNameValuePair; 14 import java.util.ArrayList; 15 import java.util.List; 16 17 public class Renren { 18 public static void main(String[] args) throws Exception{ 19 CloseableHttpClient closeableHttpClient = HttpClients.createDefault() ; 20 HttpPost httpPost = new HttpPost("http://www.renren.com/PLogin.do") ; 21 22 String userName = " " ; // 账号写入 23 String passWord = " " ; // 密码写入 24 List<NameValuePair> dlbd = new ArrayList<NameValuePair>(); 25 // 登录表单设置 26 dlbd.add(new BasicNameValuePair("domain", "renren.com")); 27 dlbd.add(new BasicNameValuePair("isplogin", "true")); 28 dlbd.add(new BasicNameValuePair("submit", "登录")); 29 dlbd.add(new BasicNameValuePair("email", userName)); 30 dlbd.add(new BasicNameValuePair("password", passWord)); 31 httpPost.setEntity(new UrlEncodedFormEntity(dlbd)); 32 // Post请求 33 CloseableHttpResponse closeableHttpResponse = closeableHttpClient.execute(httpPost) ; 34 // 获取响应头 35 Header locationHeader = closeableHttpResponse.getFirstHeader("Location"); 36 // Get请求 37 String header = locationHeader.getValue(); 38 HttpGet httpGet = new HttpGet(header) ; 39 ResponseHandler<String> responseHandler = new BasicResponseHandler(); 40 String responseBody = closeableHttpClient.execute(httpGet, responseHandler); 41 System.out.println(responseBody); 42 } 43 }
登录成功
如果之前在网页登录失败次数过多,可能会导致爬虫模拟登录需要验证码,而此处是考虑不需要验证码的情况,所以可能会登录失败,解决方法可以是清理本机Cookie。