在现代网站架构中,scalability 已经不再是可有可无的质量属性,而是决定着网站的生死攸关,所以稍微上规模的站点都不会只有一个web server,让internet clients 直接与其交互,出于安全性和scale out的考量,一般都会在web server 和用户之间设置Reverse Proxy server 或者 Load-Balancer, 又或者是RP 和 LB 的组合。其好处,网络上已经有较多的资料,此处不再赘述。
下面看一个RP和LB组合使用的例子:
Here is the flow of the requests and responses:
- The client gets connected through the firewall to thereverse-proxy in the DMZ and send it its request.
- The Reverse-Proxy validates the request, analyzes it to choose the right farm then forward it to theload-balancer in the LAN, through the firewall.
- The Load-balancer choose a server in the farm and forward the request to it
- The server processes the request then answers to the load-balancer
- The load-balancer forward the response to the reverse-proxy
- The reverse-proxy forward the response to the client
And of course, the more you chain load-balancer and reverse proxies, the more the source IP will be changed.
那么怎样获取原始IP呢? 因为source IP 对后端应用非常有用,比如根据IP定位用户所在区域,以便提供localized的服务; 根据用户IP做风险评估 (是不是恶意攻击IP?)
一个比较简单的办法是增加一个自定义的HTTP Header -- X-Forwarded-For (XFF) for identifying originating IP address of a client connecting to a web server through an HTTP Proxy.
当然你也可以根据你公司的naming convention, 定义自己的Header, 比如在eBay,It is X-eBay-Client-IP
eBay内部描述:
Operations started introducing Layer 7 load balancers (NetScaler) on site. From application server point of view requests arrive with the source IP address of the load balancer. Many eBay applications need to know the source IP address of the client. To support this requirement we introduce a standard for eBay HTTP header X-eBay-Client-IP. Netscaler and other similar devices will set this header and applications will know to use it.
参考:
http://blog.exceliance.fr/2012/06/05/preserve-source-ip-address-despite-reverse-proxies/
http://kb.radware.com/questions/2559/How+to+allow+an+application+to+see+the+client+real+source+IP%3F