cache server缓存服务器,用来存储(介质为内存或硬盘)用户访问的网页、图片、文件等信息的专用服务器,这种server不仅可以使用户最快的得到他们想要的信息,而且可大大减少服务端网络传输的数据量;
cache server往往也是proxy server,对用户来说cache server和proxy server是不可见的;
国内90%的商业CDN公司都在用squid,如网宿、蓝讯、帝联;sina用的是ats;
www.squid-cache.org
提升cache命中率:
httpd、nginx中有expires、Cache-Control缓存头;
动静分离,静态走CDN,动态业务独立不走squid;
mysql的memcached靠前;
解决4**、5**等错误页面、死链;
cache命中(用户的http请求直接从cache server中返回结果,命中率是所有客户端http请求中命中的比例,web缓存典型的cache命中率在30%-60%之间;另一个相似的度量单位是字节命中率,描述了cache提供服务的数据容量(字节数));
cache丢失(用户的http请求不能从cacheserver中返回结果,原因很多:
(1)cache server刚接收到新资源时,对于第一个用户的第一次访问就会产生cache丢失,解决:预热或预取,后端生成数据之后,统一推到前端cache server;内部先请求访问,可通过脚本实现;
(2)存储空间满或对象自身过期,cache server会清除这些缓存对象以释放空间给新对象,解决:加大内存或磁盘,过期时间设置长一些,缓存的参数设置大些,最大缓存对象2M,热点缓存;分资源缓存,图片、视频分拆成不同的server,根据acl和re分给不同的分组;
(3)用户访问的资源不可达,原始server会指示cache server如何处理用户响应,例如,会提示数据不能被缓存,或在有限的时间内才被重复使用等;
);
cache确认(保证不给用户返回过期数据,在重复使用缓存对象时,cache server需经常从原始server确认,假如原始server提示squid的拷贝仍有效,数据就发送出去,否则squid更新它的缓存拷贝并转发给用户;当用户更新了数据到DB或storage server时,可从业务角度主动调用接口清除该对象缓存的指令);
注:
对cache server,数据的一致性是很难解决的问题,尤其是memcached;
推送资源后CDN生效时间一般5-15min;
图片放到了CDN上,若是修改图片,如ps等,这样的业务需要推送,其它的操作不需要更新;
网站改版,要在CDN上推送相关资源,推送前要把js、css等文件改名;
squid是一个高性能的proxy和cache服务器,支持ftp、gopher、http,与一般的代理缓存软件不同,squid用一个单独的、非模块化的、I/O驱动的进程来处理所有的用户请求;
squid将数据缓存在内存或硬盘中,同时也缓存DNS查询的结果,支持ssl、acl,由于使用了ICP(internetcache protocol,轻量internet缓存协议,squid集群中各node通过ICP交流),squid能实现层叠的代理阵列,从而最大限度的节约带宽;
注:Gopher是Internet上一个非常有名的信息查找系统,它将Internet上的文件组织成某种索引,很方便地将用户从Internet的一处带到另一处。允许用户使用层叠结构的菜单与文件,以发现和检索信息,它拥有世界上最大、最神奇的编目。
squid用途:
(1)reverse proxy,放在web server的前面,用于缓存web server的相关数据,这样用户请求的内容直接从cache server上返回,提升用户体验,也减轻了后端web server、DB server、storage server的压力;
(2)proxy,正向代理,分普通代理和透明代理,放在企业内部关键出网位置或某些共享网络的前端,缓存内部上网用户的数据、domain系统,其它网络搜索数据等,这样用户上网请求的数据,由proxy server请求源站最后将结果返回给内网用户,通过在LAN内部直接访问公司网站不需经internet上网更快,也节约带宽;LAN内的主机通过代理上网,代理的主机可上网即可,位置随意,内部人员的browser设置proxy主机的ip及port;
注:透明代理,若与iptables配合作为办公网的网关,控制内部人员上网行为,放在网络的关键位置过滤网络流量和访问数据,提升整个网络安全,proxy+gw+内容过滤+流量控制等完整的上网解决方案,squid和firewall可在一台也可分开
注:squid主要用于类Unix中,发展历史相当悠久,功能相当完善,对ftp、http、https支持很好,3.0版支持ipv6,目前业界主流CDN都是基于squid二次开发作为cache server
注:haproxy专用于代理,而squid有缓存和代理两个功能,一般用haproxy做代理(动态、静态、LB),用squid拆分静态和动态内容进行缓存;常见的网站架构,动态内容-->静态化-->CDN,很多CDN已支持动态加速业务
现在网站发展的趋势对网络负载均衡的使用是随着网站规模的提升根据不同的阶段来使用不同的技术:
第一阶段:利用Nginx或者HAProxy进行单点的负载均衡,这一阶段服务器规模刚脱离开单服务器、单数据库的模式,需要一定的负载均衡,但是仍然规模较小没有专业的维护团队来进行维护,也没有需要进行大规模的网站部署。这样利用Nginx或者HAproxy就是第一选择,此时这些东西上手快,配置容易,在七层之上利用HTTP协议就可以。
第二阶段:随着网络服务进一步扩大,这时单点的Nginx已经不能满足,这时使用LVS或者商用F5就是首要选择,Nginx此时就作为LVS或者 F5的节点来使用,具体LVS或者F5的是选择是根据公司规模,人才以及资金能力来选择的,但是一般来说这阶段相关人才跟不上业务的提升,所以购买商业负载均衡已经成为了必经之路。
第三阶段:这时网络服务已经成为主流产品,此时随着公司知名度也进一步扩展,相关人才的能力以及数量也随之提升,这时无论从开发适合自身产品的定制,以及降低成本来讲开源的LVS,已经成为首选,这时LVS会成为主流。
最终形成比较理想的状态为:F5/LVS<—>Haproxy<—>Squid/Varnish<—>AppServer
注:每个squid上缓存的不是完整的内容,所有整合在一起才是完整的数据,这样才能缓存大量数据,算法url-hash或一致性hash
京东网站架构:
硬件方面:
内存(重要,内存不够会严重影响性能,因为所有对象都尽可能的被缓存在内存中,这样才有更好的用户体验和最快的响应速度);
硬盘(重要,最好ssd,其次ssa,更多的磁盘空间意味着更多的缓存目标和更高的命中率,如taobao热点存储每台server使用一块80Gssd+500Gsata;使用raid,同时可指定多个磁盘路径用于缓存);
cpu(越快越好,但cpu在cache server方面并不是提高性能的关键因素);
注:内存与硬盘关联,基本规则,每G磁盘空间对应32M内存,约512M的内存支持16G的磁盘缓存;内在需求依赖的有:缓存目标大小、cpu arch(32bit or 64bit)、同时在线的用户数量、使用的其它特殊功能
注:优化Linux OS:最小化系统安装;内核调优/etc/sysctl.conf;管理开机自启;关闭iptables和SElinux;修改ssh port,禁止root登录;修改文件描述符;定时更新时间;定时清理邮件临时目录;配置sudo权限管理;配置国内yum源;隐藏server版本号及内核版本;锁定关键文件
准备:
squid server(eth0:10.96.20.113;eth1:192.168.10.113)
web server(192.168.10.118)
软件包:squid-3.5.20.tar.gz
[root@master ~]# uname -rm
2.6.32-431.el6.x86_64 x86_64
[root@master ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.5(Santiago)
调整文件描述符:
[root@master ~]# ulimit -n
1024
[root@master ~]# ulimit -Hn 20480 #(-H表示hard硬限制;文件描述符,linux默认1024,每一系统用户运行的进程所打开的文件和socket,文件描述符的限制会极大的影响性能,当文件描述符不够时,将不能接收新的用户连接,导致拒绝服务,squid发现文件描述符不够时会有警告)
[root@master ~]# vim /etc/security/limits.conf
* - nofile 20480
调整临时port范围:
[root@master ~]# cat /proc/sys/net/ipv4/ip_local_port_range #(rhel默认port范围32768-61000,FreeBSD默认1024-5000,临时port是TCP/IP栈分配出去的连接的本地port,临时port短缺对非常繁忙的porxyserver来说会极大的影响性能,因为一些tcp连接在被关闭时进入TIME_WAIT状态时,临时port号不能被重用)
32768 61000
[root@master ~]# vim /etc/sysctl.conf
net.ipv4.ip_local_port_range = 1025 65535
[root@master ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route =0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
error:"net.bridge.bridge-nf-call-ip6tables" is an unknown key
error: "net.bridge.bridge-nf-call-iptables"is an unknown key
error:"net.bridge.bridge-nf-call-arptables" is an unknown key
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.ip_local_port_range= 1025 65535
[root@master ~]# cat /proc/sys/net/ipv4/ip_local_port_range
1025 65535
同步时间:
[root@master ~]# ntpdate 202.108.6.95
30 Aug 02:45:25 ntpdate[53292]: adjust timeserver 202.108.6.95 offset -0.013377 sec
[root@master ~]# date
Tue Aug 30 02:45:28 PDT 2016
[root@master ~]# crontab -e
*/5 * * * * /usr/sbin/ntpdate cn.ntp.org.cn&> /dev/null
[root@master ~]# crontab -l
*/5 * * * * /usr/sbin/ntpdate cn.ntp.org.cn&> /dev/null
[root@master ~]# tar xf squid-3.5.20.tar.gz
[root@master ~]# cd squid-3.5.20
[root@master squid-3.5.20]# ./configure --prefix=/usr/local/squid --sysconfdir=/etc --enable-gnuregex --enable-icmp --enable-snmp --enable-default-err-language="Simplify_Chinese" --enable-kill-parent-hack --enable-cache-digests --enable-underscore --enable-poll--enable-async-io=240 --enable-arp-acl --enable-delay-pools --enable-follow-x-forwarded-for --with-large-files --with-default-user=squid
[root@master squid-3.5.20]# make
[root@master squid-3.5.20]# make install
注:
--enable-gnuregex:支持GNU正则表达式
--enable-icmp:支持icmp
--enable-snmp:支持snmp,此选项可以让MRTG使用SNMP协议对服务器的流量状态进行监测
--enable-default-err-language="Simplify_Chinese":指定出错时显示的错误页面为简体中文
--enable-kill-parent-hack:关闭squid时,连同其父进程一起关闭
--enable-cache-digests:加快请求时,检索缓存内容速度
--enable-underscore:允许解析的URL中出现下划线(squid默认带下划线的URL非法并拒绝访问)
--enable-poll:指定使用poll()函数
--enable-async-io=240:异步I/O,用以提升存储性能
--enable-arp-acl:可以在规则设置中直接通过客户端的MAC地址进行管理,防止客户使用IP欺骗
--enable-delay-pools:开启squid延时池功能。延时池是squid用于传输形状或带宽限制的技术。
该池由大量的客户端IP地址组成。当来自这些客户端的请求处于cache丢失状态,他们的响应可能
会被人工延迟
--enable-follow-x-forwarded-for:当一个请求被另一些代理服务器转发时通过从http头中寻找X-Forwarded-For来发现直接或间接的客户端IP地址
--with-large-files:开启大文件支持
--with-default-user=squid:设置默认用户
[root@master squid-3.5.20]# useradd -s/sbin/nologin -M squid #(在编译时若不指定--with-defautl-user=squid,则使用的是nobody用户,主配置文件中选项cache_effective_user和cache_effective_group)
[root@master squid-3.5.20]# ls /usr/local/squid #(bin/squidclient,简单的http客户端程序,用于测试,也可对运行的squid进程发起管理请求;bin/RunCache,shell script,可用于启动squid,当squid挂掉该脚本会自动重启;bin/RunAccel,与RunCache类似,增加了一个命令行参数,告诉squid在哪侦听http请求;libexec/其下是一些辅助程序,这些程序可被其它程序启动;libexec/cachemgr.cgi,是squid管理功能的CGI接口,使用时要指定这个文件的位置或拷贝至web server的cgi-bin/下;libexec/unlinkd,从cache目录里删除文件;libexec/diskd,--enable-storeio=diskd;libexec/pinger,--enable-icmp;sbin/squid,主程序;var/,其下包含了经常变化的文件和不重要的文件,不需正常的备份;var/logs/,squid不同日志的默认路径,有access.log、cache.log、store.log;var/cache/,默认缓存目录,主配置文件中指定cache-dir)
bin libexec sbin share var
[root@master squid-3.5.20]# cd
[root@master ~]# vim /etc/squid.conf #(squid.conf.documented配置文档,7935行)
squid.conf squid.conf.default squid.conf.documented
[root@master ~]# rm -f /etc/squid.conf
[root@master ~]# egrep -v "^#|^$"/etc/squid.conf.default >> /etc/squid.conf
[root@master ~]# vim /etc/squid.conf
cache_effective_user squid #(配置squid服务器用户和组,此处不能是root执行不了)
cache_effective_group squid
visible_hostname squid #(定义可见主机名及管理员邮箱)
cache_mgr jowin@163.com
http_port 3128 #(默认port,语法:http_port port [mode] [options],可用port或IP:PORT或hostname:port形式,可指定多个,若squid作为cache server此处port应改为80;常用的mode:accel,Accelerator / reverse proxy mode,加速选项vport,Virtual hostport support. Using the http_port number instead of the port passed on Host:headers.)
cache_dir ufs /usr/local/squid/var/cache1024 16 256 #(语法:cache_dir Type Directory-Name Fs-specific-data [options];type有ufs、autfs、diskd、rock,默认ufs,可在编译时使用--enable-storeio="list of modules"配置;cache_dir ufs Directory-Name Mbytes L1 L2 [options],'Mbytes' is the amount of disk space (MB) to use under this directory. The default is 100MB. Change this to suit your configuration.Do NOT put the size of your disk drive here. Instead, if you want Squid to usethe entire disk drive,subtract 20% and use that value.;'L1' is the number of first-level subdirectories which will be created under the 'Directory'.The default is 16.;'L2' is the number of second-level subdirectories which will be created under each first-level directory. The default is 256.;squid会在设置的cache_dir下建立指定数量的L1目录,在L1下又建立多个指定数量的L2目录,cacheobject就放在L2下,squid会根据用户请求网页的URL进行hash,生成缓存文件存放于L2下的某一个目录中,squid启动后将在内存中建立一张hash表,记录硬盘中缓存文件配置的情形)
access_log /usr/local/squid/var/logs/access.log squid #(记录关于http事务的关键信息,该文件基于行,每行对应一个client请求,条目有:client IP或主机名、请求的URL、响应size等)
cache_log /usr/local/squid/var/logs/cache.log #(记录squid的配置信息、性能警告及严重错误;可将cache_log放入系统日志中,在/etc/rsyslog.conf中配置local4.warning /var/log/squid.log或者local4.notice @192.168.1.2)
cache_store_log /usr/local/squid/var/logs/store.log #(记录squid关于存储或删除cache object的决定,包含内存cache和磁盘cache,可设为cache_store_log /dev/null)
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918possible internal network
acl localnet src fc00::/7 # RFC 4193 local private network range
acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged)machines
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost manager
http_access deny manager
http_access allow localnet
http_access allow localhost
http_access deny all
coredump_dir /usr/local/squid/var/cache/squid
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
注:acl(IP、RE、port、method、proto等等):
squid在搜索acl元素时使用的是或逻辑,即在定义的acl规则中,有一个能匹配的上则结果是匹配的;
应用访问规则http_access是有先后次序的,先匹配的通过,最后要有一条http_access deny all;
squid默认配置文件是拒绝每一个用户请求,在任何人能使用代理之前,必须在squid.conf中加入acl和http_access规则,告诉squid允许哪些地址的http请求,要有两条acl和http_access,定义的acl规则无先后次序但应用时http_access是有先后次序的
acl aclname acltype argument ...
acl aclname acltype "file" ...
-i(case-insensitive不区分大小写)
+i(case-sensitive区分大小写)
-n(Disable lookups and address type conversions.)
IP的acl定义:
# aclaclname src ip-address/mask ... # clientsIP address [fast]
# aclaclname src addr1-addr2/mask ... #range of addresses [fast]
# aclaclname dst [-n] ip-address/mask ... # URLhost's IP address [slow]
# aclaclname localip ip-address/mask ... # IP address the client connected to [fast]
如:acl WorkStations src 10.0.0.0/16 #(源地址匹配10.0.0.0/16网段)
tcp port的acl定义:
# acl aclname port 80 70 21 0-1024... #destination TCP port [fast],ranges are alloed
# acl aclname localport 3128 ... # TCP port the client connected to [fast];NP: forinterception mode this is usually '80'
如:
acl Foo port 80
acl Bar port 1-1024
acl Http_ports port 80 8000 8080 #(三个port 间是或的关系)
等同于
acl Http_ports port 80
acl Http_ports port 8000
acl Http_ports port 8080
re的acl定义(url_regex过滤domain及其后的uri、urlpath_regex不过滤domain仅过滤domain后的uri路径):
# acl aclname url_regex [-i] ^http:// ... #regex matching on whole URL [fast]
# acl aclname urlpath_regex [-i] \.gif$ ... #regex matching on URL path [fast]
如:
acl SEX url_regex -i ^http://.*sex.*
acl FOO url_regex -i ^http://www
acl FTPMP3 url_regex -i^ftp://.*\.mp3$ #(.mp3$中的点表示单个字符,\.mp3$有转义符表示扩展名)
---------------------------------
限制黄色网站
acl SEX urlpath_regex sex
http_access deny SEX
或
acl SEX url_regex -i ^http://.*sex.*$
http_access deny SEX
-----------------------------------
acl CGI urlpath_regex ^/cgi-bin
method的acl定义:
# acl aclname method GET POST ... # HTTPrequest method [fast](方法有GET、POST、PUT、CONNECT、PURGE;CONNECT,用于通过http代理来封装某种请求的方法,在处理CONNECT方法和remote server的port时应特别谨慎,应限制CONNECT仅能连接到https的443或nntps的563;PURGE是squid专有的请求方法,它让管理员能强制删除cache object,squid默认拒绝PURGE请求,除非定义使用,一般仅定义localhost允许PURGE)
如:
acl Uploads method PUT POST
-------------------------------
acl CONNECT method CONNECT
acl SSL_PORTS 443 563
http_access allow CONNECT SSL_PORTS
http_access deny CONNECT
--------------------------------
仅允许定义的Localhost操作定义的Purge方法
acl Purge method PURGE
acl Localhost src 127.0.0.1
http_access allow Purge Localhost
http_access deny Purge
proto的acl定义:
# acl aclname proto HTTP FTP ... # request protocol [fast](协议有:http、https等同于HTTP/TLS、ftp、gopher、urn、whois、cache_object,cache_object是squid的特性,用于访问squid的缓存管理接口)
如:
拒绝所有的FTP请求
acl FTP proto FTP
http_access deny FTP
---------------------------------------
允许本地管理
acl Manager proto cache_object
acl LocalHost src 127.0.0.1 192.168.1.1
http_access allow Manager Localhost
http_access deny Manager
举例:
限制同一IP client的最大连接数
acl OverConnLimit maxconn16
http_access deny OverConnLimit
-----------------------------------
防止tianya盗链转嫁给baidu
acl tianya refer_regex -i tianya
http_access deny tianya
deny_info http://www.baidu.com/logs.giftianya
-----------------------------------
防止被利用为http代理,设置允许访问的IP地址
acl myip dst 192.168.1.1
http_access deny !myip
------------------------------------
防止baidu机器人爬死server
acl AntiBaidu req_header User-agentBaiduspider
http_access deny AntiBaidu
----------------------------------
仅允许80 port的代理
acl Safe_port port 80
http_access deny !Safe_port
http_access allow all
------------------------------------
限制BT文件下载
acl BT url_path_regex -i \.torrent$
http_access deny BT
-------------------------------------
更精确的统计page的访问量
acl url_no_log urlpath_regex \.gif \.jpg\.swf \.GIF \.JPG \.SWF \.js \.css F5BigIP
acl method_no_log method PURGE HEAD
access_log /usr/local/squid/var/logs/access.logcombined !url_no_log !method_no_log
[root@master ~]# vim /etc/profile.d/squid.sh
exportPATH=$PATH:/usr/local/squid/sbin:/usr/local/squid/bin
[root@master ~]# source !$
source /etc/profile.d/squid.sh
[root@master ~]# echo $PATH
/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/java/jdk1.8.0_51/bin:/usr/java/jdk1.8.0_51/jre/bin:/usr/local/mysql/bin:/root/bin:/usr/local/squid/sbin:/usr/local/squid/bin
[root@master ~]# squid -h #(#squid -k parse检查语法;#squid-z初始化缓存;#squid -k reconfigure重新加载配置为文件;#squid -k rotate日志轮询;#squid -k shutdown关闭服务)
Usage: squid [-cdhvzCFNRVYX] [-n name] [-s| -l facility] [-f config-file] [-[au] port] [-k signal]
-a port Specify HTTP port number(default: 3128).
-d level Write debugging tostderr also.
-f file Use given config-file instead of
/etc/squid.conf
-h Print help message.
-k reconfigure|rotate|shutdown|restart|interrupt|kill|debug|check|parse
Parse configuration file,then send signal to
running copy (except -k parse)and exit.
-n name Specify service name touse for service operations
default is: squid.
-s | -l facility
Enable logging to syslog.
-u port Specify ICP port number(default: 3130), disable with 0.
-v Print version.
-z Create missing swap directories and then exit.
-C Do not catch fatalsignals.
-D OBSOLETE. Scheduled forremoval.
-F Don't serve any requestsuntil store is rebuilt.
-N No daemon mode.
-R Do not set REUSEADDR onport.
-S Double-check swap duringrebuild.
-X Force full debugging.
-Y Only return UDP_HIT orUDP_MISS_NOFETCH during fast reload.
[root@master ~]# chown -R squid /usr/local/squid/var
[root@master ~]#squid -k parse #(检查语法)
……
[root@master ~]# squid -z #(初始化缓存)
[root@master ~]# 2016/08/31 03:17:16 kid1|Set Current Directory to /usr/local/squid/var/cache/squid
2016/08/31 03:17:16 kid1| Creating missingswap directories
2016/08/31 03:17:16 kid1|/usr/local/squid/var/cache exists
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/00
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/01
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/02
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/03
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/04
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/05
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/06
2016/08/31 03:17:16 kid1| Makingdirectories in /usr/local/squid/var/cache/07
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/08
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/09
2016/08/31 03:17:17 kid1| Making directoriesin /usr/local/squid/var/cache/0A
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/0B
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/0C
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/0D
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/0E
2016/08/31 03:17:17 kid1| Makingdirectories in /usr/local/squid/var/cache/0F
[root@master ~]# squid -N -d1 #(启动squid服务,是在前台运行)
2016/08/31 19:37:44| Set Current Directoryto /usr/local/squid/var/cache/squid
2016/08/31 19:37:44| Starting Squid Cacheversion 3.5.20 for x86_64-pc-linux-gnu...
2016/08/31 19:37:44| Service Name: squid
2016/08/31 19:37:44| Process ID 114408
2016/08/31 19:37:44| Process Roles: masterworker
……
2016/08/31 19:37:46| storeLateRelease:released 0 objects
2016/08/31 19:39:31| recv: (111) Connectionrefused
2016/08/31 19:39:31| Closing Pinger socketon FD 19
2016/08/31 19:42:30| Preparing for shutdownafter 667 requests
2016/08/31 19:42:30| Waiting 30 seconds foractive connections to finish
2016/08/31 19:42:30| Closing HTTP port[::]:3128
[root@master ~]# netstat -tnulp | grep squid
tcp 0 0 :::3128 :::* LISTEN 113742/(squid-1)
udp 0 0 0.0.0.0:53277 0.0.0.0:* 113742/(squid-1)
udp 0 0 :::21226 :::* 113742/(squid-1)
[root@master ~]# squid -k shutdown
[root@master ~]# squid -k shutdown
[root@master ~]# ps aux | grep squid
root 114453 0.0 0.1 103252 828 pts/3 S+ 19:42 0:00 grep squid [root@master ~]# squid -s #(以守护进程方式启动服务)
[root@master ~]# ps aux | grep squid
root 114455 0.0 0.4 53368 2400 ? Ss 19:44 0:00 squid -s
squid 114457 0.7 2.7 68184 13300 ? S 19:44 0:00 (squid-1) -s
squid 114458 0.0 0.2 20288 980 ? S 19:44 0:00 (unlinkd)
root 114461 0.0 0.1 103252 828 pts/3 S+ 19:44 0:00 grep squid
按以上配置,默认squid是普通代理,在实体机win上将browser如下设置,即可通过squid代理上网:
[root@master ~]#tail -f /usr/local/squid/var/logs/access.log
1472697697.297 60435 10.96.20.89 TCP_MISS_ABORTED/000 0 GEThttp://pki.google.com/GIAG2.crl - HIER_DIRECT/74.125.23.139 -
1472697697.297 60460 10.96.20.89 TCP_MISS_ABORTED/000 0 GEThttp://pki.google.com/GIAG2.crl - HIER_DIRECT/74.125.23.139 -
1472697727.162 65968 10.96.20.89 TCP_TUNNEL/200 8557 CONNECTiecvlist.microsoft.com:443 - HIER_DIRECT/68.232.45.201 -
1472697944.465 15 10.96.20.89 TCP_MISS/200 798 GEThttp://miserupdate.aliyun.com/data/2.4.1.6/brfversion.xml -HIER_DIRECT/222.73.134.40 text/xml
1472697944.471 5 10.96.20.89 TCP_MISS/200 12724 GET
……
squid的access.log默认不轮询,始终在一个文件上,为方便管理,通过脚本可将access.log制作为根据时间戳来轮询:
[root@master ~]# vim squid_rotate.sh
--------------------script start---------------
#!/bin/bash
#
cd/usr/lcoal/squid/var/logs/
[ -f access.log ]&& mv access.log access_$(date +%F).log
/usr/local/squid/sbin/squid-k rotate
----------------------script end-------------------
[root@master ~]# crontab -e
*/5 * * * * /usr/sbin/ntpdate cn.ntp.org.cn&> /dev/null
59 23 * * * bash /root/squid_rotate.sh &> /dev/null
[root@master ~]# service crond restart
Stopping crond: [ OK ]
Starting crond: [ OK ]
配置squid的web管理界面:
[root@master ~]# rpm -qa httpd
httpd-2.2.15-29.el6_4.x86_64
[root@master ~]# ll /usr/local/squid/libexec/cachemgr.cgi
-rwxr-xr-x. 1 root root 429249 Aug 31 00:02/usr/local/squid/libexec/cachemgr.cgi
[root@master ~]#vim /etc/httpd/conf/httpd.conf
ScriptAlias /squid"/usr/local/squid/libexec/cachemgr.cgi"
order deny,allow
Deny from all
Allow from all
[root@master ~]# vim /etc/squid.conf #(Usage: cachemgr_passwd password action action ...
cachemgr_passwd jowinconfig
[root@master ~]# vim /etc/squid.conf
[root@master ~]# service httpd start
Starting httpd: httpd: Could not reliablydetermine the server's fully qualified domain name, using 10.96.20.113 forServerName
[ OK ]
iptables+squid的透明代理(在网关主机上配置,需至少2块网卡,实现:内部员工上网行为控制、提升上网速度、对早期来说可节约带宽成本):
--enable-linux-netfilter Enable Transparent Proxy support for Linux(Netfilter)
[root@master ~]# ifconfig | egrep -A 1 "eth0|eth1"
eth0 Link encap:Ethernet HWaddr00:0C:29:1F:B6:AC
inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0
--
eth1 Link encap:Ethernet HWaddr00:0C:29:1F:B6:B6
inet addr:192.168.10.113 Bcast:192.168.10.255 Mask:255.255.255.0
[root@master ~]# vim /etc/sysctl.conf
net.ipv4.ip_forward = 1
[root@master ~]# sysctl -p
[root@master ~]# cat /proc/sys/net/ipv4/ip_forward
1
[root@master ~]# vim /etc/squid.conf (更改或添加如下内容,http_port3129 intercept为必须项;squid上的时间要保持与现实同步)
#http_port 3128
http_port 3129 intercept #(3.1版本之后透明代理不能为http_port3128 transparent,改为http_port 3129 intercept,3128继续用作普通代理)
cache_mem 64 MB #(默认cache_mem 256 MB,不能比cache_dir中设置的磁盘空间大,否则启动报错,'cache_mem' specifies the ideal amount of memory to be used for:In-Transitobjects、Hot Objects、Negative-Cached objects)
cache_swap_low 90 #(默认认90,The low-watermark for AUFS/UFS/diskd cache object eviction by he cache_replacement_policyalgorithm.)
cache_swap_high 95 #(默认95,Thehigh-water mark for AUFS/UFS/diskd cache object eviction by thecache_replacement_policy algorithm.)
maximum_object_size 8192 KB #(Set the defaultvalue for max-size parameter on any cache_dir. The value is specified in bytes,and the default is 4 MB.)
minimum_object_size 0 KB #(Objects smaller thanthis size will NOT be saved on disk. Thevalue is specified in bytes, and the default is 0 KB, which means all responsescan be stored. Default:no limit)
maximum_object_size_in_memory 4096 KB #(默认值512KB,Objectsgreater than this size will not be attempted to kept in the memory cache. Thisshould be set high enough to keep objects accessed frequently in memory toimprove performance whilst low enough to keep larger objects from hoardingcache_mem.)
#emulate_httpd_log on #(3.5版本已将此项废弃,obsolete,Replace this with an access_log directive using the format 'common' or 'combined'. Default:none)
memory_replacement_policy lru #(默认lru最近最少使用算法,least、recently、used;另cache_replace_policty有:(lru : Squid's original list based LRU policy)(heap GDSF :Greedy-Dual Size Frequency)、(heap LFUDA: Least Frequently Used with Dynamic Aging)、(heapLRU : LRU policy implemented using aheap))
[root@master ~]# squid -k parse
[root@master ~]# iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 -j REDIRECT --to-ports 3129
[root@master ~]# iptables -t nat -A POSTROUTING -s 192.168.10.0/24 -j SNAT --to-source 10.96.20.113
[root@master ~]# iptables -t nat -L -n
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
REDIRECT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 redir ports 3129
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
SNAT all -- 192.168.10.0/24 0.0.0.0/0 to:10.96.20.113
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
内网主机192.168.10.118上的配置(网关要是squid主机的eth1),测试使用#elinks www.baidu.com上网,同时监控squid的access.log
[root@master ~]# tail -f /usr/local/squid/var/logs/access.log
1472719549.725 9 192.168.10.118 TCP_MISS/2007822 GET http://www.baidu.com/baidu.html? - ORIGINAL_DST/115.239.210.27text/html
1472719556.301 24 192.168.10.118 TCP_MISS/200100873 GET http://www.baidu.com/ - ORIGINAL_DST/115.239.210.27 text/html
1472719557.303 0 192.168.10.118 TCP_MEM_HIT/2007828 GET http://www.baidu.com/baidu.html? - HIER_NONE/- text/html
举例(reverse proxy,cache server,提升用户体验,在后端web server、storage、DB扛不住时再部署):
squid reverse proxy一般只缓存可缓冲的数据,如html网页、js、css、picture,而一些CGI脚本程序或asp、jsp、php之类的动态程序默认不缓存,它根据从web server返回的http head标记来缓冲静态页面,有4个重要的http头标记:
Last-Modified(告诉reverse proxy页面什么时候被修改);
Expires(告诉reverse proxy页面什么时间从缓冲区删除);
Cache-Control(告诉reverse proxy页面是否应该被缓冲,常用的有:no-store(不缓存控制,禁止中间的cache server存储这个对象,并把header转发给用户),no-cache(cache server可以把文件缓存在本地,只是在和源站重新验证前,不能提供给用户使用),must-revalidate(严格模式,默认缓存代理可提供给用户旧对象的内容以提高性能,有此项后,旧对象不会返回,会报504 GatewayTimeout),max-age(表示若cache server拿到这个文件后,这个object多久之内是可用的,可以让用户使用),s-maxage(与max-age相同,仅用于public缓存));
Pragma(用来包含实现特定的指令,常用的是Pragma:no-cache(兼容http/1.0时使用,原则只能用于http请求,功能和Cache-Control: no-cache一样));
注:这几个优先级顺序,Cache-Control、Expires、refresh_pattern、Last-Modified,前面的生效后,后面的基本就失效了,Etag要向源server发送请求头确认,而Last-Modified默认是不向源server确认的
[root@master ~]# ifconfig | egrep -A 1 "eth0|eth1"
eth0 Link encap:Ethernet HWaddr00:0C:29:1F:B6:AC
inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0
--
eth1 Link encap:Ethernet HWaddr00:0C:29:1F:B6:B6
inet addr:192.168.10.113 Bcast:192.168.10.255 Mask:255.255.255.0
[root@master ~]# cat /proc/sys/net/ipv4/ip_forward
1
[root@master ~]# curl 192.168.10.118 #(在后端192.168.10.118上部署httpd)
[root@master ~]# vim /etc/hosts
192.168.10.118
[root@master ~]# vim /etc/squid.conf #(修改添加如下信息)
http_port 80 accel vhost vport
cache_peer parent 80 0 no-query no-digest max-conn=32 originserver #(cache_peer hostname type http-port icp-port [options];type有:either 'parent', 'sibling', or 'multicast'.;icp-port为0表示Set to 0 if the peer does not support ICP or HTCP.;no-query是icp options,Disable ICP queries to this neighbor.;max-conn=N是general options,Limit the number of concurrent connections the Squid may open tothis peer, including already opened idle and standby connections. There is nopeer-specific connection limit by default.;originserver和no-digest是accelerator/reverse-proxyoptions;originserver,Causes this parent to be contacted as an origin server. Meant to beused in accelerator setups when the peer is a web server.;no-digest,Disablerequest of cache digests.)
refresh_pattern -i \.jpg$ 30 50% 4320 reload-into-ims #(usage: refresh_pattern [-i] regex min percent max [options];'Min' is the time (in minutes) an object without an explicit expiry timeshould be considered fresh. The recommended value is 0, any higher values maycause dynamic applications to be erroneously cached unless the applicationdesigner has taken the appropriate actions.;'Percent'is a percentage of the objects age (time since last modificationage) an object without explicit expiry time will be considered fresh.;'Max' is an upper limit on how longobjects without an explicit expiry time will be considered fresh.;reload-into-ims changes a clientno-cache or ``reload'' request for a cached entry into a conditional requestusing If-Modified-Since and/or If-None-Match headers, provided the cached entryhas a Last-Modified and/or a strong ETag header. Doing this VIOLATES the HTTPstandard. Enabling this feature could make you liable for problems which itcauses.;另options还有:override-expire、override-lastmod、reload-into-ims、ignore-reload、ignore-no-store、ignore-must-revalidate、ignore-private、ignore-auth、max-stale=NN、refresh-ims、store-stale;若某个响应驻留在cache里的时间未超过30min的最低限制,那它不会过期,4320min是存活响应的最高时间限制,若某个响应驻留在cache里的时间高于最高限制,那它必须被刷新,在最低和最高限制之间的响应,会根据squid的LM-factor算法,squid计算响应的年龄和最后修改系数,作为比值,若超过50%,该响应要被刷新,resource age=对象进入cache的时间-对象的Last-Modified,response age=当前时间-对象进入cache的时间,LM-factor=response age/resource age)
refresh_pattern -i \.png$ 30 50% 4320 reload-into-ims
refresh_pattern -i \.gif$ 30 50% 4320 reload-into-ims
hosts_file /etc/hosts
request_header_max_size 128 KB #(This specifies themaximum size for HTTP headers in a request.Request headers are usuallyrelatively small (about 512 bytes). Placing a limit on the request header sizewill catch certain bugs (for example with persistent connections) and possibly buffer-overflowor denial-of-service attacks. Default: request_header_max_size 64 KB)
ipcache_size 1024 #(Maximum number ofDNS IP cache entries. Default:ipcache_size 1024)
ipcache_low 90
ipcache_high 95
offline_mode on #(Enable thisoption and Squid will never try to validate cached objects.Default:offline_modeoff;离线模式,后端web server挂掉,若squid上有缓存则网页依然可访问的到)
[root@master ~]# squid -k parse
……
[root@master ~]# squid -s
[root@master ~]# ps aux | grep squid | grep-v grep
root 10663 0.0 0.4 53372 2404 ? Ss 01:00 0:00 squid -s
squid 10665 0.1 2.7 68264 13568 ? S 01:00 0:00 (squid-1) -s
squid 10666 0.0 0.2 20288 984 ? S 01:00 0:00 (unlinkd)
在win主机上测试
当再次刷新后状态变为TCP_MEM_HIT
[root@master ~]# tail -f/usr/local/squid/var/logs/access.log
1472803272.383 24 10.96.20.89 TCP_MISS/404 567 GEThttp://10.96.20.113/favicon.ico - FIRSTUP_PARENT/192.168.10.118 text/html
1472803274.220 149 10.96.20.89 TCP_MISS/200378 GET http://10.96.20.113/ - FIRSTUP_PARENT/192.168.10.118 text/html
1472803328.186 6 10.96.20.89 TCP_MEM_HIT/200385 GET http://10.96.20.113/ - HIER_NONE/- text/html
1472803334.596 0 10.96.20.89 TCP_MEM_HIT/200 386 GEThttp://10.96.20.113/ - HIER_NONE/- text/html
往192.168.10.118的/var/www/html/下上传图片,再到win下打开
再次刷新后状态由TCP_MISS转为TCP_REFRESH_UNMODIFIED
[root@master ~]# tail -f/usr/local/squid/var/logs/access.log
1472803645.760 6 10.96.20.89TCP_MISS/200 43843 GET http://10.96.20.113/V901.jpg - FIRSTUP_PARENT/192.168.10.118image/jpeg
1472803655.700 3 10.96.20.89 TCP_REFRESH_UNMODIFIED/20043850 GET http://10.96.20.113/V901.jpg - FIRSTUP_PARENT/192.168.10.118image/jpeg
1472803663.428 3 10.96.20.89 TCP_REFRESH_UNMODIFIED/20043850 GET http://10.96.20.113/V901.jpg - FIRSTUP_PARENT/192.168.10.118image/jpeg
测试将后端web server停掉,离线模式offline_mode on,测试网页能否访问的到
[root@master ~]# ssh root@192.168.10.118'service httpd stop'
root@192.168.10.118's password:
Stopping httpd: [ OK ]
注:默认/usr/local/squid/var/logs/access.log的logformat squid %ts.%03tu %6tr %>a %Ss/%03>Hs%,%Ss有:TCP_MISS、TCP_HIT、TCP_DENIED、TCP_REDIRECT、TCP_MEM_HIT、TCP_REFRESH_HIT、TCP_REFRESH_MISS、TCP_IMS_HIT、TCP_SWAPFAIL_MISS、TCP_NEGATIVE_HIT、TCP_OFFLINE_HIT(offline_modeon)、TCP_CLIENT_REFRESH_MISS(ctrl+F5)、TCP_REF_FAIL_HIT
httpd中ExpiresActive设置:
[root@localhost ~]# vim /etc/httpd/conf/httpd.conf
LoadModule expires_module modules/mod_expires.so
ExpiresActive on
ExpiresDefault "access plus 12 month"
ExpiresByType text/html "accessplus 12 months"
ExpiresByType text/css "accessplus 12 months"
ExpiresByType image/gif "accessplus 12 months"
ExpiresByType image/jpeg "accessplus 12 months"
ExpiresByType image/jpg "accessplus 12 months"
ExpiresByType image/png "accessplus 12 months"
ExpiresByType application/x-javascript"access plus 12 months"
ExpiresByType video/x-flv "accessplus 12 months"
[root@localhost ~]# httpd -t -f /etc/httpd/conf/httpd.conf
httpd: Could not reliably determine theserver's fully qualified domain name, using localhost.localdomain forServerName
Syntax OK
[root@master ~]# curl -I http://10.96.20.113 #(在squid上测试)
HTTP/1.1 200 OK
Date: Fri, 02 Sep 2016 08:00:32 GMT
Server: Apache/2.2.15 (Red Hat)
Last-Modified: Fri, 02 Sep 2016 06:45:36GMT
ETag: "c28da-d-53b80ad8a25f8"
Accept-Ranges: bytes
Content-Length: 13
Content-Type: text/html; charset=UTF-8
Age: 237090
Warning: 113 squid (squid/3.5.20) Thiscache hit is still fresh and more than 1 day old
Warning: 110 squid/3.5.20 "Response isstale"
X-Cache: HIT from squid
X-Cache-Lookup: HIT fromsquid:80
Via: 1.1 squid (squid/3.5.20)
Connection: keep-alive
[root@master ~]# which squidclient
/usr/local/squid/bin/squidclient
[root@master ~]# squidclient -h
Version: 3.5.20
Usage: squidclient [BasicOptions] [HTTP Options]
-s | --quiet Silent. Do not print response message to stdout.
-v | --verbose Verbose debugging.Repeat (-vv) to increase output level.
Levels:
1 - Print outgoingrequest message to stderr.
2 - Print action trace tostderr.
--help Display this helptext.
Connection Settings
-h | --host host Send message to server on 'host'. Default is localhost.
-l| --local host Specify a local IPaddress to bind to. Default is none.
-p | --port port Port number on server to contact. Defaultis 3128.
-Ttimeout Timeout in seconds forread/write operations
Ping Mode
--ping [options] Enable pingmode.
options:
-g count Ping iterationcount (default, loop until interrupted).
-I interval Ping interval inseconds (default 1 second).
HTTP Options:
-a Do NOT includeAccept: header.
-A User-Agent: header.Use "" to omit.
-H 'string' Extra headers tosend. Use '\n' for new lines.
-i IMS If-Modified-Sincetime (in Epoch seconds).
-j hosthdr Host header content
-k Keep the connectionactive. Default is to do only one request then close.
-m method Request method, default is GET.
-n ProxyNegotiate(Kerberos) authentication
-N WWWNegotiate(Kerberos) authentication
-P file Send content from the named file as request payload
-r Force cache to reload URL
-t count Trace countcache-hops
-u user Proxy authenticationusername
-U user WWW authenticationusername
-V version HTTP Version. Use '-'for HTTP/0.9 omitted case
-w password Proxy authenticationpassword
-W password WWW authenticationpassword
[root@master ~]# squidclient -h localhost -p 80 mgr:info #(mgr:info取得squid运行状态信息;mgr:objects取得squid已缓存的列表,打印出所有缓存在内存和硬盘上的数据,对象有key来表示;mgr:mem取得squid内存使用情况;mgr:diskd取得squid磁盘使用情况;mgr:storedir取得squid缓存存储目录信息;mgr:forward查看squid转发情况)
HTTP/1.1 200 OK
Server: squid/3.5.20
Mime-Version: 1.0
Date: Mon, 05 Sep 2016 02:00:48 GMT
Content-Type: text/plain;charset=utf-8
Expires: Mon, 05 Sep 2016 02:00:48 GMT
Last-Modified: Mon, 05 Sep 2016 02:00:48GMT
X-Cache: MISS from squid
X-Cache-Lookup: MISS from squid:80
Via: 1.1 squid (squid/3.5.20)
Connection: close
Squid Object Cache: Version 3.5.20
Build Info:
Service Name: squid
Start Time: Mon,05 Sep 2016 01:51:43 GMT
Current Time: Mon, 05 Sep 2016 02:00:48 GMT
Connection information for squid:
Numberof clients accessing cache: 2
Numberof HTTP requests received: 3
Numberof ICP messages received: 0
Numberof ICP messages sent: 0
Numberof queued ICP replies: 0
Numberof HTCP messages received: 0
Numberof HTCP messages sent: 0
Requestfailure ratio: 0.00
AverageHTTP requests per minute since start: 0.3
AverageICP messages per minute since start: 0.0
Selectloop called: 782 times, 698.097 ms avg
Cache information for squid:
Hitsas % of all requests: 5min: 0.0%,60min: 66.7%
Hitsas % of bytes sent: 5min: 100.0%, 60min:100.0%
Memoryhits as % of hit requests: 5min: 0.0%,60min: 50.0%
Diskhits as % of hit requests: 5min: 0.0%,60min: 0.0%
StorageSwap size: 18860 KB
StorageSwap capacity: 1.8% used, 98.2% free
StorageMem size: 224 KB
StorageMem capacity: 0.3% used, 99.7% free
MeanObject Size: 50.16 KB
Requestsgiven to unlinkd: 0
……
#squidclient -m purge -u squid -g squid http://(手动刷新缓存,http://页面由业务部提供有问题的URL,解决缓存不同步,不同地区显示页面不一致)
#squidclient -m purge -p 80 http://10.96.20.113/msn.jpg(删除指定对象)
#squidclient -r http://10.96.20.113/msn.jpg(force cacheto reload,若在refresh_pattern中设置了ignore_reload则-r不生效)
squid不支持删除一组对象,可用awk命令配合squidclient删除(例如:#awk'{print $7}' /usr/local/squid/var/logs/access.log | grep www.example.com |xargs -n 1 squidclient -m purge)
#echo '' > /usr/local/squid/var/cache/swap.state(删除所有对象,此命令不会从硬盘中删除文件,仅是让squid认为它的cache是空的,当squid运行时它增加新文件到cache中可能会覆盖旧文件,若磁盘使用超出了指定的大小,要在再次重启squid前删除旧文件)
#squid -k shutdown
#squid -k shutdown
#cd /usr/local/squid/var/
#mv cache oldcache
#mkdir cache
#chown squid:squid cache
#squid -z
#squid -s
#rm -rf oldcache &
squid CDN集群:
http_port ip:portvhost vport
icp_port 3130
cache_peer ipsibling 80 3130 name=cache1
cache_peer 1.2.3.4 sibling 80 3130name=cache2 #(serverA)
cache_peer 5.6.7.8 sibling 80 3130name-cache3 #(serverB)
本文转自 chaijowin 51CTO博客,原文链接:http://blog.51cto.com/jowin/1846348,如需转载请自行联系原作者