Nginx服务器日志相关指令主要有两条:一条是log_format,用来设置日志格式;另外一条是access_log,用来指定日志文件的存放路径、格式和缓存大小,可以参加ngx_http_log_module。一般在nginx的配置文件中日记配置(/usr/local/nginx/conf/nginx.conf)。
nginx日志格式如下:
42.57.99.126 - - [02/Oct/2018:20:40:22 +0800] "GET /favicon.ico HTTP/1.1" 404 564 "-" "Mozilla/5.0 (Linux; U; Android 8.0.0; zh-cn; MI 6 Build/OPR1.170623.027) AppleWebKit/537.36 (KHTML, like Gecko)Version/4.0 Chrome/37.0.0.0 MQQBrowser/7.8 Mobile Safari/537.36"
一般来说:nginx的log_format有很多可选的参数用于指示服务器的活动状态,默认的是:
log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"';
想要记录更详细的信息需要自定义设置log_format,具体可设置的参数格式及说明如下:
需求:统计nginx日志access.log里访问量最大的10个IP
1.awk实现
awk '{a[$1]++}END{for(i in a)print i ":" a[i]}' |sort -nr |head -n 10
2.python脚本
# !/usr/bin/python # coding=utf8 log_file = "data/access.log" ip = {} with open(log_file) as f: for i in f.readlines(): # print(i.strip().split()[0]) ip_attr = i.strip().split()[0] if ip_attr in ip.keys(): # 如果ip存在于字典中,则将该ip的value也就是个数进行增加 ip[ip_attr] = ip[ip_attr] + 1 else: ip[ip_attr] = 1 s=sorted(ip.items(),key=lambda x:x[1],reverse=True) print(s) # for value in sorted(ip.values()): # for key in ip.keys(): # if ip[key]==value: # print(key,ip[key]) print(ip)
3.流量统计
#!/usr/bin/python #coding=utf8 log_file = "/usr/local/nginx/logs/access.log" with open(log_file) as f: contexts = f.readlines() # define ip dict### ip = {} # key为ip信息,value为ip数量(若重复则只增加数量) flow = {} # key为ip信息,value为流量总和 sum = 0 for line in contexts: # count row size of flow size = line.split()[9] # print ip ip_attr = line.split()[0] # count total size of flow sum = int(size) + sum if ip_attr in ip.keys(): # if ip repeated,如果ip重复就将ip数量加一,而流量继续叠加 # count of ip plus 1 ip[ip_attr] = ip[ip_attr] + 1 # size of flow plus size flow[ip_attr] = flow[ip_attr] + int(size) else: # if ip not repeated # define initial values of count of ip and size of flow ip[ip_attr] = 1 flow[ip_attr] = int(size) print(ip) print(flow) print(sum/1024/1024)
统计日志ip访问数
cat access . log | awk '{ips[$1]+=1} END{for(ip in ips) print ip,ips[ip]}'
查看3点-6点之间的Ip访问个数
grep "2021:0[3-6]" img.log | awk '{ips[$1]+=1} END{for(ip inips) print ips[ip],ip}' | sort-nr
查看3点-6点之间的ip访问数,并且访问数>=200的ip.
grep '2021:0[3-6]' banma_access.log | awk '{ips[$1]+=1}END{for(ip in ips) if(ips[ip]>=200) printips[ip],ip}' | sort -nr