Linux学习笔记:awk详细用法

简介:

一、基础用法

awk:报告生成工具;把文件中读取到的每一行的每个字段分别进行格式化,然后进行显示。

1
2
3
4
5
6
7
8
9
10
11
[Linux85] #awk -h
Usage: awk [POSIX  or  GNU style options]  - f progfile [ - - file  ...
Usage: awk [POSIX  or  GNU style options] [ - - 'program'  file  ...
POSIX options:      GNU  long  options:
     - f progfile      - - file = progfile
     - F fs            - - field - separator = fs     #字段分隔符
     - v var = val       - - assign = var = val
     - m[fr] val
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
awk [options]  'script'  FILE  ...
awk [options]  '/pattern/{action}'  FILE  ...


四种分隔符:

输入/输出

行分隔符:$

字段分隔符:空白


模式

地址定界 /pattern1/,/pattern2/
/pattern/ 可以 ! 取反
expression
表达式;>, >=, <, <=, ==, !=, ~
BEGIN{} 在遍历操作开始之前执行一次
END{} 在遍历操作结束之后、命令退出之前执行一次

1
2
3
4
5
[Linux85] #awk '/^soul/{print $0}' /etc/passwd /etc/shadow /etc/group
soul:x: 501 : 501 :: / home / soul: / bin / bash
soul:!!: 16166 : 0 : 99999 : 7 :::
soul:x: 501 :
[Linux85] #

1
2
3
4
5
6
#ID号大于等于500的用户
[Linux85] #awk -F : '$3>=500{print $1}' /etc/passwd
nfsnobody
gentoo
soul
[Linux85] #

1
2
3
4
5
6
7
8
BEGIN执行前操作
[Linux85] #awk -F : 'BEGIN{print "UserName\n***********"}$3>=500{print $1}' /etc/passwd
UserName
* * * * * * * * * * *
nfsnobody
gentoo
soul
[Linux85] #

awk的内置变量:

NF 字段数( The number of fields in the current input record.)
FS field separator,读取文本时,所使用字段分隔符
RS Record separator,输入文本信息所使用的换行符;
OFS 输出时使用字段分隔符,默认为空白(output field separator)
ORS output record separator

1
2
3
4
5
6
7
8
9
[Linux85] #awk -F : '/^soul/{print $1,$7}' /etc/passwd
soul  / bin / bash
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
[Linux85] #awk 'BEGIN{FS=":"}/^soul/{print $1,$7}' /etc/passwd
soul  / bin / bash
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
[Linux85] #awk 'BEGIN{FS=":";OFS=":"}/^soul/{print $1,$7}' /etc/passwd
soul: / bin / bash
[Linux85] #

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[Linux85] #awk '!/^$|^#/{print $1}' /etc/sysctl.conf
net.ipv4.ip_forward
net.ipv4.conf.default.rp_filter
net.ipv4.conf.default.accept_source_route
kernel.sysrq
kernel.core_uses_pid
net.ipv4.tcp_syncookies
net.bridge.bridge - nf - call - ip6tables
net.bridge.bridge - nf - call - iptables
net.bridge.bridge - nf - call - arptables
kernel.msgmnb
kernel.msgmax
kernel.shmmax
kernel.shmall
[Linux85] #

1
2
3
[Linux85] #ifconfig | awk '/inet addr/{print $2}' | awk -F : '!/127/{print $2}'
172.16 . 251.85
[Linux85] #

二、awk的进阶使用

1、print输出:print item1, item2, ...

  • 各项目之间使用逗号隔开,而输出时则以空白字符分隔;

  • 输出的item可以为字符串或数值、当前记录的字段(如$1)、变量或awk的表达式;数值会先转换为字符串,而后再输出;

  • print命令后面的item可以省略,此时其功能相当于print $0, 因此,如果想输出空白行,则需要使用print "";


2、printf输出:printf format, item1, item2, ...

  • 其与print命令的最大不同是,printf需要指定format;

  • format用于指定后面的每个item的输出格式;

  • printf语句不会自动打印换行符;\n


format格式的指示符都以%开头;后面跟一个字符;

%c 显示字符的ASCII码;
%d | %i 十进制整数;
%e | %E 科学计数法显示数值;
%f 显示浮点数;
%g | %G 以科学计数法的格式或浮点数的格式显示数值;
%s 显示字符串;
%u 无符号整数;
%% 显示%自身;
1
2
3
4
[Linux85] #awk 'BEGIN{num1=20;num2=30; printf "%d %d\n",num1,num2}'
20  30
[Linux85] #
#不显示item;只显示的是格式;格式对应的后面的变量;所以需要一一对应


修饰符

N 显示宽度
- 左对齐
+ 显示数值符号;正负数
1
2
3
4
5
6
7
[Linux85] #awk -F: '{printf "%-14s %s\n",$1,$NF}' /etc/passwd
root            / bin / bash
bin             / sbin / nologin
daemon          / sbin / nologin
adm             / sbin / nologin
lp              / sbin / nologin
sync            / bin / sync


3、awk内置变量之数据变量

NR The number of input records,awk命令所处理的记录数;如果有多个文件,这个数目会把处理的多个文件中行统一计数;
NF Number of Field,当前记录的field个数;
FNR 与NR不同的是,FNR用于记录正处理的行是当前这一文件中被总共处理的行数;
ARGV 数组,保存命令行本身这个字符串,如awk '{print $0}' a.txt b.txt这个命令中,ARGV[0]保存awk,ARGV[1]保存a.txt;
ARGC awk命令的参数的个数;
FILENAME awk命令所处理的文件的名称;
ENVIROM 当前shell环境变量及其值的关联数组;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
[Linux85] #awk '{print NR,$0}' 1.txt
1  one line
2  two line
3  three line
4  four line
5  five line
[Linux85] #awk '{print NR,$0}' 2.txt
1  six line
2  seven line
3  eight line
4  nine line
5  ten line
[Linux85] #awk '{print NR,$0}' 1.txt 2.txt
1  one line
2  two line
3  three line
4  four line
5  five line
6  six line
7  seven line
8  eight line
9  nine line
10  ten line
[Linux85] #
#
[Linux85] #awk '{print FNR,$0}' 1.txt 2.txt
1  one line
2  two line
3  three line
4  four line
5  five line
1  six line
2  seven line
3  eight line
4  nine line
5  ten line
[Linux85] #
1
2
3
4
[Linux85] #awk -F: '/root/{print $1,"is a user in",ARGV[1]}' /etc/passwd
root  is  a user  in  / etc / passwd
operator  is  a user  in  / etc / passwd
[Linux85] #
1
2
3
4
[Linux85]#awk  'BEGIN{print ARGC}'  /etc/passwd /etc/group /etc/shadow
4
[Linux85]#
'BEGIN{print ARGC}' 本身也当成一个参数
1
2
3
4
5
6
7
8
9
10
11
12
[Linux85] #awk '{print $0,"in",  FILENAME}' 1.txt 2.txt
one line  in  1 .txt
two line  in  1 .txt
three line  in  1 .txt
four line  in  1 .txt
five line   in  1 .txt
six line  in  2 .txt
seven line  in  2 .txt
eight line  in  2 .txt
nine line  in  2 .txt
ten line  in  2 .txt
[Linux85] #


4、输出重定向

print items > output-file

print items >> output-file

print items | command


特殊文件描述符:

  • /dev/stdin:标准输入

  • /dev/sdtout: 标准输出

  • /dev/stderr: 错误输出

  • /dev/fd/N: 某特定文件描述符,如/dev/stdin就相当于/dev/fd/0;


5、awk的操作符

算术操作符
赋值操作符 比较操作符
-x:负值 =:应[=] x < y   True if x is less than y.
+x:转换为数值 += x <= y  True if x is less than or equal to y.
x^y:次方 -= x > y   True if x is greater than y.
x**y:次方

*=

x >= y  True if x is greater than or equal to y.
x*y /= x == y  True if x is equal to y.
x/y %= x != y  True if x is not equal to y.
x+y ^= x ~ y   True if the string x matches the regexp denoted by y.
x-y **= x !~ y  True if the string x does not match the regexp denoted by y.
x%y ++ subscript in array  True if the array array has an element with the subscript subscript.

--

awk中;任何非0值或非空字符串都为真;反之为假。


条件表达式:

select?if-true-exp:if-false-exp


6、模式和常见的模式类型

模式:

awk 'program' input-file1 input-file2 ...

program:

  • pattern { action }

  • pattern { action }

  • ....


常见的模式:

Regexp 正则表达式,格式为/regular expression/
expresssion 表达式,其值非0或为非空字符时满足条件,如:$1 ~ /foo/ 或 $1 == "soul",用运算符~(匹配)和!~(不匹配)。
Ranges 指定的匹配范围,格式为pat1,pat2
BEGIN/END 特殊模式,仅在awk命令执行前运行一次或结束前运行一次
Empty(空模式) 匹配任意输入行;


常见的Action

  • Expressions

  • Control statements

  • Compound statements

  • Input statements

  • Output statements


7、控制语句

  • if-else

   语法:if (condition) {then-body} else {[ else-body ]}

1
2
3
4
5
6
7
8
[Linux85] #awk -F : 'BEGIN{OFS=":"}{if ($3==0) {print $1,"Administrator";} else {print $1,"Common User"}}' /etc/passwd
root:Administrator
bin :Common User
daemon:Common User
adm:Common User
lp:Common User
sync:Common User
shutdown:Common User
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[Linux85] #awk -F: '{if ($1=="root") printf "%-15s: %s\n",$1,"Admin";else printf "%-15s: %s\n",$1,"Common User"}' /etc/passwd
root           : Admin
bin             : Common User
daemon         : Common User
adm            : Common User
lp             : Common User
sync           : Common User
shutdown       : Common User
halt           : Common User
mail           : Common User
uucp           : Common User
operator       : Common User
games          : Common User
gopher         : Common User
ftp            : Common User
nobody         : Common User
dbus           : Common User
usbmuxd        : Common User
1
2
3
[Linux85] #awk -F: -v sum=0 '{if ($3>=500) sum++}END{print sum}' /etc/passwd
3
[Linux85] #统计uid>=500的用户个数
  • while

   语法:while (condition){statement1; statment2; ...}

1
2
3
4
5
6
7
8
[Linux85] #awk -F : '{i=1;while (i<=3) {print $i;i++}}' /etc/passwd
root
x
0
bin
x
1
#打印出/etc/passwd前三个字段
1
2
3
4
5
6
7
[Linux85] #awk -F: '{i=1;while (i<=NF) { if (length($i)>=4) {print $i}; i++ }}' /etc/passwd
root
root
/ root
/ bin / bash
/ bin
/ sbin / nologin
  • do-while 至少执行一次循环体,不管条件满足与否

   语法:do {statement1, statement2, ...} while (condition)

1
2
3
4
5
6
7
8
9
10
[Linux85] #awk -F: '{i=1;do {print $i;i++}while(i<=3)}' /etc/passwd
root
x
0
bin
x
1
daemon
x
2
1
2
3
4
5
6
7
8
9
10
[Linux85] #awk -F: '{i=4;do {print $i;i--}while(i>4)}' /etc/passwd
0
1
2
4
7
0
0
0
12
  • for

   语法:for (variable assignment; condition; iteration process) {statement1, statement2, ...}

1
2
3
4
5
6
7
8
[Linux85] #awk -F: '{for(i=1;i<=3;i++) if (i<3){printf "%s:",$i} print $i}' /etc/passwd
root:x: 0
bin :x: 1
daemon:x: 2
adm:x: 4
lp:x: 7
sync:x: 0
shutdown:x: 0
  • for循环遍历数组元素

   语法: for (i in array) {statement1, statement2, ...}

1
2
3
4
5
6
7
8
9
[Linux85] #awk -F: '$NF!~/^$/{BASH[$NF]++}END{for(A in BASH){printf "%15s:%i\n",A,BASH[A]}}' /etc/passwd
  / sbin / shutdown: 1
        / bin / csh: 1
       / bin / bash: 2
   / sbin / nologin: 29
      / sbin / halt: 1
       / bin / sync: 1
[Linux85] #
#统计最后一个字段出现的次数
  • case

    语法:switch (expression) { case VALUE or /REGEXP/: statement1, statement2,... default: statement1, ...}

  • break 和 continue

  • next

    提前结束对本行文本的处理,并接着处理下一行;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[Linux85] #awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd
bin  1
adm  3
sync  5
halt  7
operator  11
gopher  13
nobody  99
dbus  81
usbmuxd  113
vcsa  69
rtkit  499
abrt  173
postfix  89
rpcuser  29
pulse  497
soul  501
[Linux85] #


8、数组

array[index-expression]

  • index-expression可以使用任意字符串;需要注意的是,如果某数据组元素事先不存在,那么在引用其时,awk会自动创建此元素并初始化为空串;因此,要判断某数据组中是否存在某元素,需要使用index in array的方式。

  • 要遍历数组中的每一个元素,需要使用如下的特殊结构:

   for (var in array) { statement1, ... }

   其中,var用于引用数组下标,而不是元素值;


删除数组中的变量:delete  array[index]

1
2
3
4
[Linux85] #netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED  2
LISTEN  10
[Linux85] #


9、awk的内置函数

  • split(string, array [, fieldsep [, seps ] ])

    将string表示的字符串以fieldsep为分隔符进行分隔,并将分隔后的结果保存至array为名的数组中;数组下标为从1开始的序列;

1
2
3
4
5
[Linux85] #df -lh | awk '!/^File/{split($5,percent,"%");if(percent[1]>=10){print $1}}'
/ dev / sda1
/ dev / mapper / vg0 - usr
[Linux85] #
#磁盘使用率大于等于%10的显示出来
  • length([string]):返回string字符串中字符的个数;

1
2
3
4
5
6
7
8
9
10
11
[Linux85] #awk -F: '{for(i=1;i<=NF;i++) { if (length($i)>=4) {print $i}}}' /etc/passwd
root
root
/ root
/ bin / bash
/ bin
/ sbin / nologin
daemon
daemon
/ sbin
/ sbin / nologin
  • substr(string, start [, length ])

    取string字符串中的子串,从start开始,取length个;start从1开始计数;

  • system(command):执行系统command并将结果返回至awk命令

  • systime():取系统当前时间

  • tolower(s):将s中的所有字母转为小写

  • toupper(s):将s中的所有字母转为大写


10、用户自定义函数

自定义函数使用function关键字。格式如下:


function F_NAME([variable])

{

statements

}



example:

1
2
3
4
#统计当前系统上每个客户端IP的连接中状处于ESTABLISHED的连接态的个数;
[Linux85] #netstat -tn | awk '/ESTABLISHED\>/{split($5,ip,":");num[ip[1]]++}END{for (i in num) printf "%s %d\n", i, num[i]}'
172.16 . 254.28  2
[Linux85] #


1
2
3
4
5
6
7
8
9
10
11
12
13
14
#统计ps aux命令执行时,当前系统上各状态的进程的个数;
[Linux85] #ps aux | awk '!/^USER/{state[$8]++}END{for (i in state) printf "%s %d\n",i,state[i]}'
S<  2
S<sl  1
Ss  18
SN  1
69
Ss +  6
Ssl  2
R +  1
S +  2
Sl  2
S<s  1
[Linux85] #


1
2
3
4
5
6
7
8
9
10
#统计ps aux命令执行时,当前系统上各用户的进程的个数;
[Linux85] #ps aux | awk '!/^USER/{state[$1]++}END{for (i in state) printf "%s %d\n",i,state[i]}'
rpc  1
dbus  1
68  2
postfix  2
rpcuser  1
root  96
gentoo  2
[Linux85] #


1
2
3
4
5
6
7
8
9
10
11
12
13
14
#显示ps aux命令执行时,当前系统上其VSZ(虚拟内存集)大于10000的进程及其PID;
[Linux85] #ps aux | awk '!/USER/{if($5>10000) print $2,$11}'
1  / sbin / init
397  / sbin / udevd
1184  auditd
1209  / sbin / rsyslogd
1251  rpcbind
1282  dbus - daemon
1292  NetworkManager
1297  / usr / sbin / modem - manager
1311  rpc.statd
1344  cupsd
1354  / usr / sbin / wpa_supplicant
1392  hald


本文转自Mr_陈 51CTO博客,原文链接:http://blog.51cto.com/chenpipi/1391178,如需转载请自行联系原作者
相关文章
|
22天前
|
JavaScript Linux
【详细讲解】Linux grep命令用法大全 片尾有示例搜索指定目录中指定文件后缀的指定字符
【详细讲解】Linux grep命令用法大全 片尾有示例搜索指定目录中指定文件后缀的指定字符
38 1
|
29天前
|
存储 Linux BI
Linux 三剑客 grep、sed、awk
Linux三剑客`grep`、`sed`和`awk`是强大的文本处理工具。`grep`用正则表达式搜索匹配行;`sed`是流式编辑器,处理文本流而不直接修改原文件;`awk`则用于灵活的文本分析和报告生成。例如,`grep`可查找匹配模式,`sed`可以删除文件内容,而`awk`能提取特定字段。通过组合使用,它们能高效解决复杂文本任务。
26 1
|
14天前
|
Shell Linux
RSIC-V“一芯”学习笔记(二)——Linux入门教程
RSIC-V“一芯”学习笔记(二)——Linux入门教程
|
1月前
|
算法 数据挖掘 Linux
探索Linux中的awk命令:强大的文本分析工具
探索Linux中的`awk`命令,一个强大的文本分析工具,用于模式扫描、数据提取与报告生成。本文介绍`awk`的用途、工作原理、特点及应用示例。`awk`基于&quot;模式-动作&quot;框架,从输入数据中匹配模式并执行相应操作。其特点包括:强大的文本处理能力、灵活的I/O及简洁的语法。示例涵盖了打印特定行、处理字段、计算统计值等场景。使用`awk`时要注意理解输入数据、测试脚本、优化性能和添加注释。深入学习以提升数据处理技能。
|
2月前
|
存储 Linux Shell
Linux|如何在 awk 中使用流控制语句
Linux|如何在 awk 中使用流控制语句
34 1
|
1月前
|
XML Linux API
探索Linux中的dbus-binding-tool:理解其用途与用法
`dbus-binding-tool`是Linux D-Bus工具集的一部分,用于从XML接口描述生成语言绑定代码,简化D-Bus服务在应用程序中的集成。它支持自动代码生成,多种语言(如C、C++、Python),并提供灵活性以适应特定需求。使用步骤包括获取XML描述文件,运行工具生成代码,然后在应用中使用生成的API。注意版本兼容性、错误处理,并参考官方文档和示例以优化使用。该工具助力开发人员高效实现进程间通信和系统服务集成。
|
21天前
|
监控 Unix Linux
Linux中AWK命令的高级应用与案例分析
Linux中AWK命令的高级应用与案例分析
|
2月前
|
Shell Linux Perl
Linux|如何允许 awk 使用 Shell 变量
Linux|如何允许 awk 使用 Shell 变量
52 2
|
2月前
|
监控 Linux 数据处理
|
26天前
|
机器学习/深度学习 固态存储 Linux
一篇文章讲明白Linux下的ping命令用法与实现
一篇文章讲明白Linux下的ping命令用法与实现
21 0