正则表达式元字符(metacharacter)是不代表自身原有含义的字符。它们拥有

以某种方式控制搜索模式的特殊能力(例如只在行首或行尾搜索模式,或只在以

大写或小写字母开头的行上搜索模式)。如果在它们前面加上反斜杠(\),这

些元字符就会失去其特殊含义。例如,元字符点号(.)代表任何单个字符,但

如果在前面加上反斜杠,它就会退化为一个普通的点号或句号。


如果在元字符前面出现了反斜杠,这些反斜杠就会关闭元字符的特殊含义;但如

果在正则表达式中的其他数字或字母之前出现反斜杠的话,这些反斜杠则会拥有

其他的含义。Perl 为一些元字符提供了简化形式,又称元符号(metasymbol),

它们专门用于表示字符。例如,[0-9] 表示范围从0到9 的数字,\d 也能表示相

同的含义。只不过[0-9] 用到了方括号元字符,而\d 则使用了元符号。


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
     | 元字符               | 匹配项                                                   |
     |----------------------+----------------------------------------------------------|
     | 字符类:单字符与数字 |                                                          |
     |----------------------+----------------------------------------------------------|
     | .                    | 匹配除换行符外的任意字符                                 |
     | [a-z0-9]             | 匹配集合中任意单个字符                                   |
     | [^a-z0-9]            | 匹配不在集合中的任意单个字符                             |
     | \d                   | 匹配单个数字                                             |
     | \D                   | 匹配非数字字符,等效于[^0-9]                             |
     | \w                   | 匹配数字型(字)字符                                     |
     | \W                   | 匹配非数字型(非字)字符                                 |
     |----------------------+----------------------------------------------------------|
     | 字符类:空白字符     |                                                          |
     |----------------------+----------------------------------------------------------|
     | \s                   | 匹配空白字符,如空格,制表符和换行符                     |
     | \S                   | 匹配非空白字符                                           |
     | \n                   | 匹配换行符                                               |
     | \r                   | 匹配回车符                                               |
     | \t                   | 匹配制表符                                               |
     | \f                   | 匹配进纸符                                               |
     | \b                   | 匹配退格符                                               |
     | \0                   | 匹配空值字符                                             |
     |----------------------+----------------------------------------------------------|
     | 字符类:锚定字符     |                                                          |
     |----------------------+----------------------------------------------------------|
     | \b                   | 匹配字边界(不在[] 中时)                                |
     | \B                   | 匹配非字边界                                             |
     | ^                    | 匹配行首                                                 |
     | $                    | 匹配行尾                                                 |
     | \A                   | 匹配字符串开头                                           |
     | \Z                   | 匹配字符串或行的末尾                                     |
     | \z                   | 只匹配字符串末尾                                         |
     | \G                   | 匹配前一次m//g 离开之处                                  |
     |----------------------+----------------------------------------------------------|
     | 字符类:重复字符     |                                                          |
     |----------------------+----------------------------------------------------------|
     | x?                   | 匹配0或1个x                                              |
     | x*                   | 匹配0或多个x                                             |
     | x+                   | 匹配1或多个x                                             |
     | (xyz)+               | 匹配1或多个模式xyz                                       |
     | x(m,n)               | 匹配m到n个x组成的值                                      |
     |----------------------+----------------------------------------------------------|
     | 字符类:替换字符     |                                                          |
     |----------------------+----------------------------------------------------------|
     | (was|were|will)      | 匹配was、were、will之一                                  |
     |----------------------+----------------------------------------------------------|
     | 字符类:记忆字符     |                                                          |
     |----------------------+----------------------------------------------------------|
     | (stirng)             | 用于反向引用                                             |
     | \1 或$1              | 匹配第一组括号                                           |
     | \2 或$2              | 匹配第二组括号                                           |
     | \3 或$3              | 匹配第三组括号                                           |
     |----------------------+----------------------------------------------------------|
     | 字符类:其他字符     |                                                          |
     |----------------------+----------------------------------------------------------|
     | \12                  | 匹配八进制数,直到\377                                   |
     | \x811                | 匹配十六进制数值                                         |
     | \cX                  | 匹配控制字符。譬如\cC 指的是<Ctrl>-C;\cV 指的是<Ctrl>-V |
     | \e                   | 匹配ASCII 编码中的ESC 符(取消),而非反斜杠             |
     | \E                   | 标识使用\U、\L 或\Q 的大小写更改操作的结束位置           |
     | \l                   | 只小写下一个字符                                         |
     | \L                   | 小写字符,直到字符串末尾或碰到\E                         |
     | \N                   | 匹配已命名的字符,如\N{greek:Beta}                       |
     | \p{PROPERTY}         | 匹配拥有已命名属性的任意字符,譬如\p{IsAlpha}/           |
     | \p{PROPERTY}         | 匹配不带已命名属性的任意字符                             |
     | \Q                   | 引用\E之前的元字符                                       |
     | \u                   | 只大写下一个字符                                         |
     | \U                   | 大写字符,直到字符串末尾或碰到\E                         |
     | \x{NUMBER}           | 匹配以十六进制形式给出的Unicode 编码NUMBER               |
     | \X                   | 匹配Unicode 编码“组合字符序列”字符串                   |
     | \[                   | 匹配元字符                                               |
     | \\                   | 匹配反斜杠                                               |