详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）-巨人网络通讯

详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）

前言

正则表达式应用广泛，在绝大多数的编程语言都可以完美应用，在Linux中，也有着极大的用处。

使用正则表达式，可以有效的筛选出需要的文本，然后结合相应的支持的工具或语言，完成任务需求。

在本篇博客中，我们使用grep/egrep来完成对正则表达式的调用，其实也可以使用sed等工具，但是sed的使用极大的需要正则表达式，为了在后面sed篇的书写，就只能这样排序了，有需要的朋友可以把这两篇一起来看。

正则表达式的类型

正则表达式可以使用正则表达式引擎实现，正则表达式引擎是解释正则表达式模式并使用这些模式匹配文本的基础软件。

在Linux中，常用的正则表达式有：

- POSIX 基本正则表达式（BRE）引擎

- POSIX 扩展正则表达式（BRE）引擎

基本正则表达式的基本使用

环境文本准备

[root@service99 ~]# mkdir /opt/regular
[root@service99 ~]# cd /opt/regular
[root@service99 regular]# pwd
/opt/regular
[root@service99 regular]# cp /etc/passwd temp_passwd

纯文本

纯文本可以完全匹配对应的单词，需要注意的有正则表达式模式严格区分大小写。

//grep --color 主要是可以将匹配到的文本高亮显示，这样便于观察效果
[root@service99 regular]# grep --color "root" temp_passwd 
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

在正则表达式中，不必局限于完整的单词，所定义的文本出现在数据流的任意位置，正则表达式都将匹配。

[root@service99 regular]# ifconfig eth1 | grep --color "add"
eth1   Link encap:Ethernet HWaddr 54:52:01:01:99:02 
     inet addr:192.168.2.99 Bcast:192.168.2.255 Mask:255.255.255.0
     inet6 addr: fe80::5652:1ff:fe01:9902/64 Scope:Link

当然也不必局限于单独的单词，也可以在文本字符串中出现空格和数字。

[root@service99 regular]# echo "This is line number 1" | grep --color "ber 1"
This is line number 1

特殊字符

在正则表达式模式中使用文本字符串时，有一个问题需要注意。

在正则表达式中定义文本字符串时有几个例外，正则表达式赋予了它们特殊的含义，如果在文本中使用这些特殊字符，有可能得不到预期的效果。

正则表达式认可的特殊字符：

复制代码代码如下:

.*[]^${}+?|()

如果想要使用这些特殊字符作为普通的文本字符，就需要转义（escape）它，即是在该字符前添加一个特殊字符，向正则表达式引擎说明：它应该将下一个字符解释为普通文本字符。

实现该功能的特殊字符是：“\”反斜杠字符

[root@service99 regular]# echo "This cat is $4.99" //双引号不会屏蔽特殊符号，所以系统会读取变量4.99的值，然而当前系统并没有该变量，就显示为空  
This cat is .99
[root@service99 regular]# echo "This cat is \$4.99"  //使用"\"转义$
This cat is $4.99
[root@service99 regular]# echo 'This cat is \$4.99'  //单引号屏蔽元字符$
This cat is \$4.99
[root@service99 regular]# echo 'This cat is $4.99' 
This cat is $4.99
[root@service99 regular]# cat price.txt 
This price is $4.99
hello,world!
$5.00
#$#$
This is "\".
[root@service99 regular]# grep --color '\\' price.txt 
This is "\".

定位符

从头开始

脱字符（^）尖角号定义从数据流中文本行开头开始的模式。

[root@service99 regular]# grep --color '^h' price.txt  //以字母h开头的行
hello,world!
[root@service99 regular]# grep --color '^$' price.txt //无输出结果，由于没有屏蔽特殊含义
[root@service99 regular]# grep --color '^\$' price.txt   //以$符号开头的行
$5.00
[root@service99 regular]# echo "This is ^ test. " >> price.txt 
[root@service99 regular]# cat price.txt 
This price is $4.99
hello,world!
$5.00
#$#$
This is "\".
This is ^ test. 
[root@service99 regular]# grep --color '^' price.txt //直接使用会显示所有的内容
This price is $4.99
hello,world!
$5.00
#$#$
This is "\".
This is ^ test. 
[root@service99 regular]# grep --color '\^' price.txt //单独使用，并在最前面时需要屏蔽
This is ^ test. 
[root@service99 regular]# grep --color 'is ^' price.txt //符号不在最前面时，无需屏蔽，直接使用即可
This is ^ test.

查找结尾

美元符号$特殊字符定义结尾定位，在文本模式之后添加这个特殊字符表示数据行必须以此文本模式结束。

[root@service99 regular]# grep --color '\.$' price.txt //“.”在正则表达式中也有特殊含义，请屏蔽，具体的请往下看
This is "\".
[root@service99 regular]# grep --color '\. $' price.txt //由于我在输入的时候，多加了一个空格，所以各位需要慎重和小心
This is ^ test.           //在正则表达式中，空格作为字符计。
[root@service99 regular]# grep --color '0$' price.txt 
$5.00
[root@service99 regular]# grep --color '9$' price.txt 
This price is $4.99

联合定位

比较常用的就是“^$” 表示空行

结合“^#”，由于#在Linux代表注释

输出该文本的有效配置

[root@service99 regular]# cat -n /etc/vsftpd/vsftpd.conf | wc -l
121
[root@service99 regular]# grep -vE '^#|^$' /etc/vsftpd/vsftpd.conf  //v表示反选，E表示支持扩展正则“|”是扩展正则的符号，往下看，后面有
anonymous_enable=YES
local_enable=YES
write_enable=YES
local_umask=022
anon_upload_enable=YES
anon_mkdir_write_enable=YES
anon_other_write_enable=YES
anon_umask=022
dirmessage_enable=YES
xferlog_enable=YES
connect_from_port_20=YES
xferlog_std_format=YES
listen=YES
pam_service_name=vsftpd
userlist_enable=YES
tcp_wrappers=YES

字符出现范围

{n,m} //前一个字符出现了n到m次

{n,} //前一个字符出现了n次以上

{n} //前一个字符出现了n次

[root@service99 regular]# grep --color "12345\{0,1\}" price.txt 
1234556
[root@service99 regular]# grep --color "12345\{0,2\}" price.txt 
1234556

点字符

点特殊字符用于匹配除换行符之外的任意单个字符，但点字符必须匹配一个字符；如果在圆点位置没有字符，那么模式匹配失败。

[root@service99 regular]# grep --color ".s" price.txt 
This price is $4.99
This is "\".
This is ^ test. 
[root@service99 regular]# grep --color ".or" price.txt 
hello,world!

字符类

字符类可以定义一类字符来匹配文本模式中的某一位置。如果在字符类中的某一字符在数据流中，就和模式匹配。
为定义字符类，需要使用方括号。应该将要包括在该类中的所有字符用方括号括起来，然后模式中使用整个字符类，就像任意的其他通配符一样。

[root@service99 regular]# grep --color "[abcdsxyz]" price.txt 
This price is $4.99
hello,world!
This is "\".
This is ^ test. 
[root@service99 regular]# grep --color "[sxyz]" price.txt 
This price is $4.99
This is "\".
This is ^ test. 
[root@service99 regular]# grep --color "[abcd]" price.txt 
This price is $4.99
hello,world!
[root@service99 regular]# grep --color "Th[ais]" price.txt //Th 后的第一个字符在【ais】中匹配的
This price is $4.99
This is "\".
This is ^ test. 
[root@service99 regular]# grep -i --color "th[ais]" price.txt //-i 表示不区分大小写
This price is $4.99
This is "\".
This is ^ test.

如果不能确定某个字符的大小写，就可以使用该模式：

[root@service99 regular]# echo "Yes" | grep --color "[yY]es"  []内字符顺序没有影响
Yes
[root@service99 regular]# echo "yes" | grep --color "[Yy]es"
yes

在单个表达式内可以使用多个字符类：

[root@service99 regular]# echo "Yes/no" | grep "[Yy][Ee]"
Yes/no
[root@service99 regular]# echo "Yes/no" | grep "[Yy].*[Nn]" //*在正则表达式中的用法，请往下看
Yes/no

字符类对数字同样支持：

[root@service99 regular]# echo "My phone number is 123456987" | grep --color "is [1234]"
My phone number is 123456987
[root@service99 regular]# echo "This is Phone1" | grep --color "e[1234]"
This is Phone1
[root@service99 regular]# echo "This is Phone1" | grep --color "[1]"
This is Phone1

字符类还有一种极为常见的用途是解析可能拼错的单词：

[root@service99 regular]# echo "regular" | grep --color "r[ea]g[ua]l[ao]"
regular

否定字符类

用于查找不在该字符类中的字符，只需在字符类范围的开头添加脱字符（^）.

即使使用否定，字符类仍必须匹配一个字符。

[root@service99 regular]# cat price.txt 
This price is $4.99
hello,world!
$5.00
#$#$
This is "\".
this is ^ test. 
cat
car
[root@service99 regular]# sed -n '/[^t]his/p' price.txt 
This price is $4.99
This is "\".
[root@service99 regular]# grep --color "[^t]his" price.txt 
This price is $4.99
This is "\".
[root@service99 regular]# grep --color "ca[tr]" price.txt 
cat
car
[root@service99 regular]# grep --color "ca[^r]" price.txt 
cat

使用范围

当你需要匹配的字符很多并且有一定规律时，可以这样：

[root@service99 regular]# cat price.txt 
This price is $4.99
hello,world!
$5.00
#$#$
This is "\".
this is ^ test. 
cat
car
1234556
911
11806
[root@service99 regular]# egrep --color '[a-z]' price.txt 
This price is $4.99
hello,world!
This is "\".
this is ^ test. 
cat
car
[root@service99 regular]# egrep --color '[A-Z]' price.txt 
This price is $4.99
This is "\".
[root@service99 regular]# grep --color "[0-9]" price.txt 
This price is $4.99
$5.00
1234556
911
11806

[root@service99 regular]# sed -n '/^[^a-Z]/p' price.txt 
$5.00
#$#$
1234556
911
11806
[root@service99 regular]# grep --color "^[^a-Z]" price.txt 
$5.00
#$#$
1234556
911
11806
[root@service99 regular]# echo $LANG  //在使用 [a-Z]时，注意LANG环境变量的值，该值若是进行修改的话，要注意修改的值的合法性
zh_CN.UTF-8 
[root@service99 regular]# LANG=en_US.UTF-8

特殊字符类

用于匹配特定类型的字符。

[[:blank:]] 空格（space）与定位（tab）字符

[[:cntrl:]] 控制字符

[[:graph:]] 非空格（nonspace）字符

[[:space:]] 所有空白字符

[[:print:]] 可显示的字符

[[:xdigit:]] 十六进制数字

[[:punct:]] 所有标点符号

[[:lower:]] 小写字母

[[:upper:]] 大写字母

[[:alpha:]] 大小写字母

[[:digit:]] 数字

[[:alnum:]] 数字和大小写字母

星号

在某个字符之后加一个星号表示该字符在匹配模式的文本中不出现或出现多次

[root@service99 regular]# cat test.info 
goole
go go go
come on
goooooooooo
[root@service99 regular]# grep --color "o*" test.info 
goole
go go go
come on
goooooooooo
[root@service99 regular]# grep --color "go*" test.info 
goole
go go go
goooooooooo
[root@service99 regular]# grep --color "w.*d" price.txt   //经常与.一起使用
hello,world!

扩展正则表达式

问号

问号表示前面的字符可以不出现或者出现一次。不匹配重复出现的字符。

[root@service99 regular]# egrep --color "91?" price.txt 
This price is $4.99
911

加号

加号表示前面的字符可以出现一次或者多次，但必须至少出现一次，该字符若是不存在，则模式不匹配。

[root@service99 regular]# egrep --color "9+" price.txt 
This price is $4.99
911
[root@service99 regular]# egrep --color "1+" price.txt 
1234556
911
11806

使用大括号

使用大括号指定对可重复的正则表达式的限制，通常称为间隔。

- m：该正则表达式正好出现m次

- m，n：该正则表达式出现最少m次，最多n次

[root@service99 regular]# echo "This is test,test is file." | egrep --color "test{0,1}"
This is test,test is file.
[root@service99 regular]# echo "This is test,test is file." | egrep --color "is{1,2}"
This is test,test is file.

正则表达式实例

这里有一个实例，对基本的正则表达式进行了练习和实例。
因为正则表达式，单看概念或者理论还是比较简单的，然而在实际的使用中，却不是那么好用，一旦用好了，对效率的提升绝对时可观的。

1.过滤下载文件中包含 the 关键字

grep --color "the" regular_express.txt

2.过滤下载文件中丌包含 the 关键字

grep --color -vn "the" regular_express.txt

3.过滤下载文件中丌论大小写 the 关键字

grep --color -in "the" regular_express.txt

4.过滤 test 或 taste 这两个单字

grep --color -En 'test|taste' regular_express.txt 
grep --color -i "t[ae]ste\{0,1\}" 1.txt

5.过滤有 oo 的字节

grep --color "oo" regular_express.txt

6.过滤丌想要 oo 前面有 g 的

grep --color [^g]"oo" regular_express.txt 
grep --color "[^g]oo" regular_express.txt

7.过滤 oo 前面丌想有小写字节

egrep --color "[^a-z]oo" regular_express.txt

8.过滤有数字的那一行

egrep --color [0-9] regular_express.txt

9.过滤以 the 开头的

egrep --color ^the regular_express.txt

10.过滤以小写字母开头的

egrep --color ^[a-z] regular_express.txt

11.过滤开头丌是英文字母

egrep --color ^[^a-Z] regular_express.txt

12.过滤行尾结束为小数点.那一行

egrep --color $"\." regular_express.txt

13.过滤空白行

egrep --color "^$" regular_express.txt

14.过滤出 g??d 的字串

egrep --color "g..d" regular_express.txt

15.过滤至少两个 o 以上的字串

egrep --color "ooo*" regular_express.txt 
egrep --color o\{2,\} regular_express.txt

16.过滤 g 开头和 g 结尾但是两个 g 之间仅存在至少一个 o

egrep --color go\{1,\}g regular_express.txt

17.过滤任意数字的行

egrep --color [0-9] regular_express.txt

18.过滤两个 o 的字串

egrep --color "oo" regular_express.txt

19.过滤 g 后面接 2 到 5 个 o,然后在接一个 g 的字串

egrep --color go\{2,5\}g regular_express.txt

20.过滤 g 后面接 2 个以上 o 的

egrep --color go\{2,\} regular_express.txt

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持脚本之家。

您可能感兴趣的文章:

linux中mount/umount命令的基本用法及开机自动挂载方法
Linux下Shell脚本中几种基本命令的替换区别
【Linux】linux常用基本命令总结（推荐）
linux下动态网站维护基本命令小结
linux中叹号命令（!）的使用小结

上一篇：PHP 正则表达式效率贪婪、非贪婪与回溯分析(推荐)
下一篇：IOS正则表达式判断输入类型(整理)

详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）

前言正则表达式应用广泛，在绝大多数的编程语言都可以完美应用，在Linux中，也有着极大的用处。使用正则表达式，可以有效的筛选出需要的文本，然后结合相应的支持的工具或语言详解,基于,Linux,下,正则,...

下载神奇电话机器人视频（电话机器人效果怎么样?）

*** 本文目录一览：1、视频号下载机器人收费吗2、打电话语音机器人怎么弄3、视频下载...

温州电销电话机器人哪家好（电销电话智能机器人）

本篇文章给大家谈谈温州电销电话机器人哪家好，以及电销电话智能机器人对应的知识...

物联卡成为孩子在校安全的保护伞

运城好用的智能电话机器人供应商(运城好用的智能电话

认知＆尝鲜福利：想聆听一下人与电话机器人运城好用的智能电话机器人供应商的交流...

400电话申请对一个企业到底有多重要

随着国家对中小企业的扶持力度的增大，市场的开放，需求的增加，人才的涌现，一部...

部分旧MacBook机型升级macOS Big Sur失败怎么办? 苹果官方临

在本周四发布的官方更新文档中，苹果详细介绍了如何解决某些 MacBook Pro 机型在安装...

400开头的号码一分钟多少钱拨打400开头的号码怎么收费

(400开头的号码一分钟多少钱)(拨打400开头的400选号大厅号码怎么收费)以下内容由巨人小...

使用电话进行销售时有什么技巧和方法

对于电话销售是现在比较多的一个行业，但是在进行电话的销售过程中需要了解哪些技...

从业务扎堆到精神领航—谈新员工培训一些有效做法

呼叫中心的新员工培训工作，在人员紧缺、接通率难以达标的各种压力下，往往出现“...

我国商标保护模式与商标法原理不符

我国商标保护模式与商标法原理不符商标是经营者用来标识其提供的商品或服务,并借...

河北电销外呼系统收费（郑州电销外呼系统）

本篇文章给我们谈谈河北电销外呼体系收费，以及郑州电销外呼体系对应的常识点，期...

营口回拨外呼系统（电话外呼回拨系统）

今天给各位分享营口回拨外呼系统的知识，其中也会对电话外呼回拨系统进行解释，如...

济南房产电销外呼线路收费,外呼机器人-了解详情

一企嗨电话营销系统 1．当前的电销场景有哪些困扰？ 1）手机.卡频繁被封，电销业务...

Win8系统VMware虚拟机挂载硬盘提示＂无法挂载硬盘＂的故

win8操作系统VMware虚拟机来安装别的操作系统，在操作挂载硬盘时却遇到了问题，提示无...

外呼系统电信（电话外呼系统什么意思）

本文目录一览： 1、外呼体系哪种比较好2、现在市面上那么多的外呼体系，怎样挑选？...

详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）

全 部 栏 目

全部栏目