正则匹配---函数篇

一、正则模块：

import re

二、正则函数：

1.re.match函数：

re.match(pattern,string,flags)
尝试从字符串的起始位置匹配一个模式，如果不是起始位置匹配成功的话，match()就返回none

1.pattern:匹配的正则表达式
2.string：要匹配的字符串
3.flags:标志位，用于控制正则表达式的匹配方式，是否区分大小写，多行匹配等

匹配成功后返回的是一个对象或者None。可用group()或groups()匹配对象函数来匹配表达式
- group(num=0)匹配整个表达式的字符串，group()可用一次输入多个组号，在这种情况下它返回一个包含那些组所对应值得元组。
- groups()返回一个包含所有小组字符串组成得元组，从1到所含得小组号
```
content = 'Hello 123 4567 World_This is a Regex Demo'
result = re.match('[a-zA-Z0-9 _]*', content)
print(result) #<_sre.SRE_Match object; span=(0, 41), match='Hello 123 4567 World_This is a Regex Demo'>
print(result.group()) #Hello 123 4567 World_This is a Regex Demo
print(result.span()) #(0, 41)
```

2.re.findall函数

re.findall(pattern,string,flags)
在字符串中找到正则表达式所匹配的所有子串，并返回一个列表，如果没有找到匹配的，则返回空列表.

注意： match 和 search 是匹配一次 findall 匹配所有。

3.re.split函数

re.split(pattern,string,maxsplit,flags)
根据模式匹配项来分割函数

4.re.sub函数

re.sub(pattern,repl,string,count,flags)
用于替换

pattren:是正则表达式，一般使用r''
repl:是被替换的字符
string:被匹配的字符串

count:被替换的最大次数，默认是0，会替换所有。

dd = 'attn: X\nDear X,'
print(re.sub(r'X','Mr.smith', dd))
#结果：
#attn: Mr.smith
#Dear Mr.smith,

5.re.subn函数

re.subn(pattern,repl,string,count,flags)
与sub相同，但返回的是一个元组，其中包含新字符串和替换次数。

dd = 'attn: X\nDear X,'
print(re.subn(r'X','Mr.smith', dd))
#结果：
#('attn: Mr.smith\nDear Mr.smith,', 2)

6.re.search()函数

re.search(pattern,string,flags)
在字符串中寻找模式，与match相似，但是search可以不从位置0开始匹配

content = 'Hello 123456789 Word_This is just a test 666 Test'
result = re.search(r'(\d+).*?(\d+).*', content)

print(result)  #<_sre.SRE_Match object; span=(6, 49), match='123456789 Word_This is just a test 666 Test'>
print(result.group())  # print(result.group(0)) 同样效果字符串
print(result.groups()) #('123456789', '666')
print(result.group(0))#123456789 Word_This is just a test 666 Test
print(result.group(1)) #123456789
print(result.group(2))  #666

7.re.compile()

re.compile(pattern,flags)
根据包含正则表达式的字符串创建模式对象
返回一个匹配对象，单独使用没有任何意义，需要和findall(),match(),search()搭配使用。

与findall一起使用返回一个列表

content = '''Hello,
  I am Jerry, 
  from Chongqing,
  a montain city, 
  nice to meet you……'''
regex = re.compile('\w*o\w*',re.M)
x = regex.findall(content)
print(x)  #['Hello', 'from', 'Chongqing', 'montain', 'to', 'you']

与match一起使用，可返回一个class，str，tuple，但是match是从位置0开始匹配，匹配不到就返回None。

content = 'Hohi, I am lily,nice to meet you....'
regex = re.compile('\w*i\w?')
y = regex.match(content)
print(y)  #<_sre.SRE_Match object; span=(0, 4), match='Hohi'>
print(type(y)) #<class '_sre.SRE_Match'>
print(y.group())  #Hohi
print(y.span())  #(0, 4)

与search一起使用，返回的与match差不多，但不同的是search可以不是从位置0开始匹配。但匹配一个之后也会结束。

content = 'Hihi, I am lily,nice to meet you....'
regex = re.compile('\w*o\w?')
y = regex.search(content)
print(y)  #<_sre.SRE_Match object; span=(21, 23), match='to'>
print(type(y))  #<class '_sre.SRE_Match'>
print(y.group())  #to
print(y.span())  #(21, 23)