因为方便看所以转载一篇博客园的的文章,非常不错
原文链接(重要的事情说三遍):
https://www.cnblogs.com/xiaxiaoxu/p/8436795.html
https://www.cnblogs.com/xiaxiaoxu/p/8436795.html
https://www.cnblogs.com/xiaxiaoxu/p/8436795.html

1、匹配一行文字中的所有开头的字母内容

#coding=utf-8
 
import re
s="i love you not because of who you are, but because of who i am when i am with you"
content=re.findall(r"\b\w",s)
print content
c:\Python27\Scripts>python task_test.py
['i', 'l', 'y', 'n', 'b', 'o', 'w', 'y', 'a', 'b', 'b', 'o', 'w', 'i', 'a', 'w', 'i', 'a', 'w', 'y']
 

2、匹配一行文字中的所有开头的数字内容

import re
s="i love you not because 12sd 34er 56df e4 54434"
content=re.findall(r"\b\d",s)
print content
c:\Python27\Scripts>python task_test.py
['1', '3', '5', '5']

3、匹配一行文字中的所有开头的数字内容或数字内容

>>> print re.match(r"\w+","123sdf").group()
123sdf

4、 只匹配包含字母和数字的行

#coding=utf-8
 
import re
s="i love you not because\n12sd 34er 56\ndf e4 54434"
content=re.findall(r"\w+",s,re.M)
print content
c:\Python27\Scripts>python task_test.py
['i', 'love', 'you', 'not', 'because', '12sd', '34er', '56', 'df', 'e4', '54434']

5、写一个正则表达式,使其能同时识别下面所有的字符串:‘bat’, ‘bit’, ‘but’, ‘hat’, ‘hit’, 'hut‘

import re
s="'bat', 'bit', 'but', 'hat', 'hit', 'hut"
content=re.findall(r"..t",s)
print content
 
c:\Python27\Scripts>python task_test.py
['bat', 'bit', 'but', 'hat', 'hit', 'hut']

6、匹配所有合法的python标识符

#coding=utf-8
 
import re
s="awoeur awier !@# @#4_-asdf3$^&()+?><dfg$\n$"
content=re.findall(r".*",s,re.DOTALL)
print s
print content
c:\Python27\Scripts>python task_test.py
awoeur awier !@# @#4_-asdf3$^&()+?><dfg$
$
['awoeur awier !@# @#4_-asdf3$^&()+?><dfg$\n$', '']

7、提取每行中完整的年月日和时间字段

#coding=utf-8
 
import re
s="""se234 1987-02-09 07:30:00 1987-02-10 07:25:00"""
content=re.findall(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}",s,re.M)
print s
print content
c:\Python27\Scripts>python task_test.py
se234 1987-02-09 07:30:00
    1987-02-10 07:25:00
['1987-02-09 07:30:00', '1987-02-10 07:25:00']

8、将每行中的电子邮件地址替换为你自己的电子邮件地址

#coding=utf-8
 
import re
s="""693152032@qq.com, werksdf@163.com, sdf@sina.com sfjsdf@139.com, soifsdfj@134.com pwoeir423@123.com"""
 
content=re.sub(r"\w+@\w+.com","xiaxiaoxu1987@163.com",s)
print s
print "_______________________________________"
print content
c:\Python27\Scripts>python task_test.py
693152032@qq.com, werksdf@163.com, sdf@sina.com
    sfjsdf@139.com, soifsdfj@134.com
    pwoeir423@123.com
_______________________________________
xiaxiaoxu1987@163.com, xiaxiaoxu1987@163.com, xiaxiaoxu1987@163.com
    xiaxiaoxu1987@163.com, xiaxiaoxu1987@163.com
    xiaxiaoxu1987@163.com

9、匹配\home关键字:

>>> re.findall(r"\\home","skjdfoijower \home \homewer")
['\\home', '\\home']

1、使用正则提取出字符串中的单词

#coding=utf-8
 
import re
s="""i love you not because of who 234 you are, 234 but 3234ser because of who i am when i am with you"""
 
content=re.findall(r"\b[a-zA-Z]+\b",s)
print content
c:\Python27\Scripts>python task_test.py
['i', 'love', 'you', 'not', 'because', 'of', 'who', 'you', 
'are', 'but', 'because', 'of', 'who', 'i', 'am', 'when', 'i', 'am', 
'with', 'you']

2、使用正则表达式匹配合法的邮件地址:

import re
s="""xiasd@163.com, sdlfkj@.com sdflkj@180.com solodfdsf@123.com sdlfjxiaori@139.com saldkfj.com oisdfo@.sodf.com.com"""
 
content=re.findall(r"\w+@\w+.com",s)
print content
c:\Python27\Scripts>python task_test.py
['xiasd@163.com', 'sdflkj@180.com', 'solodfdsf@123.com', 'sdlfjxiaori@139.com']

转载
原文链接:https://www.cnblogs.com/xiaxiaoxu/p/8436795.html