import re
Search and Findall are like word processor features where they find the matching characters.
However, Word processors also have find and replace feature
We can do find and replace with regular expressions as well
namesRegex = re.compile(r'Agent \w+')
namesRegex.findall('Agent Alice gave the secret files to Agent Bob')
We can replace the agent names with a sub() method sort of like substitution
namesRegex.sub('[CLASSIFIED]', 'Agent Alice gave the secret files to Agent Bob')
Let's say we want some part of the original string
namesRegex = re.compile(r'Agent (\w)\w+') #first character of second word is added in a group.
namesRegex.findall('Agent Alice gave the secret files to Agent Bob') #findall() just returns the groups
namesRegex.sub(r'Agent \1****', 'Agent Alice gave the files to Agent Bob') #we have used the raw string
\1 means inside that match, whatever was first group
second group would be \2, third group would be \3, and so on
In Verbose Mode, whitespace does not reflect actual pattern that we want to match. That means we can use triple quotes to make a multiline string. The newlines won't be a part of the pattern that we are looking for.
verboseRegex = re.compile(r'''
(\d\d\d-)| # Area Code (without parentheses) and dash
(\(\d\d\d\)) # -or-Area Code (with parentheses) and no dash
\d\d\d # first 3 digits
- # second dash
\d\d\d\d' # last 4 digits
\sx\d{2,4} # Exetension like x1234''',re.VERBOSE)
#remeber the comma , after the arguments
We have seen re.I and re.DOTALL but if we wanted to use them both, we can do the following way.
regex = re.compile(r'\d', re.IGNORECASE | re.DOTALL | re.VERBOSE) #We can use the bitwise operator |
The | we used above is just for the second argument in the re.compile object. This type of programming with | is sort of weird and old fashioned now.