Phone number is like 444-444-4444
import re
phoneNumRegex = re.compile(r"\d\d\d-\d\d\d-\d\d\d\d")
matchobject = phoneNumRegex.search("my number is 434-343-4343")
matchobject.group()
But what if we wanted to get only the area code. For that we use parantheses to mark out the groups
phoneNumRegex = re.compile(r"(\d\d\d)-(\d\d\d-\d\d\d\d)")
matchobject = phoneNumRegex.search("my number is 434-343-4343")
matchobject.group()
matchobject.group(1)
matchobject.group(2)
Calling group() or group(0) returns the full matching string, group(1) returns the groups 1's matching string, and so on.
If the paranthesis are part of the string we have to use the escape character.
phoneNumRegex = re.compile(r"\(\d\d\d\)-\(\d\d\d-\d\d\d\d\)")
matchobject = phoneNumRegex.search("my number is (434)-(343-4343)") #the parantheses are added
matchobject.group()
The vertical bar above the enter key | is called the pipe character
What if we wanted to find all the words with a fixed prefix like batman batmobile which have bat as a prefix
batRegex = re.compile(r'bat(man|mobile|copter|cave|bat)')
The parantheses after the bat contains possible suffixes after the bat separated by the pipe character
matchobject = batRegex.search('batman rides his batcopter and batmobile to his batcave. nanananannnan batbat. lol')
matchobject.group()
If we want to find only the suffix of the first appearance we can do that in the following way
matchobject.group(1)
And the above returned only the first element
Defining matchobject using findall function on regex object
matchobject = batRegex.findall('batman rides his batcopter and batmobile to his batcave. nanananannnan batbat. lol')
matchobject
for i in range(len(matchobject)):
print("bat"+matchobject[i])