The ^ is used if the required regex must be at the start

import re

beginsWithHelloRegex = re.compile(r'^Hello')

beginsWithHelloRegex.search('Hello there...General Kenobi')

<_sre.SRE_Match object; span=(0, 5), match='Hello'>

beginsWithHelloRegex.search('And he said Hello') == None

True

The $ is used when the regex must be at the end

endsWithWorldRegex = re.compile(r'World$')

endsWithWorldRegex.search('Hello World')

<_sre.SRE_Match object; span=(6, 11), match='World'>

endsWithWorldRegex.search('This is a World string') == None

True

^both$ means pattern must match the entire string.

allDigitsRegex = re.compile(r'^\d+$') #means the pattern starts with a one or more than one digit and ends with the pattern \d+.

having both ^ and $ means that the entire string must have \d+ pattern

allDigitsRegex.search('434343')

<_sre.SRE_Match object; span=(0, 6), match='434343'>

allDigitsRegex.search('654534354654454654545465132486434867436574')

<_sre.SRE_Match object; span=(0, 42), match='654534354654454654545465132486434867436574'>

allDigitsRegex.search('0')

<_sre.SRE_Match object; span=(0, 1), match='0'>

allDigitsRegex.search('') == None

True

allDigitsRegex.search('34343x43433') == None

True

. stands for any character except for the newline.

atRegex = re.compile(r'.at')  #pattern is anything followed by at.

atRegex.findall('The cat in the flat sat on the hat mat')

['cat', 'lat', 'sat', 'hat', 'mat']

Notice here that flat is not matched, instead it only took lat. That is because we only included one dot.

atRegex = re.compile(r'.{1,2}at') #means at preceded by one or two characters of anything

atRegex.findall('The cat in the flat sat on the hat mat')

[' cat', 'flat', ' sat', ' hat', ' mat']

Notice how it also included the space too in front of some matches

.*

Dot=Star to match anything

. (dot) means any characters.

* (star) means zero or more

nameRegex = re.compile(r'First Name: (.*) Last Name: (.*)')

nameRegex.findall('First Name: S Last Name: Dahiwal')

[('S', 'Dahiwal')]

nameRegex.findall('First Name: Sat yam Last Name: Dahiwal')

[('Sat yam', 'Dahiwal')]

If we want to have non greedy match we can use .*?

serve = '<To serve humans> for dinner.>'

greedy = re.compile(r'<.*>')

greedy.findall(serve)

['<To serve humans> for dinner.>']

nongreedy = re.compile(r'<.*?>')

nongreedy.findall(serve)  #It does the non-greedy match until it finds the closing angle bracket

['<To serve humans>']

prime = 'Serve the public trust.\nProtect the innocent.\nUphold the law.'

print(prime)

Serve the public trust.
Protect the innocent.
Uphold the law.

dotStar = re.compile(r'.*')  #greedy

dotStar.search(prime)  #since it is greedy it will match as much as possible till it reaches a newline

<_sre.SRE_Match object; span=(0, 23), match='Serve the public trust.'>

dotStar = re.compile(r'.*?')  #To see what happens if we do a non-greedy regex example for non-greedy

dotStar.search(prime)

<_sre.SRE_Match object; span=(0, 0), match=''>

We can match everything including new line by passing another argument.

dotStar = re.compile(r'.*', re.DOTALL)

dotStar.search(prime)

<_sre.SRE_Match object; span=(0, 61), match='Serve the public trust.\nProtect the innocent.\nU>

print(dotStar.search(prime).group())

Serve the public trust.
Protect the innocent.
Uphold the law.

vowelRegex = re.compile(r'[aeiou]')

vowelRegex.findall('Everything changes when you start to write something.')

['e', 'i', 'a', 'e', 'e', 'o', 'u', 'a', 'o', 'i', 'e', 'o', 'e', 'i']

Above didn't include the capital 'E', it's returning the small vowels only.

We can make python do a case insensitive matching. I can tell it to ignore all casing by passing another argument in compile.

vowelRegex = re.compile(r'[aeiou]',re.IGNORECASE)

vowelRegex = re.compile(r'[aeiou]',re.I)

Above two lines of code are equivalent, i.e. they mean the same thing.

vowelRegex.findall('Everything changes when you start to write something.')

['E', 'e', 'i', 'a', 'e', 'e', 'o', 'u', 'a', 'o', 'i', 'e', 'o', 'e', 'i']

regex = re.compile(r'(\w+@\w+\.\w+)')

regex.findall('The email is something@gmail.com, anotherExample@ex.ex, and the final one is sdhfsd@khdfks.fkjhdfh')

['something@gmail.com', 'anotherExample@ex.ex', 'sdhfsd@khdfks.fkjhdfh']

Sometimes the email can contain another characters like . or _

We can include those characters too by creating our own character classes like following:

regex = re.compile(r'([\w\._]+@\w+\.\w+)')  #\. is used because . also means 'everything except newline

regex.findall('The email is some.thing@gmail.com, another_Example@ex.ex, and the final one is sdh._fs._d@khdfks.fkjhdfh')

['some.thing@gmail.com', 'another_Example@ex.ex', 'sdh._fs._d@khdfks.fkjhdfh']