How to Extract String Between Two Characters or Strings in Python
You can extract a string between two characters or strings in Python using various functions such as search()
(from re package) and split()
functions.
Method 1: search()
function
# import package
import re
m = re.search('string1(.*)string2', input_string)
ext_string = m.group(1)
Method 2: split()
function
input_string.split('string1')[1].split('string2')[0]
The following examples demonstrate how to use search()
and split()
functions to extract a string between two
characters or strings in Python.
Example 1: Extract string between two characters using search()
Create a simple string,
input_string = "XYZ=3496;ABC"
If you want to extract a string between two characters such as =
and ;
from the input string, you can
use the search()
function.
# import package
import re
m = re.search('=(.*);', input_string)
# get extracted string
ext_string = m.group(1)
print(ext_string)
# output
3496
In the above code, re.search()
searches for the pattern and, m.group(1)
returns the captured content
as defined in brackets which is between the two characters.
.*
pattern in capturing group matches any character (except newline characters) zero or more times.
Example 2: Extract string between two strings using search()
Create a simple string,
input_string = "XYZ3496ABC"
If you want to extract a string between two strings such as XYZ
and ABC
from the input string, you can
use the search()
function.
import re
m = re.search('XYZ(.*)ABC', input_string)
# get extracted string
ext_string = m.group(1)
print(ext_string)
# output
3496
In the above code, re.search()
searches for the pattern and, m.group(1)
returns the captured content
as defined in brackets which is between the two strings.
.*
pattern in capturing group matches any character (except newline characters) zero or more times.
Example 3: Extract string between two strings using split()
Define a simple string,
input_string = "XYZ3496ABC"
If you want to extract a string between two strings such as XYZ
and ABC
from input string, you can
use the split()
function.
ext_string = input_string.split('XYZ')[1].split('ABC')[0]
print(ext_string)
# output
3496
In the above code, we first split the input string using XYZ
as the delimiter. This returns a list containing two parts (before and after XYZ
). We again split the second part using ABC
as the delimiter, which
further returns 3496
(first part).