Contents

How to Extract String Between Two Characters or Strings in Python

You can extract a string between two characters or strings in Python using various functions such as search() (from re package) and split() functions.

Method 1: search() function

# import package
import re

m = re.search('string1(.*)string2', input_string)
ext_string = m.group(1)

Method 2: split() function

input_string.split('string1')[1].split('string2')[0]

The following examples demonstrate how to use search() and split() functions to extract a string between two characters or strings in Python.

Create a simple string,

input_string = "XYZ=3496;ABC"

If you want to extract a string between two characters such as = and ; from the input string, you can use the search() function.

# import package
import re

m = re.search('=(.*);', input_string)
# get extracted string
ext_string = m.group(1)
print(ext_string)

# output
3496

In the above code, re.search() searches for the pattern and, m.group(1) returns the captured content as defined in brackets which is between the two characters.

.* pattern in capturing group matches any character (except newline characters) zero or more times.

Create a simple string,

input_string = "XYZ3496ABC"

If you want to extract a string between two strings such as XYZ and ABC from the input string, you can use the search() function.

import re

m = re.search('XYZ(.*)ABC', input_string)
# get extracted string
ext_string = m.group(1)
print(ext_string)

# output
3496

In the above code, re.search() searches for the pattern and, m.group(1) returns the captured content as defined in brackets which is between the two strings.

.* pattern in capturing group matches any character (except newline characters) zero or more times.

Example 3: Extract string between two strings using split()

Define a simple string,

input_string = "XYZ3496ABC"

If you want to extract a string between two strings such as XYZ and ABC from input string, you can use the split() function.

ext_string = input_string.split('XYZ')[1].split('ABC')[0]
print(ext_string)

# output
3496

In the above code, we first split the input string using XYZ as the delimiter. This returns a list containing two parts (before and after XYZ). We again split the second part using ABC as the delimiter, which further returns 3496 (first part).