Sometimes you need to do what you need to do – use Regular Expressions for a trivial task. Now, imagine that you have some big text and you need to extract all substrings from the text between two specific words or characters. With other words, from this input:
1 |
"Joe Ivan Banana, George Joe J. Banana!, something with Joe K. Banana!" |
You will be getting this output:
1 |
[' Ivan ', ' J. ', ' K. '] |
And as visible, the character pairs will be these – Joe and Banana. As the last time I had to do this, the trivial task took me some solid 5 minutes, now I have decided to invest another 10 in writing this article, so hopefully in the long run I will save a minute or 2. Long story short – this is the function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import re def find_between(s, first, last): try: regex = rf'{first}(.*?){last}' return re.findall(regex, s) except ValueError: return -1 s = "Joe Ivan Banana, George Joe J. Banana!, something with Joe K. Banana!" found_values = find_between(s,"Joe", "Banana") print(found_values) |
That’s all! 🙂