Fakhriyanto
Fakhriyanto

Reputation: 727

python get text between 2 strings in wordlist

I'm newbie in python I have simple wordlist written in txt format

hello, hai, hi, halo

what i want, to get text between 2 strings start with word in wordlist and end with "." (dot)

The code I tried

import re
START = open('C:\\Users\\aaaa\\Desktop\\word.txt', 'r')

END = "\."
test = "Hello my name is aaaa."

m=re.compile('%s(.*?)%s' % (START.read(),END),re.S)
print m.search(test).group(1)

and it got error

Traceback (most recent call last):
File "C:\Python27\pyhtonism.py", line 10, in <module>
print m.search(test).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
>>> 

Can anyone help?

Upvotes: 0

Views: 93

Answers (1)

maxymoo
maxymoo

Reputation: 36545

Couple of issues here.

  1. If you are searching for any word in a list of words, you need to put the list its own group with brackets, delimited by the pipe character. In your case this is (hello|hai|hi|halo) which you can achive by changing your regex to re.compile('(%s)(.*?)%s' % (START.read().replace(', ','|'),END)

  2. You are trying to do a case insensitive search, so you have to pass the flag IGNORECASE like this:

    m = re.compile('(%s)(.*?)%s' % (START.read().replace(', ','|'),END), flags = re.IGNORECASE)

Note that after this change you need to change to group(2) since the first group is now 'hello'.

Upvotes: 1

Related Questions