Reputation:
I want to get some word in .doc file and append them all in list.
Doc file content :
"i love Audi
i love audi
i love AuDi "
When I give audi or Audi as an input, it should read all these three different "audi" and return list containing all three different audi.
Upvotes: 0
Views: 65
Reputation: 4518
Try regular expression where you do findall on word and ignore case
import re
doc_content = 'i love Audi i love audi i love AuDi and audis but not audits or audiences'
results = re.findall(r'\baudi[s]?\b', doc_content, re.IGNORECASE) #The ? metacharacter will match only one 's' following audi to include the plural form and the \b at the end will exclude other words that begin with audi.
print(results)
['Audi', 'audi', 'AuDi', 'audis']
Here is the link for regex in Python - https://docs.python.org/3/howto/regex.html
Upvotes: 2
Reputation:
import re
doc_content = 'i love Audi i love audi i love AuDi... but not audis'
results = re.findall(r'\baudi\b', doc_content, re.IGNORECASE) #use \b at start and end to match whole word. This will exclude audis.
print(results)
['Audi', 'audi', 'AuDi']
This works for me. I was looking for this only. \b has solved my issue. Thanks :)
Upvotes: 0
Reputation: 21
A very simple solution is to use Regular Expressions.
import re
string = "i love Audi i love audi i love AuDi"
result = re.findall('[A,a][U,u][D,d][I,i]', string)
print(result)
['Audi', 'audi', 'AuDi']
Upvotes: 1