Reputation:

Getting specific word from doc file respective of uppercase/lowercase using python

I want to get some word in .doc file and append them all in list.

Doc file content : "i love Audi i love audi i love AuDi "

When I give audi or Audi as an input, it should read all these three different "audi" and return list containing all three different audi.

Upvotes: 0

Answers (3)

Greg

Reputation: 4518

Try regular expression where you do findall on word and ignore case

import re

doc_content = 'i love Audi i love audi i love AuDi and audis  but not audits or audiences'

results = re.findall(r'\baudi[s]?\b', doc_content, re.IGNORECASE) #The ? metacharacter will match only one 's' following audi to include the plural form and the \b at the end will exclude other words that begin with audi.

print(results)
['Audi', 'audi', 'AuDi', 'audis']

Here is the link for regex in Python - https://docs.python.org/3/howto/regex.html

Upvotes: 2

user11832421

Reputation:

import re

doc_content = 'i love Audi i love audi i love AuDi... but not audis'

results = re.findall(r'\baudi\b', doc_content, re.IGNORECASE) #use \b at start and end to match whole word. This will exclude audis.

print(results)
['Audi', 'audi', 'AuDi']

This works for me. I was looking for this only. \b has solved my issue. Thanks :)

Upvotes: 0

Timm Wünsch

Reputation: 21

A very simple solution is to use Regular Expressions.

import re
string = "i love Audi i love audi i love AuDi"
result = re.findall('[A,a][U,u][D,d][I,i]', string)

print(result)
['Audi', 'audi', 'AuDi']

Upvotes: 1

Getting specific word from doc file respective of uppercase/lowercase using python

Answers (3)

Related Questions