user11832421
user11832421

Reputation:

Getting specific word from doc file respective of uppercase/lowercase using python?

I am getting following output : [[], [], ['Audi'], ['audi'], ['AuDi']]
But I want ['Audi','audi','AuDi']
my code is:

from docx import Document
document = Document(r'C:\Users\aliassample02.docx')
list1 = []
for para in document.paragraphs:
    results = re.findall(r'audi', para.text, re.IGNORECASE)
    list1.append(results)
print(list1)

Upvotes: 4

Views: 265

Answers (5)

user11832421
user11832421

Reputation:

list1 = [x for para in document.paragraphs 
           for x in re.findall(r'audi', para.text, re.IGNORECASE)]

Best solution i have got for my query.

Upvotes: 0

user11832421
user11832421

Reputation:

list1 = [item for sublist in list1 for item in sublist]

This list comprehensive also works for me.

Upvotes: 0

user11832421
user11832421

Reputation:

It worked for me:

list1 = []
for para in document.paragraphs:
    results = re.findall(r'audi', para.text, re.IGNORECASE)
    list1.extend(results)

Upvotes: 0

jezrael
jezrael

Reputation: 862396

Use extend list instead append:

list1 = []
for para in document.paragraphs:
    results = re.findall(r'audi', para.text, re.IGNORECASE)
    list1.extend(results)

Or you can flatten values in list comprehension:

list1 = [x for para in document.paragraphs 
           for x in re.findall(r'audi', para.text, re.IGNORECASE)]

EDIT:

list1 = []
for para in document.paragraphs:
    for x in list2:
        results = re.findall(x, para.text, re.IGNORECASE)
        list1.extend(results)

Upvotes: 4

random_and_unknown
random_and_unknown

Reputation: 25

You can flatten the list after finding all things you want:

list1 = [item for sublist in list1 for item in sublist]

Upvotes: 2

Related Questions