id345678
id345678

Reputation: 107

extending list after for loop append fails

I am extracting data from pdfs into lists

list1=[]
for page in pages:
    for lobj in element:
        if isinstance(lobj, LTTextBox):
            x, y, text = lobj.bbox[0], lobj.bbox[3], lobj.get_text()
            
        if isinstance(lobj, LTTextContainer):
            for text_line in lobj:
                for character in text_line:
                    if isinstance(character, LTChar):
                        Font_size = character.size
            list1.append([Font_size,(lobj.get_text())]) 

        if isinstance(lobj, LTTextContainer):
            for text_line in lobj:
                for character in text_line:
                    if isinstance(character, LTChar):
                        font_name = character.fontname
            list1.append(font_name)
            
print(list1)

gives me a list of lists that has the font_name not within each of the list with size and text.

list = [[12.0, 'aaa'], 'IJEAMP+Times-Bold', [12.0, 'bbb'], 'IJEAOO+Times-Roman', [12.0, 'ccc'], 'IJEAMP+Times-Bold', [10.0, 'ddd'], 'IJEAOO+Times-Roman',  [10.0, 'eee'], 'IJEAOO+Times-Roman', [8.0, '2\n'], 'IJEAOO+Times-Roman', 'IJEAOO+Times-Roman']

How the list of lists should look like

list = [[12.0, 'aaa', 'IJEAMP+Times-Bold'], [12.0, 'bbb', 'IJEAOO+Times-Roman'], [12.0, 'ccc', 'IJEAMP+Times-Bold'], [10.0, 'ddd', 'IJEAOO+Times-Roman'],  [10.0, 'eee', 'IJEAOO+Times-Roman'], [8.0, '2\n', 'IJEAOO+Times-Roman'], 'IJEAOO+Times-Roman']

If possible, i would like to ask for an answer to my problem that fixes my error in the code. I believe it is possible so that i dont need to create two lists and zip them afterwards.

I tried list2.extend([list1, font_name]) but that doesent do it as the font_name keeps getting split into individual letters

Upvotes: 1

Views: 24

Answers (1)

Patrick Artner
Patrick Artner

Reputation: 51683

You are appending to the outer list, not the list you just added into it. This adds your inner list:

list1.append([Font_size,(lobj.get_text())]) 

if you want to extend that added list, you can do so by using

list1[-1].append(font_name)

instead of

 list1.append(font_name)

Upvotes: 1

Related Questions