PHC
PHC

Reputation: 11

I need to get a table with data from a docx file

i am using this code:

from docx import *

file_path = "/content/my_doc_table.docx"


document = Document(file_path)

tables = document.tables
tables

I get object: [<docx.table.Table at 0x7f9dcde8ad90>]

I further want to open it with pandas. Tell me please, how do I open a table?

Upvotes: 1

Views: 4985

Answers (1)

maciejwww
maciejwww

Reputation: 1196

1. Content's structure

To see how many tables were found you can iterate over tables:

for table in tables:
    print(table)

Example output for document with two tables in it:

<docx.table.Table object at 0x7f61ad9779d0>
<docx.table.Table object at 0x53rgad9fd498>

Found tables (columns, rows, cells as well) are iterable too, so you can access them with indexes: tables[0] gives: <docx.table.Table at 0x7f61ad9779d0>.


2. Accessing content

To access the content of chosen cells, you can reach them through columns or rows.
Using the above example of accessing iterable variables, we'll print the content of the first cell in the first column of the first table:

tables[0].columns[0].cells[0].text

and here we'll print the content of all cells in the second row of the first table:

for cell in tables[0].rows[1].cells:
    print(cell.text)

Try it yourself!


I hope these examples will be enough to understand how it works.
Here is documentation where you find everything you need.

Upvotes: 2

Related Questions