seminj
seminj

Reputation: 33

I need your help about read_fwf in python pandas

The example of text file is picture enter image description here

According to file, the direction of data will be changed after the word 'chapter' In the other word, Direction of reading is changed horizontal to vertical.

In order to solve this big problem, I find read_fwf in pandas module and apply it but failed.

linefwf = pandas.read_fwf('File.txt', widths=[33,33,33], header=None, nwors = 3)

The gap between categories(Chapter, Title, Assignment) is 33.

But the command(linefwf) prints all of pages line which includes horizontal categories such as Title, Date, Reservation as well as blank lines.

Please, I want to know 'How to export vertical data only'

Upvotes: 3

Views: 9135

Answers (1)

Jonathan Eunice
Jonathan Eunice

Reputation: 22473

Let me take a stab in the dark: you wish to turn this table into a column (aka "vertical category"), ignoring the other columns?

I didn't have your precise text, so I guesstimated it. My column widths were different than yours ([11,21,31]) and I omitted the nwors argument (you probably meant to use nrows, but it's superfluous in this case). While the column spec isn't very precise, a few seconds of fiddling left me with a workable DataFrame:

enter image description here

This is pretty typical of read-in datasets. Let's clean it up slightly, by giving it real column names, and taking out the separator rows:

df.columns = list(df.loc[0])
df = df.ix[2:6]

This has the following effect:

enter image description here

Leaving us with df as:

enter image description here

We won't take the time to reindex the rows. Assuming we want the value of a column, we can get it by indexing:

df['Chapter']

Yields:

2    1-1
3    1-2
4    1-3
5    1-4
6    1-5
Name: Chapter, dtype: object

Or if you want it not as a pandas.Series but a native Python list:

list(df['Chapter'])

Yields:

['1-1', '1-2', '1-3', '1-4', '1-5']

Upvotes: 8

Related Questions