user1478335
user1478335

Reputation: 1839

pandas read text file into a dataframe

I have a .txt file

[7, 9, 20, 30, 50]  [1-8]
[9, 14, 27, 31, 45]  [2-5]
[7, 10, 22, 27, 38]  [1-7]

that I am trying to read into a data frame of two columns using df = pd.read_fwf(readfile,header=None) Instead of two columns it forms a data frame with three columns and sometimes reads each of the first list of numbers into five columns

    0              1      2
0   [7, 9, 20, 30, 50]  [1-8]
1   [9, 14, 27, 31, 45] [2-5]
2   [7, 10, 22, 27, 38] [1-7]

I do not understand what I am doing wrongly. Could someone please help?

Upvotes: 3

Views: 2177

Answers (2)

Michael Szczesny
Michael Szczesny

Reputation: 5036

You can exploit the two spaces between the lists

pd.read_csv(readfile, sep='\s\s', header=None, engine='python')

Out:

                     0      1
0   [7, 9, 20, 30, 50]  [1-8]
1  [9, 14, 27, 31, 45]  [2-5]
2  [7, 10, 22, 27, 38]  [1-7]

pd.read_fwf without an explicit widths argument tries to infere the fixed widths. But the length of the first list varies. There is no fixed width to separate each line into two columns.
The widths argument is very usefull if your data has no delimiter but fixed number of letters per value. 40 years ago this was a common data format.

# data.txt
20200810ITEM02PRICE30COUNT001
20200811ITEM03PRICE31COUNT012
20200812ITEM12PRICE02COUNT107

pd.read_csv sep argument accepts multi char and regex delimiter. Often this is more flexible to separate strings to columns.

Upvotes: 3

Rahul kuchhadia
Rahul kuchhadia

Reputation: 299

By single line you can read using pandas

import pandas as pd
df = pd.read_csv(readfile, sep='\s\s')

Upvotes: 0

Related Questions