Reputation: 1839
I have a .txt file
[7, 9, 20, 30, 50] [1-8]
[9, 14, 27, 31, 45] [2-5]
[7, 10, 22, 27, 38] [1-7]
that I am trying to read into a data frame of two columns using df = pd.read_fwf(readfile,header=None)
Instead of two columns it forms a data frame with three columns and sometimes reads each of the first list of numbers into five columns
0 1 2
0 [7, 9, 20, 30, 50] [1-8]
1 [9, 14, 27, 31, 45] [2-5]
2 [7, 10, 22, 27, 38] [1-7]
I do not understand what I am doing wrongly. Could someone please help?
Upvotes: 3
Views: 2177
Reputation: 5036
You can exploit the two spaces between the lists
pd.read_csv(readfile, sep='\s\s', header=None, engine='python')
Out:
0 1
0 [7, 9, 20, 30, 50] [1-8]
1 [9, 14, 27, 31, 45] [2-5]
2 [7, 10, 22, 27, 38] [1-7]
pd.read_fwf
without an explicit widths
argument tries to infere the fixed widths. But the length of the first list varies. There is no fixed width to separate each line into two columns.
The widths
argument is very usefull if your data has no delimiter but fixed number of letters per value. 40 years ago this was a common data format.
# data.txt
20200810ITEM02PRICE30COUNT001
20200811ITEM03PRICE31COUNT012
20200812ITEM12PRICE02COUNT107
pd.read_csv
sep
argument accepts multi char and regex delimiter. Often this is more flexible to separate strings to columns.
Upvotes: 3
Reputation: 299
By single line you can read using pandas
import pandas as pd
df = pd.read_csv(readfile, sep='\s\s')
Upvotes: 0