Jacky Man
Jacky Man

Reputation: 17

Reading .txt file into pandas with empty entries in several columns

I am trying to parse the txt file into readable pandas format.

See this sample .txt file.

Arsenal     123                             UK
Barcelona   456    Marc-Andre Ter Stegen    ES
Chelsea     789    Ceasr Azpilicueta        UK

from this .txt file, the file clearly stated three columns despite a name is missing in Arsenal.

I tried to load this .txt file into pandas and then I used str.split with more than one space command to try to separate the columns. However, the country code UK would fall to column 3 instead.

Below is my desired output:

|Arsenal   |123|NAN                  |UK|
|Barcelona |456|Marc-Andre Ter Stegen|ES|
|Chelsea   |789|Cesar Azpilicueta    |UK|

Upvotes: 0

Views: 130

Answers (1)

Naveed
Naveed

Reputation: 11650

you can use read_fwf (fix width format) to read in the file

# to read in a dataframe
df=pd.read_fwf(r'csv2.txt', header=None) 
df
         0        1     2                       3
0   Arsenal     123     NaN                     UK
1   Barcelona   456     Marc-Andre Ter Stegen   ES
2   Chelsea     789     Ceasr Azpilicueta       UK
# to write to a CSV
(pd.read_fwf(r'csv2.txt', header=None)
 .fillna('NAN')
 .to_csv('out.csv', sep='|', header=None))

0|Arsenal|123|NAN|UK
1|Barcelona|456|Marc-Andre Ter Stegen|ES
2|Chelsea|789|Ceasr Azpilicueta|UK

PS: you can also pass colspec like following, describing from-to for each column

#colspecs = [(0, 6), (8, 20), (21, 33), (34, 43)]

Upvotes: 1

Related Questions