Reputation: 17
I am trying to parse the txt file into readable pandas format.
See this sample .txt file.
Arsenal 123 UK
Barcelona 456 Marc-Andre Ter Stegen ES
Chelsea 789 Ceasr Azpilicueta UK
from this .txt file, the file clearly stated three columns despite a name is missing in Arsenal.
I tried to load this .txt file into pandas and then I used str.split with more than one space command to try to separate the columns. However, the country code UK would fall to column 3 instead.
Below is my desired output:
|Arsenal |123|NAN |UK|
|Barcelona |456|Marc-Andre Ter Stegen|ES|
|Chelsea |789|Cesar Azpilicueta |UK|
Upvotes: 0
Views: 130
Reputation: 11650
you can use read_fwf (fix width format) to read in the file
# to read in a dataframe
df=pd.read_fwf(r'csv2.txt', header=None)
df
0 1 2 3
0 Arsenal 123 NaN UK
1 Barcelona 456 Marc-Andre Ter Stegen ES
2 Chelsea 789 Ceasr Azpilicueta UK
# to write to a CSV
(pd.read_fwf(r'csv2.txt', header=None)
.fillna('NAN')
.to_csv('out.csv', sep='|', header=None))
0|Arsenal|123|NAN|UK
1|Barcelona|456|Marc-Andre Ter Stegen|ES
2|Chelsea|789|Ceasr Azpilicueta|UK
PS: you can also pass colspec like following, describing from-to for each column
#colspecs = [(0, 6), (8, 20), (21, 33), (34, 43)]
Upvotes: 1