Mainland
Mainland

Reputation: 4584

Python convert a .txt file with special characters into dataframe

I have a '.txt' file and I want to import, convert it into a dataframe. I am running into issues.

My code:

#The raw.txt file content: 
#A& B & C & D & E
#foo& 13.52 & 333.2 & 4504.4 & 0
#1 taw & 13.49 & 314.6 & 4.6 & 1.29
#2 ewq & 35.44 & 4.2 & 5.2 & 3.06
#3 asd & 13.41 & 4.1 & 6.8 & 5.04
#4 er & 13.37 & 230.0 & 7.1 & 7.07
#5 we & 13.33 & 199.7 & 8.9 & 9.12
#6 wed & 13.27 & 169.4 & 8.6 & 11.17

import pandas as pd
df = pd.read_csv('raw.txt', delimiter = "\n",sep=" & ")

print(df.columns)

Index(['A& B & C & D & E'], dtype='object')

It did not quite convert into a dataframe. It failed to recognize the columns. Just reads all them as one column.

Upvotes: 1

Views: 302

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150785

delimiter and sep are actually alias. You can use either of them, and use skiprows=1 to ignore the first rows:

pd.read_csv('filename.txt', sep='\s*&\s*', skiprows=1)

Output:

       #A      B      C       D      E
0    #foo  13.52  333.2  4504.4   0.00
1  #1 taw  13.49  314.6     4.6   1.29
2  #2 ewq  35.44    4.2     5.2   3.06
3  #3 asd  13.41    4.1     6.8   5.04
4   #4 er  13.37  230.0     7.1   7.07
5   #5 we  13.33  199.7     8.9   9.12
6  #6 wed  13.27  169.4     8.6  11.17

Upvotes: 2

Jose Vega
Jose Vega

Reputation: 579

When using pd.read_csv(), delimiter is an alias for sep, you can read about it here. Therefore, you are not selecting the correct delimiter for your file.

You can use the following:

pd.read_csv("raw.txt", sep="&")

If you use sep=" & ", the second line of your file will throw an error as there aren't enough columns because you're missing a space at the beginning.

And that should work.

Upvotes: 1

Related Questions