Alex J
Alex J

Reputation: 92

python pandas dataframe to csv exporting a formatted text file with unique formats for each column

I am using Python 2.7.7 with Pandas on Win 7 64bit. My input data was originally as space delimited, right justified. I now have the data as a Pandas dataframe which I exported as a csv. I want to write a space delimited right justified text file. The columns have strings, ints, and floats. I tried to format one of the columns using this:

df_fg['Mem']=df_fg['Mem'].map('{:5d}'.format)

This allows me to format each column individually, which is great.

The problem is that when I use this type of formatting I can not output a space delimited file. Here are the various ways I tried to write the text file:

df_fg.to_csv('t.txt',index = False)

Not surprisingly this produces a csv file that is formatted with padding spaces.

So, I thought the next logical step would be to try to include "sep" to get rid of the commas.

df_fg.to_csv('t.txt',index = False,sep= ' ') 

this produces formatted text in the text file, but each element in every column is surrounded by double quotes. So I get a column that looks like

"    1"
"    1"

I tried various combinations of the "quoting" and "doublequote" options of .to_csv. Nothing works. I either end up with formatted text within double quotation marks or formatted text within a csv file. I can't get formatted text within a text file.

Maybe, I should not use "map" and "format"? Any advice on how to write a right-justified space deliminated strings, ints, and floats from a dataframe or csv would be very much appreciated.

I attempted to write the dataframe to a string. I formatted each column in the dataframe using commands such as df_g['Mem']=df_g['Mem'].map('{:4d}'.format)

df_g['Date1']=df_g['Date1'].map('{:12s}'.format)

I wrote the dataframe using the dataframe to string command. I was hoping that the output would be right justified

f2 = open('2.txt','w')
s=df_g.to_string(justify='right',index = False)
f2.write(s) 
f2.close() 

In the text file not all columns were right justified. Column 1 contains an integer it was right justified as expected Column 5 contains a float with 2 decimals it was right justified as expected Columns 2,3 and 4 were strings (I used the command below to make them strings in the dataframe

df_g['Date1']=df_g['Date1'].map('{:12s}'.format)

1,26/04/2015 ,09:19:07 ,more-text , -1600.00,

(I am presenting the commas just to demonstrate where the fields end and begin.

So, I still cannot find a way for dataframe.to_string to output formatted strings. Most interestingly, the "map format" DOES, in fact, change the length of the strings( and the spacing), but the " justify='right' " did not work on them.

Any advice?

Upvotes: 1

Views: 3670

Answers (1)

JoeCondron
JoeCondron

Reputation: 8906

I think this might give you what you're looking for. First pad the column entries as you suggest. Then sum along axis 1:

s = df_string.sum(axis=1)

This is a series with a string in each entry representing a row in the original df. Then just add a line break to each element and sum again:

s = (s + '\n').sum()

Then just write the file you want

open('t.txt', 'w').write(s)

Here's a stupidly terse one-liner example:

df = pd.DataFrame({'A': [1.2, 2.34], 'B': ['foo', 'bar', ]})
print (df.applymap(lambda x: '{:>20s}'.format(str(x))).sum(axis=1) + '\n').sum()

             1.2                 foo
            2.34                 bar

Upvotes: 1

Related Questions