RedVII
RedVII

Reputation: 493

Pandas data frame to comma delimit with attention to data type

I have a data frame like this one:

import pandas as pd
trx = {
    'transaction_id': [1,2],
    'date': ['1/1/2017','1/2/2017'],
    'sale_amt': [50.25,99.30],
    'user': ['foo','bar']
    }
df = pd.DataFrame(trx, columns = ['transaction_id','date','sale_amt','user'])
df

   transaction_id      date  sale_amt user
0               1  1/1/2017     50.25  foo
1               2  1/2/2017     99.30  bar

Now what I want to do is convert this small data frame to a comma delimited list for each row, and I've already managed to do that like so:

df2 = df.apply(lambda row: ','.join(map(str,row)),axis=1)
df2

0    1,1/1/2017,50.25,foo
1     2,1/2/2017,99.3,bar

Fair enough, but I want this to be more dynamic. I want single quotes to wrap around the text and date fields. So I'm thinking I can create a list with all the datatypes and take it from there, except I don't know how to do that...

coltypes = ["int","date","num","text"]

Desired output:

0    1,'1/1/2017',50.25,'foo'
1     2,'1/2/2017',99.3,'bar'

How can I achieve the desired output using the coltypes list of datatypes?

Upvotes: 0

Views: 59

Answers (2)

spicypumpkin
spicypumpkin

Reputation: 1219

Using repr() will get the job done quick and easy in your specific case.

import pandas as pd
trx = {
    'transaction_id': [1,2],
    'date': ['1/1/2017','1/2/2017'],
    'sale_amt': [50.25,99.30],
    'user': ['foo','bar']
    }

trx['date'] = list(map(repr, trx['date']))
trx['user'] = list(map(repr, trx['user']))

Alternatively, you can also apply the repr() function across the entire DataFrame as the numerical fields will not show single quotes around them.

Upvotes: 0

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

If you don't specify path_or_buf parameter when calling df.to_csv() function it'll return a CSV file content as a string. After that we can split it into separate rows:

In [291]: import csv

In [292]: pd.Series(df.to_csv(header=None, index=False,
     ...:                     quoting=csv.QUOTE_NONNUMERIC).split(),
     ...:           index=df.index)
     ...:
Out[292]:
0    1,"1/1/2017",50.25,"foo"
1     2,"1/2/2017",99.3,"bar"
dtype: object

Upvotes: 3

Related Questions