tumbleweed
tumbleweed

Reputation: 4640

How to transform a Python tuple to a .csv file?

I would like to transform a Python tuple to a .csv file. Let's say I have a retrive() function and when I print it with pprint it looks like this:

test = tuple(retrive(directory))
pprint(test, width=1)

Then:

("opinion_1.txt, I am an amateur photographer and own three DSLR c.... purchase",
 "opinion_2.txt, This my second Sony Digital Came.... good camera for a good price!',
 'opinion_3.txt, \'I ordered this camera with high hopes after  couldn\\\'t find.\'')

So, I tried this with the csv module:

with open('/Users/user/Downloads/output.csv','w') as out:
    csv_out=csv.writer(out)
    csv_out.writerow(['id','content'])
    for row in test:
        csv_out.writerow(row)

The problem is that I get a weird output which looks like this:

id,content
o,p,i,n,i,o,n,_,1,.,t,x,t,",", ,I, ,a,m, ,a,n, ,a,m,a,t,e,u,r, ,p,h,o,t,o,g,r,a,p,h,e,r, ,a,n,d, ,o,w,n, ,t,h,r,e,e, ,D,S,L,R, ,c,a,m,e,r,a,s, ,w,i,t,h, ,a, ,s,e,l,e,c,t,i,o,n, ,o,f, ,l,e,n,s,e,s,., ,H,o,w,e,v,e,r, ,t,h,a,t, ,c,o,l,l,e,c,t,i,o,n, 

How can I get something like this:

opinion_1.txt,I am an amateur photographer and own three DSLR c.... purchase
opinion_2.txt,This my second Sony Digital Came.... good camera for a good price!
opinion_3.txt,I ordered this camera with high hopes after  couldn\\\'t find.

Upvotes: 2

Views: 4883

Answers (3)

jezrael
jezrael

Reputation: 862791

If you need Pandas solution, use DataFrame constructor and to_csv:

import pandas as pd

df = pd.DataFrame([ x.split(',') for x in test ])
df.columns = ["id","content"]
print df
#              id                                            content
#0  opinion_1.txt   I am an amateur photographer and own three DS...
#1  opinion_2.txt   This my second Sony Digital Came.... good cam...
#2  opinion_3.txt   'I ordered this camera with high hopes after ...

#for testing
#print df.to_csv(index=False)
df.to_csv("/Users/user/Downloads/output.csv", index=False)
#id,content
#opinion_1.txt, I am an amateur photographer and own three DSLR c.... purchase
#opinion_2.txt, This my second Sony Digital Came.... good camera for a good price!
#opinion_3.txt, 'I ordered this camera with hig

If there is multiple ,, you can use split by first occurence of ,:

import pandas as pd

test = ("opinion_1.txt,a","opinion_2.txt,b","opinion_3.txt,c",  "opinion_3.txt,b,c,k")
print test

print [ x.split(',', 1) for x in test ]
[['opinion_1.txt', 'a'], 
 ['opinion_2.txt', 'b'], 
 ['opinion_3.txt', 'c'], 
 ['opinion_3.txt', 'b,c,k']]

df = pd.DataFrame([ x.split(',', 1) for x in test ])
df.columns = ["id","content"]
print df
              id content
0  opinion_1.txt       a
1  opinion_2.txt       b
2  opinion_3.txt       c
3  opinion_3.txt   b,c,k

print df.to_csv(index=False)
id,content
opinion_1.txt,a
opinion_2.txt,b
opinion_3.txt,c
opinion_3.txt,"b,c,k"

Upvotes: 1

Alexander
Alexander

Reputation: 109546

Your parsing is destroyed if one of your sentences has multiple commas like this:

s = "opinion_4.txt, Oh my, what happens with really, really long sentences?"

>>> s.split(", ")
['opinion_4.txt',
 'Oh my',
 'what happens with really',
 'really long sentences?']

A better approach would be to find the first comma and then split the sentence using slicing at this location:

for line in text:
    comma_idx = line.find(', ')
    csvout.writerow(line[:comma_idx], line[comma_idx+2:])

For the sentence above, it would result in this:

('opinion_4.txt', 'Oh my, what happens with really, really long sentences?')

Upvotes: 1

Andrii Rusanov
Andrii Rusanov

Reputation: 4606

CSV trying to iterate over string you pass from the tuple. Change your code to:

for row in test:
    csv_out.writerow(row.split(', ', 1))

It means you split each string in the tuple by first occurrence of ', '. It produces two elements for each row and it is what csv writer is need is.

Upvotes: 4

Related Questions