Reputation: 427
I have a dataframe like below:
import pandas as pd
data = {'Words':['actually','he','came','from','home','and','played'],
'Col2':['2','0','0','0','1','0','3']}
data = pd.DataFrame(data)
The dataframe looks like this:
I write this dataframe into the drive using below command:
np.savetxt('/folder/file.txt', data.values,fmt='%s', delimiter='\t')
And the next script reads it with below line of code:
data = load_file('/folder/file.txt')
Below is load_file function to read a text file.
def load_file(filename):
with open(filename, 'r', encoding='utf-8') as f:
data = f.readlines()
return data
The data will be a tab separated list.
print(data)
gives me the following output:
['actually\t2\n', 'he\t0\n', 'came\t0\n', 'from\t0\n', 'home\t1\n', 'and\t0\n', 'played\t3\n']
I dont want to write the file to drive and then read it for processing. Instead I want to convert the dataframe to a tab separated list and process directly. How can I achieve this?
I checked for existing answers, but most just convert list to dataframe and not other way around.
Thanks in advance.
Upvotes: 2
Views: 2521
Reputation: 13349
Try using .to_csv()
df_list = data.to_csv(header=None, index=False, sep='\t').split('\n')
df_list:
['actually\t2',
'he\t0',
'came\t0',
'from\t0',
'home\t1',
'and\t0',
'played\t3'
]
v = df.to_csv(header=None, index=False, sep='\t').rstrip().replace('\n', '\n\\n').split('\\n')
df_list:
['actually\t2\n',
'he\t0\n',
'came\t0\n',
'from\t0\n',
'home\t1\n',
'and\t0\n',
'played\t3\n'
]
Upvotes: 2
Reputation: 85
I think this achieves the same result without writing to the drive:
df_list = list(data.apply(lambda row: row['Words'] + '\t' + row['Col2'] + '\n', axis=1))
Upvotes: 1