Wood
Wood

Reputation: 23

Writing to multiple columns in csv

I'm trying to read one csv file and write specific rows of that file into another file.

The code runs fine, but the output is not formatted properly:

import pandas as pd
import sys

f = open("output.csv", 'w')
sys.stdout = f

df = pd.read_csv('original_file.csv', low_memory=False)

print df[(df.name == 'fullName')]
print df[(df.name == 'LastName')]

f.close()

In the original file there are multiple columns, all filled with strings. I want to print every row where the name column equals fullName and LastName. However output.csv has all of the data crammed into a single column.

I'm doing all of this on Ubuntu using Vim. I don't know if that would make a difference.

How do I get the output data to write to its corresponding column in output.csv?

Upvotes: 2

Views: 2684

Answers (2)

Andrey Portnoy
Andrey Portnoy

Reputation: 1509

The last line of my solution is wrong. Because of operator precedence rules, the boolean array is being compared to a dataframe column, which is not what one might be looking for.

What you are doing essentially is you write two columns sequentially. Try the following:

import pandas as pd

# read file
df = pd.read_csv('original_file.csv', low_memory=False)

# write select columns of the dataframe to output.csv
df[df['name'] == 'fullName' | df['name'] == 'LastName' ].to_csv('output.csv')

Upvotes: 0

Adarsh Chavakula
Adarsh Chavakula

Reputation: 1589

This should work:

df = pd.read_csv('original_file.csv', low_memory=False) # read dataframe
new_df = df.loc[(df.name == 'fullName')|(df.name == 'LastName')] # select rows with name == fullname or lastname
new_df.to_csv("output.csv", index=False) # write to csv

Upvotes: 2

Related Questions