Reputation: 23
I'm trying to read one csv file and write specific rows of that file into another file.
The code runs fine, but the output is not formatted properly:
import pandas as pd
import sys
f = open("output.csv", 'w')
sys.stdout = f
df = pd.read_csv('original_file.csv', low_memory=False)
print df[(df.name == 'fullName')]
print df[(df.name == 'LastName')]
f.close()
In the original file there are multiple columns, all filled with strings. I want to print every row where the name
column equals fullName
and LastName
. However output.csv
has all of the data crammed into a single column.
I'm doing all of this on Ubuntu using Vim. I don't know if that would make a difference.
How do I get the output data to write to its corresponding column in output.csv
?
Upvotes: 2
Views: 2684
Reputation: 1509
The last line of my solution is wrong. Because of operator precedence rules, the boolean array is being compared to a dataframe column, which is not what one might be looking for.
What you are doing essentially is you write two columns sequentially. Try the following:
import pandas as pd
# read file
df = pd.read_csv('original_file.csv', low_memory=False)
# write select columns of the dataframe to output.csv
df[df['name'] == 'fullName' | df['name'] == 'LastName' ].to_csv('output.csv')
Upvotes: 0
Reputation: 1589
This should work:
df = pd.read_csv('original_file.csv', low_memory=False) # read dataframe
new_df = df.loc[(df.name == 'fullName')|(df.name == 'LastName')] # select rows with name == fullname or lastname
new_df.to_csv("output.csv", index=False) # write to csv
Upvotes: 2