Reputation: 793
I'm writing a script to reduce a large .xlsx file with headers into a CSV, and then write a new CSV file with only the required columns based on the header names.
import pandas
import csv
df = pandas.read_csv('C:\\Python27\\Work\\spoofing.csv')
time = df["InviteTime (Oracle)"]
orignum = df["Orig Number"]
origip = df["Orig IP Address"]
destnum = df["Dest Number"]
df.to_csv('output.csv', header=[time,orignum,origip,destnum])
The error I'm getting is with that last bit of code, and it says
ValueError: Writing 102 cols but got 4 aliases
I'm sure I'm overlooking something stupid, but I've read over the to_csv
documentation on the pandas website and I'm still at a loss. I know I'm misusing the to_csv parameters but I can't seem to get my head around the documentation.
Any help is appreciated, thanks!
Upvotes: 58
Views: 159251
Reputation: 605
column_list=["column_name1", "column_name2", "column_name3", "column_name4"]
#filter the dataframe beforehand
ds[column_list].to_csv('output.csv',index=False)
#or use columns arg
ds.to_csv('output.csv', columns = column_list,index=False)
I provide index=False
arg in order to write only column values
Upvotes: 3
Reputation: 7022
The way to select specific columns is this -
header = ["InviteTime (Oracle)", "Orig Number", "Orig IP Address", "Dest Number"]
df.to_csv('output.csv', columns = header)
Upvotes: 125