Harrison Boles
Harrison Boles

Reputation: 793

How to write specific columns of a dataframe to a CSV?

I'm writing a script to reduce a large .xlsx file with headers into a CSV, and then write a new CSV file with only the required columns based on the header names.

import pandas
import csv

df = pandas.read_csv('C:\\Python27\\Work\\spoofing.csv')

time = df["InviteTime (Oracle)"]
orignum = df["Orig Number"]
origip = df["Orig IP Address"]
destnum = df["Dest Number"]

df.to_csv('output.csv', header=[time,orignum,origip,destnum])

The error I'm getting is with that last bit of code, and it says

ValueError: Writing 102 cols but got 4 aliases

I'm sure I'm overlooking something stupid, but I've read over the to_csv documentation on the pandas website and I'm still at a loss. I know I'm misusing the to_csv parameters but I can't seem to get my head around the documentation.

Any help is appreciated, thanks!

Upvotes: 58

Views: 159251

Answers (2)

Konstantinos Roditakis
Konstantinos Roditakis

Reputation: 605

column_list=["column_name1", "column_name2", "column_name3", "column_name4"]

#filter the dataframe beforehand
ds[column_list].to_csv('output.csv',index=False)

#or use columns arg
ds.to_csv('output.csv', columns = column_list,index=False)

I provide index=False arg in order to write only column values

Upvotes: 3

user1827356
user1827356

Reputation: 7022

The way to select specific columns is this -

header = ["InviteTime (Oracle)", "Orig Number", "Orig IP Address", "Dest Number"]
df.to_csv('output.csv', columns = header)

Upvotes: 125

Related Questions