stratofortress
stratofortress

Reputation: 453

how to choose which column to write in (.csv) in python

import csv

f = csv.reader(open('lmt.csv','r'))         # open input file for reading
Date, Open, Hihh, mLow, Close, Volume = zip(*f)    #s plit it into separate columns

ofile = open("MYFILEnew1.csv", "wb")            # output csv file
c = csv.writer(ofile)

item = Date   
item2 = Volume
rows = zip(item, item)
i = 0
for row in item2:
    print row
    writer = csv.writer(ofile, delimiter='\t')
    writer.writerow([row])

ofile.close() 

Above is what I have produced so far.

As you can see in the 3rd line, I have extracted 6 columns from a spreadsheet. I want to create a .csv file under the name of MYFILEnew1.csv which only has two columns, Date and Volume.

What I have above creates a .csv that only writes Volume column into the first column of the new .csv file. How would you go about placing Date into the second column?

For example

Date       Open   High      Low     Close    Volume
17-Feb-16   210   212.97    209.1   212.74  1237731

is what i have. and Id like to produce a new csv file such that it has

Date           Volume
17-Feb-16      1237731

Upvotes: 3

Views: 6946

Answers (3)

Cleb
Cleb

Reputation: 26027

If I understand you question correctly, you can achieve that very easily using panda's read_csv and to_csv (@downvoter: Could you explain your downvote, please!?); the final solution to your problem can be found below EDIT2:

import pandas as pd

# this assumes that your file is comma separated
# if it is e.g. tab separated you should use pd.read_csv('data.csv', sep = '\t')
df = pd.read_csv('data.csv')

# select desired columns
df = df[['Date', 'Volume']]

#write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)

So, if your data.csv file looks like this:

Date,Open,Hihh,mLow,Close,Volume
1,5,9,13,17,21
2,6,10,14,18,22
3,7,11,15,19,23
4,8,12,16,20,24

The the MYFILEnew1.csv would look like this after running the script above:

Date    Volume
1   21
2   22
3   23
4   24

EDIT

Using your data (tab separated, stored in the file data3.csv):

Date    Open    Hihh    mLow    Close   Volume
17-Feb-16   210 212.97  209.1   212.74  1237731

Then

import pandas as pd

df = pd.read_csv('data3.csv', sep='\t') 

# select desired columns
df = df[['Date', 'Volume']]

# write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)

gives the desired output

Date    Volume
17-Feb-16   1237731

EDIT2

Since your header in your input csv file seems to be messed up (as discussed in the comments), you have to rename the first column. The following now works fine for me using your entire dataset:

import pandas as pd

df = pd.read_csv('lmt.csv', sep=',') 

# get rid of the wrongly formatted column name
df.rename(columns={df.columns[0]: 'Date' }, inplace=True)

# select desired columns
df = df[['Date', 'Volume']]

# write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)

Upvotes: 3

mechanical_meat
mechanical_meat

Reputation: 169434

You were so close:

import csv

f = csv.reader(open('lmt.csv','rb')) # csv is binary
Date, Open, Hihh, mLow, Close, Volume = zip(*f)

rows = zip(Date, Volume)

ofile = open("MYFILEnew1.csv", "wb")  
writer = csv.writer(ofile)
for row in rows:
    writer.writerow(row) # row is already a tuple so no need to make it a list

ofile.close()

Upvotes: 2

cderwin
cderwin

Reputation: 425

Here I would suggest using the csv module's csv.DictReader object to read and write from the files. To read the file, you would do something like

import csv
fieldnames=('Date', 'Open', 'High', 'mLow', 'Close', 'Volume')
with open('myfilename.csv') as f:
    reader = csv.DictReader(f, fieldnames=fieldnames)

Beyond this, you will just need to filter out the keys you don't want from each row and similarly use the csv.DictWriter class to write to your export file.

Upvotes: 2

Related Questions