gesgallar
gesgallar

Reputation: 51

Read columns from csv file and put them into a new csv file using pandas

I have the current code using pandas:

file1_csv = 'fileX.csv'
data = pd.read_csv(file1_csv, header=None, usecols=[0,43])
print (data)

The result is this:

          0          43
0  57669557  2020-02-15
1  57779240  2017-02-15
2  96951148  2018-07-24

What I need is to put this result into a new csv file and have something like this:

col1, col2
57669557,2020-02-15
57779240,2017-02-15
96951148,2018-07-24

My code is like this:

final = pd.DataFrame(data, columns=['col1','col2'])
final.to_csv('finalFile.csv', index=False)

But the output is wrong and generates the next:

col1,col2
,
,
,

Upvotes: 3

Views: 50

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35626

When using the DataFrame constructor with an already indexed structure (like a another DataFrame). The columns argument, selects values from the existing index, it does not overwrite the index names.

We need to do something like:

final = pd.DataFrame(data)
final.columns = ['col1', 'col2']  # Overwrite Column Names
final.to_csv('finalFile.csv', index=False)

Or get a non-indexed structure like an array (to_numpy):

# Break existing index alignment
final = pd.DataFrame(data.to_numpy(), columns=['col1','col2'])
final.to_csv('finalFile.csv', index=False)

*Or any of the many other ways to rename or overwrite (set_axis) the existing columns

These approaches produce the expected finalFile.csv:

col1,col2
57669557,2020-02-15
57779240,2017-02-15
96951148,2018-07-24

Take a look at this toy example showing columns selecting values from the existing DataFrame:

import pandas as pd

data = pd.DataFrame({
    0: [57669557, 57779240, 96951148],
    43: ['2020-02-15', '2017-02-15', '2018-07-24']
})
print(data)
final = pd.DataFrame(data, columns=[43])
print(final)

Program output:

# data
         0           43
0  57669557  2020-02-15
1  57779240  2017-02-15
2  96951148  2018-07-24

# final (Only column 43 was selected)
           43
0  2020-02-15
1  2017-02-15
2  2018-07-24

Upvotes: 1

Related Questions