Kai
Kai

Reputation: 357

Adding column to .CSV file in Python and calculate values

I have checked many solutions but I have been unable to apply any to my problem.

I have a .csv file, like this:

    Header_A;Header_B
    0;1
    1;4
    5;6
    6;7
    9;8

Now I want to pythonically add another column "Header_C" to it and calculate its values (x) from the addition from the first 2 columns per definition, so something like

    def add(a, b):
        x = a + b
        return x

where x will be the value of column Header_C and a, b are the sums of columns Header_A and Header_B.

The result should look like this:

    Header_A;Header_B;Header_C
    0;1;1
    1;4;5
    5;6;11
    6;7;13
    9;8;17

If possible without installing additional modules. Output can be a new .csv file.

Thanks a lot!

Upvotes: 0

Views: 3098

Answers (2)

zipa
zipa

Reputation: 27869

pandas is your solution:

import pandas as pd

df = pd.read_csv('a.csv')
df['Header_C'] = df['Header_A'] + df['Header_B']

df.to_csv('b.csv', sep=';', index=False)

For more info on pandas please visit http://pandas.pydata.org/

Upvotes: 1

Kai
Kai

Reputation: 357

I still got the same error even with the line

    df = pd.read_csv('a.csv', sep=';')

But you inspired me and got me the idea that the problem might be header! So I tried some things and now actually got it working. Here is the fully working code:

import pandas
df = pandas.read_csv("a.csv", sep=';', names=['Header_A', 'Header_B'], header=0)
df['Header_C'] = df["Header_A"] + df["Header_B"]
df.to_csv("b.csv", sep=';', index=False)

if header is set to NONE, Python treats the values as strings, which would result in stuff like this:

9 + 3 = 93

If you set header=0, you override that. I am not sure if my explanation is accurate, but now the program does what I want! Thanks a lot!

However, I am still interested in a solution with the CSV module or purely Python WITHOUT module! Anyone?

Upvotes: 0

Related Questions