Nanoboss
Nanoboss

Reputation: 313

Add a column to a csv file at specified index with different values

I would like to add a column to a given index with a different value at each time (that value is computed depending on the values of the row). This is a sample of my csv:

org,repo_name,stars_count,fork_count,commit_count
freeCodeCamp,freeCodeCamp,303178,22005,23183,1703
vuejs,vue,140222,20150,3016,82
twbs,bootstrap,133730,65555,18714,46
...

So far I tried the answer provided here: python pandas insert column

def func(f):
    files = f
    df = pd.read_csv(files)
    df = df.convert_objects(convert_numeric=True)
    df.insert(2, 'new', 1000)
    df.to_csv(files) 

I get the result of an added row to index 2 with values 1000.

,org,repo_name,new,stars_count,fork_count,commit_count
freeCodeCamp,freeCodeCamp,303178,1000,22005,23183,1703
vuejs,vue,140222,1000,20150,3016,82
twbs,bootstrap,133730,1000,65555,18714,46
...

How to modify this to be able to add a specific value to each row instead of adding 1000 everywhere? And how to add a header so I get the following output? Please note that score1... scoreN are int variables, not string and that you can assume that they already been computed.

org,repo_name,score,new,stars_count,fork_count,commit_count
freeCodeCamp,freeCodeCamp,303178,score1,22005,23183,1703
vuejs,vue,140222,score2,20150,3016,82
twbs,bootstrap,133730,score3,65555,18714,46
...

Thanks.

Upvotes: 1

Views: 929

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 149075

Pandas is close to overkill to only insert a new column into a csv:

with open('input.csv') as fdin, open('output.csv', 'w', newline='') as fdout:
    rd = csv.DictReader(fdin)
    fields = list(rd.fieldnames)
    fields.insert(2, 'new')
    wr = csv.DictWriter(fdout, fieldnames=fields)
    wr.writeheader()
    for row in rd:
        row['new'] = compute_val(row)    # or compute_val(*row)
        wr.writerow(row)

Upvotes: 0

You can try something like this:

len_df = len(df.index)+1
df["new"] = ["score"+str(i) for i in range(1,len_df)]

I hope this will help you. ok so this might will be helpful:

df["new"].values[2] = score_value

Note that score_value is int

Upvotes: 1

Related Questions