musava_ribica
musava_ribica

Reputation: 482

How to append a new value in a CSV file in Python?

I have a CSV sheet, having data like this:

| not used | Day 1 | Day 2 |
| Person 1    | Score | Score |
| Person 2    | Score | Score |

But with a lot more rows and columns. Every day I get progress of how much each person progressed, and I get that data as a dictionary where keys are names and values are score amounts.

The thing is, sometimes that dictionary will include new people and not include already existing ones. Then, if a new person comes, it will add 0 as every previous day and if the dict doesn't include already existing person, it will give him 0 score to that day

My idea of solving this is doing lines = file.readlines() on that CSV file, making a new list of people's names with

for line in lines:
   names.append(line.split(",")[0])

then making a copy of lines (newLines = lines) and going through dict's keys, seeing if that person is already in the csv, if so, append the value followed by a comma But I'm stuck at the part of adding score of 0 Any help or contributions would be appreciated

EXAMPLE: Before I will have this

-,day1,day2,day3
Mark,1500,0,1660
John,1800,1640,0
Peter,1670,1680,1630
Hannah,1480,1520,1570

And I have this dictionary to add

{'Mark': 1750, 'Hannah':1640, 'Brian':1780}

The result should be

-,day1,day2,day3,day4
Mark,1500,0,1660,1750
John,1800,1640,0,0
Peter,1670,1680,1630,0
Hannah,1480,1520,1570,1640
Brian,0,0,0,1780

See how Brian is in the dict and not in the before csv and he got added with any other day score 0. I figured out that one line .split(',') would give a list of N elements, where N - 2 will be amount of zero scores to add prior to first day of that person

Upvotes: 0

Views: 129

Answers (1)

tdelaney
tdelaney

Reputation: 77407

This is easy to do in pandas as an outer join. Read the CSV into a dataframe and generate a new dataframe from the dictionary. The join is almost what you want except that since not-a-number values are inserted for empty cells, you need to fill the NaN's with zero and reconvert everything to integer.

The one potential problem is that the CSV is sorted. You don't simply have the new rows appended to the bottom.

import pandas as pd
import errno
import os

INDEX_COL = "-"

def add_days_score(filename, colname, scores):
    try:
        df = pd.read_csv(filename, index_col=INDEX_COL)
    except OSError as e:
        if e.errno == errno.ENOENT:
            # file doesn't exist, create empty df
            df = pd.DataFrame([], columns=[INDEX_COL])
            df = df.set_index(INDEX_COl)
        else:
            raise
    new_df = pd.DataFrame.from_dict({colname:scores})
    merged = df.join(new_df, how="outer").fillna(0).astype(int)
    try:
        merged.to_csv(filename + ".tmp", index_label=[INDEX_COL])
    except:
        raise
    else:
        os.rename(filename + ".tmp", filename)
    return merged

#============================================================================
# TEST
#============================================================================

test_file = "this_is_a_test.csv"
before = """-,day1,day2,day3
Mark,1500,0,1660
John,1800,1640,0
Peter,1670,1680,1630
Hannah,1480,1520,1570
"""

after = """-,day1,day2,day3,day4
Brian,0,0,0,1780
Hannah,1480,1520,1570,1640
John,1800,1640,0,0
Mark,1500,0,1660,1750
Peter,1670,1680,1630,0
"""

test_dicts = [
    ["day4", {'Mark': 1750, 'Hannah':1640, 'Brian':1780}],
]

open(test_file, "w").write(before)

for name, scores in test_dicts:
    add_days_score(test_file, name, scores)

print("want\n", after, "\n")
got = open(test_file).read()
print("got\n", got, "\n")
if got != after:
    print("FAILED")

Upvotes: 1

Related Questions