Heap
Heap

Reputation: 31

Computing aggregate by creating nested dictionary on the fly

I'm new to python and I could really use your help and guidance at the moment. I am trying to read a csv file with three cols and do some computation based on the first and second column i.e.

A   spent   100     A   spent   2040
A   earned  60
B   earned  48
B   earned  180
A   spent   40
.
.
.

Where A spent 2040 would be the addition of all 'A' and 'spent' amounts. This does not give me an error but it's not logically correct:

for row in rows:
    cols = row.split(",")
    truck = cols[0]
    if (truck != 'A' and truck != 'B'):
        continue
    record = cols[1]
    if(record != "earned" and record != "spent"):
        continue
    amount = int(cols[2])
    #print(truck+" "+record+" "+str(amount))

    if truck in entries:
        #entriesA[truck].update(record)
        if record in records:
            records[record].append(amount)
        else:
            records[record] = [amount]
    else:
        entries[truck] = records
        if record in records:
            records[record].append(amount)
        else:
            entries[truck][record] = [amount]
print(entries)

I am aware that this part is incorrect because I would be adding the same inner dictionary list to the outer dictionary but I'm not sure how to go from there:

entries[truck] = records
if record in records:
    records[record].append(amount)

However, Im not sure of the syntax to create a new dictionary on the fly that would not be 'records'

I am getting:

{'B': {'earned': [60, 48], 'spent': [100]}, 'A': {'earned': [60, 48], 'spent': [100]}}

But hoping to get:

{'B': {'earned': [48]}, 'A': {'earned': [60], 'spent': [100]}}

Thanks.

Upvotes: 3

Views: 226

Answers (2)

chthonicdaemon
chthonicdaemon

Reputation: 19750

For the kind of calculation you are doing here, I highly recommend Pandas.

Assuming in.csv looks like this:

truck,type,amount
A,spent,100
A,earned,60
B,earned,48
B,earned,180
A,spent,40

You can do the totalling with three lines of code:

import pandas
df = pandas.read_csv('in.csv')
totals = df.groupby(['truck', 'type']).sum()

totals now looks like this:

              amount
truck type          
A     earned      60
      spent      140
B     earned     228

You will find that Pandas allows you to think on a much higher level and avoid fiddling with lower level data structures in cases like this.

Upvotes: 2

user6902061
user6902061

Reputation:

if record in entries[truck]:
    entries[truck][record].append(amount)
else:
    entries[truck][record] = [amount]

I believe this is what you would want? Now we are directly accessing the truck's records, instead of trying to check a local dictionary called records. Just like you did if there wasn't any entry of a truck.

Upvotes: 0

Related Questions