Salamat Begmanov
Salamat Begmanov

Reputation: 51

How is 'sum' and 'first' in dictionary related to DataFrame.Series?

I was learning Marketing analystics and stuck on the following snippet, namely operation var. How 'sum' and 'firs' can give sum of a column and 'first' first unique from a column?

operations = {'revenue':'sum',                      
              'InvoiceDate':'first',                
              'CustomerID':'first'}

df = df.groupby('InvoiceNo').agg(operations)

I was thinking to relate it to pandas.Series.first and pandas.Series.sum but could not find examples.

Book explanation: In the preceding code snippet, we first specified the aggregation functions that we will use for each column, and then performed groupby and applied those functions. InvoiceDate and CustomerID will be the same for all rows for the same invoice, so we can just take the first entry for them. For revenue, we sum the revenue across all items for the same invoice to get the total revenue for that invoice.

Result:

      revenue   InvoiceDate CustomerID
InvoiceNo           
581583  124.60  2011-12-09 12:23:00 13777.0
581584  140.64  2011-12-09 12:25:00 13777.0
581585  329.05  2011-12-09 12:31:00 15804.0
581586  339.20  2011-12-09 12:49:00 13113.0
581587  249.45  2011-12-09 12:50:00 12680.0

Upvotes: 1

Views: 41

Answers (1)

jezrael
jezrael

Reputation: 862511

Here is used function GroupBy.first, not Series.first for first value per groups with dictionary for columns names with aggregate functions.

Upvotes: 1

Related Questions