Reputation: 179
I have a dataframe of price changes by category (columns) and date (rows) and I am keen to compare each subsequent row after the first, back to the first to see how it has changed as a percentage over time.
For example, my current df looks like:
df = pd.DataFrame({'date': ['2020-03-22', '2020-03-23', '2020-03-24', '2020-03-25', '2020-03-26'], 'price': [2, 3, 6, 7, 12]})
| date | price
|-------------|--------
0| 2020-03-22 | 2
1| 2020-03-23 | 3
2| 2020-03-24 | 6
3| 2020-03-25 | 5
4| 2020-03-26 | 12
and I am keen to change this price column to be the comparison back to row 0 such as the below:
| date | price
|-------------|--------
0| 2020-03-22 | 1
1| 2020-03-23 | 1.5
2| 2020-03-24 | 3
3| 2020-03-25 | 2.5
4| 2020-03-26 | 6
Note, my actual data frame is approx 25 cols wide and 350 rows, so cherry-picking columns and rows won't work. Im thinking perhaps a loop could work, but am concerned that would be rather resource heavy. Please note, the dataframe itself is produced out of a pivot function, not sure if that helps at all.
Any help will be appreciated!
Upvotes: 1
Views: 1209
Reputation: 862681
Use Series.div
for divide by first value of column:
df['price'] = df['price'].div(df['price'].iloc[0])
print (df)
date price
0 2020-03-22 1.0
1 2020-03-23 1.5
2 2020-03-24 3.0
3 2020-03-25 3.5
4 2020-03-26 6.0
If need working with multiple columns without first column use DataFrame.iloc
for select:
df.iloc[:, 1:] = df.iloc[:, 1:].div(df.iloc[0, 1:])
print (df)
date price
0 2020-03-22 1.0
1 2020-03-23 1.5
2 2020-03-24 3.0
3 2020-03-25 3.5
4 2020-03-26 6.0
Or create DatetimeIndex
:
df = df.set_index('date')
df= df.div(df.iloc[0])
print (df)
price
date
2020-03-22 1.0
2020-03-23 1.5
2020-03-24 3.0
2020-03-25 3.5
2020-03-26 6.0
If ned working onlt with numeric columns:
cols = df.select_dtypes(np.number).columns
df[cols] = df[cols].div(df.iloc[0, df.columns.get_indexer(cols)])
Another solution from @piRSquared, thank you:
df = df.assign(**df.select_dtypes(np.number).pipe(lambda d: d / d.iloc[0]))
print (df)
date price
0 2020-03-22 1.0
1 2020-03-23 1.5
2 2020-03-24 3.0
3 2020-03-25 3.5
4 2020-03-26 6.0
Upvotes: 2