Xavier Conzet
Xavier Conzet

Reputation: 500

Creating a new column in Pandas

Thank you in advance for taking the time to help me! (Code provided below) (Data Here)

I am trying to average the first 3 columns and insert it as a new column labeled 'Topsoil'. What is the best way to go about doing that?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
raw_data = pd.read_csv('all-deep-soil-temperatures.csv', index_col=1, parse_dates=True)
df_all_stations = raw_data.copy()
df_selected_station.fillna(method = 'ffill', inplace=True);
df_selected_station_D=df_selected_station.resample(rule='D').mean()
df_selected_station_D['Day'] = df_selected_station_D.index.dayofyear
mean=df_selected_station_D.groupby(by='Day').mean()
mean['Day']=mean.index
#mean.head()

enter image description here

Upvotes: 0

Views: 110

Answers (4)

jezrael
jezrael

Reputation: 862661

Use DataFrame.iloc for select by positions - first 3 columns with mean:

mean['Topsoil'] = mean.iloc[:, :3].mean(axis=1)

Upvotes: 0

Subasri sridhar
Subasri sridhar

Reputation: 831

Try this :

mean['avg3col']=mean[['5 cm', '10 cm','15 cm']].mean(axis=1)

Upvotes: 1

David
David

Reputation: 8298

You could use the apply method in the following way:

mean['Topsoil'] = mean.apply(lambda row: np.mean(row[0:3]), axis=1)

You can read about the apply method in the following link: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

The logic is that you perform the same task along a specific axis multiple times.

Note: It is not wise to call data-structures in names of functions, in your case it might be better be mean_df rather the mean

Upvotes: 0

callmeanythingyouwant
callmeanythingyouwant

Reputation: 1997

df['new column'] = (df['col1'] + df['col2'] + df['col3'])/3

Upvotes: 0

Related Questions