Reputation: 1623
I create a dataframe df_energy :
df_energy=pd.read_csv('C:/Users/Demonstrator/Downloads/power.csv', delimiter=';', parse_dates=[0], infer_datetime_format = True)
with this structure :
df_energy.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 43229 entries, 0 to 43228 Data columns (total 6 columns): TIMESTAMP 43229 non-null datetime64[ns] P_ACT_KW 40376 non-null float64 PERIODE_TARIF 43209 non-null object P_SOUSCR 37501 non-null float64 SITE 43229 non-null object TARIF 43229 non-null object dtypes: datetime64[ns](1), float64(2), object(3) memory usage: 2.0+ MB TIMESTAMP P_ACT_KW PERIODE_TARIF P_SOUSCR SITE TARIF 2015-07-31 23:00:00 12.0 HC NaN ST GEREON TURPE_HTA5 2015-07-31 23:10:00 466.0 HC 425.0 ST GEREON TURPE_HTA5 2015-07-31 23:20:00 18.0 HC 425.0 ST GEREON TURPE_HTA5 2015-07-31 23:30:00 17.0 HC 425.0 ST GEREON TURPE_HTA5
As I am starting learning python, I would like to know can I add three new columns : High_energy, Medium_energy and low_energy .
High_energy contains the P_ACT_KW value if P_ACT_KW > 400, Medium_energy contains the P_ACT_KW value if P_ACT_KW is between 200 and 400, Low_energy contains the P_ACT_KW value if P_ACT_KW < 200. For example :
TIMESTAMP P_ACT_KW PERIODE_TARIF P_SOUSCR SITE TARIF High_energy Medium_energy Low_energy 2015-07-31 23:00:00 12.0 HC NaN ST GEREON TURPE_HTA5 0 0 12 2015-07-31 23:10:00 466.0 HC 425.0 ST GEREON TURPE_HTA5 466 0 0 2015-07-31 23:20:00 18.0 HC 425.0 ST GEREON TURPE_HTA5 0 0 18 2015-07-31 23:30:00 17.0 HC 425.0 ST GEREON TURPE_HTA5 0 0 17
Thank you
Kind regards
Upvotes: 0
Views: 75
Reputation: 7828
you can use np.where from numpy as:
Sample df:
Out[71]:
TIMESTAMP P_ACT_KW PERIODE_TARIF P_SOUSCR SITE \
0 2015-07-31 23:00:00 12.0 HC NaN ST GEREON
1 2015-07-31 23:10:00 466.0 HC 425.0 ST GEREON
2 2015-07-31 23:20:00 18.0 HC 425.0 ST GEREON
3 2015-07-31 23:30:00 17.0 HC 425.0 ST GEREON
TARIF
0 TURPE_HTA5
1 TURPE_HTA5
2 TURPE_HTA5
3 TURPE_HTA5
df['high_energy']=np.where(df['P_ACT_KW']>400,df['P_ACT_KW'],0)
df['medium_energy']=np.where((df['P_ACT_KW']>200)&(df['P_ACT_KW']<400),df['P_ACT_KW'],0)
df['low_energy']=np.where(df['P_ACT_KW']<200,df['P_ACT_KW'],0)
Out[72]:
TIMESTAMP P_ACT_KW PERIODE_TARIF P_SOUSCR SITE \
0 2015-07-31 23:00:00 12.0 HC NaN ST GEREON
1 2015-07-31 23:10:00 466.0 HC 425.0 ST GEREON
2 2015-07-31 23:20:00 18.0 HC 425.0 ST GEREON
3 2015-07-31 23:30:00 17.0 HC 425.0 ST GEREON
TARIF high_energy medium_energy low_energy
0 TURPE_HTA5 0.0 0.0 12.0
1 TURPE_HTA5 466.0 0.0 0.0
2 TURPE_HTA5 0.0 0.0 18.0
3 TURPE_HTA5 0.0 0.0 17.0
Upvotes: 2