Reputation: 10158
I have some working code to convert hours to minutes (with a condition on the hour value) but I'm currently iterating over the dataframe rows and that doesn't seem very efficient for larger dataframes
I have 2 pandas series (in reality these are columns of a larger dataframe):
hours = pd.Series([0,10,15,20,30])
mins = pd.Series([10,0,0,20,10])
What I want to do is the following:
I'm expecting the output to be the following series (note indices 2 and 4):
0 10.0
1 600.0
2 15.0
3 1220.0
4 40.0
I wrote the following function that achieves what I want:
def convert_time(hours, minutes):
df_duration = pd.DataFrame({"hours": hours, "minutes": minutes}).astype(float)
series_converted = pd.Series()
# Replace nan with zero
df_duration = df_duration.fillna(0)
# Convert out of bound hour values, add to minutes
for i in range(df_duration.shape[0]):
if df_duration.iloc[i]["hours"] in (15,30,45,60,90):
cur_hours = 0
cur_mins = df_duration.iloc[i]["hours"] + df_duration.iloc[i]["minutes"]
else:
cur_hours = df_duration.iloc[i]["hours"]
cur_mins = df_duration.iloc[i]["minutes"]
series_converted.set_value(i,(cur_hours * 60) + cur_mins)
return series_converted
Whats the correct way to do this in Pandas without iterating over dataframe rows? I can't seem to figure out how to use pandas masking across 2 columns when a condition is involved
Upvotes: 1
Views: 208
Reputation: 394459
You can use where
with isin
to only perform the conversion where the values are not in the list of values you want, then you can fillna
with the normal operation:
In [134]:
(hours.where(~hours.isin([15,30,45,60,90])) * 60 + mins).fillna(hours + mins)
Out[134]:
0 10.0
1 600.0
2 15.0
3 1220.0
4 40.0
dtype: float64
You can see this broken down:
In [135]:
hours.isin([15,30,45,60,90])
Out[135]:
0 False
1 False
2 True
3 False
4 True
dtype: bool
In [136]:
~hours.isin([15,30,45,60,90])
Out[136]:
0 True
1 True
2 False
3 True
4 False
dtype: bool
In [138]:
hours.where(~hours.isin([15,30,45,60,90]))
Out[138]:
0 0.0
1 10.0
2 NaN
3 20.0
4 NaN
dtype: float64
In [139]:
(hours.where(~hours.isin([15,30,45,60,90])) * 60 + mins)
Out[139]:
0 10.0
1 600.0
2 NaN
3 1220.0
4 NaN
dtype: float64
Upvotes: 2