finethen
finethen

Reputation: 433

Python: Working with columns inside a pandas Dataframe

Good evening,

is it possible to calculate with - let's say - two columns inside a dataframe and add a third column with the fitting result?

Dataframe (original):

name        time_a      time_b
name_a      08:00:00    09:00:00
name_b      07:45:00    08:15:00
name_c      07:00:00    08:10:00
name_d      06:00:00    10:00:00

Or to be specific...is it possible to obtain the difference of two times (time_b - time_a) and create a new column (time_c) at the end of the dataframe?

Dataframe (new):

name        time_a      time_b      time_c
name_a      08:00:00    09:00:00    01:00:00
name_b      07:45:00    08:15:00    00:30:00
name_c      07:00:00    08:10:00    01:10:00
name_d      06:00:00    10:00:00    04:00:00

Thanks and a good night!

Upvotes: 0

Views: 52

Answers (1)

Yaakov Bressler
Yaakov Bressler

Reputation: 12018

If your columns are in datetime or timedelta format:

# New column is a timedelta object
df["time_c"] = (df["time_b"] - df["time_a"])

If your columns are in datetime.time format (which it appears they are):

def time_diff(time_1,time_2):
  """returns the difference between time 1 and time 2 (time_2-time_1)"""
  now = datetime.datetime.now()
  time_1 = datetime.datetime.combine(now,time_1)
  time_2 = datetime.datetime.combine(now,time_2)
  return time_2 - time_1

# Apply the function
df["time_c"] = df[["time_a","time_b"]].apply(lambda arr: time_diff(*arr), axis=1)

Alternatively, you can convert to a timedelta by first converting to a string:

df["time_a"]=pd.to_timedelta(df["time_a"].astype(str))
df["time_b"]=pd.to_timedelta(df["time_b"].astype(str))
df["time_c"] = df["time_b"] - df["time_a"]

Upvotes: 1

Related Questions