anon
anon

Reputation: 866

How to create a column timestamp while using apply in a pandas dataframe?

I am applying some functions to pandas dataframe columns as:

def foo(x):
     return 1 + x

Then, I apply the function to a column:

df['foo'] = df['a_col'].apply(foo)

How can I return a column with the amount of miliseconds that the function foo takes to finish?. For instance:

A time_milisecs
2 0.1
4 0.2
4 0.3
3 0.3
4 0.2

Where A is the column that contains the result of the sum.

Upvotes: 0

Views: 68

Answers (1)

jpp
jpp

Reputation: 164693

You can use the time module. Given you also wish to create a new series via a calculation, you can output a sequence of tuples, then convert to a dataframe and assign back to two series.

Here's a demonstration:

import time

df = pd.DataFrame({'A': [2, 4, 4, 3, 4]})

def foo(x):
    tstart = time.time()
    time.sleep(0.25)
    tend = time.time()
    return 1 + x, (tend-tstart) * 10**3

df[['B', 'B_time']] = pd.DataFrame(df['A'].apply(foo).values.tolist())

print(df)

   A  B      B_time
0  2  3  250.014544
1  4  5  250.014305
2  4  5  250.014305
3  3  4  250.014305
4  4  5  250.014067

With Python 3.7, you can use time.process_time_ns, which measures time in nanoseconds.

Upvotes: 2

Related Questions