Ottpocket
Ottpocket

Reputation: 307

DataFrame Creation from DataFrame.apply

I have a function that returns a pd.DataFrame given a row of another dataframe:

def func(row):
    if row['col1']== 1:
        return pd.DataFrame({'a':[1], 'b':[11]})
    else:
        return pd.DataFrame({'a':[-1, -2], 'b':[-11,-22]})

I want to use apply func to another dataframe to create a new data frame, like below:

df = pd.DataFrame({'col1':[1,2,3],'col2':[11,22,33]})

# do some cool pd.DataFrame.apply stuff
# resulting in the below dataframe
pd.DataFrame({
    'a':[1,-1,-2,-1,-2],
    'b':[11,-11,-22,-11,-22]
})

Currently, I use the code below for the desired result:

pd.concat([mini[1] for mini in df.apply(func,axis=1).iteritems()])

While this works, it is fairly ugly. Is there a more elegant way to create a dataframe from df?

Upvotes: 1

Views: 35

Answers (2)

Ottpocket
Ottpocket

Reputation: 307

In the interest of speed, I ended up using the multiprocessing module to asyncronously apply func to dictionaries created by single rows. Here is the code:

import multiprocessing
import os
with multiprocessing.Pool(os.cpu_count()) as p:
    results = p.map(func, df.to_dict('records'))
pd.concat(results)

Upvotes: 0

mozway
mozway

Reputation: 262284

You could use:

pd.concat(df.apply(func, axis=1).tolist())

Output:

   a   b
0  1  11
0 -1 -11
1 -2 -22
0 -1 -11
1 -2 -22

Upvotes: 1

Related Questions