add-semi-colons
add-semi-colons

Reputation: 18810

Pandas create a new column by applying function that returns dictionary

Goal is to create a new column based on return values of row level information of data frames existing column.

df = pd.DataFrame({"A": [The quick brown fox jumps over the lazy dog,Glib jocks quiz nymph to vex dwarf], "B": [10, 20]})

                                             A   B
0  The quick brown fox jumps over the lazy dog  10
1           Glib jocks quiz nymph to vex dwarf  20

there exist method:

def returnTopic(model, query, numberOftopics): 
    # strip out topics per query/row and return topics that are relevant to a query/row
    return topicDict`

topicDict contain {'x': ['fox','brown'], 'y':['jumps','over','the']}

I want to create two new columns from these return elements in the dictionary.

                                             A   B  x               y
0  The quick brown fox jumps over the lazy dog  10 ['fox','brown'] ['jumps','over','the']
1           Glib jocks quiz nymph to vex dwarf  20

Here is my attempt:

df['x'] = df.apply(lambda x: returnTopic(tmodel['x'], x['A'], 2))

Upvotes: 1

Views: 1803

Answers (2)

David Dale
David Dale

Reputation: 11424

You can create a new DataFrame from records, and concatenate it to the old one.

Like this:

import pandas as pd
df = pd.DataFrame({"A": ['The quick brown fox jumps over the lazy dog',
                         'Glib jocks quiz nymph to vex dwarf'], "B": [10, 20]})

def f(text, something_else):
    return {'x':len(text), 'y': text.count(' ')}

new_df = pd.concat([df, pd.DataFrame.from_records(df['A'].apply(lambda x: f(x, 0)))], axis=1)
print(new_df)

It will return

                                             A   B   x  y
0  The quick brown fox jumps over the lazy dog  10  43  8
1           Glib jocks quiz nymph to vex dwarf  20  34  6

Upvotes: 3

cs95
cs95

Reputation: 402333

Have your function return a pd.Series object:

def foo(x): 
    ...
    return pd.Series(topicDict)

Now, call apply along the first axis:

v = df.apply(foo, 1)
v
              x                   y
0  [fox, brown]  [jumps, over, the]
1  [fox, brown]  [jumps, over, the]

Concatenate the result with the original, using pd.concat.

pd.concat([df, v], 1)

                                             A   B             x  \
0  The quick brown fox jumps over the lazy dog  10  [fox, brown]   
1           Glib jocks quiz nymph to vex dwarf  20  [fox, brown]   

                    y  
0  [jumps, over, the]  
1  [jumps, over, the]

Upvotes: 2

Related Questions