Reputation: 18810
Goal is to create a new column based on return values of row level information of data frames existing column.
df = pd.DataFrame({"A": [The quick brown fox jumps over the lazy dog,Glib jocks quiz nymph to vex dwarf], "B": [10, 20]})
A B
0 The quick brown fox jumps over the lazy dog 10
1 Glib jocks quiz nymph to vex dwarf 20
there exist method:
def returnTopic(model, query, numberOftopics):
# strip out topics per query/row and return topics that are relevant to a query/row
return topicDict`
topicDict
contain {'x': ['fox','brown'], 'y':['jumps','over','the']}
I want to create two new columns from these return elements in the dictionary.
A B x y
0 The quick brown fox jumps over the lazy dog 10 ['fox','brown'] ['jumps','over','the']
1 Glib jocks quiz nymph to vex dwarf 20
Here is my attempt:
df['x'] = df.apply(lambda x: returnTopic(tmodel['x'], x['A'], 2))
Upvotes: 1
Views: 1803
Reputation: 11424
You can create a new DataFrame from records, and concatenate it to the old one.
Like this:
import pandas as pd
df = pd.DataFrame({"A": ['The quick brown fox jumps over the lazy dog',
'Glib jocks quiz nymph to vex dwarf'], "B": [10, 20]})
def f(text, something_else):
return {'x':len(text), 'y': text.count(' ')}
new_df = pd.concat([df, pd.DataFrame.from_records(df['A'].apply(lambda x: f(x, 0)))], axis=1)
print(new_df)
It will return
A B x y
0 The quick brown fox jumps over the lazy dog 10 43 8
1 Glib jocks quiz nymph to vex dwarf 20 34 6
Upvotes: 3
Reputation: 402333
Have your function return a pd.Series
object:
def foo(x):
...
return pd.Series(topicDict)
Now, call apply
along the first axis:
v = df.apply(foo, 1)
v
x y
0 [fox, brown] [jumps, over, the]
1 [fox, brown] [jumps, over, the]
Concatenate the result with the original, using pd.concat
.
pd.concat([df, v], 1)
A B x \
0 The quick brown fox jumps over the lazy dog 10 [fox, brown]
1 Glib jocks quiz nymph to vex dwarf 20 [fox, brown]
y
0 [jumps, over, the]
1 [jumps, over, the]
Upvotes: 2