cap
cap

Reputation: 349

Nested for loop using lambda function

I have a nested for loop something like:

for x in df['text']:
  for i in x:
    if i in someList:
      count++

Where df['text'] is a series of lists containing words such as ['word1', 'word2', 'etc']
I know I can just use the for format but I want to convert it into a lambda function.
I tried doing:
df['in'] = df['text'].apply(lambda x: [count++ for i in x if i in someList]) but it is not proper syntax. How can I modify to get the function to what I desire?

Upvotes: 2

Views: 7864

Answers (3)

piRSquared
piRSquared

Reputation: 294258

Setup

someList = [*'ABCD']
df = pd.DataFrame(dict(text=[*map(list, 'AB CD AF EG BH IJ ACDE'.split())]))

df

           text
0        [A, B]
1        [C, D]
2        [A, F]
3        [E, G]
4        [B, H]
5        [I, J]
6  [A, C, D, E]

Numpy and __contains__

i = np.arange(len(df)).repeat(df.text.str.len())
a = np.zeros(len(df), int)
np.add.at(a, i, [*map(someList.__contains__, np.concatenate(df.text))])
df.assign(**{'in': a})

           text  in
0        [A, B]   2
1        [C, D]   2
2        [A, F]   1
3        [E, G]   0
4        [B, H]   1
5        [I, J]   0
6  [A, C, D, E]   3

map lambda and __contains__

df.assign(**{'in': df.text.map(lambda x: sum(map(someList.__contains__, x)))})

           text  in
0        [A, B]   2
1        [C, D]   2
2        [A, F]   1
3        [E, G]   0
4        [B, H]   1
5        [I, J]   0
6  [A, C, D, E]   3

Upvotes: 2

chepner
chepner

Reputation: 531065

You don't need any additional functions. Just create a sequences of ones (one per element) to sum.

count = sum(1 for x in df['text'] for i in x if i in someList)

Upvotes: 2

BENY
BENY

Reputation: 323226

I feel like you need expend the row and doing with isin , since with pandas , we usually try not use for loop .

df['in']=pd.DataFrame(df['text'].tolist(),index=df.index).isin(someList).sum(1)

Upvotes: 4

Related Questions