Reputation: 2533

How to embed a conditional in list comprehension?

Assume this dataset:

df = pd.DataFrame({
    'name': ['John','William', 'Nancy', 'Susan', 'Robert', 'Lucy', 'Blake', 'Sally', 'Bruce', 'Mike'],
    'injury': ['right hand broken', 'lacerated left foot', 'foot broken', 'right foot fractured', '', 'sprained finger', 'chest pain', 'swelling in arm', 'laceration to arms, hands, and foot', np.NaN]
    })

    name      injury
0   John      right hand broken
1   William   lacerated left foot
2   Nancy     foot broken
3   Susan     right foot fractured
4   Robert  
5   Lucy      sprained finger
6   Blake     chest pain
7   Sally     swelling in arm
8   Bruce     lacerations to arm, hands, and foot
9   Mike      NaN
10  Jeff      swollen cheek

I reduce the injuries to only the selected body part:

selected_words = ["hand", "foot", "finger", "chest", "arms", "arm", "hands"]

df["injury"] = (
    df["injury"]
    .str.replace(",", "")
    .str.split(" ", expand=False)
    .apply(lambda x: ", ".join(set([i for i in x if i in selected_words])))
)

But, this throws an error to the NaN value at index 9:

TypeError: 'float' object is not iterable

How would I modify the list comprehension such that:

it checks for any NaN values
outputs NaN if it encounters a row that is blank or does not have a body part contained in the list of selected_body_parts (e.g. index 10)

The desired output is:

name        injury
0   John    hand
1   William foot
2   Nancy   foot
3   Susan   foot
4   Robert  NaN
5   Lucy    finger
6   Blake   chest
7   Sally   arm
8   Bruce   hand, foot, arm
9   Mike    NaN
10  Jeff    NaN

I tried the following:

.apply(lambda x: ", ".join(set([i for i in x if i in selected_words and i is not np.nan else np.nan])))

But, the syntax is incorrect.

Any assistance would be most appreciated. Thanks!

Upvotes: 3

Answers (2)

Mateus

Reputation: 64

you can use .dropna() before the lambda

df["injury"].str.replace(",", "").str.split(" ", expand=False).dropna().apply(lambda x: ", ".join(set([i for i in x if i in selected_words])))

0                 hand
1                 foot
2                 foot
3                 foot
4                     
5               finger
6                chest
7                  arm
8    foot, hands, arms

Was this the result you wanted?

Upvotes: 1

cafce25

Reputation: 27567

Your problem isn't that i is a np.nan but x is and you can't iterate over np.nan with a comprehension. I think you probably want to turn your lambda into a named function and pass that like so:

def get_set_of_body_parts(words):
   if words is np.nan:
      return np.nan
   else:
      return ", ".join(set([i for i in x if i in selected_words]))

df = pd.DataFrame({
    'name': ['John','William', 'Nancy', 'Susan', 'Robert', 'Lucy', 'Blake', 'Sally', 'Bruce', 'Mike'],
    'injury': ['right hand broken', 'lacerated left foot', 'foot broken', 'right foot fractured', '', 'sprained finger', 'chest pain', 'swelling in arm', 'laceration to arms, hands, and foot', np.NaN]
    })

selected_words = ["hand", "foot", "finger", "chest", "arms", "arm", "hands"]

df["injury"] = (
   df["injury"]
   .str.replace(",", "")
   .str.split(" ", expand=False)
   .apply(get_set_of_body_parts)
)

but if you really want to you could do a lambda like so:

.apply(lambda x: np.nan if x is np.nan else ", ".join(set([i for i in x if i in selected_words])))

Upvotes: 1

How to embed a conditional in list comprehension?

Answers (2)

Related Questions