Reputation: 820

Create a new column of lists in Pandas dataframe with unique values from another column

I have a dataframe with a column of lists:

    full_list_to_check
 0          NaN 
 1          NaN 
 2    [1, 2, 3, 4, 5] 
 3        [6, 6] 
 4        [11, 11]

I need to create a new column where it shows a distinct list for each row if duplicates exist in the list, otherwise just the same list.

  full_list_to_check            new_col
 0          NaN                   NaN
 1          NaN                   NaN
 2    [1, 2, 3, 4, 5]           [1, 2, 3, 4, 5]
 3        [6, 6]                  [6]
 4        [11, 11]                [11]

I have tried this:

df['new_col'] = df['full_list_to_check'].apply(lambda x: list(set(x)))

But I get this error:

TypeError: 'float' object is not iterable

Upvotes: 2

Answers (3)

luigigi

Reputation: 4215

You could use:

df['new_col'] = df['full_list_to_check'].apply(lambda x: list(set(x)) if isinstance(x,list) else x)

The other answers only works if there are no other values then lists or NaN in your data.

Upvotes: 2

Manualmsdos

Reputation: 1545

You must check Nan:

df['full_list_to_check'].apply(lambda x: list(set(x)) if not np.any(pd.isna(x)) else np.nan)

Update:

df['full_list_to_check'].apply(lambda x: list(set(x)) if x is not np.nan else np.nan)

0                NaN
1                NaN
2    [1, 2, 3, 4, 5]
3                [6]
4               [11]

Upvotes: 4

Mikhail

Reputation: 799

You can try this:

df['new_col'] = df.loc[~df['full_list_to_check'].isna(), 'full_list_to_check'].apply(lambda x: list(set(x)))

full_list_to_check new_col
0 NaN              NaN
1 NaN              NaN
2 [1, 2, 3, 4, 5]  [1, 2, 3, 4, 5]
3 [6, 6]           [6]
4 [11, 11]         [11]

Upvotes: 2

Create a new column of lists in Pandas dataframe with unique values from another column

Answers (3)

Related Questions