Elisa L.
Elisa L.

Reputation: 297

How to copy a field based on a condition of another column in Python?

I need to copy a column's field into a variable, based on a specific condition, and then delete it.

This dataframe contains data of some kids, that have their favourite toy and colour associated:

data = {'Kid': ['Richard', 'Daphne', 'Andy', 'May', 'Claire', 'Mozart', 'Jane'],
        'Toy':  ['Ball', 'Doll', 'Car', 'Barbie', 'Frog', 'Bear', 'Doll'],
        'Colour': ['white', np.nan, 'red', 'pink', 'green', np.nan, np.nan]
        }

df = pd.DataFrame (data, columns = ['Kid', 'Toy','Colour'])

print (df)

The dataframe looks like this:

       Kid       Toy Colour
0  Richard      Ball  white
1   Daphne      Doll    NaN
2     Andy       Car    red
3      May    Barbie   pink
4   Claire      Frog  green
5   Mozart      Bear    NaN
6     Jane      Doll    NaN

The condition is: If a kid does have a toy, but it does not have a colour, then save both the kid and the toy in a separate array as follows and maintain the order/matching:

toy_array = ["Doll", "Bear", "Doll"]
kid_array = ["Daphne", "Mozart", "Jane"]

And then delete the toy from the dataframe. So the final dataframe should look like this:

       Kid     Toy Colour
0  Richard    Ball  white
1   Daphne     NaN    NaN
2     Andy     Car    red
3      May  Barbie   pink
4   Claire    Frog  green
5   Mozart     NaN    NaN
6     Jane     NaN    NaN

I got inspired by many sources, along with this one, and I tried this:

kid_array.append(df.loc[(df['Toy'] != np.nan) & (df['Colour'] == np.nan)])
print(kid_array)

I am at the very beginning, I highly appreciate all your help if you could possibly help me!

Upvotes: 3

Views: 105

Answers (2)

jezrael
jezrael

Reputation: 862511

Test missing and no misisng values by Series.isna and Series.notna and then set missing values to Toy column by DataFrame.loc:

mask = df['Toy'].notna() & df['Colour'].isna()

df.loc[mask, 'Toy'] = np.nan

Or in Series.mask:

df['Toy'] = df['Toy'].mask(mask)

Or by numpy.where:

df['Toy'] = np.where(mask, np.nan, df['Toy'])

print (df)
       Kid     Toy Colour
0  Richard    Ball  white
1   Daphne     NaN    NaN
2     Andy     Car    red
3      May  Barbie   pink
4   Claire    Frog  green
5   Mozart     NaN    NaN
6     Jane     NaN    NaN

If need lists:

toy_array = df.loc[mask, 'Toy'].tolist()
kid_array = df.loc[mask, 'Kid'].tolist()

print (toy_array)
['Doll', 'Bear', 'Doll']

print (kid_array)
['Daphne', 'Mozart', 'Jane']

Upvotes: 1

Dattaprasad Ekavade
Dattaprasad Ekavade

Reputation: 128

Your logic is correct, just the function to compare needs to be matched with the correct function used for comparison in Numpy Library

numpy.isnan()

Try the following code

kid_array.append(df.loc[(!numpy.isnan( df['Toy'])) & (!numpy.isnan(df['Colour']))])

Upvotes: 0

Related Questions