high54life
high54life

Reputation: 55

Pandas-copying values from one row to another with condition

given is the following code. I want to copy values from "price" to "newprice" when there is no value provided.

import pandas as pd
articles = {"number": ["111", "222", "333"],
          "price": [12.46, 33.66, 43.35],
          "name": ["Article1", "Article2", "Article3"],
          "newprice": [11.24, None, None]}
df = pd.DataFrame(articles)
print(df)

# Select empty entries
mask = df['newprice'].isnull()
# Version 1 (not working)
df.loc[mask, ['newprice']] = df.loc[mask, ['price']]
print(df)

# Output
#  number  price      name  newprice
# 0    111  12.46  Article1     11.24
# 1    222  33.66  Article2       NaN
# 2    333  43.35  Article3       NaN
# Version 2 (working)
df.loc[mask, ['newprice']] = df['price']
print(df)

# Output
#  number  price      name  newprice
# 0    111  12.46  Article1     11.24
# 1    222  33.66  Article2     33.66
# 2    333  43.35  Article3     43.35

In Version 1 there are still Nans in the "newprice" col, while on Version 2 the values are copyed as intended. I filter both sides of the assignment by the same mask, so there should an exact match. I mean to know that both sides of the assignemnt have to provide the exact same amount of results. So why is Version 1 not working?

Thanks in advance for your help.

Upvotes: 5

Views: 6258

Answers (2)

moys
moys

Reputation: 8033

Using the np.where is about 4-5 times faster that the loc with mask & is much more easier to understand.

df['newprice'] = np.where(df['newprice'].isnull(), df['price'], df['newprice'])

Output

number  price   name    newprice
0   111     12.46   Article1    11.24
1   222     33.66   Article2    33.66
2   333     43.35   Article3    43.35

enter image description here

Upvotes: -1

jezrael
jezrael

Reputation: 862481

I think here is problem if [] in both loc, so you assign one column DataFrame. There is different names price vs newprice, so failed.

Solution is remove [] for assign Series:

df.loc[mask, 'newprice'] = df.loc[mask, 'price']
print(df)
  number  price      name  newprice
0    111  12.46  Article1     11.24
1    222  33.66  Article2     33.66
2    333  43.35  Article3     43.35

Assign one column DataFrame is similar with no condition (failed):

df.loc[mask, 'newprice'] = df[['price']]

vs assign Series (working):

df.loc[mask, 'newprice'] = df['price']

Upvotes: 3

Related Questions