WF30
WF30

Reputation: 253

How do I move data from one column to another in Python DataFrame

There are Five columns in the DataFrame 'Product', 'Type', 'Rating', 'Reviews' and 'Category'.

The DataFrame looks like this:

  Product    Type     Rating  Reviews    Category 
1  'A'      'Free'     5       10         1
2  'B'        2       19        3         NaN
3  'C'      'Free'     3       15         1
4  'D'      'Paid'     4       20         2

I want to modify only rows that contain NaNs. In my case I need to move only the values in Index 2 (Row 2) by keeping the other indexes(rows) values unchanged.

The resulting DataFrame should look like this:

  Product    Type     Rating  Reviews    Category 
1  'A'      'Free'     5       10         1
2  'B'       NaN       2       19         3         
3  'C'      'Free'     3       15         1
4  'D'      'Paid'     4       20         2

Upvotes: 1

Views: 1994

Answers (3)

Pablo C
Pablo C

Reputation: 4771

You can achieve it with Dataframe.isnull, DataFrame.any and DataFrame.shift:

mask = df.isnull().any(1)
df.loc[mask,"Type":]] = df.loc[mask,"Type":]].shift(axis = 1)
print(df)
#  Product    Type Rating  Reviews  Category
#1     'A'  'Free'      5       10       1.0
#2     'B'     NaN      2       19       3.0
#3     'C'  'Free'      3       15       1.0
#4     'D'  'Paid'      4       20       2.0

With the mask you can determine which rows has NaN values.

Upvotes: 0

Mykola Zotko
Mykola Zotko

Reputation: 17911

You can use the method shift:

df.loc[2, 'Type':] = df.loc[2, 'Type':].shift()

Output:

  Product    Type Rating  Reviews  Category
1     'A'  'Free'      5       10       1.0
2     'B'     NaN      2       19       3.0
3     'C'  'Free'      3       15       1.0
4     'D'  'Paid'      4       20       2.0

Upvotes: 2

Vladyslav Sheruda
Vladyslav Sheruda

Reputation: 1875

One of the options is to utilize numpy.where, using which you can easily control your data flow.

It might not be the best way, but it's one of the easiest and it's pretty clear what happens here:

import numpy as np

(
   df_data.assign(
      Type=lambda df: np.where(
         pd.isnull(df["Category"]),
         np.nan,
         df["Type"]
      ),
      Rating=lambda df: np.where(
         pd.isnull(df["Category"]),
         df["Type"],
         df["Rating"]
      ),
      Reviews=lambda df: np.where(
         pd.isnull(df["Category"]),
         df["Rating"],
         df["Reviews"]
      ),
      Category=lambda df: np.where(
         pd.isnull(df["Category"]),
         df["Reviews"],
         df["Category"]
      )
   ) 
)

If multiple columns contain NaNs, you can use the next syntax:

np.where(
  (df["Category"].isna()) | (df["Type"].isna()),   # or
  ...
)

Upvotes: 0

Related Questions