Pankaj Singh
Pankaj Singh

Reputation: 586

Reassign multiple columns with a single value

Here is a sample data

import pandas as pd
cols = ['Country','Name','SomeNumber','SomeDate']
sourceData  = [('WI','Vivian',34,'#1985-01-01#'),
               ('IND','Sam',56,'#1988-02-01#'),
               ('NZ','Richard',324,'#1987-07-01#'),
               ('AUS','Don',98,'#1998-07-12#'),
               ('SL','Simth',101,'#2001-07-12#'),]
x = pd.DataFrame(sourceData,columns=cols)
x
  Country     Name  SomeNumber      SomeDate
0      WI   Vivian          34  #1985-01-01#
1     IND      Sam          56  #1988-02-01#
2      NZ  Richard         324  #1987-07-01#
3     AUS      Don          98  #1998-07-12#
4      SL    Simth         101  #2001-07-12#

What I want to do is, update every column, every value in the table with values missing, except 'Name' columns

Now, update the dataframe should look like this:

   Country     Name SomeNumber SomeDate
0  MISSING   Vivian    MISSING  MISSING
1  MISSING      Sam    MISSING  MISSING
2  MISSING  Richard    MISSING  MISSING
3  MISSING      Don    MISSING  MISSING
4  MISSING    Simth    MISSING  MISSING

Be advised, I do not want to do something like this considering, in the real world, I have 114 columns:

x['Country'] = 'MISSING'
x['SomeNumber'] = 'MISSING'
x['SomeDate'] = 'MISSING'

I tried this:

cols.remove('Name')
x[cols] = 'MISSING"

But it gave me the following warning, which I want to avoid:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#inde
xing-view-versus-copy
  inTardisMissingInSource[cols] = 'MISSING'
C:\tardis\desktop\environment\python\lib\site-packages\pandas\core\indexing.py:477: SettingWithCopyW
arning:

Upvotes: 1

Views: 311

Answers (1)

cs95
cs95

Reputation: 403248

The SettingWithCopyWarning is a good indication that you're using boolean indexing in the wrong place. You should instead use df.loc, like this:

In [1430]: x.loc[:, x.columns.difference(['Name'])] = 'MISSING'

In [1431]: x
Out[1431]: 
   Country     Name SomeNumber SomeDate
0  MISSING   Vivian    MISSING  MISSING
1  MISSING      Sam    MISSING  MISSING
2  MISSING  Richard    MISSING  MISSING
3  MISSING      Don    MISSING  MISSING
4  MISSING    Simth    MISSING  MISSING

The main bit is the x.columns.difference([...]). Pass in a list of column headers you want to exclude, and those columns will not be picked up for the assignment.

Note that this mixed assignment will change the dtype of the rows, use with caution.


If you don't want inplace assignment, you can use df.assign by unpacking a dictionary:

In [1435]: x.assign(**{ k : 'MISSING' for k in x.columns.difference(['Name'])})
Out[1435]: 
   Country     Name SomeNumber SomeDate
0  MISSING   Vivian    MISSING  MISSING
1  MISSING      Sam    MISSING  MISSING
2  MISSING  Richard    MISSING  MISSING
3  MISSING      Don    MISSING  MISSING
4  MISSING    Simth    MISSING  MISSING

Upvotes: 4

Related Questions