user1855463
user1855463

Reputation: 67

Assign value of existing column to new columns in pandas based on multiple conditions

I am trying to create a new column in pandas data frame based on existing columns.

column1 column2 column3 y1 y2 y3
100 200 300 2020 2021 2022
100 200 300 2021 2022 2023
100 200 300 2019 2020 2021

I want a new columns vals which take values of col1 if currentyear = y1 and col2 if currentyear = y2

column1 column2 column3 y1 y2 y3 Vals
100 200 300 2020 2021 2022 200
100 200 300 2021 2022 2023 100
100 200 300 2019 2020 2021 300

I am trying below code :

    def assignvalues(df):
        if df['y1'] == currentyear:
            df['Vals'] = df['col1']
        elif df['y2'] == currentyear:
            df['Vals'] = df['col2']
        elif df['y3'] == currentyear:
            df['Vals'] = df['col3']
df.apply(assignvalues)

It does create columns but doesn't store any values.

Upvotes: 1

Views: 1316

Answers (2)

SeaBean
SeaBean

Reputation: 23217

You can use np.select() for faster execution, as follows:

import numpy as np

currentyear = pd.to_datetime('now').year

condlist = [df['y1'] == currentyear, 
            df['y2'] == currentyear, 
            df['y3'] == currentyear]

choicelist = [df['column1'],
              df['column2'],
              df['column3']]            

df['Vals'] = np.select(condlist, choicelist, default=np.nan)

The default value if no match of currentyear is NaN, you can set it to 0 or other values at your choice by modifying the default= parameter.

Result:

print(df)

   column1  column2  column3    y1    y2    y3  Vals
0      100      200      300  2020  2021  2022   200
1      100      200      300  2021  2022  2023   100
2      100      200      300  2019  2020  2021   300

Upvotes: 1

tlentali
tlentali

Reputation: 3455

From your DataFrame :

>>> import pandas as pd
>>> from io import StringIO

>>> df = pd.read_csv(StringIO("""
... column1,column2,column3,y1,y2,y3
... 100,200,300,2020,2021,2022
... 100,200,300,2021,2022,2023
... 100,200,300,2019,2020,2021"""))
>>> df
    column1 column2 column3 y1      y2      y3
0   100     200     300     2020    2021    2022
1   100     200     300     2021    2022    2023
2   100     200     300     2019    2020    2021

And the function assignvalues, which now return the value from the expected column for each if. We set the currentyear at 2021 for example :

>>> def assignvalues(df):
...     if df['y1'] == currentyear:
...         return df['column1']
...     elif df['y2'] == currentyear:
...         return df['column2']
...     elif df['y3'] == currentyear:
...         return df['column3']

>>> currentyear = 2021

We can assign to df["Vals"] an apply(), as you did, with an axis=1 parameter to get the expected result :

>>> df["Vals"] = df.apply(assignvalues, axis=1)
>>> df
    column1 column2 column3 y1      y2      y3      Vals
0   100     200     300     2020    2021    2022    200
1   100     200     300     2021    2022    2023    100
2   100     200     300     2019    2020    2021    300

Upvotes: 1

Related Questions