Richard Kapustynskyj
Richard Kapustynskyj

Reputation: 95

How can I use the value of a cell in a row to chose find a column name in a pandas dataframe?

I have a dataframe that looks like this

   Index  Variable1  Value1  Variable2  Value2  Cat  Dog  Cow
    1      Cat        7       Sheep      7       0    0    0
    2      Sheep      2       Cat        6       0    0    0
    3      Cow        3       Dog        2       0    0    0

How can I efficiently populate the Cat, Dog and Cow columns with the values in the Value column when the Variable column is equal to the appropriate column name? So it looks something like this

Index  Variable1  Value1  Variable2  Value2  Cat  Dog  Cow
1      Cat        7       Sheep      7       7    0    0
2      Sheep      2       Cat        6       6    0    0
3      Cow        3       Dog        2       0    2    3

I've made a nested for loop that loops over each "Variable" column and then each row in that column that populates the data of each animal based on the value in that cell. But I'm 100% sure this is bad way of doing this.

Upvotes: 3

Views: 158

Answers (3)

jezrael
jezrael

Reputation: 863256

Use wide_to_long for reshape, then filter values by list by DataFrame.query, reshape by Series.unstack and last use DataFrame.update:

df = pd.DataFrame({'Variable1': ['Cat', 'Sheep', 'Cow'],
                   "Value1":[7, 2, 3], 
                   "Variable2": ['Sheep','Cat','Dog'], 
                   'Value2':[7,6,2],
                   'Cat':[0,0,0],
                   'Dog':[0,0,0],
                   'Cow':[0,0,0]}, index=[1,2,3])

L = ['Cat','Dog','Cow']
#or if possible select last 3 column names
#L = df.columns[-3:]
df1 = (pd.wide_to_long(df.reset_index(), ['Variable','Value'],i='index', j='tmp')
        .reset_index(level=1, drop=True)
        .query("Variable in @L")
        .set_index('Variable', append=True)['Value']
        .unstack(fill_value=0))
print (df1)
Variable  Cat  Cow  Dog
index                  
1           7    0    0
2           6    0    0
3           0    3    2

df.update(df1)
print (df)
  Variable1  Value1 Variable2  Value2  Cat  Dog  Cow
1       Cat       7     Sheep       7    7    0    0
2     Sheep       2       Cat       6    6    0    0
3       Cow       3       Dog       2    0    2    3

Upvotes: 0

sammywemmy
sammywemmy

Reputation: 28709

#create variables
vals = df.filter(like = 'Value').columns
variables = df.filter(like = 'Variable').columns
animals = df.iloc[:,-3:].columns

#lump all 'Variable_' and 'Value_' into one df
res = pd.concat(df.filter(ent).set_axis(['val','var'],axis=1) for ent in zip(vals,variables))
res

    val var
0   7   Cat
1   2   Sheep
2   3   Cow
0   7   Sheep
1   6   Cat
2   2   Dog

#pivot res
out = (res
       .pivot(columns='var',values='val')
       .fillna(0)
       .astype(int)
       .filter(animals)
      )
out


var Cat Dog Cow
0   7   0   0
1   6   0   0
2   0   2   3

#final result
result = pd.concat([df.iloc[:,:-3],out],axis=1)
result


  Index Variable1   Value1  Variable2   Value2  Cat Dog Cow
0   1     Cat         7      Sheep       7       7   0   0
1   2     Sheep       2      Cat         6       6   0   0
2   3     Cow         3      Dog         2        0  2   3

Upvotes: 1

whege
whege

Reputation: 1441

A good solution would be vectorized operations, which are generally faster than loops. NumPy comes into its own here with np.where:

import pandas as pd
import numpy as np

df = pd.DataFrame.from_dict({'V1': ['Cat', 'Sheep', 'Cow'],
"Va1":[7, 2, 3], "v2": ['Sheep','Cat','Dog'], 'va2':[7,6,2]})

df['Cat'] = np.where(df['V1'] == 'Cat', df['Va1'], np.where(df['v2'] == 'Cat', df['va2'], 0))
df['Dog'] = np.where(df['V1'] == 'Dog', df['Va1'], np.where(df['v2'] == 'Dog', df['va2'], 0))
df['Cow'] = np.where(df['V1'] == 'Cow', df['Va1'], np.where(df['v2'] == 'Cow', df['va2'], 0))

enter image description here

Basically it checks Variable1 for the animal in question, and on matching fills with Value1, else it does the same check for Variable and Value 2, otherwise filles with 0.

Upvotes: 0

Related Questions