Kiran
Kiran

Reputation: 159

Using pandas how to weightage for some parameter(columns) for get output based on weightage wise

In my data set all parameters have qualitative variables,

When, All my parameters(Columns) different for single row, then, we giving us to weightage for that variables,

for column-Irrigated we giving us 40% weightage,
for soil we giving us 35% weightage and
for seed variety we giving us 25% weightage,

enter image description here Any suggestion it will be help..

>>> import pandas as pd
>>> data = {'District':  ['Ahmednagar', 'Aurangabad','Jalna','Buldhana','Amravati','Nashik','Pune','Palghar'],
        'Soil': ['B','A','D','D','A','B','D','A' ],
    'Irrigated': ['B','B','D','A','A','B','C','A' ],
    'Seed Variety': ['A','B','B','B','A','A','A','D']
        }
>>> data
{'District': ['Ahmednagar', 'Aurangabad', 'Jalna', 'Buldhana', 'Amravati', 'Nashik', 'Pune', 'Palghar'], 'Soil': ['B', 'A', 'D', 'D', 'A', 'B', 'D', 'A'], 'Seed Variety': ['A', 'B', 'B', 'B', 'A', 'A', 'A', 'D'], 'Irrigated': ['B', 'B', 'D', 'A', 'A', 'B', 'C', 'A']}
>>> df = pd.DataFrame (data, columns = ['District','Soil','Irrigated','Seed Variety'])
>>> df
     District  ... Seed Variety
0  Ahmednagar  ...            A
1  Aurangabad  ...            B
2       Jalna  ...            B
3    Buldhana  ...            B
4    Amravati  ...            A
5      Nashik  ...            A
6        Pune  ...            A
7     Palghar  ...            D

[8 rows x 4 columns]
>>> 

Upvotes: 2

Views: 306

Answers (1)

filbranden
filbranden

Reputation: 8898

so when all parameters giving different value, then it will be select output for Irrigated column value [...] if more than 2 times repeated then output will be display as which value repeated 2 times.

So that means the only case when output will be different from "Irrigated" is when the other two columns "Soil" and "Seed Variety" have the same value.

So I'd start by populating "Output" to match "Irrigated" and then in a follow up set it to the value of one of the other columns where the two other columns have the same value:

df['Output'] = df['Irrigated']
df.loc[df['Soil'] == df['Seed Variety'], 'Output'] = df['Soil']

That should do it.

Later on, if you want to calculate the total percentage, you can do that by comparing the resulting "Output" to the source columns and multiplying it by each weight:

df['Output(%)'] = (
    (df['Output'] == df['Soil']) * 35.0 +
    (df['Output'] == df['Irrigated']) * 40.0 +
    (df['Output'] == df['Seed Variety']) * 25.0
)

Upvotes: 1

Related Questions