dsexplorer
dsexplorer

Reputation: 105

Use column headers to find matching value and get value in matching column for the row

I have a dataframe

 ID | calls_to_cust| 1 | 2 |  3| 4  | 5 
 123| 3            | 43| 12| 12| 3  | 42 
 133| 2            | 42| 32| 13| 4  | 7
 134| 6            | 42| 32| 13| 4  | 7  

My result should be

 ID | calls_to_cust| 1 | 2 |  3| 4  | 5   |value
 123| 3            | 43| 12| 12| 3  | 42  |12
 133| 2            | 42| 32| 13| 4  | 7   |32
 134| 6            | 42| 32| 13| 4  | 7   |7

I want to look up the value in calls_to_cust in the column headers and use the matching value for that row and column number to be the value that is in the new column value

def match(x):
    for i in x.columns:
        if x['total_calls_hcp'] == i:
            return i.value()
    else:
        return 0

I tried writing a function but it is not working out. I cannot use the df.lookup because the values are float and throwing an error.

EDIT: If the value does not exist then I want to pull the value of the max

Upvotes: 1

Views: 878

Answers (3)

anky
anky

Reputation: 75080

You can also use numpy here to vectorize this along with Index.get_indexer to find the index of the column (map(str) to convert the values to string if the columns are strings and not int)

df['value'] = df.to_numpy()[df.index,df.columns.get_indexer(df['calls_to_cust'].map(str))]

print(df)

    ID  calls_to_cust   1   2   3  4   5  value
0  123              3  43  12  12  3  42     12
1  133              2  42  32  13  4   7     32

Upvotes: 1

Henry Ecker
Henry Ecker

Reputation: 35636

Can use pd.factorize then numpy indexing to lookup:

import numpy as np

idx, cols = pd.factorize(df['calls_to_cust'])
df['value'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]

df:

    ID  calls_to_cust   1   2   3  4   5  value
0  123              3  43  12  12  3  42     12
1  133              2  42  32  13  4   7     32

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195438

Try:

df["value"] = df.apply(lambda x: x[str(x["calls_to_cust"])], axis=1)
print(df)

Prints:

    ID  calls_to_cust   1   2   3  4   5  value
0  123              3  43  12  12  3  42     12
1  133              2  42  32  13  4   7     32

Upvotes: 2

Related Questions