Reputation: 23
I am looking for a solution to pick values (row wise) from a Dataframe. Here is what I already have:
np.random.seed(1)
df = pd.DataFrame(np.random.randint(1,10, (10, 10)))
df.columns = list('ABCDEFGHIJ')
N = 2
idx = np.argsort(df.values, 1)[:, 0:N]
df= pd.concat([pd.DataFrame(df.values.take(idx), index=df.index), pd.DataFrame(df.columns[idx], index=df.index)],keys=['Value', 'Columns']).sort_index(level=1)
Now I have the index/position for every value but if I try to get the values from the Dataframe it only takes the values from the first row. What do I have to change in the code?
df looks like:
A B C D E F G H I J
0 6 9 6 1 1 2 8 7 3 5
1 6 3 5 3 5 8 8 2 8 1
2 7 8 7 2 1 2 9 9 4 9
....
My output should look like:
0 D E
0 1 1
1 J H
1 1 2
Upvotes: 2
Views: 213
Reputation: 20669
You can use np.take_along_axis
to take values from dataframe. Use np.insert
to sieve both values taken and corresponding column names.
# idx is the same as the one used in the question.
vals = np.take_along_axis(df.values, idx, axis=1)
cols = df.columns.values[idx]
indices = np.r_[: len(vals)] # same as np.arange(len(vals))
out = np.insert(vals.astype(str), indices , cols, axis=0)
index = np.repeat(indices, 2)
df = pd.DataFrame(out, index=index)
0 1
0 D E
0 1 1
1 J H
1 1 2
2 E D
2 1 2
3 E I
3 2 2
4 A D
4 1 1
5 I J
5 1 3
6 E I
6 1 2
7 B H
7 1 3
8 G I
8 1 1
9 E A
9 1 2
Upvotes: 2