Reputation: 149
I have a data frame in pandas, where 1 appears in different columns for every rows. The column where 1 appears for the first time in a row is different for different rows. I need to create an additional column (column index) in which as value I want to return the index number of the column where 1 appears for the first time in that row.
Example dataframe:
IDs q1 q2 q3 q4 q5 q6 q7 q8
1111 0 0 0 1 0 0 0 1
1122 0 0 1 0 0 1 0 0
the output should like this:
IDs q1 q2 q3 q4 q5 q6 q7 q8 column_index
1111 0 0 0 1 0 0 0 1 5
1122 0 0 1 0 0 1 0 0 4
It would be helpful if anyone can provide the code useful in pandas. Thanks in advance.
Upvotes: 2
Views: 511
Reputation: 670
You can always just write a simple function and then use apply on the dataframe.
def get_first(row):
for i, col in enumerate(row.index.tolist()):
if row[col] == 1:
return i
df['column_index'] = df.apply(get_first, axis=1)
Probably a cool tricky way to do this with pandas, but this works.
You could also do this if you don't want to write a function, but it's a lot less readable
df['first_col'] = df.apply(lambda row: [row.index.tolist().index(c) for c in row.index.tolist() if row[c] == 1][0], axis=1)
Upvotes: 2
Reputation: 25259
A simple idxmax
and get_indexer
from df.columns
df['column_index'] = df.columns.get_indexer(df.drop('IDs',1).idxmax(1))+1
Out[52]:
IDs q1 q2 q3 q4 q5 q6 q7 q8 column_index
0 1111 0 0 0 1 0 0 0 1 5
1 1122 0 0 1 0 0 1 0 0 4
Upvotes: 2
Reputation: 157
try something very basilar such as follows:
for i in range(df.iloc[:,0].size):
j=0
while df.iloc[i,j]=!1:
df.iloc[i,'index column']=j
j=j+1
regards
Upvotes: 1