Reputation: 565
I wish to extract a dataframe of numbers(floating) based on the first instance of position MrkA and Mrk1. I am not interested in the second instance of MrkA because I know what columns to extract via the line df1
Input:
df = pd.DataFrame({'A':['sdfg',23,'MrkA',34,0,56],'B':['jfgh',23,'sdfg','MrkB',0,56], 'C':['cvb',7,'dsfgA','ghks',47,3],'D':['rrb',7,'gfd',3,0,7],'E':['dfg',7,'gfd',5,12,1],'F':['dfg',7,'sdfA',5,0,4],'G':['dfg',7,'sdA',5,8,9],'H':['dfg',7,'gfA',5,0,8],'I':['dfg',7,'sdfA',5,7,23]})
A B C D E F G H I
0 sdfg jfgh cvb rrb dfg dfg dfg dfg dfg
1 23 23 7 7 7 7 7 7 7
2 MrkA sdfg dsfgA MrkA gfd sdfA sdA gfA sdfA
3 34 Mrk1 ghks 3 Mrk2 5 5 5 5
4 0 0 47 0 12 0 8 0 7
5 56 56 3 7 1 4 9 8 23
for i,j in range(df.shape[1]):
for k,l in range(df.shape[0]):
if df.iloc[k,i] == 'MrkA'and df.iloc[l,j] == 'Mrk1':
col = i
row = k
df1=df.iloc[row+2:,[col,col+1,col+2,col+4,col+5,col+7,col+8]]
break
Output: cannot unpack non-iterable int object
Desired Output:
A B C E F H I
4 0 0 47 12 0 0 7
5 56 56 3 1 4 8 23
How shall I proceed?
Upvotes: 0
Views: 61
Reputation: 8318
Your problem is that df.shape[0]
/df.shape[1]
is a single element. So trying to unpack range(value)
to 2 indices is causing the error.
It should be:
for i in range(df.shape[1]):
for j in range(df.shape[0]):
Then you can apply the desired logic to extract the rows.
Note that it's unclear way you ignore the second row which is also all numeric. If it's only a typo you can try the following to extract all the fully numeric rows and apply some logic there:
df[df.applymap(np.isreal).all(1)]
Although it is not clear from your specific example what is the logic:
Mrk1
but rather MrkB
.D
column disappeared?A hard-coded example that gives the desired output should be something similar to the following:
import pandas as pd
df = pd.DataFrame({'A':['sdfg',23,'MrkA',34,0,56],'B':['jfgh',23,'sdfg','MrkB',0,56], 'C':['cvb',7,'dsfgA','ghks',47,3],'D':['rrb',7,'gfd',3,0,7],'E':['dfg',7,'gfd',5,12,1],'F':['dfg',7,'sdfA',5,0,4],'G':['dfg',7,'sdA',5,8,9],'H':['dfg',7,'gfA',5,0,8],'I':['dfg',7,'sdfA',5,7,23]})
for r in range(0, df.shape[0] - 1):
for c in range(df.shape[1] - 1):
if df.iloc[r, c] == "MrkA" and df.iloc[r + 1, c + 1] == "MrkB":
print(df.iloc[r + 2:, :])
This gives:
A B C D E F G H I
4 0 0 47 0 12 0 8 0 7
5 56 56 3 7 1 4 9 8 23
Upvotes: 1