Unpacking error with multiple variables in for loop

Question

I wish to extract a dataframe of numbers(floating) based on the first instance of position MrkA and Mrk1. I am not interested in the second instance of MrkA because I know what columns to extract via the line df1

Input:

df = pd.DataFrame({'A':['sdfg',23,'MrkA',34,0,56],'B':['jfgh',23,'sdfg','MrkB',0,56], 'C':['cvb',7,'dsfgA','ghks',47,3],'D':['rrb',7,'gfd',3,0,7],'E':['dfg',7,'gfd',5,12,1],'F':['dfg',7,'sdfA',5,0,4],'G':['dfg',7,'sdA',5,8,9],'H':['dfg',7,'gfA',5,0,8],'I':['dfg',7,'sdfA',5,7,23]})

      A     B      C     D     E     F    G    H     I
0  sdfg  jfgh    cvb   rrb   dfg   dfg  dfg  dfg   dfg
1    23    23      7     7     7     7    7    7     7
2  MrkA  sdfg  dsfgA  MrkA   gfd  sdfA  sdA  gfA  sdfA
3    34  Mrk1   ghks     3  Mrk2     5    5    5     5
4     0     0     47     0    12     0    8    0     7
5    56    56      3     7     1     4    9    8    23

for i,j in range(df.shape[1]):
    for k,l  in range(df.shape[0]):
        if df.iloc[k,i] == 'MrkA'and df.iloc[l,j] == 'Mrk1':
            col = i
            row = k
            df1=df.iloc[row+2:,[col,col+1,col+2,col+4,col+5,col+7,col+8]]
            break

Output: cannot unpack non-iterable int object

Desired Output:

    A   B   C   E  F  H   I
4   0   0  47  12  0  0   7
5  56  56   3   1  4  8  23

How shall I proceed?

David · Accepted Answer

Your problem is that df.shape[0]/df.shape[1] is a single element. So trying to unpack range(value) to 2 indices is causing the error.

It should be:

for i in range(df.shape[1]):
    for j  in range(df.shape[0]):

Then you can apply the desired logic to extract the rows.

Note that it's unclear way you ignore the second row which is also all numeric. If it's only a typo you can try the following to extract all the fully numeric rows and apply some logic there:

df[df.applymap(np.isreal).all(1)]

Edit

Although it is not clear from your specific example what is the logic:

In the example you gave there is no Mrk1 but rather MrkB.
Why is D column disappeared?

A hard-coded example that gives the desired output should be something similar to the following:

import pandas as pd
df = pd.DataFrame({'A':['sdfg',23,'MrkA',34,0,56],'B':['jfgh',23,'sdfg','MrkB',0,56], 'C':['cvb',7,'dsfgA','ghks',47,3],'D':['rrb',7,'gfd',3,0,7],'E':['dfg',7,'gfd',5,12,1],'F':['dfg',7,'sdfA',5,0,4],'G':['dfg',7,'sdA',5,8,9],'H':['dfg',7,'gfA',5,0,8],'I':['dfg',7,'sdfA',5,7,23]})

for r in range(0, df.shape[0] - 1):
    for c in range(df.shape[1] - 1):
        if df.iloc[r, c] == "MrkA" and df.iloc[r + 1, c + 1] == "MrkB":
            print(df.iloc[r + 2:, :])

This gives:

    A   B   C  D   E  F  G  H   I
4   0   0  47  0  12  0  8  0   7
5  56  56   3  7   1  4  9  8  23

Unpacking error with multiple variables in for loop

Answers (1)

Edit

Related Questions