How to retain the index position back to the ouput when/after running pandas iterrorws?

Question

In the following pandas dataframe:

d1 = pd.read_csv('to_count.mcve.txt', sep='	')
d1 = d1.set_index(['pos'], append=True)

       M1           M2       F1   F2
  pos                        
0 23   A,B,A,C,D    A,C,B    A    D
1 24   A,B,B,C,B    A,B,A    B    B
2 28   C,B,C,D,E    B,C      E    C

I used the below code to do some counting:

hapX_count = pd.DataFrame()
hapY_count = pd.DataFrame()
for index, lines in d1.iterrows():
    hap_x = lines['F1']
    hap_y = lines['F2']
    x_count = lines.apply(lambda x: x.count(hap_x)/2 if len(x) > 5 else x.count(hap_x))
    y_count = lines.apply(lambda x: x.count(hap_y)/2 if len(x) > 5 else x.count(hap_y))

    hapX_count = hapX_count.append(x_count)
    hapY_count = hapY_count.append(y_count)


print(hapX_count)

Output is:

         F1   F2   M1   M2
(0, 23)  1.0  0.0  1.0  1.0
(1, 24)  1.0  1.0  1.5  1.0
(2, 28)  1.0  0.0  0.5  0.0

How can I get the index value (pos) back as they were in the previous data? I can use the index to call the position of those tuple. But, I want to automate the process so all the indexes are retained, because there will be more than one index (not just pos) in my original data.

Thanks,

scomes · Accepted Answer

You can replace the two lines above your for loop with the below lines. This will create empty DataFrames with the index having the same names as the index of d1.

hapX_count = pd.DataFrame(index=d1.index[0:0])
hapY_count = pd.DataFrame(index=d1.index[0:0])

How to retain the index position back to the ouput when/after running pandas iterrorws?

Answers (1)

Related Questions