combinations on pandas dataframe

Question

I have a pandas data frame that contains points, and I want to find all optional lines between those points. Sample of the data:

     x_val  y_val
114   1.28   1.72
90    3.47   3.71
37    0.13   1.86

I tried the following:

       H = x_s.apply(lambda x: list(combinations(x.values, 2)))

but the results are not as expected, x/y values are getting mixed:

        x_val         y_val

114 (1.28, 3.47) (1.72, 3.71)

90 (1.28, 0.13) (1.72, 1.86)

37 (3.47, 0.13) (3.71, 1.86)

So the first tuple is a mix of X's, and the 2nd mix of Y's, where I was looking for lines that are constructed of 2 of the original points. I know how to reorder it now, but is there any option to get those results in the first place and to save the unnecessary reorder?

lbd · Accepted Answer

You could go through the index:

b = {'{}_{}'.format(idx1, idx2): (df.loc[idx1].values, df.loc[idx2].values) for idx1, idx2 in combinations(df.index, 2)}

pd.DataFrame(b).transpose()

This gives you:

                   0             1
114_90  [1.28, 1.72]  [3.47, 3.71]
114_37  [1.28, 1.72]  [0.13, 1.86]
90_37   [3.47, 3.71]  [0.13, 1.86]

combinations on pandas dataframe

Answers (1)

Related Questions