Reputation: 3771
I'm trying to find the convex hull of a series of points based on two columns of a pandas dataframe.
My current code is:
# Create column of point co-ordinates
df['xy'] = df.apply(lambda x: [x['col_1'], x['col_2']], axis=1)
# Return a numpy array of the point coordinates
point_list = df.xy.values
# pass the list to ConvexHull (imported using: from scipy.spatial import ConvexHull)
hull = ConvexHull(point_list)
I get this error when I run:
Traceback (most recent call last):
File "<ipython-input-41-517201a29182>", line 1, in <module>
hull = ConvexHull(point_list)
File "qhull.pyx", line 2220, in scipy.spatial.qhull.ConvexHull.__init__ (scipy\spatial\qhull.c:19058)
File "C:\Users\****\AppData\Local\Continuum\Anaconda\lib\site- packages\numpy\core\numeric.py", line 550, in ascontiguousarray
return array(a, dtype, copy=False, order='C', ndmin=1)
ValueError: setting an array element with a sequence.
Any thoughts on this?
Best Regards,
Upvotes: 4
Views: 1635
Reputation: 394269
What you're doing looks overly complicated, you can pass df columns directly to ConvexHull
:
In [311]:
from scipy.spatial import ConvexHull
df = pd.DataFrame({'col_1':np.random.randn(30), 'col_2':np.random.randn(30), 'col3':0})
df
Out[311]:
col3 col_1 col_2
0 0 0.837349 1.526832
1 0 -0.282778 -0.150751
2 0 -0.331192 -0.382630
3 0 -0.933054 -0.234423
4 0 1.074336 -1.180293
5 0 0.296417 0.626924
6 0 0.806266 -0.501335
7 0 -1.192482 -1.793160
8 0 0.920646 1.377393
9 0 -1.255671 0.428256
10 0 -1.518031 0.888582
11 0 1.231974 0.566314
12 0 -0.717847 -0.236354
13 0 0.758947 -0.286670
14 0 -1.546001 1.774912
15 0 -0.707825 -0.529058
16 0 0.446111 0.406430
17 0 0.711017 0.774281
18 0 -2.616337 0.293725
19 0 -0.370344 -0.471336
20 0 -0.281950 -0.243941
21 0 -1.088772 -1.471154
22 0 -0.422274 -0.266592
23 0 0.423735 -0.341429
24 0 1.166969 -0.329791
25 0 0.689842 1.143460
26 0 0.462430 -0.843409
27 0 3.071030 1.615058
28 0 -0.812258 0.272436
29 0 0.707237 -1.717054
Then I can pass the columns directly:
hull = ConvexHull(df[['col_1','col_2']])
import matplotlib.pyplot as plt
plt.plot(df['col_1'], df['col_2'], 'o')
for simplex in hull.simplices:
plt.plot(df['col_1'].iloc[simplex], df['col_2'].iloc[simplex], 'k-')
Which produces this plot:
Upvotes: 4