madtowneast
madtowneast

Reputation: 2390

Python, Numpy - Trying split an array according to a condition

I am trying to find clusters (i.e. groups within an array where the difference between [n+1] and [n] is less than a certain value) inside an array. I have a numpy array that is a sequence of time stamps. I can find the difference between time stamps using numpy.diff(), but I have a hard time trying to determine clusters without looping through the array. To exemplify this:

t = t = np.array([ 147, 5729, 5794, 5806, 6798, 8756, 8772, 8776, 9976])
dt  = np.diff(t)
dt = array([5582,   65,   12,  992, 1958,   16,    4, 1200])

If my cluster condition is dt < 100 t[1], t[2], and t[3] would be one cluster and t[5], t[6], and t[7] would be another. I have tried playing around with numpy.where(), but I am having no success with getting the conditions tuned right to separate out the clusters, i.e.

cluster1 = np.array([5729, 5794, 5806])
cluster2 = np.array([8756, 8772, 8776])

or something along the lines.

Any help is appreciated.

Upvotes: 3

Views: 2887

Answers (1)

HYRY
HYRY

Reputation: 97261

import numpy as np

t = np.array([ 147, 5729, 5794, 5806, 6798, 8756, 8772, 8776, 9976])
dt  = np.diff(t)
pos = np.where(dt > 100)[0] + 1
print np.split(t, pos)

the output is:

[array([147]), 
array([5729, 5794, 5806]), 
array([6798]), 
array([8756, 8772, 8776]), 
array([9976])]

Upvotes: 7

Related Questions