Reputation: 619
E.g., if I've got
MAX_ALLOWED_DIFF = 3
nums=[1, 2, 4, 10, 13, 2, 5, 5, 5]
the output should be
groups = [[1, 2, 4], [10, 13], [2, 5, 5, 5]]
The context: I had a pandas.Series object nums
and I used
nums = nums.diff().gt(DETECTION_MAX_DIFF_NS).cumsum()).apply(list).tolist()
to subsample in the same fashion but I noticed that there're a lot of duplicates in my Series nums
and after I use .unique()
method I don't have Series
object anymore, I've got numpy.ndarray
(1D) instead.
I believe I may use sth like pandas.Series(nums.unique)
but I don't like this hack.
Upvotes: 2
Views: 102
Reputation: 323306
So we using drop_duplicates
, keep nums
stay in pd.Series
nums=nums.drop_duplicates()
nums.groupby(nums.diff().abs().gt(MAX_ALLOWED_DIFF).cumsum()).apply(list).tolist()
Out[447]: [[1, 2, 4], [10, 13], [5]]
Upvotes: 3
Reputation: 88246
Given that you've tagged with numpy
too, here's one way to do it:
thr = 3
ix = np.flatnonzero(np.concatenate([[False], np.abs(np.diff(nums))>thr]))
np.split(nums, ix)
Output
[array([1, 2, 4]), array([10, 13]), array([2, 5, 5, 5])]
Upvotes: 2
Reputation: 221594
Here's one approach -
>>> import numpy as np
>>> idx = np.r_[0,np.flatnonzero(np.abs(np.diff(nums))>MAX_ALLOWED_DIFF)+1,len(nums)]
>>> [nums[i:j] for (i,j) in zip(idx[:-1],idx[1:])]
[[1, 2, 4], [10, 13], [2, 5, 5, 5]]
Upvotes: 3