Istrel
Istrel

Reputation: 2588

Generate 1d numpy with chunks of random length

I need to generate 1D array where repeated sequences of integers are separated by a random number of zeros.

So far I am using next code for this:

from random import normalvariate

regular_sequence = np.array([1,2,3,4,5], dtype=np.int)
n_iter = 10
lag_mean = 10 # mean length of zeros sequence
lag_sd = 1 # standard deviation of zeros sequence length

# Sequence of lags lengths
lag_seq = [int(round(normalvariate(lag_mean, lag_sd))) for x in range(n_iter)]

# Generate list of concatenated zeros and regular sequences
seq = [np.concatenate((np.zeros(x, dtype=np.int), regular_sequence)) for x in lag_seq]
seq = np.concatenate(seq)

It works but looks very slow when I need a lot of long sequences. So, how can I optimize it?

Upvotes: 6

Views: 683

Answers (4)

B. M.
B. M.

Reputation: 18638

Not a trivial problem because data is misaligned. Performance depends on what is a long sequence. Take the example of a square problem : a lot of, long, regular and zeros sequences (n_iter==n_reg==lag_mean):

import numpy as np
n_iter = 1000
n_reg = 1000
regular_sequence = np.arange(n_reg, dtype=np.int)
lag_mean = n_reg # mean length of zeros sequence
lag_sd = lag_mean/10 # standard deviation of zeros sequence length
lag_seq=np.int64(np.random.normal(lag_mean,lag_sd,n_iter)) # Sequence of lags lengths

First your solution :

def seq_hybrid():
    seqs = [np.concatenate((np.zeros(x, dtype=np.int), regular_sequence)) for x in lag_seq]
    seq = np.concatenate(seqs)
    return seq   

Then a pure numpy one :

def seq_numpy():
    seq=np.zeros(lag_seq.sum()+n_iter*n_reg,dtype=int)
    cs=np.cumsum(lag_seq+n_reg)-n_reg
    indexes=np.add.outer(cs,np.arange(n_reg))
    seq[indexes]=regular_sequence
    return seq

A for loop solution :

def seq_python():
    seq=np.empty(lag_seq.sum()+n_iter*n_reg,dtype=int)
    i=0
    for lag in lag_seq:
        for k in range(lag):
            seq[i]=0
            i+=1
        for k in range(n_reg):
            seq[i]=regular_sequence[k]
            i+=1    
    return seq

And a just in time compilation with numba :

from numba import jit
seq_numba=jit(seq_python)

Tests now :

In [96]: %timeit seq_hybrid()
10 loops, best of 3: 38.5 ms per loop

In [97]: %timeit seq_numpy()
10 loops, best of 3: 34.4 ms per loop

In [98]: %timeit seq_python()
1 loops, best of 3: 1.56 s per loop

In [99]: %timeit seq_numba()
100 loops, best of 3: 12.9 ms per loop

Your hybrid solution is quite as speed as a pure numpy one in this case because the performance depend essentially of the inner loop. And yours (zeros and concatenate) is a numpy one. Predictably , python solution is slower with a traditional about 40x factor. But numpy is not optimal here, because it uses fancy indexing, necessary with misaligned data . In this case numba can help you : minimal operations are done at C level, for a 120x factor gain this time compared to the python solution.

For other values of n_iter,n_reg the factor gains compared to the python solution are:

n_iter= 1000, n_reg= 1000 : seq_numba 124, seq_hybrid 49, seq_numpy 44. 
n_iter= 10, n_reg= 100000 : seq_numba 123, seq_hybrid 104, seq_numpy 49. 
n_iter= 100000, n_reg= 10 : seq_numba 127, seq_hybrid 1, seq_numpy 42. 

Upvotes: 1

TheBlackCat
TheBlackCat

Reputation: 10308

The best approach, I think, would be to use convolution. You can figure out the lag lengths, combine that with the length of the sequence, and use that to figure out the starting point of each regular sequence. Set those starting points to zero, then convolve with your regular sequence to fill in the values.

import numpy as np

regular_sequence = np.array([1,2,3,4,5], dtype=np.int)
n_iter = 10000000
lag_mean = 10 # mean length of zeros sequence
lag_sd = 1 # standard deviation of zeros sequence length

# Sequence of lags lengths
lag_lens = np.round(np.random.normal(lag_mean, lag_sd, n_iter)).astype(np.int)
lag_lens[1:] += len(regular_sequence)
starts_inds = lag_lens.cumsum()-1

# Generate list of convolved ones and regular sequences
seq = np.zeros(lag_lens.sum(), dtype=np.int)
seq[starts_inds] = 1
seq = np.convolve(seq, regular_sequence)

This approach takes something like 1/20th the time on large sequences, even after changing your version to use the numpy random number generator.

Upvotes: 2

Divakar
Divakar

Reputation: 221574

You can pre-compute indices where repeated regular_sequence elements are to be put and then set those with regular_sequence in a vectorized manner. For pre-computing those indices, one can use np.cumsum to get the start of each such chunk of regular_sequence and then add a continuous set of integers extending to the size of regular_sequence to get all indices that are to be updated. Thus, the implementation would look something like this -

# Size of regular_sequence
N = regular_sequence.size

# Use cumsum to pre-compute start of every occurance of regular_sequence
offset_arr = np.cumsum(lag_seq)
idx = np.arange(offset_arr.size)*N + offset_arr

# Setup output array
out = np.zeros(idx.max() + N,dtype=regular_sequence.dtype)

# Broadcast the start indices to include entire length of regular_sequence
# to get all positions where regular_sequence elements are to be set
np.put(out,idx[:,None] + np.arange(N),regular_sequence)

Runtime tests -

def original_app(lag_seq, regular_sequence):
    seq = [np.concatenate((np.zeros(x, dtype=np.int), regular_sequence)) for x in lag_seq]
    return np.concatenate(seq)

def vectorized_app(lag_seq, regular_sequence):
    N = regular_sequence.size       
    offset_arr = np.cumsum(lag_seq)
    idx = np.arange(offset_arr.size)*N + offset_arr
    out = np.zeros(idx.max() + N,dtype=regular_sequence.dtype)
    np.put(out,idx[:,None] + np.arange(N),regular_sequence)
    return out

In [64]: # Setup inputs
    ...: regular_sequence = np.array([1,2,3,4,5], dtype=np.int)
    ...: n_iter = 1000
    ...: lag_mean = 10 # mean length of zeros sequence
    ...: lag_sd = 1 # standard deviation of zeros sequence length
    ...: 
    ...: # Sequence of lags lengths
    ...: lag_seq = [int(round(normalvariate(lag_mean, lag_sd))) for x in range(n_iter)]
    ...: 

In [65]: out1 = original_app(lag_seq, regular_sequence)

In [66]: out2 = vectorized_app(lag_seq, regular_sequence)

In [67]: %timeit original_app(lag_seq, regular_sequence)
100 loops, best of 3: 4.28 ms per loop

In [68]: %timeit vectorized_app(lag_seq, regular_sequence)
1000 loops, best of 3: 294 µs per loop

Upvotes: 4

Back2Basics
Back2Basics

Reputation: 7806

I thought an answer posted on this question had a good approach using a binary mask and np.convolve but the answer got deleted and I don't know why. Here it is with 2 concerns addressed.

def insert_sequence(lag_seq, regular_sequence):
    offsets = np.cumsum(lag_seq)
    start_locs = np.zeros(offsets[-1] + 1, dtype=regular_sequence.dtype)
    start_locs[offsets] = 1
    return np.convolve(start_locs, regular_sequence)

lag_seq = np.random.normal(15,1,10)
lag_seq = lag_seq.astype(np.uint8)
regular_sequence = np.arange(1, 6)
seq = insert_sequence(lag_seq, regular_sequence)

print(repr(seq))

Upvotes: 0

Related Questions