user3222184
user3222184

Reputation: 1111

Compress long vector consists of 0 and 1

Say I got a vector of size [1 x 300], in which each element consists of 0 or 1. I might need to store a bunch of these iteratively during the run time. How do I effectively represent it so that I can effeciently store them (python)?


I guess there are two ways to do it. The first method is to do something like a bitmap (do they even have this in python)?

The second approach I was thinking maybe is to store the 1's position.

eg. [0, 1, 1, 1]. I will store them as [1,2,3].

Any ideas?

Upvotes: 3

Views: 277

Answers (1)

Matteo Italia
Matteo Italia

Reputation: 126777

An alternative often used in raster filled shapes processing (where you typically have large uniform areas) is to store your data as spans, i.e. store just the length of each run of 0s or 1s (essentially, it's RLE with the item of each run implicit in the position). You can choose arbitrarily that the first value (and so, all even values) represents a run of 0s, while the second (and so, all odd values) a run of 1s. So, something like

0 0 0 0 0 1 1 0 0 0 1 1 1 1

becomes

5 2 3 4

Appending to such a structure is trivial:

def append(l, value):
    cur = (len(l) + 1) % 2
    if value == cur:
        l[-1] += 1
    else:
        l.append(1)

Upvotes: 3

Related Questions