Mr Awesome8
Mr Awesome8

Reputation: 281

Efficient way of finding sequential numbers across multiple arrays?

I'm not looking for any code or having anything being done for me. I need some help to get started in the right direction but do not know how to go about it. If someone could provide some resources on how to go about solving these problems I would very much appreciate it. I've sat with my notebook and am having trouble designing an algorithm that can do what I'm trying to do.

I can probably do:

foreach element in array1
    foreach element in array2
        check if array1[i] == array2[j]+x

I believe this would work for both forward and backward sequences, and for the multiples just check array1[i] % array2[j] == 0. I have a list which contains int arrays and am getting list[index] (for array1) and list[index+1] for array2, but this solution can get complex and lengthy fast, especially with large arrays and a large list of those arrays. Thus, I'm searching for a better solution.


I'm trying to come up with an algorithm for finding sequential numbers in different arrays.

For example:

[1, 5, 7] and [9, 2, 11] would find that 1 and 2 are sequential.

This should also work for multiple sequences in multiple arrays. So if there is a third array of [24, 3, 15], it will also include 3 in that sequence, and continue on to the next array until there isn't a number that matches the last sequential element + 1.

It also should be able to find more than one sequence between arrays.

For example:

[1, 5, 7] and [6, 3, 8] would find that 5 and 6 are sequential and also 7 and 8 are sequential.


I'm also interested in finding reverse sequences.

For example: [1, 5, 7] and [9, 4, 11]would return 5 and 4 are reverse sequential.


Example with all:

[1, 5, 8, 11] and [2, 6, 7, 10] would return 1 and 2 are sequential, 5 and 6 are sequential, 8 and 7 are reverse sequential, 11 and 10 are reverse sequential.

It can also overlap:

[1, 5, 7, 9] and [2, 6, 11, 13] would return 1 and 2 sequential, 5 and 6 sequential and also 7 and 6 reverse sequential.

I also want to expand this to check numbers with a difference of x (above examples check with a difference of 1).



In addition to all of that (although this might be a different question), I also want to check for multiples,

Example: [5, 7, 9] and [10, 27, 8] would return 5 and 10 as multiples, 9 and 27 as multiples.

and numbers with the same ones place.

Example: [3, 5, 7] and [13, 23, 25] would return 3 and 13 and 23 have the same ones digit.

Upvotes: 1

Views: 368

Answers (2)

bigballer
bigballer

Reputation: 146

Use a dictionary (set or hashmap)

dictionary1 = {}

Go through each item in the first array and add it to the dictionary. [1, 5, 7]

Now dictionary1 = {1:true, 5:true, 7:true}

dictionary2 = {}

Now go through each item in [6, 3, 8] and lookup if it's part of a sequence.

6 is part of a sequence because dictionary1[6+1] == true so dictionary2[6] = true

We get dictionary2 = {6:true, 8:true}

Now set dictionary1 = dictionary2 and dictionary2 = {}, and go to the third array.. and so on.

We only keep track of sequences. Since each lookup is O(1), and we do 2 lookups per number, (e.g. 6-1 and 6+1), the total is n*O(1) which is O(N) (N is the number of numbers across all the arrays).

Upvotes: 1

Special Sauce
Special Sauce

Reputation: 5602

The brute force approach outlined in your pseudocode will be O(c^n) (exponential), where c is the average number of elements per array and n is the number of total arrays.

If the input space is sparse (meaning there will be more missing numbers on average than presenting numbers), then one way to speed up this process is to first create a single sorted set of all the unique numbers from all your different arrays. This "master" set will then allow you to early exit (i.e. break statements in your loops) on any sequences which are not viable.

For example, if we have input arrays [1, 5, 7] and [6, 3, 8] and [9, 11, 2], the master ordered set would be {1, 2, 3, 5, 6, 7, 8, 9, 11}. If we are looking for n+1 type sequences, we could skip ever continuing checking any sequence that contains a 3 or 9 or 11 (because the n+1 value in not present at the next index in the sorted set. While the speedups are not drastic in this particular example, if you have hundreds of input arrays and very large range of values for n (sparsity), then the speedups should be exponential because you will be able to early exit on many permutations. If the input space is not sparse (such as in this example where we didn't have many holes), the speedups will be less than exponential.

A further improvement would be to store a "master" set of key-value pairs, where the key is the n value as shown in the example above, and the value portion of the pair is a list of the indices of any arrays that contain that value. The master set of the previous example would then be: {[1, 0], [2, 2], [3, 1], [5, 0], [6, 1], [7, 0], [8, 1], [9, 2], [11, 2]}. With this architecture, scan time could potentially be as low as O(c*n), because you could just traverse this single sorted master set looking for valid sequences instead of looping over all the sub-arrays. By also requiring the array indexes to increment, you can clearly see that the 1->2 sequence can be skipped because the arrays are not in the correct order, and the same with the 2->3 sequence, etc. Note this toy example is somewhat oversimplified because in practice you would need a list of indices for the value portions of the key-value pairs. This would be necessary if the same value of n ever appeared in multiple arrays (duplicate values).

Upvotes: 1

Related Questions