akaHuman
akaHuman

Reputation: 1352

Check if array B is a permutation of A

I tried to find a solution to this but couldn't get much out of my head.

We are given two unsorted integer arrays A and B. We have to check whether array B is a permutation of A. How can this be done.? Even XORing the numbers wont work as there can be several counterexamples which have same XOR value bt are not permutation of each other.

A solution needs to be O(n) time and with space O(1)

Any help is welcome!! Thanks.

Upvotes: 10

Views: 9630

Answers (9)

Kunukn
Kunukn

Reputation: 2236

The solution needs to be O(n) time and with space O(1). This leaves out sorting and the space O(1) requirement is a hint that you probably should make a hash of the strings and compare them.

If you have access to a prime number list do as cheeken's solution.

Note: If the interviewer says you don't have access to a prime number list. Then generate the prime numbers and store them. This is O(1) because the Alphabet length is a constant.

Else here's my alternative idea. I will define the Alphabet as = {a,b,c,d,e} for simplicity. The values for the letters are defined as:

a, b, c, d, e
1, 2, 4, 8, 16

note: if the interviewer says this is not allowed, then make a lookup table for the Alphabet, this takes O(1) space because the size of the Alphabet is a constant

Define a function which can find the distinct letters in a string.

// set bit value of char c in variable i and return result
distinct(char c, int i) : int

E.g. distinct('a', 0) returns 1
E.g. distinct('a', 1) returns 1
E.g. distinct('b', 1) returns 3

Thus if you iterate the string "aab" the distinct function should give 3 as the result

Define a function which can calculate the sum of the letters in a string.

// return sum of c and i
sum(char c, int i) : int

E.g. sum('a', 0) returns 1
E.g. sum('a', 1) returns 2
E.g. sum('b', 2) returns 4

Thus if you iterate the string "aab" the sum function should give 4 as the result

Define a function which can calculate the length of the letters in a string.

// return length of string s
length(string s) : int

E.g. length("aab") returns 3

Running the methods on two strings and comparing the results takes O(n) running time. Storing the hash values takes O(1) in space.

 e.g. 
 distinct of "aab" => 3
 distinct of "aba" => 3
 sum of "aab => 4
 sum of "aba => 4
 length of "aab => 3
 length of "aba => 3

Since all the values are equal for both strings, they must be a permutation of each other.

EDIT: The solutions is not correct with the given alphabet values as pointed out in the comments.

Upvotes: 1

Antti Huima
Antti Huima

Reputation: 25522

The question is theoretical but you can do it in O(n) time and o(1) space. Allocate an array of 232 counters and set them all to zero. This is O(1) step because the array has constant size. Then iterate through the two arrays. For array A, increment the counters corresponding to the integers read. For array B, decrement them. If you run into a negative counter value during iteration of array B, stop --- the arrays are not permutations of each others. Otherwise at the end (assuming A and B have the same size, a prerequisite) the counter array is all zero and the two arrays are permutations of each other.

This is O(1) space and O(n) time solution. However it is not practical, but would easily pass as a solution to the interview question. At least it should.

More obscure solutions

  • Using a nondeterministic model of computation, checking that the two arrays are not permutations of each others can be done in O(1) space, O(n) time by guessing an element that has differing count on the two arrays, and then counting the instances of that element on both of the arrays.

  • In randomized model of computation, construct a random commutative hash function and calculate the hash values for the two arrays. If the hash values differ, the arrays are not permutations of each others. Otherwise they might be. Repeat many times to bring the probability of error below desired threshold. Also on O(1) space O(n) time approach, but randomized.

  • In parallel computation model, let 'n' be the size of the input array. Allocate 'n' threads. Every thread i = 1 .. n reads the ith number from the first array; let that be x. Then the same thread counts the number of occurrences of x in the first array, and then check for the same count on the second array. Every single thread uses O(1) space and O(n) time.

  • Interpret an integer array [ a1, ..., an ] as polynomial xa1 + xa2 + ... + xan where x is a free variable and the check numerically for the equivalence of the two polynomials obtained. Use floating point arithmetics for O(1) space and O(n) time operation. Not an exact method because of rounding errors and because numerical checking for equivalence is probabilistic. Alternatively, interpret the polynomial over integers modulo a prime number, and perform the same probabilistic check.

Upvotes: 11

datenwolf
datenwolf

Reputation: 162174

You're given two constraints: Computational O(n), where n means the total length of both A and B and memory O(1).

If two series A, B are permutations of each other, then theres also a series C resulting from permutation of either A or B. So the problem is permuting both A and B into series C_A and C_B and compare them.

One such permutation would be sorting. There are several sorting algorithms which work in place, so you can sort A and B in place. Now in a best case scenario Smooth Sort sorts with O(n) computational and O(1) memory complexity, in the worst case with O(n log n) / O(1).

The per element comparision then happens at O(n), but since in O notation O(2*n) = O(n), using a Smooth Sort and comparison will give you a O(n) / O(1) check if two series are permutations of each other. However in the worst case it will be O(n log n)/O(1)

Upvotes: 1

Sachin
Sachin

Reputation: 18747

If we need not sort this in-place, then the following approach might work:

  1. Create a HashMap, Key as array element, Value as number of occurances. (To handle multiple occurrences of the same number)
  2. Traverse array A.
  3. Insert the array elements in the HashMap.
  4. Next, traverse array B.
  5. Search every element of B in the HashMap. If the corresponding value is 1, delete the entry. Else, decrement the value by 1.
  6. If we are able to process entire array B and the HashMap is empty at that time, Success. else Failure.

HashMap will use constant space and you will traverse each array only once.

Not sure if this is what you are looking for. Let me know if I have missed any constraint about space/time.

Upvotes: 4

cheeken
cheeken

Reputation: 34655

If we are allowed to freely access a large list of primes, you can solve this problem by leveraging properties of prime factorization.

For both arrays, calculate the product of Prime[i] for each integer i, where Prime[i] is the ith prime number. The value of the products of the arrays are equal iff they are permutations of one another.

Prime factorization helps here for two reasons.

  1. Multiplication is transitive, and so the ordering of the operands to calculate the product is irrelevant. (Some alluded to the fact that if the arrays were sorted, this problem would be trivial. By multiplying, we are implicitly sorting.)
  2. Prime numbers multiply losslessly. If we are given a number and told it is the product of only prime numbers, we can calculate exactly which prime numbers were fed into it and exactly how many.

Example:

a = 1,1,3,4
b = 4,1,3,1
Product of ith primes in a = 2 * 2 * 5 * 7 = 140
Product of ith primes in b = 7 * 2 * 5 * 2 = 140

That said, we probably aren't allowed access to a list of primes, but this seems a good solution otherwise, so I thought I'd post it.

Upvotes: 8

uncle_xia
uncle_xia

Reputation: 1

I just find a counterexample. So, the assumption below is incorrect.


I can not prove it, but I think this may be possible true.

Since all elements of the arrays are integers, suppose each array has 2 elements, and we have

a1 + a2 = s
a1 * a2 = m

b1 + b2 = s
b1 * b2 = m

then {a1, a2} == {b1, b2}

if this is true, it's true for arrays have n-elements.

So we compare the sum and product of each array, if they equal, one is the permutation of the other.

Upvotes: 0

beaker
beaker

Reputation: 16801

I apologize for posting this as an answer as it should really be a comment on antti.huima's answer, but I don't have the reputation yet to comment.

The size of the counter array seems to be O(log(n)) as it is dependent on the number of instances of a given value in the input array.

For example, let the input array A be all 1's with a length of (2^32) + 1. This will require a counter of size 33 bits to encode (which, in practice, would double the size of the array, but let's stay with theory). Double the size of A (still all 1 values) and you need 65 bits for each counter, and so on.

This is a very nit-picky argument, but these interview questions tend to be very nit-picky.

Upvotes: 5

Rob Neuhaus
Rob Neuhaus

Reputation: 9290

I'd use a randomized algorithm that has a low chance of error.

The key is to use a universal hash function.

def hash(array, hash_fn):
  cur = 0
  for item in array:
    cur ^= hash_item(item)
  return cur

def are_perm(a1, a2):
  hash_fn = pick_random_universal_hash_func()
  return hash_fn(a1, hash_fn) == hash_fn(a2, hash_fn) 

If the arrays are permutations, it will always be right. If they are different, the algorithm might incorrectly say that they are the same, but it will do so with very low probability. Further, you can get an exponential decrease in chance for error with a linear amount of work by asking many are_perm() questions on the same input, if it ever says no, then they are definitely not permutations of each other.

Upvotes: 0

wildplasser
wildplasser

Reputation: 44250

You can convert one of the two arrays into an in-place hashtable. This will not be exactly O(N), but it will come close, in non-pathological cases.

Just use [number % N] as it's desired index or in the chain that starts there. If any element has to be replaced, it can be placed at the index where the offending element started. Rinse , wash, repeat.

UPDATE: This is a similar (N=M) hash table It did use chaining, but it could be downgraded to open addressing.

Upvotes: 0

Related Questions