Thursday 23 April 2015

Data Structure and Algorithm: Minimal Swaps from Array A to B (Google Interview Question)

1. Problem Description
This is a Google interview question for software engineer from careercup. The following is the original thread,
"
You are given two arrays - A & B. The length of the arrays is the same - N. The values in the arrays are integers from 0 to N - 1 in random order and no number is repeated. You need to list a sequence of two-element swaps required to reorder the array A into the order of the array B. Additionally, at each step of the sequence you are allowed to swap two elements only if one of them is 0.
"

2. Analysis
Given the size of array as N and all unique elements (according to the problem description), all the total different combination of arrays are N! (factorial of N). A and B are just two combinations of N! Based on the fact all the combinations can be reached from a initial state (let's say in a ascending order) via swapping there must exist a minimal swaps that can make A and B the exactly same.

As we are talking about the minimal swaps of A to reach B, all the elements in A that are already in the correct locations should stay untouched and those are not should be swapped to reach B. Here comes the problem of how to reach B via minimal swaps.

The current solution proposed by some contributors
The solution proposed by some contributes are not the solution to reach B with the minimal swaps. Here is how it is done. Go through form the beginning of A and B and comparing two elements at the same location. If they are the same then carry on to the next location. If not the same, do the following, (suppose at this location x in A and y in B and x != y)
    - Swap x with 0 in A
    - Swap 0 with y in A
Now at this location A and B have the same value y. And in this way one difference causes two swaps. In general given M difference between A and B, the minimal swaps will be 2M.

The minimal swaps solution
Let's take another look at the example appearing in the thread.
int a[] = {1, 3, 2, 4, 0};
int b[] = { 4, 0, 3, 2, 1};
The solution proposed by Diapankar and a few others needs 10 swaps to reach B. According to the above analysis, 10 is exactly twice of the number of different element between A and B, 5.

Is this really need 10 swaps. Here is the solution that it should look like,
    - 1st swap:   1 and 0 => 0 3 2 4 1
    - 2nd swap: 4 and 0 => 4 3 2 0 1
    - 3rd swap:  2 and 0 => 4 3 0 2 1
    - 4th swap:  3 and 0 => 4 0 3 2 1
This takes exactly 4 swaps. If recall my other few blog entries, the difference between A and B can be called "edit distance". As all elements are unique and A and B are composed of the same elements. The actual question can be generalized as how many swaps to reduce the "edit distance" between A and B to 0. And what is the minimal swaps?

My solution as illustrated in the above example follows,
    - Find out the element 0 in A and its location i
    - Find out the element x in B at the same location i
    - Swap x with 0 in A
    - Repeat last 3 steps until there is no difference
The number of swaps of this solution is M - 1, where M is the "edit difference".

Here is a special case, When element 0 appears in the same location in A and B at the beginning or during the swap, two extra swap is need. In this case the minimal swap will be M+1.
int a[] = {1, 0, 2, 4, 3};
int b[] = { 4, 0, 3, 2, 1};
In the above case the edit distance M is equal to 4. But as the element 0 is not appearing in the editing difference list, therefore have to make it in the list. The solution is simple swap it out. For instance swap 0 with 3 (then M' = 5), Then we need M'-1 = 4 swaps to reach B. Therefore in total M'-1+1 = 5 swaps to from A to B. In this special case the minimal swaps is M'=M+1.

3. Solution
Here is the pseudo-code,
    - 1. Extract out the different elements into the edit distance list
    - 2. Check if 0 is appearing in edit distance list
        - 2.1 If not, swap 0 with one of element in the edit distance and add it into this list
    - 3. From the list look at the location of element "0".
    - 4. Find out the value x in B at the same location
    - 5. Swap 0 with x in the edit distance list (in A)
    - 6. Remove x from the edit distance list
    - 7. Repeat last 5 steps until exhaust the edit distance list
Bear in mind that Step 2.1 can only happen once. Either it happens at the very beginning when exacting the edit distance list, or it could happen after Step 5.
    -
Quite a few contributes pointed out how to do Step 3, 4 and 5 via using a N size array to save the index of each element in A, because the range of elements is from 0 to N-1 and without duplicate. In this special case it works perfectly fine. But in a more general case that keeps the uniqueness of elements but does not have the range constraint, a hash map should be used as the data structure with the element as the key and its index in A as the value.

In all here this solution has O(N) computation complexity and O(N) space complexity and guarantees minimal swaps as M-1/M+1.

No comments:

Post a Comment