Saturday 9 May 2015

Data Structure and Algorithm - Find the longest strictly ascending sub-sequence of array (Google Interview Question)

1. Problem Description
This is a Google interview question from careercup. Here is the original thread,

"
".

2. Examples
I am to propose a tree structure to insert all the elements of array into the tree. Only the best/longest possible solutions will remain in the tree and one of the solutions will be found when reaching the end of array. 
Let's take a look how the two examples in the thread can be worked out by hand, then we can look into the tree structure. Apologize for my handwriting as the train was tumbling when I was working on them.

Example 1: [-1, 2, 100, 100, 101, 3, 4, 5, -7]
Solution of Example 1


Example 2: [10, 8, 6, 10, 2, 10 , 6, 14, 4, 9, 5, 13, 3, 11, 7, 15, 80, 9, 10, 10, 10, 10]

Solution of Example 2 - 1
Solution of Example 2 - 2
Solution of Example 2 - 3

Let's take a look at the more difficult one - Example 2. Here is some detailed explanation why each step is taken so.
Step a: - Insert 10 into an empty node
Step b: - Insert 8 into  left of 10, because 8 < 10.
            - Remove 10 from the right branch.
              Reason: any sub-sequence involving 10 can be replaced by 8. The left branch (8)
                            is always better or at least as good/long as the right branch (10).
Step c: - Insert 6 into the left of 8, because 6 < 8
            - Remove 8 from the right branch. The same reason as Step b
Step d: - Insert 10 as the child of 6, because 10 > 6 (build up the sub-sequence)
Step e: - Insert 2 into the left branch of 6, because 2 < 6
            - Keep both branches
              Reason: the left branch (2) has the potential to overtake the right branch (6), for
                            instance 3,  4, 5 following next in the array.
Step f: - Insert 10 as child of 2. because 10  > 2
            - Ignore inserting 10 as child of 6, because 6 already has a child 10
            - Remove the right branch 6, 10
              Reason: The same as Step b. In general if two branches has the same depth and
                            each element in the left branch is =< it counterpart at the same
                            level in the right branch, then the whole right branch can be removed.
                            In this case the parent node of two branches are 2 and 6, and their
                            parent node is the root, therefore 6 and 10 can be removed.
Step g: - Insert 6 as a child of 2 and stays at the left branch
               Reason: as child of 2 because 6 > 2 and stays at the left branch because 6 < 10
            - Remove 10 as Step b
Step h: - Insert 14 as a child of 6 because of 14 > 2 and 14 > 6
Step  i: - Insert 4 as a child as 2 and stay as  left branch. Same reason as Step g
            - Keep both branches as Step e
Step  j: - Insert 9 as child of 4 and 6. And stays the left branch as child of 6
            - Remove 14 as Step b
            - Then continue to remove 6 and 9 as Step f
Step k: - Insert 5 as child of 4 and stays in the left branch
            - Remove 9 as Step b
Step  l: - Insert 13 as child of 5 as 13 is the largest value
Step m: - Insert 3 as child of 2
             - Keep both branches as Step i
Step n: - Insert 11 as child of 3 and child of 5 and stay at the left branch of 5
            - Remove 13 as Step b
Step n':- Remove 11 as child of 3
              Reason: any following number > 11 conforms the sub-sequence, the right
                            branch is preferred because it is longer than left branch. (Currently
                            the depth of left and right are 3 and 4)
Step o: - Insert 7 as child of 3 and 5. Stays at the left branch of 5
            - Remove 11 as Step b
            - Remove 7 as child of 3 as Step n'
Step p: - Insert 15 as child of 3 and child of 7
            - Remove 15 as the child 3 as Step n'
Step q: - Insert 80 as child of 3 and child of 15
            - Remove 80 as child of 3 as Steop n'
Step r: - Insert 9 as child of 3 and child of 7. Stays at the left branch of 7
            - Remove 9 as child of 3 as Step n'
Step s: - Insert 10 as child of 3 and child of 9
            - Remove 15 and 80 as Step f
            - Remove 10 as child of 3 as Step n'
Step t: - Ignore 10 as duplicate of last value
Step u: - Ignore 10 as duplicate of last value
Step v: - Ignore 10 as duplicate of last value 
End:     - Reach the end of the array
             - One of solutions 2, 4, 5, 7, 9, 10
             - Length: 6

In Step n, Step o, Step p, Step q, Step r and Step s, the insertion to the left branch can be avoided actually in the first place then no bother deleting again. It will be discussed in the next section.

3. Analysis
Now let's take a close look at the tree structure and operations based on the examples in Section 2.
Branches 
    - 2, therefore this is a binary tree
Ordered
    - Yes
    - The children are greater than parents
    - The left branches is less than the right branches
Insertion
    - Give x, Find the node, y, that is one of following cases. And insert x as a child of y
        * y is a leaf node and less than x, for instance Step d and Step h
        * y is not a leaf node but its child is greater than x, for instance Step g and Step i
    - If y is the root, then insert as its left node, for instance Step b and Step c
Deletion after insertion of x
    -  If y is not a left node and x has a sibling z (must be greater than x)
        * Delete z, if z has no children, for instance Step k
    - x has a counterpart, z, at the same level at the right branch,
        * Delete x, if x > z and z has child/children, for instance Step p and Step q
    - Compare x with its sibling z
        * Delete the right branch up to the common ancestor of x and z, if x is less
           than z and x's ancestors are less than z's ancestor at each level,
           for instance Step f, Step g and Step j 

Exemption from insertion
Just examining Step n, Step o, Step p, Step q, Step r and Step s, probably do not need to insert x into the left branch. Do not insert x, if meet all the following conditions
    - y is a left node it has a sibling z
    - y < z
    - x < y
    - x < z
    - z's child is less than x as well

Solution
Guarantee find one of the longest sub-sequences but not all the solutions. In Example 2, Step s shows that 2, 4, 5, 7, 15, 80 is also a solution.

Complexity
Assume the size of the array is N and  the size of the longest sub-sequence as M,
    - Computation complexity: worst case N*M.
        * For instance given array is sorted and ascending, then  M=N, the tree will be like
           drop line. Each note has an element and each insertion will have to  compare
           all from the top to the bottom for insertion. (This worst case can be dropped
           to a linear solution by caching the leaf node)
    - Space complexity: worst case N

4. Solution and Implementation
I will discuss them in my next two blogs.

No comments:

Post a Comment