Sunday, 17 August 2014

Data Structure and Algorithm - Linked List: Circle Detection

This is one of interviewing questions that my friends and I all have encountered and used in quite a few occasions. And I believe it is worth of one blog.

1. Problem
Given the fact that in my last two blog entries, Data Structure and Algorithm - Linked List: Common API and Data Structure and Algorithm - Linked List: Merge Sort, these algorithms and implementation are only operational on Linked List without circle linking. Otherwise some of them will be lost in loop. For instance, LL_PushBack(), this operation will be trapped in the loop to find the last node, if a circle linking exists in the Linked List.

Hence it will be circular to guarantee that there does not exist circle linking in the Linked List when using these functions. Hence circle detection algorithm is useful for this occasion.The goal is to design/implement an algorithm to remove the circle linking if it does exist.

2. Solution 1 - Hash Table
This is the first intuitive solution popping into my head. It is similar to use a linear method to find the very first repeating number given an array. Use hash table to contain the hash value of the pointer in order to speed up the searching. It guarantees that traversing the Linked List once will find the solution. But one drawback of this solution is that its correctness depends on the quality of the hash function. Given a perfect hash function to guarantee different input producing different output, then this solution will work as expected.

// Pseudo-code
//********************************************************************************
curHead <- Current head of this Liked List
ptrHashTable <- Initialize this hash table
lastNodeOfCircle <- nullptr
while (curHead is not nullptr)
    hashVal <- Compute the hash value of curHead
    if (hashVal does not exist in ptrHashTable)
        ptrHashTable <- Insert hashVal
        lastNodeOfCircle <- curHead
        curHead <- (curHead->next)
    else
        lastNodeOfCircle <- nullptr
        Found and return
Not found and return
//********************************************************************************

3. Solution 2 - Two Speeds of Traversing
The basic idea behind of this solution is to use two pointers traversing the Linked List with two different speed. The slower pointer traverses one node by another and the faster pointer traverses twice faster than the slower. Think about the clock. Obviously three hands (second, minute and hour) go around and around the clock. Assuming at the moment of midnight three hands meet at the 12, (actually no matter any position these three hands are currently now), the second hand can always catch the minute and hour hands and the minute hand can always catch the hour hand. Because the second hand moves faster than other two and the minute hand moves faster than the hour hand. This algorithm uses the exactly the same idea - the faster can always catch the slower in a circle arena.

If the circle linking does exists, traverse the Linked List and the faster pointer will eventually catch the slower one. If does not exist, the faster pointer will reach the end of the Linked List first. In order to remove the circle linking, the key is to find the last node in the Linked List. In the clock example, if the head is set as "12", then the last node of the circle is "11". Removing the circle is simple - break the link and set the "next" of the last node in the circle as nullptr.

// C implementation
//********************************************************************************
// LinkedListCircleDetect.h
#ifndef LINKED_LIST_CIRCLE_DETECT_H
#define LINKED_LIST_CIRCLE_DETECT_H

#pragma once

#include "LinkedListElement.h"

template<typename T>
LinkedListElement<T>* LL_GetOneOfNodesInCircleIfExists(LinkedListElement<T>* head)
{
    if (head == nullptr) {
        return nullptr;
    }

    LinkedListElement<T>* slow = head;
    LinkedListElement<T>* fast = head;

    while (1) {
        for (int speed = 0; speed < 2; ++speed) {
            fast = fast->next;
            if (fast == nullptr) {
                return nullptr;
            }
            if (slow == fast) {
                return slow;
            }
        }

        slow = slow->next;
    }

    return nullptr;
}

template<typename T>
bool LL_CheckIfCircleExists(LinkedListElement<T>* head)
{
    return LL_GetOneOfNodesInCircleIfExists(head) != nullptr;
}

template<typename T>
size_t LL_GetCircleSizeIfExists(LinkedListElement<T>* head)
{
    LinkedListElement<T>* oneNodeInCircle = LL_GetOneOfNodesInCircleIfExists(head);

    if (oneNodeInCircle == nullptr) {
        return 0;
    }

    size_t count = 1;
    LinkedListElement<T>* cur = oneNodeInCircle->next;
    while (cur != oneNodeInCircle) {
        cur = cur->next;
        ++count;
    }

    return count;
}

template<typename T>
LinkedListElement<T>* LL_DetectTheLastNodeOfCircleIfExists(LinkedListElement<T>* head)
{
    size_t circleSize = LL_GetCircleSizeIfExists(head);

    if (circleSize == 0) {
        return nullptr;
    }

    LinkedListElement<T>* cur = head;
    LinkedListElement<T>* tempOffsetByCircleSize = head;

    while (--circleSize > 0) {
        tempOffsetByCircleSize = tempOffsetByCircleSize->next;
    }

    while (cur != tempOffsetByCircleSize->next) {
        cur = cur->next;
        tempOffsetByCircleSize = tempOffsetByCircleSize->next;
    }

    return tempOffsetByCircleSize;
}

/*
 * If circle does not exsit, return false
 * If circle exists, then remove the circle and return true
 */
template<typename T>
bool LL_RemoveCircleIfExists(LinkedListElement<T>** head)
{
    if (head == nullptr) {
        return false;
    }
    LinkedListElement<T>* lastNodeOfCircle = LL_DetectTheLastNodeOfCircleIfExists(*head);
    if (lastNodeOfCircle == nullptr) {
        return false;
    }

    lastNodeOfCircle->next = nullptr;
    return true;
}

#endif

// test.cxx
void TestLinkedListCircleDetector()
{
    LinkedListElement<int>* head = nullptr;

    // empty list
    assert(LL_GetOneOfNodesInCircleIfExists(head) == nullptr);
    assert(LL_CheckIfCircleExists(head) == false);
    assert(LL_GetCircleSizeIfExists(head) == 0);
    assert(LL_DetectTheLastNodeOfCircleIfExists(head) == nullptr);
    assert(LL_RemoveCircleIfExists(&head) == false);

    LL_PushFront(&head, 11);
    LinkedListElement<int>* tail = head;

    // point to itself
    tail->next = head;
    assert(LL_GetOneOfNodesInCircleIfExists(head) != nullptr);
    assert(LL_CheckIfCircleExists(head) == true);
    assert(LL_GetCircleSizeIfExists(head) == 1);
    assert(LL_DetectTheLastNodeOfCircleIfExists(head) == tail);
    assert(LL_RemoveCircleIfExists(&head) == true);
    assert(head->next == nullptr);
    assert(head->data == 11);

    // construct 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
    LL_PushFront(&head, 10);
    LL_PushFront(&head, 9);
    LL_PushFront(&head, 8);
    LL_PushFront(&head, 7);
    LL_PushFront(&head, 6);
    LL_PushFront(&head, 5);
    LL_PushFront(&head, 4);
    LL_PushFront(&head, 3);
    LL_PushFront(&head, 2);
    LL_PushFront(&head, 1);
    LL_PushFront(&head, 12);

    assert(LL_GetOneOfNodesInCircleIfExists(head) == nullptr);
    assert(LL_CheckIfCircleExists(head) == false);
    assert(LL_GetCircleSizeIfExists(head) == 0);
    assert(LL_DetectTheLastNodeOfCircleIfExists(head) == nullptr);
    assert(LL_RemoveCircleIfExists(&head) == false);

    // setup the circle
    tail->next = head;
    assert(LL_GetOneOfNodesInCircleIfExists(head) != nullptr);
    assert(LL_CheckIfCircleExists(head) == true);
    assert(LL_GetCircleSizeIfExists(head) == 12);
    assert(LL_DetectTheLastNodeOfCircleIfExists(head) == tail);
    assert(LL_RemoveCircleIfExists(&head) == true);
    assert(tail->next == nullptr);
    assert(LL_GetSize(head) == 12);
    int val;
    assert(LL_GetNth(head, 12, val) == false);

    // setup the circle
    tail->next = head;
    LL_PushFront(&head, 20);
    LL_PushFront(&head, 30);
    LL_PushFront(&head, 40);
    LL_PushFront(&head, 50);
    LL_PushFront(&head, 60);
    LL_PushFront(&head, 70);
    assert(LL_GetOneOfNodesInCircleIfExists(head) != nullptr);
    assert(LL_CheckIfCircleExists(head) == true);
    assert(LL_GetCircleSizeIfExists(head) == 12);
    assert(LL_DetectTheLastNodeOfCircleIfExists(head) == tail);
    assert(LL_RemoveCircleIfExists(&head) == true);
    assert(tail->next == nullptr);
    assert(LL_GetSize(head) == 18);
    assert(LL_GetNth(head, 18, val) == false);

    LL_Delete(&head);
}
//********************************************************************************

4. Summary
The first solution using hash table is to guarantee O(N) computation complexity and O(N) space complexity. The second solution is to guarantee O(N) computation complexity as well but only O(1) space complexity. But the second solution could run ~3 times slower than the first solution in the worst case, because it needs to traverse the Linked List to detect if the circle does exist, traverse again to find out the size of circle, and traverse the third time to find out the last node of the circle. And it is fair that the first solution trades space for speed.

Bibliography:
[1] http://cslibrary.stanford.edu/105/LinkedListProblems.pdf
[2] Glenn W. Rowe, Data Structure & Algorithm C++, Prentice Hall, 1996

No comments:

Post a Comment