Journal Articles

CVu Journal Vol 17, #5 - Oct 2005 + Programming Topics

Browse in :

All > Journals > CVu > 175 (15)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Pointer Reversal: An Algorithm Design Technique

Author: Administrator

Date: 02 October 2005 06:00:00 +01:00 or Sun, 02 October 2005 06:00:00 +01:00

Summary:

To summarize, as memory goes too low, garbage collection could kick in, but it in turn needs memory to maintain a traversal stack â€¦ Quite a catch-22.

Body:

Why Do We Need Yet Another Algorithm Technique?

I came across the Schorr and Waite algorithm while studying garbage collection in Knuth. (See algorithm E, section 2.3.5 of Knuth Volume I.) One phase in garbage collection typically consists of marking nodes of a data structure not in use. These data structures are Lists. Knuth defines Lists as "a finite sequence of zero or more atoms or Lists". Any forest is a List. Unlike forests, Lists can have cycles. And just the way forests are represented as binary trees, two links are used, each link with a different meaning.

There are three types of Nodes:

List Heads
Atom Nodes
List Nodes

An ATOM field is maintained, which is true for Atom nodes and false for List nodes. For a List node, two links, DLINK and RLINK are maintained. If ATOM is false and DLINK is non-null, it points to a List Head. If ATOM is true, DLINK is irrelevant.

Given a collection of List Heads, you need to visits all nodes reachable from each List Head and mark them. You need to keep track of where you came from and so need to maintain a stack. But there might not be enough memory to keep the stack!!!

To summarize, as memory goes too low, garbage collection could kick in, but it in turn needs memory to maintain a traversal stack … Quite a catch-22.

The Schorr and Waite is an elegant algorithm that solves this problem. It alters the pointer fields of the data structure itself to maintain the traversal stack ( Pointer Reuse ☺ ). Suppose Node A points to Node B and B points to Node C. When the algorithm is executing and marking Node C, it temporarily alters the pointers so that C points to B and B points to A. Hence the name Pointer Reversal. When the algorithm pops the stack, it restores ( reverses back ) the pointers to their original values.

The ATOM Trick

List Nodes have an ATOM bit which is used skip ATOM nodes, after they are marked. For a List Node, this bit is always false. The algorithm uses this bit again ( in a context where it is known to be false ) for remembering state for Linked Nodes!! As pseudo-code

if ( ATOM = true )
{
 mark this node and skip rest of the processing
}
else
{
  if ( we changed DLINK as a stack link )
    ATOM = true
  .....
/*Now here, when we want to reverse the link*/
  if(  ATOM = true )
  { 
    we came over a DLINK ( and hence changed it ) 
  }
  else
  {
    we came over a RLINK (and hence changed it)
  }
}

The clever trick is manipulation of the ATOM bit. While processing a List Node ATOM is false, so we set ATOM = true to remember one pointer and ATOM = false ( the default ) to remember other.

Application to Binary Trees (And The Good Old Inorder Traversal)

However, as it turns out, we don't need this ATOM bit for binary trees… We exploit the properties of binary trees (no cycles, all nodes are List nodes etc.) and use the traversal property ( inorder traversal ) in place of the ATOM bit.

The algorithm, in Knuthspeak, looks as below: ( See notation below )

K1. Set P <- Root, T <- ^. [ Root is root of binary tree, ^ stands for null pointer ].

K2. If  LLINK( P ) = ^ goto K3. Otherwise, set R <- LLINK(P), 
  LLINK(P) <- T, T <- P [ pointer reversed ], P <- R and repeat this step.

K3. Set MARK(P) <- 1. If RLINK(P) = ^ goto K4. Otherwise, set R <- RLINK(P), 
  RLINK(P) <- T, T <- P, P <- R. Goto K2.

K4. If T = ^, the algorithm terminates. Otherwise, if MARK(T) = 0, Q <- LLINK(T), 
  LLINK(T) = P, P <- T, T <- Q and goto K3. Else, [parent is marked ], 
  set Q <- RLINK(T), RLINK(T) <- P, P <- T, T <- Q and repeat this step. 

Notation: 1. = is equality operator (= =)
                2. T <- P is assignment - T is assigned the value of P
                3. LLINK(P) is really P->LLINK in C/C++

Step K2 reverses left pointer.
K3 reverses the right pointer.
K4 pops back ( i.e. restores pointers )

Note that we are essentially traversing (marking nodes in) the tree in in-order traversal. It is essentially an in-order traversal without any explicit stack. Instead we need a MARK bit ( but note that the MARK bit really need not be explicitly maintained - you can use other contextual info ).

Note that, in K3, when a node is marked, its left subtree ( if any ) is all marked.

I tried the algorithm with boundary test cases (a binary tree with just the root node, a degenerate tree with only left or only right nodes etc…). It works correctly…

The Master Verifies…

This appears as an exercise in Knuth (2.3.5 - exercise 7). When I come up with a solution, verifying it always teaches me something more… This was a rare case where Don's agreed with mine… Sometime back, I solved exercise 10 and when tallied with the book solution, I came across a very novel use of queues.

Over the years, as I keep reading Knuth I have grown a deep respect and adoration for these volumes. The text is crisp, sharp and cracking the exercises make you ecstatic. Nirvana, anyone?

Notes:

More fields may be available via dynamicdata ..