Journal Articles

CVu Journal Vol 14, #4 - Aug 2002 + Design of applications and programs
Browse in : All > Journals > CVu > 144 (17)
All > Topics > Design (236)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: An Introduction to 4DML

Author: Administrator

Date: 03 August 2002 13:15:53 +01:00 or Sat, 03 August 2002 13:15:53 +01:00

Summary: 

In my research project I have developed a data structure called 4DML (four-dimensional markup language), which is something like a cross between a tree and a matrix.

Body: 

In my research project I have developed a data structure called 4DML (four-dimensional markup language), which is something like a cross between a tree and a matrix. It's not particularly efficient, but it can be an interesting way of representing things; if you can think in the appropriate paradigm, there are some tasks that can be implemented quite rapidly.

Computer Science has always used tree-like structures to represent things. Mathematical expressions are structured hierarchically, and so are computer languages, as you would know if you've ever tried to use a compiler generator. There is a vast body of theory that deals with storing and manipulating tree-like data structures for all kinds of purposes. The recent popularity of generalised markup languages such as XML is a continuation of this trend.

Much data is also matrix-like in nature. Tables and spreadsheets are matrices. A musical score (which represents parallel streams of events) is matrix-like, and so are parallel translations of literary works. Often it is possible to read a document in several different ways, using, in effect, several different methods of indexing into the items of data that make up the document. It can be useful to treat these indices as the different dimensions of a multi-dimensional matrix, so that switching from one system to another amounts to switching to a different dimension.

However, not everything makes sense as a matrix. Sometimes it is better to use a conventional tree-like hierarchical structure, particularly if the data's structure is recursive (as is the case with mathematical expressions, which can contain other expressions to any depth). But using a hierarchical structure throughout makes things more complex when they would perhaps be better represented as matrices.

The 4DML data structure aims to represent data in such a way that it can be treated as either or both, as appropriate. As a side-effect, it also allows multiple, independent hierarchies over the same data. When you read a book, you are looking at at least two independent hierarchies - the logical hierarchy of headings, paragraphs and sentences, and the physical hierarchy of pages, columns and lines. Sometimes it is necessary to store both; for example, Braille versions of books frequently have special notes to tell the blind reader where they are in the printed edition, so that, if they are in a classroom full of sighted people and the teacher says "now look at the top of the page'', they'll be able to find where this is in the Braille. Some religious texts have chapter and verse numbers, which is yet another hierarchy and can be more useful than layout (life in a diverse classroom would be so much easier if all teaching materials were numbered like that).

Why might mixing trees with matrices be useful? For one thing, navigation around the structure becomes easier to program. For example, imagine a document with several sections. Each section has several subsections, and each subsection has several paragraphs. Now imagine you have an iterator (or cursor) that is pointing to section 2, subsection 3, paragraph 6. If the document is represented as a hierarchical, tree-like structure, it should be fairly simple to move that iterator on to the next paragraph or the previous paragraph, or to go up one level and move it on to the next or previous subsection. This kind of navigation is sufficient for most conventional tasks that involve documents.

However, now suppose that the document has a translation into another language. The translation is represented as a second document, with the same structure of sections, subsections and paragraphs as the first. The iterator, as before, is in section 2, subsection 3, paragraph 6, of document 1. For some applications, it can make sense to jump from this point to the same point in document 2, that is, document 2, section 2, subsection 3, paragraph 6. If the documents can only be navigated hierarchically, then this jump will involve quite a number of navigational steps. In this particular case, the problem can be addressed by maintaining two separate iterators (one in each document), but things could get more complex.

For example, imagine a music book with many songs. Each song has many translations, each translation has several verses, and each verse has many words. Each song also has music, which has several parts for different instruments. The whole thing can be represented hierarchically, but it would be a fairly complex task to jump from (song 3, translation 1, verse 2, note 5) to the same place in translation 2, or to the same place in verse 3, or to the data that represents which notes are being played at that time. Such jumps would be required very frequently by certain kinds of typesetting algorithms.

Problems like the above arise because some hierarchical representations are not "simply connected". In a tree-like data structure, there may be no direct link from, for example, (song 3, translation 1, verse 2, note 5) to the same place in translation 2, even though there is a conceptual link as far as some types of processing are concerned. If conceptual links are not physically represented in the data structure, then the algorithms that use them will become more complex and difficult to develop, since there is a more considerable code overhead for navigating around the data.

The problem could be addressed by adding more links (pointers) in the hierarchical structure, but doing this can become cumbersome. However, adding more dimensions can allievate the need for such pointers, because the items that are linked are adjacent to each other in a different dimension. (Of course, the actual implementation might use references or pointers, but the programmer doesn't have to think about that.)

To illustrate: take a piece of paper, mark any two points on it, and then try to fold the paper in such a way that those two points touch each other. In two-dimensional space (on the piece of paper), those two points are quite far apart, but in our 3D space, they are touching. If you were trying to explain all this to imaginary beings who can only understand two dimensions, then you would have to say that there is some kind of "wormhole" magically linking the two points together. But if you can think in three dimensions, you don't have to invent such things as wormholes - the points are touching because the paper is folded, and they are adjacent to each other in the third dimension. Mathematically, the space on the paper used to be "simply connected" (each place on the paper was only "touching" the places next to it), but when the paper was folded, it was no longer simply connected (some places were now touching other places that were quite far away on the paper). However, in three dimensions, it is still simply connected.

Then why not work with N-dimensional matrices directly?

Computer memory is one-dimensional; everything is indexed by memory address. It is possible to represent objects of two or more dimensions in computer memory, but doing so requires somebody (either the programmer or the compiler writer) to think about how to store such things in a one-dimensional space. This can be particularly difficult if there is no limit to how the object can grow. Anybody who has tried to write a simple full-screen text editor from scratch will testify that this is not as easy as a novice might imagine; as the user moves the cursor on a two-dimensional screen, the editor has to calculate where it is in the one-dimensional sequence of characters that makes up the file being edited.

This problem is made more acute when the number of dimensions is not constant. Additionally, when representing a document in an Ndimensional matrix, the resulting matrix is sparse; most of the data is close to the origin, and it would be very wasteful to allocate a large Ndimensional array that is big enough to hold the maximum extent of every dimension.

There are many ways of representing sparse N-dimensional matrices in computer memory. 4DML is one of these, but it has been designed in such a way that it can also represent hierarchical structures, or a mixture of the two. To begin with, we note that an N-dimensional matrix can be represented as a set of three-dimensional points:

(dimension, position, object)

In this representation, each object (which we assume can be uniquely identified) is given a position on each of the dimensions. For example, if object Q is at co-ordinates (5,8,13,21,34,55) then the threedimensional point-set will contain the following points (in no particular order):

(1, 5, Q), (2, 8, Q), (3, 13, Q), (4, 21, Q), (5, 34, Q), (6, 55, Q)

In a real application, it is useful to give the values of the first co-ordinate names rather than numbers; 4DML terms them "elements", since they often reflect the names of XML elements, and it is important not to confuse a dimension in the N-dimensional matrix with a dimension of the 3D representation.

The fourth dimension in 4DML arises from the need to also represent hierarchical structures. It is a depth dimension, and it is used to disambiguate elements with the same name at different depths in the tree.

Thus 4DML is a collection of points of the form:

(element, position, depth, object)

These points can represent any hierarchical data structure and also any N-dimensional matrix, or a combination of the two. They can also represent several different structures over the same data. The ordering of the points themselves is not significant, which means they can be kept in a hash table or other efficient data structure.

Transforming it

An algorithm called "transformation by model" takes two 4DML documents, an "input" and a "model", and transforms the input as necessary to reflect the structure of the model. The algorithm can also report which objects have been lost in the process, if any.

The algorithm works by performing a top-down traversal of the model, and reads off relevant parts of the input as it does so. Since 4DML can be read in many different ways, it is not difficult to read it in whatever way is dictated by the structure of the model, regardless of whether or not this matches the original structure. Hence it is possible to perform complex structural transformations merely by writing down the form of the desired result. Of course, it is also possible to follow the structure of the input like a conventional stylesheet processor.

I won't go into the precise workings of the algorithm here (it would take too long) but readers are welcome to download and experiment with the hacky Python prototype on my website, at http://www.cus.cam.ac.uk/~ssb22/4dml/. The prototype can take its input in XML as well as a few simple text-based languages specially invented for the purpose (these are usually easier to hand-code than XML is). It is only a proof of concept though; it can be used, but it would need more development before it is a quality software deliverable (but then, academic research is not about making software deliverables).

Of course, feedback is most welcome, although I cannot guarantee to reply to all of your hundreds of emails :-)

Notes: 

More fields may be available via dynamicdata ..