We present a unified and systematic approach to basic principles of Arbology, a new algorithmic discipline focusing on algorithms on trees. Stringology, a highly developed algorithmic discipline in the area of string processing, can use finite automata as its basic model of computation. For various kinds of linear notations of ranked and unranked ordered trees it holds that subtrees of a tree in a linear notation are substrings of the tree in the linear notation. Arbology uses pushdown automata reading such linear notations of trees as its basic model of computation. Basic principles known from stringology are used for the construction of particular arbology algorithms, in which the underlying tree structure is processed with the use of the pushdown store. Arbology results are shown for the basic problems subtree matching and tree indexing for ranked and unranked ordered trees.
In this work we deal with tree pattern matching over ranked trees, where the pattern set to be matched against is defined by a regular tree expression. We present a new method that uses a tree automaton constructed inductively from a regular tree expression. First we construct a special tree automaton for the regular tree expression of the pattern E, which is somehow a generalization of Thompson automaton for strings. Then we run the constructed automaton on the subject tree t. The pattern matching algorithm requires an O(|t||E|) time complexity, where |t| is the number of nodes of t and |E| is the size of the regular tree expression E. The novelty of this contribution besides the low time complexity is that the set of patterns can be infinite, since we use regular tree expressions to represent patterns.