MAF : Reference : Word-ordering methods

Word-ordering methods

MAF supports all the methods used to order words permitted in KBMAG, and some others, and the code is constructed so as to make it easy to add more if required. With some word-orderings, such as "shortlex", automatic structures are assumed to be required; for others they are only computed if explicitly requested. You can use command line options to over-ride the default behaviour.

In MAF the word-ordering method is always set in the input file, and, unlike KBMAG, the use of command line options to change the word-ordering is not supported. This is mainly because a user could generate the automata for a different word-ordering and then inadvertently try to use them as though they applied to to the ordering specified in the input file.

The following options are currently supported as values for the ordering field of an input file:

  1. ordering := "shortlex" Words are ordered so that all words of a particular length come before any longer word. Amongst words of equal length, lexicographical ordering is used, with words earlier in the lexicographical ordering coming first.
  2. ordering := "wtlex"

    This ordering, "weighted lex ordering" requires an additional field, weight, to appear in the input file, containing a list of positive integers, one for each generator, like this:

    weight := [2,1,6,3,4],

    The length of the list of weights must be equal to the number of generators. The assignment of the weight field must come after the generatorOrder field and before the equations field. Using weight values above 65535 could lead to erroneous results, because integer overflows will be possible on very long words.

    The weight of a word is computed by adding up the weights of the generators in the word. Heavier words come later in the ordering than all lighter words. Amongst words of equal weight lexicographic ordering is used to decide which word comes first.

  3. ordering := "wtshortlex"

    This ordering, "weighted shortlex ordering" requires also requires the weight field, and is defined almost identically to "wtlex" order. The only difference is that in the definition above "Amongst words of equal weight lexicographic ordering is used to decide which word comes first" is replaced by "Amongst words of equal weight shortlex ordering is used to decide which word comes first".

  4. ordering := "recursive"

    Use a recursive word-ordering. There are various ways to define this. Perhaps the quickest is as follows:

    Let u and v be words in the generators. If one of u and v, say v, is empty, then uv.
    Otherwise, let u = u′*a and v = v′*b, where a and b are generators.
    Then u > v if and only if at least one of the following holds:

    1. a = b and u′ > v′
    2. a > b and u > v′
    3. b > a but u′ > v

    In a confluent rewriting system using ordering := "recursive", the reduced words will have as few of the later generators in the ordering as possible, and these will come as early in the word as possible. If generator b occurs later than generator a in generatorOrder then a*b comes later in the ordering than any word of the form b*a^n. More problematically, (from the point of view of achieving success with MAF) if c > b > a in the ordering, then a*c comes later in the ordering than c*w where w is any word at all containing just a and b.

  5. ordering := "rt_recursive"

    This ordering, "right-recursive", is very similar to recursive ordering. The only difference from the definition above is that instead of:
    "Otherwise, let u = u′*a and v = v′*b,..." we have
    "Otherwise, let u = a*u′ and v = b*v′, ..."

    As with recursive ordering, the reduced words will have as few as possible of the later generators, but in this ordering the later generators will come as late in the word as possible. ordering := "rt_recursive" is slightly more like ordering := "shortlex", in that it will prefer a*b to b*a. In fact for some Coxeter groups they give the same results.

  6. ordering := "wreathprod"

    "Wreath Product" ordering is rather complex, as can be seen if you examine the code for comparing two words using this ordering in the source file alphabet.cpp. In this ordering each generator must be assigned a level via the level field like this:

    level := [4,3,2,1],

    The length of the list of levels must be equal to the number of generators. The assignment of the level field must come after the generatorOrder field and before the list of equations. The smallest permitted value for the level of any generator is 1.

    In order to compare two words we proceed as follows:

    1. First remove any common prefix or trailing subword they may have. If nothing is left the words are equal. If something is left proceed to step 2.
    2. Find the maximum level of generator occurring in the symbols remaining for each word. If this is different then the word containing the generator at a higher level is greater. If the maximum is the same proceed to step 3.
    3. Extract from each word, in order, all the generators at this highest level. These maximal level subwords are compared using shortlex, and if they are not equal the word which gave the greater maximal level subword is greater. If they are equal proceed to step 4.
    4. Now starting from the words as they were after step 1, discard, from both words, all symbols from the first occurrence of a generator at the maximal level, regardless of their level. In other words, replace each word by its longest prefix that consists only of generators from a lower level than the maximal level. Note that these words cannot be equal, because we removed any common symbols from the front of the words at step 1. Now return to step 2.

    ordering:="recursive" behaves identically to the special case of wreath product ordering in which the level of generator number i is i. On the other hand, if all the generators have the same level, then "wreathprod" is the same as "shortlex". So, depending on the levels given to the generators, ordering:="wreathprod" is capable of behaving like both those orderings. (This is part of the reason for its use in coset systems).

    Note that we could have used some other way of comparing the maximal subwords in step 3 above rather than shortlex. For example,we could use a "wtlex" comparison. Also, in step 4 we might choose to look to the right of the maximal subwords rather than the left. MAF's "wreathprod" ordering could more accurately be described as "left wreath product order over left shortlex".

The remaining word-ordering methods are new to MAF. The author does not know if they have been described and named elsewhere. Several of them are only included because the author uses them for testing purposes, or was interested in experimenting with them. The first two are "right" versions of orders already defined, and are useful.

  1. ordering := "short_rtlex"

    Words are ordered so that all words of a particular length come before any longer word. Amongst words of equal length, right lexicographical ordering is used, with words earlier in the ordering coming first. Right lexicographical ordering is like ordinary lexicographical ordering but with the words read from right to left rather than left to right. In other words, it is not the first different generator that is decisive between two words of the same length, but the last.

    This ordering will usually give similar results to ordinary shortlex, but might be more convenient in some cases. For example, if you want to use a word-acceptor automaton to control a drawing application, and the words represent some sequence of transformations performed left-first rather than right-first. (So that the action of a word a*b*c on some point x, might be represented as x*a*b*c). It is perfectly possible for a rewriting system to be automatic using "short_rtlex" order when it is not automatic using "shortlex" order, and vice versa.

  2. ordering := "rt_wreathprod"

    This ordering is identical in its definition to "wreathprod" ordering, except that at step 4 we look to the right of the maximal subwords, rather than the left. This order could therefore be described as "right wreath product order over left shortlex". This word-ordering is actually a better choice for coset systems, than "wreathprod" ordering, because it works correctly when the rewriting system for the group does not use shortlex order. If all the generators are at different levels, this ordering is identical to "rt_recursive" ordering with the generators reordered according to increasing level. When all the generators are at the same level, this ordering is again the same as "shortlex".

The next few orderings were added to provide easy test-cases for MAF's facilities for non-shortlex automatic structures.

  1. ordering := "short_wtlex"

    Words are ordered so that all words of a particular length come before any longer word. Amongst words of equal length, "wtlex" ordering is used, with words earlier in "wtlex" order coming first. As you would expect this ordering uses the weight field.

  2. ordering := "short_fptp" (fptp stands for first past the post)

    Words are ordered so that all words of a particular length come before any longer word. Words of equal length are ordered by considering the left-most occurrence of the latest generator that occurs in either word, ignoring all positions where the generators are equal. Whichever word has this generator is the greater.

  3. ordering := "short_accented_lex"
  4. ordering := "short_multiaccented_lex"

    These two orders use the level field. ordering := "short_accented_lex" behaves almost identically to ordinary shortlex. However, if at the first place where the words differ the letters are from the same level, the decision on the order of the words is only provisional and can be over-ridden by the first position in which the generators are from different levels. This ordering is similar to how words are ordered in French and other European languages with accented characters, hence the name.

    With ordering:="short_multiaccentedlex", if at a later position in the word the order has still not been decided by the occurrence of generators from different levels, and again there are two different generators from the same level, and both generators are at a higher level than has been used to make a provisional decision before, then the provisional decision is over-ridden.

The remaining orderings were added for experimental purposes, and do not seem to be useful.

  1. ordering := "short_recursive"

    Words are ordered so that all words of a particular length come before any longer word. Amongst words of equal length, recursive ordering is used, with words earlier in the recursive ordering coming first.

  2. ordering := "short_rt_recursive"

    Words are ordered so that all words of a particular length come before any longer word. Amongst words of equal length, right recursive ordering is used, with words earlier in the right recursive ordering coming first.

  3. ordering := "grouped"

    This ordering uses the level field, but can be considered to be the limiting case of weighted lex ordering in which the weights of two generators, if different, differ by an infinite ratio. When comparing words first compares the number of generators in each word at the highest level, if this is not equal, then the word with fewer generators at this level comes earlier. The words are then compared at the next level, and so on. If the words are still equal (so that they are certainly the same length), the word which comes first in lexicographical ordering comes earlier.

  4. ordering := "ranked"

    This word-ordering uses the level field,and is intermediate in behaviour between "wtlex" and "wreathprod". The words are first compared by ignoring all the letters except those at the highest level. If they are different, the word which has the remainder which comes first in shortlex ordering comes first. If they are equal we then compare the words by ignoring all the letters except those at the next highest level (now also ignoring the letters at the very highest level). We keep doing this until we have compared the words at every level. If they are still equal after all that, then the word which comes first in lexicographical ordering comes first.

    ordering:="ranked" can only differ from ordering:="grouped" if at least two generators are at the same level.

    This ordering is similar to "wreathprod" ordering in that it will make the word arbitrarily longer just to improve the word lexicographically at a higher level, but unlike it in that equations such as [a*c,c*w] are impossible unless w contains at most one a, and no generators of the same or higher level than a.

  5. ordering := "nestedrank"

    This ordering is very similar to the last, but in the event of a draw at a higher level the higher level letters are still taken into account when the generators from the next level down are compared. Therefore this ordering is slightly more like "rt_wreathprod" orderings than "ranked" is.

Input files using the last three word-orderings tend to produce output rewriting systems with many fewer equations than a "shortlex" input file would, but many more than "recursive" and "rt_recursive" or "wreathprod" and "rt_wreathprod" input files.

MAF does not currently support building word-acceptors from word-difference machines for "ranked" or "nestedrank" order.

The author had hoped that these three orderings would prove genuinely useful for rewriting systems. Unfortunately, although they tend to produce confluent systems with far fewer equations than shortlex systems, and are less prone than recursive orderings to pathological behaviour, they also seem to be less likely to collapse quickly, and to be much larger before the collapse.

Further remarks on word-ordering

"short_fptp", "short_accentedlex" and "short_multiaccentedlex" were created to provide the easy test cases for building non-shortlex automatic structures. All these word-orderings can be recognised by an FSA, so that they are equally suited to the construction of an automatic structure as shortlex. The author has not found any examples which are automatic using one of these orders but not "shortlex", but thinks it is possible that this might happen, although since the FSA for these orderings have more states than the shortlex automaton does it would probably an unusual occurrence. Construction of word-acceptors for these word-orderings is slightly more difficult than for "shortlex" or "short_rtlex" word-ordering, because both the latter two can use highly optimised code for building the word-acceptor, whereas the former all need to use the generic non shortlex code.

The various "wtlex" type word-orderings cannot be recognised by an FSA, but even so, construction of automatic structures is quite often possible.

Geodesic and non-geodesic word-orderings

The word-orderings whose name begins with short all have the property that the "reduced" word for an element is a shortest equal word, and they will be termed geodesic word-orderings. For all the other word-orderings "reduction" can increase the length of a word, and so the orderings are termed non-geodesic. In the case of weighted lex type orderings ("wtlex" and"wtshortlex"), the potential increase is not usually too severe, but with some of the others, and especially with the the four orderings "recursive", "rt_recursive", "wreathprod", and "rt_wreathprod" the increase in length is potentially unlimited. The first two of these will be called recursive type word-orderings, and all four called wreath type word-orderings. Even where the equations in a rewriting system are all quite short, and when the increase in length in any one equation is moderate, reduction of some words can sometimes become very difficult, even impossible within reasonable time and space constraints, when these orderings are used. Both MAF and KBMAG can run into severe difficulties with wreath type word-orderings. A sign that MAF is in trouble is that the "Depth" value reported in its progress messages becomes extremely high and stays high, and that the message "Performing a difficult reduction" appears frequently. MAF tries hard not to get into this kind of state. Although MAF has to allow the creation of equations that increase the length of words, during Knuth-Bendix completion it attempts to give preference to equations that do not do so, and tries to eliminate perverse equations. Also MAF simply aborts any reduction that takes too long, saves the equation it was working on in its current state if necessary, and only returns to the equation later on if it still seems to be needed.

The four wreath type orderings can have remarkable behaviour, at least when most generators are at different levels. There are input files, with which automata can do nothing useful when ordering is set to "shortlex", but for which a confluent rewriting system is found within seconds when one or other of these orderings is used with a suitable ordering of generators or choice of levels. The confluent systems that are found in such cases are usually much smaller, in terms of the number of equations, than the "shortlex" one where both exist, which means these orderings are potentially very useful for large finite groups as well as for infinite ones that are not shortlex confluent (see MAF Tutorial 3 for an example). On the other hand the equations that emerge are often frequently severely unbalanced, with the RHS much longer than the LHS. This means that "reduction" becomes a severely misleading term.

If you use one of these orderings, and MAF gets into difficulties, it is worth trying to run automata again with the -work_order n option or the -strategy n, or some of the other special options. This makes it possible to set flags that alter MAF's algorithms in various ways. This may be enough for the problem to be avoided.

Tips on Word Ordering

  1. If using a recursive type of word-ordering, make sure you order the generators sensibly. Using one of the recursive orderings frequently results in the elimination of one or more generators - if you specify the order incorrectly you will not eliminate the generators you want to eliminate. Generators you want to keep should come first.

  2. If you want to use a wreath type word-ordering, but the input file is actually confluent when using "shortlex" ordering, it may be better to create an input file from the .kbprog file for the "shortlex ordering". It sometimes happens that MAF is unable to create a confluent rewriting system from the original axiom set, but can do so from the shortlex confluent rewriting system.

  3. Good results are often obtained with the "wreathprod" and "rt_wreathprod" word-orderings if each generator has the same level as its inverse, but increasing levels are used otherwise. Sometimes a symmetry between generators may suggest other ways of grouping generators at the same level. Another possibility, of interest in the case where all the generators are torsion elements, is to assign, for each mutually inverse pair of generators, one generator to level 1, and the other to level 2, as this will result in output files where all the accepted words are "positive" words.

    If all the generators are at different levels, then "wreathprod" ordering is equivalent to "recursive", and "rt_wreathprod" to "rt_recursive" ordering, in each case with the generators re-ordered according to increasing level. MAF's performance will be marginally better if you actually change the input file accordingly.

  4. You can often use a recursive type word-ordering to change generators. For example if a group is generated by a,b it is also generated by a,c where c = a*b, so that adding c as a new generator which comes before b in generatorOrder will cause b to be eliminated. The author used this approach to find some two-generator presentations of symmetric groups included in the examples. However, using a coset system with named subgroup generators is normally a better approach to changing generators.