MAF : Overview : Computations on whole groups or monoids

MAF: Computations on whole groups or monoids

Readers interested in such matters should note that MAF does not deal with semigroups. It is a trivial operation to convert a semigroup into corresponding monoid, and equally trivial to convert the automata MAF produces for such a monoid into automata for the original semigroup. If support for semigroups is desired it will be easy to add.

Most of the computations to be described in this section are usually performed by invoking MAF's principal utility: automata.

Presentations, rewriting systems and "input files"

In order for MAF to be able to perform computations on a group or monoid, it must first be supplied with an appropriate presentation, in the form of a text file containing a standard GASP format rewriting system. A file in this format will generally be called simply an input file in this document.

Although the MAF user will probably think of an input file as a presentation it is important to remember that there are some differences from a group presentation:

MAF works with monoid presentations. This means that MAF needs to be told, which, if any, of the generators are invertible, and that both the generator and its inverse need to be included in the list of generators when an input file is created from a group presentation.
The input file must contain, at least implicitly, information which establishes a reduction ordering of all the words in the generators.

An informal definition of the syntax of input files syntax can be found in MAF Reference : Input files. The various options for ordering words are discussed in MAF Reference : Word-ordering methods.

Readers who have not already done so, and who have not previously used KBMAG, should refer to MAF Tutorial 1: Processing an input file before continuing.

Searching for a solution to the word problem

Once an input file has been created, one may need to decide what kind of a solution to the word problem MAF should look for. The good news is that in many cases this is unnecessary. However, it is important that the user should be aware of the issues, because otherwise MAF may run in an inefficient manner. The various possibilities are discussed more fully in Overview: Solutions to the word problem. automata will attempt to construct either a finite confluent rewriting system, or an automatic structure (or very likely both). In either case the computation usually uses the Knuth-Bendix procedure. An alternative, for input files describing a finite group, is to attempt to use coset enumeration to find the coset table of the group.

Bad news

Not all input files have a finite confluent rewriting system, and many infinite groups and monoids may not have a finite confluent rewriting system for any choice of generating set and reduction ordering. The existence of a finite confluent rewriting system for any particular choice of generating set and word-ordering does not guarantee its existence for any other choice.

Not all groups have automatic structures. There are infinite groups, with a solvable word problem, for which it is possible to show that no such structure can exist. If a group does have an automatic structure then the group is called an "automatic group". In principle, the existence of an automatic structure for some choice of generating set does guarantee the existence of one for any other finite generating set; this is why it is meaningful to talk of a group's being automatic. An algorithm exists for constructing an automatic structure on an arbitrary finite set of generators given any particular automatic structure. MAF is only able to find automatic structures that are natural for the chosen generating set and word-ordering: ones in which the accepted word for each element is the least word in the word-ordering that is equal to it as a group element. However, if you have succeeded in computing an automatic structure using some generating set, MAF is able to compute a word acceptor for any other genererating set using the utility gpxlatwa.

It will be useful, informally, to speak of an input file as being automatic or confluent, where by this the terms are understood to mean, "MAF can compute an automatic structure from the input file" and "MAF can compute a finite confluent rewriting system from the input file". This makes it easier to state an important fact: an input file can be either, neither, or both of these. MAF's ability to find either type of structure for a group is heavily dependent on the choice of presentation and word-ordering (a first hint of this can be found by referring to MAF Tutorials: 2 - Choosing generators). So, given a presentation for an unknown group, one is immediately faced with the dilemma of which kind of structure to look for. This dilemma does not as occur for monoids, since MAF cannot construct automatic structures for them.

MAF searches for automatic structures and confluent rewriting systems

A major difference between MAF and KBMAG from the point of view of a user hoping to find a solution to the word problem for a particular group, is that with KBMAG one has to decide in advance, whether to look for an automatic structure, or for a confluent rewriting system. In MAF the situation is different: MAF's default behaviour, usually, is to look for both at the same time. This is not always the advantage it might seem to be - it might sometimes be better to run two copies of automata independently, one looking for an automatic structure, the other for a confluent rewriting system. Also, in some cases, it will quickly become apparent to the user that there is little or no hope of finding an automatic structure. In such cases it is usually much better to interrupt automata, and to start it again with an option which tells it only to look for a confluent rewriting system.

MAF has no means of knowing in advance whether its computations will succeed (no such means can exist), and it will run for ever until they do, unless interrupted, or it gives up due to limits placed on it being exceeded, or insufficient resources are available for it to continue.

Forcing MAF to search just for an automatic structure or just for a confluent rewriting system

The user can supply various command line options, which will control what automata MAF searches for, and when they are searched for. These may be termed goal setting options. We shall introduce the most important of these now. Refer to MAF Reference (Usage): automata for more complete information on all the many command line options accepted by automata.

To search only for a confluent rewriting system

There are two command line options for doing this: -no_wd and -confluent. The options only differ in what happens when MAF does succeed in finding a confluent rewriting system. With the first option, this is the only structure that MAF will compute and output. With the second option, MAF will then attempt to compute an automatic structure as well, using a word-acceptor computed from the rewriting system as the basis of the structure. Both these options may substantially speed up processing, and do usually do so if the input file presents a finite group. The -confluent option is on by default if the input file uses a word-ordering for which the construction of automatic structures is difficult. For large finite groups automatic structures can require a lot of computation, and the confluent rewriting system will usually be more useful in practice, and so the -no_wd option is often used with such groups.

To search only for an automatic structure

The -no_kb option tells MAF not to use the Knuth-Bendix procedure, and instead to attempt to construct an automatic structure immediately, using only the equations contained in the input file, and any others that are readily deducible from them. For infinite automatic groups this option often produces the fastest results, even for some groups for which the construction of the automatic structure is quite difficult. However, it can happen that MAF has great difficulty in building trial automatic structures. In such cases it may be best to run automata without this option, to allow more word-differences to be discovered first (though for the most difficult automatic groups it is often the case that there is almost no hope of running Knuth-Bendix for long enough to avoid this difficulty - it is very difficult to know when all the word-differences have been discovered). The -no_kb option is disastrous if the presentation is actually a difficult presentation of a finite group. MAF will usually eventually be able to discover this, but it will take much longer than it would have done had the Knuth-Bendix procedure been used. If a group is finite MAF will insist on finding the confluent rewriting system that certainly exists before it computes the automatic structure.

To search for an automatic structure when this is not the default behaviour

MAF's default behaviour is to search for an automatic structure only when an input file for a group uses a geodesic word-ordering (i.e. a word-ordering where a word is always greater than any shorter word). To make MAF search for an automatic structure for a word-ordering such as wtlex then one can use either of two options: -wd or -force_differences. The behaviour of the latter option is described in the next section. The -wd option will cause MAF to search for word-differences with its default strategy, in which equations with word-differences are favoured only until there are 5000 word-differences or so.

Using a preliminary run of `automata` to decide upon the best strategy

The -force_differences command line option will ensure that MAF computes word-differences if it is possible to do so, and will not stop favouring equations that contain new word-differences if the number of word-differences is high. One can run automata like this for a minute or two to help decide on the best strategy. If the number of word differences increases rapidly, and soon exceeds 5000, then there is little or no hope that MAF will be able to compute an automatic structure. In such cases it is best to stop automata and to restart it with the -confluent or -no_wd options. If, on the other hand, the number of word differences has stopped increasing rapidly by this time, then there is a good chance that the -no_kb option will give the fastest results. Start a second copy of automata, specifying the -no_kb option, this time leaving the first copy running, and wait until one of them succeeds. If memory becomes short then stop whichever seems to be making less progress.

A note about word-acceptors

It is perfectly possible for an input file to have a provably correct word-acceptor which does not form part of an automatic structure. This will happen, for example, whenever there is a finite confluent rewriting system for the input file, but the number of word-differences is not finite. It may also happen that while the minimal confluent rewriting system is not finite, there are only a finite number of word-differences for it, so that the equations that appear in it can be recognised by an automaton. In such cases one can construct a word-acceptor, even if it is not possible to construct the general multiplier because that requires an infinite number of word-differences. This situation does sometimes arise in practice. MAF can find such word-acceptors but cannot prove them correct.

MAF prefers, whenever possible, to compute the word-acceptor for an input file from a confluent rewriting system, rather than from a word-difference machine. It is worth having a word-acceptor even when a confluent rewriting system exists, because although both structures can be used to enumerate the reduced word for each element of the group, the former will usually have many fewer states, and use much less memory. For largish finite groups it is often very difficult, if possible at all, to construct a word-acceptor from the word-difference machine, because there are too many word-differences, whereas the computation of the word-acceptor from a confluent rewriting system is very rapid. Usually, though not always, MAF refrains from trying to build an automatic structure prior to a rewriting system becoming confluent, in the case that it will do so, because such groups usually have many word-differences.

The `-strategy string` option

MAF has a very large number of command line options that modify the behaviour of its Knuth-Bendix implementation. These are all documented in Usage: automata. It is quite difficult to remember how to use all of these various options, so to make life simpler the -strategy string option is provided. This can select one from a range predefined strategies for MAF to follow, and also apply modifications to the chosen strategy. One should do this only if the strategy MAF selects when no specific command line options are given does not seem to work well: some of the strategy options have very bad behaviour with many input files, whereas the automatically chosen strategy MAF will usually work reasonably well, even if it is not necessarily the fastest possible for that particular input file. The various strategies, and the combination of other options that they select are described in detail in Usage: automata, but the most important of the strategies are introduced here:

First come three strategies that are not normally specified explicitly, but which are nevertheless important; they select combinations of options that MAF commonly uses in particular scenarios:

-strategy short selects the strategy that is usually followed when MAF is looking for a confluent rewriting system and the word-ordering is geodesic.
-strategy wreath selects the strategy that is usually followed when MAF is looking for a confluent rewriting system and the word-ordering is a wreath product ordering, or close to such an ordering.
-strategy sparse selects the strategy that is usually followed when MAF is looking for a shortlex automatic structure.

Note that MAF does not actually select any of these options as its "default strategy", because the "strategy" option does not actually exist internally, and nor do these options. However, if no specific command line options are chosen, the internal options that MAF really does use in those scenarios would be set to the same values, or at least very similar ones, as these specific strategy options select. The reason why you might want to specify one of these options specifically is that you can use "strategy modifiers" as a way to investigate whether a minor modification to the default strategy might work better for a particular input file.

Next we mention some strategy options that may sometimes be useful:

-strategy long and -strategy era are important strategies, because one of them may be the only strategy that works well with some input files. Unfortunately MAF is not able to determine for itself when an input file is going to have this unfortunate property, and neither strategy works well the majority of the time, so MAF never works like this by default. If an input file does work well with either of these strategies it is rather likely that KBMAG will process the file as it stands more efficiently than MAF. It is also likely that MAF will be able to process the file much faster than that, and using a default strategy, if the word-ordering method is changed to "recursive" or a similar wreath product ordering. An input file has a good chance of running more quickly with one or other of these strategies if most of the following conditions apply:
1. Some of the equations in the input file are much longer than others. (Only equations where the total length is more than 4 count here - all equations of length 4 and below are ignored when MAF is trying to decide on a strategy for processing the file).
2. There are many overlaps involving only short equations.
3. There are "hidden" short equations, in other words, equations involving only short words that are not obvious consequences of the axioms in the input file, and which can only be deduced from much longer equations.
4. The input file uses a geodesic word-ordering such as "shortlex".
-strategy quick and -strategy easy turn off some of the additional processing MAF normally performs. As a result some input files will be processed more quickly if one of these two options is used. On the other hand, for some input files these strategies will take much longer than the default strategy, and it is impossible to know for sure which strategy without trying all of them.

Using the progress messages to decide on command line options

The progress messages usually output by automata are not there just to let you know that automata has not hung. They contain information which may help you to decide whether the options that are being used are good.

If too much time seems to be spent processing "special overlaps" or "conjugating" you might want to try the effect of -special_overlaps 0 or -conjugation 0
When processing an input file that uses a wreath product type word-ordering, if too much time seems to be spent "Checking secondaries", and especially if the number of L2/L3 nodes high compared to the number of L1 nodes, then experiment with the -weed_secondary or -secondary 1 options, or alternatively try the -consider_secondary option.

`gptcenum`: an alternative to `automata`

For finite groups, but not monoids, there is a third possibility for finding a solution to the word problem: instead of using the Knuth-Bendix procedure one can use coset enumeration. If it is possible to enumerate the cosets of the trivial group, then the standardised coset table provides an effective solution to the word problem. On average, and when it works well, computing a coset table using coset enumeration is significantly faster than computing a confluent rewriting system using the Knuth-Bendix procedure, and the coset table may well require less memory than a confluent rewriting system and index automaton. More information about MAF's coset enumerator gptcenum can be found in the relevant section of MAF Reference: RWS Utilities.

For difficult presentations, automata is very often far superior in performance, both in execution time, and the amount of memory required. On the other hand, for presentations of very large finite objects with very large confluent rewriting systems gptcenum is likely to be much better than automata. It is certainly worth experimenting with both programs.

Axiom checking. An important difference from KBMAG

As usually implemented by MAF the algorithm used to construct an automatic structure is theoretically incomplete. When MAF creates an automatic structure for a group it checks that it defines a closed multiplication on the language of the word-acceptor, but it does not check that the multiplication is associative, nor that it actually satisfies the group axioms. In theory MAF should perform some checks to ensure the proposed automatic structure really does define a group multiplication that satisfies the axioms. There is a good reason why it does not do so: the checks frequently take much longer to perform than constructing the automatic structure takes in the first place, and they have never been known to fail (for a group automatic structure). It is possible to make these checks using MAF, either by supplying a command line option when automata is invoked, or by running a separate axiom checking program afterwards. The first approach is usually faster, the second approach marginally safer, because more memory will be available if the automatic structure is checked later on by a separate utility.

automata checks the axioms for the automatic structure if the command line option -validate is specified. KBMAG users may like to note that this option is automatically supplied if the MAF version of autgroup is used to build an automatic structure instead of automata.

The separate utility for checking axioms is called gpaxioms; it is described in Usage information for RWS Utilities.

MAF does check the axioms by default when an automatic structure for a coset system is built, because for coset systems the checks can sometimes fail.

Examples

In this section, we mention some of the example input files, all to be found in various subfolders of the examples folder. These can usefully be used as test examples, and some of them have been included to demonstrate particular features.

The input files in trivial are all presentations of the trivial group. Note, in particular, degen4a, degen4b and degen4c. These are the first three of an infinite sequences of increasingly complicated presentations of the trivial group, due to B.H. Neumann. automata will run very quickly on all of these presentations, whereas KBMAG can only do the first two within a reasonable time. The -confluent, or the -nowd command line options speed up the calculation considerably in the case of degen4c, the primary reason for this being that automata computation of word-differences is very time consuming for this presentation. The fastest results for degen4c can perhaps be obtained by changing the word-ordering to "recursive" and using the command line options -strategy easy -work_order 4: when these two options are used the computation takes only one second!. For shortlex -strategy easy/i -work_order 6 is equally fast.

The input files in automatic are all for automatic groups. Most of the examples in this folder are of at least moderate difficulty and most run much faster with the -nokb option than without. Easier examples are to be found in the hyperbolic and coxeter folders. The most difficult input files for automatic groups are to be found in the hard folder - all those examples require several hours at least.

The example subgroups/ab2 is the free abelian group on two generators, as is subgroups/ab2_abAB. The former, which has generatorOrder := [a,A,b,B] has a finite confluent rewriting system. The latter, which has generatorOrder := [a,b,A,B], does not. It is almost always best to order the alphabet so that generators and their inverses appear next to one another if they are different.

The monoids directory contains a number of monoids. a4monoid illustrates what might happen if the inverses field is omitted or not set up. What would be a group of order 12 as a group presentation, becomes an infinite monoid. Several of the examples in this folder are related to the Fibonacci groups. Others are monoid presentations, where generators are not supplied with inverses. Try f25monoid, which is the presentation of the Fibonacci group F(2,5), but as a monoid. In fact, the structure is almost identical to the group in this example. The group is cyclic of order 11. The monoid has order 12, the extra element being either the a^11 word (which is not recognised as being equal to IdWord), or IdWord itself. The corresponding semigroup (without the empty IdWord) is isomorphic to the group. f27monoid and f27monoid_shortlex are both monoid presentations corresponding to the Fibonacci group F(2,7), which has order 29. As before the monoid has one extra element, which in this case is a^29. This example is another interesting example of the differences between MAF and KBMAG. KBMAG cannot easily solve this example when shortlex ordering is used, but with MAF it completes in around 5 seconds with the default strategy, and in similar times with some but not all of the other predefined strategies. With this example it is important to allow "special overlap" processing, and to use an "open" strategy that makes use of new equations as soon as possible. Particularly bad results are obtained if -strategy era is used. For f27monoid, which uses "recursive" ordering KBMAG's "maxstoredlen" option is set in the input file. However, the values specified for this option could not possibly work with MAF, because several of the equations in the confluent set exceed the rhs limit. MAF ignores this KBMAG specific option, and, with no limits on equation length, completes almost instantly. Use of the "pool" is crucial to MAF's success for this and most other input files using a wreath product word-ordering.

In the hyperbolic folder there are many hyperbolic examples of Von Dyck and triangle groups. All the groups are automatic for any choice of generating set, and it would seem that all of them have finite confluent rewriting systems as well, but not all have them for "shortlex" word-ordering. The rule seems to be that all the triangle groups have a finite "shortlex" rewriting system, but the Von Dyck groups only have them when exactly one of the 3 numbers representing the angles of the triangle is even, when the natural 3 generator presentation must be used. All the Von Dyck groups seem to be confluent on a 2 generator presentation using "recursive" ordering. These groups are amongst those discussed in a paper of Le Chenadec from 1986 entitled "Complete group presentations", but the author has not investigated whether the confluent rewriting systems found by MAF are the same as the ones in that paper.

In the pathological folder there are a large number of input files that are all for the same group : #8 in the list of presentations of groups of order 3⁸ given in "Groups of Deficiency Zero". The files vary only in the word-ordering except that in some cases the relators have been reversed. Some of these can be processed quickly, while others seems to be hopeless cases, and a wide range of unpleasant behaviours can be observed. For instance with "rt_recursive" ordering the input file can currently be processed fairly quickly with generatorOrder=[A,a,B,b,C,c] but not with generatorOrder=[a,A,b,B,b,c,C]. Many of these files seem extremely sensitive to minor code changes in MAF, or work well only when the command line options are carefully chosen.

In the recursive subfolder there are more examples using the "recursive" word-ordering For nilp2, nonhopf, heinnilp and verifynilp, a wreath product word-ordering is essential. (These examples also work if they are changed to use an ordering where each generator is at the same level as its inverse. The author has never come across an input file where this is not the case). All these examples are from KBMAG. The last two of these are examples of the use of Knuth-Bendix to prove that a presentation defines a nilpotent group. The verifynilp again show the same difference in behaviour from KBMAG as did f27monoidexample. In KBMAG the maxstoredlen parameter (or equivalently the -mrl option) is used to make kbprog complete more quickly for some examples. MAF has to ignore this setting, because in every one of the examples included with KBMAG that use such a setting, the KBMAG value of the setting prevents MAF from ever finding the correct confluent rewriting system, whereas when the setting is ignored MAF very quickly completes successfully! MAF does allow for such a setting however, in case it should ever prove useful: MAF's equivalent of KBMAG's maxstoredlen is maf_maxstoredlen. However, the author has yet to find a single example where the use of this setting is at all beneficial. In most cases it just prevents MAF from ever finding a confluent set.