This description is how MAF works is intended to give a flavour of how MAF looks for an automatic structure or a confluent rewriting system, not as a complete description. If you are at all interested you are recommended to refer to the C++ code, which the author has gone to considerable effort to document thoroughly with extensive comments.
This explanation disregards the relatively trivial stages of initially reading the input file, and finally writing the output files. Otherwise the processing of MAF in automata and kbprog falls into three distinct areas, two of which are fairly definitely ordered in time, but all three of which are essentially co-routines of each other.
In fact a better term for this phase would be the output phase since its code is in fact responsible for creating all the output files that automata may generate - it is for example responsible for turning MAF's very different internal data structures that represent a confluent rewriting system into the corresponding .kbprog and .reduce files). Word-difference machines are also turned from an internal representation into their output form via this code.
In fact the distinction between these three areas of processing, though helpful in terms of understanding how MAF works, and for relating MAF to KBMAG, is much less clear in practice than the previous section suggests.
For example, Difference Tracker may quite strongly affects how the Knuth-Bendix phase proceeds, since MAF uses information that this phase returns to help decide which equations should be considered next. Difference Tracker is also quite capable of itself discovering new equations (because it enforces various consistency requirements in the word-difference machine), and Difference Tracker will ask Node Manager to insert these just like equations discovered in the course of Knuth-Bendix completion (though both Job Manager and Equation Manager will get a say as to when and if this gets done). Conversely, Node Manager may detect errors in the word-difference machine, which Job Manager will then schedule for correction, and at the appropriate time it will inform Difference Tracker of the details of the error to be corrected, and ask it to perform the correction. In the event of a major collapse Rewriter Machine may decide to delete Difference Tracker altogether, and then recreate it when the rewriting system has stabilised again.
Again, even the distinction between the Knuth-Bendix phase and the automatic structure phase is not clear-cut. For example, if it is discovered during the latter phase that the rewriting system should be confluent, the automatic structure phase will temporarily abort itself and the Knuth-Bendix phase is resumed until confluence really has been achieved. Also "corrections" discovered by the various checks in the automatic structure phase are again fed into the Rewriter Machine, and not just into the word-difference machine (as they would be in KBMAG), and this can result in the discovery of many further equations and word-differences. And, in principle at least, MAF can decide that it made a poor decision when it switched to the automatic structure phase, go back to to Knuth-Bendix phase, and then try the automatic structure phase again later. In fact, this is rarely occurs, because the "correction" capabilities of the automatic structure phase are usually strong enough that they can succeed in building the automatic structure even if the Knuth-Bendix phase is omitted entirely. In fact for many "easy" infinite groups this is actually the fastest way of proceeding. The reason for keeping the Knuth-Bendix phase is two-fold:
For groups with many word-differences, especially largish finite groups, MAF's Knuth-Bendix phase is far superior to its automatic structure phase at finding equations quickly, and for such groups attempting to build an automatic structure, which in any case is not really needed since we can find a confluent rewriting system, would fail because the automata would be too big prior to minimisation.
Even for groups where Knuth-Bendix can be omitted (and this includes all of the most difficult examples supplied with KBMAG) doing so may waste time, especially when the group is finite.
The "Knuth-Bendix" phase is far from being a straight-forward Knuth-Bendix implementation:
The Knuth-Bendix implementation does not need to laboriously compare every possible pair of equations. The Node Manager object allows it to quickly create a list of which equations possibly could overlap and where, so that overlaps can be discovered very quickly. Unless you ask MAF to look for non-minimal overlaps, this list will only include the minimal overlaps (an overlap is minimal if no subword that does not include either the first or the last symbol in the overlap word is reducible). In fact MAF can even afford to build the list of overlaps for any particular equation, and consider these overlaps, multiple times, and it may well choose to do so, because the equations from many overlaps will usually be discarded initially, because they have properties that the "committee" regards as being undesirable.
Node Manager functions as a fast index automaton, which means that when a word is being reduced there is never any need to backtrack except when a rewrite is done. In KBMAG word reduction during the Knuth Bendix phase has to backtrack whenever the index automaton gets to a "no LHS here state", but there are no such states in Node Manager, because "prefix change" transitions are constantly maintained.
As each equation is created it immediately forces all equations in which its LHS occurs as a subword of the LHS or RHS to be either eliminated (if the LHS occurs as a subword in the prefix of the LHS), demoted (if the LHS occurs as a trailing subword of the original LHS), or scheduled for correction (if the LHS occurs as a subword of the RHS). When an equation is demoted MAF needs to consider the overlap now formed by the old and new LHS. A new primary equation may result, and MAF may or may not decide to keep the old LHS, perhaps after updating its RHS. All this processing may result in many new equations.
Doing either of these things has been dismissed as impractical in most other Knuth-Bendix implementations, due to the overhead it entails. In theory MAF might need to examine every node multiple times for every equation it inserts to make fast reduction possible. However MAF maintains data which usually allows it to exclude large parts of Node Manager's data from consideration when this is done. In practice MAF's current approach seems to work very well the majority of the time; in fact it is largely only doing the work it would have to do anyway at the time it wanted to find the overlaps of the new equation were the data not maintained as it is.
The updating is quite a delicate operation because Node Manager is modifying itself, and deleting parts of a structure to which there are potentially many outstanding pointers: for example the Difference Tracker holds pointers to word-differences, and these are stored as states within Node Manager. So there is a reference counting scheme to keep such data alive until all the pointers are released: the data is removed from the tree but knows enough about where it came from to be able to reconstruct its former context.
When word-differences become reducible, or when equations which "own" word-differences are eliminated, MAF will tell the word-difference machine to correct itself. The first of these may processes result in the discovery of new equations: the relevant code is very similar to coincidence processing in a coset enumerator, because two word-differences may merge into one, which can have dramatic consequences.
When MAF finds two LHS with a common cancellable final symbol and an identical RHS it creates a new equation (All equations are grouped by RHS so this is obvious to it). (A symbol would be cancellable on the right if there is an equation with an empty RHS in which it is the first symbol of the LHS). MAF is, of course, very careful not to assume cancellation applies generally - it is a program that works with monoids, not just groups). A similar check is performed for LHS with a common prefix, but only if both equations are primary.
Equations created through this and other similar checks, are called deductions (equations discovered by Difference Tracker are also treated as deductions, and right conjugation is handled through a deduction as well). Though all these equations would eventually be discovered by Knuth-Bendix completion, it is beneficial to discover them early as there is very little overhead in doing so.
From time to time MAF may use the current state of the word-difference machine (if it exists) to look for words that ought to be reducible but which Rewriter Machine/Node Manager does not yet know are reducible. Such words often lead to the creation of spurious word differences, so eliminating them helps MAF determine when word-differences are stabilising.
MAF limits the amount of data in Node Manager by putting "uninteresting" long new equations into the pool, or even discarding them altogether. Periodically the pool is checked, and more equations from it are put into Node Manager. This is very important for the good performance of MAF with "pathological" presentations. MAF does discard some equations, but only if they will be reconstructed later on. In each phase of the Knuth-Bendix procedure the various objects that make up Rewriter Machine have to abide by a "contract". Initially the contract is flexible, and allows the "committee" to throw away many equations and ignore unpromising overlaps. Gradually the contract becomes progressively more onerous, forcing MAF to retain more and more data in each phase, so that if there is a finite confluent rewriting system, it will be discovered. This mode of operation makes MAF function rather like a Felsch style coset enumerator, because it will try to discover as many short equations as possible before creating long ones. On the other hand it does consider long overlaps early, so that a short equation that arises from a long overlap will be discovered as soon as possible. This style of processing probably accounts for MAF's outstanding performance with the degen4c presentation
In essence MAF follows a similar scheme to KBMAG for building the automatic structure:
In this outline the algorithm appears simpler than KBMAG's which once it has a word-acceptor at all only builds a new word-acceptor if step 2 fails. In MAF step 2 usually cannot fail because the word-difference machine is maintained in a way that guarantees that if u*x = v is recognised as a reduction, then so also is v*X = u. Building a word-acceptor from a difference machine without this property is the the only reason that building the multiplier can fail, provided the word-acceptor being used has been built from the strongest word-difference machine currently available. MAF will only keep word-acceptors from one iteration to the next if the word-acceptor appears to be stable, and it takes longer to build than the multiplier.
Although this outline omits steps which we shall come onto later, it would be the path taken through the code for "easy" input files.
MAF does usually use the strongest word-difference machine when building the word-acceptor, which contains all possible transitions between all known word differences. However, if building the word-acceptor takes a long time MAF may try to use a weak word-difference machine, which only contains the transitions that have actually been proved to be necessary through their being needed to read the equation using the word-difference machine. In this case building the multiplier can fail. MAF will try to check for this in advance by looking for words that were rejected by the last word-acceptor built from a strong word-difference machine, that are now accepted by the acceptor just built from the weak word-difference machine. Because of all the extra complication this entails MAF will usually only try this once, and decide that using the strong word-difference machine is the best policy, but there are some groups for which building the acceptor from a weak word-difference machine does prove better. For example, for examples/coxeter/5335 trial word-acceptors built from the weak word-difference machine are built far more quickly, and have far fewer states than acceptors built from the strong word-difference machine, and so MAF uses these for most iterations of the automatic structure phase for this group.
The handling of strong and weak word-difference machines is somewhat analogous to KBMAG's use of .diff2 and .diff1 word-difference machines, but the weak word-difference machine usually has many more states and transitions between states than the .diff1 machine would, since it contains all known word-differences and all transitions that have been seen in an equation, whether or not the equation is primary.
In the case where MAF is building an automatic structure for a word-ordering other than shortlex a certain amount of pre-processing is performed against strong word-difference machines to eliminate transitions which ought not to be present in the final correct word-difference machine. This usually improves the chances of building a word-acceptor successfully even though it may sometimes cause the word-acceptor to accept words that could in fact have been rejected.
IdWord
, and wd is a word-difference, have previously unknown reductions. Perform consistency checks on the word-difference machine. Rebuild it if need be.