MAF: Computations on subgroups

MAF : Computations on subgroups

MAF: Computations on subgroups

What are coset systems for?

Computations on whole groups or monoids described how to use MAF to try to solve the word problem for a group or monoid. However, one is frequently concerned with the properties of subgroups of a group. We may want to know what the structure of the subgroup generated by a certain set of elements of the group is, and which members of the original group are actually members of it, and, for the elements that are not, whether two elements are in the same (right) coset of the subgroup. Having a solution to the word problem for the group is not enough to let us solve these problems. The coset membership problem is called the generalised word problem: subgroup membership is then just the special case of whether a word belongs to the same coset as the identity. It is often the case that, even when the word problem for a group cannot be solved, the generalised word problem can be solved for some of its subgroups. Clearly if we can find any subgroup in which two words belong to distinct cosets the words represent different elements of the group. The generalised word problem can be investigated, for a particular group and subgroup, through the use of a coset rewriting system. Coset rewriting systems, which from now on we shall coset systems for short, are very similar to the monoid and group rewriting systems we have been dealing with up until now.

What is different about coset systems?

The main differences between using MAF to perform computations on a subgroup with a coset system, and using it to perform computations on a whole group or monoid with an ordinary input file (which we should here think of as a rewriting system), lie, from the MAF user's point of view, in how the input file for a coset system is created, and in how automata, and the other MAF components that can deal with coset systems are then invoked.

The input file for a coset system is not created by the user directly. Instead it is created by MAF itself, using an input file for the group G containing the subgroup, and a second input file in a new format, also created by the user, which is called a substructure file. The substructure file specifies the generators of the subgroup as words in the group generators, and, optionally, may also give them names (so that the elements of the subgroup can be expressed as words in the subgroup generators). From these two files MAF then constructs the coset rewriting system, and this is used as the input file actually processed for computations involving the subgroup. Since there are several different types of coset system, we shall defer discussion of the nature of the substructure files until later in this section.

In order to process a coset system with automata or another MAF utility, the user will normally specify the command line option -cos. A further command line parameter, indicating the suffix used for the substructure file may also need to be specified. MAF follows KBMAG in insisting that all substructure files should have the same basename as the input file describing the group.

Coset system basics

From now on G will denote a group (or monoid) defined by a rewriting system in an input file, and H will be the subgroup of G in which we are currently interested.

MAF supports three different types of coset system, but before we can explain the differences between them, we first need to introduce term coset equation, and to do this we need to describe some general features of MAF's coset systems, and to point out an important difference from an ordinary rewriting system.

In a rewriting system for G all the equations represent true equations between group elements. However, in a coset system an equation may represent either an equation between elements (of either the group or the subgroup), or an equation between cosets of elements, or possibly both. In order to understand the meaning of the equations appearing in a coset system we must learn to distinguish between various different types of generator, word and equation that may appear in the system.

Generators of coset systems

There are up to three types of generator in a coset rewriting system:

The G-generators, which are the generators of G coming from the original input file for G. These are always present.
The coset symbol (usually either _H or _N), which, is a new, non-invertible generator that represents the subgroup H as a whole. This generator is always present in a coset system.
Finally, and optionally, the H-generators, which are all distinct from the G-generators (even when some of the subgroup generators are also group generators), and which are independent names for the generators of the subgroup H.

Words appearing in coset systems

The words in these generators are now divided into four types:

The empty word (denoted as usual by IdWord)
Words in the G-generators alone (G-words)
Words in the H-generators alone (H-words)
"Mixed" words - all other words in these generators.

The empty word is, uniquely, both a G-word and an H-word.

A mixed word may only appear in a coset system if it has the form u*x*v where u is an H-word (possibly empty), x is the coset symbol, and v is a G-word (possibly empty).

Equations appearing in coset systems

The LHS and RHS of any equation appearing in a coset system must be words of the same type. Thus there are three types of equation in a coset system:

Equations between G-words. All such equations are true as equations between elements of G.
Equations between H-words. Of course these are only present if there are H-generators. All such equations are true as equations between elements both of G and H.
Finally, and most importantly, the equations involving mixed words, which are the coset equations. In all these the coset symbol necessarily appears on both sides of the equation. Coset equations always represent equations between cosets, but whether or not they also represent equations between elements of G depends on the type of the coset system.

Coset equations in the input file generated for a coset system will all have the form x*w=h*x where x is the coset symbol, w is one of the subgroup generators listed in the substructure file, and h is either the corresponding H-generator or the empty word (if H-generators are not present). Usually the equations in output files will take the form x*u=h*x*v where u and v are different G-words, and h is some (possibly empty) H-word. However, other types of coset equation are possible if G is a monoid, or if the subgroup generators are named but do not have named inverses.

An input file created for a coset system by MAF will always be valid. If you choose to edit/create a coset system input file manually, but do not follow this pattern, then MAF will either refuse to treat the file as a coset system, or it may even refuse to process it altogether. In either case diagnostics will be issued so that the location of the violation can be determined.

Word-ordering for coset systems

The word-ordering options for coset systems are much more restricted than they are for input files for groups and monoids. In MAF a coset system will use either "wreathprod", "coset", or "rt_wreathprod" ordering. The word-ordering will be generated automatically based on the ordering given in the base input file for the group. Even though the ordering method in the coset system appears to be different from the original input file, the actual ordering of the G-words will be the same as for the original input file, so long as the original ordering was one of "shortlex", "recursive", "rt_recursive", "wreathprod", or "rt_wreathprod". In all other cases the ordering of the G-words is changed. MAF will issue a warning if this happens.

More flexible word-ordering support will be added to coset systems in future.

The three types of coset system

We are now ready to describe the three types of coset system. The first two types are also supported by KBMAG (with limitations), but the third type is new to MAF.

Simple coset systems

In the first type of coset system, called a simple coset system, a coset equation is an equation of the form. H*u=H*v, and means that the set of of all elements of the form h*u and the set of all elements of the form h*v are equal as sets, where in each case h denotes a member of H. Obviously such an equation does not imply that u=v. In fact equations of this type appear in the coset system in the form [_H*u,_H*v] where _H is the coset symbol.

Computations with this type of coset system are often much easier than for the next.
Coset systems with named subgroup generators

In the second type of coset system, called a coset system with named subgroup generators, we express our coset equations differently: in the case where H*u=H*v is true as a coset equation in the sense just explained, we will then find a specific h such that u=h*v and then write the coset equation as H*u=h*H*v. This obviously implies our first form of the coset equation, and if we remove the H on both the LHS and the RHS, we are left with a true equation of words in the group. Now coset equations appear in the coset system in the form [_H*u,h*_H*v] where u and v are words in the original group generators, and h is a word in the named subgroup generators. Although _H is non-invertible it cancels on the right. This is very important as this is the main method by which equations in the H-generators alone will be discovered.

The main benefits of using this type of coset system are:
1. A subgroup presentation can be computed using the specified subgroup generators.
2. Each element of the group can be expressed in the form h*v where h is a subgroup member and v is a canonical coset representative.
Despite these benefits, a simple coset system may well give better results in practice. This is likely where the subgroup has a very large number of generators. In addition, it can happen that the H-words appearing in the prefix of the RHS of coset equations become extremely long, and in this case it is also usually better to use one of the other types of coset system. Alternatively you can try to add some redundant subgroup generators in the hope of reducing the length of such words. Unless you want to generate a presentation for the subgroup using a specific set of generators, it is sometimes best to avoid coset systems with named subgroup generators.
Normal closure coset systems

The third type of coset system is very similar to the first. However, in a coset system of this type, the subgroup is assumed to be normal. Therefore, whenever a coset equation of the form H*u=H*v is found MAF will deduce that all words of the form g*u*v^-1*g^-1 are elements of the subgroup. The advantage of using this type of coset system, which is called a normal closure coset system is that you do not have to know in advance a complete generating set for the subgroup. However, this option must be used with caution, because it could easily happen that the normal closure of a subgroup is not finitely generated, and in this case MAF has no hope of success.

In this type of coset system coset equations take the form [_N*u,_N*v] where _N is now used as the coset symbol instead of _H.

This type of coset system is often useful for determining a finite generating set for a normal subgroup, and is convenient for computing a presentation of the subgroup corresponding to a finite-index quotient. Once a finite generating set has been found you can, if you wish, create a corresponding coset system with named subgroup generators.

Creating a coset system

Cosets of submonoids of monoids do not in general behave in a manner suitable for use with coset systems (or much else for that matter). For example, if L is a submonoid of M and x ∈ L, then it does not even follow that L*x=L - it may be a proper subset of L: as an example consider the monoid of positive integers under multiplication, and any submonoid of the form {N*n ∪ 1}.

We have already mentioned that a coset system is usually generated automatically from the input file for G, and a substructure file describing H, so before we can start using coset systems we must learn how to produce the substructure files they will be built from. This seems to be an appropriate time to point out that a substructure file should always describe a subgroup, even if the original input file is for a monoid, because the notion of cosets is somewhat meaningless for submonoids (as is explained in the side panel). So substructure files might also be referred to as a subgroup file, even though, in the case of a monoid, it would be up to the user to verify that the given set of substructure generators did in fact generate a group and not just a monoid. (This will certainly be true if the subgroup generators are words in generators of G for which an inverse was specified).

It is suggested that the reader refers to MAF: Reference: Substructure files at this point.

Computations for coset systems available in MAF

All the programs in MAF work equally well with rewriting systems that define groups or monoids, and with coset rewriting systems. So here we merely introduce programs and details that were not relevant when considering the use of automata to solve the word problem for groups and monoids. Usually this just means starting the relevant component of MAF with a command line option of -cos, and possibly an explicit subgroup suffix (in the case where your substructure file is not called simply groupname.sub, or as described in Coset system filenames, you can simply supply the program with the full name of the coset system file if you prefer, provided the coset system has been created already.

`gpsublowindex` and `gpsubmake`

The gpsublowindex utility can be used to search for subgroups up to a given index, and so can actually create substructure files for you. The gpsubmake utility can be used to create substructure files for the derived subgroup and power subgroups (the subgroup generated by all elements of the form gⁿ for some n).

`gptcenum`

This is probably a good time to point out that MAF can now also perform coset enumeration using coset systems. This is done using the program gptcenum rather than automata. Coset enumeration is only possible for finite index subgroups, but in this case it will often, though by no means always, be a faster way of verifying that a subgroup has finite index than using automata. Currently gptcenum can only compute a coset table (which can be used to perform word reduction), a coset word-acceptor, and a subgroup presentation on Schreier generators. More facilities will probably be added to gptcenum in future releases, and it is possible that it will be enhanced at some point to understand either ACE input files, or the GAP files used with the GAP ACE interface.

`automata`

The aim of automata, when used with a coset system, is either to find a confluent rewriting system for it, or to generate an automatic structure for it, or preferably both. There are only a few differences in the operation and use of automata from what was described in MAF: Computations on whole groups and monoids. As before, either, neither, or both of these structures may exist. However, there are a few new features:

In the case of finite index subgroups, it is possible to get automata to stop when it has found a rewriting system that is "partially confluent". Refer to Confluent and partially confluent coset systems below for further information about this.
automata will also perform some additional computations for a coset system that have no counterpart in an ordinary rewriting system:
- If the subgroup has finite index, and has sufficiently small index, the coset table is computed.
- It will attempt to compute a presentation of the subgroup, either on Schreier generators (when the subgroup generators are not named), or on the named subgroup generators (when they are).
- If an automatic structure has been computed for the group, automata will attempt to compute a word-acceptor which accepts precisely those words which are accepted by the word-acceptor for G and which are elements of H. This automaton will also be computed if a word-acceptor for the group has been computed and the subgroup has finite index and the coset table has been computed, whether or not an automatic structure has been computed.

Confluent and partially confluent coset systems

If the output coset system system is confluent in the strict sense, then the various types of equation present in a coset system will be confluent individually in the following sense.
1. The G-equations will form a confluent system for G, just as they would if automata had been run on the original rewriting system for G.
2. The equations involving the coset symbol _H together with the G-equations will be sufficient to reduce any word of the form _H*w, with w a word in the G-generators, to _H*w′ (or h*_H*w′ if H-generators are present), where w′ is the least word (under the ordering being used) in the G-generators that lies in the coset Hw. If H-generators are present, then the equation w = h*w′ is a true equation in G, and h is the least word in the H-generators for the corresponding element of the subgroup.
3. The H-equations, if present, will form a confluent rewriting system for H (and so we shall, in particular, have computed a (probably highly redundant) presentation for the subgroup H).
A confluent rewriting system for a coset system may well contain a great deal of information that is not required in order to find the coset representative of any group element. For, example in the extreme case of a subgroup of index 1, we do not need any G-equations at all to find the coset representative of a group element.
In a "partially confluent" coset system, neither the G-equations, nor the H-equations (if present) are necessarily confluent (though it is possible that either one or the other set might be). However, the set of equations is confluent in so far as for any G-word u, the least element of the coset containing u (in the reduction ordering), can be found unambiguously by rewriting the word _H*u where _H is assumed to be the coset symbol for the system. However, if subgroup generators are named, the H-word appearing in the reduced word for _H*u might not be irreducible.

MAF can recognise this limited type of confluence when either the -detect_finite_index or -prove_finite_index option is supplied to automata. It arises when the _H language is finite (the subgroup has finite index), and all overlaps of the G- and _H- equations which are needed to reduce words of the form _H*w*g where w is a coset representative are locally confluent. The H-equations, if present, are not relevant, because no attention is paid to the H-words appearing on the RHS of coset equations in this situation. If MAF outputs such a coset system, it will be given the usual .kbprog suffix, but the "isConfluent := true" field normally found in such output files will be missing.

For coset systems with named subgroup generators the command line options -no_h and -ignore_h_length are frequently useful. Surprisingly, the subgroup presentations that are generated by MAF are often better (i.e. smaller), when the -no_h option is used, than when it is not.

For full details of all thse options refer to Options for coset systems.

Automatic structures for coset systems

It is difficult to know whether an automatic coset structure does actually exist for a coset system: it is neither a necessary nor a sufficient condition that an automatic structure exists for the whole group, though it is certainly more likely that MAF will succeed in finding such an automatic coset structure if an automatic structure for the group has already been found. Of course, a coset system for a subgroup of finite index is always automatic.

In order for an automatic coset structure to exist the the set of all word-differences arising from the G-word and coset equations required for reducing words to the coset representative must be finite. (Word-differences are not computed for H-word equations). For this to be the case it must also be true that the set of all words h in H that occur in the rewriting equations of the form [_H*u,h*_H*v] must also be finite, since these words appear as initial states of the word-difference machine. (Even if H-words are not present in the coset system, the initial states will be different if different subgroup elements are required to make the coset equation true as a group equation)

Subgroup presentations

If MAF succeeds in finding an automatic coset structure for a coset system then it is possible to compute a presentation of the subgroup described by the substructure file whether or not the subgroup has finite index. It is also possible to compute a subgroup presentation if the subgroup has finite index, even if an automatic coset structure is not computed, or if the coset system is confluent and uses named subgroup generators (whether or not the subgroup has finite index).

automata always computes a subgroup presentation if it is possible to do so. It outputs two files for the presentation, a file with the suffix .rws, which is an ordinary input file for the group described by the substructure file, and a file with the suffix .pres, which is a file containing GAP source code that creates the group as a FpGroup. This file is primarily created for compatibility with KBMAG. It can be processed with GAP's Tietze transformation functions, but you will probably find it more convenient to use MAF's simplify program against the .rws file instead.

It is sometimes convenient to be able to perform the computation of a subgroup presentation separately, and you can use the utility gpsubpres to do this. For example, you might have used a coset system with named subgroup generators, only to find that the equations in the subgroup presentation are horribly long, so that it might be better to have a presentation on Schreier generators. Or, you might have edited the file containing the presentation, or run simplify against it, and then found you wanted to recover the original presentation. Occasionally it might happen that the subgroup presentation takes a very long time to compute, and you might decide to interrupt automata and try to compute it using gpsubpres with new options.

The KBMAG work-alike autcos only computes a subgroup presentation if the -p option is specified, and only in the GAP source code format, and only when an automatic structure is computed. As noted elsewhere this presentation will always be on Schreier generators: the information needed to compute a presentation on the user generators is missing from the automatic structures created by autcos.

In fact the presentations produced by MAF will not be usually be very redundant (at least for low index subgroups) and in fact for simple examples may form an independent set of axioms for the subgroup. For subgroups with a high index the number of generators can be very high if Schreier generators are used: for example, while investigating the group in examples/unknown/g9(1_3), whose order is unknown, the author tried to compute a presentation of a subgroup of index 9576, which is known to be a perfect group. There are fairly small generating sets for this subgroup, but it does not seem to be at all easy to calculate a presentation using them: using named generators results in a coset system which quickly produces unmanageably large equations. With unnamed generators the coset multiplier and table can be found quickly, but a presentation computed on the Schreier generators has in excess of 17000 generators.

Creating subgroup word-acceptors

A subgroup word-acceptor for a subgroup H of a group G is a finite state automaton with alphabet equal to the monoid generating set for G, which accepts precisely those words that lie in H and are accepted by the word-acceptor for G. Thus they accept a unique word for each element of H. If such an acceptor can be constructed then an arbitrary word in the G-generators can then be tested for membership in H by first reducing it to its G-normal form (using reduce) and then testing for acceptance by the word-acceptor for H. Thus the membership problem can be solved for H. MAF creates such word-acceptors by first creating a "membership" FSA which tries to decide this question for an arbitrary word (this FSA is similar to a coset table constructed by a Todd-Coxeter coset enumeration). However, this latter membership FSA can only ever be correct if the index of the subgroup is finite (because otherwise it would have to have an infinite number of states). In the infinite case, for some words this FSA will reach the failure state, which means the coset for the word is unknown (though it would be possible to find it if the coset rewriting system is either confluent or automatic). Yet, it may happen that when we restrict ourselves to considering the language, L₀, of reduced words of G, that we may be able to create a more limited membership FSA that says whether an irreducible word is in the subgroup (and this is the subgroup word-acceptor), because it may be the case that the set of all cosets corresponding to prefixes of irreducible words that are members of the subgroup is finite. This set, which we shall call C_H = {H*w(i) : w ∈ L₀ and w ∈ H and i ∈ N} (where w(i) denotes the prefix of length i of w or w if i > l(w). In this case we can construct a subgroup word-acceptor by "anding" the word-acceptor for the group with a suitably large provisional "membership" FSA. Indeed it is not to hard to prove that this is both a necessary and sufficient condition for the subgroup word-acceptor to exist, given the existence of a word-acceptor for G in the first place.

Even so, whether or not we can construct such a word-acceptor is rather a tricky question in practice. Even if we have successfully computed a confluent or an automatic structure for a coset system, (in which case we certainly can compute whether a reduced word w lies in H, by the simple expedient of reducing _H*w), we might not be able to construct the desired FSA, because the set C_H might not in fact be finite. Conversely, even when we cannot find a confluent or an automatic structure for a coset system, then, at least in the case where G was automatic, if C_H is finite, we can construct the subgroup word-acceptor, through the use of a suitable large provisional coset rewriting system.

Theoretically, if G is automatic, and an automatic structure has previously been computed, then this method will work precisely when H is a quasiconvex subgroup of G. The algorithm used for constructing a subgroup word-acceptor is as follows:

If the subgroup has sufficiently small finite index construct the coset table and and this with the word-acceptor for the group. In this case there is nothing further to be done.
Otherwise, construct the general multiplier in the group G for the subgroup generators of H. (We can always do this since G is automatic). Let us call this FSA GM_GH.
Construct a "membership" FSA M_H which we hope can recognise all the cosets C_H.
Then construct a candidate subgroup word-acceptor W_H by "anding" M_H with the the word-acceptor, W_G, for the group.
Clearly W_H should be closed under the multiplication in G defined by GM_GH. So we construct a candidate general multiplier in H by intersecting GM_GH and W_H, and this should contain an accept state for every word accepted by W_H and every sub-generator. In fact it should contain such an accept state whether we form either the "left" or the "right" "exists" FSA for the multiplier. If it does not, then we can find one or more G-reduced words that are in H but are not accepted by W_H, which means that our M_H did not have a big enough language. We then return to step ii), and use these words to improve M_H.
If, as we hope, C_H really is finite, then eventually this process will terminate, and not only have we found the subgroup word-acceptor, we have found a multiplier for it, and so have proved that the language it accepts really does define a subgroup.

When automata is attempting to analyse a coset system, its primary aim is to find a confluent rewriting system or automatic structure for it. In the latter case if it succeeds in doing so, it will then attempt to construct a subgroup word-acceptor, using the algorithm described, provided the automatic structure for G has previously been computed. However, in other cases it does not do. The program gpsubwa can be used to attempt to construct a subgroup word-acceptor in the case were automata only found a provisional (or possibly a confluent) coset rewriting system, but G itself already does have an automatic structure.

KBMAG compatibility issues for coset systems

Word-ordering

In KBMAG's implementation of coset systems, use of "wreathprod" ordering is mandated. The intention behind this is to ensure that all equations produced by Knuth-Bendix completion will take the desired form: [_H*u,h*_H*v]. This is the case if the original input rewriting system uses "shortlex" ordering. However, if "recursive" ordering is used, then "wreathprod" ordering does not work. The file ab2_recursive in the examples/subgroups directory demonstrates this. KBMAG's kbprogcos will not work properly with the coset system at all. This happens because in wreath product ordering h*_H*b is greater than _H*b*a. KBMAG assumes it can move the H-word from the LHS to the RHS of this equation, when clearly it cannot. KBMAG should not allow coset systems to based on input rewriting systems that do not use "shortlex" ordering.

KBMAG would also change "rt_recursive" ordering in the original input file to "wreathprod" ordering in the coset system. This gives a different ordering of G-words, whereas MAF will change the ordering to an identical "rt_wreathprod" ordering. So, in general, it is unwise to attempt to use coset systems created by KBMAG in MAF. Instead the coset system should be regenerated. (This would usually happen automatically, unless you specify an explicit coset system filename rather than relying on the -cos option).

MAF is able to deal this problem: it does this by examining the level field of the coset system. If it determines that problems with word-ordering will arise then it silently changes the ordering method from "wreathprod" to "coset". This is an ordering method new to MAF which works as follows: to compare two words u,v first split each word into its initial H-word and its final G-word. First compare the G-words. If these are not equal then their ordering determines the ordering of the whole word. If the G-words are equal parts compare the H-words.

This approach would work for any method of ordering G-words and H-words, so MAF could in principle be much more flexible in the range of word-ordering methods it supports for coset rewriting systems. It is the author's intention to support this in future.

The problems that arise when the original input file uses "recursive" or "wreathprod" ordering do not arise with the "rt_recursive" and "rt_wreathprod" orderings, because ties between equal high level words are decided on the right rather than the left.

Automatic structures

MAF can compute automatic structures for any type of coset system (unlike KBMAG, which only supports them for coset systems with named subgroup generators), and can also generate a subgroup presentation and subgroup word-acceptor equally well with any kind of coset system.

Subgroup presentations

KBMAG cannot compute a subgroup presentation on the specified subgroup generators, even when they are named. This limitation also applies if you use MAF's autcos or autgroup to create an automatic coset structure. Even if you then invoke gpsubpres separately any subgroup presentation computed from such an automatic structure will be on Schreier generators, because the information required to compute a presentation on the named subgroup generators is missing from the automatic structure.

Examples of Knuth-Bendix on coset rewriting systems

The examples/subgroups directory contains a number of examples of rewriting systems for groups together with some defined subgroups.

examples/wallpaper contains a presentation of the 244 (or more accurately, in terms of the presentation used, the 424) triangle group, together with subgroup files for each of the crystallographic wallpaper groups which occur as its subgroups - these can be used to generate nice presentations of each of these groups: there is at least one subgroup file for each group that will generate a confluent and automatic presentation. There is a similar set of presentations of the wallpaper groups which occur as subgroups of the 236 triangle subgroup.

examples/hyperbolic contains some subgroup files which work with many different hyperbolic triangle groups. These can be used to find analogous subgroups to some of the Euclidean wallpaper groups.