Newsgroups: sprouts-theory From: Dan Hoey Date: Wed, 24 Dec 2008 15:00:38 -0500 Subject: Re: Sprouts notation Yper Cube sent me a note requesting input on Sprouts notation suitable for a position library as in GLOP. With his permission, I'm responding to sprouts-theory generally. Yper Cube wrote: > I liked the parenthesis idea.It doesn't work in positions arousing from > torii and proj. planes surfaces but maybe it can still be used sometimes. > (Example: we can't use it in .abcabc. > but we can write .abacbc. > as .(b)(b). , can't we? I think this is a good idea when possible. > I would also prefer GLOP to use "{" to mark the start of a region and not > only to mark its ending and "." to mark for boundary starting as well. Like, > instead of: > 0.0.AB.}AB.}]! > use something like > {.0.0.AB.}{.AB.}]! > or maybe > [.0.0.AB.}{.AB.}] > In a recent sprouts theory post, I mentioned another possible > technique to copmact notation further. > Use of * or ^ when we have multiple identical boundaries. > Example. instead of 0.0.0.0.}], use: > 0*4.}] > (GLOP uses the above now) > or > [{(.0.)^4}] > or > [{(.0.)*4}] I strongly disagree with this. We should save our matching brackets for things that need to be grouped recursively, whereas boundaries within regions within lands are handled by hierarchy. Boundaries within a region should be _separated_ (as with "."), regions should be separated (say, with "/"), and lands should be separated (say, with "%"). The use of a trailing separator is superfluous and should be avoided, so the above would be 0.0.AB/AB . The "!" for end-of-position should only be needed if there's some other thing following the position. A trailing separator should be permitted on input, but ignored. For nonplanar regions, I suggest a region should be suffixed with +n for T^n or -n for P^n. Since the region will always be followed by "/", "%", "!", or end-of-string, there is no restriction on multiple digits for n. > Using parenthesis, we can extend this to include more complex > objects (than boundaries) > Like, instead of 0.0.0.0.2.2.2.2.2.AB.CD.EF.}AB.}CD.}EF.} ,use: > {(.0.)*4(.2.)*5(.AB.{.AB.})*3} This can be done in several ways without the leading or trailing separators. Perhaps the simplest is to always use a kind of brackets (say "<>"), so that <0.>4 means 0.0.0.0 <1.2A/>2 means 1.2A/1.2A <12>32 means 1212122 <(>3<(<)>2>3 means (((())())()) . The <...> is always followed by a single digit specifying the repeat. A position canonicalizer may not want to use all of these constructs, but they should be recognized on input. Your last example in which the parentheses imply a change of variables is somewhat problematic. I think it should be done with a different approach, which I call the "partial position" representation. I've talked about partial positions before--I think they will eventually appear in the database by themselves, along with simplifications such as the ^x:x.0 = ^x:x2 that has been observed before. A partial position is a land that has some unmatched pivots called "parameters". The parameters are listed in the front of the partial position in a sort of lambda notation, like ^xyz:1xy2z.2 . This refers to a part of a position whose pivots are to be matched up with another part of the position in a way to be determined. Copies of the partial position may be matched up in several different places. When there is only one parameter, the presence of that parameter outside the partial position denotes a pivot attached to a copy of the partial position. So ^x:xA/A1%xx.x means AB.C/AD/D1/BE/E1/CF/F1 . Partial positions can instantiate other partial positions: ^x:x((1))%^z:x.xz%zzx means ABC/D.EA/D((1))/E((1))/F.GB/F((1))/G((1))/C((1)) . With a multiple-parameter partial position instantiated more than once, we must specify which parameters go with which instantiation. In the simplest case, on a plane where the matching pivots appear in the same region (and therefore on a single boundary), the parameters parenthesize, so we can specify parameters as [x...y...z] without ambiguity. For instance, ^xyz:1xy2z.2%xy[z1[xyz]yx]z would mean ABF1GHJEDC/1AB2C.2/1DE2F.2/1GH2J.2 . In more complicated situations, we can specify multiple parameter sets for a partial position as in ^xyz^tuv:x(y(z))%xtuyvz for ADEBFC/A(B(C))/D(E(F)) . The primary parameter set can still be parenthesized if necessary, but the others must be unambiguously matched with each other. Finally, I would like a construct for specifying the join of two partial positions. I propose using "=xyz=tuv" , where the first "=" replaces the preceding "%" character. So ^xyzw:xyzw1=xyzw=xywz refers to the (nonplanar) position ABCD1/ABDC1 On alphabets: Glop introduced the idea of lower-case characters for pier spots, which can appear on multiple boundaries within a position. upper-case characters were reserved for pivots. The problem is that we sometimes run out of one kind or another. With the addition of parameter letters, the problem is worse, so I propose a more general approach. For readability, a program is encouraged to use ABC... for pivots, abc... for labeled pier spots, and xyz... for parameters so far as possible. The rules for recognizing which is which should be loosened so that any letter can be used for any of these purposes. A letter that appears in a lambda expression "^xyz:" is always a parameter, and those are the only parameters. For multiple parameters in brackets, each parameter must appear just once except within inner brackets. Other multiple parameter sets appearing on a boundary are assumed to match each other, so ^xy%xy.xy means AB.CD/AB/CD; we need ^xy^zw:xy%xz.yw to refer to AC.BD/AB/CD . No other unbracketed parameters of the set may appear on that boundary. Other sets of unbracketed parameters that appear within a region are assumed to match each other, so ^xy%Axy/A1xy means ABC/A1CD/BC/CD ; again we can't have other stray members of the set in the region. If none of these is possible, each set of parameters can appear only once in a land: ^xy^zw^tu:xz.yt/uz for AD.BE/FC/AB/CD/EF . Excluding the parameters, a letter that appears twice on a boundary is a pier spot, and may appear only twice on that boundary. Excluding these, a letter that appears on different boundaries in the same region may appear only twice in the region, and the two letters match each other. (This can only happen on nonplanar boards). After all of these are taken into account, remaining letters must appear exactly twice per land, and match each other. Just in case we end up with a need for more than 52 letters, I believe we should allow an sequence like &word; to be used for a letter, where "word" is composed entirely of letters. So the grammar we have is := | | "!" := ( "+" | "-" ) ":" := | "=" "=" | "%" := "^" ":" | "^" ":" := | "%" | "%" := | "/" | "/" := | ( "+" | "-" ) := | "." | "." := | := | "(" ")" | "[" "]" := | 1 | 2 | "&" ";" := | This grammar accidentally prohibits extended letters as parameters, but maybe that's a good thing. There are other requirements such as that parameter lists can't repeat a letter, but that is for symantic analysis. The gamelabel is in case someone needs to mark the position as normal or misere or specify an initial number of spots. I suppose I should allow the number or sign to be omitted. The use of "<>" for repetition is a metalanguage used for abbreviating this language. It is expanded textually first before applying this grammar. We still have a set of brackets "{}" that aren't used, but I find they tend to look too much like parentheses to be very useful. Other symbols such as @~#$*_\,;?'"` may turn out to have some use as well. Finally, I should note that GLOP databases need to have a version number at the top so that older versions can be recognized and converted to newer ones. Dan