## The Wheels hit a bump

To recap, the hypothesis is that the VMS text was written by use of a number of cipher wheels, each wheel containing a number of glyphs from which none or one was used. In addition, it was theorized that one of the wheels contained just the Gallows glyphs. The attractiveness of this hypothesis can be summarized as follows:

- The length of words created using a set of wheels in this way should be binomial distributed. This is the case, and was first observed by Stolfi.
- The number of words containing a gallows glyph should be about 50% (since the gallows wheel is chosen 50% of the time). This is approximately true (the VMS has about 60%).
- The number of words in the VMS that begin with a gallows glyph is about 13%, and this closely matches the number obtained when the wheel containing the gallows glyphs is the third wheel.
- The average length of a word that starts with a gallows glyph, should be shorter than the average length of all words that contain a gallows glyph (since the first two wheels were not used). This is also true, by one glyph on average.
- If the gallows glyphs only appear on one wheel, then gallows glyphs should never appear next to one another in a word. This is approximately true: there is one case in the VMS where two gallows glyphs appear together;

This may in fact be two words: EVA ot and EVA kchedy. (Aside: the challenge of deciding where one VMS word ends and the next begins is well known – what is a space, how big is it, and did the transcribers get it right?!)

So far, so good. But from the wheels hypothesis we can make another prediction: for words containing a gallows glyph, there should be at most two glyphs preceding the gallows, if the gallows are all on the third wheel. This is not the case: there are many words that have more than two glyphs preceding the gallows, even if you count some glyph combinations such as EVA qo and EVA ch, sh as one glyph.

Another prediction we can make is for the number of words that end with a gallows glyph. This number can be calculated from the wheel number and layout, and it turns out to be much smaller than what is observed in the VMS. Specifically, in the VMS, there are 85 words that end in a gallows glyph (about 1% of all words), but only about 0.1% are predicted.

## Grove Word Lengths

In the previous post, we looked at how the Grove words (words with an initial gallows glyph) are distributed in the VMS, and how their frequency is explained by the use of cipher wheels to generate VMS words.

Marco commented on that post with the astute observation that if this generation scheme is valid then gallows initial words should be shorter than other words, on average, as only wheels 3 onwards are used to create them.

Here are the data: these show the lengths of Grove words compared with the lengths of other words that contain at least one gallows glyph:

This confirms that, yes, Grove words are on average shorter than other gallows words (by about 1 glyph) – perhaps more evidence for the validity of this scheme?

For interest (as requested by Rene), here are the distributions for EVA l and EVA r:

## Fun with Grove Words and Cipher Wheels

What is a Grove word? The answer is a little fuzzy, but simplistically a Grove word is a VMS word that begins with one of the gallows glyphs. These words are often page or paragraph initial. Emma May Smith has a good explanation in her recent blog entry.

Mr. Grove observed the peculiar feature that *some* words beginning with a gallows glyph are also valid words if you remove the gallows glyph. For example, the word EVA **kodaiin** starts with gallows **k**, and **odaiin** is also a valid word.

It turns out that if you look at all words in the VMS, 46% of them have this property: remove the first glyph and you are left with a valid VMS word. Compare this with English, where only around 8% of words produce valid words if you remove the first letter. Making up the 46% we have 38% from non-Grove words (i.e. non-gallows initial), and 8% from Grove words.

To round out the statistics, about 13% of all VMS words have an initial gallows glyph.

Consider the nine wheels above, where one of the wheels contains gallows glyphs, and the other wheels contain other glyphs. These wheels can be selected in 2^{9} -1 i.e. 511 different ways, to make words of length between 1 and 9.

The probability of selecting wheel 3 as the first wheel for the word is about 12.5%. In other words, with these 9 wheels, 12.5% of the time we’d create a gallows-initial “Grove” word – very close to what we observe in the VMS (13%). In fact, this figure of 12.5% is independent of the number of cipher wheels: as long as there are at least three wheels and they are used left to right, and the gallows glyphs fully occupy the third wheel, then 12.5% of the generated words will be Grove types.

As a corollary, it’s clear that for Grove types generated with the wheels, removing the first glyph will produce a valid word, as it is equivalent to generating a word starting at wheel 4 or later.

So what of the 54% of VMS words that are non-Grove, i.e. removing the initial glyph does not produce a valid VMS word? This can be explained if the number of different words used and written in the VMS is simply less than the total number of possible words that the author’s wheels can produce. What is the expected vocabulary size if we know there are 7,552 words written in the VMS (Takeshi), and we are missing 54% of them? It is simply 1.54 x 7,552 = 11,630 words, or thereabouts.

(Aside: the wheels above could just as easily be represented and used as a table with nine columns.)

In summary, “Grove” words (gallows initial) are ~13% of all words in the Voynich manuscript, and this fraction is what you’d expect if the text was produced using cipher wheels.

## Word Length Distributions

In the previous blog post, we looked at the distribution of word lengths in the EVA transcription, and compared it with the binomial distribution for 9, as per the work of Stolfi. They matched well enough, as I had denoted EVA ch, sh, ain, aiin and qo as single glyphs, in a similar fashion as Stolfi:

For this page, we will define

https://www.ic.unicamp.br/~stolfi/voynich/00-12-21-word-length-distr/symbolas Currier did; i.e. EVA ch ans sh will be counted as single symbols, and so are EVA cth, ckh, etc..

i.e. he reduced some of the EVA glyph sequences to single symbols.

Without making these reductions, so leaving the EVA transcription unchanged, the distributions of course tend to higher values. As a check of my sanity, Marco Ponzi was kind enough to send me a list of VMS words he’d extracted from the ZL transcription, so that I could compare it with the words I extracted from the Takaheshi EVA. In the following plot I show the three word length distributions: EVA, ZL and the reduced EVA with ch, sh, ain, aiin and qo as single glyphs.

Reassuringly, the EVA and ZL (green and blue curves) match quite well, as they should, and the Reduced matches Stolfi’s result. (Curiously, the ZL transcription has a total of 8078 different words, compared with 7552 for Takaheshi EVA – which warrants further investigation.)

The EVA distribution now matches a binomial of (n=12,p=0.5), i.e. using 12 cipher wheels with a probability of 50% for a glyph being used from each wheel.

## Nine Cipher Wheels

UPDATE (12 Aug 2021): the plots and results discussed in this post used a version of EVA that replaces some common glyph sequences by a single glyph, namely ch, sh, ain, aiin, and qo. Clearly, this tends to reduce the average word length. A later post will discuss the distributions obtained with words without this simplification.

The lengths of VMS words follow the binomial distribution for 9, as observed by Stolfi, and as discussed in Rene’s recent paper. This binomial distribution can be obtained from a set of 9 cipher wheels, where each wheel has a 50% chance of contributing one of its glyphs to the word being assembled, and the lengths of the resulting words plotted:

In the above plot, the orange line shows the distribution of word lengths from EVA, and the blue line shows the distribution of word lengths obtained by using the following set of 9 cipher wheels to generate a large number of random words:

With the cipher wheels shown, about 50% of the generated words will contain a gallows glyph, and this is, perhaps not coincidentally, the case in the VMS text, too.

Using the same technique as applied in my earlier blog post, where I looked at the counts between gallows glyphs in the VMS text, we can look at the same distributions for words generated with the above wheels, assembled into lines of text, and ignoring spaces between words. The results are very similar, and shown below.

Here are the others:

## Cipher Wheels – Genetic Algorithm – Some Results

The latest run of the GA produced the following prediction after 5000 Epochs: 12 cipher wheels, covering 96% of all VMS words, as shown below.

The GA was free to use up to 13 wheels, and as few as 3: these 12 are the best fit it found. The blue segments shown are those that can be used to create the example word “pchodol”, i.e. “p” from Wheel 1, “c” from Wheel 3, “h” from 4, “o” from 7, “d” from 8, “o” from 10, and finally “l” from 12.

There is some redundancy: another way of making “pchodol” is to take “p”(1), “c”(5), “h”(6), “o”(7), “d”(8), “o”(10), “l”(12).

## Observations on the Wheel Configuration

The un-benched Gallows glyphs “p”, “f”, “t”, “k” all appear in the first Wheel. They also appear in Wheels 4 and 5, and Wheels 8, 9 and 10. It’s curious that the GA has divided them up in the later Wheels. Their appearance in the first Wheel covers the Grove words. Their appearance later allows for the Gallows word with preceding glyphs.

The GA has only found it necessary to include two of the benched Gallows: “cth” (3) and “ckh” (5). The other benched Gallows “cfh” and “cph” can of course be formed from “c”, “h”, “p” and “f” in the wheels. Why did the GA find it expedient to include “cfh” and “cph” as unique glyphs?

The GA was allowed to use the glyphs “in” and “iin” as if they were single glyphs, but it has not seen the need, perhaps suggesting that they are not single glyphs after all.

The existence of “ee” and “e” in Wheels 6 and 10 is probably a result of the scoring system employed, that tries to only take at most one glyph from each wheel, otherwise “ee” could be made simply by taking “e” twice – but that is penalised.

## A Comment on Repeating Words

**f75r** in the VMS contains the infamous word sequence “qokedy qokedy qokedy”. The Wheels allow qokedy to be created in three different ways (shown below). This is thus a possible explanation of how that sequence occurs – when ciphering three different plaintext words using the Wheels, which Wheels and which positions to use will be different, but the end result is the same VMS word repeated.

## Cipher Wheels – Genetic Algorithm

In my last blog post I talked about a Genetic Algorithm (GA) that works with a set of chromosomes. Each of the chromosomes comprises and is defined by a set of cipher wheels containing VMS glyphs. The goal is to find a chromosome whose set of cipher wheels are able to reproduce as many VMS words as possible, following the rules to be described below.

The hypothesis behind this investigation is that the VMS scribe was using a set of cipher wheels to cipher a plaintext. If the cipher wheels can be identified purely by looking at their output then it would allow further investigations such as what their positions would need to be for each word along the lines of the folio, whether there is a preferred set of wheels for Currier A and Currier B, whether there is a set of wheels that fit the labels better, and so on.

When each chromosome is created it is assigned a number Nwheels of cipher wheels, where** 2 < Nwheels < 13**. For each of the Nwheels, the chromosome places a number of glphs or glyph groups, Nglyphs, around the wheel, where **2 < Nglyphs < 25**. The glyphs are taken at random from the following EVA set:

Master glyph groups

q d l r s n x i m g c s ee a y o e h ch sh ee in iin t p k f cth cph ckh cfh

I’ll refer to the above as “glyph groups”: most are single glyphs but some are groups of two or three glyphs. (The above set is the Stolfi Core/Crust/Mantle EVA glyphs, plus a couple of extras like “in” and “iin”.) The initial population of chromosomes is of size **P** (typically 200).

Here is an example chromosome showing its 8 wheels of 8 glyph groups (this is just one chromosome from the population):

A successful chromosome will be able to encode any word from the VMS, using the following prescription:

- The wheels are used from left to right
- Each wheel may skipped, or it can be used more than once

In the example given above, the VMS word qoteeody is encoding by selecting “q” from Wheel 1, nothing from Wheel 2, then “o”, “t”, “ee”, “o”, “d” and “y” from Wheels 3 to 8.

Here’s a different chromosome:

This chromosome has 10 wheels, each of 6 glyph groups. (The chromosomes are not required to have the same number of glyph groups in each wheel.)

## How does it work?

Each chromosome is assigned a score, by measuring how well it is able to generate a set of VMS words. The score is affected by the following metrics:

- How many VMS words it can successfully reproduce
- The number of duplicate glyph groups amongst the wheels (fewer is better)
- The number of glyph groups per wheel – a uniform distribution amongst the wheels is preferred
- For each VMS word, the number of glyph groups selected from each wheel – none or one is preferred

Once all chromosomes in the population of size **P** have been assigned a score, they are ordered by decreasing score. The top half of the population is retained. The remainder are replaced by **P/2** new chromosomes, created as follows:

**Crossover:**Take two randomly selected chromosomes from the retained population and mate them together. Suppose we select chromosomes C1 and C2: the mating procedure is to select a wheel at random in each of C1 and C2, then select a set of random glyph groups (slices) from each of the selected wheels, and swap them over. So, C1 accepts a donated set of slices of one of C2’s wheels, and C2 accepts a donated set of slices from C1’s wheels. The donated slices are removed from the donor. Chromosomes C1 and C2 are added to the population.**Mutation:**with some probability, select a wheel at random in C1, and a glyph group at random in the wheel, and replace it by one of the master glyph groups. Do the same for C2.**Add Wheel:**with some probability add a cipher wheel to C1, of a randomly selected size, and with a randomly selected set of glyph groups. Do the same for C2.**Add New:**add**Nnew**(typically 10) freshly generated chromosomes to make the population have size**P**. This inserts fresh blood into the population.

With the new population, a score is assigned to each of the chromosomes, and the process repeats. In fact there is an intermediate step called “Cull”, which cleans up a bit:

- For each glyph group in each wheel in each chromosome, calculate the number of times that glyph group was used to successfully create a VMS word. If that number is less than a threshold value
**MinUses**(typically 10) remove the glyph group from the wheel. This cleans up the wheels so that do not contain slices/glyph groups that are rarely used. - For each chromosome, remove any wheel that has no slices/glyph groups left.

The process above defines one Epoch. The GA is allowed to run for many Epochs, but is stopped when the best chromosome (the one with the highest score) doesn’t change over several Epochs.

At the end of each Epoch, the GA prints out some status information, for example:

`Epoch 916 Best score 979.994 Worst -310.009 Good words 2699 / 2814 [0, 0, 1, 2, 1, 5, 8, 172, 9, 2, 0, 0]`

This shows that at Epoch 916, the chromosome with the best score (979.994) was able to successfully reproduce 2699 of the 2814 VMS words it was presented with. The worst chromosome had a score of -310.009. The list of integers shows the distribution of wheel numbers across the chromosome population: 172 have 8 wheels, for example.

In the next post I will show some results.

## Cipher Wheels

Here is a set of three cipher wheels inspired by Rene’s recent arXiv paper, and based on Stolfi’s core-crust-mantle work.

These wheels can account for ~93% of all VMS words if you follow these rules:

- Select none or more glyphs from the outer ring
- Append none or
**one**glyph from the middle Gallows ring - Append none or more glyphs from the inner ring

(The outer ring is Stolfi Crust + Mantle + EVA aoye (**17** glyphs), the middle ring is Stolfi Core (8 glyphs), the inner ring is Stolfi Crust + Mantle + EVA aoye (**17** glyphs).)

I do like the number **17** as it jives with f57v.

## Genetic Algorithm

Practically, it’s extremely doubtful that the above wheels were what was used to write the VMS! What seems more likely is that there were some number of wheels, N, and when a plaintext word was being enciphered only one glyph was used from each of the N wheels, in the style of the middle Gallows ring in the illustration above. What is also likely is that it wasn’t as simple as this, but it’s a good working hypothesis.

The EVA transcription’s VMS words are almost universally 10 glyphs or less long (Rene goes into detail about this in his paper), which suggests that N, the number of wheels, is at most 10. Each of the N wheels should have one or more glyphs around it, and when enciphering a word, the user may skip one or more of the N wheels (or each of the N wheels has a position for a null glyph: it amounts to the same thing).

The question then arises, what set of N glyph wheels is able to reproduce all the EVA words in the VMS, when only one glyph may be selected from each wheel?

This is an ideal task for a Genetic Algorithm! Specify a chromosome as having a number of wheels randomly selected between 3 and 10, and for each of those wheels assign between 4 and 24 glyphs (selected from all the VMS glyphs) at random to make up the wheel. Generate hundreds of such chromosomes as the initial population.

To evaluate the fitness of each chromosome, feed it each of the VMS words in turn, and decide whether the chromosome’s wheels can be used to recreate that word. In particular, score each chromosome taking into account:

- The number of VMS words it can successfully reproduce.
- The number of glyphs that need to be selected from each wheel before moving to the next wheel (lower is better, 1 is ideal, zero is also fine).
- The wheel sizes: chromosomes with wheels of all the same size score higher than those with widely different wheel sizes.

Armed with the fitness function, score each chromosome, order the population by decreasing fitness, discard the bottom half, generate new chromosomes by mating the top half, mutate some of them, re-score, rinse and repeat until the winning chromosome reveals the best N wheels for the problem.

Results will be forthcoming …!

## The Gallows and Benched Gallows

Let’s explore the number of glyphs that tend to appear between the Gallows glyphs. As a reminder, the Gallows glyphs are:

To illustrate the counting method, an example line of VMS text is:

The line above can be represented as

Gxxxx-xxGxxx-Gxx-xGxxx-xx-Gxxxx-xGxxxx

where “x” denotes any non-Gallows glyph, “G” denotes any one of the eight Gallows glyphs, and “-” denotes a space. The counting we’ll use ignores spaces and only counts the “x”s. For this study we only consider lines of text as separate units – no counting across line breaks.

So, the count of glyphs between the initial G and the second G is 6, and between the second G and the third G is 3. For the last G in the line, we count the number of glyphs following: so in the line above the count is 4.

First of all, we’ll look at the differences between the distributions of counts for the four unbenched Gallows glyphs. The “Mode” value is the most likely/common count.

Some remarks about these distributions:

- The sequence GG i.e. where the count is zero, is very rare: this means Gallows glyphs rarely are found next to one another. (Just as capital letters are uncommonly found together in English prose.)
- Most often, there are between 5 and 7 glyphs that follow a Gallows before the next Gallows appears.
- The shapes of the distributions are similar: they have Full Width Half Maximum (FWHM) values of around 6 glyphs, and long tails.
- The distributions are unrelated to VMS word lengths, as we are not counting any spaces.

**Observations**

- Gallows EVA p and f tend to be followed by one or more glyphs than EVA t and k (the Modes are 7, 6, 5, 5 respectively).

What do EVA p and f have in common that EVA t and k do not?

Now let’s look at the Benched Gallows. I will call these T, K, P, F: in EVA they are cth ckh cph cfh. The same counting method is used as above.

Again, some features of the distributions:

- The sequence GG is very rare.
- G is most likely followed by 3 glyphs.
- FWHM are between 3 and 4 glyphs.
- Unlike the un-benched gallows, the count of glyphs following is apparently unrelated to the curlicrossbar.

**Observations**

The number of glyphs that follow a benched Gallows is typically 3, before the next Gallows is encountered. For un-benched Gallows, this number is typically 5 to 7. This does not enthusiastically support the hypothesis that the benched Gallows are simply the same as un-benched gallows with the two adjoining glyphs written separately. For example, take a common sequence, with 7 intervening glyphs:

pxxxxxxxp

is clearly not equivalent to the common sequence for benched of:

PxxxP

since PxxxP written as un-benched would be

cphxxxcph

which only has 5 intervening glyphs. On the other hand there are some relatively common sequences such as:

kxxxxxk

which is equivalent to the common sequence:

KxxxK

The discussion above begs the question: what are the count distributions for the number of glyphs that lie between specific pairs of un-benched or benched Gallows? Is there some peculiarity of the counts between different pairs of gallows?

## Using t-distributed Stochastic Neighbor Embedding (TSNE) to cluster folios

For this attack we’ll use the Takeshi EVA transcription to count the number of times each glyph appears on each folio. This gives us a vector of probabilities for each glyph, for each folio – the vectors are 24 long, as there are 24 EVA glyphs in the alphabet.

For example, here is the probability vector for f1r:

*1r 28 lines {‘a’: 0.08917835671342686, ‘c’: 0.08216432865731463, ‘e’: 0.05110220440881764, ‘d’: 0.06212424849699399, ‘f’: 0.00501002004008016, ‘i’: 0.08617234468937876, ‘h’: 0.12324649298597194, ‘k’: 0.045090180360721446, ‘*’: 0.012024048096192385, ‘m’: 0.001002004008016032, ‘l’: 0.03507014028056112, ‘o’: 0.11923847695390781, ‘n’: 0.050100200400801605, ‘p’: 0.012024048096192385, ‘s’: 0.06412825651302605, ‘r’: 0.04408817635270541, ‘t’: 0.03907815631262525, ‘y’: 0.07915831663326653}*

(This reads as glyph “a” appears 8.9% of the time on f1r, glyph “c” 8.2% of the time, and so on.)

The question is: how similar are these frequency distributions amongst all the folios? Using tSNE (implemented in Scikit learn here: http://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html) we can try to find a 3D arrangement of all the folios that minimises the glyph frequency vector difference between nearby folios.

Here’s a typical result: each folio appears as a point in 3D space …

The colour coding is: red dots are folios that Currier identified as “Language A”, blue are “Language B”, and the remaining black dots do not have an assignment.

It’s clear that the red and blue are well separated, reinforcing Currier’s assignments. Thus this is independent support of Currier’s theory.

There are a couple of notable features:

- f57r and f57v are labelled as Language A (red) – but it looks like they should be labelled as Language B (blue)
- The unassigned folios (black dots) look like they are all Language B