Home > Algorithms, cipher, English, gallows, Grove > Fun with Grove Words and Cipher Wheels

Fun with Grove Words and Cipher Wheels

What is a Grove word? The answer is a little fuzzy, but simplistically a Grove word is a VMS word that begins with one of the gallows glyphs. These words are often page or paragraph initial. Emma May Smith has a good explanation in her recent blog entry.

Mr. Grove observed the peculiar feature that some words beginning with a gallows glyph are also valid words if you remove the gallows glyph. For example, the word EVA kodaiin starts with gallows k, and odaiin is also a valid word.

It turns out that if you look at all words in the VMS, 46% of them have this property: remove the first glyph and you are left with a valid VMS word. Compare this with English, where only around 8% of words produce valid words if you remove the first letter. Making up the 46% we have 38% from non-Grove words (i.e. non-gallows initial), and 8% from Grove words.

To round out the statistics, about 13% of all VMS words have an initial gallows glyph.

Consider the nine wheels above, where one of the wheels contains gallows glyphs, and the other wheels contain other glyphs. These wheels can be selected in 29 -1 i.e. 511 different ways, to make words of length between 1 and 9.

The probability of selecting wheel 3 as the first wheel for the word is about 12.5%. In other words, with these 9 wheels, 12.5% of the time we’d create a gallows-initial “Grove” word – very close to what we observe in the VMS (13%). In fact, this figure of 12.5% is independent of the number of cipher wheels: as long as there are at least three wheels and they are used left to right, and the gallows glyphs fully occupy the third wheel, then 12.5% of the generated words will be Grove types.

As a corollary, it’s clear that for Grove types generated with the wheels, removing the first glyph will produce a valid word, as it is equivalent to generating a word starting at wheel 4 or later.

So what of the 54% of VMS words that are non-Grove, i.e. removing the initial glyph does not produce a valid VMS word? This can be explained if the number of different words used and written in the VMS is simply less than the total number of possible words that the author’s wheels can produce. What is the expected vocabulary size if we know there are 7,552 words written in the VMS (Takeshi), and we are missing 54% of them? It is simply 1.54 x 7,552 = 11,630 words, or thereabouts.

(Aside: the wheels above could just as easily be represented and used as a table with nine columns.)

In summary, “Grove” words (gallows initial) are ~13% of all words in the Voynich manuscript, and this fraction is what you’d expect if the text was produced using cipher wheels.

  1. August 18, 2021 at 9:50 pm


    If one looks at numbers, then removing the first digit always results in another valid number. This is also true for Roman numerals and the Greek way of generating numbers. Of course, the wheels generate an enumeration system (rather than numbering), so it does the same.

    An alternative explanation (which I like) is that, if somehow such a system is at the basis of the Voynich “word generation”, then not all combinations were valid.

    In the hypothetical situation that someone first creates a lexicon, then the text written based on that lexicon doesn’t need all the words.
    For example.

  2. Marco Ponzi
    August 19, 2021 at 6:17 am

    Hi Julian,
    as Emma wrote, Grove words begin with one of the gallows glyphs and appear at the beginning of paragraphs.

    Apparently, you missed the second part of the definition.

    The gallows-initial-words you discuss are a large superset of Grove words (~3000 vs ~600 tokens). Actual Grove words represent about 85% of paragraph-initial words; they have distinct features e.g. they averagely are ~1 character longer than other words, 11% of tokens include two gallows in the same word, 15% of tokens include the rare gallows-sh sequence. In the much larger class of words that you discuss, all these interesting features are so diluted that they become almost invisible.

  3. Marco Ponzi
    August 19, 2021 at 8:27 am

    I also have a simple question about gallows-initial-words and how they are supposed to be created with wheels. If ordinary words use up to nine wheels, while gallows-initial-words do not use more than seven (they skip the first two), aren’t the latter going to be one character shorter than average (3.5 vs 4.5)?

    • JB
      August 19, 2021 at 9:39 am

      Hi Marco, yes, the gallows-initial words would tend to be shorter, because they are only using up to 7 wheels (if there are indeed 9 wheels in total). I haven’t yet looked at the word length distribution for gallows-initial words. A related problem is the words that contain more than one gallows – these are completely disallowed by the N wheels system unless a) the gallows appear on more than one wheel, or b) (more likely) the spaces between words are wrong or somehow artificial – perhaps there is one or more wheels with a space as one of its glyphs, and that is the only way a space can appear on a line? What I need to do is to check gallows words in the labels: what is their length distribution? That might be revealing.

  4. April 23, 2022 at 5:51 pm

    Julian – Have you received a circular from a researcher promoting a kind of cipher-wheels-with-Turkic-language idea?

    I have responded to his email/circular by suggesting that he take trouble to read, and acknowledge, precedents for each of those elements (cipher-wheels; Turkic language) he combines in his paper now published through Researchgate.

    Perhaps you’d like to offer a comment on the value of his ideas – it’s one for the cryptographers.

  1. April 27, 2022 at 8:47 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: