Strong | Computational Attacks on the Voynich Manuscript

Cipher Wheels – Genetic Algorithm – Some Results

August 7, 2021 JB 2 comments

The latest run of the GA produced the following prediction after 5000 Epochs: 12 cipher wheels, covering 96% of all VMS words, as shown below.

Candidate Set of Cipher Wheels for Generating the VMS text

The GA was free to use up to 13 wheels, and as few as 3: these 12 are the best fit it found. The blue segments shown are those that can be used to create the example word “pchodol”, i.e. “p” from Wheel 1, “c” from Wheel 3, “h” from 4, “o” from 7, “d” from 8, “o” from 10, and finally “l” from 12.

There is some redundancy: another way of making “pchodol” is to take “p”(1), “c”(5), “h”(6), “o”(7), “d”(8), “o”(10), “l”(12).

Observations on the Wheel Configuration

The un-benched Gallows glyphs “p”, “f”, “t”, “k” all appear in the first Wheel. They also appear in Wheels 4 and 5, and Wheels 8, 9 and 10. It’s curious that the GA has divided them up in the later Wheels. Their appearance in the first Wheel covers the Grove words. Their appearance later allows for the Gallows word with preceding glyphs.

The GA has only found it necessary to include two of the benched Gallows: “cth” (3) and “ckh” (5). The other benched Gallows “cfh” and “cph” can of course be formed from “c”, “h”, “p” and “f” in the wheels. Why did the GA find it expedient to include “cfh” and “cph” as unique glyphs?

The GA was allowed to use the glyphs “in” and “iin” as if they were single glyphs, but it has not seen the need, perhaps suggesting that they are not single glyphs after all.

The existence of “ee” and “e” in Wheels 6 and 10 is probably a result of the scoring system employed, that tries to only take at most one glyph from each wheel, otherwise “ee” could be made simply by taking “e” twice – but that is penalised.

A Comment on Repeating Words

f75r in the VMS contains the infamous word sequence “qokedy qokedy qokedy”. The Wheels allow qokedy to be created in three different ways (shown below). This is thus a possible explanation of how that sequence occurs – when ciphering three different plaintext words using the Wheels, which Wheels and which positions to use will be different, but the end result is the same VMS word repeated.

Categories: Uncategorized Tags: gallows, Genetic Algorithm, Repeating Words, Strong

Current Status

March 3, 2010 JB 6 comments

Current Status

This is my personal summary of where I am at the moment, in particular which theories I’ve rejected (for better or worse!)

Theory: VMs words are anagrams of a plaintext that has been enciphered into the VMs glyphs
- Attempts to find solutions with many mappings (1- 2- 3-grams) and various languages/dictionaries fail to find even mediocre matches
- Unusual prevalence of e.g. “8am 8am 8am” not explained by this theory
Theory: VMs words are in fact pieces of plaintext words, that need to be a) combined b) deciphered
- Trials with delimiters like VMs “o” and “9” and with many mappings and languages/dictionaries fail to find good matches
- But this would explain “8am 8am 8am” at a stretch
Theory: VMs words contain numeric codes, that use a Selenus type code table, with e.g. gallows characters used as multipliers
- There are too many VMs characters: for this to work – only, say, 4 gallows characters and ten digits are needed for a minimal implementation – what are all the rest for?
- Doesn’t explain “8am 8am 8am”
Theory: VMs words are phonetic codes for a reading of the manuscript
- Mapping the words to Soundex or Double Metaphone and comparing with plaintexts produces a poor frequency match (but is this a good test – see e.g. Robert Firth’s notes)
- This could explain “8am 8am 8am”
Theory: The text is produced by a polyalphabetic cipher with rotating/repeating sequences (a la Strong)
- Multiple attempt to fit this theory using various alphabet lengths and sequence lengths fails to find a convincing match, although plausible results can be generated
- Would explain “8am 8am 8am”
Procedure: since the cipher/code/whatever it is changes at least between sections, and possibly between folios (and maybe even within a folio), examining large quantities of VMs text for statistical properties is very misleading. Only text within a single side of a folio should be tackled for decryption.

Categories: 8am 8am 8am, 9, anagrams, cipher, codes, Double Metaphone, gallows, Languages, n-grams, o, phonetic, polyalphabetic, Robert Firth, soundex, Strong Tags: 8am, 9, cipher, codes, Double Metaphone, n-grams, phonetic, polyalphabetic, soundex, Strong

Strong’s Cipher

February 26, 2010 JB 1 comment

Strong’s “peculiar system of a double reversed arithmetic progression of a multiple alphabet” is a puzzling description, but GC recently (Feb 2010) explained it as “”double reversed arithmetic progression” as defined by the string 1-3-5-7-9-7-5-3-1-4-7-4″ (although I think the sequence given is an example, rather than the definition). The number of alphabets is “a handful”.

If we suppose that the cipher is indeed constructed like this, then can we crack it computationally?

First we need to make some assumptions. Let’s generously assume that the number of alphabets is 10. Let’s then assume that these alphabets are rotated through in a sequence that is 17 long (the number 17 is picked since it crops up as a feature of the VMs text in many places). Let’s not assume that the sequence is double, or reversed, or anything else: it’s just a sequence of alphabet numbers. Let’s assume that each alphabet contains 21 characters: abcdefghilmnopqrstuvx

We then take a sample of VMs text (I chose the first “paragraph” of f1v)

h1s9 1o8am oe oek1c9 1ay Fax ap 9kcc9 1ay oy o19 81o eho89 oho8ay 1o89 8o H9 HoH9 29 8h2ii9 K9 hok1o89 8ae 8oe 1ohco 8aiy 8ap so1c9 1o ho89

and, equipped with a large dictionary of Latin words, we start to build a possible cipher. To do this, we start by looking at the first VMs word “h1s9”, and pick a Latin word of the same length, at random: “acri”. With this pair we can start to construct the cipher table:

Voynich        o 9 e 1 8 a h y c k 2 i K s m H p F x g &
Alphabet 0     . . . . . . a . . . . . . . . . . . . . .
Alphabet 1     . . . c . . . . . . . . . . . . . . . . .
Alphabet 2     . . . . . . . . . . . . . r . . . . . . .
Alphabet 3     . i . . . . . . . . . . . . . . . . . . .
Alphabet 4     . . . . . . . . . . . . . . . . . . . . .
Alphabet 5     . . . . . . . . . . . . . . . . . . . . .
Alphabet 6     . . . . . . . . . . . . . . . . . . . . .
Alphabet 7     . . . . . . . . . . . . . . . . . . . . .
Alphabet 8     . . . . . . . . . . . . . . . . . . . . .
Alphabet 9     . . . . . . . . . . . . . . . . . . . . .

We continue with the next word: “1o8am” and a random Latin word of the same length: “paveo”, and update the table:

Voynich        o 9 e 1 8 a h y c k 2 i K s m H p F x g &
Alphabet 0     . . . . . . a . . . . . . . . . . . . . .
Alphabet 1     . . . c . . . . . . . . . . . . . . . . .
Alphabet 2     . . . . . . . . . . . . . r . . . . . . .
Alphabet 3     . i . . . . . . . . . . . . . . . . . . .
Alphabet 4     . . . p . . . . . . . . . . . . . . . . .
Alphabet 5     a . . . . . . . . . . . . . . . . . . . .
Alphabet 6     . . . . v . . . . . . . . . . . . . . . .
Alphabet 7     . . . . . e . . . . . . . . . . . . . . .
Alphabet 8     . . . . . . . . . . . . . . o . . . . . .
Alphabet 9     . . . . . . . . . . . . . . . . . . . . .

The next word is “eo” and the random Latin word is “do”. Now the Latin letter “o” has to be placed under the Voynich “o” column in Alphabet 0:

Voynich        o 9 e 1 8 a h y c k 2 i K s m H p F x g &
Alphabet 0     . . o . . . a . . . . . . . . . . . . . .
Alphabet 1     . . . c . . . . . . . . . . . . . . . . .
Alphabet 2     . . . . . . . . . . . . . r . . . . . . .
Alphabet 3     . i . . . . . . . . . . . . . . . . . . .
Alphabet 4     . . . p . . . . . . . . . . . . . . . . .
Alphabet 5     a . . . . . . . . . . . . . . . . . . . .
Alphabet 6     . . . . v . . . . . . . . . . . . . . . .
Alphabet 7     . . . . . e . . . . . . . . . . . . . . .
Alphabet 8     . . . . . . . . . . . . . . o . . . . . .
Alphabet 9     d . . . . . . . . . . . . . . . . . . . .

We continue in this vein, picking random Latin words to match the VMs words, and attempting to place them into the cipher. This starts off easily, but rapidly becomes impossible, with the Latin words chosen: when we come to place a letter into the required column at the current alphabet in the sequence, we find that the position is already occupied by a different letter, or that the alphabet already contains that letter but in a different column.

In such cases we try to select a different Latin word to see if it will fit. If we exhaust all possible Latin words, then we backtrack to the beginning, and start afresh with a new sequence and new choices.

Most of the time, this algorithm doesn’t get further than a few words into the text before failing. Occasionally it gets quite a long way. Of course, the search space of possible Latin word combinations is staggering …

This is one of the more interesting attempts at deciphering f2v:

Voynich          o 9 e 1 8 a h y c k 2 i K s m H p F x g &
Alphabet 0       l . t a s . b o f . n . . . . . . . . . .
Alphabet 1       m i . o l b . . r . . t . . . g . . . . .
Alphabet 2       f a u e . . . s . . . i . n . . . . . . .
Alphabet 3       . o . v f . i . . n u . . . . . . e . . .
Alphabet 4       . a . h d i . . . . . . o . . . . . . . .
Alphabet 5       e s . i t o . . . . . . . . . . . . a . .
Alphabet 6       u . . . r s c . . . . . . . . . . . . . .
Alphabet 7       u . i . n b . s r . . . . . . m t . . . .
Alphabet 8       e i . . . d u . . c . . . . a . . . . . .
Alphabet 9       s . . u . . . . . n . . . . . a . . . . .
Sequence vals = 0 1 2 3 4 5 6 7 8 9 0 1 2 3 5 7 8

h1s9 1o8am oe oek1c9 1ay Fax ap 9kcc9 1ay oy o19 81o eho89 oho8ay 1o89 8o H9 HoH9 29 8h2ii9 K9 hok1o89 8ae 8oe 1ohco 8aiy 8ap so1c9 1o ho89

bono herba st muniri abs eia st infra vos eo meo diu iussi fiendo offa tu mi alga us nuntio os cuculla

Categories: cipher, Latin, polyalphabetic, Strong Tags: cipher, Latin, polyalphabetic, Strong

Computational Attacks on the Voynich Manuscript

Archive

Cipher Wheels – Genetic Algorithm – Some Results

Observations on the Wheel Configuration

A Comment on Repeating Words

Current Status

Current Status

Strong’s Cipher

A Caution

Recent Posts

Blogroll