Archive

Archive for the ‘e’ Category

Language A and B Again

March 13, 2013 12 comments

A tentative conclusion from comparing Language A and Language B  is that the non-gallows glyphs are used in the same way in both Languages.

That is to say, they appear to mean the same thing. So the “o” in A means the same as the “o” in B.
There is some persistent “mixing” between the e/y glyphs, which is illustrated by the example result below:
ABMixing
There is also some doubt about the “8” glyph, which sometimes seems to mix with the gallows glyphs (e.g. in some cases, the “8” appears in A to function in the same way as a gallows glyph in B and vice versa). This may simply be an error in the comparison method, or it may be that the “8” is a null, or it may be due to some other effect.
The gallows glyphs are different – they don’t appear to mean the same in A and B. I’m focussing on those glyphs now.

The Relationship Between Currier Languages “A” and “B”

March 1, 2013 24 comments

Captain Prescott Currier, a cryptographer, looked at the Voynich many moons ago, and made some very perceptive comments about it, which can be seen here on Rene Zandbergen’s site.

In particular, he noticed that the handwriting was different between some folios and others, and he also noticed (based on glyph/character counts) that there were two “languages” being used.

When I first looked at the manuscript, I was principally considering the initial (roughly) fifty folios, constituting the herbal section. The first twenty-five folios in the herbal section are obviously in one hand and one ‘‘language,’’ which I called ‘‘A.’’ (It could have been called anything at all; it was just the first one I came to.) The second twenty-five or so folios are in two hands, very obviously the work of at least two different men. In addition to this fact, the text of this second portion of the herbal section (that is, the next twenty-five of thirty folios) is in two ‘‘languages,’’ and each ‘‘language’’ is in its own hand. This means that, there being two authors of the second part of the herbal section, each one wrote in his own ‘‘language.’’ Now, I’m stretching a point a bit, I’m aware; my use of the word language is convenient, but it does not have the same connotations as it would have in normal use. Still, it is a convenient word, and I see no reason not to continue using it.

We can look at some statistics to see what he was referring to. Let’s compare the most common words in Folios 1 to 25 (in the Herbal section, Language A, written in Hand 1) and in Folios 107 to 116 (in the Recipes section, Language B, written in a different Hand):

Comparison between word frequencies in Languages A and B

Comparison between word frequencies in Languages A and B

So, for example, in Language A the most common word is “8am” and it occurs 192 times in the folios, whereas in Language B the most common word is “am”, occuring 137 times.

We might expect that these are the same word, enciphered differently. The question then is, how does one convert between words in Language A and words in Language B, and vice versa? In the case of the “8am” to “am” it’s just a question of dropping the “8”, as if “8” is a null character in Language A. In the case of the next most popular words, “1oe”(A) and “1c89″(B) it looks like “oe”(A) converts to “c89″(B). And so on.

If we look at the most popular nGrams (substrings) in both Languages, perhaps there is a mapping that translates between the two. Perhaps the cipher machinery that was used to generate the text had different settings, that produced Language A in one configuration, and Language B in another. Perhaps, if we look at the nGram correspondence that results in the best match between the two Languages, a clue will be revealed as to how that machinery worked.

This involves some software (I’m using Python now, which is fun). The software first calculates the word frequencies for Language A and B in a set of folios (the table above is an output from this stage). It then calculates the nGram frequencies for each Language. Here are the top 10:

nGramFrequencies

The software then runs a Genetic Algorithm to find the best mapping between the two sets of nGrams, so that when the mapping is applied to all words in Language B, it produces a set of words in Language A the frequencies of which most closely match the frequencies of words observed in Language A (i.e.  the frequencies shown in the first table above).

Here is an initial result. With the following mapping, you can take most common words in Language B, and convert them to Language A.

Table for converting between a Language B word and a Language A word

Table for converting between a Language B word and a Language A word

A couple of remarks. This is an early result and probably not the best match. There are some interesting correspondences :

  • “9” and “c” are immutable, and have the same function
  • Another interesting feature is that “4o” in Language B maps to “o” in Language A, and vice versa!
  • in Language B, “ha” maps to “h” in Language A, as if “a” is a null

In the Comments, Dave suggested looking at word pair frequencies between the Languages. Here is a table of the most common pairs in each Language.

Common word pairs in Languages A and B

Common word pairs in Languages A and B

For clarity, I am using what I call the “HerbalRecipesAB” folios for this study i.e.

Using folios for HerbalRecipeAB : [107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]

More results coming …

How was the Voynich Manuscript text written?

August 23, 2012 7 comments

I’ve spent many happy hours poring over the text, and am convinced that it is not as “simple” as it appears (i.e. the “words” are not words at all). Here are some conjectures:

  1. The lines look like they are written left to right i.e. the glyphs were written down from left to right, but were not.
  2. The scribe started with the drawing and started writing glyphs at various positions on the page.
  3. The method used for choosing each glyph and for deciding its position involved a mechanical apparatus, perhaps a set of co-rotating cipher wheels that were used to convert each character in the Latin plaintext into a VMs glyph and page position
  4. The apparatus is set to a new starting position for each folio/page (so e.g. Bettony labels on the three folios the plant appears on are different)
  5.  The density of ink is a clue to the order in which the glyphs were written (nib/quill freshly dipped and full of ink, or almost dry)
  6. At some point the scribe finishes writing the needed glyphs, and then fills out the spaces with pseudo-random words.
  7. There is no punctuation because what is seen are not words. What is seen makes no grammatical sense because the glyphs are not ordered and positioned linearly across the page.
  8. Perhaps the secret to unwinding the cipher is in the labels. The labels on one page are constrained to have been produced by the same initial position of the cipher apparatus, and they must come from the plaintext label.

There are so many clues as to what is going on, yet putting them all together is hugely challenging

For example, Jim Reeds suggested years ago that the order in which the text had been written on the sunflower page, f33v:

f33v

was first the text to the left of the left stalk, second the text in between the stalks, and finally the text to the right of the last stalk. This is compelling, since the ink density looks different, and the lines don’t line up well across the stalks. It becomes clearer if you saturate the image:

f33v Saturated

And in that image, what jumps out are the glyphs that are darker than the others. Those can be seen more clearly in black/white:

f33v monochrome drop

where the “o”, “y”, “8”, “e” stick out like sore thumbs. Most of those are in the left section, some in the middle, and fewer in the right. Why are these glyphs bolder, why are they inked more heavily? Were these the glyphs initially placed on the page, and contain the real information, and the rest, unimportant and pseudo-random, were all added later to make the text look “normal”?

Categories: 8, ay, Characters, e, f33v, Features, Jim Reeds, Latin, o, oy, Theories, Writing, y Tags: , ,