Home > Algorithms, Characters, Folios, gallows, Genetic Algorithm, Tony Gaffney > Language “A” and “B” Conversions

Language “A” and “B” Conversions

This is an update to my previous two posts on this topic.

I have been concentrating on searching for the correspondence between glyphs used in Language A, and glyphs used in Language B. As a reminder, the method is to take all words in, say, Language A, and “convert” them to words in Language B by changing the glyphs according to a candidate mapping table. The frequency of the converted Language B words is then compared with the original Language A words: the closer the frequencies, the better the mapping match.

Method Check using only Language A words

As a check of the method, I took the Herbal folios 1-25 (all in Language A) and split them into two groups: 1-12 and 13-25, and I then artificially labelled the latter group as Language B. Then I ran the matching procedure, which produced the following result:

Epoch 62 Best chromosome 0 Value= 5.62272615159e-05
Chromosome ['o', '9', '1', 'i', '8', 'a', 'e', 'c', 'k', 'y', 'h', 'N', '2', '4', 's', 'g', 'p', '?', 'K', 'H']
ngramsA    ['o', '9', '1', 'i', '8', 'a', 'e', 'c', 'h', 'y', 'k', 'N', '2', '4', 's', 'g', 'p', '?', 'K', 'H']

This is good and reassuring, since it shows that the words in folios 13-25 have essentially the same frequency distribution when their glyphs are mapped to the same glyphs in folios 1-12.

Removal of Glyph Variants in Voyn_101

As the tests progressed, it became clear that some of the glyphs GC defined in Voyn_101 were in fact variants of more common glyphs. The most obvious were the “m”, “n”, “N” glyphs mentioned before – with these included, the conversions between Language B and Language A were of much poorer quality than if they were expanded to “iiN”, “iN” and “iiiN” respectively. After some time weeding out these variants, the following table was arrived at:

seek =  ["3", "5", "+", "%", "#", "6", "7", "A", "X", 
         "I", "C", "z", "Z", "j", "u", "d", "U", "P", 
         "Y", "$", "S", "t", "q",
         "m", "M", "n", "Y", "!", ")", "*", "b", "J", "E", "x", "B", "D", "T", "Q", "W", "w", "V", "(", "&"]
repl =  ["2", "2", "2", "2", "2", "8", "8", "a", "y", 
         "ii", "cc", "iy", "iiy", "g", "f", "ccc", "F", "ip",
         "y", "s", "cs", "s", "iip",
         "iiN", "iiiN", "iN", "y", "2", "9", "p", "y", "G", "c", "y", "cccN", "ccN", "s", "p", "h", "h", "K", "9", "8"]

I am very confident that the glyphs remaining after using the above conversion table are the base set.  The base set of glyphs is thus:

Language A frequency order: 'o', 'c', '9', '1', 'a', '8', 'e', 'i', 'h', 'y', 'k', 's', '2', 'N', '4', 'g', 'p', '?', 'K', 'H', 'f', 'G', 'F', 'L', 'l', 'v', 'r', 'R'
Language B frequency order: 'c', 'o', '9', 'a', '8', 'e', '1', 'h', 'i', 'y', 'k', '2', 'N', 's', '4', 'g', 'p', 'f', '?', 'H', 'K', 'G', 'F', 'l', 'L', 'R', 'r', 'v'

where “?” represents all very rare glyphs (such as the “picnic table” glyph). There are thus 27 glyphs (15 gallows and 12 regular) excluding the rare special glyphs like the picnic table.

Glyph Mixing Between A and B

I ran many trials using the base set of glyphs, comparing various sections of the VMs written in the different hands. In particular, the following folio collections were defined:

Special = {'HerbalRecipeAB': range(107,117) + range(1,26),
           'HerbalAB': range(1,57),
           'HerbalBalneoAB': range(1,26) + range(75,85),
           'HerbalAstroAB': range(1,13) + range(67,75),
           'PharmaRecipeAB': [88,89,99,100,101,102] + range(103,117),
           'AllAB': range(1,117)

The collection I used the most was the one called “HerbalBalneoAB”, which contains Herbal folios written in Language A, and Balneo folios written in Language B. The nice feature of this collection is that the number of words is around the same for both Languages, which makes comparing counts very easy:

Total words =  2846  Total Language A =  1581  Total Language B =  1584

As an example, here is a trial result for HerbalBalneoAB:

Language B ['o', '9', '1', 'a', 'i', 'f', 'c', 'y', 'h', 'e', 'K', 'N', '2', 's', '4', 'g', 'p', '8', 'k', 'H']
Language A ['o', '9', '1', 'a', 'i', '8', 'c', 'e', 'h', 'y', 'k', 'N', '2', 's', '4', 'g', 'p', 'K', '?', 'H']

In all the tests I ran, there were some common features in the results:

  • Mixing between “e” and “y” – when writing Language A, the use of “e” appears to be equivalent to the use of  “y” in Language B, and vice versa
  • Mixing between  8,f,F,k,K,g,G,r,R,?  and so on – the Gallows glyphs swap amongst themselves, and “8”

Just about all trials showed the “e”/”y” mixing. Tony Gaffney pointed out that these two glyphs are quite similar in stroke construction. The appearance of “8” amongst the swapping Gallows glyphs is curious.

  1. March 5, 2013 at 11:57 am

    I think there may be a glitch in the font conversions – when I convert what you have given in Courier New to Voynich all is well –
    but in the line you have – Mixing between “8″, “f”, “F”, “k”, “K”, “g”, “G”, “r”, “R”, “”?”
    (which is in Verdana) the “8” changes to “82” whilst the rest convert correctly.
    Maybe this is why the “8” is swapping with gallows glyphs!?

    • JB
      March 5, 2013 at 12:08 pm

      Hi Tony, I think that must be a WordPress weirdness – the software is certainly using the single character 8 – perhaps when I copy and paste into the non-Courier font it has added something – I can see the effect myself when taking the above line, copying it into Word and then changing the font to “Voynich”. If I remove the quotes around the characters, all is well.

  2. March 5, 2013 at 12:38 pm

    Strange coincidence then that 8 is the anomaly in both.

  3. JB
    March 5, 2013 at 1:53 pm

    tony :

    Strange coincidence then that 8 is the anomaly in both.

    I removed the quotes in the bullet point, and it should be fine now. The 8 does indeed mix with the Gallows, for some reason. One theory is that the 8 is a null, and so it doesn’t have any information content and can thus find itself swapping with rarer glyphs, like the Gallows.

  4. March 5, 2013 at 2:08 pm

    8 as a null – never in a million years – carry on.

  5. JB
    March 5, 2013 at 2:14 pm

    tony :

    8 as a null – never in a million years – carry on.

    Any particular reason why not?

  6. March 5, 2013 at 2:42 pm

    Because it is’nt a cipher – it’s a drawing of writing – I haven’t looked at it for about a year now (still read the threads though) and thought what you are doing comparing the ‘languages’ might throw some light on the order of insertion which I was finding too difficult – oddly enough looking through your previous posts the other night, under ‘Ink density and glyph order’ where you look at f49v, take another look at it, although having seen this before I now realised what it is, the VM author has filled in the first bit, a half finished canvas as it were, and has written down the side the order in which his friend is to complete it. (The friend is’nt quite as skilful with the quill!)

  7. March 5, 2013 at 2:51 pm

    PS. A random distribution with order imposed on it.!

    • March 7, 2013 at 3:20 am

      Hi Julian,
      You’ll have spotted in realigning the column on 49v that the 8 corresponds to the gallows!!
      Could these all be nulls?
      I suspend my belief of it being just a drawing of writing.
      One my first attempts was the following – (scroll to bottom of page)
      I treated the 8 as a c in this – maybe it is a null and the solution is something along those lines after all .
      What you’ve probably not noticed is that when 4o first appears in the VM it has a bar over it – thereafter it appears only 9 times but with an arc above it – one of these is on 49v – 3 are on the reverse page 48r Maybe these pages were once at the beginning?
      Good luck with it. Tony

      • JB
        March 7, 2013 at 8:46 am

        Hi Tony,
        You are ahead of me: what do you mean by “realigning the columns” on f49v? I looked at this page many times, with the tantalising “1 2 3 4 5” on the LHS, and the column of glyphs to the right, before the text proper. That column contains an 8 – is that what you are referring to? (I notice also it contains a “y” but no “e” …)

        I am reading your stuff on aerobush – fascinating!

  8. March 7, 2013 at 9:55 am

    In Eva the single column of characters is –
    F o r y e s k s F o r y e ? s F o r y c ? 8 y c k y
    (a few of characters unclear)
    Splitting at F gives –
    F o r y e s k s
    F o r y e ? s
    F o r y e ? 8 y c k y
    The ? here may be the ascender of EVA s
    Me thinks not a coincidence this time.

    • JB
      March 7, 2013 at 1:30 pm

      I need to make an EVAVoyn_101 translator!

      In Voyn_101 I think those glyphs are:

      f o y 9 c | h s
      g o y 9 c | s
      g o y 9 c | 8 9 c h 9

      where | is your ascender (sure looks like it!), the same flourish that appears on the 2 (and the instances of “4o” you mentioned).

      Observation: where the “8” glyphs on this folio appear on a line, they do so mostly in pairs.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: