Archive

Archive for the ‘Features’ Category

Using t-distributed Stochastic Neighbor Embedding (TSNE) to cluster folios

September 26, 2017 19 comments

For this attack we’ll use the Takeshi EVA transcription to count the number of times each glyph appears on each folio. This gives us a vector of probabilities for each glyph, for each folio – the vectors are 24 long, as there are 24 EVA glyphs in the alphabet.

For example, here is the probability vector for f1r:

1r 28 lines {‘a’: 0.08917835671342686, ‘c’: 0.08216432865731463, ‘e’: 0.05110220440881764, ‘d’: 0.06212424849699399, ‘f’: 0.00501002004008016, ‘i’: 0.08617234468937876, ‘h’: 0.12324649298597194, ‘k’: 0.045090180360721446, ‘*’: 0.012024048096192385, ‘m’: 0.001002004008016032, ‘l’: 0.03507014028056112, ‘o’: 0.11923847695390781, ‘n’: 0.050100200400801605, ‘p’: 0.012024048096192385, ‘s’: 0.06412825651302605, ‘r’: 0.04408817635270541, ‘t’: 0.03907815631262525, ‘y’: 0.07915831663326653}

(This reads as glyph “a” appears 8.9% of the time on f1r, glyph “c” 8.2% of the time, and so on.)

The question is: how similar are these frequency distributions amongst all the folios? Using tSNE (implemented in Scikit learn here: http://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html) we can try to find a 3D arrangement of all the folios that minimises the glyph frequency vector difference between nearby folios.

Here’s a typical result: each folio appears as a point in 3D space …

The colour coding is: red dots are folios that Currier identified as “Language A”, blue are “Language B”, and the remaining black dots do not have an assignment.

It’s clear that the red and blue are well separated, reinforcing Currier’s assignments. Thus this is independent support of Currier’s theory.

There are a couple of notable features:

  • f57r and f57v are labelled as Language A (red) – but it looks like they should be labelled as Language B (blue)
  • The unassigned folios (black dots) look like they are all Language B
Advertisements

Puzzles of the Voynich Manuscript

December 19, 2016 4 comments

I just published the guide “Puzzles of the Voynich Manuscript” as an ebook on Amazon. A paperback version is also available. From the blurb:

This illustrated guide to the Voynich Manuscript is targeted mainly at those who have recently come across the book and are wondering what all the fuss is about, and why, after more than a century of effort, nobody has cracked its code yet. It should also be useful as a set of tests for those who believe they may have cracked the code, so that they can see how their solution matches up against each of the puzzles or notable features described. And finally, it is hopefully of interest to those already familiar with the manuscript – perhaps they will find something new or thought provoking within.

51tmdjmbpel

Readers of this blog, who tend to be Voynich experts already, will probably not find much (if anything) new in the guide, as it is principally intended for newcomers to the Manuscript.

Categories: Features

Are the Glyphs placed in specific folio locations?

June 6, 2016 15 comments

Based on a lot of circumstantial evidence related to the weirdness of the Voynich text (such as the odd repeating words, the curious faintness and boldness of some glyphs, and the sometimes curious positioning of text words and lines), it appears that the folios were perhaps not written Left to Right (or Right to Left) and Top to Bottom.

Instead, suppose the scribe started each folio with a prescription: for example “put an h-Gallows at the top left, then put a c in the middle of the folio, then a 9 at the end of the last line”, and so on. This would be sort of like filling out the answers to a bizarre crossword puzzle.

If there was such a prescription, might it explain some of the Voynich text features?

In the following selected charts I’m showing a virtual folio from the Recipes section. Each chart has lines and columns. Line 1 position 1 is the top left of the folio. Let’s look at the chart folio for Glyph “o”:

Recipes_o

Each disc indicates that the “o” appears at least twice in that location in the Recipes. The size of the disc indicates how many times it appears there: the bigger the disc, the more times it appeared. The random appearance of the chart suggests that “o” is not placed on the page in any particular pattern.

Let’s now look at the “s” glyph:

Recipes_s

Here it is clear that this glyph vastly prefers the first column, but not the first line. It is infrequently found elsewhere on the folio. In contrast, take a look at the rare glyphs (I just call them “?”):

Recipes_?

These abhor the early columns, and love the ends of the lines. They also seem to prefer the ends of the first lines (notice a little cluster there). Perhaps they hate the “s” glyphs…

The “4” glyph:

Recipes_4

The gap after the first column is explained by how “4” only appears at the start of a word.

Here are some more glyphs:

Recipes_y

Recipes_1Recipes_2Recipes_8Recipes_9

No conclusions here, as usual!

Addendum: the distribution for “c”:

Recipes_c

 

 

Common Words in Language A that are Rare in Language B

March 15, 2013 40 comments

The question was posed: which words are common in Language A but rare in Language B? And vice versa.

For this study I used the Herbal/Balneo folios that are Language A and B respectively (folios 1-25 and 75-84).

There are around 2900 unique words in total, with around 1600 being used in Language A, and 1630 in Language B.

Here are the results. The tables show the words in order of decreasing value of the frequency in A (B) divided by the frequency in B (A), and show the number of occurrences of each word in both Languages.

Common in A, rare in B

Common in A, rare in B

 

Common in B, rare in A

Common in B, rare in A

Conclusion? I have no idea … for now.

Categories: Features, Folios, Languages

Language A and B Again

March 13, 2013 12 comments

A tentative conclusion from comparing Language A and Language B  is that the non-gallows glyphs are used in the same way in both Languages.

That is to say, they appear to mean the same thing. So the “o” in A means the same as the “o” in B.
There is some persistent “mixing” between the e/y glyphs, which is illustrated by the example result below:
ABMixing
There is also some doubt about the “8” glyph, which sometimes seems to mix with the gallows glyphs (e.g. in some cases, the “8” appears in A to function in the same way as a gallows glyph in B and vice versa). This may simply be an error in the comparison method, or it may be that the “8” is a null, or it may be due to some other effect.
The gallows glyphs are different – they don’t appear to mean the same in A and B. I’m focussing on those glyphs now.

How was the Voynich Manuscript text written?

August 23, 2012 7 comments

I’ve spent many happy hours poring over the text, and am convinced that it is not as “simple” as it appears (i.e. the “words” are not words at all). Here are some conjectures:

  1. The lines look like they are written left to right i.e. the glyphs were written down from left to right, but were not.
  2. The scribe started with the drawing and started writing glyphs at various positions on the page.
  3. The method used for choosing each glyph and for deciding its position involved a mechanical apparatus, perhaps a set of co-rotating cipher wheels that were used to convert each character in the Latin plaintext into a VMs glyph and page position
  4. The apparatus is set to a new starting position for each folio/page (so e.g. Bettony labels on the three folios the plant appears on are different)
  5.  The density of ink is a clue to the order in which the glyphs were written (nib/quill freshly dipped and full of ink, or almost dry)
  6. At some point the scribe finishes writing the needed glyphs, and then fills out the spaces with pseudo-random words.
  7. There is no punctuation because what is seen are not words. What is seen makes no grammatical sense because the glyphs are not ordered and positioned linearly across the page.
  8. Perhaps the secret to unwinding the cipher is in the labels. The labels on one page are constrained to have been produced by the same initial position of the cipher apparatus, and they must come from the plaintext label.

There are so many clues as to what is going on, yet putting them all together is hugely challenging

For example, Jim Reeds suggested years ago that the order in which the text had been written on the sunflower page, f33v:

f33v

was first the text to the left of the left stalk, second the text in between the stalks, and finally the text to the right of the last stalk. This is compelling, since the ink density looks different, and the lines don’t line up well across the stalks. It becomes clearer if you saturate the image:

f33v Saturated

And in that image, what jumps out are the glyphs that are darker than the others. Those can be seen more clearly in black/white:

f33v monochrome drop

where the “o”, “y”, “8”, “e” stick out like sore thumbs. Most of those are in the left section, some in the middle, and fewer in the right. Why are these glyphs bolder, why are they inked more heavily? Were these the glyphs initially placed on the page, and contain the real information, and the rest, unimportant and pseudo-random, were all added later to make the text look “normal”?

Categories: 8, ay, Characters, e, f33v, Features, Jim Reeds, Latin, o, oy, Theories, Writing, y Tags: , ,

Odd Distributions of “oy” and “ay”

August 23, 2012 6 comments
A few weeks ago I posted some images showing the positions of the 
gallows characters on each of the VMs folios.
(The blog post is here is you missed it: https://voynichattacks.wordpress.com/2012/06/29/page-positional-gallows-mk-ii/ )

With a couple of small changes to the code, I have generated a set of 
images showing the positions of the "oy" and "ay" glyphs on each of the 
folios. (I believe the oy and ay are transcribed in EVA as ol and or, 
not sure.) This was prompted by the observations that
a) these glyph pairs often occur many times on a folio,
b) on some folios they don't appear at all
c) on some other folios only "ay" appears, on others only "oy"
d) often the "oy" glyphs appear to the left of each line, and the "ay" 
to the right, and sometimes vice-versa.

I wanted to link to a few example images from the set. The colour code 
is "oy" yellow and "ay" pink, with the coloured square indicating the 
position of the "o" or "a", a grey square indicating another glyph, and 
a black square a space.

1)  Examples of "oy"s at the left, and "ay"s at the right:
f18v http://imageshack.us/photo/jjbunn/31/gffolio18v.jpg/
f29v http://imageshack.us/photo/jjbunn/571/gffolio29v.jpg/

2) Example of the opposite: "ay"s at the left, "oy"s at the right:
f26v http://imageshack.us/photo/jjbunn/826/gffolio26v.jpg/

3) Example of only "oy" on the folio:
f21r http://imageshack.us/photo/jjbunn/809/gffolio21r.jpg/

4) Example of only "ay" on the folio:
f26r http://imageshack.us/photo/jjbunn/402/gffolio26r.jpg/

5) Example of numerous "oy"s to only one "ay":
f37v http://imageshack.us/photo/jjbunn/51/gffolio37v.jpg/

6) Example of an even mixture of both types, across the lines:
f39v http://imageshack.us/photo/jjbunn/19/gffolio39v.jpg/

What might be going on here? Nick Pelling commented on my blog that GC, 
while working on the Voyn_101 transcription, got the impression that the 
change from dominant "oy" to dominant "ay" was a vocabulary change in 
the text (at least, that's what I understood from Nick's comment).

I'd welcome comments on this. Also, if you would like me to generate 
images for your favourite glyph's distribution, it's a trivial process - 
just let me know
Categories: ay, Glen Caston, Nick Pelling, oy, oy Tags: