The Vignére Cipher

The safest way to communicate in secret for over 300 years.

Apr 08, 2021

TOP 25 ENCRYPTION QUOTES (of 53) | A-Z Quotes

In 1553 an Italian by the name of Giovan Battista Bellaso devised an easy to use cipher (ciphers swap letters for other letters, as opposed to a code where words are swapped for other words) that could allow anyone to write a letter that was only readable by the intended recipient. To anyone without the keyword the letter would just be unintelligible gibberish.

Before we learn how that was done, we need to learn the simple type of cipher that it was a major upgrade to. That simple cipher is a “monoalphabetic simple substitution cipher.” To make one you scramble up the alphabet, and then match that up with the regular alphabet. It would look something like this:

IOJQPRNLSTXDKMZVHUEWACGFBY

ABCDEFGHIJKLMNOPQRSTUVWXYZ

So anytime you want to use the letter A, you would use an I, any time you want to use a B, you use an O, and so on. Of course the scrambled alphabet could be in any order.

This can be sufficiently hard to crack if you are just kids having fun in your clubhouse, but it will not cut the mustard if you’re dealing with real deal grown up secrets for which the wrong person getting the information is very dangerous. This cipher falls pray quickly to the first tool of any cryptanalyst, frequency analysis. From my days in high school when I was steeped in cryptology I have memorized the order of frequency for the most common 8 letters; E-T-A-O-I-N-S-H.

Frequency analysis - Wikipedia — English Language Letters by Usage

With a sufficiently long encoded message using the above described cipher, a cryptanalyst will figure out that the reason P is coming up so frequently, is that it is representing E. The reason W is coming up so frequently, is because it is representing T. Once you’ve got T and E, it’s pretty easy to figure out the letter that comes between those two in a three letter word all the time represents H. Before you know it your entire message is plainly readable.

The Vignére cipher ingeniously complicates the process for the writer and receiver of the cipher a little bit (we are going from monoalphabetic to polyalphabetic), while complicating the cracking of it to an outsider immeasurably. Well, if you measure it in human effort, it complicated it by approximately three centuries. Here’s how it works:

A simple version of the already simple substitution cipher described above is called a “Caesar Cipher” and instead of a randomly scrambled alphabet, you just shift the alphabet a certain number of letters, so for instance you might have:

XYZABCDEFGHIJKLMNOPQRSTUVW

ABCDEFGHIJKLMNOPQRSTUVWXYZ

A Vignére Cipher uses a table of all 26 possible Caesar Ciphers:

The Vigenère Square or Vigenère Table, also known as the *tabula recta.* It is used for encryption and decryption.

Say it is the year 1600 and we want to write letters to each other where we talk shit about the King; but we don’t want to be executed if the letter falls into the wrong hands. First we would need to agree on a keyword between us. If we need to start communicating in secret before a keyword can be agreed on, we would have to just use our best judgement to think of a word that has special meaning specifically to the two of us, use that, and hope the recipient tries that keyword out. For the purposes of this example let’s say we use the keyword “VEGA.”

If I wanted to encrypt a message using the Vignére Cipher with the keyword VEGA that said “Thank you for reading my substack,” I would line up the keyword repeatedly on top of my message, like this:

VEGAVEGAVEGAVEGAVEGAVEGAVEGA

THANKYOUFORREADINGMYSUBSTACK

Then for each letter I would look at the column for the letter of the keyword I’m on, and match that with the corresponding row for the letter I’m trying to encrypt.

So for encrypting the letter T, using the first letter V of my keyword, I would go over to the V column, go down to the T row, and find the letter O. Doing this repeatedly my message would get encrypted as:

OLGNF JUR TSA RZEJIIK SY NYHSOEIK

To decrypt it and read my message the recipient would just reverse the order of operations I did to encrypt it. You’d look at what letter O, under the V column, really represented. You would find out it is T. And so on.

For three centuries you could be supremely confident that if you communicated in this method that even if your letter fell into the wrong hands, it would not be decipherable.

Human ingenuity is a helluva drug though, and eventually some crafty minds found a way to crack it. If you’re sufficiently arrogant you can pause your reading here for a bit and ponder a way to do what all of humanity failed to do for ~300 years.

I’ll encode a longer message using a keyword I will not tell you, and then we will crack it together:

Olk szguny tandg cam my tci mrzezenx uf vpr Pproc Rexs, zzkrtfudt otorw zhvx. Zhz jornx vuimi wvv can txeoxe gmigt, olk tcmxd jrk wvw somerlt ehhjvxeix gny rut kextdgalvvry drzemiytdrm tj etagcfe bmbei luw jrk sdhkd dx can. Xne nxxiik uf qmitjvoen Lgnimhag mtfgmitzh un olk Rjqgnn eltzv ne nyiczwyfppry xvusnij tci Glkw grz xxugc ytprtiik. Zhz pgso fuof M liimyhzh can hkek aurf valzw lom jucpwkd nyiczwy ii e jinxxaxxkd rsxly. M gm cebiik yhmmsp vrj rdgk fjv ruign aih zhz wguxi zhvx mozw ciol zhz wnrdqv in idtmisegc ypdge. Tci reikzh jj zhdw senwggz my ljrm eisagc xnao xne yiirttzijr hy feyifm kxvqonvxooi wnoppj bz wacxiysayr.

Here is how to crack it, in a nutshell (pun not intended, but noted).

What you do first is look for strings of 3 or more characters in a row that get repeated in the text. Here we see OLK comes up three different times. ZHZ comes up four different times. I have bolded them so it’s a bit easier to see. We can safely assume that OLK is the same word, being encrypted with the same part of the keyword. We can assume the same for ZHZ. If we look at the distance between the OLKs and the distance between the ZHZs, we can then infer the length of the keyword by finding the greatest common divisor of all the distances between the repeated character strings. In this case the greatest common divisor, and hence the length of my keyword, is 4.

What you do next is split the text into as many groups as we just determined the length of the keyword to be, letters 1,5,9,13,17… go together (as they have all been encrypted using the first letter of the keyword), letters 2, 6, 10, 14, 18… go together (as they have all been encrypted using the second letter of the keyword) and so on. These four groups are then just four different monoalphabetic simple substitution ciphers, which we can easily attack and crack with frequency analysis.

If you actually go through the work to do that you will find out that I once again used “VEGA” as my keyword, and the message was:

The Second Punic War is the greatest of all Punic Wars, everybody knows that. The first Punic War was pretty great, the third one was morally abhorrent and not particularly interesting to analyze given how one sided it was. The string of victories Hannibal inflicted on the Romans after he successfully crossed the Alps are truly stunning. The last book I finished was Deep Work: Rules for Focused Success in a Distracted World. I am having shrimp and rice for lunch and the sauce that goes with the shrimp is extremely spicy. The length of this message is long enough that the decryption by Kasiki examination should be successful.

Not exactly a plan for regicide; sometimes when you violate someone’s fourth amendment rights to surreptitiously read their encrypted messages you merely find out boring stuff, like their hobbies or what they had for lunch.

It wasn’t until 1863, 310 years after the Vignére Cipher was invented, did Friedrich Kasiski devise and publish this method of cracking the Vignére Cipher (although it appears Charles Babbage figured out how to do this as early as 1846, meaning the cipher only truly stood unsolvable for 293 years).

Modern encryption is quite a bit different, it is called RSA. It involves computers, very large prime numbers, and two keywords, one (the “public key”) for the sender to encrypt and a second (the “private key”) for the receiver to decrypt it. This method is actually rather slow and is mostly used just to encrypt a keyword (like “VEGA” in our examples) which is then used in a second layer of encryption which actually encrypts and decrypts the information, because it is much faster than if the RSA method was itself used for the entire message.

The most interesting story in the history of cryptology is probably the story of the German Enigma machine, Bletchley Park, Alan Turing, and the Bombe. It is the story that sparked my entire interest in history and cryptology in fact. But that’s another story, for another day, and another post.

Quotes about Right to privacy (65 quotes)

Jaime's Musings

Discussion about this post