Cryptoquip Solver

I wrote a Cryptoquip solver.

example cryptoquip

My big contribution here is using the “shape” of the words to narrow the choice of clear text letters.

I get the Denver Post newspaper on Wednesdays and Sundays, originally for the Jumble puzzle, later for the Sudoku, and now for the Cryptoquip.

As shown above, the Cryptoquip is a phrase, proverb, punny joke or witty saying, but the letters are enciphered. It’s a pure substitution cipher, but not a Caesar Cipher. If ‘A’ is enciphered as ‘M’, ‘B’ is not necessarily enciphered as ‘N’, it could be any other letter. I believe that a letter is never enciphered as itself, but I could be wrong, my solver does not assume that.

The usual method of breaking these kinds of ciphers has you doing frequency analysis: counting frequencies of cipher letters, then matching high-frequency-of-appearance cipher letters to high-frequency-of-appearance letters in plain text, as in the famed “ETAOIN SHRDLU” most-common-letters mnemonic. I’ve tried this method of deciphering before. You need a lot of cipher text to get it to work well.

I suppose part of the fun of doing Cryptoquips by hand is using your intuition about word length and punctuation and so forth, and also making trials that probably won’t work. The blurb in the Denver Post advises you to look at punctuation.

Cryptoquips don’t encipher word boundaries. There’s more information in a given Cryptoquip than just letter frequencies. Word length and repeated letters are revealed.

My program uses the length and letters of an enciphered word to create a “shape” or “configuration”.

Enciphered word “OHDXLZSVPLYY” has the shape “012345678499”:

OHDXLZSVPLYY
012345678499

Length of a cipher word is equal to length of a “shape”, the ‘L’ cipher letters show up as ‘4’ characters in the shape, and ‘Y’ cipher letters show up as ‘9’ characters.

It’s possible to narrow which words might match an enciphered word based on the “shape”.

OHDXLZSVPLYY has only 3 “shape” matches in my dictionary:

  • gracefulness
  • gratefulness
  • motherliness

Notice that all 3 matches end in “ness” - it’s possible to match 3 cipher letters with clear text letters based on shape of this word alone.

It’s also possible to eliminate clear text letters that might match cipher letters by comparing a cipher letter’s possible clear text letters when a cipher letter appears in more than one enciphered word.

Cipher letter T appears in 3 enciphered words in the example above, TQLP, YQRT, TUVV, which have shapes “0123”, “0123” and “0122”.

Based on those shapes, T can match:

  • “0123”, position 0 - a b c d e f g h i j k l m n o p q r t u v w x y z
  • “0123”, position 3 - a b c d e f g h i j l m n o p r s t u v w x y z
  • “0122”, position 0 - b c d f g h i j k l m n p q r s t v w x y

Letters common to all 3: b c d f g h i j l m n p r t v w x y

It’s possible to narrow the possible clear text letters corresponding to a cipher text letter to a single letter, just as in the shape matching above.

Once shape matching and clear text letter elimination is done, my program creates Regular Expressions from any solved letters and the clear text letters that remain.

A Regular Expression is a pattern of letters and special characters. The special characters can represent “any letter”, “this range of letters”, “zero or more of this letter”, “this letter or that letter”, and a few other things.

My program comes up with these reqular expressions for the example 4-letter words:

TQLP:   ^[b-df-mprtv-y][a-df-ik-moprt-z]en$
YQRT:   ^s[a-df-ik-moprt-z][a-df-mo-rt-z][b-df-mprtv-y]$
TUVV:   ^[b-df-mprtv-y][achiloruwy][il][il]$

The regular expression for words that could be deciphered TQLP gives 8 dictionary words. That narrows down the search substantially from the 2727 dictionary words that have the shape of “0123”.

The program creates a new dictionary based on the “shapes” of the words matching per-enchipered-word regular expressions, and a new, smaller, set of letters that might solved each enciphered letter. It repeats the cycle until all enciphered letters have a clear text match, or 6 cycles, whichever comes first.

That’s how I used the extra information about word length and letter position in words to solve Cryptoquips.

Solution to the example

tqlp ypdily qdfl xql glyuhl xr yqrt xqluh ohdxlzsvplyy gr crs xqupi
when snakes have the desire to show their gratefulness do you think
xqlc tuvv oufl zdpoy
they will give fangs