Aquiman asked that I get this started by posting the charts and basic info. He will chime in soon with his thoughts. He is still doing analysis on this, and looking at other codes. Remember, he started this as someone who thought that the Kite - Wilks work on the codes was likely wrong, and he thought a probability analysis would prove that. So, I think, as someone who is cautious and careful by nature, and likes hard evidence, he is still processing what this means, applying it to different codes and has not yet formed a final conclusion.

Even so the numbers in this first stage report are quite impressive, and totally refute one of the central beliefs of the critics, that "virtually any" letter string can produce the 18 letter name of Theodore J Kaczynski.

Aquiman has a masters degree in Applied Physics, and works at one of the top ten research universites in America, where among other things he does failure analysis using computers and probabilty/statistical models.

AQUIMAN:

Over the last several weeks, I attempted to perform a comprehensive analysis of AK's (et al) cipher theories. It turned out to be a very difficult task and I now hate math. What follows will, no doubt, be quite a boring read for most; however, I believe it may shed some light on how useful probability analysis can (or cannot) be in regards to proving or disproving theories. For those without a math background, I hope I've explained things well enough to understand probability analysis in general. For the math geeks (like myself), I hope I don't bore you... but at least you will be able to correct any mistakes I may have made along the way. Comments, criticisms, and suggestions are welcome. Be gentle.

*********

First, I want to discuss some basic probability and then delve into why this was not the typical probability exercise. Let’s take the name “Theodore J. Kazcynski.” For each of the 18 letters in his name, we have a bag filled with all 26 letters of the alphabet. If you reach into the first bag, the odds of pulling out the “T” are one in 26. Reaching into the second bag, the probability is again 1/26 that the “H” will be chosen. The overall probability that a “T” and an “H” will be chosen from the first and second bags respectively is 1/26 x 1/26 = 1/676. Someone choosing one letter from each bag, hoping to spell out the entire 18 letter name should expect that the odds are very thin; (1/26)^18 to be precise.

Let’s make the odds a little better. You can now choose one letter from each bag, but you don’t have tp choose the letters in “name order.” You can then rearrange them any way you like to try and spell the name. The number of possible arrangements of 18 different letters would be 18! (i.e., 18x17x16…x2x1). All the letter in THIS name, however, are not unique; there are two E’s, two O’s, and two K’s. Therefore, the number of unique arrangements is given by 18!/(2!x2!x2!). The overall probability then of choosing 18 letters (each from a different bag) and obtaining any one of those arrangements is (1/26)^18 x 18!/(2!x2!x2!).

This is where things begin to get tricky. AK, Kite, et al have proposed that you can use a Caesar shift (of -9, -6, -3, +3, +6, or +9) on each letter, essentially giving you six additional letters from each bag; not just any letters, but those within the constraints of the allowed shifts. The difficulty now lies in determining the number of permutations of 18 sets of seven letters, several of which may be duplicated between sets.

The total number of arrangements when choosing one of seven letters from each of the 18 sets is given by 7^18; however, they are not all unique. Many arrangements are duplicates, not just because of the three pairs of duplicate letters in the name, but also because of the Caesar shifts. Take the letters “H” and “E” for example. With Caesar shifts, they form the following sets: {Q, N, K, H, E, B, Y} and {N, K, H, E, B, Y, V}. There are six duplicates just within those two sets. Eighteen sets will share many more.

A “closed-form” mathematical solution to this problem is not readily apparent (and likely does not exist). Therefore, a brute-force computer program was developed to determine the number of unique arrangements and corresponding probability of obtaining at least one of those arrangements. For 18 letters with Caesar shifts, the number of possible arrangements borders on infinity. To obtain a “true” probability for the proposed solution of 18 letters, it would literally take years to run all the permutations, even if I had the computing power; my computer could only handle nine letters with Caesar shifts. The results of the analysis of two nine letter sets are shown below.