The Code Book: The Secret History of Codes and Code-breaking. Simon Singh
Чтение книги онлайн.

Читать онлайн книгу The Code Book: The Secret History of Codes and Code-breaking - Simon Singh страница 19

СКАЧАТЬ contains four letters. Each letter of the keyword defines a different cipher alphabet in the Vigenère square, as shown in Table 7. The e column of the square has been highlighted to show how it is enciphered differently, depending on which letter of the keyword is defining the encipherment:

      If the K of KING is used to encipher e, then the resulting ciphertext letter is O.

      If the I of KING is used to encipher e, then the resulting ciphertext letter is M.

      If the N of KING is used to encipher e, then the resulting ciphertext letter is R.

      If the G of KING is used to encipher e, then the resulting ciphertext letter is K.

      Table 7 A Vigenère square used in combination with the keyword KING. The keyword defines four separate cipher alphabets, so that the letter e may be encrypted as O, M, R or K.

image

      Similarly, whole words will be enciphered in different ways: the word the, for example, could be enciphered as DPR, BUK, GNO or ZRM, depending on its position relative to the keyword. Although this makes cryptanalysis difficult, it is not impossible. The important point to note is that if there are only four ways to encipher the word the, and the original message contains several instances of the word the, then it is highly likely that some of the four possible encipherments will be repeated in the ciphertext. This is demonstrated in the following example, in which the line The Sun and the Man in the Moon has been enciphered using the Vigenère cipher and the keyword KING.

      Keyword K I N G K I N G K I N G K I N G K I N G K I N G

      Plaintext t h e s u n a n d t h e m a n i n t h e m o o n

      Ciphertext D P R Y E V N T N B U K W I A O X B U K W W B T

      The word the is enciphered as DPR in the first instance, and then as BUK on the second and third occasions. The reason for the repetition of BUK is that the second the is displaced by eight letters with respect to the third the, and eight is a multiple of the length of the keyword, which is four letters long. In other words, the second the was enciphered according to its relationship to the key word (the is directly below ING), and by the time we reach the third the, the keyword has cycled round exactly twice, to repeat the relationship, and hence repeat the encipherment.

      Babbage realised that this sort of repetition provided him with exactly the foothold he needed in order to conquer the Vigenère cipher. He was able to define a series of relatively simple steps which could be followed by any cryptanalyst to crack the hitherto chiffre indéchiffrable. To demonstrate his brilliant technique, let us imagine that we have intercepted the ciphertext shown in Figure 13. We know that it was enciphered using the Vigenère cipher, but we know nothing about the original message, and the keyword is a mystery.

      The first stage in Babbage’s cryptanalysis is to look for sequences of letters that appear more than once in the ciphertext. There are two ways that such repetitions could arise. The most likely is that the same sequence of letters in the plaintext has been enciphered using the same part of the key. Alternatively, there is a slight possibility that two different sequences of letters in the plaintext have been enciphered using different parts of the key, coincidentally leading to the identical sequence in the ciphertext. If we restrict ourselves to long sequences, then we largely discount the second possibility, and, in this case, we shall consider repeated sequences only if they are of four letters or more. Table 8 is a log of such repetitions, along with the spacing between the repetition. For example, the sequence E-F-I-Q appears in the first line of the ciphertext and then in the fifth line, shifted forward by 95 letters.

image

      Figure 13 The ciphertext, enciphered using the Vigenère cipher.

      As well as being used to encipher the plaintext into ciphertext, the keyword is also used by the receiver to decipher the ciphertext back into plaintext. Hence, if we could identify the keyword, deciphering the text would be easy. At this stage we do not have enough information to work out the keyword, but Table 8 does provide some very good clues as to its length. Having listed which sequences repeat themselves and the spacing between these repetitions, the rest of the table is given over to identifying the factors of the spacing – the numbers that will divide into the spacing. For example, the sequence W-C-X-Y-M repeats itself after 20 letters, and the numbers 1, 2, 4, 5, 10 and 20 are factors, because they divide perfectly into 20 without leaving a remainder. These factors suggest six possibilities:

      (1) The key is 1 letter long and is recycled 20 times between encryptions.

      (2) The key is 2 letters long and is recycled 10 times between encryptions.

      (3) The key is 4 letters long and is recycled 5 times between encryptions.

      (4) The key is 5 letters long and is recycled 4 times between encryptions.

      (5) The key is 10 letters long and is recycled 2 times between encryptions.

      (6) The key is 20 letters long and is recycled 1 time between encryptions.

      The first possibility can be excluded, because a key that is only 1 letter long gives rise to a monoalphabetic cipher – only one row of the Vigenère square would be used for the entire encryption, and the cipher alphabet would remain unchanged; it is unlikely that a cryptographer would do this. To indicate each of the other possibilities, a image is placed in the appropriate column of Table 8. Each image indicates a potential key length.

      Table 8 Repetitions and spacings in the ciphertext.

image

      To identify whether the key is 2, 4, 5, 10 or 20 letters long, we need to look at the factors of all the other spacings. Because the keyword seems to be 20 letters or smaller, Table 8 lists those factors that are 20 or smaller for each of the other spacings. There is a clear propensity for a spacing divisible by 5. In fact, every spacing is divisible by 5. The first repeated sequence, E-F-I-Q, can be explained by a keyword of length 5 recycled nineteen times between the first and second encryptions. The second repeated sequence, P-S-D-L-P, can be explained by a keyword of length 5 recycled just once between the first and second encryptions. The third repeated sequence, W-C-X-Y-M, can be explained by a keyword of length 5 recycled four times between the first and second encryptions. The fourth repeated sequence, E-T-R-L, can be explained by a keyword of length 5 recycled twenty-four times between the first and second encryptions. In short, everything is consistent with a five-letter keyword.

      Assuming that the keyword is indeed 5 letters long, the next step is to work out the actual letters of the keyword. For the time being, let us call the keyword L1-L2-L3-L4-L5, such that L1 represents the first letter of the keyword, and so on. The process of encipherment would have begun with enciphering the first letter of the plaintext according to the first letter of the keyword, L1. The letter L1 defines one row of the Vigenère square, and effectively provides a monoalphabetic substitution cipher alphabet for the first letter of the plaintext. However, when it comes to encrypting the second letter of the plaintext, the cryptographer would have used L2 to define a different row of the Vigenère square, effectively providing a different monoalphabetic substitution cipher alphabet. The third letter of plaintext would be encrypted according to L3, the fourth according to L4, and the СКАЧАТЬ