Category Archives: Encryption monoalphabetic

In the encryption monoalphabetic you replace a character by another of a default, they are vulnerable to frequency analysis of the appearance of the letters

Encryption monoalphabetic

Encryption monoalphabetic

A cipher system is monoalphabetic when each character is replaced by a determined character in the alphabet of the cipher text

From the ancient times to our days, have sent secret messages

The need to communicate secretly has occurred in diplomacy and between military

With the advent of electronic communications, the interest in maintaining messages unintelligible to all except the receiver has done nothing but increase

To introduce a few terms before we get in, we will say that cryptology is the discipline dedicated to communicate secretly

Cryptography is part of cryptology that deals with the design and implementation of systems secrets and cryptanalysis which is dedicated to break such systems

I would like to start with a very simple system that can be explained mathematically speaking using modular arithmetic

Perhaps the first of these systems had their origin with Julius Caesar, encryption was simply to replace a letter by the one three places further in the alphabet that is To be transformed into D, B into E, and so on until Z became C

Throughout this article for simplicity I will use standard English alphabet 26 letters:

\tiny\begin{pmatrix} 0& 1& 2& 3& 4& 5& 6& 7& 8& 9& 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25 \\ A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ \end{pmatrix}

Which is enough for most of the encrypted text-based and has the advantage of occupying positions in successive ASCII code, which makes it very advantageous to schedule

Well, the cipher of Julius Caesar could be expressed as well C\equiv P+3\pmod{26} where we have assigned the number 0, B 1, ... , Z to 25, and \pmod{26} indicates that we should take the remainder of dividing by 26 (in C language we use the % operator ) C is the ciphertext and P the original

Frequency of letters

In the cryptanalysis of some classical methods it is interesting to know the frequency of letters, pairs of letters, and words in the language in which we assume that it is written that message

Here are some data useful for the English language:

Letters high-frequency
Letter Frequency %
E 12,70
T 9,06
A 8,17
O 7,51
I 6,97
N 6,75
S 6,33
H 6,09

Letters of average frequency
Letter Frequency %
D 4,25
L 4,03
C 2,78
U 2.76
M 2,41
W 2,36
F 2,23
G 2.02

Letters of low frequency
Letter Frequency %
Y 1,97
P 1,93
B 1,49
V 0,98
K 0,77

The rest of the letters J, Q, X and Z have frequency less than 0.5% and can be considered so “rare”

Summarizing the above data and applying them by groups of letters, we could say:

  • The vowels occupy about 38% of the text

  • Only the E and the A are identified with relative reliability because they stand out much over the others

  • The letters of high frequency and accounted for 63% of the total

  • The consonants most frequent are T, N, S, H (around 28%)

  • The letters least common are J, Q, X and Z (little more than 1%)

Most frequent words
Word Frequency (per billion)
THE 56271872
OF 33950064
AND 29944184
TO 25956096
IN 17420636
I 11764797
THAT 11073318
WAS 10078245
HIS 8799755
HE 8397205
IT 8058110

Two-letter words
Word Frequency (per billion)
OF 33950064
TO 25956096
IN 17420636
HE 8397205
IT 8058110
IS 7557477
AS 7037543
BE 5662527
ON 5113263
AT 5091841

Three-letter words
Word Frequency (per billion)
THE 56271872
AND 29944184
THAT 11073318
WAS 10078245
HIS 8799755
FOR 7097981
HAD 6139336
YOU 6048903
NOT 5741803
HER 5202501

Four-letter words
Word Frequency (per billion)
WITH 7725512
HAVE 4346500
FROM 4108111
WERE 3323884
SAID 2637136
THEM 2509917
BEEN 2357654
WILL 2320022
WHEN 1980046
MORE 1899787

Example of encryption monoalphabetic

MESSAGE SENT YESTERDAY we break the structure in the words of the message by deleting punctuation marks, if any, by putting for example MESSAGESENTYESTERDAY and, we get the numerical equivalents of these letters:

\tiny\begin{pmatrix} 12& 4& 18& 18& 0& 6& 4& 18& 4& 13& 19& 24& 4& 18& 19& 4& 17& 3& 0& 24 \\ M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ \end{pmatrix}

by applying the transformation P+3\pmod{26} become

\tiny\begin{pmatrix} 15& 7& 21& 21& 3& 9& 7& 21& 7& 16& 22& 1& 7& 21& 22& 7& 20& 6& 3& 1 \\ P& H& V& V& D& J& H& V& H& Q& W& B& H& V& W& H& U& G& D& B \\ \end{pmatrix}

that is to say the encrypted message is now PHVVDJHVHQWBHVWHUGDB

A cipher of this type is ridiculously easy to break (but remember that it was also very easy to do), it is sufficient to test 25 possible offsets from P + 1 to P + 25, and with a glance we will know which is the message

We have used in this case, a cryptanalysis called “brute-force” because we test all the keys (in this case displacement) possible

There are some ways to improve this method, without complicate it too much, the first is based on choosing a key word with all different letters, let's say that we choose VIRTUAL ZONE

We write then the normal alphabet along with the transformed as follows:

\tiny\begin{pmatrix} A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ V& I& R& T& U& A& L& Z& O& N& E& B& C& D& F& G& H& J& K& M& P& Q& S& W& X& Y \\ \end{pmatrix}

and now the message along with the encryption would be

\tiny\begin{pmatrix} M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ C& U& K& K& V& L& U& K& U& D& M& X& U& K& M& U& J& T& V& Y \\ \end{pmatrix}

now a brute-force attack is “somewhat” more expensive so you should try with all the alphabets of possible substitution that are 26!=403291461126605635584000000 or is a few more than the 25 from before

This method has the following weakness: with certain keys, the final letters of the alphabet are left unchanged, and this greatly facilitates the work of the cryptanalyst

The key in our example is chosen so that they appear in her letters as V, U, Z near the end of the alphabet, and they produce a greater “disorder” in the alphabet transformed

In any case, in an encryption like this uses what is called a frequency analysis. Consists of: knowing the frequency of letters in English (if you don't know in what language it is written in the original can cost you more work) try to guess which letter corresponds to each one of them

For example, in the last encrypted message CUKKVLUKUDMXUKMUJTVY it is noted that the letter repeated is the U, like the letter most frequent in English, is the And we may conjecture that U corresponds with the E as in effect and is following with the other letters can be ascertained enough to be able to read the original message

Encryption of Caesar

Encryption of Caesar

In the first century B. C. appears a basic cipher known as the generic cipher of Caesar in honor of Emperor Julius Caesar and in which a transformation is already applied to the monoalphabetic clear text

The cipher of Caesar applied a constant displacement of b characters to the text in clear

Example of encryption is the Caesar

We take b equal to 3, so that the alphabet of the cipher is the same as the alphabet of the text in clear but shifted 3 spaces to the right module n, with n the number of letters in the same

To encrypt we will use:

C_i\equiv(M_i+b)\pmod{n}

To decrypt we will use:

M_i\equiv(C_i+n-b)\pmod{n}

In the English alphabet, as there are 26 letters, n will be 26

We have the following message that we want to encrypt:

C=MESSAGE SENT YESTERDAY

Their characters clearly correspond to the following matrix:

\tiny\begin{pmatrix} 0& 1& 2& 3& 4& 5& 6& 7& 8& 9& 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25 \\ A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ \end{pmatrix}

\tiny\begin{pmatrix} 12& 4& 18& 18& 0& 6& 4& 18& 4& 13& 19& 24& 4& 18& 19& 4& 17& 3& 0& 24 \\ M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ \end{pmatrix}

We get the following results:

\begin{array}{l} (12+3)\pmod{26}\equiv 15 \\ (4+3)\pmod{26}\equiv 7 \\ (18+3)\pmod{26}\equiv 21 \\ (18+3)\pmod{26}\equiv 21 \\ (0+3)\pmod{26}\equiv 3 \\ (6+3)\pmod{26}\equiv 9 \\ (4+3)\pmod{26}\equiv 7\\ (18+3)\pmod{26}\equiv 21 \\ (4+3)\pmod{26}\equiv 7 \\ (13+3)\pmod{26}\equiv 16 \\ (19+3)\pmod{26}\equiv 22 \\ (24+3)\pmod{26}\equiv 1 \\ (4+3)\pmod{26}\equiv 7 \\ (18+3)\pmod{26}\equiv 21 \\ (19+3)\pmod{26}\equiv 22 \\ (4+3)\pmod{26}\equiv 7 \\ (17+3)\pmod{26}\equiv 20 \\ (3+3)\pmod{26}\equiv 6 \\ (0+3)\pmod{26}\equiv 3 \\ (24+3)\pmod{26}\equiv 1 \end{array}

by applying the transformation P+3\pmod{26} become

\tiny\begin{pmatrix} 15& 7& 21& 21& 3& 9& 7& 21& 7& 16& 22& 1& 7& 21& 22& 7& 20& 6& 3& 1 \\ P& H& V& V& D& J& H& V& H& Q& W& B& H& V& W& H& U& G& D& B \\ \end{pmatrix}

So the encrypted message is:

M=PHVVDJHVHQWBHVWHUGDB

We can decipher the M previous:

\begin{array}{l} (15+26-3)\pmod{26}\equiv 12 \\ (7+26-3)\pmod{26}\equiv 4 \\ (21+26-3)\pmod{26}\equiv 18 \\ (21+26-3)\pmod{26}\equiv 18 \\ (3+26-3)\pmod{26}\equiv 0 \\ (9+26-3)\pmod{26}\equiv 6 \\ (7+26-3)\pmod{26}\equiv 4 \\ (21+26-3)\pmod{26}\equiv 18 \\ (7+26-3)\pmod{26}\equiv 4 \\ (16+26-3)\pmod{26}\equiv 13 \\ (22+26-3)\pmod{26}\equiv 19 \\ (1+26-3)\pmod{26}\equiv 24 \\ (7+26-3)\pmod{26}\equiv 4 \\ (21+26-3)\pmod{26}\equiv 18 \\ (22+26-3)\pmod{26}\equiv 19 \\ (7+26-3)\pmod{26}\equiv 4 \\ (20+26-3)\pmod{26}\equiv 17 \\ (6+26-3)\pmod{26}\equiv 3 \\ (3+26-3)\pmod{26}\equiv 0 \\ (1+26-3)\pmod{26}\equiv 24 \end{array}

\tiny\begin{pmatrix} 12& 4& 18& 18& 0& 6& 4& 18& 4& 13& 19& 24& 4& 18& 19& 4& 17& 3& 0& 24 \\ M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ \end{pmatrix}

Getting the original c: C=MESSAGESENTYESTERDAY

This system of encryption, simple, appropriate and even pretty ingenious for the time, presents a level of security very weak

Cryptanalysis of the encryption of Caesar

In the event a substitution is fixed for each character of the alphabet in clear by a single character of the alphabet of the cipher, the cryptogram will be able to easily break using statistical techniques of the language, always and when we have a sufficient amount of cipher text

The distance of the uniqueness is given by the ratio between the entropy of the key H(K) and the redundancy of the language D. therefore, if n = 26, there are only 25 possible combinations of alphabets, therefore H(K)=\log_2{25}=4,64

As the redundancy D was equal to 4.03 then you have to N=\frac{H(K)}{D}\approx\frac{4,64}{4,03}\approx 1,15. Therefore, we need a minimum of 2 characters

An elementary form of cryptanalysis is to write under the text encryption all the combinations of phrases, with or without meaning, which are obtained by applying to said cryptogram displacement of 1, \cdots, n-1 characters, n being the number of characters of the alphabet used. One of these combinations will give with the clear text and this will be true regardless of the value assigned to the constant displacement

b Cipher
1 QIWWEKIWIRXCIWXIVHEC
2 RJXXFLJXJSYDJXYJWIFD
3 SKYYGMKYKTZEKYZKXJGE
4 TLZZHNLZLUAFLZALYKHF
5 UMAAIOMAMVBGMABMZLIG
6 VNBBJPNBNWCHNBCNAMJH
7 WOCCKQOCOXDIOCDOBNKI
8 XPDDLRPDPYEJPDEPCOLJ
9 YQEEMSQEQZFKQEFQDPMK
10 ZRFFNTRFRAGLRFGREQNL
11 ASGGOUSGSBHMSGHSFROM
12 BTHHPVTHTCINTHITGSPN
13 CUIIQWUIUDJOUIJUHTQO
14 CUIIQWUIUDJOUIJUHTQO
15 EWKKSYWKWFLQWKLWJVSQ
16 FXLLTZXLXGMRXLMXKWTR
17 GYMMUAYMYHNSYMNYLXUS
18 HZNNVBZNZIOTZNOZMYVT
19 IAOOWCAOAJPUAOPANZWU
20 JBPPXDBPBKQVBPQBOAXV
21 KCQQYECQCLRWCQRCPBYW
22 LDRRZFDRDMSXDRSDQCZX
23 MESSAGESENTYESTERDAY
24 NFTTBHFTFOUZFTUFSEBZ
25 OGUUCIGUGPVAGUVGTFCA

I could easy say that a system of encryption by substitution monoalfabética as he Cesar presents a minimal level of security in both support of romero, we have been how to a pencil, and a little bit of patience and support to make him box before, nothing out of this world

This weakness is due to that the number of possible offsets is very small, counting only with the 25 values that correspond to the characters of the alphabet; that is, it is true that 1\leq b\leq 25since a displacement equal to zero or a multiple of twenty-six would be equal to that transmit in the clear

Is to be fulfilled by both the following decryption operation is D from a And encryption in the ring n:

D_b=E_{n-b}\Rightarrow D_3=E_{26-3}=E_{23}

Encryption for the Caesar with key

Encryption for the Caesar with key

The encryption of the Caesar key was created to increase the security of the encryption of Caesar, that is to say, the distance of oneness, we include in the alphabet of the encryption key k that consists of a word or phrase that is written from a position p_0 of the alphabet in clear

The repeated characters of the key are not used. Once you positioned the key at the given position, add the other letters of the alphabet in order and in a modular way, in order to get the alphabet encryption

In this type of encryption fails to meet the condition of constant displacement

Example of encryption is the Caesar with key

We take p_0 = 3 and the key is going to be:

k = I’M BORED

To encrypt we will use:

C_i\equiv(M_i+b)\pmod{n}

To decrypt we will use:

M_i\equiv(C_i+n-b)\pmod{n}

In the English alphabet, as there are 26 letters, n will be 26

We have the following message that we want to encrypt:

C=MESSAGE SENT YESTERDAY

Their characters clearly correspond to the following matrix:

\tiny\begin{pmatrix}0& 1& 2& 3& 4& 5& 6& 7& 8& 9& 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25 \\ A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ \end{pmatrix}

To the previous matrix we add the key, taking into account the need to eliminate the repeated

\tiny\begin{pmatrix} 0& 1& 2& 3& 4& 5& 6& 7& 8& 9& 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25 \\ A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ & & & I& M& B& O& R& E& D \\ \end{pmatrix}

Now we add the other letters of the alphabet in order and in a modular way, in order to get the alphabet full encryption

\tiny\begin{pmatrix} 0& 1& 2& 3& 4& 5& 6& 7& 8& 9& 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25 \\ A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ X& Y& Z& I& M& B& O& R& E& D& A& C& F& G& H& J& K& L& N& P& Q& S& T& U& V& W \\\end{pmatrix}

We get the following results:

\begin{array}{l} (12+3)\pmod{26}\equiv 15 \\ (4+3)\pmod{26}\equiv 7 \\ (18+3)\pmod{26}\equiv 21 \\ (18+3)\pmod{26}\equiv 21 \\ (0+3)\pmod{26}\equiv 3 \\ (6+3)\pmod{26}\equiv 9 \\ (4+3)\pmod{26}\equiv 7\\ (18+3)\pmod{26}\equiv 21 \\ (4+3)\pmod{26}\equiv 7 \\ (13+3)\pmod{26}\equiv 16 \\ (19+3)\pmod{26}\equiv 22 \\ (24+3)\pmod{26}\equiv 1 \\ (4+3)\pmod{26}\equiv 7 \\ (18+3)\pmod{26}\equiv 21 \\ (19+3)\pmod{26}\equiv 22 \\ (4+3)\pmod{26}\equiv 7 \\ (17+3)\pmod{26}\equiv 20 \\ (3+3)\pmod{26}\equiv 6 \\ (0+3)\pmod{26}\equiv 3 \\ (24+3)\pmod{26}\equiv 1 \end{array}

by applying the transformation P+3\pmod{26} and by referring to the second part of the matrix become

\tiny\begin{pmatrix} 15& 7& 21& 21& 3& 9& 7& 21& 7& 16& 22& 1& 7& 21& 22& 7& 20& 6& 3& 1 \\ J& R& S& S& I& D& R& S& R& K& T& Y& R& S& T& R& Q& O& I& Y \\ \end{pmatrix}

So we is that the encrypted message is:

M=JRSSIDRSRKTYRSTRQOIY

We can decipher the M previous:

\begin{array}{l} (15+26-3)\pmod{26}\equiv 12 \\ (7+26-3)\pmod{26}\equiv 4 \\ (21+26-3)\pmod{26}\equiv 18 \\ (21+26-3)\pmod{26}\equiv 18 \\ (3+26-3)\pmod{26}\equiv 0 \\ (9+26-3)\pmod{26}\equiv 6 \\ (7+26-3)\pmod{26}\equiv 4 \\ (21+26-3)\pmod{26}\equiv 18 \\ (7+26-3)\pmod{26}\equiv 4 \\ (16+26-3)\pmod{26}\equiv 13 \\ (22+26-3)\pmod{26}\equiv 19 \\ (1+26-3)\pmod{26}\equiv 24 \\ (7+26-3)\pmod{26}\equiv 4 \\ (21+26-3)\pmod{26}\equiv 18 \\ (22+26-3)\pmod{26}\equiv 19 \\ (7+26-3)\pmod{26}\equiv 4 \\ (20+26-3)\pmod{26}\equiv 17 \\ (6+26-3)\pmod{26}\equiv 3 \\ (3+26-3)\pmod{26}\equiv 0 \\ (1+26-3)\pmod{26}\equiv 24 \end{array}

We consulted the first part of the array

\tiny\begin{pmatrix} 12& 4& 18& 18& 0& 6& 4& 18& 4& 13& 19& 24& 4& 18& 19& 4& 17& 3& 0& 24 \\ M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ \end{pmatrix}

Getting the original C:

C=MESSAGESENTYESTERDAY

By having a greater number of combinations of alphabets, there is a greater uncertainty with respect to the key. The distance of the uniqueness of this cipher will be higher and, therefore, the system will present greater strength

Cryptanalysis of the encryption of Caesar with key

It is impossible to establish a mathematical relationship only and directly between the alphabet in clear and the alphabet cipher. The only way that remains for us is to take statistics on the language of the cryptogram, by observing for example the relative frequency of appearance of the characters in the cipher text

This type of statistical attack will be valid for the encrypted type monoalfabético with key as well as for those who have not. Now, in the great majority of cases it will be necessary to have a number of a cipher, quite higher than that of the previous example, a dash of intuition and a bit of luck

The distance of the uniqueness is given by the ratio between the entropy of the key H(K) and the redundancy of the language D. therefore, if the alphabet has n characters, there will be n! combinations of elements of n, therefore N=\frac{H(K)}{D}=\frac{\log_2{n!}}{D}

If we use the approximation of Sterling we have that \log_2{n!}\approx n\cdot\log_2{\frac{n}{e}} therefore, the distance of oneness will be N=\frac{n\cdot\log_2{\frac{n}{e}}}{D}. As the redundancy D was equal to 4.03 and n = 26 has to be N=\frac{26\cdot\log_2{\frac{26}{e}}}{4,03}\approx 21,02. Therefore, we need at least 22 characters

When set to the encryption operation is a direct correspondence between the characters of the clear text and alphabet encrypting, maintaining the same frequency relationship related feature of the language. Therefore, it is very likely that the letter C_i the cipher text with a higher relative frequency corresponds with the letter M_i greater relative frequency in the language.

Therefore, if the letter W is the largest frequency in the cryptogram, we can assume with very good expectations of success, that is the letter And the text clear and that, therefore, the offset applied has been equal to 18, the distance which separates the two letters in the alphabet

These assumptions will only have some validity if the amount of cipher text is large, and therefore met the statistical properties of the language. In the background you are making a comparison of the frequency distribution of all the elements of the cryptogram with the feature of the language, with the object of finding that constant displacement

Encryption of Polybius

Encryption of Polybius

The Greek historian Polybius (203-120 B. C.), created a system of sending messages by means of torches

The method consisted essentially in the creation of a square matrix of 5 \times 5 such as the following

\begin{pmatrix}&1&2&3&4&5\\1&A&B&C&D&E\\2&F&G&H&I/J&K\\3&L&M&N&O&P\\4&Q&R&S&T&U\\5&V&W&X&Y&Z\end{pmatrix}

The message is represented by numbers that form the row and column whose intersection gives as a result the letter you want to send

While the method of Polybius does not initially had a purpose cryptographic, yes that is the base of later systems, and the first known case of replacement monoalfabética multiliteral

A variant of the encryption of Polybius, used by the communists in the Spanish civil war consisted of generating a table with three rows of ten columns

The first row had no numbering, and the second and third rows are ultimately respectively with two of the unused numbers in the columns of the first row

The columns are ultimately with a permutation of the digits from zero to nine

The encryption process consisted in putting a word of eight or fewer different letters in the first row

In this word were removed the letters repeated and the rest, until you complete the alphabet, arranged in two rows

Encryption is similar to that of Polybius, but here the letters can be encoded as one or two numbers

Example of the alternative communist of the encryption of Polybius

The Spanish communists had to send the following message, which would not that franco's troops interceptasen

C=EN PIE FAMELICA LEGION

Taking as a key:

K=FUSIL

Using the following table

\tiny\begin{pmatrix}&8&3&0&2&4&6&1&7&5&9\\&F&U&S&I&L\\5&A&B&C&D&E&G&H&J&K&M\\1&N/\widetilde{N}&O&P&Q&R&T&V&X&Y&Z \end{pmatrix}

So we is that the encrypted message is:

M=54 18 10 2 54 8 58 59 54 4 2 50 58 4 54 56 2 13 18

We can decipher the M previous

We go to the table of encryption, if we have two numbers, check which is your row and the second corresponds to the column

The row of the intersection of both will be the letter that will be used in the message decryption

If we have a figure, is the one corresponding to the column and the row is the one corresponding to the key

There may be confusion in the case of 18, because they share a common position of the N and the Ñ, all depends on the context of the message (in this case the N)

We will repeat the process until you get the message clear

Getting the original C:

C=ENPIEFAMELICALEGION