Content

1 Encryption monoalphabetic
- 1.1 Frequency of letters
- 1.2 Example of encryption monoalphabetic

Encryption monoalphabetic

A cipher system is monoalphabetic when each character is replaced by a determined character in the alphabet of the cipher text

From the ancient times to our days, have sent secret messages

The need to communicate secretly has occurred in diplomacy and between military

With the advent of electronic communications, the interest in maintaining messages unintelligible to all except the receiver has done nothing but increase

To introduce a few terms before we get in, we will say that cryptology is the discipline dedicated to communicate secretly

Cryptography is part of cryptology that deals with the design and implementation of systems secrets and cryptanalysis which is dedicated to break such systems

I would like to start with a very simple system that can be explained mathematically speaking using modular arithmetic

Perhaps the first of these systems had their origin with Julius Caesar, encryption was simply to replace a letter by the one three places further in the alphabet that is To be transformed into D, B into E, and so on until Z became C

Throughout this article for simplicity I will use standard English alphabet 26 letters:

$\tiny\begin{pmatrix} 0& 1& 2& 3& 4& 5& 6& 7& 8& 9& 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25 \\ A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ \end{pmatrix}$

Which is enough for most of the encrypted text-based and has the advantage of occupying positions in successive ASCII code, which makes it very advantageous to schedule

Well, the cipher of Julius Caesar could be expressed as well $C\equiv P+3\pmod{26}$ where we have assigned the number 0, B 1, ... , Z to 25, and $\pmod{26}$ indicates that we should take the remainder of dividing by 26 (in C language we use the % operator ) C is the ciphertext and P the original

Frequency of letters

In the cryptanalysis of some classical methods it is interesting to know the frequency of letters, pairs of letters, and words in the language in which we assume that it is written that message

Here are some data useful for the English language:

Letters high-frequency
Letter	Frequency %
E	12,70
T	9,06
A	8,17
O	7,51
I	6,97
N	6,75
S	6,33
H	6,09

Letters of average frequency
Letter	Frequency %
D	4,25
L	4,03
C	2,78
U	2.76
M	2,41
W	2,36
F	2,23
G	2.02

Letters of low frequency
Letter	Frequency %
Y	1,97
P	1,93
B	1,49
V	0,98
K	0,77

The rest of the letters J, Q, X and Z have frequency less than 0.5% and can be considered so “rare”

Summarizing the above data and applying them by groups of letters, we could say:

The vowels occupy about 38% of the text
Only the E and the A are identified with relative reliability because they stand out much over the others
The letters of high frequency and accounted for 63% of the total
The consonants most frequent are T, N, S, H (around 28%)
The letters least common are J, Q, X and Z (little more than 1%)

Most frequent words
Word	Frequency (per billion)
THE	56271872
OF	33950064
AND	29944184
TO	25956096
IN	17420636
I	11764797
THAT	11073318
WAS	10078245
HIS	8799755
HE	8397205
IT	8058110

Two-letter words
Word	Frequency (per billion)
OF	33950064
TO	25956096
IN	17420636
HE	8397205
IT	8058110
IS	7557477
AS	7037543
BE	5662527
ON	5113263
AT	5091841

Three-letter words
Word	Frequency (per billion)
THE	56271872
AND	29944184
THAT	11073318
WAS	10078245
HIS	8799755
FOR	7097981
HAD	6139336
YOU	6048903
NOT	5741803
HER	5202501

Four-letter words
Word	Frequency (per billion)
WITH	7725512
HAVE	4346500
FROM	4108111
WERE	3323884
SAID	2637136
THEM	2509917
BEEN	2357654
WILL	2320022
WHEN	1980046
MORE	1899787

Example of encryption monoalphabetic

MESSAGE SENT YESTERDAY we break the structure in the words of the message by deleting punctuation marks, if any, by putting for example MESSAGESENTYESTERDAY and, we get the numerical equivalents of these letters:

$\tiny\begin{pmatrix} 12& 4& 18& 18& 0& 6& 4& 18& 4& 13& 19& 24& 4& 18& 19& 4& 17& 3& 0& 24 \\ M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ \end{pmatrix}$

by applying the transformation $P+3\pmod{26}$ become

$\tiny\begin{pmatrix} 15& 7& 21& 21& 3& 9& 7& 21& 7& 16& 22& 1& 7& 21& 22& 7& 20& 6& 3& 1 \\ P& H& V& V& D& J& H& V& H& Q& W& B& H& V& W& H& U& G& D& B \\ \end{pmatrix}$

that is to say the encrypted message is now PHVVDJHVHQWBHVWHUGDB

A cipher of this type is ridiculously easy to break (but remember that it was also very easy to do), it is sufficient to test 25 possible offsets from P + 1 to P + 25, and with a glance we will know which is the message

We have used in this case, a cryptanalysis called “brute-force” because we test all the keys (in this case displacement) possible

There are some ways to improve this method, without complicate it too much, the first is based on choosing a key word with all different letters, let's say that we choose VIRTUAL ZONE

We write then the normal alphabet along with the transformed as follows:

$\tiny\begin{pmatrix} A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z \\ V& I& R& T& U& A& L& Z& O& N& E& B& C& D& F& G& H& J& K& M& P& Q& S& W& X& Y \\ \end{pmatrix}$

and now the message along with the encryption would be

$\tiny\begin{pmatrix} M& E& S& S& A& G& E& S& E& N& T& Y& E& S& T& E& R& D& A& Y \\ C& U& K& K& V& L& U& K& U& D& M& X& U& K& M& U& J& T& V& Y \\ \end{pmatrix}$

now a brute-force attack is “somewhat” more expensive so you should try with all the alphabets of possible substitution that are $26!=403291461126605635584000000$ or is a few more than the 25 from before

This method has the following weakness: with certain keys, the final letters of the alphabet are left unchanged, and this greatly facilitates the work of the cryptanalyst

The key in our example is chosen so that they appear in her letters as V, U, Z near the end of the alphabet, and they produce a greater “disorder” in the alphabet transformed

In any case, in an encryption like this uses what is called a frequency analysis. Consists of: knowing the frequency of letters in English (if you don't know in what language it is written in the original can cost you more work) try to guess which letter corresponds to each one of them

For example, in the last encrypted message CUKKVLUKUDMXUKMUJTVY it is noted that the letter repeated is the U, like the letter most frequent in English, is the And we may conjecture that U corresponds with the E as in effect and is following with the other letters can be ascertained enough to be able to read the original message

Cookie	Duration	Description
CookieLawInfoConsent	Until the end of the browser session	Controla la visualización del consentimiento de Cookies, su gestión y visualización en la página web por parte del usuario
qtrans_admin_language	Until the end of the browser session	Permite al administrador gestionar la traducción de la página web a varios idiomas
qtrans_edit_language	Until the end of the browser session	Permite al administrador editar la traducción de la página web a varios idiomas
viewed_cookie_policy	Until the end of the browser session	Controla si la visualización del consentimiento de Cookies es visible actualmente en la página web para el usuario o por el contrario está oculta

Secarcam's Computer Science Web

Encryption monoalphabetic

Encryption monoalphabetic

Frequency of letters

Example of encryption monoalphabetic

Web page of Sergio Cárcamo Garcia dedicated to the computing and related topics such as programming languages, statistics, mathematics, etc

Encryption monoalphabetic

Frequency of letters

Example of encryption monoalphabetic

Web page of Sergio Cárcamo Garcia dedicated to the computing and related topics such as programming languages, statistics, mathematics, etc

Cookie policy