You are on page 1of 42

Substitution Techniques

Ravneet Kaur Sidhu


MTECH (CS)- 120457071
Substitution Techniques
Basic building block of all encryption techniques
Letters of plaintext -> letters or numbers or symbols
If the plaintext is viewed as a sequence of bits, then,
plaintext bit patterns -> ciphertext bit patterns
Substitution Techniques
Various classical substitution techniques are:
Caesar Cipher
Monoalphabetic Ciphers
Playfair Cipher
Hill Cipher
Polyalphabetic Ciphers
One-Time Pad

Caesar Cipher
Earliest known substitution cipher
By Julius Caesar
Replaces each letter of the alphabet with the letter
standing three places further down the alphabet
Caesar Cipher (cont.)
can define transformation as:
a b c d e f g h i j k l m n o p q r s t u v w x y z
D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
(alphabet is wrapped around- A,B,...,Z,A,..)
Example:
meet me after the toga party
PHHW PH DIWHU WKH WRJD SDUWB


Caesar Cipher (cont.)
Mathematically give each letter a number
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Caesar cipher can be expressed as:
c = E(3, p) = (p + 3) mod (26)
General Caesar cipher :
c = E(k, p) = (p + k) mod (26)
p = D(k, c) = (c k) mod (26)



Cryptanalysis of Caesar Cipher
brute-force cryptanalysis- easily performed
Encryption, decryption algorithms known
Only 25 keys to try
Language of plaintext is known, easily recognizable
(compressed text file -> substitution cipher -> plaintext not
recognized in brute-force cryptanalysis)


Monoalphabetic Ciphers
Arbitrary substitution
each plaintext letter maps to a different random
ciphertext letter
Key -26 letters long


Monoalphabetic Ciphers (cont.)
Example:
Plain: abcdefghijklmnopqrstuvwxyz
Cipher: DKVQFIBJWPESCXHTMYAUOLRGZN
Plaintext: ifwewishtoreplaceletters
Ciphertext: WIRFRWAJUHYFTSDVFSFUUFYA

Monoalphabetic- single cipher alphabet is used per
message

Monoalphabetic Cipher Security
Total keys 26! = 4 * 10
26
Still not secure
Problem language characteristics


Cryptanalysis of Monoalphabetic
Cipher
Given ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
Relative frequencies of the letters:

P 13.33 H 5.83 F 3.33 B 1.67 C 0.00
Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00
S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00
U 8.33 V 4.17 T 2.50 I 0.83 N 0.00
O 7.50 X 4.17 A 1.67 J 0.83 R 0.00
M 6.67
Cryptanalysis of Monoalphabetic
Cipher (cont.)
Frequencies are compared to a standard frequency
distribution for English (Fig.)
P 13.33 H 5.83 F 3.33 B 1.67 C 0.00
Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00
S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00
U 8.33 V 4.17 T 2.50 I 0.83 N 0.00
O 7.50 X 4.17 A 1.67 J 0.83 R 0.00
M 6.67
Cipher letters P, Z equivalents
of plain letters e and t
Cipher letters Plain letters
{S,U,O,M,H} {a,h,i,n,o,r,s}
{A,B,G,Y,I,J} {b,j,k,q,v,x,z}
High
frequencies
Low frequencies
Cryptanalysis of Monoalphabetic
Cipher (cont.)
The most common digram - th
Given ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
Correspondence:
Z with t
W with h

Cryptanalysis of Monoalphabetic
Cipher (cont.)
The most common digram - th
Given ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
Correspondence:
Z with t
W with h
Equate P with e.
Cipher letters P, Z equivalents
of plain letters e and t
Translate ZWP as the (the
most common trigram)
Cryptanalysis of Monoalphabetic
Cipher (cont.)
The most common digram - th
Given ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
Correspondence:
Z with t
W with h
Cipher letter S equivalent of
plain letter a
Translate ZWSZ as th_t
Cryptanalysis of Monoalphabetic
Cipher (cont.)
Z - t, W - h, P - e, S - a
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
t a e e te a that e e a a t
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
e t ta t ha e ee a e th t a
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
e e e tat e the t


Cryptanalysis of Monoalphabetic
Cipher (cont.)
Z - t, W - h, P - e, S - a
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
t a e e te a that e e a a t
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
e t ta t ha e ee a e th t a
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
e e e tat e the t
proceeding with trial and error (+ adding spaces)
finally get:
it was disclosed yesterday that several informal but
direct contacts have been made with political
representatives of the viet cong in moscow


Cryptanalysis of Monoalphabetic
Cipher (cont.)
Ciphers- reflect the frequency data of original
plaintext
Countermeasure- homophones (multiple substitutes
for a single letter)
Example-
e cipher symbols 16, 74, 35 and 21 (each used in
rotation)
Multiple-letter patterns (digram frequencies) in the
ciphertext cryptanalysis straightforward
Playfair Cipher
Playfair - multiple-letter encryption cipher
Plaintext digrams- single units
Plaintext digrams -> Ciphertext digrams
invented by Charles Wheatstone in 1854, but named
after his friend Baron Playfair
Playfair Cipher (cont.)
Playfair algorithm is based on the use of a 5x5 matrix
of letters constructed using a keyword.
Fill in the letters of the keyword (minus duplicates)
Fill rest of the matrix with remaining letters in
alphabetic order
Playfair Key Matrix
M O N A R
C H Y B D
E F G I/J K
L P O S T
U V W X Z
5x5 matrix
Keyword- MONARCHY
Playfair Cipher
plaintext is encrypted two letters at a time
if a pair is a repeated letter, insert filler like 'X
Example: balloon -> ba lx lo on
if both letters fall in the same row, replace each with letter
to right (wrapping back to start from end)
Example: ar encrypted as RM
M O N A R
C H Y B D
E F G I/J K
L P O S T
U V W X Z
Playfair Cipher (cont.)
if both letters fall in the same column, replace each with the
letter below it (wrapping to top from bottom)
Example: mu encrypted as CM
M O N A R
C H Y B D
E F G I/J K
L P O S T
U V W X Z
Playfair Cipher (cont.)
otherwise each letter is replaced by the letter in the same row
and in the column of the other letter of the pair
Example: hs becomes BP, ea becomes IM( JM)
M O N A R
C H Y B D
E F G I/J K
L P O S T
U V W X Z
M O N A R
C H Y B D
E F G I/J K
L P O S T
U V W X Z
For h
For s
h row
h column S column
s row
Security of Playfair Cipher
security much improved over monoalphabetic
since have 26 x 26 = 676 digrams
would need a 676 entry frequency table to analyse (versus
26 for a monoalphabetic)
and correspondingly more ciphertext
was widely used for many years
eg. by US & British military in WW1
Security of Playfair Cipher(cont.)
Relatively easy to break
Plaintext language structure
Few hundred letters of ciphertext are generally sufficient

Hill Cipher
Developed by the mathematician Lester Hill in 1929
Encryption algorithm:
Takes m successive plaintext letters & substitutes for them
m ciphertext letters.
Substitution is determined by m linear equations
Hill Cipher Encryption
For m=3
c
1
=(k
11
p
1
+k
12
p
2
+k
13
p
3
) mod 26
c
2
=(k
21
p
1
+k
22
p
2
+k
23
p
3
) mod 26
c
3
=(k
31
p
1
+k
32
p
2
+k
33
p
3
) mod 26
c
1
k
11
k
12
k
13
p
1
c
2 =
k
21
k
22
k
23
p
2
c
3
k
31
k
32
k
33
p
3
or,
C= E(K,P)=KP mod 26

3 successive plaintexts-
p1,p2,p3
3 ciphertexts- c1,c2,c3
3 linear equations
C and P column vectors (length 3)
representing plaintext and ciphertext
K -3x3 matrix, representing
encryption key
Hill Cipher Encryption (cont.)
Example:
Plaintext- paymoremoney
Encryption key
17 17 5
K = 21 18 21
2 2 19
first 3 letters of plaintext
vector -p(15), a(0),y(24)
15
0
24


C=KP mod 26
15 375 11
K 0 = 819 mod 26= 13 = LNS
24 486 18

Continuing in this fashion,
ciphertext :
LNSHDLEWMTRW

Hill Cipher Decryption
Uses inverse of K, i.e. K
-1
P=D(K,P) = K
-1
C mod 26 = K
-1
KP=P

Example:
Inverse of key
4 9 15
K
-1
= 15 17 6
24 0 17

17 17 5 4 9 15 443 442 442 1 0 0
21 18 21 15 17 6 = 858 495 780 mod 26 = 0 1 0
2 2 19 24 0 17 494 52 365 0 0 1

KK
-1
=I
Security of Hill Cipher
3x3 Hill cipher hides both single-letter and two-letter
frequency
If m plaintext-ciphertext pairs are known, mxm Hill
Cipher can be easily known
C=KP mod 26
Find P
-1
K=CP
-1
mod26

Polyalphabetic Cipher
General name -polyalphabetic substitution ciphers
improve security using multiple cipher alphabets
make cryptanalysis harder with more alphabets to
guess and flatter frequency distribution

Vigenere Cipher
Simplest algorithm
Multiple caesar ciphers(26), with shifts(0-25)
make cryptanalysis harder with more alphabets to
guess and flatter frequency distribution
Vigenere table -aid for this scheme

Vigenere Table
Encryption & Decryption In Vigenere
Cipher
Encryption:
Key letter x
Plaintext letter y
Ciphertext letter V intersection of x
th
row and y
th
column
Decryption:
Key letter x
Ciphertext letter V
Plaintext letter y- column containing V in x
th
row

Vigenere Cipher Example
keyword = deceptive
key: deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
ciphertext:ZICVTWQNGRZGVTWAVZHCQYGLMGJ

Security of Vigenere Cipher
have multiple ciphertext letters for each plaintext
letters
Still not all information about plaintext is lost
Improvement over Playfair Cipher, but frequency
information remains
If a monoalphabetic substitution is used,
statistical properties of the ciphertext should be the same as
that of the language of the plaintext.
Security of Vigenere Cipher (cont.)
If a Vigenere cipher is suspected,
progress depends on determining the length of the
keyword.
Key length determination
If two identical sequences of plaintext letters occur at a distance
that is an integer multiple of the keyword length, they will
generate identical ciphertext sequences
key: deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
ciphertext:ZICVTWQNGRZGVTWAVZHCQYGLMGJ

Assumption:
keyword
length 3 or 9
Autokey System
Periodic nature of keyword can be eliminated (by using
nonrepeating keyword, as long as the message)
Vignere proposed autokey system
Keyword concatenated with plaintext itself to provide
a running key
key: deceptivewearediscoveredsav
plaintext: wearediscoveredsaveyourself
ciphertext:ZICVTWQNGKZEIIGASXSTSLVVWLA
Problem: key and plaintext share some frequency
distributions of letter (statistical technique can be
applied)


One-Time Pad
Key as long as the message, not repeated
Key- used to encrypt decrypt single message and then
discarded
One-time pad
Unbreakable since ciphertext bears no statistical
relationship to the plaintext
Problems : generation & safe distribution of key

Thanks

You might also like