MTECH (CS)- 120457071 Substitution Techniques Basic building block of all encryption techniques Letters of plaintext -> letters or numbers or symbols If the plaintext is viewed as a sequence of bits, then, plaintext bit patterns -> ciphertext bit patterns Substitution Techniques Various classical substitution techniques are: Caesar Cipher Monoalphabetic Ciphers Playfair Cipher Hill Cipher Polyalphabetic Ciphers One-Time Pad
Caesar Cipher Earliest known substitution cipher By Julius Caesar Replaces each letter of the alphabet with the letter standing three places further down the alphabet Caesar Cipher (cont.) can define transformation as: a b c d e f g h i j k l m n o p q r s t u v w x y z D E F G H I J K L M N O P Q R S T U V W X Y Z A B C (alphabet is wrapped around- A,B,...,Z,A,..) Example: meet me after the toga party PHHW PH DIWHU WKH WRJD SDUWB
Caesar Cipher (cont.) Mathematically give each letter a number a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Caesar cipher can be expressed as: c = E(3, p) = (p + 3) mod (26) General Caesar cipher : c = E(k, p) = (p + k) mod (26) p = D(k, c) = (c k) mod (26)
Cryptanalysis of Caesar Cipher brute-force cryptanalysis- easily performed Encryption, decryption algorithms known Only 25 keys to try Language of plaintext is known, easily recognizable (compressed text file -> substitution cipher -> plaintext not recognized in brute-force cryptanalysis)
Monoalphabetic Ciphers Arbitrary substitution each plaintext letter maps to a different random ciphertext letter Key -26 letters long
Monoalphabetic- single cipher alphabet is used per message
Monoalphabetic Cipher Security Total keys 26! = 4 * 10 26 Still not secure Problem language characteristics
Cryptanalysis of Monoalphabetic Cipher Given ciphertext: UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ Relative frequencies of the letters:
P 13.33 H 5.83 F 3.33 B 1.67 C 0.00 Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00 S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00 U 8.33 V 4.17 T 2.50 I 0.83 N 0.00 O 7.50 X 4.17 A 1.67 J 0.83 R 0.00 M 6.67 Cryptanalysis of Monoalphabetic Cipher (cont.) Frequencies are compared to a standard frequency distribution for English (Fig.) P 13.33 H 5.83 F 3.33 B 1.67 C 0.00 Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00 S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00 U 8.33 V 4.17 T 2.50 I 0.83 N 0.00 O 7.50 X 4.17 A 1.67 J 0.83 R 0.00 M 6.67 Cipher letters P, Z equivalents of plain letters e and t Cipher letters Plain letters {S,U,O,M,H} {a,h,i,n,o,r,s} {A,B,G,Y,I,J} {b,j,k,q,v,x,z} High frequencies Low frequencies Cryptanalysis of Monoalphabetic Cipher (cont.) The most common digram - th Given ciphertext: UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ Correspondence: Z with t W with h
Cryptanalysis of Monoalphabetic Cipher (cont.) The most common digram - th Given ciphertext: UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ Correspondence: Z with t W with h Equate P with e. Cipher letters P, Z equivalents of plain letters e and t Translate ZWP as the (the most common trigram) Cryptanalysis of Monoalphabetic Cipher (cont.) The most common digram - th Given ciphertext: UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ Correspondence: Z with t W with h Cipher letter S equivalent of plain letter a Translate ZWSZ as th_t Cryptanalysis of Monoalphabetic Cipher (cont.) Z - t, W - h, P - e, S - a UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ t a e e te a that e e a a t VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX e t ta t ha e ee a e th t a EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ e e e tat e the t
Cryptanalysis of Monoalphabetic Cipher (cont.) Z - t, W - h, P - e, S - a UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ t a e e te a that e e a a t VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX e t ta t ha e ee a e th t a EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ e e e tat e the t proceeding with trial and error (+ adding spaces) finally get: it was disclosed yesterday that several informal but direct contacts have been made with political representatives of the viet cong in moscow
Cryptanalysis of Monoalphabetic Cipher (cont.) Ciphers- reflect the frequency data of original plaintext Countermeasure- homophones (multiple substitutes for a single letter) Example- e cipher symbols 16, 74, 35 and 21 (each used in rotation) Multiple-letter patterns (digram frequencies) in the ciphertext cryptanalysis straightforward Playfair Cipher Playfair - multiple-letter encryption cipher Plaintext digrams- single units Plaintext digrams -> Ciphertext digrams invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair Playfair Cipher (cont.) Playfair algorithm is based on the use of a 5x5 matrix of letters constructed using a keyword. Fill in the letters of the keyword (minus duplicates) Fill rest of the matrix with remaining letters in alphabetic order Playfair Key Matrix M O N A R C H Y B D E F G I/J K L P O S T U V W X Z 5x5 matrix Keyword- MONARCHY Playfair Cipher plaintext is encrypted two letters at a time if a pair is a repeated letter, insert filler like 'X Example: balloon -> ba lx lo on if both letters fall in the same row, replace each with letter to right (wrapping back to start from end) Example: ar encrypted as RM M O N A R C H Y B D E F G I/J K L P O S T U V W X Z Playfair Cipher (cont.) if both letters fall in the same column, replace each with the letter below it (wrapping to top from bottom) Example: mu encrypted as CM M O N A R C H Y B D E F G I/J K L P O S T U V W X Z Playfair Cipher (cont.) otherwise each letter is replaced by the letter in the same row and in the column of the other letter of the pair Example: hs becomes BP, ea becomes IM( JM) M O N A R C H Y B D E F G I/J K L P O S T U V W X Z M O N A R C H Y B D E F G I/J K L P O S T U V W X Z For h For s h row h column S column s row Security of Playfair Cipher security much improved over monoalphabetic since have 26 x 26 = 676 digrams would need a 676 entry frequency table to analyse (versus 26 for a monoalphabetic) and correspondingly more ciphertext was widely used for many years eg. by US & British military in WW1 Security of Playfair Cipher(cont.) Relatively easy to break Plaintext language structure Few hundred letters of ciphertext are generally sufficient
Hill Cipher Developed by the mathematician Lester Hill in 1929 Encryption algorithm: Takes m successive plaintext letters & substitutes for them m ciphertext letters. Substitution is determined by m linear equations Hill Cipher Encryption For m=3 c 1 =(k 11 p 1 +k 12 p 2 +k 13 p 3 ) mod 26 c 2 =(k 21 p 1 +k 22 p 2 +k 23 p 3 ) mod 26 c 3 =(k 31 p 1 +k 32 p 2 +k 33 p 3 ) mod 26 c 1 k 11 k 12 k 13 p 1 c 2 = k 21 k 22 k 23 p 2 c 3 k 31 k 32 k 33 p 3 or, C= E(K,P)=KP mod 26
3 successive plaintexts- p1,p2,p3 3 ciphertexts- c1,c2,c3 3 linear equations C and P column vectors (length 3) representing plaintext and ciphertext K -3x3 matrix, representing encryption key Hill Cipher Encryption (cont.) Example: Plaintext- paymoremoney Encryption key 17 17 5 K = 21 18 21 2 2 19 first 3 letters of plaintext vector -p(15), a(0),y(24) 15 0 24
C=KP mod 26 15 375 11 K 0 = 819 mod 26= 13 = LNS 24 486 18
Continuing in this fashion, ciphertext : LNSHDLEWMTRW
Hill Cipher Decryption Uses inverse of K, i.e. K -1 P=D(K,P) = K -1 C mod 26 = K -1 KP=P
Example: Inverse of key 4 9 15 K -1 = 15 17 6 24 0 17
KK -1 =I Security of Hill Cipher 3x3 Hill cipher hides both single-letter and two-letter frequency If m plaintext-ciphertext pairs are known, mxm Hill Cipher can be easily known C=KP mod 26 Find P -1 K=CP -1 mod26
Polyalphabetic Cipher General name -polyalphabetic substitution ciphers improve security using multiple cipher alphabets make cryptanalysis harder with more alphabets to guess and flatter frequency distribution
Vigenere Cipher Simplest algorithm Multiple caesar ciphers(26), with shifts(0-25) make cryptanalysis harder with more alphabets to guess and flatter frequency distribution Vigenere table -aid for this scheme
Vigenere Table Encryption & Decryption In Vigenere Cipher Encryption: Key letter x Plaintext letter y Ciphertext letter V intersection of x th row and y th column Decryption: Key letter x Ciphertext letter V Plaintext letter y- column containing V in x th row
Security of Vigenere Cipher have multiple ciphertext letters for each plaintext letters Still not all information about plaintext is lost Improvement over Playfair Cipher, but frequency information remains If a monoalphabetic substitution is used, statistical properties of the ciphertext should be the same as that of the language of the plaintext. Security of Vigenere Cipher (cont.) If a Vigenere cipher is suspected, progress depends on determining the length of the keyword. Key length determination If two identical sequences of plaintext letters occur at a distance that is an integer multiple of the keyword length, they will generate identical ciphertext sequences key: deceptivedeceptivedeceptive plaintext: wearediscoveredsaveyourself ciphertext:ZICVTWQNGRZGVTWAVZHCQYGLMGJ
Assumption: keyword length 3 or 9 Autokey System Periodic nature of keyword can be eliminated (by using nonrepeating keyword, as long as the message) Vignere proposed autokey system Keyword concatenated with plaintext itself to provide a running key key: deceptivewearediscoveredsav plaintext: wearediscoveredsaveyourself ciphertext:ZICVTWQNGKZEIIGASXSTSLVVWLA Problem: key and plaintext share some frequency distributions of letter (statistical technique can be applied)
One-Time Pad Key as long as the message, not repeated Key- used to encrypt decrypt single message and then discarded One-time pad Unbreakable since ciphertext bears no statistical relationship to the plaintext Problems : generation & safe distribution of key