Fall 2018 CS 222: Discrete Structures 1
Cryptography Intro and RSA Well, a gentle intro to cryptography, - - PowerPoint PPT Presentation
Cryptography Intro and RSA Well, a gentle intro to cryptography, - - PowerPoint PPT Presentation
Cryptography Intro and RSA Well, a gentle intro to cryptography, followed by a description of public key crypto and RSA. Fall 2018 CS 222: Discrete Structures 1 Definition Cryptology is the study of secret writing Concerned with
Fall 2018 CS 222: Discrete Structures 2
Definition
- Cryptology is the study of secret writing
- Concerned with developing algorithms which may be
used:
– To conceal the content of some message from all except the sender and recipient (privacy or secrecy), and/or – Verify the correctness of a message to the recipient (authentication or integrity)
- The basis of many technological solutions to computer
and communication security problems
Fall 2018 CS 222: Discrete Structures 3
Terminology
- Cryptography: The art or science encompassing the
principles and methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back to its original form
- Plaintext: The original intelligible message
- Ciphertext: The transformed message
- Cipher: An algorithm for transforming an intelligible
message into one that is unintelligible
Fall 2018 CS 222: Discrete Structures 4
Terminology (cont).
- Key: Some critical information used by the cipher,
known only to the sender & receiver – Or perhaps only known to one or the other
- Encrypt: The process of converting plaintext to
ciphertext using a cipher and a key
- Decrypt: The process of converting ciphertext back
into plaintext using a cipher and a key
- Cryptanalysis: The study of principles and methods of
transforming an unintelligible message back into an intelligible message without knowledge of the key!
Fall 2018 CS 222: Discrete Structures 5
Concepts
- Encryption: The mathematical operation mapping
plaintext to ciphertext using the specified key: C = EK(P)
- Decryption: The mathematical operation mapping
ciphertext to plaintext using the specified key: P = EK-1(C) = DK (C)
- Cryptographic system: The family of transformations
from which the cipher function EK is chosen – It is a family of transformations since each key K effectively creates a different transformation
Fall 2018 CS 222: Discrete Structures 6
Concepts (cont.)
- Key: Is the parameter which selects which individual
transformation is used, and is selected from a keyspace K
- Usually assume the cryptographic system is public, and
- nly the key is secret information
– Why? Because we don’t want to rely on “security through
- bscurity”
Fall 2018 CS 222: Discrete Structures 7
Rough Classification
- Symmetric-key encryption algorithms
- Public-key encryption algorithms
- Digital signature algorithms
- Hash functions
- Cipher Classes
– Block ciphers – Stream ciphers
Fall 2018 CS 222: Discrete Structures 8
Symmetric-Key Encryption System
Message Source M Adversary Message Dest. M Encrypt M with Key K C = EK(M) Decrypt C with Key K M = DK(C) Key K saved Key source Random key K produced K C K K C Insecure communication channel Secure key channel
Fall 2018 CS 222: Discrete Structures 9
Symmetric-Key Encryption Algorithms
- A Symmetric-key encryption algorithm is one where
the sender and the recipient share a common, or closely related, key – Managing this key is nontrivial – Plus there is the question: how does the key come to be shared?
- Historically, symmetric-key algorithms were developed
first – They are generally good at efficiently encrypting large amounts of data
- As of Feb. 2017, an Intel i7 with integrated AES
instruction set can encrypt almost 12 GB/s
Fall 2018 CS 222: Discrete Structures 10
Exhaustive Key Search
- Always theoretically possible to simply try every key
- Most basic attack, directly proportional to key size
- Typically, key is large enough so that exhaustive search is
not computationally feasible – Do the math: Consider a 128-bit key. Key space is roughly 3.4 x 1038 keys. one billion machines each testing one billion keys each second requires (3.4 x 1038)/(1018) seconds to test them all. That’s 3.4 x 1020 seconds, or 10.7 trillion years
Fall 2018 CS 222: Discrete Structures 11
The Caeser Cipher
- 2000 years ago Julius Caesar used a simple
substitution cipher, now known as the Caesar cipher
– First attested use in military affairs (e.g., Gallic Wars)
- Concept: replace each letter of the alphabet with
another letter that is k letters after original letter
- Example: replace each letter by 3rd letter after
L FDPH L VDZ L FRQTXHUHG I CAME I SAW I CONQUERED
Fall 2018 CS 222: Discrete Structures 12
The Caeser Cipher
- Can describe this mapping (or translation alphabet) as:
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ Cipher: DEFGHIJKLMNOPQRSTUVWXYZABC
Fall 2018 CS 222: Discrete Structures 13
General Caesar Cipher
- Can use any shift from 1 to 25
– I.e. replace each letter of message by a letter a fixed distance away
- Specify key letter as the letter a plaintext A maps to
– E.g. a key letter of F means A maps to F, B to G, ... Y to D, Z to E, I.e. shift letters by 5 places
- Hence have 26 (25 useful) ciphers
– Hence breaking this is easy. Just try all 25 keys one by one.
Fall 2018 CS 222: Discrete Structures 14
Mathematics
- If we assign the letters of the alphabet the numbers
from 0 to 25, then the Caesar cipher can be expressed mathematically as follows: For a fixed key k, and for each plaintext letter p, substitute the ciphertext letter C given by C = (p + k) mod(26) Decryption is equally simple: p = (C – k) mod (26)
Fall 2018 CS 222: Discrete Structures 15
Mixed Monoalphabetic Cipher
- Rather than just shifting the alphabet, could shuffle
(jumble) the letters arbitrarily
- Each plaintext letter maps to a different random
ciphertext letter, or even to 26 arbitrary symbols
- Key is 26 letters long
Fall 2018 CS 222: Discrete Structures 16
Security of Mixed Monoalphabetic Cipher
- With a key of length 26, now have a total of 26! ~ 4 x
1026 keys
– A computer capable of testing a key every ns would take more than 12.5 billion years to test them all. – On average, expect to take more than 6 billion years to find the key.
- With so many keys, might think this is secure…but
you’d be wrong
Fall 2018 CS 222: Discrete Structures 17
Security of Mixed Monoalphabetic Cipher
- Variations of the monoalphabetic substitution cipher
were used in government and military affairs for many centuries into the middle ages
- The method of breaking it, frequency analysis was
discovered by Arabic scientists
- All monoalphabetic ciphers are susceptible to this type
- f analysis
Fall 2018 CS 222: Discrete Structures 18
Language Redundancy and Cryptanalysis
- Human languages are redundant
- Letters in a given language occur with different
frequencies.
– Ex. In English, letter e occurs about 12.75% of time, while letter z occurs only 0.25% of time.
- In English the letters e is by far the most common
letter
Fall 2018 CS 222: Discrete Structures 19
Language Redundancy and Cryptanalysis
- t,r,n,i,o,a,s occur fairly often, the others are relatively
rare
- w,b,v,k,x,q,j,z occur least often
- So, calculate frequencies of letters occurring in
ciphertext and use this as a guide to guess at the
- letters. This greatly reduces the key space that needs
to be searched.
Fall 2018 CS 222: Discrete Structures 20
Language Redundancy and Cryptanalysis
- Tables of single, double, and triple letter frequencies
are available
Fall 2018 CS 222: Discrete Structures 21
Public Key Cryptography
Fall 2018 CS 222: Discrete Structures 22
Terminology
- Asymmetric cryptography
- Public key (known to entire world)
- Private key (kept secret)
- Encryption process (P to C with public key)
- Decryption Process (C to P with private key)
- Can also do this in reverse: encrypt with private
key, decrypt with public key
- This doesn’t keep info secret, but does verify who
sent it! (called a digital signature - Only holder of private key can sign, so can’t be forged)
Fall 2018 CS 222: Discrete Structures 23
Uses
- Orders of magnitude slower than symmetric key
crypto, so usually used to initiate symmetric key session
- Much easier to configure, so used widely in network
protocols to establish temporary shared key that is used to transmit secret (symmetric) key
Fall 2018 CS 222: Discrete Structures 24
Uses
- Transmitting over insecure channel
- Alice <Apu, Apr> , Bob <Bpu, Bpr>
- Alice to Bob encrypt m with Bpu
- Bob to alice encrypt m with Apu
- Accurately knowing public key of other person is one
- f biggest challenges of using public key crypto.
Fall 2018 CS 222: Discrete Structures 25
The General Idea
- We use two one-way functions
– Multiplication vs factoring – modular exponentiation vs modular logarithm
- Both can be one way trap door processes
Fall 2018 CS 222: Discrete Structures 26
The General Idea
- Multiplication
- Relatively easy, even if you are multiplying two huge
numbers
- Factoring
- Difficult: No matter how it is done, need to check
many possible factors
- Think of it as finding the combination for a lock
(prime factorization)
- Here: n = pq, where p and q are both (very) large primes
Fall 2018 CS 222: Discrete Structures 27
The General Idea
- Modular exponentiation
- Relatively easy: Think of a clock face with the
requisite number of numbers on it
- Modular multiplication like winding a length of rope
around it and seeing where it stops. Thanks to Kahn Academy 46 mod 12 = 10
Fall 2018 CS 222: Discrete Structures 28
The General Idea
- Computing modular logarithm (discrete logarithm) is
difficult
- modular exponentiation distributes values in manner
close to uniformly random around clock face
- Finding discrete log means testing many possible
values
- For large numbers, this is a prohibitively expensive
- peration
Fall 2018 CS 222: Discrete Structures 29
The General Idea
Fall 2018 CS 222: Discrete Structures 30
The General Idea
- We use two tools to make this work
- First, the Euler totient function
- This is a one-way trap door function!
- We use Euler’s Theorem
Fall 2018 CS 222: Discrete Structures 31
Totient Function
- Allegedly from total and quotient
- How many numbers less than n are relatively prime to
n?
- Totient function, φ(n) gives this.
- If n is prime, φ(n) = n-1 (1,2,…n-1)
- If p and q are prime, φ(pq) = (p-1)(q-1)
– p, 2p, … (q-1)p q, 2q, … (p-1)q not rel. prime so have – pq – 1 – [(p-1) + (q-1)] = (p-1)(q-1)
Fall 2018 CS 222: Discrete Structures 32
Totient Function
- This is trap door
- difficult, in general, to determine value
- easy if you know the prime factorization
Fall 2018 CS 222: Discrete Structures 33
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures 34
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures 35
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures 36
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures 37
Euler’s Theorem
Upshot: We can do exponentiation mod the totient function
Fall 2018 CS 222: Discrete Structures 38
RSA
- Key length variable (but should now be at least 1024
bits)
- Plaintext block must be smaller than key length
- Ciphertext block will be length of key
Fall 2018 CS 222: Discrete Structures 39
RSA
- Choose two large primes (around 1024 bits each) p and
- q. Let n = pq (very difficult to factor)
- Choose number e that is relatively prime to φ(n). Can do
this since you know p and q and thus φ(pq) and from the derivation know exactly which numbers are relatively prime!
- Public key is <e, n>
- To make private key, find d that is the multiplicative
inverse of e mod φ(n) (so ed = 1 mod φ(n)) (use Euclid’s algorithm)
- Private key is <d,n>
- To encrypt a number m, compute c = me mod n.
- To decrypt: m = cd mod n.
Fall 2018 CS 222: Discrete Structures 40
RSA Example
1. Select primes: p=17 & q=11 2. Compute n = pq =17×11=187 3. Compute ø(n)=(p–1)(q-1)=16×10=160 4. Select e : gcd(e,160)=1; choose e=7 5. Determine d: de=1 mod 160 and d < 160 Value is d=23 since 23×7=161= 10×160+1 6. Publish public key KU={7,187} 7. Keep secret private key KR={23,17,11}
Fall 2018 CS 222: Discrete Structures 41
RSA Example cont
- sample RSA encryption/decryption is:
- given message M = 88 (nb. 88<187)
- encryption:
C = 887 mod 187 = 11
- decryption:
M = 1123 mod 187 = 88
Fall 2018 CS 222: Discrete Structures 42
Questions
- Why does it work?
- Why is it secure?
- Are operations sufficiently efficient?
- How do we find big primes?
Fall 2018 CS 222: Discrete Structures 43
Why Does It Work?
- We chose d and e so that de = 1 mod φ(n), so for any
x,
- x(ed) mod n = x(ed mod φ(n)) mod n = x1 mod n = x mod n.
- And (xe)d = x(ed)
Fall 2018 CS 222: Discrete Structures 44
Why Is It Secure?
- We’re not sure it is, but it seems to be
- Based on premise that factoring a big number is difficult.
– Semiprimes, the product of two (not necessarily distinct) primes, are most difficult numbers to factor. – Largest such semiprime yet factored is RSA-768, 768 bits, 232 decimal digits.
- Took two years, hundreds of machines, several
research institutions, and highly optimized code.
- Equivalent of 2000 CPU years on a single-core 2.2
GHz AMD Opteron
Fall 2018 CS 222: Discrete Structures 45
Why Is It Secure?
- If you can factor n, you’re golden:
– Problem is one of finding modular log (i.e. inverse of exponential) – Why? Adversary knows <e,n>. So for message m, knows ciphertext is c = me mod n. – So if adversary can reverse the exponentiation (that is, find the number x s.t. xe mod n = c), she’s got the original message m! – Remember how we originally find this inverse: By knowing φ(n). Which is difficult to know if you can’t factor n
Fall 2018 CS 222: Discrete Structures 46
Why Is It Secure?
- We don’t know that there are not easier ways to break
it (we do know that breaking it is no harder than factoring)
- We do know that it can be broken with a quantum
computer using Shor’s Algorithm (1994) which has cubic time and linear space complexity in the number
- f bits of the number being factored
– So if quantum computers become practical...
Fall 2018 CS 222: Discrete Structures 47
Finding Big Primes
Fall 2018 CS 222: Discrete Structures 48
Finding Big Primes
- No nice way of absolutely determining that a huge
number is prime, but we can guess pretty accurately
- Fermat’s Theorem: If p is prime, and 0 < a < p, then
a^(p-1) mod p = 1 mod p.
– Works because though it’s possible for a^(n-1) = 1 mod n for a non-prime, it’s not likely. For a randomly generated number of about 100 digits, probability that n is not prime but relation holds is about 1 in 10^(13). – Other similar probabilistic algorithms for finding large primes
Fall 2018 CS 222: Discrete Structures 49
Finding Big Primes
- Update: usually Fermat test with base 2 is applied
- because it can be optimized
- Then several Miller-Rabin tests applied
- How many depends on how small you want the
probability of being wrong
- Typically somewhere around 20 tests run
- Gets probability of being wrong down to around 2-100
- See FIPS Pub 186-4 for details
Fall 2018 CS 222: Discrete Structures
UPDATE!!! The AKS Algorithm!
- The Agrawal-Keyal-Saxena Primality Test
– Published in 2002 (after previous slide created) – A deterministic polynomial time primality-proving algorithm – Developed by three researchers at the Indian Institute of Technology Kanpur – Answered a centuries old question (and in a surprising way)! – Won 2006 Godel Prize and 2006 Fulkerson Prize
- Unfortunately the “constants” involved in the computational
complexity estimates are very large – So not yet practical for identifying large primes (but making this competitive with probabilistic algorithms is a ongoing research area)
50
Fall 2018 CS 222: Discrete Structures
2016 UPDATE!!! The AKS Algorithm!
- The Agrawal-Keyal-Saxena Primality Test
– Still not used – Real difficulty: probability of a hardware error running this algorithm is higher than the probability of accidentally choosing a non-prime with the earlier methods!
51
Fall 2018 CS 222: Discrete Structures 52
Diffie-Hellman
- Oldest public key cryptosystem still in use
- Does neither encryption nor digital signatures.
- Used because it is fastest at what it does: allow two
individuals to agree on a symmetric key even though they can only communicate over insecure channels.
- Remarkable because neither Alice nor Bob need any
apriori information, yet after the exchange of two messages, they share a secret number.
- One bad thing: no authentication, so Alice may be
setting up a key with Trudy!
Fall 2018 CS 222: Discrete Structures 53
The Process
- Alice and Bob agree on two primes, p and g, where p
is a large prime and g is a number less than p (with some restrictions)
- Each chooses a random 1024 bit number (SA for
Alice, SB for Bob).
- Alice computes TA = gSA mod p. Bob computes TB =
gSB mod p.
- They exchange their T values
- Alice computes TBSA mod p, Bob computes TASB mod
p.
- Done: TBSA = (gSB)SA = g(SB*SA) = g(SA*SB) = (gSA)SB =
TASB mod p.
Fall 2018 CS 222: Discrete Structures 54
Why It Is Secure
- Whole world knows gSA and gSB, but getting g(SA*SB)
means having to do a modular logarithm
– If can find y such that gy = gSA, then know SA.
- And well, it’s not exactly secure -- it has a problem with a