You are on page 1of 2

Post-Quantum Cryptographic Hardware Primitives

Lake Bu, Rashmi Agrawal, Hai Cheng, and Michel A. Kinsy


Adaptive and Secure Computing Systems Laboratory
Department of Electrical and Computer Engineering, Boston University
(bulake, rashmi23, chenghai, mkinsy)@bu.edu
ABSTRACT cryptographic primitives (PCPs). They can serve as the fundamen-
The development and implementation of post-quantum cryptosys- tal building blocks for a wide range of secure systems. In the work
tems have become a pressing issue in the design of secure com- we demonstrate (1) a high speed polynomial multiplier design to
puting systems, as general quantum computers have become more aid in the efficient hardware implementation of these primitives,
feasible in the last two years. In this work, we introduce a set and (2) new algorithms for the OT and ZKP primitives.
arXiv:1903.03735v1 [cs.CR] 9 Mar 2019

of hardware post-quantum cryptographic primitives (PCPs) con- 2 THE PCP HARDWARE PRIMITIVES
sisting of four frequently used security components, i.e., public-
key cryptosystem (PKC), key exchange (KEX), oblivious transfer
2.1 The Public-Key Cryptosystem (PKC) and
(OT), and zero-knowledge proof (ZKP). In addition, we design a Key Exchange (KEX) Primitives
high speed polynomial multiplier to accelerate these primitives. The detailed algorithms of the public-key cryptosystem (PKC) and
These primitives will aid researchers and designers in constructing key exchange (KEX) can be found in [2] and [1], respectively. For
quantum-proof secure computing systems in the post-quantum era. brevity, we will only briefly introduce the PKC algorithm, since
many of its sub-modules are reused in the KEX primitive.
KEYWORDS Algorithm 2.1. Let the ring Rq be Rq = R/⟨q⟩ = Zq [x]/⟨f (x)⟩,
Post-quantum cryptography, public-key system, key exchange, where f (x) = x n + 1 is an irreducible polynomial with n a power of
oblivious transfer, zero-knowledge proof, FPGA-based prototyping. 2, and q ≡ 1 mod 2n is a large prime number. Thus Rq is a ring of
integer polynomials modulo both f (x) and q, and it has qn elements.
1 INTRODUCTION q
Let X be a Gaussian distribution of “small” errors/noise. If t = ⌊ 2 ⌋,
In the last three years, we have witnessed a raft of breakthroughs a, b ∈ Rq and s, e, r 0 , r 1 , r 2 ← X, then the public key encryption
and several key milestones towards the development of general protocol between Alice and Bob is as follows.
quantum computers. These advances do bring with them criti- Key generation: Alice picks s and a random e to generate the
cal challenges to classical cryptosystems like RSA (Rivest-Shamir- public key pk = {a, b} and the private key sk = {s} by:
Adleman), ECC (Elliptic Curve Cryptography), and ElGamal. The b = a ·s +e (1)
strength of these classic algorithms rests on the hardness of integer Encryption: Bob converts his message (plaintext) into a binary
factorization and discrete logarithm problems, which do not hold vector m of length n, and generates the cipher {c 0 , c 1 } as:
under quantum computing approaches. Thus, researchers have been (
c 0 = b · r 0 + r 2 + tm,
actively investigating new algorithms and designs for cryptosys- (2)
tems for the post-quantum era. Among these techniques, designs c1 = a · r0 + r1 .
based on Ring-learning with errors (Ring-LWE) [2] thus far have Decryption: Alice decrypts the cipher by:
proven to be the most promising approach. Ring-LWE-based cryp- m = ⌈(c 0 − c 1 · s)/t⌋, (3)
tosystems have the following advantages (i) their security reduction
is a modification of the shortest vector problem (SVP) and closest where ⌈⌋ stands for taking the nearest binary integer.
vector problem (CVP), which are known to be NP-hard, and so The basic operations of the algorithms are: polynomial addition,
far there are no efficient classical or quantum algorithms to solve polynomial subtraction, scalar multiplication, scalar division then
them; (ii) they can support homomorphic encryption (HE) schemes; taking the nearest binary integers, and polynomial multiplication.
(iii) they have much smaller key size comparing with other cryp- Most of the operations are component-wise, or can be reduced to
tosystems; (iv) finally, in some cases, they lend themselves to more conditional assignment. The polynomial multiplication operation
efficient hardware implementations than their classical competitors. has the highest hardware implementation complexity. An efficient
In contrast to the extensive literature on the study and software multiplication module will substantially improve the hardware
implementation of the Ring-LWE algorithm, there has been little implementation efficiency of the entire hardware crypto-primitive
work on its efficient hardware implementation. Recently, a handful suite. Figure 1 shows a system architecture using the commonly
of works have explored the FPGA implementation of the KEX [3], shared hardware modules.
and even less the PKC [4]. There is also a general lack of discussion One of the common implementations of the polynomial mul-
on the design and hardware implementation of other cryptographic tiplier, is negative wrapped convolution combined with butterfly
primitives such as oblivious transfer (OT) and zero-knowledge number-theoretic transform (NTT, the finite field version of FFT).
proof (ZKP), which play critical roles in many applications such as This approach takes O(nlogn) multiplications and has a time com-
private machine learning and crypto-currencies using blockchain. plexity of O(logn). In this work, we are introducing a new and
Therefore, in this work, we construct a small representative set high-speed design of the modular polynomial multiplier, named
of reusable, standalone hardware modules of these post-quantum Preemptive Adaptive Reduction Multiplier (PARM).
Application System

Decrypted Message Key Generation


Message to Encrypt
m
Decryption e
Encryption c0
Noise Sampler
r2 Nearest Poly b Public Key Out
s
a Poly c1 Binary Add
u Poly s Poly Mod a
Integer Cipher In

Interface
Poly Mod Add Sub Mod Poly RNG MUL Redu
MUL Redu of
Redu MUL c1
r1 u/[q/2]
Noise Sampler
Poly
b Add c0
Poly Mod Cipher Out
[q/2] r0 MUL Redu
Scalar Public Key In
MUL
m

Figure 1: The three core building blocks for the primitives: Key Generation (KeyGen), Encryption (Enc), and Decryption (Dec).
Figure 1: The three core building blocks for the primitives: Key Generation (KeyGen), Encryption (Enc), and Decryption (Dec).
It calculates
It calculates thethegeneralized
generalized representation
representation ofofthe product
the in in
product the PKC primitive. We denote KeyGen(s, a) = b as the Key Gen-
advance.
advance. Thus,
Thus, given
given twopolynomial
two vector multiplicands, their product
multiplicands, can
their product eration Algorithm
module, Encpk2.2. (m) Suppose
= {c 0 , c 1 }Alice
as theuses KeyGen()
Encryption to generate and
function,
be immediately computed in one step. send
and Dec a public
sk ({c 0 , c 1 }) =key
m asto theBob, and keeps
Decryption module the- Figure
private pairing key to
1. The
can be computed as fast as in one step.
proposed OT Alice
herself. algorithm has over
l n-bit ring Rq is as
binary follows. {m 1 , · · · , ml } and l n-bit
messages
Alg. 1: Preemptive Adaptive Reduction Multiplier (PARM) Algorithm 2.2. Suppose Alice uses KeyGen() to generate and
1 Let a = {a 0 , · · · , an−1 }, b = {b0 , · · · , bn−1 } ∈ Zq [x]/⟨f (x )⟩ sendrandom
a public vectors
key to Bob,{r 1 , ·and rl }. the private pairing key to
· · ,keeps
(where f (x ) = x n + 1) be two n-bit vectors, and herself. (1)
AliceAlice
has l sends
n-bit binary · , rl } to{m
{r 1 , · ·messages Bob.
1 , · ·Bob
· , mlchooses the c th vector
} and l n-bit
P (X ) the primitive polynomial of the ring. random binary rc in vectors
order{r 1to
, · ·acquire
· , rl }. mc . Thus Bob generates a random
Let d = {d 0 , · · · , dn−1 }, e = {e 0 , · · · , en−1 } where di , ei are
(1) Alice binary
sends {r 1vector R n2 and computes c thtovector
send rto
c Alice:
2
merely variable names. , · · · , rlK} to∈ Bob. Bob chooses the v
3 in order to acquire mc . Thus v = Bobrc +generates a random binary
Encpk (K). (4)
4 Precompute: vector
(2) For R n2i and
K ∈all ∈ {1,computes to Alice
2, · · · ,vl }, send tocomputes
Alice: the set {m ′ } and i
5 ĉ ← d ⊛ e (⊛ for convolution) such that sends it backv = to rc + Bob:
Encpk (K ). (4)
6 ĉ = ĉ 0 + ĉ 1x 1 + · · · ĉ n−1x n−1 + ĉ n x n + · · · ĉ 2n−2x 2n−2 (2) For all i ∈ {1, 2, · · · m
, l i′}, =Alice
Dec skcomputes
(v − r i )the
⊕ mset, {m ′ } and
(5)
i i
7 # Approach 1: by using P(x) for reduction sendswhere
it back ⊕
to is
Bob:
8 for i=n to 2n-2 do
bitwise XOR.
9 x i = l 0 + l 1x 1 + · · · + ln−1x n−1 (3) Bob computes
mi′ = Dechis sk (vdesired
− r i ) ⊕mmci , while remaining (5)oblivious to
10 By substituting {x n , · · · x 2n−2 } to ĉ whereother mi , where
⊕ is bitwise XOR.i , c:
11 c = c 0 + c 1x 1 + · · · c n−1x n−1 (3) Bob computes his desired m = mc′remaining
mccwhile ⊕ K. oblivious to (6)
12 # Approach 2: by using f(x) for reduction other The
2.3 mi , i ,Zero-Knowledge
c: Proof (ZKP) Primitive
13 c ← ĉ/f (x ) = c 0 + c 1x 1 + · · · c n−1x n−1 The ZKP enables an mc entity′
= mc ⊕toK .prove to a verifier that (6) it knows a
14 Denote c i = дi (d, e), where дi is a general secret value s, without revealing any information (including the
representation of c i by d, e. 2.3 The Zero-Knowledge Proof (ZKP) Primitive
value of s) apart from the fact that it knows the value. Similar to
15
The ZKP enables an entity to prove to a verifier that it knows a secret
16 Real-time: valuethe OT primitive,
s, without revealingthe ZKP
any primitive(including
information is designed usingofthe building
the value
17 for i=0 to n-1 do blocks the PKC algorithm in section 2.1.
s) apart from the fact that it knows the value. Similar to the OT
18 c i ← дi (a, b) Algorithm
primitive, 2.3. Suppose
the ZKP primitive Alice
is designed has the
using a secret value
building s and needs to
blocks
19 end for prove her ownership of to Bob.
the PKC algorithm in section 2.1. The proposed ZKP algorithm over
s
20
ring Rq (1)
is asAlice uses KeyGen(a, s) to generate b. Alice selects a binary
follows.
21 return c vector m, and samples e ′, r ← X to generate c:
Algorithm 2.3. Suppose Alice c =hasa ·arsecret
+ mtvalue
+ e ′,s and needs to (7)
As shown
As shown in in
thethealgorithm
algorithm 11 outline,
outline, the thebulkbulkofofthethe work
work is is prove her ownership of s qto Bob.
performed in the “Precompute” where t = ⌊ 2 ⌋.
performed in the “Precompute”stage,
stage,which
which is is done
done only once in
only once (1) Alice Alice
uses KeyGen(a,
the lifetime of theofmultiplier’s. TheThe realreal
time computation sends {a,s)b,tom,generate
c} to Bob,b. Alice
andselects
keepsasbinary
to herself.
in the lifetime the multiplier’s. time computationisisjust vector m, and samples e ′, r ← X to generate c:
n calculations of c i ←ofдic(a,
just n calculations b),дiwhich can becan
(a, b), which done be fully
done in parallel
fully in (2) Bob samples u ← X, and ′interactively sends it to Alice.
i ← c = a ·with (7)
parallelgiven
(1 cycle) (1 cycle) given multipliers
enough enough multipliers (n 2 ). The
(n 2 ). The resource
resource andandtime (3) Alice responds
q
r + mtx to
+ eBob:
,
time complexities for PARM algorithm O2(n) 2multiplications where t = ⌊ 2 ⌋. (8)
complexities for the PARM algorithm areare
O(n ) multiplicationsand x = r + s · u.
Alice sends {a, b, m, c} to Bob, and keeps s to herself.
and O (1) cycles, while the NTT-based complexities O (nlogn) and
O(1) cycles latency, while the NTT-based complexities are O(nlogn) (4) Bob computes and verifies if:
O (logn), respectively. (2) Bob samples u ← X, and interactively sends ?it to Alice.
multiplications and O(logn) cycles latency. (3) Alice responds with ⌈(c
x to−Bob:
a · x + b · u)/t⌋ = m, (9)
2.2 The Oblivious Transfer (OT) Primitive where ⌈⌋ stands
x = for
r + staking
· u. the nearest binary (8)
integer.
2.2TheThe Oblivious Transfer (OT) Primitive If the equality of [Eq.
OT mechanism enables a receiver to choose and receive a cer-
The OT mechanism enables a receiver to choose and receive a cer- (4) Bob computes and verifies if: 9] stands, then Alice has successfully
tain piece of information out of many pieces from the sender, while proved her ownership of s ?to Bob.
tain remaining
piece of information outother
oblivious to the of many
pieces.pieces fromisthe
The sender sender,
also while
oblivious ⌈(c − a · x + b · u)/t⌋ = m, (9)
remaining oblivious REFERENCES
to the exact pieceto the other
selected. Thepieces.
OT is The sender
a widely usedis protocol
also oblivious
in where ⌈⌋ stands for taking the nearest binary integer.
[1] Erdem Alkim, Léo Ducas, Thomas Pöppelmann, and Peter Schwabe. 2016. Post-
to the exact piece selected.
privacy-preserving The OT
computations is a widely
between used protocol
two or multiple parties. in If quantum
the equality of [Eq. 9] stands,
Key Exchange-A then
New Hope.. In Alice
USENIXhas successfully
Security Symposium, Vol. 2016.
privacy-preserving
The proposed OTcomputations between two
primitive is constructed or foundation
on the multiple parties.
of [2]proved
Vadim her ownershipChris
Lyubashevsky, of s Peikert,
to Bob.and Oded Regev. 2010. On ideal lattices and
The proposed OT primitive is constructed on the foundation of2 learning with errors over rings. In Annual International Conference on the Theory
and Applications of Cryptographic Techniques. Springer, 1–23.
the PKC primitive. We denote KeyGen(s, a) = b as the Key Gen- [3] Tobias Oder and Tim Güneysu. 2017. Implementing the NewHope-Simple key
eration module, Encpk (m) = {c 0 , c 1 } as the Encryption function, exchange on low-cost FPGAs. Progress in Cryptology–LATINCRYPT 2017 (2017).
[4] Sujoy Sinha Roy, Frederik Vercauteren, Nele Mentens, Donald Donglong Chen, and
and Dec sk ({c 0 , c 1 }) = m as the Decryption module - Figure 1. The Ingrid Verbauwhede. 2014. Compact ring-LWE cryptoprocessor. In International
proposed OT algorithm over ring Rq is as follows. Workshop on Cryptographic Hardware and Embedded Systems. Springer, 371–391.
2

You might also like