- This article is about the RSA cryptosystem. For other meanings, see RSA (disambiguation).
In cryptography, RSA is an algorithm for public key encryption. It was the first algorithm known to be
suitable for signing as well as encryption, and one of the first great advances in public key cryptography. RSA is still widely used in electronic commerce protocols, and is believed to be secure given
sufficiently long keys.
History of RSA
The algorithm was described in 1977 by Ron
Rivest, Adi Shamir and Len
Adleman at MIT; the
letters RSA are the initials of their surnames.
Clifford Cocks, a British mathematician working for GCHQ, described an equivalent system in an internal document in 1973. His discovery, however, was not revealed until 1997 due to its top-secret
classification.
The algorithm was patented by MIT in 1983 in the United States of America as
U.S. Patent 4405829 (http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm&r=1&f=G&l=50&s1=4405829.WKU.&OS=PN/4405829&RS=PN/4405829).
It expired 21 September 2000. Since
the algorithm had been published prior to patent application, regulations in much of the rest of the world precluded patents
elsewhere. Had Cocks' work been publicly known, a patent in the US would not have been possible either.
Operation
Key generation
Suppose a user Alice wishes to allow Bob to send her a private message over an insecure transmission medium. She takes the following steps
to generate a public key and a private key:
- Choose two large prime numbers p ≠ q randomly and
independently of each other. Compute N = p q.
- Compute φ = (p-1)(q-1).
- Choose an integer 1 < e < φ which is coprime to φ.
- Compute d such that d e ≡ 1 (mod φ).
- (Steps 3 and 4 can be performed with the extended Euclidean algorithm; see modular arithmetic.)
- (Step 4, rewritten, can also be found by finding integer x which causes d = (x(p-1)(q-1) +
1)/e to be an integer, then using the value of d (mod (p-1)(q-1));
- (From step 2 PKCS#1 v2.1 uses λ = lcm(p-1, q-1) instead of φ = (p-1)(q-1)).
N and e are the public key, and N and d are the private key. Note that only d is a secret
as N is known to the public. Alice transmits the public key to Bob, and keeps the private key secret. p and
q are also very sensitive since they are the factors of N, and allow computation of d given e. They
are sometimes securely deleted, and sometimes kept secret along with d in order to speed up decryption and signing using
the Chinese Remainder Theorem.
Encrypting messages
Suppose Bob wishes to send a message m to Alice. He turns m into a number n < N, using some
previously agreed-upon reversible protocol known as a padding scheme.
So Bob has n, and knows N and e, which Alice has announced. He then computes the ciphertext c
corresponding to n:

This can be done quickly using the method of exponentiation by squaring. Bob then transmits c to Alice.
Decrypting messages
Alice receives c from Bob, and knows her private key d. She can recover n from c by the following procedure:

Given n, she can recover the original message m. The decryption procedure works because
.
Now, since ed ≡ 1 (mod p-1) and ed ≡ 1 (mod q-1), Fermat's little theorem yields

and

Since p and q are distinct prime numbers, applying the Chinese remainder theorem to these two congruences yields
.
Thus,
.
A worked example
Here is an example of RSA encryption and decryption. The parameters used here are artificially small.
We let
| p = 61 |
— first prime number (to be kept secret or deleted securely) |
| q = 53 |
— second prime number (to be kept secret or deleted securely) |
| N = pq = 3233 |
— modulus (to be made public) |
| e = 17 |
— public exponent (to be made public) |
| d = 2753 |
— private exponent (to be kept secret) |
The public key is (e, N). The private key is d. The encryption function is:
- encrypt(n) = ne mod N = n17 mod 3233
where n is the plaintext. The decryption function is:
- decrypt(c) = cd mod N = c2753 mod 3233
where c is the ciphertext.
To encrypt the plaintext value 123, we calculate
- encrypt(123) = 12317 mod 3233 = 855
To decrypt the ciphertext value 855, we calculate
- decrypt(855) = 8552753 mod 3233 = 123
Both of these computations can be done efficiently using the square-and-multiply algorithm for modular exponentiation.
Padding schemes
The padding scheme must be carefully constructed so that no values of m cause security problems. For example, if we
simply take the ASCII representation of m and concatenate the bits together to
create n, then a message consisting of a single ASCII NUL character (whose numeric value is 0) would produce
n = 0, which produces a ciphertext of 0 regardless of what e and N are used. Likewise, a single ASCII
SOH (whose numeric value is 1) would always produce a ciphertext of 1. In fact, for systems which conventionally use
small values of e, such as 3, all single character ASCII messages encoded using this scheme would be insecure, since the
largest n would have a value of 255, and 2553 is less than any reasonable modulus, so decrypting would be
simply a matter of taking the cube root of the ciphertext with no regard for the modulus N. Consequently, standards such
as PKCS have been carefully designed to allow arbitrary messages to be securely encrypted.
Also consider the Speed section below, which explains why m will almost never be
the desired message itself, but rather a randomly selected message key.
Signing messages
RSA can also be used to sign a message. Suppose Alice wishes
to send a signed message to Bob. She produces a hash value of the message, raises it to the power of d mod N (as she does when
decrypting a message), and attaches it as a "signature" to the message. When Bob receives the signed message, he raises the
signature to the power of e mod N (as he does when encrypting a message), and compares the resulting hash value
with the message's actual hash value. If the two agree, he knows that the author of the message was in possession of Alice's
secret key, and that the message has not been tampered with since.
Note that secure padding schemes are essential for the security of message signing as they are for message encryption, and
that the same key should not be used for both encryption and signing purposes.
Security
The best known attacks on RSA depend on solving the problem of factoring very large numbers; if a new, sufficiently fast factorization method were developed, it
might be possible to break RSA.
As of 2004, the largest number factored by general-purpose methods was 174 decimal
digits (576 binary bits) long, using state-of-the-art distributed methods. RSA keys are typically 1024–2048 bits long. Some
experts believe that 1024-bit keys may become breakable in the near term (though this is disputed); none see any way that
2048-bit keys could be broken in the foreseeable future.
A working quantum computer implementing Shor's algorithm could render RSA insecure through fast factorization,
but few believe such a computer could be built, at least in the near term.
Suppose Eve, an eavesdropper, intercepts the public key N and e, and the ciphertext c. However, she is
unable to directly obtain d, which Alice keeps secret. The problem of finding an n such that ne=c
mod N is known as the RSA problem.
The most effective way known for Eve to deduce n from c is to factor N into p and q, in
order to compute (p-1)(q-1) which allows the determination of d from e. No polynomial-time method for
factoring large integers on a classical computer has yet been found, but it has not been proven that none exists. See integer factorization for a discussion of this problem.
It has not been proven that factoring N is the only way of deducing n from c, but no easier method has
been discovered (at least to public knowledge.)
Therefore, it is generally presumed that Eve is defeated in practice if N is sufficiently large.
If N is 256 bits or shorter, it can be factored in a few hours on a personal computer, using software already freely available. If N
is 512 bits or shorter, it can be factored by several hundred computers as of 1999. A
theoretical hardware device named TWIRL and described by Shamir and Tromer in 2003 called
into question the security of 1024 bit keys. It is currently recommended that N be at least 2048 bits long.
In 1993, Peter Shor showed that a
quantum computer could in principle perform the factorization in
polynomial time. If (or when) quantum computers become a practical technology, Shor's algorithm will make RSA and related algorithms obsolete.
Should an efficient classical factorization code be discovered or a practical quantum computer constructed, using still larger
key lengths would provide a stopgap measure. However, any such security break in RSA would obviously be retroactive: an
eavesdropper could record a public key and some ciphertext encrypted with it, and then hope that a future breakthrough in
factorization allows them to break the cipher and read the messages.
- See also: RSA Factoring Challenge,
RSA numbers
Practical considerations
Key generation
Finding the large primes p and q is usually done by testing random numbers of the right size with probabilistic
primality tests which quickly eliminate almost all non-primes.
p and q should not be 'too close', lest the Fermat factorization for N be successful. Furthermore, if either p-1
or q-1 has only small prime factors, N can be factored quickly and these values of p or q should
therefore be discarded as well.
One should not employ a prime search method which gives any information whatsoever about the primes to the attacker. In
particular, a good random number generator for the
start value needs to be employed. Note that the requirement here is both 'random' and 'unpredictable'. These are
not the same criteria; a number may have been chosen by a random process (ie, no pattern in the results), but if it is
predictable in any manner (or even partially predicatable), the method used will result in loss of security. For example, the
random number table published by the Rand Corp in the 1950s might very well be truly random, but it has been published and thus
can serve an attacker as well. If the attacker can guess half of the digits of p or q, they can quickly compute the
other half (shown by Coppersmith in 1997).
It is important that the secret key d be large enough. Wiener showed in 1990 that
if p is between q and 2q (which is quite typical) and d < N1/4/3, then d
can be computed efficiently from N and e. The encryption key e = 2 should also not be used.
Speed
RSA is very much slower than DES and other symmetric cryptosystems. In practice, Bob typically encrypts a secret message with a symmetric
algorithm, encrypts the (comparatively short) symmetric key with RSA, and transmits both the RSA-encrypted symmetric key and the
symmetrically-encrypted message to Alice.
This procedure raises additional security issues. For instance, it is of utmost importance to use a strong random number generator for the symmetric key, because
otherwise Eve could bypass RSA by guessing the symmetric key.
Key distribution
As with all ciphers, how RSA public keys are distributed is important to security. Key distribution must be secured against a
man-in-the-middle attack. Suppose Eve has some way
to give Bob arbitrary keys and make him believe they belong to Alice. Suppose further that Eve can intercept transmissions
between Alice and Bob. Eve sends Bob her own public key, which Bob believes to be Alice's. Eve can then intercept any ciphertext
sent by Bob, decrypt it with her own secret key, keep a copy of the message, encrypt the message with Alice's public key, and
send the new ciphertext to Alice. In principle, neither Alice nor Bob would be able to detect Eve's presence. Defenses against
such attacks are often based on digital certificates or other
components of a public key infrastructure.
Timing attacks
Kocher described an ingenious new attack on RSA in 1995: if the attacker Eve knows
Alice's hardware in sufficient detail and is able to measure the decryption times for several known ciphertexts, she can deduce
the decryption key d quickly. This attack can also be applied against the RSA signature scheme. One way to thwart the
attack is to ensure that the decryption operation takes a constant amount of time for every ciphertext. Another way makes use of
the multiplicative property of RSA. Instead of computing cd mod N, Alice first chooses a secret random value
r and computes (rec)d mod N. The result of this computation is rm mod N and so the
effect of r can be removed by multiplying by its inverse. A new value of r is chosen for each ciphertext. With this
technique, known as message blinding, the decryption time is no longer correlated to the value of the ciphertext and so
the timing attack fails.
Adaptive chosen ciphertext attacks
In 1998, Daniel Bleichenbacher described the first practical adaptive chosen ciphertext attack,
against RSA-encrypted messages using the PKCS #1 v1 padding scheme (a padding scheme adds structure to an RSA-encrypted message, so it is possible to
determine whether a decrypted message is valid.) Due to flaws with the PKCS #1 scheme, Bleichenbacher was able to mount a
practical attack against RSA implementations of the Secure Socket
Layer protocol, and potentially reveal session keys. As a result of this work, cryptographers now recommend the use of
provably secure padding schemes such as Optimal Asymmetric Encryption Padding, and RSA Laboratories has released new versions
of PKCS #1 that are not vulnerable to these attacks.
External links
- PKCS #1: RSA Cryptography Standard (http://www.rsasecurity.com/rsalabs/node.asp?id=2125) (RSA Laboratories website)
- A Method for Obtaining Digital Signatures and Public-Key
Cryptosystems (http://theory.lcs.mit.edu/~rivest/rsapaper.pdf), R. Rivest, A.
Shamir, L. Adleman, Communications of the ACM, Vol. 21 (2), 1978, pages 120--126. Previously released as an MIT "Technical Memo"
in April 1977.
- Initial publication of the RSA scheme.
- An Introduction to the RSA
Cryptosystem (http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=544&printer=t), M.
Griep, Oct. 2002, (From the DevHood (http://www.devhood.com) website)
- An introduction to the RSA cryptosystem for people with a general interest in the mathematics behind the cryptosystem or
possible programming implementations.
|