Speeding up modular exponentiation using CRT

This is the fifth post in a series of posts on the Chinese remainder theorem (CRT). When solving a congruence equation with a composite modulus, it is often easier to convert the problem to one of solving several congruence equations with smaller moduli that are primes or powers of primes. Then combine the individual solutions using the Chinese remainder theorem. In this post, we demonstrate this process for modular exponentiation x \equiv c^d \ (\text{mod} \ m) where the exponent d and the modulus m are large. Given x \equiv c^d \ (\text{mod} \ m), the algorithm discussed here is to produce a system of equations with smaller moduli and exponents that give the same answer as for the original problem.

The previous posts in the series on CRT: first post; second post; third post; fourth post

____________________________________________________________________________

Preliminary discussion

The exponentiation c^d is usually programmed using the fast powering algorithm. The CRT method will convert the exponentiation c^d to several exponentiations that involve much smaller exponents and moduli, thus greatly reducing the calculation time, in particular speeding up the fast powering algorithm. One application of the CRT method is to improve the run time of the decryption process in the RSA algorithm (up to four times faster).

As already mentioned, the goal of the algorithm discussed here is to produce an equivalent system of linear congruence equations. Once this system is produced, the Chinese remainder theorem only guarantees a solution and does not actually produce a solution. So we need to know how to Chinese remainder, i.e. using an algorithm for solving a simultaneous linear congruences. We can use the one discussed here (the first method) or here (the second method). The following examples are worked using the second method.

The CRT algorithm discussed here makes use of Euler’s theorem, which states that a^{\phi(m)} \equiv 1 \ (\text{mod} \ m) whenever a and the modulus m are relatively prime where \phi(m) is the phi function evaluated at m. For this reason, the algorithm requires the evaluation of the phi function.

When calculating c^d \ (\text{mod} \ m), the use of the phi function \phi(m) is to reduce the exponent d by the largest multiple of \phi(m). For example, since \phi(17)=16 and 2^{16} \equiv 1 \ (\text{mod} \ 17), the problem of 2^{250} \ (\text{mod} \ 17) is converted to finding 2^{10} \ (\text{mod} \ 17). Here, the original exponent of 250 is reduced to 10 after taking out the largest multiple of 16. Note that 10 \equiv 250 \ (\text{mod} \ \phi(17)=16). In general, we want to replace the original exponent d by a smaller exponent d_1 where d_1 \equiv d \ (\text{mod} \ \phi(m)).

____________________________________________________________________________

Examples

In the following three examples, we use the algorithm discussed here to solve systems of linear congruence equations (this is the iterative approach). These examples are by no means realistic since the numbers used are small. So they are for demonstration of how CRT works.

Example 1
Calculate x \equiv 2^{3163} \ (\text{mod} \ 3969).

First, factor the modulus 3969=3^4 \times 7^2=81 \times 49. Now the problem is converted to solving the following system of two equations:

    x \equiv 2^{3163} \ (\text{mod} \ 81)

    x \equiv 2^{3163} \ (\text{mod} \ 49)

By CRT, any solution to the two equations is also a solution to the original equation. However, the exponent of 3163 should first be reduced. To this end, calculate the phi function, where \phi(3^4)=3^2 \cdot (3-1)=54 and \phi(7^2)=7 \cdot (7-1)=42. We should reduce from 3163 the largest multiple of 54 in the first equation and reduce the largest multiple of 42 in the second equation. In other words, reduce the exponent 3163 modulo the two phi function values:

    3163 \equiv 31 \ (\text{mod} \ 54)

    3163 \equiv 13 \ (\text{mod} \ 42)

As a result, we solve the following two equations:

    x \equiv 2^{3163} \equiv 2^{31} \equiv 65 \ (\text{mod} \ 81)

    x \equiv 2^{3163} \equiv 2^{13} \equiv 9 \ (\text{mod} \ 49)

Note that the original exponentiation 2^{3163} is turned into the easier ones of 2^{31} and 2^{13}. The following gives the solution to the above two equations.

    \displaystyle \begin{aligned} x_0&=65+81 \cdot 23 \cdot (9-65) \ (\text{mod} \ 3969)  \\&\equiv -104263 \ (\text{mod} \ 3969) \\&\equiv 2900 \ (\text{mod} \ 3969) \end{aligned}

    where 23 is obtained by solving for y in 81y \equiv 1 \ (\text{mod} \ 49)

By CRT, the answer to the original problem is 2^{3163} \equiv 2900 \ (\text{mod} \ 3969). \square

Example 2
Calculate x \equiv 3^{3163} \ (\text{mod} \ 3969).

In this example, the number 3 and the modulus 3969 are not relatively prime. The CRT method still applies. As in Example 1, the problem can be reduced in the following way:

    x \equiv 3^{3163} \equiv 3^{31} \equiv 0 \ (\text{mod} \ 81)

    x \equiv 2^{3163} \equiv 3^{13} \equiv 10 \ (\text{mod} \ 49)

The first equation is congruent to 0 since 3^{31} contains 81 as a factor. The following gives the solution to the above two equations:

    \displaystyle \begin{aligned} x_0&=0+81 \cdot 23 \cdot (10-0) \ (\text{mod} \ 3969)  \\&\equiv 18630 \ (\text{mod} \ 3969) \\&\equiv 2754 \ (\text{mod} \ 3969) \end{aligned}

By CRT, the answer to the original problem is 3^{3163} \equiv 2754 \ (\text{mod} \ 3969). \square

Example 3
The above 2 examples use small numbers to illustrate the CRT technique. In this example, we use slightly larger numbers. Calculate x \equiv 355^{d} \ (\text{mod} \ m) where d=\text{1,759,695,794} and m=\text{3,055,933,789}=1277 \cdot 1439 \cdot 1663.

As in the other examples, we break up the problem in three congruences. The three factors of the modulus are prime numbers. Thus we reduce the exponent d by multiples of a prime factor less one.

    \text{1,759,695,794} \equiv 1198 \ (\text{mod} \ 1276)

    \text{1,759,695,794} \equiv 814 \ (\text{mod} \ 1438)

    \text{1,759,695,794} \equiv 110 \ (\text{mod} \ 1662)

Then the original problem is transformed to solving the following three equations.

    x \equiv 355^{d} \equiv 355^{1198} \equiv 189 \ (\text{mod} \ 1277)

    x \equiv 355^{d} \equiv 355^{814} \equiv 1010 \ (\text{mod} \ 1439)

    x \equiv 355^{d} \equiv 355^{110} \equiv 315 \ (\text{mod} \ 1663)

Notice that the original exponentiation is transformed to the smaller ones of 355^{1198}, 355^{814} and 355^{315}. The remaining task is to solve the system of three equations. One way to find to solution to the above three equations is to use the iterative approach, starting with the solution x_1=189 to the first equation. Then find the solution x_2 to the first two equations and then the solution x_3 to all three equations.

    \displaystyle \begin{aligned} x_2&=189+1277 \cdot 151 \cdot (1010-189) \ (\text{mod} \ 1277 \cdot 1439)  \\&\equiv 158311156 \ (\text{mod} \ 1837603) \\&\equiv 277298 \ (\text{mod} \ 1837603) \end{aligned}

    where 151 is obtained by solving for y in 1277y \equiv 1 \ (\text{mod} \ 1439)

    \displaystyle \begin{aligned} x_3&=277298+(1277 \cdot 1439) \cdot 970 \cdot (315-277298) \ (\text{mod} \ 1277 \cdot 1439 \cdot 1663)  \\&\equiv 277298+1782474910 \cdot (-276983) \ (\text{mod} \ m) \\&\equiv 277298+1414954310 \ (\text{mod} \ m) \\&\equiv 1415231608 \ (\text{mod} \ m) \end{aligned}

    where 970 is obtained by solving for y in (1277 \cdot 1439) y \equiv 1 \ (\text{mod} \ 1663)

By CRT, the answer to the original problem is 355^{d} \equiv 1415231608 \ (\text{mod} \ m). \square

____________________________________________________________________________

The CRT algorithm to speed exponentiation

Suppose we wish to evaluate x \equiv c^{d} \ (\text{mod} \ m) where the prime factorization of m is m=p_1^{n_1} \cdot p_2^{n_2} \cdots p_t^{n_t}. The numbers p_i are distinct prime numbers and each exponent n_i \ge 1. To prepare for the calculation, do the following:

    Let m_i=p_i^{n_i} for each i.

    Calculate \phi(m_i) for each i.

Case 1. The base c and the modulus m are relatively prime.
Then the answer to the original exponentiation problem is identical to the solution to the following system of t congruence equations:

    x \equiv c^{d_1} \ (\text{mod} \ m_1)

    x \equiv c^{d_2} \ (\text{mod} \ m_2)

    \cdots

    x \equiv c^{d_t} \ (\text{mod} \ m_t)

where d_i \equiv d \ (\text{mod} \ \phi(m_i)) for each i. If possible, each c^{d_i} should be reduced modulo m_i.

Case 2. The base c and the modulus m are not relatively prime.
In this case, c and m have prime factors in common (at least one p_i). The idea here is that for any p_i that is a prime factor of c, the equation x \equiv c^{d_i} \ (\text{mod} \ m_i) in Case 1 is replaced by x \equiv 0 \ (\text{mod} \ m_i). Then solve the resulting system of equations (see Example 2). Essentially, Case 2 can fall under Case 1 with c^{d_i} being congruent to zero. We call out Case 2 for the sake of clarity.

The original exponentiation c^d boils down to solving an appropriate system of CRT congruences as described above. Once the equivalent system of congruences is set up, use the algorithm discussed here or here to do Chinese remaindering.

Comment
Both Case 1 and Case 2 produce a system of linear congruence equations that have identical solution to the original equation. This is a result of using CRT (see Theorem D and Theorem G here). The savings in the calculation come in the form of the smaller exponentiations in the resulting congruence equations.

In Case 2, some of the congruence equations are x \equiv 0. This is because the base c and some moduli m_i are not relatively prime. For these moduli, c^{d_i} would contain p_i^{n_i} as a factor (assuming that d_i \ge n_i). Hence, x \equiv c^{d_i} \equiv 0 \ (\text{mod} \ m_i).

The two-equation case
When the modulus is the product of two factors that are relatively prime, the CRT algorithm involves only two equations. We write out the solution explicitly for this case. To evaluate x \equiv c^{d} \ (\text{mod} \ m) where m=m_1 \cdot m_2 and m_1 and m_2 are relatively prime, solve the following system of two equations,

    x \equiv c^{d_1} \ (\text{mod} \ m_1)

    x \equiv c^{d_2} \ (\text{mod} \ m_2)

The solution is x \equiv  c^{d_1}+m_1 \cdot v_1 \cdot (c^{d_2}-c^{d_1}) \ (\text{mod} \ m) where v_1 is the multiplicative inverse of m_1 modulo m_2. If possible, each c^{d_i} should be reduced modulo m_i.

____________________________________________________________________________

RSA application

The algorithm to speed the exponentiation is possible because the factorization of the modulus m is known (as a result, the values of \phi(m_i) are known). Knowing the values of \phi(m_i) makes it possible to reduce the large exponent d. For this reason, the decryption process in the RSA algorithm is a perfect place to apply the CRT technique described here.

With the RSA cryptosystem, a public key consists of N and e, where N is a large modulus that is a product of two large primes p and q (the two primes are not published) and e is the encryption exponent. Say Bob is the originator of the RSA public key. Bob also generates a private key, which is a number d that is used for decrypting any messages that he receives. The number d must be kept private. The prime factors p and q must also be kept secret since knowing p and q can derive d.

Suppose that Alice has a message to send to Bob. She can do so using the published key of N and e through the exponentiation c \equiv m^e \ (\text{mod} \ N). Here, m is the plaintext (the message to be sent) and c is the ciphertext (the encrypted message). Upon receiving the ciphertext c, Bob can then decrypt through the exponentiation m \equiv c^d \ (\text{mod} \ N) where d is the decryption exponent. In realistic RSA calculation, the public modulus N and the private decryption exponent d are large integers (N is at minimum a 2048-bit number). With CRT, the decryption can be reduced to two much smaller exponentiations. The effect can be at least four times faster.

We illustrate this with an example. This is a toy example since the numbers used are small. It is only intended as an illustration.

Example 4
Suppose the public key consists of N=\text{17,086,049} and e=65537. Bob has the additional information of N=p \cdot q where p=3863 and q=4423, which are kept secret. Knowing p and q allows Bob to compute d=\text{5,731,241}. Suppose that Bob receives a message c= 4831984 from Alice. Use the CRT approach to find the plaintext m.

The exponentiation is m \equiv 4831984^{5731241} \ (\text{mod} \ N), which is equivalent to the following two equations by CRT:

    m \equiv 4831984^{5731241} \ (\text{mod} \ 3863)

    m \equiv 4831984^{5731241} \ (\text{mod} \ 4423)

which is further simplified to:

    m \equiv 4831984^{5731241} \equiv 4831984^{33} \equiv 3084 \ (\text{mod} \ 3863)

    m \equiv 4831984^{5731241} \equiv 4831984^{329} \equiv 1436 \ (\text{mod} \ 4423)

where 33 \equiv 5731241 \ (\text{mod} \ 3862) and 329 \equiv 5731241 \ (\text{mod} \ 4422). Then the following is the plaintext (the original message).

    \displaystyle \begin{aligned} m&=3084+3863 \cdot 1319 \cdot (1436-3084) \ (\text{mod} \ 3863 \cdot 4423)  \\&\equiv 3084+5095297 \cdot (-1648) \ (\text{mod} \ N) \\&\equiv 9289736 \ (\text{mod} \ N) \end{aligned}

    where 1319 is obtained by solving for y in 3683y \equiv 1 \ (\text{mod} \ 4423)

The answer is 4831984^{5731241} \equiv 9289736 \ (\text{mod} \ N). \square

____________________________________________________________________________

Closing comment

In conclusion, we state the explcit formula for providing the CRT answer to the RSA decryption.

    m \equiv  c^{d_p}+p \cdot p_{inv} \cdot (c^{d_q}-c^{d_p}) \ (\text{mod} \ p \cdot q) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)

where d_p \equiv d \ (\text{mod} \ p-1), d_q \equiv d \ (\text{mod} \ q-1) and p_{inv} is the multiplicative inverse of p modulo q.

The decryption formula of (1) represends tremendous saving in calculation (up to four times faster). It is possible only for the holder of the RSA private key. It requires knowledge of the decryption exponent d, which is calculated from the factors p and q of the modulus N.
____________________________________________________________________________
\copyright \ 2015 \text{ by Dan Ma}

Advertisements

Factorization versus primality testing

Let n be a large positive integer whose “prime versus composite” status is not known. One way to know whether n is prime or composite is to factor n into its prime factors. If there is a non-trivial factor (one that is neither 1 nor n), it is composite. Otherwise n is prime. This may sound like a reasonable approach in performing primality testing – checking whether a number is prime or composite. In reality, factoring and primality testing, though related, are very different problems. For a very large number (e.g. with at least 300 decimal digits), it is possible that, even with the state of the art in computing, factoring it may take more than a few million years. On the other hand, it will take a modern computer less than a second to determine whether a 300-digit number is prime or composite. Interestingly this disparity is one reason that makes the RSA work as a practical and secure cryptosystem. In this post, we use the RSA cryptosystem as an example to give a sense that factoring is a “hard” problem while primality testing is an “easy” problem. The primality test used in the examples is the Fermat primality test.

____________________________________________________________________________

The brute force approach

There is a natural and simple approach in factoring, which is to do trial divisions. To factor the number n, we divide n by every integer a in the range 1<a<n. Once a factor a is found, we repeat the process with the complementary factor \frac{n}{a} until all the prime factors of n are found. This is simple in concept and is sure to produce the correct answer. For applications in cryptography, this brute force approach is essentially useless since the amount of time to try every candidate factor is prohibitively huge. The amount of time required may be more than the age of the universe if the brute force approach is used.

The brute force approach can be improved upon slightly by doing the trial divisions using candidate factors up to \sqrt{n}. It is well known that if a composite integer n is greater than one, then it has a prime divisor d such that 1<d \le \sqrt{n}. So instead of dividing n by every number a with 1<a<n, we can divide n by every prime number a with 1<a \le \sqrt{n}. But even this improved brute force approach will still take too long to be practical.

Let’s look at a quick example for brute force factoring. Let n=96638243. Note that \sqrt{n}=\sqrt{96638243}=9676. There are 1192 odd primes less than 9676. In dividing n by these primes, we stop at 127 and we have n=96638243=127 \cdot 737309. We now focus the attention on 737309. Note that \sqrt{737309}=858.67 and there are 147 odd primes less than 858. Dividing 737309 by these 147 candidate factors, we find that none of them is a factor. We can conclude 737309 is prime. Then we have the factorization n=96638243=127 \cdot 737309.

____________________________________________________________________________

Example of RSA

RSA is a public-key cryptosystem and is widely used for secure data transmission. The RSA public key consists of two parts. One is the modulus that is the product of two distinct prime factors. Suppose the modulus is called N and we have N=pq where p and q are distinct prime numbers. How large does N have to be? The larger the N is, the more secure RSA is. The current practice is that for corporate use the modulus is at least a 1024-bit number (the bit size is called the key length). If data is extra sensitive or if the data needs to be retained for a long time, then a larger key length should be used (e.g. 2048-bit). With a 1024-bit modulus N=pq, each prime factor is a 512-bit number. The other part of the RSA public key is the encryption key, which is an integer e that is relatively prime to the integer (p-1) \cdot (q-1).

Let’s say we want to generate a 1024-bit modulus. There are two challenges with a key of this size. One is that a reliable way is needed to obtain two prime numbers that are 512-bit long. Given a large integer that is at least 512-bit long, how do we determine reliably that it is prime? Is it possible to factor a 512-bit integer using the brute force approach? The other challenge is from the perspective of an attacker – successful factoring the 1024-bit modulus would break RSA and allow the attacker to read the secret message. Let’s look at the problem of the attacker trying to factor a 1024-bit number. A 1024-bit number is approximately 2^{1024}. The following calculation converts it to a decimal basis:

    \displaystyle 2^{1024}=(10^{\text{log(2)}})^{1024} \approx 10^{308.25}

We use \text{log}(x) to denote the logarithm of base 10. Note that 1024 \cdot \text{log}(2)=308.25. So a 1024-bit number has over 300 digits.

Let’s see what the challenge is if you want to factor a 1024-bit number. Suppose your chosen large number n is such that n \approx 10^{308}. Note that \sqrt{10^{308}}=10^{154}. According to the improved brute force approach described above, in effect you will need to divide n by every prime number less than 10^{154}.

Now let’s get an estimate on the number of prime numbers less than 10^{154}. According to the prime number theorem, the number of prime numbers at most x is approximately

    \displaystyle \pi(x) \approx \frac{x}{\text{ln}x}

where \pi(x) is the number of primes at most x. Then \pi(10^{154}) \approx 2.82 \cdot 10^{151}. This is certainly a lot of prime numbers to check.

It is hard to comprehend such large numbers. Let’s put this into perspective. Currently the world population is about 7 billion. Let’s say each person in the world possesses a supercomputer that can check 10^{40} prime numbers per second (i.e. to check whether they are factors of the number n). This scenario clearly far exceeds the computing resources that are currently available. Suppose that the 7 billion supercomputers are available and that each one can check 10^{40} many primes per second. Then in each second, the following is the number of prime numbers that can be checked by the 7 billion supercomputers.

    \displaystyle 7 \cdot 10^9 \cdot 10^{40}=7 \cdot 10^{49} \text{ prime numbers per second}

The following is the number of seconds it will take to check 2.82 \cdot 10^{151} many prime numbers:

    \displaystyle \frac{2.82 \cdot 10^{151}}{7 \cdot 10^{49}} \approx 4 \cdot 10^{101} \text{ seconds}

The universe is estimated to be about 13 billion years old. The following calculation converts it to seconds.

    13 \text{ billion years}=13 \cdot 10^9 \cdot 365 \cdot 24 \cdot 3600 \approx 4 \cdot 10^{17} \text{ seconds}

With 7 billion fast suppercomputers (one for each person in the world) running in the entire life of the universe, you can only finish checking

    \displaystyle \frac{4 \cdot 10^{17}}{4 \cdot 10^{101}}=\frac{1}{10^{84}}

of the 2.82 \cdot 10^{151} many prime numbers. Note that \frac{1}{10^{84}} is a tiny portion of 1%. So by taking the entire life of the universe to run the 7 billion supercomputers, each checking 10^{40} many candidate prime factors per second, you would not even make a dent in the problem!

The security of RSA rests on the apparent difficulty of factoring large numbers. If the modulus N=pq can be factored, then an eavesdropper can obtain the private key from the public key and be able to read the message. The difficulty in factoring means there is a good chance that RSA is secure. In order to break RSA, an attacker would probably have to explore other possible vulnerabilities instead of factoring the modulus.

By carrying out a similar calculation, we can also see that factoring a 512-bit number by brute force factoring is also not feasible. Thus in the RSA key generation process, it is not feasible to use factoring as a way to test primality. The alternative is to use efficient primality tests such as Fermat test or Miller-Rabin test. The computation for these tests is based on the fast powering algorithm, which is a very efficient algorithm.

____________________________________________________________________________

The story told by RSA numbers

The required time of more than the life of the universe as discussed above is based on the naïve brute force approach of factoring. There are many other factoring approaches that are much more efficient and much faster, e.g., the quadratic sieve algorithm, the number field sieve algorithm, and the general number field sieve algorithm. For these methods, with ample computing resources at the ready, factoring a 1024-bit or 2048-bit number may not take the entire life of the universe but make take decades or more. Even with these better methods, the disparity between slow factoring and fast primality testing is still very pronounced and dramatic.

The best evidence of slow factoring even with using modern methods is from the RSA numbers. The RSA numbers are part of the the RSA Factoring Challenge, which was created in 1991 to foster research in computational number theory and the practical difficulty of factoring large integers. The challenge was declared inactive in 2007. The effort behind the successful factorization of some of these numbers gives us an idea of the monumental challenges in factoring large numbers.

According to the link given in the above paragraph, there are 54 RSA numbers, ranging from 330 bits long to 2048 bits long (100 decimal digits to 617 decimal digits). Each of these numbers is a product of two prime numbers. Of these 54 numbers, 18 were successfully factored (as of the writing of this post). They were all massive efforts involving large groups of volunteers (in some cases using hundreds or thousands of computers), spanning over months or years. Some of methods used are the quadratic sieve algorithm, the number field sieve algorithm, and the general number field sieve algorithm.

The largest RSA number that was successfully factored is the RSA-768, which is 768 bits long and has 232 decimal digits (completed in December 2009). The method used was the Number Field Sieve method. There were 4 main steps in this effort. The first step is the polynomial selection, which took half a year using 80 processors. The second step is the sieving step, which took almost two years on many hundreds of machines. If only using a single core 2.2 GHz AMD Opteron processor with 2 GB RAM, the second step would take about 1500 years! The third step is the matrix step, which took a couple of weeks on a few processors. The final step took a few days, which involved a great deal of debugging.

The number field sieve method is the fastest known method for factoring large numbers that are a product of two primes (i.e. RSA moduli). The effort that went into factoring RSA-768 was massive and involved many years of complicated calculations and processing. This was only a medium size number on the list!

Another interesting observation that can be made is on the RSA numbers that have not been factored yet. There are 36 unfactored numbers in the list. One indication that RSA is secure in the current environment is that the larger numbers in the list are not yet factored (e.g. RSA-1024 which is 1024-bit long). Successful factorization of these numbers has important security implication for RSA. The largest number on the list is RSA-2048, which is 2048-bit long and has 617 digits. It is widely believed that RSA-2048 will stay unfactored in the decades to come, barring any dramatic and significant advance in computing technology.

The factoring challenge for the RSA numbers certainly provides empirical evidence that factoring is hard. Of course, no one should be complacent. We should not think that factoring will always be hard. Technology will continue to improve. A 768-bit RSA modulus was once considered secure. With the successful factorization of RSA-768, key size of 768 bits is no longer considered secure. Currently 1024 bit key size is considered secure. The RSA number RSA-1024 could very well be broken in within the next decade.

There could be new advances in factoring algorithm too. A problem that is thought to be hard may eventually turn out to be easy. Just because everyone thinks that there is no fast way of factoring, it does not mean that no such method exists. It is possible that someone has discovered such a method but decides to keep it secret in order to maintain the advantage. Beyond the issue of factoring, there could be some other vulnerabilities in RSA that can be explored and exploited by attackers.

____________________________________________________________________________

Fermat primality test

We now give some examples showing primality testing is a much better approach (over factoring) if the goal is to check the “prime or composite” status only. We use Fermat primality test as an example.

Example 1
Let n=15144781. This is a small number. So factoring would be practical as a primality test. We use it to illustrate the point that the “prime or composite” status can be determined without factoring. One option is to use Fermat’s little theorem (hence the name of Fermat primality test):

    Fermat’s little theorem
    If n is a prime number and if a is an integer that is relatively prime to n, then a^{n-1} \equiv 1 \ (\text{mod} \ n).

Consider the contrapositive of the theorem. If we can find an a, relatively prime to n such that a^{n-1} \not \equiv 1 \ (\text{mod} \ n), then we know for sure n is not prime. Such a value of a is said to be a Fermat witness for the compositeness of n.

If a Fermat witness is found, then we can say conclusively that n is composite. On the other hand, if a is relatively prime to n and a^{n-1} \equiv 1 \ (\text{mod} \ n), then n is probably a prime. We can then declare n is prime or choose to run the test for a few more random values of a.

The exponentiation a^{n-1} \ (\text{mod} \ n) is done using the fast powering algorithm, which involves a series of squarings and multiplications. Even for large moduli, the computer implementation of this algorithm is fast and efficient.

Let’s try some value of a, say a=2. Using an online calculator, we have

    2^{15144780} \equiv 1789293 \not \equiv 1 \ (\text{mod} \ 15144781)

In this case, one congruence calculation tells us that n=15144781 is not prime (if it were, the congruence calculation would lead to a value of one). It turns out that n=15144781 is a product of two primes where n=15144781=3733 \cdot 4057. Of course, this is not a secure modulus for RSA. The current consensus is to use a modulus that is at least 1024-bit long.

Example 2
Let n=15231691. This is also a small number (in relation what is required for RSA). Once again this is an illustrative example. We calculate a^{15231690} \ (\text{mod} \ 15231691) for a=2,3,4,5,6,7, the first few values of a. All such congruence values are one. We suspect that n=15231691 may be prime. So we randomly choose 20 values of a and compute a^{15231690} \ (\text{mod} \ 15231691). The following shows the results.

    \left[\begin{array}{rrr}      a & \text{ } & a^{n-1} \ \text{mod} \ n \\      \text{ } & \text{ } & n=15,231,691  \\      \text{ } & \text{ } & \text{ }  \\      3,747,236 & \text{ } & 1  \\      370,478 & \text{ } & 1  \\      12,094,560 & \text{ } & 1  \\      705,835 & \text{ } & 1  \\      10,571,714 & \text{ } & 1  \\      15,004,366 & \text{ } & 1  \\      12,216,046 & \text{ } & 1  \\        10,708,300 & \text{ } & 1  \\      6,243,738 & \text{ } & 1  \\      1,523,626 & \text{ } & 1  \\      10,496,554 & \text{ } & 1  \\      10,332,033 & \text{ } & 1  \\      10,233,123 & \text{ } & 1  \\      3,996,691 & \text{ } & 1  \\            4,221,958 & \text{ } & 1  \\      3,139,943 & \text{ } & 1  \\      1,736,767 & \text{ } & 1  \\      12,672,150 & \text{ } & 1  \\      12,028,143 & \text{ } & 1  \\      8,528,642 & \text{ } & 1    \end{array}\right]

For all 20 random values of a, a^{15231690} \equiv 1 \ (\text{mod} \ 15231691). This represents strong evidence (though not absolute proof) that n=15231691 is a prime. In fact, we can attach the following probability statement to the above table of 20 random values of a.

    If n=15231691 were a composite number that has at least one Fermat witness, there is at most a 0.0000953674% chance that 20 randomly selected values of a are not Fermat witnesses.

    In other words, if n=15231691 were a composite number that has at least one Fermat witness, there is at most a 0.0000953674% chance of getting 20 1’s in the above computation.

In general, if n has at least one Fermat witness, the probability that all k randomly selected values of a with 1<a<n are not Fermat witnesses is at most 0.5^k. For k=20, 0.5^{20}=0.000000953674, which is 0.0000953674%. The probability statement should give us enough confidence to consider n=15231691 a prime number.

There is a caveat that has to be mentioned. For the above probability statement to be valid, the number n must have at least one Fermat witness. If a number n is composite, we would like the test to produce a Fermat witness. It turns out that there are composite numbers that have no Fermat witnesses. These numbers are called Carmichael numbers. If n is such a number, a^{n-1} \equiv 1 \ (\text{mod} \ n) for any a that is relatively prime to the number n. In other words, the Fermat test will always indicate “probably prime” for Carmichael numbers. Unless you are lucky and randomly pick a value of a that shares a common prime factor with n, the Fermat test will always incorrectly identify a Carmichael number n as prime. Fortunately Carmichael numbers are rare, even though there are infinitely many of them. In this previous post, we estimate that a randomly selected 1024-bit odd integer has a less than one in 10^{88} chance of being a Carmichael number!

The Fermat test is a powerful test when the number being tested is a prime number or a composite number that has a Fermat witness. For Carmichael numbers, the test is likely to produce a false positive (identifying a composite number as prime). Thus the existence of Carmichael numbers is the biggest weakness of the Fermat test. Fortunately Carmichael numbers are rare. Though they are rare, their existence may still make the Fermat test unsuitable in some situation, e.g., when you test a number provided by your adversary. If you really want to avoid situations like these, you can always switch to the Miller-Rabin test.

____________________________________________________________________________

\copyright \ 2014 \text{ by Dan Ma}

An upper bound for Carmichael numbers

It is well known that Fermat’s little theorem can be used to establish the compositeness of some integers without actually obtaining the prime factorization. Fermat’s little theorem is an excellent test for compositeness as well as primality. However, there are composite numbers that evade the Fermat test, i.e. the Fermat test will fail to indicate that these composite integers are composite. These integers are called Carmichael numbers. However, Carmichael numbers are rare. We illustrate this point by doing some calculation using an upper bound for Carmichael numbers.

Let p be a prime number. According to Fermat’s little theorem, a^{p-1} \equiv 1 \ (\text{mod} \ p) for all integer a that is relatively prime to p (i.e., the GCD of a and p is 1). The Fermat primality test goes like this. Suppose that the “composite or prime” status of the positive integer n is not known. We randomly pick a number a \in \left\{2,3,\cdots,n-1 \right\}. If a is relatively prime to n and if a^{n-1} \not \equiv 1 \ (\text{mod} \ n), then we are certain that n is composite even though we may not know its prime factorization. Such a value of a is said to be a Fermat witness for (the compositeness of) a. If a^{n-1} \equiv 1 \ (\text{mod} \ n), then n is probably prime. But to be sure, repeat the calculation with more values of a. If the calculation is done for a large number of randomly selected values of a and if the calculation for every one of the values of a indicates that n is probably prime, we will have high confidence that n is prime. In other words, the probability of making a mistake is very small.

However here is a wrinkle in the Fermat test. There are composite numbers which have no Fermat witnesses. These numbers are called Carmichael numbers. Specifically a positive composite integer n is a Carmichael number if a^{n-1} \equiv 1 \ (\text{mod} \ n) for all a relatively prime to n. In other words, if n is a Carmichael number, the Fermat test always indicates n is probably prime no matter how many values of a you use in the test. Fortunately Carmichael numbers are rare. The upper bound discussed below gives an indication of why this is the case.

____________________________________________________________________________

An upper bound

For each positive integer n, let C(n) be the number of Carmichael numbers that are less than n. The following is an upper bound for C(n).

    \displaystyle C(n)<n \cdot \text{exp}\biggl(-\frac{(\text{ln} \ n) \ (\text{ln} \ \text{ln} \ \text{ln} \ n)}{\text{ln} \ \text{ln} \ n}  \biggr)

The formula is found here (credited to Richard G. E. Pinch). We use this upper bound to find out the chance of encountering a Carmichael numbers. As shown below, the upper bound can overestimate C(n). The main point we like to make is that even with the overestimation of Carmichael numbers represented by the above upper bound, the number of Carmichael number is extremely small in relation to n. This is even more so when n is large (e.g. a 1024-bit integer). Thus for a randomly selected 1024-bit odd number, the probability that it is a Carmichael number is practically zero (see Examples 4 and 5 below).

____________________________________________________________________________

Examples

Example 1
The first 10 Carmichael numbers are 561, 1105, 1729, 2465, 2821, 6601, 8911, 10585, 15841, 29341. Furthermore, there are only 16 Carmichael numbers less than 100,000. Let n=10^5. According to the above formula, the following is the upper bound for C(10^5):

    \displaystyle C(10^5)<10^5 \cdot \text{exp}\biggl(-\frac{(\text{ln} \ 10^5) \ (\text{ln} \ \text{ln} \ \text{ln} \ 10^5)}{\text{ln} \ \text{ln} \ 10^5}  \biggr)=1485

The bound of 1485 is a lot more than the actual count of 16. Even with this inflated estimate, when you randomly select an odd positive integer less than 10,000, the probability of getting a Carmichael number is 0.0297. With the actual count of 16, the probability is 0.00032.

Example 2
Here’s another small example. There are only 2,163 Carmichael numbers that are less than 25,000,000,000. Let n=2.5 \cdot 10^{10}.

    \displaystyle C(2.5 \cdot 10^{10})<2.5 \cdot 10^{10} \cdot \text{exp}\biggl(-\frac{(\text{ln} \ 2.5 \cdot 10^{10}) \ (\text{ln} \ \text{ln} \ \text{ln} \ 2.5 \cdot 10^{10})}{\text{ln} \ \text{ln} \ 2.5 \cdot 10^{10}}  \biggr)=4116019

This inflated bound is a more than 1900 times over the actual count of 2163. But even with this inflated bound, the probability of a random odd integer being Carmichael is under 0.00033 (about 3 in ten thousands). With the actual count of 2163, the probability is 0.00000017 (less than one in a million chance).

Example 3
Here’s a larger example. A calculation was made by Richard G. E. Pinch that there are 20,138,200 many Carmichael numbers up to 10^{21}. Let’s compare the actual probability and the probability based on the upper bound. The following is the upper bound of C(10^{21}).

    \displaystyle C(10^{21})<10^{21} \cdot \text{exp}\biggl(-\frac{(\text{ln} \ 10^{21}) \ (\text{ln} \ \text{ln} \ \text{ln} \ 10^{21})}{\text{ln} \ \text{ln} \ 10^{21}}  \biggr) \approx 4.6 \cdot 10^{13}

The actual count of 20,138,200 is about 2 \cdot 10^{7}. So 4.6 \cdot 10^{13} is an inflated estimate. The following shows the probability of randomly selecting an odd integer that is Carmichael (both actual and inflated).

    \displaystyle \text{inflated probability}=\frac{4.6 \cdot 10^{13}}{0.5 \cdot 10^{21}}=\frac{4.6 \cdot 10^{13}}{5 \cdot 10^{20}}=\frac{0.92}{10^{7}} \approx \frac{1}{10.9 \cdot 10^6} <\frac{1}{10^6}

    \displaystyle \text{actual probability}=\frac{2 \cdot 10^{7}}{0.5 \cdot 10^{21}}=\frac{2 \cdot 10^{7}}{5 \cdot 10^{20}}=\frac{0.4}{10^{13}} = \frac{1}{25 \cdot 10^{12}} <\frac{1}{10^{12}}

Even with the inflated upper bound, the chance of randomly picking a Carmichael number is less than one in a million. With the actual count of 20,138,200, the chance of randomly picking a Carmichael number is less than one in a trillion!

Remark
The number 10^{21} is quite small in terms of real world applications. For example, in practice, the RSA algorithm requires picking prime numbers that are at least 512-bit long. The largest 512-bit numbers are approximately 10^{154}. What is the chance of randomly picking a Carmichael number in this range? First, let’s look at the Carmichael numbers up to the limit 10^{100}. Then we look at 10^{154}.

Example 4
Here’s the estimates for C(10^{100}) based on the above upper bound.

    \displaystyle C(10^{100})<10^{100} \cdot \text{exp}\biggl(-\frac{(\text{ln} \ 10^{100}) \ (\text{ln} \ \text{ln} \ \text{ln} \ 10^{100})}{\text{ln} \ \text{ln} \ 10^{100}}  \biggr) \approx 7.3 \cdot 10^{68}

    \displaystyle \text{probability}=\frac{7.3 \cdot 10^{68}}{0.5 \cdot 10^{100}}=\frac{7.3 \cdot 10^{68}}{5 \cdot 10^{99}}=\frac{1.46}{10^{31}} \approx \frac{1}{6.8 \cdot 10^{30}} <\frac{1}{10^{30}}

Thus the chance of randomly picking a Carmichael number under 10^{100} is less than one in 10^{30}, i.e., practically zero.

Example 5
Here’s the example relevant to the RSA algorithm. As mentioned above, the RSA algorithm requires that the modulus in the public key is a product of two primes. The current practice is for the modulus to be at least 1024 bits. Thus each prime factor of the modulus is at least 512-bit. A 512-bit number can be as large as 10^{154} in decimal terms. When picking candidate for prime numbers, it is of interest to know the chance of picking a Carmichael number. We can get a sense of how small this probability is by asking: picking an odd integer under the limit 10^{154}, what is the chance that it is a Carmichael number? Here’s the estimates:

    \displaystyle C(10^{154})<10^{154} \cdot \text{exp}\biggl(-\frac{(\text{ln} \ 10^{154}) \ (\text{ln} \ \text{ln} \ \text{ln} \ 10^{154})}{\text{ln} \ \text{ln} \ 10^{154}}  \biggr) \approx 3.7 \cdot 10^{107}

    \displaystyle \text{probability}=\frac{3.7 \cdot 10^{107}}{0.5 \cdot 10^{154}}=\frac{7.4}{10^{47}}=\frac{0.74}{10^{46}} < \frac{1}{10^{46}}

Thus a randomly selected odd integer under 10^{154} has a less than one in 10^{46} chance of being a Carmichael number!

Example 6
In some cases, for stronger security, the modulus in the RSA should be longer than 1024 bits, e,g, 2048 bits. If the modulus is a 2048-bit number, each prime in the modulus is a 1024-bit number. A 1024-bit number can be as large as 10^{308} in decimal terms. In picking an odd integer under the limit 10^{308}, what is the chance that it is a Carmichael number? Here’s the estimates:

    \displaystyle C(10^{308})<10^{308} \cdot \text{exp}\biggl(-\frac{(\text{ln} \ 10^{308}) \ (\text{ln} \ \text{ln} \ \text{ln} \ 10^{308})}{\text{ln} \ \text{ln} \ 10^{308}}  \biggr) \approx 5 \cdot 10^{219}

    \displaystyle \text{probability}=\frac{5 \cdot 10^{219}}{0.5 \cdot 10^{308}}=\frac{5 \cdot 10^{219}}{5 \cdot 10^{307}} \approx \frac{1}{10^{88}}

Thus a randomly selected odd integer under 10^{308} has a less than one in 10^{88} chance of being a Carmichael number!

Remark
The above examples demonstrate that Carmichael numbers are rare. Even though the Fermat primality test “fails” for these numbers, the Fermat test is still safe to use because Carmichael numbers are hard to find. However, if you want to eliminate the error case of Carmichael numbers, you may want to consider using a test that will never misidentify Carmichael numbers. One possibility is to use the Miller-Rabin test.

____________________________________________________________________________

\copyright \ 2014 \text{ by Dan Ma}

Congruence Arithmetic and Fast Powering Algorithm

In some cryptography applications such as RSA algorithm, it is necessary to compute \displaystyle a^w modulo m where the power w and the modulus m are very large numbers. We discuss and demonstrate an efficient algorithm that can handle such calculations. This general algorithm has various names such as fast powering algorithm, square-and-multiply algorithm and exponentiation by squaring.

The problem at hand is to compute a^w \ (\text{mod} \ m). The naïve approach is to compute by repeatedly multiplying by a and reducing modulo m. When the power w is large (e.g. an integer with hundreds of digits), this approach is difficult or even impossible (given the current technology). In this post we discuss an alternative that is known as the fast powering algorithm.

___________________________________________________________________________________________________________________

An Example

Compute 1286^{1171} modulo 1363.

Using the naïve approach described earlier, we would do something like the following:

    \displaystyle \begin{aligned} 1286^{1171} &=(1286 \cdot 1286) \cdot 1286^{1269} \equiv 477 \cdot 1286^{1269} \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \\&=(477 \cdot 1286) \cdot 1286^{1268} \equiv 72 \cdot 1286^{1268} \  \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \\&=(72 \cdot 1286) \cdot 1286^{1267} \equiv 1271 \cdot 1286^{1267} \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \\&\text{     }\cdots \\&\text{     }\cdots \\&\text{     }\cdots \end{aligned}

In this naïve approach, we would multiply two numbers at a time and then reduce the result modulo 1363 so that the numbers do not get too large. The above example would involve 1170 multiplications and then 1170 divisions for the reduction modulo 1363. Great difficulty comes when the power is not 1171 and instead is an integer with hundreds or even thousands of digits.

Note that in the above naïve approach, the power is reduced by one at each step. In the fast power alternative, the power comes down by an exponent of two in each step. The idea is to use the binary expansions of the exponent 1171 to transform the computation of 1286^{1171} into a series of squarings and multiplications. To this end, we write 1171 as a sum of powers of two as follows:

    (1) \ \ \ \ \ \ \ \ \ 1171=2^0+2^1+2^4+2^7+2^{10}

Next we compute 1286^{2^0},1286^{2^1},1286^{2^2},1286^{2^3},\cdots,1286^{2^{10}} modulo 1363. Note that each term is the square of the preceding term, hence the word square in the name “square-and-multiply”. To make it easier to see, we put the results in the following table.

    \displaystyle (2) \ \ \ \ \ \ \ \ \ \begin{bmatrix} \text{ i }&\text{ }&1286^{2^i}&\text{ }&\text{squaring}&\text{ }&\text{modulo } 1363&\text{ }&\text{ }  \\\text{ }&\text{ }&\text{ } \\ 0&\text{ }&1286^{2^0}&\text{ }&\text{ }&\text{ }&\equiv 1286&\text{ }&*  \\ 1&\text{ }&1286^{2^1}&\text{ }&\equiv 1286^2&\text{ }&\equiv 477&\text{ }&*  \\ 2&\text{ }&1286^{2^2}&\text{ }&\equiv 477^2&\text{ }&\equiv 1271&\text{ }&\text{ }  \\ 3&\text{ }&1286^{2^3}&\text{ }&\equiv 1271^2&\text{ }&\equiv 286&\text{ }&\text{ }  \\ 4&\text{ }&1286^{2^4}&\text{ }&\equiv 286^2&\text{ }&\equiv 16&\text{ }&*  \\ 5&\text{ }&1286^{2^5}&\text{ }&\equiv 16^2&\text{ }&\equiv 256&\text{ }&\text{ } \\ 6&\text{ }&1286^{2^6}&\text{ }&\equiv 256^2&\text{ }&\equiv 112&\text{ }&\text{ } \\ 7&\text{ }&1286^{2^7}&\text{ }&\equiv 112^2&\text{ }&\equiv 277&\text{ }&* \\ 8&\text{ }&1286^{2^8}&\text{ }&\equiv 277^2&\text{ }&\equiv 401&\text{ }&\text{ } \\ 9&\text{ }&1286^{2^9}&\text{ }&\equiv 401^2&\text{ }&\equiv 1330&\text{ }&\text{ } \\ 10&\text{ }&1286^{2^{10}}&\text{ }&\equiv 1330^2&\text{ }&\equiv 1089&\text{ }&*  \end{bmatrix}

Note that the rows marked by * in the above table are the results that we need. In the above table, there are 10 multiplications for the squarings and 10 divisions for the reduction modulo 1363.

Now 1286^{1171} is calculated as follows:

    \displaystyle \begin{aligned} (3) \ \ \ \ \ \ \ \ \ 1286^{1171} &=1286^{2^0} \cdot 1286^{2^1} \cdot 1286^{2^4} \cdot1286^{2^7} \cdot 1286^{2^{10}} \\&\equiv 1286 \ \ \cdot 477 \ \ \ \ \cdot 16 \ \ \ \ \ \cdot 277 \ \ \ \ \cdot 1089 \ \ \text{mod} \ 1363 \\&\equiv 72 \ \ \ \ \cdot 16 \ \ \ \ \ \cdot 277 \ \ \ \ \cdot 1089 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \\&\equiv 1152 \ \ \cdot 277 \ \ \cdot 1089 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \\&\equiv 162 \ \ \cdot 1089 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \\&\equiv 591 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 1363 \end{aligned}

We have the answer 1286^{1171} \equiv 591 \ (\text{mod} \ 1363). The calculation in (3) is the step that gives the word “multiply” in the name “square-and-multiply”. In this step, we multiply the results obtained from the previous step.

We now tally up the amount of work that is done. The calculation in table (2) requires 10 multiplications for the squaring and 10 divisions for the reduction modulo 1363. The calculation in (3) requires 4 multiplications and 4 divisions for the reduction modulo 1363. Together, there are 14 multiplications and 14 divisions. In contrast, the naïve approach would require 1170 multiplications and 1170 divisions!

___________________________________________________________________________________________________________________

Description of Fast Powering Algorithm

To compute a^w \equiv (\text{mod} \ m), use the following steps. The following steps correspond with the steps in the above example.

Step (1)

Compute the binary expansions of the power w.

    \displaystyle w=C_0 +C_1 \cdot 2^{1}+C_2 \cdot 2^{2}+\cdots+C_{k-1} \cdot 2^{k-1}+C_k \cdot 2^{k}

where each j, C_j=0 or C_j=1. In particular, we assume that C_k=1.

Step (2)

For each j=0,1,2,\cdots,k, compute \displaystyle a^{2^j} \equiv A_j modulo m. Note that each \displaystyle a^{2^j} \equiv A_j is the result of squaring the previous term \displaystyle a^{2^{j-1}} \equiv A_{j-1}. We arrange the calculation in the following table.

    \displaystyle (2) \ \ \ \ \ \ \ \ \ \begin{bmatrix} \text{ i }&\text{ }&a^{2^i}&\text{ }&\text{squaring}&\text{ }&\text{modulo } m&\text{ }&\text{ }  \\\text{ }&\text{ }&\text{ } \\ 0&\text{ }&a^{2^0}&\text{ }&\text{ }&\text{ }&A_0\equiv a&\text{ }&\text{ }  \\ 1&\text{ }&a^{2^1}&\text{ }&\equiv A_0^2&\text{ }&\equiv A_1&\text{ }&\text{ }  \\ 2&\text{ }&a^{2^2}&\text{ }&\equiv A_1^2&\text{ }&\equiv A_2&\text{ }&\text{ }  \\ 3&\text{ }&a^{2^3}&\text{ }&\equiv A_2^2&\text{ }&\equiv A_3&\text{ }&\text{ }  \\ \text{ }&\text{ }&\cdots&\text{ }&\cdots&\text{ }&\cdots&\text{ }&\text{ }  \\ \text{ }&\text{ }&\cdots&\text{ }&\cdots&\text{ }&\cdots&\text{ }&\text{ } \\ \text{ }&\text{ }&\cdots&\text{ }&\cdots&\text{ }&\cdots&\text{ }&\text{ } \\ k-1&\text{ }&a^{2^{k-1}}&\text{ }&\equiv A_{k-2}^2&\text{ }&\equiv A_{k-1}&\text{ }&\text{ } \\ k&\text{ }&a^{2^k}&\text{ }&\equiv A_{k-1}^2&\text{ }&\equiv A_k&\text{ }&\text{ }   \end{bmatrix}

Step (3)

Compute a^w \equiv (\text{mod} \ m) using the following derivation.

    \displaystyle \begin{aligned}(3) \ \ \ \ \ \ \ \ \  a^{w}&=a^{C_0 +C_1 \cdot 2^{1}+C_2 \cdot 2^{2}+\cdots+C_{k-1} \cdot 2^{k-1}+C_k \cdot 2^{k}} \\&=a^{C_0} \cdot a^{C_1 \cdot 2^{1}} \cdot a^{C_2 \cdot 2^{2}} \cdots a^{C_{k-1} \cdot 2^{k-1}} \cdot a^{C_k \cdot 2^{k}} \\&=a^{C_0} \cdot (a^{2^{1}})^{C_1} \cdot (a^{2^{2}})^{C_2} \cdots (a^{2^{k-1}})^{C_{k-1}} \cdot (a^{2^{k}})^{C_k} \\&\equiv A_0^{C_0} \cdot (A_1)^{C_1} \cdot (A_2)^{C_2} \cdots (A_{k-1})^{C_{k-1}} \cdot (A_k)^{C_k} \ \ \ \ \ (\text{mod} \ m) \end{aligned}

The last line in (3) is to be further reduced modulo m. In the actual calculation, only the terms with C_j=1 need to be used.

We now establish an upper bound for the number multiplications. Step (2) requires k multiplications and k divisions to reduce modulo m. Step (3) requires at most k many multiplications since some of the C_j many be zero. Step (3) also requires at most k many divisions to reduce modulo m. So altogether, the algorithm requires at most 2k multiplications and 2k divisions.

From Step (1), we know that \displaystyle 2^k \le w. Take natural log of both sides, we have \displaystyle k \le \frac{\text{ln}(w)}{\text{ln}(2)} and \displaystyle 2 \cdot k \le \frac{2 \cdot \text{ln}(w)}{\text{ln}(2)}. So the fast powering algorithm requires at most

    \displaystyle \frac{2 \cdot \text{ln}(w)}{\text{ln}(2)}

many multiplications and at most that many divisions to compute the congruence calculation a^w \equiv (\text{mod} \ m).

For example, when the power w=2^{127}-1, which is a Mersenne prime, which has 39 digits. Now w \approx 2^{127}. By the above calculation, the fast powering algorithm would take at most 254 multiplications and at most 254 divisions to do the power congruence computation.

The fast powering calculations demonstrated in this post can be done by hand (using a hand-held calculator). In real applications, such calculations should of course be done in a computer.

___________________________________________________________________________________________________________________

Another Example

Use the fast power algorithm to show that

    4030^{2657} \equiv 21144 \ (\text{mod} \ 55049)

    21144^{79081} \equiv 4030 \ (\text{mod} \ 55049)

Note that one congruence is encryption and the other one is decryption. We demonstrate the second calculation.

In doing the second calculation, we use a little bit of help from Fermat’s little theorem. The modulus 55049 is a prime number. So 21144^{55048} \equiv 1 \ (\text{mod} \ 55049). Thus we have:

    21144^{79081}=21144^{24033} \cdot 21144^{55048} \equiv 21144^{24033} \ (\text{mod} \ 55049)

Step (1)

The binary expansions of 24033 are:

    24033=2^0+2^5+2^6+2^7+2^8+2^{10}+2^{11}+2^{12}+2^{14}

Step (2)

Compute 21144^{2^j} modulo 55049 for each j. The computation is displayed in the following table. The rows with * are the results that we need for Step (3).

    \displaystyle \begin{bmatrix} \text{ i }&\text{ }&21144^{2^i}&\text{ }&\text{squaring}&\text{ }&\text{modulo } 55049&\text{ }&\text{ }  \\\text{ }&\text{ }&\text{ } \\ 0&\text{ }&21144^{2^0}&\text{ }&\text{ }&\text{ }&\equiv 21144&\text{ }&*  \\ 1&\text{ }&21144^{2^1}&\text{ }&\equiv 21144^2&\text{ }&\equiv 15807&\text{ }&\text{ }  \\ 2&\text{ }&21144^{2^2}&\text{ }&\equiv 15807^2&\text{ }&\equiv 48887&\text{ }&\text{ }  \\ 3&\text{ }&21144^{2^3}&\text{ }&\equiv 48887^2&\text{ }&\equiv 41483&\text{ }&\text{ }  \\ 4&\text{ }&21144^{2^4}&\text{ }&\equiv 41483^2&\text{ }&\equiv 7549&\text{ }&\text{ }  \\ 5&\text{ }&21144^{2^5}&\text{ }&\equiv 7549^2&\text{ }&\equiv 11686&\text{ }&* \\ 6&\text{ }&21144^{2^6}&\text{ }&\equiv 11686^2&\text{ }&\equiv 41076&\text{ }&* \\ 7&\text{ }&21144^{2^7}&\text{ }&\equiv 41076^2&\text{ }&\equiv 40975&\text{ }&* \\ 8&\text{ }&21144^{2^8}&\text{ }&\equiv 40975^2&\text{ }&\equiv 11174&\text{ }&* \\ 9&\text{ }&21144^{2^9}&\text{ }&\equiv 11174^2&\text{ }&\equiv 7144&\text{ }&\text{ } \\ 10&\text{ }&21144^{2^{10}}&\text{ }&\equiv 7144^2&\text{ }&\equiv 6313&\text{ }&* \\ 11&\text{ }&21144^{2^{11}}&\text{ }&\equiv 6313^2&\text{ }&\equiv 53542&\text{ }&* \\ 12&\text{ }&21144^{2^{12}}&\text{ }&\equiv 53542^2&\text{ }&\equiv 14040&\text{ }&* \\ 13&\text{ }&21144^{2^{13}}&\text{ }&\equiv 14040^2&\text{ }&\equiv 46180&\text{ }&\text{ } \\ 14&\text{ }&21144^{2^{14}}&\text{ }&\equiv 46180^2&\text{ }&\equiv 49189&\text{ }&* \end{bmatrix}

Step (3)

Compute 21144^{79081} \ (\text{mod} \ 55049).

    \displaystyle \begin{aligned} 21144^{24033} &=21144^{2^0} \cdot 21144^{2^5} \cdot 21144^{2^6} \cdot 21144^{2^7} \cdot 21144^{2^{8}} \cdot 21144^{2^{10}} \cdot 21144^{2^{11}} \cdot 21144^{2^{12}} \cdot 21144^{2^{14}} \\&\equiv 21144 \cdot 11686 \cdot 41076 \cdot 40975 \cdot 11174 \cdot 6313 \cdot 53542 \cdot 14040 \cdot 49189 \ \ \text{mod} \ 55049 \\&\equiv 25665 \cdot 40975 \cdot 11174 \cdot 6313 \cdot 53542 \cdot 14040 \cdot 49189 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049 \\&\equiv 22328 \cdot 11174 \cdot 6313 \cdot 53542 \cdot 14040 \cdot 49189 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049 \\&\equiv 11004 \cdot 6313 \cdot 53542 \cdot 14040 \cdot 49189 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049 \\&\equiv 51463 \cdot 53542 \cdot 14040 \cdot 49189 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049 \\&\equiv 9300 \cdot 14040 \cdot 49189 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049 \\&\equiv 50821 \cdot 49189 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049 \\&\equiv 4030 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{mod} \ 55049   \end{aligned}

___________________________________________________________________________________________________________________

\copyright \ 2013 \text{ by Dan Ma}

Fermat’s Little Theorem and RSA Algorithm

RSA is a cryptographic algorithm that is used to send and receive messages. We use the Fermat’s Little Theorem to prove that RSA works correctly and accurately. In other words, the decrypted message is indeed the original message from the sender. Mathematically we show that applying the encryption function and the decryption function successively produces the identity function.

To see how RSA works, see the previous post An Illustration of the RSA Algorithm.

___________________________________________________________________________________________________________________

RSA Algorithm

We first briefly describe the algorithm and then present the mathematical statement to validate.

Let N=p \cdot q where p and q are two prime numbers. Let \phi=(p-1) \cdot (q-1). Choose an integer e with 1<e<N such that e and \phi are relatively prime.

The public key consists of N and e where e is the encryption key. Once it is published, anyone can use it to encrypt messages to send to the creator of the public key. The following is the encryption function:

    f(M) \equiv M^e \ (\text{mod} \ N)

where M is a positive integer and is the original message.

The private key is a positive integer d that satisfies:

    d \cdot e \equiv 1 \ (\text{mod} \ \phi=(p-1) \cdot (q-1))

In other words, d is the multiplicative inverse of e in the modular arithmetic of modulo \phi. The above condition is equivalent to: de-1=(p-1) \cdot (q-1) \cdot k for some integer k.

The number d is the decryption key that will be used to decode messages. So it should remain private.

Once the creator of the public key receives an encrypted message C=f(M), he or she uses the following decryption function to obtain the original message M.

    g(C) \equiv C^d \ (\text{mod} \ N)

___________________________________________________________________________________________________________________

The Mathematical Statement to Validate

What we prove is that the decryption function is to undo the encryption function. Specifically, we prove the following:

    g(C)=g(f(M))=(M^e)^d=M^{ed} \equiv M \ (\text{Mod} \ N) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)

In other words, applying the decryption function g to the encryption function f produces the original message.

___________________________________________________________________________________________________________________

Fermat’s Little Theorem

In this section, we list out the tools we need to prove the correctness of RSA.

Theorem 1 (Fermat’s Little Theorem)
If p is a prime number and a is an integer such that a and p are relatively prime, then

    a^{p-1}-1 is an integer multiple of p

    or equivalently a^{p-1} \equiv 1 \ (\text{mod} \ p).

For a proof of Fermat’s little theorem, see this post.

Lemma 2 (Euclid’s Lemma)
Let a, b and d be integers where d \ne 0. Then if d divides a \cdot b (symbolically d \lvert a \cdot b), then either d \lvert a or d \lvert b.

Euclid’s Lemma is needed to prove the following Lemma.

Lemma 3
Let M be an integer. Let p and q be prime numbers with p \ne q.

Then if a \equiv M \ (\text{mod} \ p) and a \equiv M \ (\text{mod} \ q), then a \equiv M \ (\text{mod} \ p \cdot q).

Proof of Lemma 3
Suppose we have a \equiv M \ (\text{mod} \ p) and a \equiv M \ (\text{mod} \ q). Then for some integers i and j, we have:

    a=M+p \cdot i and a=M+q \cdot j.

Then p \cdot i=q \cdot j. This implies that p divides q \cdot j (p \lvert q \cdot j). By Euclid’s lemma, we have either p \lvert q or p \lvert j. Since p and q are distinct prime numbers, we cannot have p \lvert q. So we have p \lvert j and that j=p \cdot w for some integer w.

Now, a=M+q \cdot j=M+q \cdot p \cdot w, implying that a \equiv M \ (\text{mod} \ p \cdot q). \blacksquare

___________________________________________________________________________________________________________________

The Proof of (1)

We now prove the property (1) described above. We show that

    (M^e)^d=M^{ed} \equiv M \ (\text{Mod} \ N=p \cdot q)

We first show that M^{ed} \equiv M \ (\text{Mod} \ p) and M^{ed} \equiv M \ (\text{Mod} \ q). Then the desired result follows from Lemma 3.

To show M^{ed} \equiv M \ (\text{Mod} \ p), we consider two cases: M \equiv 0 \ (\text{Mod} \ p) or M \not \equiv 0 \ (\text{Mod} \ p).

Case 1. M \equiv 0 \ (\text{Mod} \ p). Then M is an integer multiple of p, say M=p \cdot w where w is an integer. Then M^{ed}=(p \cdot w)^{ed}=p \cdot p^{ed-1} \cdot w^{ed}. So both M and M^{ed} are integer multiples of p. Thus M^{ed} \equiv M \ (\text{Mod} \ p).

Case 2. M \not \equiv 0 \ (\text{Mod} \ p). This means that p and M are relatively prime (having no common divisor other than 1). Thus we can use Fermat’s Little Theorem. We have M^{p-1} \equiv 1 \ (\text{mod} \ p).

From the way the decryption key d is defined above, we have ed-1=(p-1) \cdot (q-1) \cdot k for some integer k. We then have:

    \displaystyle \begin{aligned} M^{ed}&=M^{ed-1} \cdot M \\&=M^{(p-1) \cdot (q-1) \cdot k} \cdot M \\&=(M^{p-1})^{(q-1) \cdot k} \cdot M \\&\equiv (1)^{(q-1) \cdot k} \cdot M \ (\text{Mod} \ p) \ * \\&\equiv M \ (\text{Mod} \ p) \end{aligned}

At the step with *, we apply Fermat’s Little Theorem. So we have M^{ed} \equiv M \ (\text{Mod} \ p).

The same reason reasoning can show that M^{ed} \equiv M \ (\text{Mod} \ q).

By Lemma 3, it follows that M^{ed} \equiv M \ (\text{Mod} \ N=p \cdot q). \blacksquare

___________________________________________________________________________________________________________________

\copyright \ 2013 \text{ by Dan Ma}

Revised August 9, 2014.

An Illustration of the RSA Algorithm

The following 617-digit number (all 617 digits starting with 251 and ending in 357) is a product of two prime numbers p and q. How long will it take to find these two prime factors?

    25195908475657893494027183240048398571429282126204
    03202777713783604366202070759555626401852588078440
    69182906412495150821892985591491761845028084891200
    72844992687392807287776735971418347270261896375014
    97182469116507761337985909570009733045974880842840
    17974291006424586918171951187461215151726546322822
    16869987549182422433637259085141865462043576798423
    38718477444792073993423658482382428119816381501067
    48104516603773060562016196762561338441436038339044
    14952634432190114657544454178424020924616515723350
    77870774981712577246796292638635637328991215483143
    81678998850404453640235273819513786365643912120103
    97122822120720357

The above number is a challenge to the public to come up with the two prime factors (see RSA challenge number and RSA factoring challenge). No one has been able to successfully meet the challenge. In fact, even with the current state of the art in supercomputing technology, it could take millions of years to factor a number as big as the one shown above. According to the organization that posed the challenge (RSA Laboratories), barring fundamental algorithmic or computing advances, the above number or other similarly sized number is expected to stay unfactored in the decades to come.

RSA is a cryptographic algorithm that is used to send and receive sensitive data. The name RSA stands for the last names of the creators of the algorithm – Ron Rivest, Adi Shamir and Leonard Adleman.

The algorithm uses a pair of large prime numbers in setting up a public key to encrypt and a private key to decrypt data. The public key is known to the public and includes a large integer N (such as the one indicated at the beginning of the post) that is a product of two prime numbers p and q. The private key is computed using two prime factors p and q and is needed to decrypt the messages encrypted by the public key. The RSA scheme is built on the monumental difficulty in factoring large numbers. Barring some system vulnerability that hackers can exploit, it is all but certain that messages sent via RSA scheme can only be read by the intended party – the legitimate party that possesses the private key.

This post gives an illustration of how the RSA algorithm work using much smaller prime numbers. The RSA algorithm is also an application of the Fermat’s Little Theorem (see the post Fermat’s Little Theorem and RSA Algorithm).

The example given below makes use of congruence arithmetic. If a, b and m are integers, the symbol a \equiv b \ (\text{mod } m) means that the difference a-b is divisible by m (we say that a is congruent to b modulo m). For example, 15 \equiv 3 \ (\text{mod } 12) (in terms of clock arithmetic, 6 hours after 9 o’clock is not 15th o’clock but is 3 o’clock). For more information about congruence relation and congruence arithmetic, see this blog post or this Wikipedia entry.

___________________________________________________________________________________________________________________

Example of RSA

The example uses small prime numbers to make the illustration easy to follow. The example has the following steps:

  1. Andy creates a public key and a private key. He gives the public key to Becky and keeps the private key to himself.
  2. Becky then uses the public key to encrypt a message and sends it to Andy.
  3. Andy then uses the private key to decrypt the message received from Becky.
  4. \text{ }

______________________________________________________
Step 1 – Generating the Public Key and Private Key

Choose two prime numbers p and q. In this example, let p=29 and q=47. Then calculate the following numbers:

    N=p \times q=1363

    (p-1) \times (q-1)=1288

Choose an integer e with 1<e<N such that e and (p-1) \times (q-1) are relatively prime (meaning the only common divisor is 1). For this example, Andy chooses e=11.

The public key consists of N=1363 and e=11, which Andy passes to Becky. Becky will use the public key to encrypt any message that she wishes to send to Andy (see Step 2 below). The number e is the encryption key.

Andy can also make the public key known to any one from whom Andy wishes to receive message.

The private key is the number d such that:

    e \cdot d \equiv 1 \ (\text{mod } (p-1) \cdot (q-1))

In this example, find the number d such that

    11 \cdot d \equiv 1 \ (\text{mod } 1288)

Specifically, we need to find the number d that satisfies 11d=1+1288 k for some integer k. When k=10, we have a solution d=1171.

The number d=1171 will be used to decrypt any message that will come from Becky. So Andy needs to keep d private and secure.

______________________________________________________
Step 2 – Encrypting a Message using the Public Key

Becky wishes to send a message T (in text) to Andy. Becky first translates T into an integer M (e.g. using a mapping to translate strings of letters and other symbols into blocks of numbers). This same mapping will be used by Andy to translate the decrypted message M back to the text message T. Note that the mapping for translating letters into numbers and back is not part of the RSA algorithm, which only encrypts and decrypts numeric messages.

Suppose the message is M=289. The following is the encryption function – the function that generates the encrypted message C

    f(M) \equiv M^e \ (\text{mod } N)

In this example, the encryption function is:

    f(M) \equiv 289^{11} \ (\text{mod } 1363)

The encrypted message is the number C=f(M) is the remainder that is obtained when 289^{11} is divided by N=1363. This can be done in a process using congruence arithemtic that can be described as “divide and conquer”. The process is to reduce the exponent by half in each step of the step by step approach.

First, note that 289^2 \equiv 378 \ (\text{mod} \ 1363) since 378 is the remainder when 289^2 is divided by 1363 (i.e. division algorithm). Then use the division algorithm successively as shown in the following:

    \displaystyle \begin{aligned} 289^{11}&\equiv (289^2)^5 \cdot 289 \ (\text{mod} \ 1363) \\&\text{ } \\&\equiv 378^5 \cdot 289 \equiv (378^2)^2 \cdot 378 \cdot 289 \\&\text{ } \\& \equiv 1132^2 \cdot 378 \cdot 289 \\&\text{ } \\&\equiv 1132^2 \cdot 202 \\&\text{ } \\&\equiv 204 \cdot 202 \\&\text{ } \\&\equiv 318 \ (\text{mod} \ 1363) \end{aligned}

In the “divide and conquer” approach, a congruence calculation in done in each step to reduce the exponent (applying the division algorithm). For example, to go from the first line to the second line, the remainder 378 is obtained when 289^2 is divided by 1363. The number 318 is the remainder when 204 \cdot 202 is divided by 1363.

______________________________________________________
Step 3 – Decrypting the Message Receiving from Becky

Andy receives the encrypted message C=318. To decrypt C, the private key d that is calculated in Step 1 above is used. The following is the decryption function – the function that generates the decrypted message M.

    g(C) \equiv C^d \ (\text{mod } N)

In particular, Any needs to find the number M=g(C) that satisfies the following:

    g(C) \equiv 318^{1171} \ (\text{mod } 1363)

The decrypted message M=g(C) is the remainder when 318^{1171} is divided by 1363. A “divide and conquer” approach can be used.

    \displaystyle \begin{aligned} 318^{1171}&\equiv (318^2)^{585} \cdot 318 \ (\text{mod} \ 1363) \\&\text{ } \\&\equiv 262^{585} \cdot 318 \equiv (262^2)^{292} \cdot 262 \cdot 318 \\&\text{ } \\& \equiv 494^{292} \cdot 262 \cdot 318 \\&\text{ } \\&\equiv 494^{292} \cdot 173 \equiv (494^2)^{146} \cdot 173 \\&\text{ } \\& \equiv 59^{146} \cdot 173 \equiv (59^2)^{73} \cdot 173 \\&\text{ } \\& \equiv 755^{73} \cdot 173 \equiv (755^2)^{36} \cdot 755 \cdot 173 \\&\text{ } \\& \equiv 291^{36} \cdot 755 \cdot 173 \\&\text{ } \\&\equiv 291^{36} \cdot 1130 \equiv (291^2)^{18} \cdot 1130 \\&\text{ } \\& \equiv 175^{18} \cdot 1130 \equiv (175^2)^{9} \cdot 1130 \\&\text{ } \\& \equiv 639^{9} \cdot 1130 \equiv (639^2)^{4} \cdot 639 \cdot 1130 \\&\text{ } \\& \equiv 784^{4} \cdot 639 \cdot 1130 \\&\text{ } \\&\equiv 784^{4} \cdot 1043 \equiv (784^2)^{2} \cdot 1043 \\&\text{ } \\& \equiv 1306^{2} \cdot 1043 \\&\text{ } \\&\equiv 523 \cdot 1043  \\&\text{ } \\&\equiv 289 \ (\text{mod} \ 1363) \end{aligned}

We just illustrates how Andy decrypts Becky’s message of C=318 to find the original message of M=289. The calculation is tedious if done manually but is simple for a computer.

___________________________________________________________________________________________________________________

Summary

We summarize the main steps in the above example.

______________________________________________________
Step 1 – Generating the Public Key and Private Key

Choose two prime numbers p and q. Then calculate the following numbers:

    N=p \times q

    (p-1) \times (q-1)

    Choose an integer e with 1<e<N such that e and (p-1) \times (q-1) are relatively prime (meaning the only common divisor is 1).

The public key consists of N and e, which Andy passes to Becky. Becky will use the public key to encrypt any message that she wishes to send to Andy. The number e is the encryption key.

Andy can also make the public key known to any one from whom Andy wishes to receive message.

The private key is the number d such that:

    e \cdot d \equiv 1 \ (\text{mod } (p-1) \cdot (q-1))

The number d is the decryption key and will be used to decrypt any message that will come from Becky. So Andy needs to keep d private and secure.

______________________________________________________
Step 2 – Encrypting a Message using the Public Key

Becky wishes to send a message T (in text) to Andy. Becky first translates T into an integer M (e.g. using a mapping to translate strings of letters and other symbols into blocks of numbers). This same mapping will be used by Andy to translate the decrypted message M back to the text message T.

The following is the encryption function – the function that generates the encrypted message C.

    f(M) \equiv M^e \ (\text{mod } N)

The encrypted message that Becky will send to Andy is C=f(M).

______________________________________________________
Step 3 – Decrypting the Message Receiving from Becky

Andy receives the encrypted message C. To decrypt C, the private key d that is calculated in Step 1 above is used. The following is the decryption function – the function that generates the decrypted message M.

    g(C) \equiv C^d \ (\text{mod } N)

The original message sent from Becky is M=g(C).

Note that M=g(f(M)). Thus the function f is the inverse function of g (and vice versa). This fact can be established by using Fermat’s Little Theorem (see the post Fermat’s Little Theorem and RSA Algorithm).

___________________________________________________________________________________________________________________

RSA Exercise

Suppose you are Andy. You set p=53 and q=67. You also choose e=7.

You give the public key N=3551 and e=7.

Becky sends you an encrypted message C=2691.

What is the original message?

___________________________________________________________________________________________________________________

More Information about RSA

A lot of information about RSA can be found online. Here’s a few useful links for introductory information.

___________________________________________________________________________________________________________________

\copyright \ 2013 \text{ by Dan Ma}

Revised August 9, 2014.