Speeding up modular exponentiation using CRT

This is the fifth post in a series of posts on the Chinese remainder theorem (CRT). When solving a congruence equation with a composite modulus, it is often easier to convert the problem to one of solving several congruence equations with smaller moduli that are primes or powers of primes. Then combine the individual solutions using the Chinese remainder theorem. In this post, we demonstrate this process for modular exponentiation x \equiv c^d \ (\text{mod} \ m) where the exponent d and the modulus m are large. Given x \equiv c^d \ (\text{mod} \ m), the algorithm discussed here is to produce a system of equations with smaller moduli and exponents that give the same answer as for the original problem.

The previous posts in the series on CRT: first post; second post; third post; fourth post

____________________________________________________________________________

Preliminary discussion

The exponentiation c^d is usually programmed using the fast powering algorithm. The CRT method will convert the exponentiation c^d to several exponentiations that involve much smaller exponents and moduli, thus greatly reducing the calculation time, in particular speeding up the fast powering algorithm. One application of the CRT method is to improve the run time of the decryption process in the RSA algorithm (up to four times faster).

As already mentioned, the goal of the algorithm discussed here is to produce an equivalent system of linear congruence equations. Once this system is produced, the Chinese remainder theorem only guarantees a solution and does not actually produce a solution. So we need to know how to Chinese remainder, i.e. using an algorithm for solving a simultaneous linear congruences. We can use the one discussed here (the first method) or here (the second method). The following examples are worked using the second method.

The CRT algorithm discussed here makes use of Euler’s theorem, which states that a^{\phi(m)} \equiv 1 \ (\text{mod} \ m) whenever a and the modulus m are relatively prime where \phi(m) is the phi function evaluated at m. For this reason, the algorithm requires the evaluation of the phi function.

When calculating c^d \ (\text{mod} \ m), the use of the phi function \phi(m) is to reduce the exponent d by the largest multiple of \phi(m). For example, since \phi(17)=16 and 2^{16} \equiv 1 \ (\text{mod} \ 17), the problem of 2^{250} \ (\text{mod} \ 17) is converted to finding 2^{10} \ (\text{mod} \ 17). Here, the original exponent of 250 is reduced to 10 after taking out the largest multiple of 16. Note that 10 \equiv 250 \ (\text{mod} \ \phi(17)=16). In general, we want to replace the original exponent d by a smaller exponent d_1 where d_1 \equiv d \ (\text{mod} \ \phi(m)).

____________________________________________________________________________

Examples

In the following three examples, we use the algorithm discussed here to solve systems of linear congruence equations (this is the iterative approach). These examples are by no means realistic since the numbers used are small. So they are for demonstration of how CRT works.

Example 1
Calculate x \equiv 2^{3163} \ (\text{mod} \ 3969).

First, factor the modulus 3969=3^4 \times 7^2=81 \times 49. Now the problem is converted to solving the following system of two equations:

    x \equiv 2^{3163} \ (\text{mod} \ 81)

    x \equiv 2^{3163} \ (\text{mod} \ 49)

By CRT, any solution to the two equations is also a solution to the original equation. However, the exponent of 3163 should first be reduced. To this end, calculate the phi function, where \phi(3^4)=3^2 \cdot (3-1)=54 and \phi(7^2)=7 \cdot (7-1)=42. We should reduce from 3163 the largest multiple of 54 in the first equation and reduce the largest multiple of 42 in the second equation. In other words, reduce the exponent 3163 modulo the two phi function values:

    3163 \equiv 31 \ (\text{mod} \ 54)

    3163 \equiv 13 \ (\text{mod} \ 42)

As a result, we solve the following two equations:

    x \equiv 2^{3163} \equiv 2^{31} \equiv 65 \ (\text{mod} \ 81)

    x \equiv 2^{3163} \equiv 2^{13} \equiv 9 \ (\text{mod} \ 49)

Note that the original exponentiation 2^{3163} is turned into the easier ones of 2^{31} and 2^{13}. The following gives the solution to the above two equations.

    \displaystyle \begin{aligned} x_0&=65+81 \cdot 23 \cdot (9-65) \ (\text{mod} \ 3969)  \\&\equiv -104263 \ (\text{mod} \ 3969) \\&\equiv 2900 \ (\text{mod} \ 3969) \end{aligned}

    where 23 is obtained by solving for y in 81y \equiv 1 \ (\text{mod} \ 49)

By CRT, the answer to the original problem is 2^{3163} \equiv 2900 \ (\text{mod} \ 3969). \square

Example 2
Calculate x \equiv 3^{3163} \ (\text{mod} \ 3969).

In this example, the number 3 and the modulus 3969 are not relatively prime. The CRT method still applies. As in Example 1, the problem can be reduced in the following way:

    x \equiv 3^{3163} \equiv 3^{31} \equiv 0 \ (\text{mod} \ 81)

    x \equiv 2^{3163} \equiv 3^{13} \equiv 10 \ (\text{mod} \ 49)

The first equation is congruent to 0 since 3^{31} contains 81 as a factor. The following gives the solution to the above two equations:

    \displaystyle \begin{aligned} x_0&=0+81 \cdot 23 \cdot (10-0) \ (\text{mod} \ 3969)  \\&\equiv 18630 \ (\text{mod} \ 3969) \\&\equiv 2754 \ (\text{mod} \ 3969) \end{aligned}

By CRT, the answer to the original problem is 3^{3163} \equiv 2754 \ (\text{mod} \ 3969). \square

Example 3
The above 2 examples use small numbers to illustrate the CRT technique. In this example, we use slightly larger numbers. Calculate x \equiv 355^{d} \ (\text{mod} \ m) where d=\text{1,759,695,794} and m=\text{3,055,933,789}=1277 \cdot 1439 \cdot 1663.

As in the other examples, we break up the problem in three congruences. The three factors of the modulus are prime numbers. Thus we reduce the exponent d by multiples of a prime factor less one.

    \text{1,759,695,794} \equiv 1198 \ (\text{mod} \ 1276)

    \text{1,759,695,794} \equiv 814 \ (\text{mod} \ 1438)

    \text{1,759,695,794} \equiv 110 \ (\text{mod} \ 1662)

Then the original problem is transformed to solving the following three equations.

    x \equiv 355^{d} \equiv 355^{1198} \equiv 189 \ (\text{mod} \ 1277)

    x \equiv 355^{d} \equiv 355^{814} \equiv 1010 \ (\text{mod} \ 1439)

    x \equiv 355^{d} \equiv 355^{110} \equiv 315 \ (\text{mod} \ 1663)

Notice that the original exponentiation is transformed to the smaller ones of 355^{1198}, 355^{814} and 355^{315}. The remaining task is to solve the system of three equations. One way to find to solution to the above three equations is to use the iterative approach, starting with the solution x_1=189 to the first equation. Then find the solution x_2 to the first two equations and then the solution x_3 to all three equations.

    \displaystyle \begin{aligned} x_2&=189+1277 \cdot 151 \cdot (1010-189) \ (\text{mod} \ 1277 \cdot 1439)  \\&\equiv 158311156 \ (\text{mod} \ 1837603) \\&\equiv 277298 \ (\text{mod} \ 1837603) \end{aligned}

    where 151 is obtained by solving for y in 1277y \equiv 1 \ (\text{mod} \ 1439)

    \displaystyle \begin{aligned} x_3&=277298+(1277 \cdot 1439) \cdot 970 \cdot (315-277298) \ (\text{mod} \ 1277 \cdot 1439 \cdot 1663)  \\&\equiv 277298+1782474910 \cdot (-276983) \ (\text{mod} \ m) \\&\equiv 277298+1414954310 \ (\text{mod} \ m) \\&\equiv 1415231608 \ (\text{mod} \ m) \end{aligned}

    where 970 is obtained by solving for y in (1277 \cdot 1439) y \equiv 1 \ (\text{mod} \ 1663)

By CRT, the answer to the original problem is 355^{d} \equiv 1415231608 \ (\text{mod} \ m). \square

____________________________________________________________________________

The CRT algorithm to speed exponentiation

Suppose we wish to evaluate x \equiv c^{d} \ (\text{mod} \ m) where the prime factorization of m is m=p_1^{n_1} \cdot p_2^{n_2} \cdots p_t^{n_t}. The numbers p_i are distinct prime numbers and each exponent n_i \ge 1. To prepare for the calculation, do the following:

    Let m_i=p_i^{n_i} for each i.

    Calculate \phi(m_i) for each i.

Case 1. The base c and the modulus m are relatively prime.
Then the answer to the original exponentiation problem is identical to the solution to the following system of t congruence equations:

    x \equiv c^{d_1} \ (\text{mod} \ m_1)

    x \equiv c^{d_2} \ (\text{mod} \ m_2)

    \cdots

    x \equiv c^{d_t} \ (\text{mod} \ m_t)

where d_i \equiv d \ (\text{mod} \ \phi(m_i)) for each i. If possible, each c^{d_i} should be reduced modulo m_i.

Case 2. The base c and the modulus m are not relatively prime.
In this case, c and m have prime factors in common (at least one p_i). The idea here is that for any p_i that is a prime factor of c, the equation x \equiv c^{d_i} \ (\text{mod} \ m_i) in Case 1 is replaced by x \equiv 0 \ (\text{mod} \ m_i). Then solve the resulting system of equations (see Example 2). Essentially, Case 2 can fall under Case 1 with c^{d_i} being congruent to zero. We call out Case 2 for the sake of clarity.

The original exponentiation c^d boils down to solving an appropriate system of CRT congruences as described above. Once the equivalent system of congruences is set up, use the algorithm discussed here or here to do Chinese remaindering.

Comment
Both Case 1 and Case 2 produce a system of linear congruence equations that have identical solution to the original equation. This is a result of using CRT (see Theorem D and Theorem G here). The savings in the calculation come in the form of the smaller exponentiations in the resulting congruence equations.

In Case 2, some of the congruence equations are x \equiv 0. This is because the base c and some moduli m_i are not relatively prime. For these moduli, c^{d_i} would contain p_i^{n_i} as a factor (assuming that d_i \ge n_i). Hence, x \equiv c^{d_i} \equiv 0 \ (\text{mod} \ m_i).

The two-equation case
When the modulus is the product of two factors that are relatively prime, the CRT algorithm involves only two equations. We write out the solution explicitly for this case. To evaluate x \equiv c^{d} \ (\text{mod} \ m) where m=m_1 \cdot m_2 and m_1 and m_2 are relatively prime, solve the following system of two equations,

    x \equiv c^{d_1} \ (\text{mod} \ m_1)

    x \equiv c^{d_2} \ (\text{mod} \ m_2)

The solution is x \equiv  c^{d_1}+m_1 \cdot v_1 \cdot (c^{d_2}-c^{d_1}) \ (\text{mod} \ m) where v_1 is the multiplicative inverse of m_1 modulo m_2. If possible, each c^{d_i} should be reduced modulo m_i.

____________________________________________________________________________

RSA application

The algorithm to speed the exponentiation is possible because the factorization of the modulus m is known (as a result, the values of \phi(m_i) are known). Knowing the values of \phi(m_i) makes it possible to reduce the large exponent d. For this reason, the decryption process in the RSA algorithm is a perfect place to apply the CRT technique described here.

With the RSA cryptosystem, a public key consists of N and e, where N is a large modulus that is a product of two large primes p and q (the two primes are not published) and e is the encryption exponent. Say Bob is the originator of the RSA public key. Bob also generates a private key, which is a number d that is used for decrypting any messages that he receives. The number d must be kept private. The prime factors p and q must also be kept secret since knowing p and q can derive d.

Suppose that Alice has a message to send to Bob. She can do so using the published key of N and e through the exponentiation c \equiv m^e \ (\text{mod} \ N). Here, m is the plaintext (the message to be sent) and c is the ciphertext (the encrypted message). Upon receiving the ciphertext c, Bob can then decrypt through the exponentiation m \equiv c^d \ (\text{mod} \ N) where d is the decryption exponent. In realistic RSA calculation, the public modulus N and the private decryption exponent d are large integers (N is at minimum a 2048-bit number). With CRT, the decryption can be reduced to two much smaller exponentiations. The effect can be at least four times faster.

We illustrate this with an example. This is a toy example since the numbers used are small. It is only intended as an illustration.

Example 4
Suppose the public key consists of N=\text{17,086,049} and e=65537. Bob has the additional information of N=p \cdot q where p=3863 and q=4423, which are kept secret. Knowing p and q allows Bob to compute d=\text{5,731,241}. Suppose that Bob receives a message c= 4831984 from Alice. Use the CRT approach to find the plaintext m.

The exponentiation is m \equiv 4831984^{5731241} \ (\text{mod} \ N), which is equivalent to the following two equations by CRT:

    m \equiv 4831984^{5731241} \ (\text{mod} \ 3863)

    m \equiv 4831984^{5731241} \ (\text{mod} \ 4423)

which is further simplified to:

    m \equiv 4831984^{5731241} \equiv 4831984^{33} \equiv 3084 \ (\text{mod} \ 3863)

    m \equiv 4831984^{5731241} \equiv 4831984^{329} \equiv 1436 \ (\text{mod} \ 4423)

where 33 \equiv 5731241 \ (\text{mod} \ 3862) and 329 \equiv 5731241 \ (\text{mod} \ 4422). Then the following is the plaintext (the original message).

    \displaystyle \begin{aligned} m&=3084+3863 \cdot 1319 \cdot (1436-3084) \ (\text{mod} \ 3863 \cdot 4423)  \\&\equiv 3084+5095297 \cdot (-1648) \ (\text{mod} \ N) \\&\equiv 9289736 \ (\text{mod} \ N) \end{aligned}

    where 1319 is obtained by solving for y in 3683y \equiv 1 \ (\text{mod} \ 4423)

The answer is 4831984^{5731241} \equiv 9289736 \ (\text{mod} \ N). \square

____________________________________________________________________________

Closing comment

In conclusion, we state the explcit formula for providing the CRT answer to the RSA decryption.

    m \equiv  c^{d_p}+p \cdot p_{inv} \cdot (c^{d_q}-c^{d_p}) \ (\text{mod} \ p \cdot q) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)

where d_p \equiv d \ (\text{mod} \ p-1), d_q \equiv d \ (\text{mod} \ q-1) and p_{inv} is the multiplicative inverse of p modulo q.

The decryption formula of (1) represends tremendous saving in calculation (up to four times faster). It is possible only for the holder of the RSA private key. It requires knowledge of the decryption exponent d, which is calculated from the factors p and q of the modulus N.
____________________________________________________________________________
\copyright \ 2015 \text{ by Dan Ma}

Advertisements

How to Chinese remainder, part 2

This is the fourth post in a series of posts on the Chinese remainder theorem (usually abbreviated as CRT). The previous post presents an algorithm for solving systems of linear congruence equations, which is based on a constructive proof of CRT. This post presents another algorithm that is based on another constructive proof of CRT. Links to the previous posts in the series: first post, second post, third post.

The algorithm discussed here is to solve the following is the version of the Chinese remainder theorem. This version has been shown to be equivalent to several other versions in this previous post.

Theorem G (Chinese Remainder Theorem)
Suppose m_1,m_2,\cdots,m_t are positive integers that are pairwise relatively prime. Let m=m_1 \cdot m_2 \cdots m_t. For any sequence of integers c_1,c_2,\cdots,c_t, the following system of linear congruence equations

    x \equiv c_1 \ (\text{mod} \ m_1)

    x \equiv c_2 \ (\text{mod} \ m_2)

      \cdots

    x \equiv c_{t-1} \ (\text{mod} \ m_{t-1})

    x \equiv c_t \ (\text{mod} \ m_t)

has a unique solution modulo m=m_1 \cdot m_2 \cdots m_t.

____________________________________________________________________________

Discussion of Proof/Algorithm

To understand the algorithm, it is a good idea to understand the constructive proof of CRT on which the algorithm is based. The constructive proof is to build up a solution to the system of equations in CRT in an iterative fashion. The first equation (stated in the above statement of CRT) has a solution, which is x_1=c_1. Then the solution x_1 is used to build a solution to the first two equations, say x_2, which is then used to build the solution to the first three equations, and so on. So this constructive proof of a solution is an inductive proof.

Let’s look at the case of two equations: x \equiv c_1 \ (\text{mod} \ m_1) and x \equiv c_2 \ (\text{mod} \ m_2). Then x_1=c_1 is the solution to the first equation. Express the equation x_1 \equiv c_1 \ (\text{mod} \ m_1) as c_1=x_1+m_1 y for some integer y. We would like x_1+m_1 y to be the solution to the second equation too. Then we need to solve for y in the following equation

    x_1+m_1 \cdot y \equiv c_2 \ (\text{mod} \ m_2) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (*)

Once we know what y is, x_1+m_1 \cdot y is the desired solution to the system of two equations. To solve for y, subtract x_1 on both sides of (*) and then multiply both sides by the inverse of m_1. Then y=v_1 \cdot (c_2-x_1) where v_1 is the multiplicative inverse of m_1 modulo m_2. The desired solution is x_2 where x_2=x_1+m_1 \cdot v_1 \cdot (c_2-x_1). From the way it is set up, x_2 is a solution to both equations.

Let’s look at the case of t equations as described in the statement of Theorem F. Let x_{t-1} be the solution to the first t-1 equations. We will use this to build x_{t}, the solution to all the t equations. The key is to solve for y in the equation

    x_{t-1}+m_1 \cdot m_2 \cdots m_{t-1} \cdot y \equiv c_{t} \ (\text{mod} \ m_{t})

Then y=v_{t-1} \cdot (c_{t}-x_{t-1}) where v_{t-1} is the multiplicative inverse of m_1 \cdot m_2 \cdots m_{t-1} modulo m_{t}. Then the following

    x_{t}=x_{t-1}+(m_1 \cdot m_2 \cdots m_{t-1}) \cdot v_{t-1} \cdot (c_{t}-x_{t-1})

It is clear that x_{t} \equiv x_{t-1} \equiv c_i \ (\text{mod} \ m_i) for i=1,\cdots,t-1. This is because the term (m_1 \cdot m_2 \cdots m_{t-1}) \cdot v_{t-1} \cdot (c_{t}-x_{t-1}) in x_{t} vanishes. From the way x_{t} is set up, it is clear that x_{t} \equiv c_{t} \ (\text{mod} \ m_{t}).

The inductive argument just presented establishes that the system of linear congruence equations in Theorem F always has a solution. An algorithm can be created based on this inductive proof.

____________________________________________________________________________

An algorithm for CRT

Assume that the system has t equations, as described and notated in the statement of Theorem F above. The algorithm is then to build a series of numbers

    x_1, x_2, \cdots, x_j, \cdots, x_t \ \ \ \ \ \ 2 \le j \le t

leading to the desired solution x_t where x_1=c_1 and x_j is the solution to the first j equations and is obtained using x_{j-1} as follows:

    x_{j}=x_{j-1}+(m_1 \cdot m_2 \cdots m_{j-1}) \cdot v_{j-1} \cdot (c_{j}-x_{j-1}) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)

    v_{j-1} is defined such that (m_1 \cdot m_2 \cdots m_{j-1}) \cdot v_{j-1} \equiv 1 \ (\text{mod} \ m_j)

Note that the number v_{j-1} is the multiplicative inverse of m_1 \cdot m_2 \cdots m_{j-1} modulo m_j. The desired solution x_t may have to be reduced modulo m=m_1 \cdot m_2 \cdots m_{t}.

In a computer implementation, the algorithm in (1) is carried out iteratively to find the solution. If the calculation is to be carried by hand as an exercise, an alternative to memorizing the formula (1) is to know that each step x_{j} is obtained from solving for y in the following equation:

    x_j \equiv x_{j-1}+m_1 \cdot m_2 \cdots m_{j-1} \cdot y \equiv c_{j} \ (\text{mod} \ m_{j})

Comment
If there are t equations, the algorithm as described here requires the finding of t-1 applications of the extended Euclidean algorithm (for the multiplicative inverse in each induction step). If the given equations are in the form a_i \cdot x \equiv b_i \ (\text{mod} \ m_i), then t additional applications of the extended Euclidean algorithm are required to convert the equations to the form x \equiv c_i \ (\text{mod} \ m_i). Overall, the amount of computation is comparable to the algorithm discussed in the previous post.

The iterative nature of this algorithm is useful in the situations where the solution to a smaller system is known. For example, if we add an additional equation to an existing system of equations, then the solution to the new system can be found based on the solution to the smaller system. In particular, with one additional application of the extended Euclidean algorithm, the new solution can be found. The following two examples demonstrate how this works.

____________________________________________________________________________

Examples

Example 1
Solve the following system of linear congruence equations.

    x \equiv 3 \ (\text{mod} \ 17)

    x \equiv 10 \ (\text{mod} \ 21)

    x \equiv 15 \ (\text{mod} \ 29)

This is Example 1 from this previous post. The results are x_1=3, x_2=241 and x_3=7381 as derived below:

    \displaystyle \begin{aligned} x_2&\equiv 3+17 \cdot 5 \cdot (10-3) \ (\text{mod} \ 17 \cdot 21) \\&\equiv 598 \ (\text{mod} \ 357) \\&\equiv 241 \ (\text{mod} \ 357)  \end{aligned}

    where 5 is obtained by solving 17y \equiv 1 \ (\text{mod} \ 21)

    \displaystyle \begin{aligned} x_3&\equiv 241+(17 \cdot 21) \cdot 13 \cdot (15-241) \ (\text{mod} \ 17 \cdot 21 \cdot 29) \\&\equiv -1048625 \ (\text{mod} \ 10353) \\&\equiv 7381 \ (\text{mod} \ 10353)  \end{aligned}

    where 13 is obtained by solving (17 \cdot 21) y \equiv 1 \ (\text{mod} \ 29)

Example is now completed. \square

Example 2
To illustrate the iterative nature of the algorithm, we add an addition equation to the system in Example 1. Solve the following system of linear congruence equations.

    x \equiv 3 \ (\text{mod} \ 17)

    x \equiv 10 \ (\text{mod} \ 21)

    x \equiv 15 \ (\text{mod} \ 29)

    x \equiv 16 \ (\text{mod} \ 31)

From Example 1, the solution to the first three equations is x_3=7381 \ (\text{mod} \ 10353) where 10353 is the product of the first three moduli. The following gives the solution:

    \displaystyle \begin{aligned} x_4&\equiv 7381+10353 \cdot 30 \cdot (16-7381) \ (\text{mod} \ 17 \cdot 21 \cdot 29 \cdot 31) \\&\equiv -2287487969 \ (\text{mod} \ 320943) \\&\equiv 193735 \ (\text{mod} \ 320943)  \end{aligned}

    where 30 is obtained by solving 10353 y \equiv 1 \ (\text{mod} \ 31)

The above gives the unique solution to the given system of linear equations. \square

____________________________________________________________________________
\copyright \ 2015 \text{ by Dan Ma}

How to Chinese remainder, part 1

This is the third post in a series of posts on the Chinese remainder theorem (first post and second post). In this and the next post, we highlight two algorithms for obtaining the solution of a system of linear congruence equations. The Chinese remainder theorem guarantees that, under certain condition, any system of linear congruence equations has a unique solution (unique modulo the product of all the moduli of the congruence equations). The theorem is usually abbreviated as CRT. Even though the solution is unique, there are more than one way to solve systems of simultaneous linear congruence equations. In this post, we extract an algorithm from a constructive proof of CRT that is found in many textbooks (used here in this previous post). In the next post, we present another algorithm for solving systems of linear congruence equations.

The statement of CRT only guarantees the existence of a solution and does not show how to derive the unique solution. In many situations, it is sufficient to know that solutions exist (e.g. using CRT to prove theorems). In other situations where actual solutions are sought, we need more than the statement of CRT. In these situations, an algorithm is needed to actually find a solution. It is critical to have an algorithm for solving CRT problems, especially in computer implementation.

The algorithm discussed here is to solve the following is the version of the Chinese remainder theorem. This version has been shown to be equivalent to several other versions in this previous post.

Theorem F (Chinese Remainder Theorem)
Suppose m_1,m_2,\cdots,m_t are positive integers that are pairwise relatively prime. Let m=m_1 \cdot m_2 \cdots m_t. Suppose we have the following system of linear congruence equations

    a_1 \ x \equiv b_1 \ (\text{mod} \ m_1)

    a_2 \ x \equiv b_2 \ (\text{mod} \ m_2)

      \cdots

    a_{t-1} \  x \equiv b_{t-1} \ (\text{mod} \ m_{t-1})

    a_t \  x \equiv b_t \ (\text{mod} \ m_t)

such that each of the linear congruence equations has a unique solution, i.e. for each i, a_i and m_i are relatively prime. Then the system of linear congruence equations has a unique solution modulo m=m_1 \cdot m_2 \cdots m_t.

____________________________________________________________________________

Examples

We first start with two examples.

Example 1
Solve the following system of linear congruence equations.

    x \equiv 3 \ (\text{mod} \ 17)

    x \equiv 10 \ (\text{mod} \ 21)

    x \equiv 15 \ (\text{mod} \ 29)

Both 17 and 29 are prime. The factors of 21 are 3 and 7. The three moduli are pairwise relatively prime (i.e. no two of which have a common factor other than 1). Thus CRT implies that the three equations have a solution x_0. The solution is unique modulo 10353 = 17 x 21 x 29. This means that if y is another solution, then y\equiv x_0 \ (\text{mod} \ 10353).

The solution is x_0=3 \cdot 609 \cdot 11+10 \cdot 493 \cdot 19+15 \cdot 357 \cdot 13 \ (\text{mod} \ 10353) where

    609=21 \cdot 29 and 609 \cdot 11 \equiv 1 \ (\text{mod} \ 17)

    493=17 \cdot 29 and 493 \cdot 19 \equiv 1 \ (\text{mod} \ 21)

    357=17 \cdot 21 and 357 \cdot 13 \equiv 1 \ (\text{mod} \ 29)

The number 609 is the product of the remaining moduli (all the modulo not 17). Because 609 and the modulus 17 are relatively prime, the number 609 has a multiplicative inverse modulo 17. The multiplicative inverse is 11, found using the extended Euclidean algorithm. The other two pairs “493 and 19” and “357 and 13” are found similarly. Thus the solution x_0 is the sum of three numbers. The first is 3 x 609 x 11, where 3 is the solution to the first equation, and 609 and 11 multiplicative inverses to each other modulo 17. The other two numbers 10 x 493 x 19 and 15 x 357 x 13 are obtained similarly.

After reducing modulo 10353, the solution is

    \displaystyle \begin{aligned} x_0&\equiv 3 \cdot 609 \cdot 11+10 \cdot 493 \cdot 19+15 \cdot 357 \cdot 13 \ (\text{mod} \ 10353) \\&\equiv 20097+93670+69615 \ (\text{mod} \ 10353) \\&\equiv 9744+493+7497 \ (\text{mod} \ 10353) \\&\equiv 17734 \ (\text{mod} \ 10353) \\&\equiv 7381 \ (\text{mod} \ 10353)  \end{aligned}

The algorithm demonstrated here requires the computation of three multiplicative inverses. One try and true method for finding multiplicative inverse in the extended Euclidean algorithm (i.e. applying the Euclidean algorithm and then work backward). We demonstrate how to solve 609y \equiv 1 \ (\text{mod} \ 17). To this end, solve the linear diophantine equation 609y+17z=1. Then work backward.

    \displaystyle \begin{aligned} &609=17 \cdot 35+1 \\&17=14 \cdot 1+3 \\&14=3 \cdot 4+2 \\&3=2 \cdot 1+1 \\&2=1 \cdot 2+0 \\&\text{ } \\&\text{ } \end{aligned}        \ \ \ \ \ \ \ \ \   \displaystyle \begin{aligned} 1&=3-2 \\&=3-(14-3 \cdot 4) \\&=14(-1)+3(5) \\&=14(-1)+(17-14)5 \\&=17(5)+14(-6) \\&=17(5)+(609-17 \cdot 35) (-6) \\&=609(-6)+17(215) \end{aligned}

The calculation on the left shows the repeated applications of divisions to show that (609,17)=1. The calculation on the right works backward to solve 609y+17z=1. The solution is y=-6, the least residue of which is y=11. \square

Example 2
Solve the following system of linear congruence equations.

    3 \ x \equiv 1 \ (\text{mod} \ 5)

    4 \ x \equiv 6 \ (\text{mod} \ 14)

    5 \ x \equiv 11 \ (\text{mod} \ 3)

The algorithm used in Example 1 works here. The only difference is that each equation needs to be solved individually. The numbers here are small. The equations can be solved by inspection. Otherwise, each equation can also be solved by solving an appropriate linear diophantine equation (as discussed here). The following are the solutions to the above equations (individually).

    x \equiv 2 \ (\text{mod} \ 5)

    x \equiv 5 \ (\text{mod} \ 14)

    x \equiv 1 \ (\text{mod} \ 3)

The problem is then to solve the system of the above 3 linear congruence equations. The solution is x_0=2 \cdot 42 \cdot 8+5 \cdot 15 \cdot 1+1 \cdot 70 \cdot 1 where:

    42=14 \cdot 3 and 42 \cdot 8 \equiv 1 \ (\text{mod} \ 5)

    15=5 \cdot 3 and 15 \cdot 1 \equiv 1 \ (\text{mod} \ 14)

    70=5 \cdot 14 and 70 \cdot 1 \equiv 1 \ (\text{mod} \ 3)

The above calculation requires solving three linear congruence equations individually and finding three multiplicative inverses. Both tasks can be done using the extended Euclidean algorithm. After further reduction modulo 210, the unique solution is:

    \displaystyle \begin{aligned} x_0&\equiv 2 \cdot 42 \cdot 8+5 \cdot 15 \cdot 1+1 \cdot 70 \cdot 1 \ (\text{mod} \ 210) \\&\equiv 672+75+70 \ (\text{mod} \ 210) \\&\equiv 42+75+70 \ (\text{mod} \ 210) \\&\equiv 187 \ (\text{mod} \ 210)   \end{aligned}

The above gives the unique solution to the given system of linear equations. \square

____________________________________________________________________________

An algorithm for CRT

Now we describe the algorithm that is described by the above two examples. Suppose that the following system of equations such that the moduli m1,m_2,\cdots,m_t are pairwise relatively prime and such that each equation by itself has a unique solution, i.e. a_i and m_i are relatively prime for each i.

    a_1 \ x \equiv b_1 \ (\text{mod} \ m_1)

    a_2 \ x \equiv b_2 \ (\text{mod} \ m_2)

      \cdots

    a_{t-1} \  x \equiv b_{t-1} \ (\text{mod} \ m_{t-1})

    a_t \  x \equiv b_t \ (\text{mod} \ m_t)

The following three steps describe how to obtain the solution to this system of equations.

Step 1
Solve each equation a_i \ x \equiv b_i \ (\text{mod} \ m_i) individually to obtain the solution x \equiv c_i \ (\text{mod} \ m_i). The original system is equivalent to the following equivalent system of equations, i.e. any solution to one system is the solution to the other.

    x \equiv c_1 \ (\text{mod} \ m_1)

    x \equiv c_2 \ (\text{mod} \ m_2)

      \cdots

    x \equiv c_{t-1} \ (\text{mod} \ m_{t-1})

    x \equiv c_t \ (\text{mod} \ m_t)

Step 2
For each i, let n_i be the product of all moduli m_j where j \ne i. Each n_i is relatively prime to m_i. Thus n_i has a multiplicative inverse modulo m_i. Equivalently solve the following equations individually.

    n_1 \ y \equiv 1 \ (\text{mod} \ m_1)

    n_2 \ y \equiv 1 \ (\text{mod} \ m_2)

      \cdots

    n_{t-1} \  y \equiv 1 \ (\text{mod} \ m_{t-1})

    n_t \  y \equiv 1 \ (\text{mod} \ m_t)

For each i, let w_i be the inverse of n_i modulo m_i, i.e. n_i^{-1} \equiv w_i \ (\text{mod} \ m_i).

Step 3
The solution is given by

    x_0 \equiv c_1 \cdot n_1 \cdot w_1+c_2 \cdot n_2 \cdot w_2+\cdots+c_t \cdot n_t \cdot w_t \ (\text{mod} \ m)

where m=m_1 \cdot m_2 \cdots m_t.

Let’s look at what the algorithm entails computationally. Step 1 requires solving a set of linear congruence equations individually (equivalently, solving linear diophantine equations individually), unless the equations are given in the form x \equiv c_i \ (\text{mod} \ m_i). The extended Euclidean algorithm is an excellent approach, especially in computer implementation.

Step 2 involves finding multiplicative inverses of the numbers n_i. The extended Euclidean algorithm is also an excellent approach, especially in computer implementation.

Step 3 is to gather up the results from Step 1 and Step 2. The computation here is to reduce modulo m=m_1 \cdot m_2 \cdots m_t.

____________________________________________________________________________

Why does the algorithm work?

The number x_0 \equiv c_1 \cdot n_1 \cdot w_1+c_2 \cdot n_2 \cdot w_2+\cdots+c_t \cdot n_t \cdot w_t \ (\text{mod} \ m) is a solution to each of the equations in the given system of linear congruence equations. To see that x_0 is a solution to the equation a_1 \ x \equiv b_1 \ (\text{mod} \ m_1), note that:

    \displaystyle \begin{aligned} a_1 \cdot x_0&\equiv a_1 \cdot (c_1 \cdot n_1 \cdot w_1+c_2 \cdot n_2 \cdot w_2+\cdots+c_t \cdot n_t \cdot w_t) \\&\equiv a_1 \cdot c_1 \cdot n_1 \cdot w_1+a_1 \cdot c_2 \cdot n_2 \cdot w_2+\cdots+a_1 \cdot c_t \cdot n_t \cdot w_t  \\&\equiv a_1 \cdot c_1 \cdot n_1 \cdot w_1 \\&\equiv a_1 \cdot c_1 \\&\equiv b_1 \ (\text{mod} \ m_1) \end{aligned}

In the above derivation, all the terms containing n_j with j \ne 1 drop out. This is because n_j has m_1 as a factor. The product n_1 \cdot w_1 drops out since w_1 is the multiplicative inverse if n_1 modulo m_1. Finally a_1 \cdot c_1 \equiv b_1 \ (\text{mod} \ m_1) since c_1 is a solution of the equation a_1 \ x \equiv b_1 \ (\text{mod} \ m_1). By the same reasoning, x_0 is the solution to all the other equations in the system.

To see that the solution x_0 is unique, see the proof in this previous post.

____________________________________________________________________________
\copyright \ 2015 \text{ by Dan Ma}

Versions of the Chinese remainder theorem

This post is the second post in a series of posts on the Chinese remainder theorem (CRT). The previous post sets up the scene by discussing the fundamental theorem of arithmetic. In this post, we derive the various versions of CRT from a lemma (Lemma A below) that is equivalent to the fundamental theorem of arithmetic.

Links to the other posts in the series: first post, third post, fourth post.

____________________________________________________________________________

The starting point

Lemma A and Theorem B are discussed in the previous post. Lemma A is shown to be equivalent to the fundamental theorem of arithmetic. Lemma A makes CRT possible.

Lemma A
Let a, b and d>0 be integers. Suppose that \text{GCD}(a,d)=1, i.e. the greatest common divisor of a and d is 1. If d \lvert (a \cdot b), then d \lvert b.

Theorem B
The following conditions are equivalent:

  1. The statement of Lemma A.
  2. (Euclid’s lemma) If p is a prime number and p \lvert (a \cdot b), then either p \lvert a or p \lvert b.
  3. Every positive integer n>1 can be written as a product of prime numbers and that this product (called a factorization) is unique.

In this post, we start from Lemma A and derive several versions of CRT. First, let’s look at a consequence of Lemma A.

Lemma C
Let m_1,m_2,\cdots,m_t be positive integers that are pairwise relatively prime. Let M=m_1 \cdot m_2 \cdots m_t. Then for any integer n,

    M \lvert n if and only if m_i \ \lvert \ n for each i.

Proof
The direction \rightarrow is clear.

We show the direction \leftarrow. Suppose that for each i, m_i \lvert n. Then n=m_1 \cdot r_1 for some integer r_1. Note that m_2 \lvert n=m_1 \cdot r_1. By Lemma 1, m_2 \lvert r_1. We can write n=m_1 \cdot m_2 \cdot r_2 for some integer r_2. Continue the same argument, it follows that n=m_1 \cdot m_2 \cdots m_t \cdot r for some integer r. This means that M \lvert n. \square

____________________________________________________________________________

The Chinese remainder theorem

The following are several statements of the Chinese remainder theorem. The first version is a re-statement of Lemma C using congruence notation.

Theorem D (Chinese Remainder Theorem)
Let m_1,m_2,\cdots,m_t be positive integers that are pairwise relatively prime. Let M=m_1 \cdot m_2 \cdots m_t. Then for any integers a and b,

    a \equiv b \ (\text{mod} \ M) if and only if a \equiv b \ (\text{mod} \ m_i) for each i.

The proof of Lemma C would take care of Theorem D. Note that a \equiv b \ (\text{mod} \ y) means that y \lvert (a-b).

Theorem E (Chinese Remainder Theorem)
Let m_1,m_2,\cdots,m_t be positive integers that are pairwise relatively prime. Let M=m_1 \cdot m_2 \cdots m_t. Then for any integers a and b,

    the integer x_0 is a solution of the linear congruence equation a \cdot x \equiv b \ (\text{mod} \ M) if and only if the integer x_0 is satisfies simultaneously the following linear congruence equations:

      a \cdot x \equiv b \ (\text{mod} \ m_1)

      a \cdot x \equiv b \ (\text{mod} \ m_2)

      \cdots

      a \cdot x \equiv b \ (\text{mod} \ m_t)

Proof
By Theorem D, a \cdot x_0 \equiv b \ (\text{mod} \ M) if and only if a \cdot x_0 \equiv b \ (\text{mod} \ m_i) for each i. \square

Theorem F (Chinese Remainder Theorem)
Suppose m_1,m_2,\cdots,m_t are positive integers that are pairwise relatively prime. Let m=m_1 \cdot m_2 \cdots m_t. Suppose we have the following system of linear congruence equations

    a_1 \ x \equiv b_1 \ (\text{mod} \ m_1)

    a_2 \ x \equiv b_2 \ (\text{mod} \ m_2)

      \cdots

    a_{t-1} \  x \equiv b_{t-1} \ (\text{mod} \ m_{t-1})

    a_t \  x \equiv b_t \ (\text{mod} \ m_t)

such that each of the linear congruence equations has a unique solution, i.e. for each i, a_i and m_i are relatively prime. Then the system of linear congruence equations has a unique solution modulo m=m_1 \cdot m_2 \cdots m_t.

Proof
First, a solution is produced. Then it is shown that the solution is unique. To produce the solution, the first step is solve each equation individually. Because a_i and m_i are relatively prime, the equation a_i \ x \equiv b_i \ (\text{mod} \ m_i) has a unique solution c_i. Thus a_i \ c_i \equiv b_i \ (\text{mod} \ m_i) for each i.

Next, let N_i be the product of all the moduli m_j where j \ne i. Note that N_i and m_i are still relatively prime. There exists a unique n_i such that N_i \ n_i \equiv  1 \ (\text{mod} \ m_i). In other words, n_i is the multiplicative inverse of N_i modulo m_i.

The proposed solution is x_0=c_1 \cdot N_1 \cdot n_1+c_2 \cdot N_2 \cdot n_2+\cdots+c_t \cdot N_t \cdot n_t. The following shows that x_0 is the solution to each equation.

    \displaystyle \begin{aligned} a_i x_0 &\equiv a_i (c_1 \cdot N_1 \cdot n_1+c_2 \cdot N_2 \cdot n_2+\cdots+c_i \cdot N_i \cdot n_i+ \cdots +c_t \cdot N_t \cdot n_t) \\&\equiv a_i \cdot c_1 \cdot N_1 \cdot n_1+a_i \cdot c_2 \cdot N_2 \cdot n_2+\cdots+a_i \cdot c_i \cdot N_i \cdot n_i +\cdots + a_i \cdot c_t \cdot N_t \cdot n_t \\&\equiv a_i \cdot c_i \cdot N_i \cdot n_i \\&\equiv a_i \cdot c_i \\&\equiv b_i  \ (\text{mod} \ m_i)  \end{aligned}

All the terms containing N_j with j \ne i drop out since N_j contains m_i as a factor. The product N_i \cdot n_i drops out since it is congruent to 1 modulo m_i. Then a_i \cdot c_i is congruent to b_i modulo m_i.

To show the uniqueness, suppose x is also a solution to the system of linear congruence equations. Then a_i \ x \equiv a_i \ x_0 \ (\text{mod} \ m_i) for each i. Multiplying the inverse of a_i on both sides, we have and x \equiv x_0 \ (\text{mod} \ m_i) for each i. By Theorem E, x \equiv x_0 \ (\text{mod} \ m). \square

Theorem G (Chinese Remainder Theorem)
Suppose m_1,m_2,\cdots,m_t are positive integers that are pairwise relatively prime. Let m=m_1 \cdot m_2 \cdots m_t. For any sequence of integers c_1,c_2,\cdots,c_t, the following system of linear congruence equations

    x \equiv c_1 \ (\text{mod} \ m_1)

    x \equiv c_2 \ (\text{mod} \ m_2)

      \cdots

    x \equiv c_{t-1} \ (\text{mod} \ m_{t-1})

    x \equiv c_t \ (\text{mod} \ m_t)

has a unique solution modulo m=m_1 \cdot m_2 \cdots m_t.

Proof
This follows directly from Theorem F. \square

____________________________________________________________________________

To complete the loop

To complete the loop, we show that Theorem G implies Theorem D. Suppose that a \equiv b \ (\text{mod} \ m_i) for each i. Consider the following system of congruence equations:

    x \equiv b \ (\text{mod} \ m_1)

    x \equiv b \ (\text{mod} \ m_2)

    \cdots

    x \equiv b \ (\text{mod} \ m_t)

The number b is clearly a solution to this system of equations. By Theorem G, this solution is unique. So any other solution to this system must be congruent to b modulo m=m_1 \cdot m_2 \cdots m_t. By assumption, a is a solution to the system. So a \equiv b \ (\text{mod} \ m). \square

The loop D \rightarrow E \rightarrow F \rightarrow G \rightarrow D is now complete. Theorem G is the usual statement of CRT. Based on the loop, any one of the theorems can be called the Chinese remainder theorem. Note that the loop by itself does not establish the Chinese remainder theorem. For that to happen, some condition outside the loop must imply one condition in the loop. The above discussion shows that the fundamental theorem of arithmetic (in the form of Lemma A) implies Theorem D.

____________________________________________________________________________

Comments

CRT is a versatile tool and is found to be useful in many areas of mathematics. One approach in applying CRT is that of a divide and conquer idea. The original problem is divided into smaller problems, which can be solved independently of one another. At the end, the results of the smaller problems are combined to form the solution of the original problem. For example, in solving a linear congruence equation a \cdot x \equiv b \ (\text{mod} \ M) for a large modulus M, CRT in the form of Theorem E suggests that the problem can be broken up into a set of linear congruence equations with smaller moduli. For this reason, CRT is important in computing (both theory and applications), especially in computing intensive areas such as coding theory and cryptography.

In the next posts, we discuss two algorithms for solving CRT simultaneous systems of equations and look at a few applications.

____________________________________________________________________________
\copyright \ 2015 \text{ by Dan Ma}

Another look at the fundamental theorem of arithmetic

This is the first post of a series of blog posts on the Chinese remainder theorem, often abbreviated CRT. In this post, we take another look at the fundamental theorem of arithmetic. This post is background for the next post, which will show that CRT can be built step by step from the fundamental theorem of arithmetic.

Links to the other posts in the series:
second post, third post, fourth post.

Every integer can be factored into a product of primes in essentially one way. This fact is called the fundamental theorem of arithmetic. For example, the number 84 is 2 x 2 x 3 x 7. The ordering of the primes is not important here since, for example, 2 x 2 x 3 x 7 is the same as 3 x 2 x 7 x 2. The example of 84 seems to suggest that the fundamental theorem of arithmetic is simply an exercise at the elementary school. In general, factoring a number is a very hard problem. For integers with hundreds or thousands of digits, finding the factors may take more seconds than the number of atoms in the universe! The fundamental theorem of arithmetic guarantees that such factorization exists for any integer, large or small. Because of this, the prime numbers are the building blocks of the integers (the atoms of arithmetic).

What is the importance of knowing that every integer can be factored into prime numbers in a unique way? A short answer is that the fundamental theorem of arithmetic is a foundation result. A vast array of mathematical structures are built on this foundation. The fundamental theorem of arithmetic is the beginning point of many mathematical stories. In this and subsequent posts, we highlight one such example. We show that the Chinese remainder theorem follows from the fundamental theorem of arithmetic. CRT combines beauty and utility. Even though CRT has been known for at least 1,000 years, it continues to present new applications in many areas, e.g. in coding theory, cryptography and the theory of computing. Surprisingly, the proof of CRT is quite simple. CRT follows from a lemma that is equivalent to the fundamental theorem of arithmetic. As a result, the discussion in this and the subsequent posts is a clear demonstration of the power of the fundamental theorem of arithmetic.

In this post, we discuss the fundamental theorem of arithmetic. In the next posts, we discuss CRT.

____________________________________________________________________________

The lemma

The starting point is the following lemma.

Lemma A
Let a, b and d>0 be integers. Suppose that \text{GCD}(a,d)=1, i.e. the greatest common divisor of a and d is 1. If d \lvert (a \cdot b), then d \lvert b.

Proof
Suppose that \text{GCD}(a,d)=1. By the extended Euclidean algorithm, the linear diophantine equation ax+dy=1 is solvable in integers. Let x and y be integers that satisfy this equation. Multiply this equation by b to obtain abx+dby=b. Since d \lvert (a \cdot b), d divides both terms on the left hand side of the last equation. Thus d \lvert b. \square

Lemma A is a simple statement. The proof is also simple, simply using the extended Euclidean algorithm, which is the Euclidean algorithm working backward. the Euclidean alogorithm consists, at heart, of a series of divisions to derive the greatest common divisor. As discussed below, the simple Lemma A leads to the fundamental theorem of arithmetic and Lemma A can also be derived from the fundamental theorem of arithmetic.

____________________________________________________________________________

The fundamental theorem of arithmetic

The fundamental theorem of arithmetic states that any positive integer greater than 1 can be expressed as a product of primes and that the factorization is unique. The theorem is proved in this previous post. Here, we show that it is equivalent to Lemma A.

Theorem B
The following conditions are equivalent:

  1. The statement of Lemma A.
  2. (Euclid’s lemma) If p is a prime number and p \lvert (a \cdot b), then either p \lvert a or p \lvert b.
  3. (Fundamental Theorem of Arithmetic) Every positive integer n>1 can be written as a product of prime numbers and that this product (called a factorization) is unique.

Condition 2 is usually referred to as the Euclid’s lemma. Condition 3 is, of course, a statement of the fundamental theorem of arithmetic. The story we would like to tell is that Lemma A is equivalent to the fundamental theorem of arithmetic. Thus any consequence of Lemma A is also a consequence of the fundamental theorem of arithmetic.

The proof of 1 \rightarrow 2 \rightarrow 3 is done in this previous post. We also like to point out one additional argument is needed to establish the equivalence of the three conditions. The proof to establish the fundamental theorem of arithmetic in the previous post uses an induction argument to show that every integer is a product of prime numbers. So condition 2 plus the induction argument establish condition 3. The role of condition 2 is to establish the uniqueness of the product of primes. We only need to show 3 \rightarrow 1.

Proof
3 \rightarrow 1
Suppose that a and d have no prime factors in common and that d \lvert (a \cdot b). So we have a \cdot b=d \cdot m for some integer m. By condition 3, we can express a and d as a product of primes as follows:

    \displaystyle p_1^{\alpha_1} \cdot p_2^{\alpha_2} \cdots p_t^{\alpha_t} \times b=q_1^{\delta_1} \cdot q_2^{\delta_2} \cdots q_r^{\delta_r} \times m

The numbers p_i are the prime factors of a and the numbers q_i are the prime factors of d. The exponents \alpha_i and \delta_j are positive integers. Note that p_i \ne q_j for any i and j. Each q_i^{\delta_i} must appear in the prime factorization of the left-hand side. Since q_i^{\delta_i} cannot appear in the factorization of a, it must be in the factorization of b. \square

The loop 1 \rightarrow 2 \rightarrow 3 \rightarrow 1 shows the equivalence of the three statements. From a logical standpoint, any one of these statements is the fundamental theorem of arithmetic. As a demonstration of the power of the fundamental theorem of arithmetic, the next post shows that the Chinese remainder theorem is derived using Lemma A.

____________________________________________________________________________
\copyright \ 2015 \text{ by Dan Ma}

Euler’s phi function is multiplicative

This post gives a proof that Euler’s phi function is multiplicative. The proof is a simpler and more elegant proof, as compared to the one presented here. The combinatorial argument is greatly simplified using the Chinese remainder theorem (abbreviated CRT).

The letter m is used below to denote the modulus in modular arithmetic. The phi function \phi(m) is defined to be the count of all the integers 0 \le a \le m-1 such that a and m are relatively prime. If m is prime, then \phi(m)=m-1 since all integers from 1 to m-1 have no factors in common with m (other than 1). If m=10, then \phi(10)=4 since 1, 3, 7, and 9 are the only numbers that are relatively prime to m=10.

To properly work with the phi function, conmsider the following setting. Given a modulus m, a set of interest is \mathbb{Z}_m=\left\{0,1,2,\cdots,m-1 \right\}. This is the set of all least resdues modulo m, i.e., every integer is congruent modulo m to exactly one element of \mathbb{Z}_m. Another set of interest is \mathbb{Z}_m^*, which is the set of all elements a in \mathbb{Z}_m such that a and m are relatively prime. In other words, \phi(m) is defined to be the cardinality of the set \mathbb{Z}_m^*.

The set \mathbb{Z}_m is called the ring of integers modulo m since it satisfies the definition of a ring with regard to addition and multiplication modulo m. The set \mathbb{Z}_m^* with the multiplication modulo m is a group, i.e. every element of \mathbb{Z}_m^* has a multiplicative inverse with respect to the binary operation of multiplication modulo m. The set \mathbb{Z}_m^* is called the multiplicative group of integers modulo m. The fact that \mathbb{Z}_m^* is a group is established by the following theorem, which is proved here.

Theorem 1
Let a be an integer in \mathbb{Z}_m. The following conditions are equivalent.

  1. The numbers a and m are relatively prime, i.e. \text{GCD}(a,m)=1.
  2. There is a b \in \mathbb{Z}_m such that a \cdot b \equiv 1 \ (\text{mod} \ m).
  3. Some positive power of a modulo m is 1, i.e., a^n \equiv 1 \ (\text{mod} \ m) for some positive integer n.

Another interesting fact to point out is that when m is a prime number, \mathbb{Z}_m is a field, i.e. it is a commutative ring in which every non-zero element has a multiplicative inverse. Another terminology that is used by some authors is that the elements in \mathbb{Z}_m that have multiplicative inverses are called units. Thus \mathbb{Z}_m^* is also called the group of units of the ring of integers modulo m. Our notation here is not standard. The usual notation for the ring \mathbb{Z}_m is \mathbb{Z}/m\mathbb{Z}. The usual notation for \mathbb{Z}_m^* is (\mathbb{Z}/m\mathbb{Z})^*.

The phi function is multiplicative in the following sense.

Theorem 2
Let m and n be positive integers such that they are relatively prime. Then \phi(m \times n)=\phi(m) \times \phi(n).

Theorem 2 is established by the following results.

Lemma 3
Let m and n be positive integers such that they are relatively prime. Then the set \mathbb{Z}_{mn} and the set \mathbb{Z}_m \times \mathbb{Z}_n have the same cardinality.

Proof
To prove that the two sets have the same cardinality, we define a bijection f: \mathbb{Z}_{mn} \rightarrow \mathbb{Z}_m \times \mathbb{Z}_n, i.e. f is a one-to-one function that maps \mathbb{Z}_{mn} onto the set \mathbb{Z}_m \times \mathbb{Z}_n. For each a \in \mathbb{Z}_{mn}=\left\{0,1,2,\cdots,m \cdot n-1 \right\}, define f(a)=(c, d) where c \in \mathbb{Z}_m with c \equiv a \ (\text{mod} \ m) and d \in \mathbb{Z}_n with d \equiv a \ (\text{mod} \ n).

To see that f is one-to-one, let f(a)=f(b). Then we have a \equiv b \ (\text{mod} \ m) and a \equiv b \ (\text{mod} \ n). By the Chinese remainder theorem, a \equiv b \ (\text{mod} \ mn). Since a,b \in \mathbb{Z}_{mn}, a=b.

To show that f maps \mathbb{Z}_{mn} onto the set \mathbb{Z}_m \times \mathbb{Z}_n, let (c,d) \in \mathbb{Z}_m \times \mathbb{Z}_n. Consider the equations x \equiv c \ (\text{mod} \ m) and x \equiv d \ (\text{mod} \ n). By CRT, these two equations have a simultaneous solution a that is unique modulo m \times n. This means that f(a)=(c, d). \square

Lemma 4
Let m and n be positive integers such that they are relatively prime. Then the set \mathbb{Z}_{mn}^* and the set \mathbb{Z}_m^* \times \mathbb{Z}_n^* have the same cardinality.

Proof
The same mapping f defined in the proof of Lemma 3 is used. We show that when f is restricted to the set \mathbb{Z}_{mn}^*, it is a bijection from \mathbb{Z}_{mn}^* onto the set \mathbb{Z}_m^* \times \mathbb{Z}_n^*. First, we show that for any a \in \mathbb{Z}_{mn}^*, f(a) is indeed in \mathbb{Z}_m^* \times \mathbb{Z}_n^*. To see this, note that a and m \times n are relatively prime. So it must be that a and m are relatively prime too and that a and n are relatively prime.

The function f is one-to-one as shown above. The remaining piece is that for each (c,d) \in \mathbb{Z}_m^* \times \mathbb{Z}_n^*, there is some a \in \mathbb{Z}_{mn}^* such that f(a)=(c,d). As in the proof of Lemma 3, there is some a \in \mathbb{Z}_{mn} such that f(a)=(c,d). Note that a and m are relatively prime and a and n are relatively prime. This means that a and m \times n are relatively prime (see Lemma 4 here). Thus the function f is a one-to-one from \mathbb{Z}_{mn}^* onto \mathbb{Z}_m^* \times \mathbb{Z}_n^*. \square

Proof of Theorem 2
Lemma 4 shows that the set \mathbb{Z}_{mn}^* and the set \mathbb{Z}_m^* \times \mathbb{Z}_n^* have the same cardinality. First, \phi(m \times n) is the cardinality of the set \mathbb{Z}_{mn}^*. Theorem 2 is established by noting that \phi(m) \times \phi(n) is the cardinality of \mathbb{Z}_m^* \times \mathbb{Z}_n^*. \square

____________________________________________________________________________

Evaluating the phi function

The combinatorial argument for \phi(m \times n)=\phi(m) \times \phi(n) is greatly simplified by using the Chinese remainder theorem. Compare the above proof with the lengthier proof in an earlier post (see Theorem 3 here). With the multiplicative property of the phi function, we can evaluate the phi function for any positive integer.

For any positive integer m, consider its unique prime factorization m=p_1^{e_1} \cdot p_2^{e_2} \cdots p_t^{e_t}. Then we have:

    \displaystyle \begin{aligned} \phi(m)&=\phi(p_1^{e_1} \cdot p_2^{e_2} \cdots p_t^{e_t}) \\&=\phi(p_1^{e_1}) \times \phi(p_2^{e_2}) \times \cdots \times \phi(p_t^{e_t}) \\&=p_1^{e_1-1} \cdot (p_1-1) \times p_2^{e_2-1} \cdot (p_2-1) \times \cdots \times p_t^{e_t-1} \cdot (p_t-1) \\&=p_1^{e_1} \cdot \biggl(1-\frac{1}{p_1}\biggr) \times p_2^{e_2} \cdot \biggl(1-\frac{1}{p_2}\biggr) \times \cdots \times p_t^{e_t} \cdot \biggl(1-\frac{1}{p_t}\biggr) \\&=p_1^{e_1} \cdot p_2^{e_2} \cdots p_t^{e_t} \cdot \biggl(1-\frac{1}{p_1}\biggr) \times  \cdot \biggl(1-\frac{1}{p_2}\biggr) \times \cdots \times  \cdot \biggl(1-\frac{1}{p_t}\biggr) \\&=m \cdot \biggl(1-\frac{1}{p_1}\biggr) \times  \cdot \biggl(1-\frac{1}{p_2}\biggr) \times \cdots \times  \cdot \biggl(1-\frac{1}{p_t}\biggr)\end{aligned}

In the above evaluation, this fact is used. For any prime p, \phi(p^n)=p^{n-1} \times (p-1) (see see Theorem 2 here). We also use the extended version of Theorem 2: if m_1,m_2,\cdots,m_t are pairwise relatively prime, then

    \phi(m_1 \times m_2 \times \cdots \times m_t)=\phi(m_1) \times \phi(m_2) \times \cdots \times \phi(m_t).

The extended result is to extend the product to more than two numbers. It is derived by a simple inductive argument.

____________________________________________________________________________

A formula for the phi function

The above evaluation leads us to the following way to express the phi function. For m=p_1^{e_1} \cdot p_2^{e_2} \cdots p_t^{e_t}, we have the following:

    \displaystyle \phi(m)=m \ \prod \limits_{p \lvert m} \biggl( 1-\frac{1}{p} \biggr)

In the above product, the p ranges over all prime divisors of m. A quick example: for 33075=3^3 \cdot 5^2 \cdot 7^2,

    \displaystyle \begin{aligned}\phi(33075)&=33075 \times \biggl( 1-\frac{1}{3} \biggr) \times \biggl( 1-\frac{1}{5} \biggr) \times \biggl( 1-\frac{1}{7} \biggr) \\&=33075 \times \frac{2}{3} \times \frac{4}{5} \times \frac{6}{7} \\&=315 \times 2 \times 4 \times 6 \\&=15120 \end{aligned}

One comment. For the above formula to work, we only need to know the prime divisors of the number, not necessarily the powers of the primes. For example, for 33075, we only need to know that the prime divisors are 3, 5 and 7.

____________________________________________________________________________

\copyright \ 2015 \text{ by Dan Ma}

The primitive root theorem revisited

The primitive root theorem lists out all possible values of m for which primitive roots exist in the modulo m arithmetic. The theorem is proved in this previous post and several blog posts leading to it. In this post, we reorganize the proof and add some additional information. The proof of the primitive root theorem detailed below shows that the list of values of the moduli m is very restrictive and, in addition, that such moduli m are such that there are no extra square root beside 1 and -1. The crux of the argument is a lemma that indicates that most moduli have a third square root of 1 in addition to 1 and -1. The proof is also an opportunity for applying the Chinese remainder theorem (usually abbreviated CRT).

____________________________________________________________________________

The primitive root theorem

In this post, m is a positive integer that serves as the modulus. Given m, it is of interest to know all the integers a where 1 \le a <m and that a and the modulus m are relatively prime. The number of such values of a is denoted by \phi(m) (this is called the phi function). For example, for m=11, \phi(11)=10. For m=10, \phi(10)=4 since there are only 4 numbers that are relatively prime to 10, namely 1, 3, 7, and 9.

A positive integer g is said to be a primitive root modulo m if \phi(m) is the least positive integer x such that g^x \equiv 1 \ (\text{mod} \ m). If g is said to be a primitive root modulo m, it is necessary that g and the modulus m are relatively prime. This is because of this basic fact: a and the modulus m are relatively prime if and only a \cdot b \equiv 1 \ (\text{mod} \ m) if and only of a^t \equiv 1 \ (\text{mod} \ m) for some integer t. The middle condition is saying that a has a multiplicative modulo m.

The following lemma is alluded to at the beginning. The idea will used through the proof of the primitive root theorem. So it is extracted as a lemma to make the argument easier to follow.

Lemma 1
Let c and d be integers such that c>2 and d>2 and such that c and d are relatively prime. Let m=c \cdot d. Then there exist x_0 such that x_0 \not \equiv \pm 1 \ (\text{mod} \ m) and such that x_0^2 \equiv 1 \ (\text{mod} \ m), i.e. x_0 is a third square root of 1 modulo m that is neither 1 nor -1.

Proof
Consider the two equations x \equiv 1 \ (\text{mod} \ c) and x \equiv -1 \ (\text{mod} \ d). By the Chinese remainder theorem, there is a solution x_0 to this system of equations. We have x_0^2 \equiv 1 \ (\text{mod} \ c) and x_0^2 \equiv 1 \ (\text{mod} \ d). By the Chinese remainder theorem again, x_0^2 \equiv 1 \ (\text{mod} \ m). There are three possibilities for x_0, 1, -1 or another value. If x_0 \equiv 1 \ (\text{mod} \ m), then 1 \equiv -1 \ (\text{mod} \ d), which means d \lvert 2, contradicting that d>2. Similarly, x_0 \equiv -1 \ (\text{mod} \ m) would lead to c \lvert 2, which contradicts c>2. The only possibility is that x_0 \not \equiv \pm 1 \ (\text{mod} \ m). \square

The proof of Lemma 1 makes use of CRT twice. Lemma 1 will be used several times in the proof of 2 \longrightarrow 3 in the primitive root theorem.

The Primitive Root Theorem
Let m be a positive integer. Then the following conditions are equivalent:

  1. There exists a primitive root modulo m.
  2. The equation x^2 \equiv 1 \ (\text{mod} \ m) has no solution outside of \pm 1 modulo m. In other words, the only square roots of 1 modulo m are \pm 1.
  3. The only possibilities for m are:
    • m=2,
    • m=4,
    • m is the power of an odd prime,
    • m is twice the power of an odd prime.

Proof
1 \longrightarrow 2
Suppose that g is a primitive root modulo m. Suppose that condition 2 does not hold. As a result, there exists some positive integer a \not \equiv \pm 1 \ (\text{mod} \ m) such that a^2 \equiv 1 \ (\text{mod} \ m). Since g is a primitive root modulo m, a \equiv g^h \ (\text{mod} \ m) for some integer integer h with 1 \le h < \phi(m) (see Theorem 5 here). If h = \phi(m), then a \equiv 1 \ (\text{mod} \ m). Thus 1 \le h < \phi(m).

Furthermore, a^2 \equiv g^{2h} \equiv 1 \ (\text{mod} \ m). We have \phi(m) \le 2h, since \phi(m) is the least power x such that g^x is congruent to 1 modulo m. Since g^{2h}=g^{\phi(m)} g^{2h-\phi(m)}, g^{2h} \equiv g^{2h-\phi(m)} \equiv 1 \ (\text{mod} \ m). We have \phi(m) \le 2h-\phi(m) since \phi(m) is the least power x such that g^x is congruent to 1 modulo m. The last inequality leads to \phi(m) \le h, contradicting the earlier observation of h < \phi(m). Thus condition 2 must holds if condition 1 is true.

2 \longrightarrow 3
We show that for any m outside of the four possibilities listed in condition 3, condition 2 does not hold. This means that if condition 2 holds for m, m must be one of the four possibilities. We consider the cases of m even and m odd separately.

Case 1. The modulus m is odd. Since m is not the power of one single odd prime, it must be the product of two or more powers of odd primes. Let m=c \cdot d where c a power of an odd prime and d is the product of one ore more powers of odd primes. It is clear that c and d are relatively prime. It is also the case that c>2 and d>2 since each of them has at least one odd prime factor. By Lemma 1, there exists a square root x_0 of 1 such that x_0 \not \equiv \pm 1 \ (\text{mod} \ m).

For the second case, m is even. We break this up into two cases. One is that m is a power of 2. The other is that m is not a power of 2. In these two cases, the goal is still to show that if m is outside of the four possibilities in condition 3, then condition 2 does not hold.

Case 2a. The modulus m is a power of 2.
Then m=2^h where h \ge 3. In this case, we can actually exhibit a square root that is neither 1 nor -1. Define x_0=2^{h-1}-1. Since h \ge 3, x_0>1. Consider the following derivation:

    \displaystyle x_0^2=(2^{h-1}-1)^2=2^{2(h-1)}-2^h+1=2^h (2^{h-2}-1)+1

The above derivation shows that x_0^2 \equiv 1 \ (\text{mod} \ m=2^h). It is clear that x_0 \not \equiv \pm 1 \ (\text{mod} \ 2^h).

Case 2b. The modulus m is not a power of 2.
Since m is even and m is not a power of 2, it must be that m=2^k \times q where q is odd. Either q is the power of an odd prime or it is not. If q is the power of an odd prime, then k \ge 2. If q is not the power of an odd prime (i.e. q is of case 1 above), then k \ge 1. To make the argument clear, we consider the two sub cases separately.

Case 2b-1. The modulus m=2^k \times p^j where k \ge 2, j \ge 1 and p is an odd prime.
Let c=2^k and d=p^j. Note that m=c \cdot d. By Lemma 1, there exists a square root x_0 of 1 modulo m such that x_0 \not \equiv \pm 1 \ (\text{mod} \ m).

Case 2b-2. The modulus m=2^k \times b where k \ge 1 and b is an odd integer that is not the power of an odd prime.
Consider the prime factorization of b, say b=p_1^{e_1} \cdot p_2^{e_2} \cdots p_t^{e_t} where t \ge 2. Let c=2^k \cdot p_1^{e_1} and d=p_2^{e_2} \cdots p_t^{e_t}. We can also apply Lemma 1 to obtain a square root of 1 modulo m=c \cdot d that is neither 1 nor -1.

3 \longrightarrow 1
It is clear that there are primitive roots modulo both m=2 and m=4. For the case of power of an odd prime, see this previous post. For the case of twice the power of an odd prime, see this previous post. \square

____________________________________________________________________________

Comments

Lemma 1 plays a prominent role in the proof of the direction 2 \longrightarrow 3. Lemma 1 shows that it is easy to find moduli that have a third square root of 1 (other than 1 and -1). As the primitive root theorem shows, having a third square root of 1 is the condition that kills the possibility of having a primitive root. Lemma 1 and the primitive root theorem speak to different sides of the same coin. One tells us that moduli with no primitive roots are easy to find. The other says that only in rare cases do you find moduli that admit primitive roots, namely the moduli that are not product of two factors, each of which is greater than 2, that are relatively prime. The proof of 2 \longrightarrow 3 may seem tedious (by listing out all cases that satisfy Lemma 1). The key to understanding the proof is through Lemma 1.

____________________________________________________________________________

\copyright \ 2015 \text{ by Dan Ma}