## Pari/GP programming for basic cryptography

In studying cryptography, there are occasions in which we need to convert a string of plaintext characters into an integer. The integer corresponding to the plaintext string can then be encrypted using various encryption schemes that operate on numbers. A case in point is the RSA public key cryptosystem. Encryption and decryption in RSA work with integers and use number theoretic operations. Given a plaintext represented as a string of characters, we require a way to convert to an integer prior to encryption. Furthermore, if is the ciphertext corresponding to , then it is most likely that is an integer. After decrypting , we recover the original integer representation of .

In this post, we use PARI/GP to study the RSA cryptosystem. The reader is assumed to have some familiarity with basic concepts from elementary number theory, basic programming constructs, and public key cryptography, in particular, the RSA cryptosystem. This post is not a beginner’s howto on PARI/GP. We assume that the reader has some prior experience in using PARI/GP.

**What is PARI/GP?**

The computer algebra system PARI/GP [2] is primarily aimed at computation in number theory. It can be used as a powerful, programmable desktop calculator similar to the Linux command . It can also be used to study number theory and basic cryptography. Here are screenshots of PARI/GP sessions under Linux and Windows:

**Strings to numbers & vice versa**

PARI/GP provides two commands to convert from a string of characters to integers and vice versa. The command is able to convert a character string into a corresponding vector of ASCII encodings. Let be the plaintext string . To get a vector of ASCII encodings corresponding to , we proceed as follows:

`gp > M = "HELLO WORLD"`

%1 = "HELLO WORLD"

gp > P = Vecsmall(M)

%2 = Vecsmall([72, 69, 76, 76, 79, 32, 87, 79, 82, 76, 68])

Now is a vector of ASCII encodings. Notice that has the ASCII encoding , has the encoding and so on. We also need to pay particular attention to the white space between and . The white space character that is obtained with the spacebar key has ASCII encoding . To convert to a character string, we use the command :

`gp > Strchr(P)`

%3 = "HELLO WORLD"

We need to concatenate all ASCII encodings in the vector to obtain an integer corresponding to the message . This can be accomplished using the command , which concatenates all of its arguments into a single string. The result of concatenating all elements of the vector is a string representation of an integer. We need to somehow convert that string into the integer that it represents. One simple method is to assign the digits to some variable, excluding the double quotation characters:

`gp > a = "";`

gp > for (i = 1, length(P), a = Str(a,P[i]))

gp > a

%5 = "7269767679328779827668"

gp > a = 7269767679328779827668

%6 = 7269767679328779827668

**The RSA cryptosystem**

We now consider the RSA algorithm for encryption and decryption. The following is a list of steps in this algorithm, taken from the book [3, p.165]:

- Choose two primes and and let .
- Let be positive such that .
- Compute such that .
- Our public key is the pair and our private key is the triple .
- For any non-zero integer , encrypt using .
- Decrypt using .

**Public and private keys**

Choosing a prime number is easy. PARI/GP provides a list of precomputed primes, accessible via the command . For example, returns a list of the first 100 primes. The command returns a pseudo-random integer such that . Thus we can work through step 1 in the algorithm as follows:

`gp > p = primes(40000)[random(40000) + 1]`

%7 = 338251

gp > q = primes(40000)[random(40000) + 1]

%8 = 140333

gp > n = p * q

%9 = 47467777583

For step 2, we need to find a positive integer that is coprime to , called Euler’s totient (or phi) function. The command counts the number of integers , with , such that . The greatest common divisor of two integers can be computed using the command . Using a loop, we can compute the required value of as follows:

`gp > e = random(eulerphi(n))`

%10 = 43556566359

gp > while (gcd(e, eulerphi(n)) != 1, e = random(eulerphi(n)))

gp > e

%11 = 3511612661

To calculate a value for in step 3, we use the extended Euclidean algorithm. By definition of congruence, the congruence

is equivalent to

where . From above, we already know the numeric values of and . The extended Euclidean algorithm allows us to compute and . In PARI/GP, this can be accomplished via the command . Given two integers and , the command returns a vector such that and :

`gp > d = bezout(e, eulerphi(n))[1]`

%12 = 5558100941

Thus our RSA public key is and our corresponding private key is .

**Encryption and decryption**

Recall that our message is the string , which can be represented by the integer . To encrypt our message, we raise to the power of and reduce the result modulo . The command first computes , then reduces the result modulo . If the exponent is a “large” integer, say with more than 20 digits, then performing modular exponentiation takes more than a few seconds. Sometimes this can result in the PARI stack to overflow. Brute force modular exponentiation is inefficient and, when performed using a computer, can quickly consume the computer’s memory.

There is a trick to efficiently perform modular exponentiation. Called the squaring trick, or the method of repeated squaring [1, p.879], we use the binary representation of the exponent to repeatedly perform squaring and reduce the running result modulo some fixed integer. The following is a PARI/GP script to perform modular exponentiation using repeated squaring. The pseudocode can be found in [1, p.879].

/* Modular exponentiation using repeated squaring. */ /* That is, we want to compute a^b mod n. */ modexp(a, b, n) = { \ local(d, bin); \ d = 1; \ bin = binary(b); \ for (i = 1, length(bin), \ d = Mod(d*d, n); \ if (bin[i] == 1, \ d = Mod(d*a, n); \ ); \ ); \ return(d); \ }

The script can be saved to a file called, say, . The file can then be read into PARI/GP using the command . Notice the backslash character at the end of each line, except for the last line. This character tells PARI/GP that the script continues on the next line. If we perform brute force modular exponentiation, we would get something similar to the following:

`gp > m = 7269767679328779827668;`

gp > Mod(m^e, n)

*** length (lg) overflow

On the other hand, let us read our script into PARI/GP and perform modular exponentiation using repeated squaring:

`gp > \r /home/mvngu/modexp.gp`

gp > modexp(m, e, n)

%14 = Mod(16017121090, 47467777583)

gp > c = 16017121090

%15 = 16017121090

Thus is the ciphertext. To recover our plaintext, we raise to the power of and reduce the result modulo . Again, we can use modular exponentiation via repeated squaring to decrypt :

`gp > modexp(c, d, n)`

%16 = Mod(32758012945, 47467777583)

gp > Mod(m, n)

%17 = Mod(32758012945, 47467777583)

Although our result is , this integer is equivalent to modulo . Hence we have recovered our plaintext.

**References**

- T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein. Introduction to Algorithms (2nd ed.). The MIT Press, 2001.
- PARI/GP version 2.3.4, Bordeaux, France, 2008, viewed 2008-07-31
- W. Trappe and L. C. Washington. Introduction to Cryptography with Coding Theory (2nd ed.). Pearson Prentice Hall, 2006.

You might be interested in looking at William Stein’s notes for his number theory course at http://modular.fas.harvard.edu/edu/Fall2001/124/ which use PARI extensively, and which include, among other things, an implementation of RSA.

You might be interested in William Stein’s number theory course from 2001:

http://modular.fas.harvard.edu/edu/Fall2001/124/

which uses PARI extensively, and which discusses (among other things) RSA.

well, came across your page, very timely for me as I am fiddling with a custom encryption using Pari/GP and also have more limited xposure to SAGE.

You seem to be using eulerphi(n) repeatedly. While mathematically correct, it requires the same amount of calculations as breaking the code (by factoring n). RSA *relies* on the fact that it is not easy to do (with bigger primes p and q). You could simply use the fact that phi(n) = (p-1)(q-1).

The decoding of the encrypted message seems to be a problem as well. I cannot see how you can recover your original m modulo n if m is greater than n. You should always have n > m, for example by dividing your message to parts of smaller size.

Your modular exponentiation function is well-written, but it’s not necessary when using Pari. Instead of finding

Mod(m^e, n),

you can ask Pari for

Mod(m,n)^e. Pari will do all the calculations Mod e, so there is no risk of length overflows. I’ve tested this command and your modexp, and they seem to run in the same amount of time.