Home > cryptography, documentation, number theory, open source software, Pari/GP, programming > Pari/GP programming for basic cryptography

## Pari/GP programming for basic cryptography

In studying cryptography, there are occasions in which we need to convert a string of plaintext characters into an integer. The integer corresponding to the plaintext string can then be encrypted using various encryption schemes that operate on numbers. A case in point is the RSA public key cryptosystem. Encryption and decryption in RSA work with integers and use number theoretic operations. Given a plaintext $p$ represented as a string of characters, we require a way to convert $p$ to an integer prior to encryption. Furthermore, if $c$ is the ciphertext corresponding to $p$, then it is most likely that $c$ is an integer. After decrypting $c$, we recover the original integer representation of $p$.

In this post, we use PARI/GP to study the RSA cryptosystem. The reader is assumed to have some familiarity with basic concepts from elementary number theory, basic programming constructs, and public key cryptography, in particular, the RSA cryptosystem. This post is not a beginner’s howto on PARI/GP. We assume that the reader has some prior experience in using PARI/GP.

What is PARI/GP?

The computer algebra system PARI/GP [2] is primarily aimed at computation in number theory. It can be used as a powerful, programmable desktop calculator similar to the Linux command $\texttt{bc}$. It can also be used to study number theory and basic cryptography. Here are screenshots of PARI/GP sessions under Linux and Windows:

A PARI/GP session under Ubuntu Linux.

A PARI/GP session under Windows XP.

Strings to numbers & vice versa

PARI/GP provides two commands to convert from a string of characters to integers and vice versa. The command $\texttt{Vecsmall}$ is able to convert a character string into a corresponding vector of ASCII encodings. Let $M$ be the plaintext string $\texttt{HELLO WORLD}$. To get a vector of ASCII encodings corresponding to $M$, we proceed as follows:

gp > M = "HELLO WORLD" %1 = "HELLO WORLD" gp > P = Vecsmall(M) %2 = Vecsmall([72, 69, 76, 76, 79, 32, 87, 79, 82, 76, 68])

Now $P$ is a vector of ASCII encodings. Notice that $\texttt{H}$ has the ASCII encoding $\texttt{72}$, $\texttt{E}$ has the encoding $\texttt{69}$ and so on. We also need to pay particular attention to the white space between $\texttt{HELLO}$ and $\texttt{WORLD}$. The white space character that is obtained with the spacebar key has ASCII encoding $\texttt{32}$. To convert $P$ to a character string, we use the command $\texttt{Strchr}$:

gp > Strchr(P) %3 = "HELLO WORLD"

We need to concatenate all ASCII encodings in the vector $P$ to obtain an integer corresponding to the message $M$. This can be accomplished using the command $\texttt{Str}$, which concatenates all of its arguments into a single string. The result of concatenating all elements of the vector $P$ is a string representation of an integer. We need to somehow convert that string into the integer that it represents. One simple method is to assign the digits to some variable, excluding the double quotation characters:

gp > a = ""; gp > for (i = 1, length(P), a = Str(a,P[i])) gp > a %5 = "7269767679328779827668" gp > a = 7269767679328779827668 %6 = 7269767679328779827668

The RSA cryptosystem

We now consider the RSA algorithm for encryption and decryption. The following is a list of steps in this algorithm, taken from the book [3, p.165]:

1. Choose two primes $p$ and $q$ and let $n = pq$.
2. Let $e \in \mathbb{Z}$ be positive such that $\gcd \big( e, \varphi(n) \big) = 1$.
3. Compute $d \in \mathbb{Z}$ such that $de \equiv 1 \pmod{\varphi(n)}$.
4. Our public key is the pair $(n,e)$ and our private key is the triple $(p,q,d)$.
5. For any non-zero integer $m$, encrypt $m$ using $c \equiv m^e \pmod{n}$.
6. Decrypt $c$ using $m \equiv c^d \pmod{n}$.

Public and private keys

Choosing a prime number is easy. PARI/GP provides a list of precomputed primes, accessible via the command $\texttt{primes}$. For example, $\texttt{primes(100)}$ returns a list of the first 100 primes. The command $\texttt{random(n)}$ returns a pseudo-random integer $k$ such that $0 \leq k \leq n-1$. Thus we can work through step 1 in the algorithm as follows:

gp > p = primes(40000)[random(40000) + 1] %7 = 338251 gp > q = primes(40000)[random(40000) + 1] %8 = 140333 gp > n = p * q %9 = 47467777583

For step 2, we need to find a positive integer that is coprime to $\varphi(n)$, called Euler’s totient (or phi) function. The command $\texttt{eulerphi(n)}$ counts the number of integers $k$, with $1 \leq k \leq n$, such that $\gcd( k,n) = 1$. The greatest common divisor of two integers can be computed using the command $\texttt{gcd}$. Using a loop, we can compute the required value of $e$ as follows:

gp > e = random(eulerphi(n)) %10 = 43556566359 gp > while (gcd(e, eulerphi(n)) != 1, e = random(eulerphi(n))) gp > e %11 = 3511612661

To calculate a value for $d$ in step 3, we use the extended Euclidean algorithm. By definition of congruence, the congruence

$de \equiv 1 \pmod{\varphi(n)}$

is equivalent to

$de - k \cdot \varphi(n) = 1$

where $k \in \mathbb{Z}$. From above, we already know the numeric values of $e$ and $\varphi(n)$. The extended Euclidean algorithm allows us to compute $d$ and $-k$. In PARI/GP, this can be accomplished via the command $\texttt{bezout}$. Given two integers $x$ and $y$, the command $\texttt{bezout(x, y)}$ returns a vector $\texttt{[u, v, g]}$ such that $g = \gcd(x,y)$ and $ux + vy = g$:

gp > d = bezout(e, eulerphi(n))[1] %12 = 5558100941

Thus our RSA public key is $(n, e) = (47467777583,\, 3511612661)$ and our corresponding private key is $(p, q, d) = (338251,\, 140333,\, 5558100941)$.

Encryption and decryption

Recall that our message is the string $\texttt{HELLO WORLD}$, which can be represented by the integer $\texttt{m = 7269767679328779827668}$. To encrypt our message, we raise $m$ to the power of $e$ and reduce the result modulo $n$. The command $\verb!Mod(a^b, n)!$ first computes $\verb!a^b!$, then reduces the result modulo $\texttt{n}$. If the exponent $\texttt{b}$ is a “large” integer, say with more than 20 digits, then performing modular exponentiation takes more than a few seconds. Sometimes this can result in the PARI stack to overflow. Brute force modular exponentiation is inefficient and, when performed using a computer, can quickly consume the computer’s memory.

There is a trick to efficiently perform modular exponentiation. Called the squaring trick, or the method of repeated squaring [1, p.879], we use the binary representation of the exponent to repeatedly perform squaring and reduce the running result modulo some fixed integer. The following is a PARI/GP script to perform modular exponentiation using repeated squaring. The pseudocode can be found in [1, p.879].

/* Modular exponentiation using repeated squaring. */
/* That is, we want to compute a^b mod n. */
modexp(a, b, n) = { \
local(d, bin); \
d = 1; \
bin = binary(b); \
for (i = 1, length(bin), \
d = Mod(d*d, n); \
if (bin[i] == 1, \
d = Mod(d*a, n); \
); \
); \
return(d); \
}

The script can be saved to a file called, say, $\texttt{/home/mvngu/modexp.gp}$. The file can then be read into PARI/GP using the command $\verb!\r!$. Notice the backslash character $\verb!\ !$ at the end of each line, except for the last line. This character tells PARI/GP that the script continues on the next line. If we perform brute force modular exponentiation, we would get something similar to the following:

gp > m = 7269767679328779827668; gp > Mod(m^e, n) *** length (lg) overflow

On the other hand, let us read our script into PARI/GP and perform modular exponentiation using repeated squaring:

gp > \r /home/mvngu/modexp.gp gp > modexp(m, e, n) %14 = Mod(16017121090, 47467777583) gp > c = 16017121090 %15 = 16017121090

Thus $\texttt{c = 16017121090}$ is the ciphertext. To recover our plaintext, we raise $\texttt{c}$ to the power of $\texttt{d}$ and reduce the result modulo $\texttt{n}$. Again, we can use modular exponentiation via repeated squaring to decrypt $\texttt{c}$:

gp > modexp(c, d, n) %16 = Mod(32758012945, 47467777583) gp > Mod(m, n) %17 = Mod(32758012945, 47467777583)

Although our result is $\texttt{32758012945}$, this integer is equivalent to $\texttt{m = 7269767679328779827668}$ modulo $\texttt{n = 47467777583}$. Hence we have recovered our plaintext.

References

1. T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein. Introduction to Algorithms (2nd ed.). The MIT Press, 2001.
2. PARI/GP version 2.3.4, Bordeaux, France, 2008, viewed 2008-07-31
3. W. Trappe and L. C. Washington. Introduction to Cryptography with Coding Theory (2nd ed.). Pearson Prentice Hall, 2006.
1. 29 August 2008 at 1:05 pm

You might be interested in looking at William Stein’s notes for his number theory course at http://modular.fas.harvard.edu/edu/Fall2001/124/ which use PARI extensively, and which include, among other things, an implementation of RSA.

2. 29 August 2008 at 1:40 pm

You might be interested in William Stein’s number theory course from 2001:

http://modular.fas.harvard.edu/edu/Fall2001/124/

which uses PARI extensively, and which discusses (among other things) RSA.

3. 22 December 2008 at 3:31 pm

well, came across your page, very timely for me as I am fiddling with a custom encryption using Pari/GP and also have more limited xposure to SAGE.

4. 17 March 2010 at 11:56 am

You seem to be using eulerphi(n) repeatedly. While mathematically correct, it requires the same amount of calculations as breaking the code (by factoring n). RSA *relies* on the fact that it is not easy to do (with bigger primes p and q). You could simply use the fact that phi(n) = (p-1)(q-1).

The decoding of the encrypted message seems to be a problem as well. I cannot see how you can recover your original m modulo n if m is greater than n. You should always have n > m, for example by dividing your message to parts of smaller size.

5. 1 May 2010 at 9:06 pm

Your modular exponentiation function is well-written, but it’s not necessary when using Pari. Instead of finding
Mod(m^e, n),