Archive

Archive for the ‘Python’ Category

Converting from binary to integer

26 October 2010 1 comment

The following is an updated and edited version of my posts to this sage-support thread.

Problem

You have a bitstring as output by

sage.crypto.stream.blum_blum_shub

and you want to convert that bitstring to an integer. Or in general, you want to convert a bit vector to its integer representation.

Solution

Here are two ways, assuming that you want the bits in little-endian order, i.e. you read the bits from right to left in increasing order of powers of 2.

sage: version()
'Sage Version 4.5.3, Release Date: 2010-09-04'
sage: from sage.crypto.stream import blum_blum_shub
sage: b = blum_blum_shub(length=6, lbound=10**4, ubound=10**5); b
100110
sage: type(b)
<class 'sage.monoids.string_monoid_element.StringMonoidElement'>
sage: # read in little-endian order
sage: # conversion using Python's built-in int()
sage: int(str(b), base=2)
38
sage: # conversion using Sage's built-in Integer()
sage: Integer(str(b), base=2)
38


Now assume you read the bitstring as output by blum_blum_shub in big-endian order, i.e. from left to right in increasing order of powers of 2. You simply convert the bitstring to a string, reverse that string, and apply any of the above two methods.

sage: # reversing a string
sage: str(b)
'100110'
sage: str(b)[::-1]
'011001'
sage: # read in big-endian order
sage: int(str(b)[::-1], base=2)
25
sage: Integer(str(b)[::-1], base=2)
25


Or you can do as follows:

sage: b = "100110"
sage: sum(Integer(i) * (2^Integer(e)) for e, i in enumerate(b))
25
sage: sum(Integer(i) * (2^Integer(e)) for e, i in enumerate(b[::-1]))
38


Another way is to use Horner’s method. Here’s a Sage function that computes the integer representation of a bit vector read using big-endian order. A usage example is also shown.

sage: def horner(A, x0):
...       # Evaluate the polynomial P(x) at x = x_0.
...       #
...       # INPUT
...       #
...       # - A -- list of coefficients of P where A[i] is the coefficient of
...       #   x_i.
...       # - x0 -- the value x_0 at which to evaluate P(x).
...       #
...       # OUTPUT
...       #
...       # An evaluation of P(x) using Horner's method.
...       i = len(A) - 1
...       b = A[i]
...       i -= 1
...       while i >= 0:
...           b = b*x0 + A[i]
...           i -= 1
...       return b
sage: A = [1, 0, 0, 1, 1, 0]
sage: horner(A, 2)
25


As an exercise, modify the function horner to output the integer representation of a bit vector that is read using little-endian order.

Optimized parity testing

To test the parity of an integer is to determine whether it is even or odd. Letting n = 132469, we can test the parity of n by computing its value modulo 2. This can be done either by using the Sage built-in mod function, the Python modulo operator %, or by using the Python bitwise operator &. The operator & is bitwise conjunction, i.e. it corresponds to multiplication over the Galois field $\mathbf{F}_2$ of two elements. The integer n = 132469 is odd, hence the result of parity testing via mod, %, and & should each return 1.

sage: n = 132469
sage: mod(n, 2).lift()
1
sage: mod(n, 2)
1
sage: n % 2
1
sage: n & 1
1


However, the test using the bitwise operator & is the fastest of all:

sage: %timeit mod(n, 2).lift()
625 loops, best of 3: 42.2 micro second per loop
sage: %timeit mod(n, 2)
625 loops, best of 3: 42.2 micro second per loop
sage: %timeit n % 2
625 loops, best of 3: 1.22 micro second per loop
sage: %timeit n & 1
625 loops, best of 3: 1.02 micro second per loop


The program below demonstrates how to do parity testing using C.

#include <stdio.h>

/* Parity testing using bitwise AND.
*/
static void parity_and(int n)
{
if (n & 1)
printf("%i is odd\n", n);
else
printf("%i is even\n", n);
}

/* Parity testing using the modulus operator.
*/
static void parity_mod(int n)
{
if (n % 2)
printf("%i is odd\n", n);
else
printf("%i is even\n", n);
}

int main(void)
{
int n = 132469;
int m = 132470;

printf("Parity testing using modulus operator\n");
parity_mod(n);
parity_mod(m);
printf("Parity testing using bitwise AND\n");
parity_and(n);
parity_and(m);

return 0;
}


The output of the program is:

Parity testing using modulus operator
132469 is odd
132470 is even
Parity testing using bitwise AND
132469 is odd
132470 is even


List within a list

The problem below was asked on the AskSage forum. Below I restate the problem and three possible solutions.

Problem

I have a short list of primes

sage: L = [2, 5, 23]


and a longer list of primes

sage: P = primes_first_n(20); P
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]


How do I check that all members of L are contained within P?

Solution

You can use a brute-force search by defining your own custom function. This option doesn’t assume that elements in your lists are unique. Your lists can contain duplicate elements if you want.

sage: def is_sublist(shortlist, longlist):
....:     for e in shortlist:
....:         if not (e in longlist):
....:             return False
....:     return True
....:
sage: L = [2, 5, 23]
sage: P = primes_first_n(20); P
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
sage: is_sublist(L, P)
True
sage: L + [23]
[2, 5, 23, 23]
sage: is_sublist(L + [23], P)
True
sage: L.append(next_prime(P[-1])); L
[2, 5, 23, 73]
sage: is_sublist(L, P)
False
sage: is_sublist(L + [23], P)
False


Alternatively, you can use the built-in functions itertools.imap and all. The function itertools.imap is efficient when your lists are large, e.g. having hundreds or even hundreds of thousands of elements. This second option doesn’t care if your lists have duplicate elements.

sage: import itertools
sage: L = [2, 5, 23]
sage: P = primes_first_n(20); P
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
sage: L + [23]
[2, 5, 23, 23]
sage: all(itertools.imap(lambda x: x in P, L))
True
sage: all(itertools.imap(lambda x: x in P, L + [23]))
True
sage: L.append(next_prime(P[-1])); L
[2, 5, 23, 73]
sage: all(itertools.imap(lambda x: x in P, L))
False
sage: all(itertools.imap(lambda x: x in P, L + [23]))
False


Or, as Mitesh Patel said, you could use set. This third approach assumes that the elements in each list are unique, i.e. each list doesn’t contain duplicate elements.

sage: L = [2, 5, 23]
sage: P = set(primes_first_n(20))
sage: set(L)
set([2, 5, 23])
sage: set(L).issubset(P)
True
sage: set(L + [23])
set([2, 5, 23])
sage: set(L + [23]).issubset(P)
True
sage: L.append(111); L
[2, 5, 23, 111]
sage: set(L)
set([2, 111, 5, 23])
sage: set(L + [111])
set([2, 111, 5, 23])
sage: set(L + [111]).issubset(P)
False
sage: set(L).issubset(P)
False


Thematic tutorials

The release of Sage 4.5.2 exposes a new category of documentation called “thematic tutorials”. Like any tutorial, a thematic tutorial discusses how to use Sage but with a focus on a particular topic. The thematic tutorial page currently lists two tutorials:

More are currently in preparation. See, for example, tickets 8469, 8442, 8467, 3624, and 8466. Any help is appreciated in reviewing or working on these tickets. Any suggestions for thematic tutorial topics? Or are there any existing tutorials that should go under the rubric of thematic tutorials?

Typesetting Sage code listings in LaTeX

I needed to typeset Sage code listings in a LaTeX document. I have used the listings package before, but I want to customize with colours for keywords, comments, strings, etc. Borrowing some customization from William Stein’s ICMS 2010 paper, I think I have my customization just about right. The relevant code snippet for the preamble is:

%% For typesetting code listings
\usepackage{listings}
\lstdefinelanguage{Sage}[]{Python}
{morekeywords={False,sage,True},sensitive=true}
\lstset{
frame=none,
showtabs=False,
showspaces=False,
showstringspaces=False,
keywordstyle={\ttfamily\color{dbluecolor}\bfseries},
stringstyle={\ttfamily\color{dgraycolor}\bfseries},
language=Sage,
basicstyle={\fontsize{10pt}{10pt}\ttfamily},
aboveskip=0.3em,
belowskip=0.1em,
numbers=left,
numberstyle=\footnotesize
}
\definecolor{dblackcolor}{rgb}{0.0,0.0,0.0}
\definecolor{dbluecolor}{rgb}{0.01,0.02,0.7}
\definecolor{dgreencolor}{rgb}{0.2,0.4,0.0}
\definecolor{dgraycolor}{rgb}{0.30,0.3,0.30}
\newcommand{\dblue}{\color{dbluecolor}\bf}
\newcommand{\dred}{\color{dredcolor}\bf}
\newcommand{\dblack}{\color{dblackcolor}\bf}


And here’s an example

\begin{lstlisting}
sage: R.<x> = ZZ[]
sage: type(R.an_element())
<type 'sage.rings...Polynomial_integer_dense_flint'>
sage: R.<x,y> = ZZ[]
sage: type(R.an_element())
<type 'sage.rings...MPolynomial_libsingular'>
sage: R = PolynomialRing(ZZ, 'x', implementation='NTL')
sage: type(R.an_element())  # this is a comment
<type 'sage.rings...Polynomial_integer_dense_ntl'>
sage: def abc():
...       """
...       This should be a very long comment.
...       That should span multiple lines.
...       To illustrate what colour Sage comments look like.
...       To get a feel for the color when rendered using LaTeX.
...       """
...       return 2
\end{lstlisting}


This renders as follows:

Pushing towards 90% doctest coverage for Sage 5.0

This is an edited version of my post to sage-devel. One of the main goals of the upcoming Sage 5.0 release is to get doctest coverage of the Sage library up to at least 90%. As of Sage 4.4.4.alpha0, the overall weighted coverage is 82.7%. To get a sense of which modules in the Sage library need work on their coverage scores, you could use the coverage script as follows:

$./sage -coverage /path/to/module.py[x]  Or you could do the following to get the coverage scores of all modules, including a coverage summary: $ ./sage -coverageall


You might be interested in knowing which modules have a certain coverage percentage, in which case you could save the output of -coverageall to a text file and then grep that file for certain coverage scores. At this repository is a script to generate various types of coverage analysis reports. You can also find the script here. The script currently supports the following reports

1. The coverage summary of all modules.
2. Modules with 100% coverage.
3. Modules with zero coverage.
4. Modules with between 1% and 9% coverage.
5. Modules with between 10% and 19% coverage.
6. Modules with between 20% and 29% coverage.
7. Modules with between 30% and 39% coverage.
8. Modules with between 40% and 49% coverage.
9. Modules with between 50% and 59% coverage.
10. Modules with between 60% and 69% coverage.
11. Modules with between 70% and 79% coverage.
12. Modules with between 80% and 89% coverage.
13. Modules with between 90% and 99% coverage.

Each report has links to detailed reports for individual modules. To run the script, copy it to the SAGE_ROOT of a Sage source or binary installation and do

[mvngu@sage sage-4.4.4.alpha0]\$ ./coverage-status.py
Coverage report of all modules...
Summary of doctest coverage...
Modules with 0% coverage...
Modules with 100% coverage...
Coverage reports within certain ranges...
Detailed coverage report for all modules...
Format the detailed coverage reports...
Format the summary reports...
Generate index.html...


And you’re done. Here is a report generated by the script. The idea is to provide an overview of which modules need work. I’d be interested to know what other types of doctest coverage reports people would like to see. Comments, suggestions, critiques, etc. are welcome.

Binary trees and branch cut

1 May 2010 1 comment

The topic and content of this post originate from a question to sage-support. I’m posting the essential responses here so it doesn’t get lost in the sage-support archive.

Problem

How do you construct a binary tree in Sage? If T is a binary tree, how do you cut off a branch of that tree?

Solution

There is as yet no class for representing binary trees in Sage. However, you could use either the classes Graph or DiGraph to construct a graph T and then use the method T.is_tree() to determine whether or not T is a tree. There is also a balanced tree generator. Also missing is a method to determine whether or not a tree is binary. That can be remedied by defining your own function to test the number of children a vertex has. Using Graph to construct a tree, and then test that tree to see that it is binary, is rather difficult because unless you label the vertices to indicate their parents, you don’t know which vertex is a child of which other vertex.

In general, I prefer using DiGraph to construct a tree T and then use the method T.neighbors_out() in testing whether or not T is a binary tree. The reason is that in a digraph that represents a tree, you can think of the out-neighbors of a vertex as being the children of that vertex. Here is an example demonstrating the construction of a binary tree rooted at vertex v. By definition, a vertex in a binary tree has at most 2 children. The session below uses this definition to test whether or not a tree is binary.

sage: T = DiGraph({"v": ["a", "w"],
....: "w": ["x", "y"],
....: "x": ["c", "b"],
....: "y": ["z", "d"],
....: "z": ["f", "e"]})
sage: T.vertices()
['a', 'b', 'c', 'd', 'e', 'f', 'v', 'w', 'x', 'y', 'z']
sage: T.edges(labels=None)
[('v', 'a'), ('v', 'w'), ('w', 'x'), ('w', 'y'), ('x', 'b'), ('x', 'c'), ('y', 'd'), ('y', 'z'), ('z', 'e'), ('z', 'f')]
sage: T.is_tree()
True
sage: def is_binary_tree(tree):
....:     for v in tree.vertex_iterator():
....:         if len(tree.neighbors_out(v)) > 2:
....:             return False
....:     return True
....:
sage: is_binary_tree(T)
True


Nathann Cohen offered another way to test that a graph is a binary tree.

sage: def is_binary_tree(g):
....:     if g.is_tree() and max(g.degree()) == 3 and g.degree().count(2) == 1:
....:         return True
....:     return False
....:
sage: is_binary_tree(T)
True


Once you have determined the root vertex of a branch that you want to cut off, you could use breadth-first search (or depth-first search) to determine all vertices in that branch. Again, assume that your binary tree T is represented using the DiGraph class and V is a list of vertices in the branch you want to cut off. You can use the method T.delete_vertices() to cut off that branch. Deleting a vertex v not only deletes v, but also all edges incident on that vertex. Say you have constructed your tree as in the above session and you have determined that the vertex y is the root of the branch you want to cut off. Here is how you can cut off that branch:

sage: V = list(T.breadth_first_search("y"))
sage: V
['y', 'd', 'z', 'e', 'f']
sage: T.delete_vertices(V)
sage: T.vertices()
['a', 'b', 'c', 'v', 'w', 'x']
sage: T.edges(labels=None)
[('v', 'a'), ('v', 'w'), ('w', 'x'), ('x', 'b'), ('x', 'c')]