Archive - programming RSS Feed

Interoperability between Source Code Management systems with Tailor

Tailor is an application that lets you migrate changesets between different kinds of source code repositories. It is written in Python and supports most open source SCM systems.

With Tailor you can:

  1. Create a local repository using your favorite source control system of a project managed by another source control system.
  2. Do your work.
  3. Export your changes back to the original repository.

Here’s an example of using Tailor to import a branch of Axiom (uses Subversion) into a local darcs repository:

First of all, we create a configuration file for the project

$ tailor –verbose –source-kind svn –target-kind darcs \
–source-repository https://svn.sourceforge.net/svnroot/axiom \
–source-module branches/build-improvements \
–start-revision INITIAL \
–target-repository file:///home/rwx/lab/math/axiom/axiom-darcs/ \
–target-module axiom-build-improvements axiom-build-improvements \
> axiom-build-improvements.tailor

Now that we have the config file stored in axiom-build-improvements.tailor, we can launch the tool to do the initial import

$ tailor –configfile=axiom-build-improvements.tailor

After a while, we have a local darcs repository for that branch and we can write:

$ darcs get ~/lab/math/axiom/axiom-darcs

to get a working copy in which to do your modifications.

Today, there are a myriad of SCM tools available and projects like Tailor will become increasingly important because they let you stick to your tools of choice. In this way, instead of learning the details of other source control systems, you can focus on what you do best: writing code.

Halberd screen shots

snapshot4

Because a picture is worth a thousand words I uploaded some screen shots of halberd in action

Halberd 0.2.1 is out!

Halberd

I just released the next revision (0.2.1) of halberd, my load balancer detection tool. If you’re curious about the way the program works, you can read this part of the user’s guide.

Halberd has been tested in real world scenarios for quite some time and it seems to be solid. I hope the wider audience it is gaining now will uncover some bugs and after fixing those I’ll think of it as stable software. Future work could happen in the following areas:

  • Clustering algorithm
    • The module Halberd.clues.analysis currently implements an ad-hoc hierarchical clustering algorithm to isolate possible real servers. I would like halberd to report to the user the degree of trust he should place in its conclusions.
    • I think the way to go would be to test some algorithms in R (fuzzy clustering comes to mind) using real world data and see what works best before implementing anything.
  • SSL session reuse
    • When an SSL/TLS session begins, the server issues an SSL session ID to the client. This ID will be used to resume transactions between client and server (remember the stateless nature of HTTP).
    • Some load balancers can keep track of which real server dealt with which SSL session and direct the client to the right server (the one having the client’s session ID in its cache). This could be used by halberd as an extra technique to enumerate real servers.
  • Test suite improvements
    • The test harness is tied to my own development environment. This should change.
    • More tests never hurt.

Introduction to the lambda calculus

A Brave New Hope briefly reviews an interesting text on the lambda calculus. This reminded me of one of the books that got me started in functional programming: An introduction to functional programming through lambda calculus by Greg Michaelson. It is an enjoyable and fast-paced text which I’d recommend if you’re looking for a good introduction to the subject.

First public release of halberd

Halberd is a tool I wrote two years ago to detect HTTP load balancers. I recently decided I should polish it, write some documentation and release it.

You can be use halberd as a stand-alone command or as a Python module to be imported by other software.

Here it is for your enjoyment.

Python code coverage revisited

Yesterday Ned Batchelder published an updated version of the code coverage tool for Python I mentioned in a past entry.

How do I use nmap XML?

Recently, in the nmap-dev mailing list, Fyodor asked:

In what ways do you use the Nmap XML output? Do you parse it from within a higher level program, transform it to HTML with XSLT, use it to populate a database, use XPath to parse the results from the command-line in a way that is as easy as awk/sec/cut/etc. on the normal output, or something else entirely?

I’ll share here my approach to nmap output parsing.

For my automated scans I use a combination of Python, Bash and AWK scripts. I always keep nmap scans in XML even if these will be used by some Bash/Awk scripts.

With Python I just parse the XML with libxml’s Python bindings.

With Bash and/or AWK I transform the XML output into PYX format with a custom made utility called xmltopyx.

For those not familiar with PYX, it is a way of converting XML documents into a more grep/AWK friendly format. More information about it can be found here and here.

An example of xmltopyx + AWK usage:

$ xmltopyx nmap-sample-tcpudp-portscan.xml | awk -f getports
tcp 21 open ftp
tcp 22 open ssh
tcp 53 open domain
udp 53 open|filtered domain
tcp 111 open rpcbind
udp 111 open|filtered rpcbind
udp 608 open|filtered sift-uft
tcp 611 open npmp-gui
udp 636 open|filtered
tcp 639 open
udp 664 open|filtered
udp 667 open|filtered
tcp 670 open
tcp 953 open rndc
tcp 2049 open nfs
udp 2049 open|filtered nfs
tcp 3128 open squid-http
udp 3130 open|filtered squid-ipc
udp 3401 open|filtered squid-snmp
udp 4827 open|filtered squid-htcp
udp 32768 open|filtered omad
udp 32771 open|filtered sometimes-rpc6

Then, using getports together with a while read proto port state service; do ... ; done loop in Bash is very simple.

Python code coverage and reverse engineering

I just found this Python coding style guidelines and I want to comment on the following paragraph regarding changing a function/method’s behaviour:

Even with unit testing, it’s really hard to track dependencies like this. We would need code coverage tools and exhaustive tests, neither of which we had.

I also find it hard to track dependencies in Python code and I’m aware of, at least, one code coverage tool for Python useful for improving test suites (but try to avoid some common pitfalls).

Parsing incorrect HTML

TidyLib provides a command-line tool and a library to turn badly formed HTML/XHTML pages into standards compliant ones.

Aside from the point of view of the webmaster who wants to make sure his website is well-formed this package is also useful for web client developers who want to sanitize invalid HTML before feeding it to their parsers.

There’s even a Python interface called uTidyLib.

Page 2 of 2«12