Tag Archive - programming

First public release of halberd

Halberd is a tool I wrote two years ago to detect HTTP load balancers. I recently decided I should polish it, write some documentation and release it.

You can be use halberd as a stand-alone command or as a Python module to be imported by other software.

Here it is for your enjoyment.

Python code coverage revisited

Yesterday Ned Batchelder published an updated version of the code coverage tool for Python I mentioned in a past entry.

How do I use nmap XML?

Recently, in the nmap-dev mailing list, Fyodor asked:

In what ways do you use the Nmap XML output? Do you parse it from within a higher level program, transform it to HTML with XSLT, use it to populate a database, use XPath to parse the results from the command-line in a way that is as easy as awk/sec/cut/etc. on the normal output, or something else entirely?

I’ll share here my approach to nmap output parsing.

For my automated scans I use a combination of Python, Bash and AWK scripts. I always keep nmap scans in XML even if these will be used by some Bash/Awk scripts.

With Python I just parse the XML with libxml’s Python bindings.

With Bash and/or AWK I transform the XML output into PYX format with a custom made utility called xmltopyx.

For those not familiar with PYX, it is a way of converting XML documents into a more grep/AWK friendly format. More information about it can be found here and here.

An example of xmltopyx + AWK usage:

$ xmltopyx nmap-sample-tcpudp-portscan.xml | awk -f getports
tcp 21 open ftp
tcp 22 open ssh
tcp 53 open domain
udp 53 open|filtered domain
tcp 111 open rpcbind
udp 111 open|filtered rpcbind
udp 608 open|filtered sift-uft
tcp 611 open npmp-gui
udp 636 open|filtered
tcp 639 open
udp 664 open|filtered
udp 667 open|filtered
tcp 670 open
tcp 953 open rndc
tcp 2049 open nfs
udp 2049 open|filtered nfs
tcp 3128 open squid-http
udp 3130 open|filtered squid-ipc
udp 3401 open|filtered squid-snmp
udp 4827 open|filtered squid-htcp
udp 32768 open|filtered omad
udp 32771 open|filtered sometimes-rpc6

Then, using getports together with a while read proto port state service; do ... ; done loop in Bash is very simple.

Python code coverage and reverse engineering

I just found this Python coding style guidelines and I want to comment on the following paragraph regarding changing a function/method’s behaviour:

Even with unit testing, it’s really hard to track dependencies like this. We would need code coverage tools and exhaustive tests, neither of which we had.

I also find it hard to track dependencies in Python code and I’m aware of, at least, one code coverage tool for Python useful for improving test suites (but try to avoid some common pitfalls).

Parsing incorrect HTML

TidyLib provides a command-line tool and a library to turn badly formed HTML/XHTML pages into standards compliant ones.

Aside from the point of view of the webmaster who wants to make sure his website is well-formed this package is also useful for web client developers who want to sanitize invalid HTML before feeding it to their parsers.

There’s even a Python interface called uTidyLib.

Page 2 of 2«12