Parsing incorrect HTML
TidyLib provides a command-line tool and a library to turn badly formed HTML/XHTML pages into standards compliant ones.
Aside from the point of view of the webmaster who wants to make sure his website is well-formed this package is also useful for web client developers who want to sanitize invalid HTML before feeding it to their parsers.
There’s even a Python interface called uTidyLib.
About this entry
You’re currently reading “Parsing incorrect HTML,” an entry on Reality tunnels
- Published:
- 10.05.04 / 1am
- Category:
- programming, python
No comments
Jump to comment form | comments rss [?] | trackback uri [?]