Parsing incorrect HTML

TidyLib provides a command-line tool and a library to turn badly formed HTML/XHTML pages into standards compliant ones.

Aside from the point of view of the webmaster who wants to make sure his website is well-formed this package is also useful for web client developers who want to sanitize invalid HTML before feeding it to their parsers.

There’s even a Python interface called uTidyLib.

Wow. It's Quiet Here...

Be the first to start the conversation!

Leave a Reply:

Gravatar Image

XHTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>