[weboob] What library uses to parse HTML pages
laurent at bachelier.name
Wed Mar 31 18:16:56 CEST 2010
Since weboob requires an Internet connection anyway, why not use
weboob on another machine, over SSH, from the n900?
On Wed, Mar 31, 2010 at 18:07, Romain Bignon <romain at peerfuse.org> wrote:
> Historically, the 'AuM' backend uses html5lib to parse HTML pages.
> This library has serious performance problems, and another issue is that it is
> not packaged on every systems (for example, juke tells me that it is not on the
> N900 Nokia cell phone.
> I propose to use instead the xml.dom.minidom, a light implementation of DOM.
> This is a standard library, so probably with high-performances, probably more
> supported, available on every systems with python.
> The only eventual problem is: how is it tolerant to bad-HTML?
> So I'll try to do some test to know if this is a good solution. If you have
> other ideas, don't hesitate.
> weboob mailing list
> weboob at lists.symlink.me
More information about the weboob