christophe.benz at gmail.com
Thu Apr 1 15:39:32 CEST 2010
Le Wed, 31 Mar 2010 11:08:45 +0200,
Romain Bignon <romain at peerfuse.org> a écrit :
> Christophe, I see you have added a new capability named
> ICapUpdatable, to replace the behavior of the removed
> ICapMessages.iter_new_messages() method.
> I think I understand how it is supposed to work, but this is not
> exactly what I wanted to do with iter_new_messages().
> Problem is we need to keep a copy of every messages, even we don't
> care about them. So for example if you run the 'monboob' daemon
> (which is the new name of the 'mail' frontend') for a long period,
> every new article or comment sent to user will be infinitely stored
> in memory.
I thought about working with a hash, or a primary key, in order to
avoid storing the item entierly.
> Also, for example with the DLFP backend, it requires to open all the
> time every article pages and return every comments, to then remove
> every already stored articles and keep only a few of them.
The aim is obviously not to crawl the entire website :-)
We could add a parameter to limit the quantity of data. It could be a
date, or a page number limit (if we crawl a website
using /1.html, /2.html pattern, for example).
> Instead, I think a good way would be to re-introduce the
> 'iter_new_messages' method (without removing 'iter_messages' which is
> usefull for a graphical frontend), and the backend takes care of
> returning only new messages since the last call.
> Of course, this is just a proposal, what do you think about?
For the moment, each backend will implement the way to detect new
items, and if similar ways to do it appear, we will factorize the code.
More information about the weboob