[weboob] [PATCH 1/1] leboncoin: fix cost for items in rent search page

Simon Lipp laiquo at hwold.net
Tue Jan 3 23:03:32 CET 2017


> Content of itemprops are often less dirty that text content.
Are you sure about that ? It’s the first time I encounter @content used 
as an alias as text() ; XPath specification doesn’t yield anything 
(except that it should reference the content attribute of current node, 
which is not what I want). A quick test on the firefox console with 
document.evaluate() shows me that text() gives me what I want, and 
@content doesn’t.

Regards,

Simon Lipp

>
> Regards
>
> *Le mardi 03 janvier 2017 à 16:47 +0100, Simon Lipp a écrit :
>> In rent search page, there is a tag ("C.C.") associated with the
>> price,
>> which break CleanDecimal. Leave it out of xpath selector.
>>
>> Signed-off-by: Simon Lipp <laiquo at hwold.net>
>> ---
>>   modules/leboncoin/pages.py | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/modules/leboncoin/pages.py b/modules/leboncoin/pages.py
>> index 59b448116..ea205c3f8 100644
>> --- a/modules/leboncoin/pages.py
>> +++ b/modules/leboncoin/pages.py
>> @@ -106,7 +106,7 @@ class HousingListPage(HTMLPage):
>>                               '\\2', default=None)
>>   
>>               obj_title = CleanText('./@title|./section/p[@class="item
>> _title"]')
>> -            obj_cost =
>> CleanDecimal('./section[@class="item_infos"]/*[@class="item_price"]',
>> +            obj_cost =
>> CleanDecimal('./section[@class="item_infos"]/*[@class="item_price"]/t
>> ext()',
>>                                       replace_dots=(',', '.'),
>>                                       default=Decimal(0))
>>               obj_currency =
>> Regexp(CleanText('./section[@class="item_infos"]/*[@class="item_price
>> "]'),




More information about the weboob mailing list