Reputation: 11
revenues_in = MapCompose(MatchEndDate(float))
revenues_out = Compose(imd_filter_member, imd_mult, imd_max)
def add_xpath(self, field_name, xpath, *processors, **kw):
values = self._get_values(xpath, **kw)
self.add_value(field_name, values, *processors, **kw)
return len(self._values[field_name])
def add_xpaths(self, name, paths):
for path in paths:
match_count = self.add_xpath(name, path)
if match_count > 0:
return match_count
return 0
self.add_xpaths('revenues', [
'//us-gaap:Revenues',
'//us-gaap:SalesRevenueNet',
'//us-gaap:SalesRevenueGoodsNet',
'//us-gaap:SalesRevenueServicesNet',
'//us-gaap:RealEstateRevenueNet',
'//*[local-name()="NetRevenuesIncludingNetInterestIncome"]',
'//*[contains(local-name(), "TotalRevenues") and contains(local-name(), "After")]',
'//*[contains(local-name(), "TotalRevenues")]',
'//*[local-name()="InterestAndDividendIncomeOperating" or local-name()="NoninterestIncome"]',
'//*[contains(local-name(), "Revenue")]'
])
Currently, the code only spits out the first match in the list of xpaths. I'd like it to return the maximum value out of all xpaths that matched. Please advise.
This is of course a subsection of the code that I thought was relevant. If you'd like to see any additional code, please visit https://github.com/eliangcs/pystock-crawler/tree/master/pystock_crawler
Thank you for your time and help!
Upvotes: 1
Views: 101
Reputation: 298
This isn't working because the add_xpaths function is returning a value at the end of every pass through the loop. This causes the loop to exit after the first run. Instead, you need to store the count in a variable and return it when you've looped through the entire data structure.
Instead of this:
def add_xpaths(self, name, paths):
for path in paths:
match_count = self.add_xpath(name, path)
if match_count > 0:
return match_count
return 0
Try this:
def add_xpaths(self, name, paths):
match_count = 0
for path in paths:
match_count += self.add_xpath(name, path)
return match_count
Upvotes: 1