David R
David R

Reputation: 1044

Better way to write this elementTree search operation?

I'm working on a python script to study option pricing. I do a lot of pre-processing in XSL and output an XML file that is read into my python script using the elementTree api.

The xml has the following structure:
The children of the root XML node are stock elements, one for each stock.
The children of a <stock> element are day elements, one for each day.
The children of <day> elements are f elements, one for each forward-day I look at.
The f elements has some attributes, "days" for how far forward the day is and "change" for the change in the stock price on that day.

Because stocks do not trade every day, the sequence of "@days" has gaps. For example, a day element associated to a Thursday might look like:

<day [info in attributes>
<f days="1" change="-3.1"/>
<f days="4" change="-1"/>
<f days="5" change="0.4"/>
<f days="6" change="1.1"/>
...
</day>

Right now, I'm trying to look for historical data and find instances where @days = X, where X is an input. But if a given <day> element does not have such a day, I will settle for an f element where @days = X - 1. If it does not have one of those, I'll look for an f element where @days = X + 1.

Unfortunately, the elementTree library will throw an error if you try to do f[@days = X].get('change') if there is no f element with days = X. So I'm currently doing this the following way:

Changes = []
for day in Test_Stock:
    forward_days = [int(f.get('days')) for f in day]
    if X in forward_days:
        expiry_day = [f for f in day if int(f.get('days')) == X]
        Changes.append(float(expiry_day[0].get('change')))
    elif (X - 1) in forward_days:
        proxy_day = [f for f in day if int(f.get('days')) == (X - 1)]
        Changes.append(float(proxy_day[0].get('change')))
    elif (X + 1) in forward_days:
        proxy_day = [f for f in day if int(f.get('days')) == (X + 1)]
        Changes.append(float(proxy_day[0].get('change')))

This gives the intended result, butI would hope there is a simpler way to do this and would like to know how to better work with elementTree objects.

Upvotes: 1

Views: 62

Answers (2)

BenC
BenC

Reputation: 451

Instead of .get, try .find with an XPath, which returns None if there is no matching element, instead of an exception.

Using find lets the ElementTree do the searching for you, instead of needing to manually filter with [f for f in int(f.get('days')) == x].

So, for each day, the f element you want, if it exists, will be the first non-None item in this list:

[day.find("f[@days='%d']" % index)
 for index in [X, X - 1, X + 1]]

This also means that you can write your entire function as a horrifying one-liner like this (untested):

Changes = [float(d[0].get('change'))
           for d in [islice(ifilter(len, (day.find("f[@days='%d']" % candidate)
                                          for candidate in [X, X - 1, X + 1])),
                            1)
                     for day in Test_Stock]
           if d]

but please don't do this.

Definitely look into .find, though.

Upvotes: 0

Because the processing for X, X-1 and X+1 is the same, you can use a loop like:

Changes = []
for day in Test_Stock:
    forward_days = [int(f.get('days')) for f in day]
for x in [X,X-1,X+1]:
    if x in forward_days:
        expiry_day = [f for f in day if int(f.get('days')) == x]
        Changes.append(float(expiry_day[0].get('change')))
        break

Upvotes: 1

Related Questions