Reputation: 2446

Python ranged lookup table

I'm trying to crate a data structure sort of like a dictionary in Python, but with ranged keys. Here's the scenario:

I'm implementing a programming language for backwards compatibility with an old game. Each instruction in the game is put into a list, and a counter is incremented on every encounter of a newline character. The hope is that if the language has a run-time error, it will be able to look up what line the instruction was on in the table and display that back to the user.

So, if the error occurred on instruction 20, and line 3 contained instructions 15-30, then polling this structure for 20 should return 3.

Having a dictionary with an entry for every single instruction is possible, but also highly wasteful. Is there another way this can be done?

Upvotes: 1

Answers (3)

Joel Cornett

Reputation: 24788

Here's a class which adjusts the dict keys:

class rdict(dict):

    def __init__(self, incSize, *args):
        self.incSize = incSize
        dict.__init__(self, *args)

    def __normRange(self, keyArg, incSize):
        return keyArg // incSize * incSize

    def __getitem__(self, key):
            return dict.__getitem__(self, self.__normRange(key, self.incSize))

        def __setitem__(self, key, value):
            dict.__setitem__(self, self.__normRange(key, self.incSize), value)

    g = input("hi")
    d = rdict(3)
    d[10.4] = "a"
    print d

It outputs: {9.0: "a"}

If you aren't going to use it much, forget the class and just run your inputs through the function normRange

Upvotes: 0

kindall

Reputation: 184280

"Having a dictionary with an entry for every single instruction is possible, but also highly wasteful." Do you expect programs written in the language to be millions of instructions long? If not, this is exactly what I'd recommend. Don't optimize prematurely. Most people interpret this adage as referring to performance, but it applies to resource use as well.

If you do need to optimize space, what I'd recommend, assuming you're using Python 2.6 or later, is a bytearray. As its name implies, it is an array of bytes and can therefore represent values of 0-255. Each item in the array would represent the number of statements on the corresponding line. To convert from an instruction number back to a line number would be something like this:

instcounts = bytearray((2, 4, 6, 3, 1, 1, 5, 2, 3))

def getline(instnumber):
    count = line = 0
    while count <= instnumber and line < len(instcounts):
        count += instcounts[line]
        line += 1
    return line  # conveniently, this will be 1-indexed  :-)

getline(15)    # tells me instruction 15 is on line 5

The advantage of this over torek's answer is that it stores essentially the same information in only one byte per line of source code: much more efficient than a list. You have to do a little extra work to figure out the line number, but in practice you'll never notice even on very large files, and you will only run it when printing error messages anyway, hardly a speed-critical function. The function above takes about a tenth of a second on a bytearray representing 1,000,000 lines, and that's unoptimized code under Python 3.1 on a Windows XP VM running on a 4-year-old Mac Pro.

Memory required? 1,087,199 bytes, about a quarter of that required for just for the list in torek's solution -- not counting the int objects that are referenced in the list, each of which is 14 bytes (CPython reuses a pool of small integer objects, however, so small files might not use much if any extra memory for the integers).

By the way, CPython uses a very similar scheme (see func_code.co_lnotab or __code__.co_lnotab on any function).

Upvotes: 2

torek

Reputation: 489223

From your description, you have a monotonically increasing counter and a map function of the form (constants below just made up for example):

 0 <= x <  5: 0
 5 <= x <  7: 1
 7 <= x < 10: 2
10 <= x < 18: 3
18 <= x < 21: 4

It seems as though a simple array (list, in Python) of transition points will suffice:

[5, 7, 10, 18, 21]

and then given x, you find the lowest value of i such that arr[i] >= x.

You can find that index with an interpolative search, which will be roughly O(log log n) where n is the length of the array (code left as an exercise :-) ).

Upvotes: 3

Python ranged lookup table

Answers (3)

Related Questions