Reputation: 2818
Say, there's a tree data structure, each leaf of which defines a set of keys for lookup:
*
|- A = 1, B = 2
|- *
|- C = 4
|- *
|- D = 5
|- D = 6, E = 7
I need a way of finding the value of the key for any given leaf while the tree is being traversed depth-first.
There are two approaches I have thought of:
If the value is not found in current leaf, check the dictionary of its parent and so on back to the root of the tree.
There is a global dictionary and each leaf inserts/removes its keys when being traversed. The lookup in performed in this global dictionary.
It's most likely that there will be many leafs with a few keys in each, and about 3-4 lookups for each key.
Which approach is more efficient? Or, maybe, there's another way of doing it that is better than both?
Upvotes: 2
Views: 232
Reputation: 7837
For a reasonably performant solution, use functional/persistent data structures.
They will have signatures like
insert :: Map -> key -> value -> Map
delete :: Map -> key -> Map
and so on. That is, each operation returns a new map with the operation performed on it, but the old map is also still valid. For tree-based maps this can be done with only a constant factor overhead; so operations will still run in log(n) time. (The main technique is path copying, FWIW.)
The way to gainfully use them is this: every time you encounter a sub-scope, keep the state of the parent scope around while evolving the state of the child scope as you encounter variables. Once you're done with the sub-scope, revert back to using the map that matches the parent scope. (Always do lookups in the most current, child-most map.)
The branch of research you want to look to, for better answers, is Persistent Data Structures. Here's Erik DeMaine giving a lecture about the topic, in case you want to familiarize yourself: https://www.youtube.com/watch?v=T0yzrZL1py0&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf&index=1
With log(n) time map operations I figure your whole thing will run in O(n log n) time. I wish I knew of a way to make it run in linear time, but I don't. I do not, for example, know about a persistent array-based hash map.
Upvotes: 0
Reputation: 27962
The programming language you are implementing will for sure define exact rules for name resolution. I don't think it will lead to a depth-first search. Name resolution rules, very fruequently, look somethink like this:
using
/ import
or whatever other construct linking to some other scope, perform a search in that other scope (all of such scopes, sequentially), and recurse within it:
In other words, you gradually go up in the tree of enclosing scopes, and decide whether to search any 'foreign' referenced scopes. Statements such as using
/ import
lead to references among scopes, which in turn cause what is viewed as a tree of scopes to be in fact a directed graph.
Regarding the lookup table construction, I'd start with a simple hashtable. Prefix trees (tries) work well for these scenarios, too.
Last but not least, I wouldn't care much of lookup performance unless I'd face a real performance problem when compiling tens or maybe hundreds of thousands of lines of code.
Upvotes: 2