Reputation: 3872
I am trying to make a trie in c++, now what my basic data structure looks like is ..
struct node{
int count; no of times this node has been visited.
struct node* child[ALPHABET_SIZE]; // Let ALPHABET_SIZE be 26
}
When the string size gets large a lot of allocated memory is wasted. Like if we insert "he"
Our tree would be
root---->h--->e
|--->e
We see that at root only 2/26th
of the memory allocated is being used. How to improve ??.
Upvotes: 4
Views: 3487
Reputation: 21
Use adjacency list.
Rather than a tree, we can create a list of nodes. A node will be dictionaries, each having "current value" (the alphabet) and "next state" (list of indices of the child nodes). We can add other required attributes in the node.
In your case: The list will be -
[{"value":"", "next_state":[1 ]}, {"value":"h", "next_state":[2]}, {"value":"e", "next_state":[ ]}]
Now say, we add "his". The list will be updated to :
[{"value":"", "next_state": [1 ]}, {"value": "h", "next_state": [2 , 3]}, {"value":"e", "next_state":[ ]}, {"value":"i", "next_state":[4]}, {"value":"s", "next_state":[ ]},]
Notice, the next_state
of node in index-1. We have two child nodes - "e" and "i".
It's very efficient and easy to implement. However, the trie operations will be considerably slower.
Upvotes: 0
Reputation: 11807
Instead of creating a fixed size array for each node, create an array with 1 element and resize it (replace it with a new array with size+1) when you insert a child. Inserting would be slower so you might test and change the resizing algorithm (size+1 or size*2 or size + size/2) so that there are fewer allocations if it gets too slow.
Upvotes: 0
Reputation: 74390
Some very basic advice:
Upvotes: 2