Reputation: 10343
If you look at the node definitions for a simple Trie and a simple K-ary tree, they look the same.
(using C++ notation)
template <size_t K>
trieNode
{
trieNode *[K]
};
template <size_t K>
KaryNode
{
KaryNode *[K]
};
At its simplest a K-ary tree has multiple children per node (2 for a binary tree)
And a Trie has "multiple children per node"
It seems that a K-ary tree makes it's choice of child based on comparison( < or > ) of Keys
While a Trie makes it's choice of child based on (unary) equality of sub-spans of the Key
Since neither data structure has made it into any standards, what would be best definition of each, and how would they be differentiated?
Upvotes: 7
Views: 3490
Reputation: 208456
From the point of view of the shape of the data structure, a trie is clearly an N-ary tree, in the same way that a balanced binary search tree is a binary tree, the difference being in how the data structure manages the data.
A binary search tree is a binary tree with additional constraint that the keys in the nodes are ordered, a balanced binary tree adds on top of that a constraint on the difference between the lengths of different branches.
Similarly, a trie is a N-ary tree with additional constrains that determine how the keys are managed.
Let's try a definition of what a trie is:
A trie is an efficient data structure used to implement a dictionary in which keys are sequences lexicographically. The implementation uses an N-ary tree where the branching factor is the range of valid values for each element in the key sequence[1] and each node may or not hold a value, but always holds a subsequence of the key being stored [2]. For each node in the tree, the concatenation of the subsequences of keys stored in the nodes from the root to any given node represent the key for the value stored, if the node holds a value, and/or a common prefix for all nodes in this subtree.
This layout of data allows for linear lookups on the size of the keys, and sharing the prefix allows for compact representations for many natural languages (like Spanish, where different forms of each verb differ only on the last few suffix characters).
1: That keys are sequences is an important premise, as the main advantage of the tries is that they split the key into different nodes along the path.
2: Depending on the implementation each node might maintain a single element (character) from the sequence or a combination.
Upvotes: 5
Reputation: 4955
A binary tree refers to the shape of the tree without saying anything about how the tree will be used. A binary search tree is a binary tree that is being used in a particular way.
Similarly, a k-ary tree = n-ary tree = multi-way tree refers to the shape of the tree. A trie is a multiway tree that is being used in a particular way.
(But, be careful, just like there are many variations on binary search trees, there are many different variations on tries.)
So, what makes a trie a trie?
A trie is usually used to represent a collection of sequences, such as strings. A particular key is stored, not in a single node like in a binary search tree, but rather split up across many levels of the tree. Here's a picture of a trie containing the strings "can", "car", "cat", and "do".
.
/ \
c/ \d
/ \
. .
| |
a| |o
| |
. .
/|\
n/r| \t
/ | \
. . .
As you can see, it may easier to think of the characters as being associated with the edges instead of the nodes, but any particular implementation might represent it either way.
The many varieties of tries differ in things like how they handle cases where one key is a prefix of another (eg, "cat" and "catastrophe"), and how/whether to compress long common substrings.
Upvotes: 3
Reputation: 10343
Thanks to user534498's comment about Knuth's "Taocp volume 3, chapter 6.2 & 6.3"
Knuth claims - Ch 6.3
A trie is essentially an M-ary tree, whose nodes are M-place vectors with components corresponding to digits or characters. each node on level l represent the set of all keys that begin with a certain sequence of l characters; the node specifies an M-way branch, depending on the (l +1)st character.
K-ary, M-ary and N-ary being synonyms, it seems the answer is yes.
Upvotes: 0
Reputation: 3984
K-nary tree: each node has at most K children. Trie: the children of each node is not limited to a number (theoretically). In practice of course there's always a limit. For example for an asian word trie, the number of children of each node is limited to the size of asian characters, which is probably say 5000 or 10000.
Upvotes: 0