JRG
JRG

Reputation: 4177

Serialize and Deserialize Trie-like Data Structure

I am trying to serialize and deserialize a Trie-like data structure which has data/character in each node. So to form a whole word, traversal from root to leaf node is required.

Serialization and De-serialization should be in pre-order traversal i.e. process the children in DFS approach.

# marks end of traversal for that node i.e. that the trie-like node doesn't have any more children.

Here is what I have tried.

public class SerializeDeserialize {

    public static void main(String[] args) {
        // prepare TrieNode Tree
        TrieNodeSD root = buildTrienodeTree();
        StringBuilder sb = new StringBuilder();
        serialize(root, sb);
        sb.deleteCharAt(sb.length()-1);
        System.out.println(sb.toString());
        System.out.println();
        TrieNodeSD newRoot = deserialize(sb.toString().split(","), new int[] {0});
        StringBuilder newsb = new StringBuilder();
        serialize(newRoot, newsb);
        newsb.deleteCharAt(newsb.length()-1);
        System.out.println(newsb.toString());
    }

    private static void serialize(TrieNodeSD node, StringBuilder sb) {
        if (node == null) return;
        sb.append(node.character + ",");
        if (node.characters != null && node.characters.size() > 0) {
            for (Character c : node.characters.keySet()) {
                serialize(node.characters.get(c), sb);
            }
        }
        sb.append("#,");
    }

    // DOESN'T WORK!!
    private static TrieNodeSD deserialize(String[] data, int[] t) {
        if (t[0] >= (data.length-1) || data[t[0]].equals("#")) return null;
        TrieNodeSD node = new TrieNodeSD(data[t[0]].charAt(0));
        t[0] = t[0] + 1;
        TrieNodeSD child = deserialize(data, t);
        if (child != null) node.characters.put(child.character, child);
        return node;
    }

    private static TrieNodeSD buildTrienodeTree() {
        TrieNodeSD root = new TrieNodeSD('A');

        root.characters.put('B', new TrieNodeSD('B'));
        root.characters.get('B').characters.put('E', new TrieNodeSD('E'));
        root.characters.get('B').characters.put('F', new TrieNodeSD('F'));
        root.characters.get('B').characters.get('F').characters.put('K', new TrieNodeSD('K'));

        root.characters.put('C', new TrieNodeSD('C'));

        root.characters.put('D', new TrieNodeSD('D'));
        root.characters.get('D').characters.put('G', new TrieNodeSD('G'));
        root.characters.get('D').characters.put('H', new TrieNodeSD('H'));
        root.characters.get('D').characters.put('I', new TrieNodeSD('I'));
        root.characters.get('D').characters.put('J', new TrieNodeSD('J'));

        return root;
    }
}

class TrieNodeSD {
    Map<Character, TrieNodeSD> characters;
    char character;
    public TrieNodeSD(char c) {
        this.characters = new HashMap<Character, TrieNodeSD>();
        this.character = c;
    }
    @Override
    public String toString() { return this.character + "";  }
}

Serializing gives output in pre-order format (e.g. A,B,E,#,F,K,#,#,#,C,#,D,G,#,H,#,I,#,J,#,#,#).

PROBLEM: during de-serialization, the code doesn't process all the children correctly and doesn't associate them with correct parent.

Can someone suggest how to fix processing in deserialize method or help me with pointers what am i missing?

Upvotes: 1

Views: 3661

Answers (2)

JRG
JRG

Reputation: 4177

Finally found the way to de-serialize a pre-order serialized form of Trie-Like data structure.

import java.util.HashMap;
import java.util.Map;

/**
 *                              A<br>
 *                  /           |           \<br>
 *                  B           C           D<br>
 *              /       \           /   /       \   \<br>
 *              E       F           G   H       I   J<br>
 *                      |<br>
 *                      K<br>
 * 
 *
 */
public class SerializeDeserialize {

    public static void main(String[] args) {
        StringBuilder sb = new StringBuilder();
        StringBuilder newsb = new StringBuilder();

        // prepare TrieNode Tree
        TrieNodeSD root = buildTrienodeTree();

        // serialize tree into string
        serialize(root, sb);
        sb.deleteCharAt(sb.length() - 1);
        System.out.println(sb.toString());
        System.out.println();

        // construct tree again from serialized string
        TrieNodeSD newRoot = deserialize(sb.toString().split(","), new int[] { 0 });

        // Verify : again serialize above de-serialized tree to match both
        // trees serialized format.
        serialize(newRoot, newsb);
        newsb.deleteCharAt(newsb.length() - 1);
        System.out.println(newsb.toString());
    }

    private static void serialize(TrieNodeSD node, StringBuilder sb) {
        if (node == null) return;
        sb.append(node.character + ",");
        if (node.characters != null && node.characters.size() > 0) {
            for (Character c : node.characters.keySet()) {
                serialize(node.characters.get(c), sb);
            }
        }
        sb.append("#,");
    }

    private static TrieNodeSD deserialize(String[] data, int[] t) {
        if (t[0] >= (data.length - 1) || data[t[0]].equals("#")) return null;
        TrieNodeSD node = new TrieNodeSD(data[t[0]].charAt(0));
        while (true) {
            t[0] = t[0] + 1;
            TrieNodeSD child = deserialize(data, t);
            if (child != null) node.characters.put(child.character, child);
            else break;
        }
        return node;
    }

    private static TrieNodeSD buildTrienodeTree() {
        TrieNodeSD root = new TrieNodeSD('A');

        root.characters.put('B', new TrieNodeSD('B'));
        root.characters.get('B').characters.put('E', new TrieNodeSD('E'));
        root.characters.get('B').characters.put('F', new TrieNodeSD('F'));
        root.characters.get('B').characters.get('F').characters.put('K', new TrieNodeSD('K'));

        root.characters.put('C', new TrieNodeSD('C'));

        root.characters.put('D', new TrieNodeSD('D'));
        root.characters.get('D').characters.put('G', new TrieNodeSD('G'));
        root.characters.get('D').characters.put('H', new TrieNodeSD('H'));
        root.characters.get('D').characters.put('I', new TrieNodeSD('I'));
        root.characters.get('D').characters.put('J', new TrieNodeSD('J'));

        return root;
    }
}

class TrieNodeSD {
    Map<Character, TrieNodeSD> characters;
    char character;

    public TrieNodeSD(char c) {
        this.characters = new HashMap<Character, TrieNodeSD>();
        this.character = c;
    }

    @Override
    public String toString() {
        return this.character + "";
    }
}

Sample Run: Serialize given Trie-Like data structure in pre-order traversal, use the serialized string to construct Trie-data like data structure i.e. de-serialize and finally serialize it again to verify that the serialized form matches actual tree.

A,B,E,#,F,K,#,#,#,C,#,D,G,#,H,#,I,#,J,#,#,#

A,B,E,#,F,K,#,#,#,C,#,D,G,#,H,#,I,#,J,#,#,#

Upvotes: 3

Hearen
Hearen

Reputation: 7828

Not quite sure about your trie data structure, but if you are referring to the trie then there must be some misunderstandings.

There is a clear specification for trie in wiki.

...Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string...

(content from wiki, I just added the emphasis)

PROBLEM: during de-serialization, the code doesn't process all the children correctly and doesn't associate them with correct parent.

Even for a tree structure with key in the node, still your solution won't work since you are ignoring the size of the children by using a map instead of a fixed-sized array which is quite essential to deserialise the serialised data.

Using map makes it impossible to determine which node is the parent and which nodes are the children.

As for binary search tree or the real trie tree, their structure are predefined, via which you then can serialise and deserialise the tree since they're deterministic.


Perhaps Radix tree is what you really want.

By the way, you can actually serialise and deserialise directly in the *Node.

For example serialisation can something just like this:

@Override
public String toString() {
    List<String> resultList = new ArrayList<>();
    for (TrieNode child : children) {
        if (child == null) resultList.add("#");
        else resultList.add(child.toString());
    }
    return resultList.stream().collect(Collectors.joining(","));
}

Upvotes: 0

Related Questions