Reputation: 8477
I like to make data classes immutable to make concurrent programming easier. But making a completely immutable hierarchy seems problematic.
Consider this simple tree class:
public class SOTree {
private final Set<SOTree> children = new HashSet<>();
private SOTree parent;
public SOTree(SOTree parent) {
this.parent = parent;
}
public SOTree(Set<SOTree> children) {
for (SOTree next : children)
children.add(next);
}
public Set<SOTree> getChildren() {
return Collections.unmodifiableSet(children);
}
public SOTree getParent() {
return parent;
}
}
Now, if I want to create a hierarchy of these, when I construct it, either the parent has to exist before the current node, or the children have to exist first.
SOTree root = new SOTree((SOTree)null);
Set<SOTree> children = createChildrenSomehow(root);
//how to add children now? or children to the children?
or
Set<SOTree> children = createChildrenSomehow(null);
SOTree root = new SOTree(children);
//how to set parent on children?
Without forcing this to be a singly linked tree, is there any clever way to construct such a tree and still have all the nodes completely immutable?
Upvotes: 27
Views: 9450
Reputation: 60987
Eric Lippert recently blogged about this problem. See his blog post Persistence, Facades and Roslyn's Red-Green Trees. Here's an excerpt:
We actually do the impossible by keeping two parse trees. The "green" tree is immutable, persistent, has no parent references, is built "bottom-up", and every node tracks its width but not its absolute position. When an edit happens we rebuild only the portions of the green tree that were affected by the edit, which is typically about O(log n) of the total parse nodes in the tree.
The "red" tree is an immutable facade that is built around the green tree; it is built "top-down" on demand and thrown away on every edit. It computes parent references by manufacturing them on demand as you descend through the tree from the top. It manufactures absolute positions by computing them from the widths, again, as you descend.
Upvotes: 15
Reputation: 491
I recently faced a similar problem - https://medium.com/hibob-engineering/from-list-to-immutable-hierarchy-tree-with-scala-c9e16a63cb89
The approach is to build the tree bottom-up, having the bottom nodes constructed first, then go up towards the top node(s).
Phase 1 - Sort by depthIn order to start from the bottom, the algorithm sorts the nodes by their depth in the hierarchy (There's an O(n) way of doing it as you'll see in the link).
Phase 2 - Build the tree from the bottomBottom nodes have no children, so the algorithm constructs the bottom nodes.
Then, go one layer up and construct the nodes who have children with the nodes that were processed a layer before. The algorithm continues until it reaches the top node(s).
Upvotes: 5
Reputation: 81217
The proper approach for constructing an immutable tree would be to have the constructor of each node invoke the child nodes' constructors with itself as a parameter, with the provisos that the child node's constructor must not cause a a rooted reference to itself to be stored anywhere, nor use the passed-in parameter for any purpose except to initialize a field which the constructor would use for no purpose except to accept such initialization. Additionally, the parent node's constructor should avoid using any members of the child node which would dereference the "parent" field.
Although such a technique would seem to violate the rule that the constructors of immutable objects should not the fledgeling objects as parameters to other routines, the "real" rule is that the constructor of an immutable object should not allow a reference to the fledgeling object to be used in such fashion as to directly or indirectly access any fields which have not yet attained their final value. In general, if a fledgeling object exposes a reference to itself to the outside world, it will have no control over what outside code may do with it. In the particular case of calling a child node constructor, however, assuming the code for the child node honors the above requirements, there will be no rooted reference to the parent node, except through the parent node itself. Consequently, there will be no danger that any code that would do anything unexpected with the fledgeling node will get a reference to it.
Upvotes: 4
Reputation: 41
Building efficient, immutable data structures can be challenging. Fortunately, there are folks who have figured out how to implement many of these already. Take a look here for a discussion of large variety of immutable data structures.
This is an area that I'm still trying to bring myself up to speed on as well, so I'm not able to recommend the exact subset of these structures you should be looking at, but one data structure for working with trees that can be very useful is Zippers.
Upvotes: 4
Reputation: 5574
You've correctly stated your problem as one of chicken and egg. Another way of restating the problem that might shed a solution is that you want to grow a tree (root, trunk, leaves and all - all at once).
Once you accept that the computer can only process things step by step, a series of possible solution emerges:
Have a look at how Clojure creates immutable data structures. In Clojure's case, each operation on a tree (such as adding a node) returns a new tree.
Make tree creation atomic. You can create a special format and then deserialize the tree. Since all serialization methods are internal, you do not have to expose any mutable methods.
Just before the factory returns a constructed tree, lock it down with a flag first. This is an analog of atomic operation.
Use package level methods to construct the tree. This way the mutation methods on nodes couldn't be accessed by external packages.
Create nodes on the fly when they are accessed. This means your internal tree representation can never be changed, since changing the nodes have no effect on your tree structure.
Upvotes: 5
Reputation: 372982
Two thoughts:
Use some sort of tree factory. You could describe the tree using mutable structures, then have a factory that would assemble an immutable tree. Internally, the factory would have access to the fields of the different nodes and so could rewire internal pointers as necessary, but the produced tree would be immutable.
Build an immutable tree wrapper around a mutable tree. That is, have the tree construction use mutable nodes, but then build a wrapper class that then provides an immutable view of the tree. This is similar to (1), but doesn't have an explicit factory.
Hope this helps!
Upvotes: 11
Reputation: 189796
Without forcing this to be a singly linked tree, is there any clever way to construct such a tree and still have all the nodes completely immutable?
Keep your interfaces and implementations decoupled, and don't restrict your nodes of the tree to being the same class as the tree itself.
One solution to this issue is to store the node hierarchy in some other immutable representation, and when a caller calls getChildren()
or getParent()
whatever, it lazily constructs the child nodes from that immutable representation. If you want node.getChildren().get(i).getParent() == node
to be true (rather than .equals(node)
-- i.e. identity rather than equality), then you'll have to cache the node objects so you can redeliver them accordingly.
Upvotes: 3
Reputation: 1161
Since you want them to be immutable, you'd have to just do it at construction. Make one constructor that takes both parent and children, instead of two separate constructors.
Upvotes: -3