Cherokee
Cherokee

Reputation: 143

How does java serialization solve circular reference problems?

I did a serialization test with Java. I found that Java serialization can handle circular references properly. But how does Java serialization solve circular reference problems?

The following code works correctly:

public class SerializableTest {

    static class Employee implements Serializable{

        private static final long serialVersionUID = 1L;

        String name;

        int age;

        Employee leader;

        public void say(){
            System.out.println("my name is " + name + ". and I'm " + age + " years old.");
        }

    }

    public static void main(String[] args) throws IOException, ClassNotFoundException {

        ObjectOutput objectOutput = new ObjectOutputStream(new FileOutputStream(new File("tempPath")));

        Employee employee = new Employee();
        employee.name = "Tom";
        employee.age = 41;
        employee.leader = employee;
        employee.say();

        objectOutput.writeObject(employee);

        ObjectInput objectInput = new ObjectInputStream(new FileInputStream(new File("tempPath")));

        Employee readEmployee = (Employee) objectInput.readObject();

        readEmployee.say();
        readEmployee.leader.say();
    }
}

Upvotes: 1

Views: 1043

Answers (4)

chaokunyang
chaokunyang

Reputation: 2322

When jdk serialize an object, it store its objectref to IdentityHashMap, so next time jdk serialize this object, jkdk can write just a ref(an auto-growing id) for it, which avoid duplicate serialization. There ara similar mechanism used by https://github.com/alipay/fury , https://github.com/EsotericSoftware/kryo

Upvotes: 0

Cherokee
Cherokee

Reputation: 143

I found the answer in the book "Core Java® Volume II—Advanced Features, Tenth Edition " .In section 2.4 " Object Input/Output Streams and Serialization " of this book, the description is:

Behind the scenes, an ObjectOutputStream looks at all the fields of the objects and saves their contents. For example, when writing an Employee object, the name, date, and salary fields are written to the output stream.

However, there is one important situation that we need to consider: What happens when one object is shared by several objects as part of its state?

Saving such a network of objects is a challenge. Of course, we cannot save and restore the memory addresses for the secretary objects. When an object is reloaded, it will likely occupy a completely different memory address than it originally did.

Instead, each object is saved with a serial number , hence the name object serialization for this mechanism. Here is the algorithm:

  1. Associate a serial number with each object reference that you encounter (as shown in Figure 2.6).

  2. When encountering an object reference for the first time, save the object data to the output stream.

  3. If it has been saved previously, just write “same as the previously saved object with serial number x .”

When reading back the objects, the procedure is reversed.

  1. When an object is specified in an object input stream for the first time, construct it, initialize it with the stream data, and remember the association between the serial number and the object reference.

  2. When the tag “same as the previously saved object with serial number x ” is encountered, retrieve the object reference for the sequence number.

So it is to serialize and deserialize an object graph in this way with JAVA.

Upvotes: 0

Tom Hawtin - tackline
Tom Hawtin - tackline

Reputation: 147154

There are two key things for circular references in Java Serialization: backreferences and construction nesting.

Java Serialization performs a depth first traversal. Consider this example.

class Outer implements java.io.Serializable {
    Inner inner;
}

One object contains another (assuming inner is non-null). The outer object starts to be written to the stream, in the middle of which the inner object is written, followed by the rest of the outer object. Likewise for reading. Java Serialization does not delay writing out the outer object until after the inner objects have been cleanly constructed.

An analogy to normal Java code would be constructing nested objects in the constructor of outer objects.

    // Like Java Serialization
    Outer() {
        this.inner = new Inner();
    }

Rather than constructing the nested objects and passing references to the constructor of the outer object.

    // Not like Java Serialization
    Outer(Inner inner) {
        this.inner = inner;
    }
}

    ... new Outer(new Inner()) ...

Backreferences are needed even if you just wanted a directed acyclic graph of objects. Consider a simple example.

class Foo implements java.io.Serializable {
    Object a = new SerialBar();
    Object b = a;
}

We should find for any deserialised instance foo, foo.a == foo.b. In order to achieve this serialisation checks to see if the stream has started serialised the reference before, and if so inserts a back reference rather than reserialising the object. The object is remembered as soon as it is constructed, but before either default field serialisation or readObject/readExternal has started.

Putting those two things together, we see that a nested object can receive a reference an outer object. Note that the nested object sees a partially constructed outer object, with all the fun that entails.

Upvotes: 0

Peter Lawrey
Peter Lawrey

Reputation: 533492

Java Serialization uses an IdentityHashMap to map every reference it tries to serialize to an id. The first time it serializes an object it writes its contents and its id. After that, it writes just the id allowing circular references and one copy of an object no matter how many times it is referenced.

The downside is that if you keep the Object stream and don't call reset() it will retain every object you have ever sent resulting in memory usage increasing. Also if you change an object and send it again, the changes won't be apparent as it only sends the reference to the object again.

Upvotes: 3

Related Questions