Aviral Srivastava
Aviral Srivastava

Reputation: 4582

What is meant by fan-out in entity-relationships graph databases?

In the Dgraph paper, it is mentioned that entities are sharded randomly across servers and they carry relationships with other entities with themselves. This results in a high fan-out result set in intermediate steps of a graph query causing broadcasting on servers and high latency to perform joins.

I do not understand the meaning of fan-out here.

I tried understanding from the basics and this is what I found:

I am not certain how the above to points relate to an entity based graph as discussed in the Dgraph paper. Is it the case that entities have many relationships with other entities and so, there are many edges attached to one entity?

Upvotes: 0

Views: 1145

Answers (1)

djhallx
djhallx

Reputation: 739

In short: Yes, there can be cases where an entity can have many edges attached to it.

In the first paper, shards are based on the predicate "lives-in" and "eats". In this description all of the "lives-in" data is in one shard and all of the "eats" data is in another shard and the paper says: "Assuming the worst case scenario where the cluster is so big each shard lives on a separate server."

Now, consider a query that covers several predicates, each in its own shard, and each shard living on a different server.

Find any person 
    that lives-in ???, 
    that eats ???, 
    that drinks ???,
    that drives ???,
    that reads ???

As the number of predicates grows, (i.e. the number of interrogated shards), we see the high fan-out described by the paper. The number of cluster "broadcasts" grows significantly, causing the query latency to grow proportionally.

Upvotes: 0

Related Questions