template boy
template boy

Reputation: 10490

Union-find expressed as a social network

This is an interview question that I am trying to answer:

Given a social network containing N members and a log file containing M timestamps at which times pairs of members formed friendships, design an algorithm to determine the earliest time at which all members are connected (i.e, every member is a friend of a friend of a friend...of a friend). Assume that the log file is sorted by timestamp and that friendship is an equivalence relation. The running time of your algorithm should be M log N or better and use extra space proportional to N.

The first thing I thought was..."I can't do this!".

But then I thought how can this social network be expressed as a data structure. Union-find is a data structure that can be used. Now I have to understand what it means when all members are connected. How can I view the actual data structure and what it looks like when every member became friends with each other?

I think only until I can understand visually or conceptually how the system becomes fully connected can I begin to figure out how to find the timestamp that corresponds with that event.

Upvotes: 28

Views: 19840

Answers (10)

Peter
Peter

Reputation: 1

Data source: the columns are num1, num2, date (use an integer to represent). you can save the data source as connections.txt for the program to use:

10
1 2 1
3 4 2 
5 8 3
6 2 4
7 3 5
9 1 6
4 6 7
5 9 8
0 2 9

10 is the total nodes of this union-find data structure.

use the below code to run the program: java-algs4 Ex1 < connections.txt

public class Ex1 {
    public static void main(String[] args) {
        int n = StdIn.readInt();
        WeightedQuickUnionUF uf = new WeightedQuickUnionUF(n);
        while (uf.count() > 1 && !StdIn.isEmpty()) {
            int p = StdIn.readInt();
            int q = StdIn.readInt();
            if (uf.find(p) == uf.find(q)) continue;
            uf.union(p, q);
            int time = StdIn.readInt();
            StdOut.println(p + " " + q + " " + time);
        }
        StdOut.println(uf.count());
    }
}

Upvotes: 0

Shandur
Shandur

Reputation: 41

Both answers from Andres Soto and Monil look very valid here.

One thing I want to point out is that it seems like the Kruskal's algorithm - i.e., finding a minimum spanning tree/forest (MST) of an undirected edge-weighted graph. Basically, there are members (vertices of the graph) and timestamps (weighted edges between vertices) - so-called, connected graph. Since you're given a sorted list, the algorithm fits perfectly - you use weighted quick-union (even with path compression) and track the size of each tree to identify MST (check Monil answer).

Note: if the graph is disconnected, then a minimum spanning tree is composed of a minimum spanning tree for each component/tree.

UPD: noticed the comment from Benjamin below the question that also mentioned the algorithm.

Upvotes: 0

nusgeek
nusgeek

Reputation: 1

Adding my opinion to @Andrés Soto. I agree with his answer overall but I think we don't need getComponent() method.

In WeightedUnionFind, remember that we have an array size and we need to compare the size of two components so that we can determine which one unites which.

    public class WeightedQuickUnionUF {
        private int[] parent;   // parent[i] = parent of i
        private int[] size;     // size[i] = number of elements in subtree rooted at i
        private int count;      // number of components

After each effective union, we will update the size. So we only need to judge whether the updated size equals to the N. If yes, this step is one which all persons are connected.

Upvotes: 0

Harshit Singh
Harshit Singh

Reputation: 1

This can be easily done by adding a count for number of connected component. Regard N members as N objects, regard M timestamps as M unions. Then “the earliest time at which all members are connected” is the time when the number of connected component is equal to 1.

Upvotes: 0

ROHAN ARORA
ROHAN ARORA

Reputation: 51

To find if all the members are connected I used the concept of weighted Quick-union. If the size of tree gets equal to the n then we can say all the members are connected. My Code:

class MyClas {
    private int[] a;
    private int[] size;
    int N=0;
    public MyClas(int n){
        N=n;
        a = new int[n];
        size = new int[n];
        for(int i=0;i<n;i++){
            a[i]=i;
            size[i]=1;
        }
    }
    private int root(int x){
        while(x != a[x]){
            x=a[x];
        }
        return x;
    }
    public boolean connected(int p, int q){
        return root(p)==root(q);
    }
    public void union(int p,int q, String timestamp){
        int i = root(p);
        int j = root(q);
        if(i == j) return;
        if(size[i] < size[j]){
            a[i] = j;
            size[j]+=size[i];
            if(size[j]==N){
                System.out.println("All Members are connected at Timestamp"+ timestamp);
            }
        }
        else{
            a[j] = i;
            size[i]+=size[j];
            if(size[i]==N){
                System.out.println("All Members are connected at Timestamp"+ timestamp);
            }
        }
    }

}
public class MyClass {
    public static void main(String args[]) {
      MyClas obj = new MyClas(6);
      obj.union(1,5, "2019-08-14 18:00:00");
      obj.union(2,4, "2019-08-14 18:00:01");
      obj.union(1,3, "2019-08-14 18:00:02");
      obj.union(5,2, "2019-08-14 18:00:03");
      obj.union(0,3,"2019-08-14 18:00:04");
      obj.union(2,1,"2019-08-14 18:00:05");

    }
}

Upvotes: 5

humkins
humkins

Reputation: 10715

Social network can be represented as a tree or graph where each node have 0 to n connections.

Main part of data structure which you need is an array of integers, where each element index can be interpreted as social network member ID, and value is ID of another member which is root for first one.

You need to read the log and to perform union operation in your data structure on each log record (building the tree) and analyze two criteria

  • each member has a connection (Set<Integer> haveConnection)
  • there is only one global root (because from the beginning you will have a lot of not connected subnetworks with own roots) (Set<Integer> roots)

Once both criteria satisfied - all members of your network are connected.

Good luck!

Upvotes: 2

Amit Kumar Verma
Amit Kumar Verma

Reputation: 3

Adding to @Andrés's Answer. Here is the method to check all of them are connected or not,to be added in WeightedQuickUnionFind Class.

public boolean isAllConnected(){
    int N = id.length;
    while(--N>0 && id[N] == id[0]);
    return N == 0;
}

Upvotes: -1

Monil
Monil

Reputation: 11

The answer provided by Andres is the right one. There is also another way. You can use extra space propotional to N elements here. So you can keep an array of total number of nodes in a tree and in the union function after merging both the trees you will have to add the number of nodes in the tree being merged to the number of nodes in the tree which is the root of both. So after adding you just have to check if total number of nodes in the new tree is equal to N. If it is then it will be the earliest time when all N members are friends of each other.

Upvotes: 0

Andr&#233;s Soto
Andr&#233;s Soto

Reputation: 1004

Ok to solve this exercise I did the assumption that the log file will look something like this:

0 1 2015-08-14 18:00:00
1 9 2015-08-14 18:01:00
0 2 2015-08-14 18:02:00
0 3 2015-08-14 18:04:00
0 4 2015-08-14 18:06:00
0 5 2015-08-14 18:08:00
0 6 2015-08-14 18:10:00
0 7 2015-08-14 18:12:00
0 8 2015-08-14 18:14:00
1 2 2015-08-14 18:16:00
1 3 2015-08-14 18:18:00
1 4 2015-08-14 18:20:00
1 5 2015-08-14 18:22:00
2 1 2015-08-14 18:24:00
2 3 2015-08-14 18:26:00
2 4 2015-08-14 18:28:00
5 5 2015-08-14 18:30:00

Where the 2 first numbers are the members who formed the friendship follow by the timestamp.

Another important thing to call out is that the exercise mention that the the file is sorted, so I have decide to sort it on a ascending order.

With this information, you can use the WeightedQuickUnionFind data structure provided in the class and simple process the file performing union operation on the members, once that you make the union you can ask for how many components are in the structure, if there is only one that means all members have equivalent relation.

Here is the code I did:

public static void main(String[] args) {

        int n = StdIn.readInt();
        WeightedQuickUnion uf = new WeightedQuickUnion(n);
        String date, time;
        //timestamps are sorted ascending
        while (!StdIn.isEmpty()) {

            int p = StdIn.readInt();
            int q = StdIn.readInt();
            date = StdIn.readString();
            time = StdIn.readString();


            uf.union(p, q);

            StdOut.println("["+p+","+q+"]");

            if(uf.getComponents() == 1){
                StdOut.println("All members were connected at: " + date + time);
                break;
            }

        }

The performance will be M lg N because you are iterating M times (the amount of lines in the log file) and the union operations takes: lg n.

Upvotes: 25

Paul Hankin
Paul Hankin

Reputation: 58409

When you add a friendship to the union-find datastructure, you can note if it results in two graph components being joined. Simply keep adding edges until N-1 of these merging events have happened.

In pseudo-code form:

G := UnionFind(1..N)
count := 0
for timestamp, p1, p2 in friendships {
    if G.Find(p1) != G.Find(p2) {
        G.Union(p1, p2)
        count++
        if count == N-1 {
            return timestamp
        }
    }
}
return +infinity

Upvotes: 26

Related Questions