Naved Mir
Naved Mir

Reputation: 97

How can i optimize creation of node relationship?

I have created roughly 200 000 nodes. I want to create relation between them based on facebook friend list.

first I get the parent node reference

var userquery = client.Cypher
    .Match("(n:User)")
    .Where("n.UserID=" + _ui.UserID)
    .Return<Node<UserInfo>>("n")
    .Results
    .Single();

then reference of friends based on facebook friend list

var friendquery1 = client.Cypher
    .Match("(n:User)")
    .Where("n.ThirdPartyFriendsIds In[" + _ui.ThirdPartyFriendsIds + "]")
    .Return<Node<UserInfo>>("n");

then create relationship with refrences I get

var friendquery = friendquery1.Results;

foreach(Node<UserInfo> friendnode in friendquery)
{
    client.CreateRelationship(userquery.Reference, new UserRelationship (friendnode.Reference));
}

Can anyone help me optimize this,creating relationship is taking considerable amount of time

Upvotes: 0

Views: 177

Answers (1)

Tatham Oddie
Tatham Oddie

Reputation: 4290

There are many things you can improve here.

Injection risk and parameters

Do not ever write code in your queries like this:

.Where("n.UserID=" + _ui.UserID)

That is a) bad for performance, because it breaks all query plan caching, and b) a major security hole just waiting to happen. It's called an "injection" attack. You can lookup "SQL injection" for lots of information about it, but the same risk applies to Cypher. It's also a performance impact.

Do this instead:

.Where((User n) => n.UserID == _ui.UserID)

If you can't model it using a lambda expression like this, you can use parameters instead:

.Where("n.UserID = {userId}")
.WithParams(new { userId = _ui.UserID })

Node references

Do not use Node<T>, like Node<UserInfo>>. The concept of using raw node ids is deprecated with Neo4j 2.0, and is being phased out over time. There are lots of reasons for this, and plenty of other writing about it, so I won't duplicate that here.

If you want the data, just use UserInfo. If you need to do something with that node, add more clauses to your Cypher query.

Avoid anything other than IGraphClient.Cypher, like IGraphClient.CreateRelationship

All of the other methods use the Neo4j REST API which is gradually being replaced by Cypher calls.

In this case, your performance problem is because you are running an entire network request and DB transaction for every single relationship that you want to create. This is very slow.

Instead, do as much as you can in one Cypher query.

Putting it all together

Let's ignore C# and go back to Cypher for a second.

Find the user:

MATCH (user:User)
WHERE user.UserID = {userId}

Find all of the friends:

MATCH (friend:User)
WHERE friend.ThirdPartyFriendsId IN {thirdPartyFriendIds}

Create the relationship, only if it doesn't already exist (so you can run the query many times without creating duplicates):

CREATE UNIQUE user-[:FRIENDS_WITH]->friend

Now let's put all that together as one:

MATCH (user:User)
WHERE user.UserID = {userId}
MATCH (friend:User)
WHERE friend.ThirdPartyFriendsId IN {thirdPartyFriendIds}
CREATE UNIQUE user-[:FRIENDS_WITH]->friend

Now let's convert that to C#:

client.Cypher
    .Match("(user:User)")
    .Where((User user) => user.UserID == _ui.UserID)
    .Match("(friend:User)")
    .Where("friend.ThirdPartyFriendId IN {thirdPartyFriendIds}")
    .WithParams(new { thirdPartyFriendIds = _ui.ThirdPartyFriendsIds })
    .ExecuteWithoutResults();

Note that we have to call ExecuteWithoutResults at the end, because we're not returning any results. (We don't need to return any.)

Disclaimer: I've just typed all of this straight into the answer window here, so there may be minor mistakes. Don't copy and paste my code. Follow the principles.

Even faster

The approach so far still requires you to run it once for each user you have.

Something like this would create the relationships between all of your users in one go. (But might take a long time, depending on how many users you have.)

MATCH (user:User)
MATCH (friend:User)
WHERE friend.ThirdPartyFriendId IN user.ThirdPartyFriendsIds
CREATE UNIQUE user-[:FRIENDS_WITH]->friend

Key principles

Tell Neo4j what you're trying to do in total, rather than whispering lots of tiny little commands at it. It will do better when it knows a whole batch of things to do.

Never ever construct queries by concatenating dynamic string values into them. As well as being very very unsafe, it will be slow because you cause the query to be compiled every single time, and that is expensive.

Always start with Cypher, before you go to C#. You will get more help, because anybody who works with Neo4j can help instead of just the .NET people, and you will get a better result, because you will do more batch-based operations.

Upvotes: 2

Related Questions