Manos Nathanail
Manos Nathanail

Reputation: 57

Neo4j graph algorithm / Node similarity

Suppose this graph down below.

I have a job which requires some skills, and with this list of skills i am searching for candidates knows the skills (i mean the set of skills required by a specific job). This is the easy part.

The relationships have an attribute years_of_experience. The correct results requires an where candidate.years_of_experience>= skill.years_of_experience. I would like to use a procedure like this gds.nodeSimilarity for having a list of candidates and similarity for each of them. may i have some help with this query because i tryied but no luck till now

Graph example

Example:

MATCH aa= (job:JobNode{job_id:'feed85b9-041c-4bb5-b48a-963c9f927e1d'})-[r:REQUIRES]->(s:SkillNode) 
with job 
MATCH bb= (job:JobNode{job_id:'feed85b9-041c-4bb5-b48a-963c9f927e1d'})-[r:REQUIRES]->(s:SkillNode)<-[:KNOWS]-(c:CandidateNode) 
WITH {item:id(job), categories: collect(id(c))} AS userData 
WITH collect(userData) AS data 
CALL gds.alpha.ml.ann.stream({ data: data, algorithm: 'jaccard' }) 
YIELD item1, item2, similarity 
return data

Upvotes: 1

Views: 773

Answers (1)

jose_bacoy
jose_bacoy

Reputation: 12684

You should start with the candidates rather than the job. This is because you are comparing the similarities among the candidates based on the skills that the job requires.

MATCH (job:JobNode{job_id:'feed85b9-041c-4bb5-b48a-963c9f927e1d'})-[r:REQUIRES]->(s:SkillNode)<-[:KNOWS]-(c:CandidateNode)
MATCH (c)-[:APPLIED_FOR]-(job)
WITH {item:id(c), categories: collect(id(job))} AS userData
WITH collect(userData) AS data
CALL gds.alpha.ml.ann.stream({
   data: data,
   algorithm: 'jaccard'
 })
 YIELD item1, item2, similarity
 return gds.util.asNode(item1).name AS Candidate1, gds.util.asNode(item2).name AS Candidate2, similarity
 ORDER BY Candidate1

 Result:
 ╒════════════╤════════════╤════════════╕
 │"Candidate1"│"Candidate2"│"similarity"│
 ╞════════════╪════════════╪════════════╡
 │"Leo"       │"Manos"     │1.0         │
 ├────────────┼────────────┼────────────┤
 │"Manos"     │"Leo"       │1.0         │
 └────────────┴────────────┴────────────┘

EDITED: I thought the question is about node similarity algorithm in neo4j data science library. The question is about calculating the percentage of each candidates' skill compared to the total skill required for the job.

Steps:

  1. Get the total number of skills required for the job
  2. Get all candidates with those skills that the job requires
  3. Ensure that this candidate has applied for that job
  4. Return the candidate name and his/her skills divided by total skills needed for that job

NOte: the function round(10^2/10^2) is a hack in neo4j. Neo4j is not capable of displaying decimals in desktop. If you want 3 decimal places, use 10^3

MATCH (job:JobNode{job_id:'<id>'})-[:REQUIRES]->(sk:SkillNode)
WITH job, count(sk) as total_skills
MATCH (job)-[r:REQUIRES]->(s:SkillNode)<-[k:KNOWS]-(c:CandidateNode)
MATCH (c)-[:APPLIED_FOR]-(job)
WITH c as candidate, count(s) as skills, total_skills
RETURN candidate.name, round(10^2*skills/total_skills)/10^2 as percent

Result:
╒════════════════╤═════════╕
│"candidate.name"│"percent"│
╞════════════════╪═════════╡
│"Leo"           │0.67     │
├────────────────┼─────────┤
│"Manos"         │0.67     │
└────────────────┴─────────┘

Upvotes: 1

Related Questions