Reputation: 57
Suppose this graph down below.
I have a job which requires some skills, and with this list of skills i am searching for candidates knows the skills (i mean the set of skills required by a specific job). This is the easy part.
The relationships have an attribute years_of_experience. The correct results requires an where candidate.years_of_experience>= skill.years_of_experience.
I would like to use a procedure like this
gds.nodeSimilarity
for having a list of candidates and similarity for each of them.
may i have some help with this query because i tryied but no luck till now
Example:
MATCH aa= (job:JobNode{job_id:'feed85b9-041c-4bb5-b48a-963c9f927e1d'})-[r:REQUIRES]->(s:SkillNode)
with job
MATCH bb= (job:JobNode{job_id:'feed85b9-041c-4bb5-b48a-963c9f927e1d'})-[r:REQUIRES]->(s:SkillNode)<-[:KNOWS]-(c:CandidateNode)
WITH {item:id(job), categories: collect(id(c))} AS userData
WITH collect(userData) AS data
CALL gds.alpha.ml.ann.stream({ data: data, algorithm: 'jaccard' })
YIELD item1, item2, similarity
return data
Upvotes: 1
Views: 773
Reputation: 12684
You should start with the candidates rather than the job. This is because you are comparing the similarities among the candidates based on the skills that the job requires.
MATCH (job:JobNode{job_id:'feed85b9-041c-4bb5-b48a-963c9f927e1d'})-[r:REQUIRES]->(s:SkillNode)<-[:KNOWS]-(c:CandidateNode)
MATCH (c)-[:APPLIED_FOR]-(job)
WITH {item:id(c), categories: collect(id(job))} AS userData
WITH collect(userData) AS data
CALL gds.alpha.ml.ann.stream({
data: data,
algorithm: 'jaccard'
})
YIELD item1, item2, similarity
return gds.util.asNode(item1).name AS Candidate1, gds.util.asNode(item2).name AS Candidate2, similarity
ORDER BY Candidate1
Result:
╒════════════╤════════════╤════════════╕
│"Candidate1"│"Candidate2"│"similarity"│
╞════════════╪════════════╪════════════╡
│"Leo" │"Manos" │1.0 │
├────────────┼────────────┼────────────┤
│"Manos" │"Leo" │1.0 │
└────────────┴────────────┴────────────┘
EDITED: I thought the question is about node similarity algorithm in neo4j data science library. The question is about calculating the percentage of each candidates' skill compared to the total skill required for the job.
Steps:
NOte: the function round(10^2/10^2) is a hack in neo4j. Neo4j is not capable of displaying decimals in desktop. If you want 3 decimal places, use 10^3
MATCH (job:JobNode{job_id:'<id>'})-[:REQUIRES]->(sk:SkillNode)
WITH job, count(sk) as total_skills
MATCH (job)-[r:REQUIRES]->(s:SkillNode)<-[k:KNOWS]-(c:CandidateNode)
MATCH (c)-[:APPLIED_FOR]-(job)
WITH c as candidate, count(s) as skills, total_skills
RETURN candidate.name, round(10^2*skills/total_skills)/10^2 as percent
Result:
╒════════════════╤═════════╕
│"candidate.name"│"percent"│
╞════════════════╪═════════╡
│"Leo" │0.67 │
├────────────────┼─────────┤
│"Manos" │0.67 │
└────────────────┴─────────┘
Upvotes: 1