David A Stumpf
David A Stumpf

Reputation: 793

Neo4j Full Text Index: Term(s) found

I'm using Neo4j's db.index.fulltext.queryNodes and getting nice results using wildcards. This uses the Lucene 5.5.5 but may not implement all its functionality.

Is there are way to return the specific term that was found within the text being searched? For instance, I search for Sm*th and would like to see Smith or Smyth.

In Neo4j, I created the index on a node property "ancestral_surnames, which is an unstructured string of surnames.

CREATE FULLTEXT INDEX ancestor_surnames_names FOR (n:ancestor_surnames)\n" +
"ON EACH [n.name]

I then search for a surname and the associate DNA_Match node:

CALL db.index.fulltext.queryNodes('ancestor_surnames_names', 'Stinn*tt ') YIELD node, score 
WITH score,node.p as match,node.name as anc_names 
MATCH (m:DNA_Match{fullname:match}) 
return distinct m.fullname,match,anc_names order by m.fullname

I get back the full unstructured list of surnames and would like to extract out the term that was found.

Upvotes: 1

Views: 344

Answers (2)

David A Stumpf
David A Stumpf

Reputation: 793

Thanks Jose ...

Here the full query that gave me the desired result, including a list of two words that were found. The query itself will need to be created dtnamicaly in java code in the user defined function I'm creating. But this is slick!

CALL db.index.fulltext.queryNodes('ancestor_surnames_names', 'Stennett AND Kent') YIELD node, score WITH score,node.p as match,node.name as anc_names,node 
MATCH (m:DNA_Match{fullname:match}) 
WITH m,score,anc_names,[w in split(node.name," ") where w =~ 'Stennett' or w=~ 'Kent' | w] as words
return distinct m.fullname as DNA_Match_name,case when m.RN is null then '-' else toString(m.RN) end as RN,round(score,2) as score,words, anc_names as ancestor_list order by score desc,m.fullname

Upvotes: 1

jose_bacoy
jose_bacoy

Reputation: 12684

Below example is for searching the Movies database. You can do similarly for your node Person.

For Neo4j version 4.2 and below, use syntax below:

CALL db.index.fulltext.createNodeIndex("titlesIndex", ["Movie"], ["title"])

OR

Neo4j version 4.3+

CREATE FULLTEXT INDEX titlesIndex FOR (n:Movie) ON EACH [n.title]

Then use * or ? wildcards to query like below.

CALL db.index.fulltext.queryNodes("titlesIndex", "?re*") YIELD node, score
WITH node.title as title, score
MATCH (n:Movie {title:title}) 
WITH [w in split(n.title," ") where w =~ '(?i).*re.*' | w] as words, n, score
RETURN words[0] as match, n.title, score   

//where (?i) means ignore upper or lower case letters
//        .* means match any letter before and after "re"


╒════════╤══════════════════════╤═══════╕
│"match" │"n.title"             │"score"│
╞════════╪══════════════════════╪═══════╡
│"Dreams"│"What Dreams May Come"│1.0    │
├────────┼──────────────────────┼───────┤
│"Green" │"The Green Mile"      │1.0    │
└────────┴──────────────────────┴───────┘

Upvotes: 1

Related Questions