Reputation: 1782
I have an application that (among other things) stores a file system tree in a neo4j graph. That is to say that each directory and file is a node. Some of these files are Office documents, text or pdf files and I would like to provide some search functionality.
Search functionality should scan node properties and file content and return most relevant nodes.
--------------------------------------------------
update for extra information:
The graph allows to filter out subset of files. File nodes also contain custom metadata that needs to be searched. One of many applications are:
A user searches for a "term" > use of graph to find files that this search applies to (depending on user groups & rights for example) then search both node properties for "term" and file content > return most relevant results.
Possibly some files might be linked to others for some reason or another and those files should also be searched but with less priority (a "term" hit should idealy count for less than a hit on the initial file)
The real life case level of complexity is tenfold this so I cannot substitute/remove use of graph DB, or influence of the DB results in the result relevancy.
--------------------------------------------------
My questions are:
Thanks in advance guys.
Further details:
Upvotes: 0
Views: 454
Reputation: 7521
If you're wanting to do a file content scan, your probably better off choosing another data store for the file content. Neo4j would work great for searching things like file names and directory structures, but I believe you're talking about doing a byte array scan, and there are better systems out there for it.
Upvotes: 2