Reputation: 53
I have a functional working Adobe ColdFusion application that is indexing roughly 2k PDF files with via Solr search and providing expected results - however each search query to the collection takes generally 25-30 seconds.
This is how I indexed the 2k PDF files to Solr:
<!--- query database files --->
<cfset getfiles = application.file.getfiles()>
<!--- create solr query set --->
<cfset filesQuery = QueryNew("
fileUID
, filepath
, title
, description
, fileext
, added
")>
<!--- create new file query with key path and download url --->
<cfoutput query="getfiles">
<cfset ext = trim(getfiles.fileext)>
<cfset path = expandpath('/docs/#fileUID#.#ext#')>
<cfscript>
newRow = QueryAddRow(filesQuery);
QuerySetCell(filesQuery, "fileUID","#fileUID#" );
QuerySetCell(filesQuery, "filepath","#path#" );
QuerySetCell(filesQuery, "title","#filename#" );
QuerySetCell(filesQuery, "description","#description#" );
QuerySetCell(filesQuery, "added","#added#" );
</cfscript>
</cfoutput>
<!--- index the bunch --->
<cfindex
query = "filesQuery"
collection = "resumes"
action = "update"
type = "file"
key = "filepath"
title = "title"
body = "title, description"
custom1 = "fileext"
custom2 = "added"
category= "file"
status = "filestatus">
This is how the files are being searched and where the (25-30 second) Solr search happens:
<!--- imagine form with (form.search) for terms --->
<cfsearch name = "results"
collection = "resumes"
criteria = "#form.search#
contextPassages = "1"
contextBytes = "300"
maxrows = "100"
contextHighlightBegin = "<strong>"
contextHighlightEnd = " </strong>">
<!--- show (results) query --->
Some additional info on the project: all of the files are less than 1 page in length, so there was no character cutoff in creating the index results to Solr. I have played with the Solr buffer limit within the ColdFusion Administrator with no major discernable change in time (currently at 40). I am on a development VM with MS Server 2003, 1.86 Xeon - Adobe ColdFusion 9.0.1 and 1GB RAM. JVM is Sun Microsytems (14.3-b01). Almost nothing else is running server-side, so performance should be unaffected by external factors.
It is providing expected and perfect results, just not in a timely fashion.
Upvotes: 4
Views: 1529
Reputation: 7519
You could try using CFSolrLib. It uses the Solr API. Its possible you may get a performance boost by bypassing <cfsearch>
Upvotes: 2