Reputation:
I try to index PDF files with Sitecore 7. I installed IFilter , but I received on crawlers log next error :
ManagedPoolThread #17 09:24:20 WARN LuceneIndexOperations : Update : Could not build document data 4433434-3443-3223-91c4-233232. Skipping.
Exception: System.Runtime.InteropServices.COMException
Message: Error HRESULT E_FAIL has been returned from a call to a COM component.
Source: mscorlib
at System.Runtime.InteropServices.ComTypes.IPersistFile.Load(String pszFileName, Int32 dwMode)
at Sitecore.ContentSearch.Extracters.IFilterTextExtraction.FilterLoader.LoadAndInitIFilter(String fileName, String extension)
at Sitecore.ContentSearch.Extracters.IFilterTextExtraction.FilterReader..ctor(String fileName)
at Sitecore.ContentSearch.ComputedFields.MediaItemIFilterTextExtractor.ComputeFieldValue(IIndexable indexable)
at Sitecore.ContentSearch.ComputedFields.MediaItemContentExtractor.ComputeFieldValue(IIndexable indexable)
at Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder.AddComputedIndexFields()
at Sitecore.ContentSearch.LuceneProvider.LuceneIndexOperations.GetIndexData(IIndexable indexable, IIndexable latestVersion, IProviderUpdateContext context)
at Sitecore.ContentSearch.LuceneProvider.LuceneIndexOperations.BuildDataToIndex(IProviderUpdateContext context, IIndexable version, IIndexable latestVersion)
at Sitecore.ContentSearch.LuceneProvider.LuceneIndexOperations.<>c__DisplayClass7.<Update>b__0(Item version)
What I have to do work because on Sitecore documentation they said it must work out of the box.
Upvotes: 6
Views: 2435
Reputation: 353
Here is the solution I took since I didn't like the idea of coping iFilter related DLLs into the system path.
%ProgramFiles%\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\
.iisreset
For your consideration:
After these steps PDF indexing started working fine on my instance of Sitecore 7 running on Windows 8.1.
Upvotes: 2
Reputation:
I had the same issue and I received from Sitecore support next response (it works fine after):
1) Copy all the Adobe iFilter .dll files into the "\System32\Inetsrv" folder. This is the working directory for IIS on Windows Server. The Adobe iFilter .dll files are stored at the "C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin" folder by default. Also you can use the "IFilter Explorer" tool to detect the folder where the .dll files are stored: http://www.citeknet.com/Products/IFilters/IFilterExplorer/tabid/62/Default.aspx For more details please see the screenshot: http://screencast.com/t/xmWukanM+
2) Delete all the files under the "Website/App_Data/MediaCache" folder;
3) Rebuild the Sitecore Search Indexes (Sitecore -> Control Panel -> Indexing -> Indexing Manager);
4) Clear the Sitecore cache (the http://{hostname}/sitecore/admin/cache.aspx tool); 5) Restart the IIS;
Upvotes: 5