user3341332
user3341332

Reputation: 439

Analysis of Solr index

I have a Solr 4 instance running which has about about two million entries which are notices published by a stock exchange. To give you an idea of the schema, the main components are as follows:

<field name="UID" type="string" indexed="true" stored="true" required="true" multiValued="false" /> 
<field name="company" type="text_general" indexed="true" stored="true" />
<field name="datetime" type="date" indexed="true" stored="true" />
<field name="title" type="text_general" indexed="true" stored="true" />
<field name="url" type="text_general" indexed="true" stored="true" />
<field name="notice" type="text_general" indexed="true" stored="true" />
<field name="cachefile" type="text_general" indexed="true" stored="true" />

Is there a way to prepare queries that will give me some interesting facts and figures about the index?

For example:

At the moment I'm not sure if this can be done with some clever query syntax, or if I need to employ the use of a module (Statistic/Analytics?)

Upvotes: 0

Views: 56

Answers (1)

Toke Eskildsen
Toke Eskildsen

Reputation: 729

  1. Top ten companies that have entries (and the number of notices for each): Facet on company, do a :-search. If there is a document for each notice, you would get the wanted result in the faceting request.
  2. Number of notices published each year: Do range faceting on datetime with year as the gap.
  3. Most and least popular day/month for publishing notices: Add two explicit fields for day and month and facet on those. Maybe also index the weekday, while you are at it?
  4. Most popular hour of the day for publishing notices: Make a field containing only the hour, facet on that.
  5. Longest notice (by number of characters): A function query is the candidate here, but there is no strLength-function. Besides, it would not work as you use a text-field for the notices. Instead you could introduce a new field containing the length of the notice and sort on that.

Upvotes: 1

Related Questions