Reputation: 671
QUESTION below:
Data structure in SOLR:
<field name="id" type="string" required="true"/>
<field name="session_id" type="string" required="true"/>
<field name="action_type" required="true"/>
<field name="error_msg" required="false"/>
(all fields have: indexed="true" stored="true" multiValued="false") only 'error' field is not required (can be null).
There is equivalent table in oracle:
TABLE SOLR_TEST
(
ID NUMBER NOT NULL ,
SESSION_ID VARCHAR2(20 BYTE) NOT NULL ,
ACTION_TYPE VARCHAR2(20 BYTE) NOT NULL ,
ERROR_MSG VARCHAR2(20 BYTE)
);
there is sample data (the same for SOLR and Oracle)
ID SESSION_ID ACTION_TYPE ERROR_MSG
-- -------------------- -------------------- --------------------
1 00001 SELECTED_ACTION
2 00001 SELECTED_ACTION
3 00001 OTHER
4 00002 A2 ERROR_001
5 00002 OTHER
6 00003 SELECTED_ACTION ERROR_002
7 00004 A1 ERROR_001
8 00005 A2
9 00005 SELECTED_ACTION
10 00005 SELECTED_ACTION ERROR_003
11 00006 SELECTED_ACTION
12 00006 OTHER ERROR_004
QUESTION:
How to create in SOLR query which will return:
all session_id
which have specified action_type
but never happen specified action_type
with non empty error_msg
or equivalent of this query in Oracle:
select distinct session_id
from SOLR_TEST
where action_type='SELECTED_ACTION'
and not session_id in
( select session_id
from SOLR_TEST
where action_type='SELECTED_ACTION'
and error_msg is not null
);
result for this query is:
SESSION_ID
--------------------
00001
00006
e.g. SOLR query like this is not working:
http://solrhost/solr/collection/select?rows=1&q=-(error_msg:[*+TO+*]+AND+action_type:SELECTED_ACTION)&wt=xml&indent=true&facet=true&facet.field=session_id&facet.zeros=false&fq=action_type:SELECTED_ACTION
// EDIT /////////////////////////////////////
real schema looks like this:
<schema name="elogging" version="1.5">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="action_type" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="session_id" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="error_msg" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<types>
<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="uuid" class="solr.UUIDField" indexed="true"/>
</types>
<updateRequestProcessorChain name="uniq-fields">
<processor class="org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactory">
<lst name="fields">
<str>id</str>
</lst>
</processor>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
</schema>
// EDIT 2 //////////////////////
SOLR query is not working as I expect - this SOLR query returns something like:
select distinct session_id
from SOLR_TEST
where action_type='SELECTED_ACTION'
and error_msg is null;
SESSION_ID
--------------------
00001
00005
00006
Value '00005' is wrong because there is a row:
10 00005 SELECTED_ACTION ERROR_003
// EDIT 3 ////////////
this SOLR query also not working (the same issue as for previous):
http://solrhost/solr/collection/select?rows=1&q=action_type:SELECTED_ACTION+AND+-{!join+from=session_id+to=session_id}error_msg:*+AND+action_type:SELECTED_ACTION&wt=xml&indent=true&facet=true&facet.field=session_id&facet.zeros=false
// EDIT 4 ///////
*fixes schema - 'error_msg' is indexed*
// EDIT 5 /////
There you have sample data for SOLR:
id,session_id,action_type,error_msg
1,00001,SELECTED_ACTION,
2,00001,SELECTED_ACTION,
3,00001,OTHER,
4,00002,A2,ERROR_001
5,00002,OTHER,
6,00003,SELECTED_ACTION,ERROR_002
7,00004,A1,ERROR_001
8,00005,A2,
9,00005,SELECTED_ACTION,
10,00005,SELECTED_ACTION,ERROR_003
11,00006,SELECTED_ACTION,
12,00006,OTHER,ERROR_004
and
result from SOLR for this data and query http://localhost:8983/solr/collection3/select?rows=1&q=-(error_msg:[*+TO+*]+AND+action_type:SELECTED_ACTION)&wt=xml&indent=true&facet=true&facet.field=session_id&facet.zeros=false&fq=action_type:SELECTED_ACTION
:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">30</int>
<lst name="params">
<str name="facet.zeros">false</str>
<str name="facet">true</str>
<str name="indent">true</str>
<str name="q">
-(error_msg:[* TO *] AND action_type:SELECTED_ACTION)
</str>
<str name="facet.field">session_id</str>
<str name="wt">xml</str>
<str name="fq">action_type:SELECTED_ACTION</str>
<str name="rows">1</str>
</lst>
</lst>
<result name="response" numFound="4" start="0">
<doc>
<str name="id">1</str>
<str name="session_id">00001</str>
<str name="action_type">SELECTED_ACTION</str>
<long name="_version_">1449881246216749056</long>
</doc>
</result>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="session_id">
<int name="00001">2</int>
<int name="00005">1</int>
<int name="00006">1</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges"/>
</lst>
</response>
Upvotes: 1
Views: 1274
Reputation: 2947
This is kind of tricky, because as far as I know(and I would be very happy if someone could prove this wrong) - it's not possible to reuse parts of query result in another query(e.g. filter query or nested query).
So, here is as close as I can get currently:
Query:
http://localhost:8983/solr/stack19588325/select?q=action_type%3A%22SELECTED_ACTION%22&fq=%7B!tag%3Ddt%7Daction_type%3ASELECTED_ACTION+AND+error_msg%3A%5B*+TO+*%5D+AND+_query_%3A%7B!join+from%3Dsession_id+to%3Dsession_id+v%3D%24qq%7D&rows=0&wt=xml&indent=true&facet=true&facet.mincount=1&facet.field={!ex=dt%20key=nonfilter_session_id}session_id&facet.field=session_id&qq=-error_msg:[*%20TO%20*]
Result:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
<lst name="params">
<str name="qq">-error_msg:[* TO *]</str>
<str name="q">action_type:"SELECTED_ACTION"</str>
<arr name="facet.field">
<str>{!ex=dt key=nonfilter_session_id}session_id</str>
<str>session_id</str>
</arr>
<str name="indent">true</str>
<str name="fq">{!tag=dt}action_type:SELECTED_ACTION AND error_msg:[* TO *] AND _query_:{!join from=session_id to=session_id v=$qq}</str>
<str name="facet.mincount">1</str>
<str name="rows">0</str>
<str name="wt">xml</str>
<str name="facet">true</str>
<str name="_">1382878844535</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
</result>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="nonfilter_session_id">
<int name="00001">2</int>
<int name="00005">2</int>
<int name="00003">1</int>
<int name="00006">1</int>
</lst>
<lst name="session_id">
<int name="00005">1</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges"/>
</lst>
</response>
So, as you see here, we have two different facet results:
So, if there won't be any better choice - you can make an intersection of those two sets, and there will be only those session_id's that are expected.
Upvotes: 1