Reputation: 175
I'm trying to delete documents from a Solr index. I'm using pysolr and trying to delete them by id and by query. In both cases the operation fails with ids like this one: cr-10.1002/(sici)1520-6688(199621)15:2<476::aid-pam7>3.3.co;2-2
with following error:
pysolr.SolrError: Solr responded with an error (HTTP 400): [Reason: Unexpected character '4' (code 52) in content after '<' (malformed start element?).
at [row,col {unknown-source}]: [1,53]]
https://lucene.apache.org/core/7_2_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters has no mention of escaping angle brackets at all. I tried it though, with no luck.
Any idea what I can do to delete these documents?
EDIT: updated the ID to match the error
Upvotes: 0
Views: 1065
Reputation: 175
I ended up using the JSON API like this:
import requests
url = 'http://localhost:8983/solr/collection/update' # update endpoint of the collection
ids_to_delete = ['a', 'b<c', 'd:e']
requests.post(url, json={ 'delete': ids_to_delete })
Upvotes: 2