arthur
arthur

Reputation: 1064

How to delete all attributes from the schema in solr?

Deleting all documents from solr is

curl http://localhost:8983/solr/trans/update?commit=true -d "<delete><query>*:*</query></delete>"

Adding a (static) attribute to the schema is

curl -X POST -H 'Content-type:application/json' --data-binary '{ "add-field":{"name":"trans","type":"string","stored":true, "indexed":true},}' http://localhost:8983/solr/trans/schema

Deleting one attribute is

curl -X POST -H 'Content-type:application/json' -d '{ "delete-field":{"name":"trans"}}' http://arteika:8983/solr/trans/schema

Is there a way to delete all attributes from the schema?

Upvotes: 3

Views: 2530

Answers (1)

Just a student
Just a student

Reputation: 11050

At least in version 6.6 of the Schema API and up to the current version 7.5 of it, you can pass multiple commands in a single post (see 6.6 and 7.5 documenation, respectively). There are multiple accepted formats, but the most intuitive one (I think) is just passing an array for the action you want to perform:

curl -X POST -H 'Content-type: application/json' -d '{
  "delete-field": [
    {"name": "trans"},
    {"name": "other_field"}
  ]
}' 'http://arteika:8983/solr/trans/schema'

So. How do we obtain the names of the fields we want to delete? That can be done by querying the Schema:

curl -X GET -H 'Content-type: application/json' 'http://arteika:8983/solr/trans/schema'

In particular, the copyFields, dynamicFields and fields keys in the schema object in the response.

I automated clearing all copy field rules, dynamic field rules and fields as follows. You can of course use any kind of script that is available to you. I used Python 3 (might work with Python 2, I did not test that).

import json
import requests

# load schema information
api = 'http://arteika:8983/solr/trans/schema'
r = requests.get(api)

# delete copy field rules
names = [(o['source'], o['dest']) for o in r.json()['schema']['copyFields']]
payload = {'delete-copy-field': [{'source': name[0], 'dest': name[1]} for name in names]}
requests.post(api, data = json.dumps(payload),
                   headers = {'Content-type': 'application/json'})

# delete dynamic fields
names = [o['name'] for o in r.json()['schema']['dynamicFields']]
payload = {'delete-dynamic-field': [{'name': name} for name in names]}
requests.post(api, data = json.dumps(payload),
                   headers = {'Content-type': 'application/json'})

# delete fields
names = [o['name'] for o in r.json()['schema']['fields']]
payload = {'delete-field': [{'name': name} for name in names]}
requests.post(api, data = json.dumps(payload),
                   headers = {'Content-type': 'application/json'})

Just a note: I received status 400 responses at first, with null error messages. Had a bit of a hard time figuring out how to fix those, so I'm sharing what worked for me. Changing the default of updateRequestProcessorChain in solrconfig.xml to false (default="${update.autoCreateFields:false}") and restarting the Solr service made those errors go away for me. The fields I was deleting were created automatically, that may have something to do with that.

Upvotes: 2

Related Questions