Reputation: 193
We are working on a project where we have a running Milvus standalone with approximately 70M data inserted.
However, we encountered some issue regarding the delete operation. We attempted to delete entities based on this code linked here: https://github.com/milvus-io/pymilvus/blob/131989ac95587ce4664109ba9a2e2323cea42f33/pymilvus/orm/collection.py#L522.
Looks like the call to delete function was successful, but the entities were not deleted, so total entity number remained still the same, and collection was loaded with MMAP enabled. Why did the delete call have no effect and what is supposed to be the correct way to delete entities from collection?
We also have another problem when inserting into existed collection. In the project there will be daily new data inserted into the collection, so do we need to call create_index
every time new data is inserted and load again, or will Milvus build the index for new data automatically given the collection has MMAP enabled? Also, do we need to release the collection to build index for new data?
Upvotes: 4
Views: 317
Reputation: 54
Regarding the delete operation in Milvus, it is indeed an asynchronous process. When you call delete()
, the request is first received by the proxy node, which then forwards it to Pulsar or Kafka. Subsequently, the data node and query node asynchronously consume this request from Pulsar or Kafka. The delete operation is only fully completed once the request is applied to the appropriate segment.
The reason why collection.num_entities
shows the same number of entities after deletion is that this method doesn't count the deleted items. Instead, I recommend using collection.query(expr=", output_fields=["count(*)"])
. This query action counts the deleted items and will provide the correct number of rows after deletion.
For inserting data into an existing collection, there's no need to call create_index()
again. Calling create_index()
once sets a specified index type, and each segment has its own independent index. When a new segment is generated, Milvus will automatically build an index for it. If you wish to change the index type or its parameters, you should release the collection, call drop_index()
to remove the old index, and then use create_index()
to create the new one.
Upvotes: 1