Reputation: 11
I was trying to understand how this stateful application is storing data in pvc, in general k8s statefulsets or myql stateful application store the same data in all the PVC/replicas in duplicate manner, meaning all the PVC's should have same data and if any pod goes down client get the requested data from other pod which is having the same data(from pvc).
i've installed qdrant with helm in AKS(4nodes, 4Replicas with 4PVC's(4 Azure SSD Disk's with default storage class:500G each)): We have pushed 1M collections with shard values and i can see only two pvc's got the data below:
k exec q181-qdrant-0 -- df -ah
Filesystem Size Used Avail Use% MountedON
/dev/sde 503G 14G 490G 3% /qdrant/storage
k exec q181-qdrant-1 -- df -ah
/dev/sdc 503G 24G 480G 5% /qdrant/storage
k exec q181-qdrant-2 -- df -ah
/dev/sdc 503G 5.4M 503G 1% /qdrant/storage
k exec q181-qdrant-3 -- df -ah
/dev/sdd 503G 9.1M 503G 1% /qdrant/storage
{~/ws} greetings, earthling [166.140Mb]$ ☞ kgp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
q181-qdrant-0 1/1 Running 0 4h44m 10.3.6.155 aks-qdrant64gb-12987685-vmss000000 <none> <none>
q181-qdrant-1 1/1 Running 0 4h49m 10.3.6.139 aks-qdrantnoaz-19721947-vmss000005 <none> <none>
q181-qdrant-2 1/1 Running 0 4h48m 10.3.6.11 aks-qdrant64gb-12987685-vmss000004 <none> <none>
q181-qdrant-3 1/1 Running 0 4h50m 10.3.6.20 aks-qdrant64gb-12987685-vmss000003 <none> <none>
GET /cluster #Result
{
"result": {
"status": "enabled",
"peer_id": 3922056191064933,
"peers": {
"3922056191064933": {
"uri": "http://q181-qdrant-0.q181-qdrant-headless:6335/"
},
"612800828958104": {
"uri": "http://q181-qdrant-2.q181-qdrant-headless:6335/"
},
"5183492229046375": {
"uri": "http://q181-qdrant-3.q181-qdrant-headless:6335/"
},
"1755630610601120": {
"uri": "http://q181-qdrant-1.q181-qdrant-headless:6335/"
}
},
"raft_info": {
"term": 402,
"commit": 871,
"pending_operations": 0,
"leader": 1755630610601120,
"role": "Follower",
"is_voter": true
},
"consensus_thread_status": {
"consensus_thread_status": "working",
"last_update": "2024-03-20T18:10:53.895767796Z"
},
"message_send_failures": {}
},
"status": "ok",
"time": 0.0000052
}
trying to create shards in qdrant and understand how it is storing Statefulsets in kubernetes stores the multiple copy of same data in all the replicas but here it is not happening.
Upvotes: 0
Views: 303
Reputation: 19
which shard is at which node?
You need to call GET /collections/my-collection/cluster
for this info. You'll get something like:
{
"result": {
"peer_id": 6534799014422152,
"shard_count": 3,
"local_shards": [
{
"shard_id": 0,
"points_count": 62223,
"state": "Active"
},
{
"shard_id": 1,
"points_count": 71779,
"state": "Active"
}
],
"remote_shards": [
{
"shard_id": 0,
"peer_id": 2455022185625782,
"state": "Active"
},
{
"shard_id": 1,
"peer_id": 1073561671703112,
"state": "Active"
}
],
"shard_transfers": []
},
"status": "ok",
"time": 2.1128e-05
}
You can co-relate this with the results of GET /cluster
to find the shards for each node.
For rest of your questions:
replication_factor
as long as they have the same bootstrap URI.Read more at https://qdrant.tech/documentation/guides/distributed_deployment.
Upvotes: 1