Reputation: 6548
I am having a hard time understanding different terminologies of distributed computing:
1). What is a node ? Is it simply the number of machines
within a distributed system or is it the number of processes
ran by a single machine ?
2). What is the relation between a shard and a node within a cluster ?
3). I understand that sharding is a separation of data inside a table/collection across multiple shards using shard keys. Is sharding a physical separation
OR a logical separation
?
Upvotes: 2
Views: 239
Reputation: 6548
I found all my answers and cleared confusions from here: Elastic Search 5.x: Basic Concepts
Note: this reference guide
is for 5.x version
. I was looking at the 2.x version
before which doesn't not have a clear explanation on these issues.The links provided by @Artholl in his answer also belongs to 2.x
Upvotes: 0
Reputation: 93
Considering the elasticsearch tag in your question, Here is the elasticsearch nomemclature:
According to https://www.elastic.co/guide/en/elasticsearch/guide/current/_an_empty_cluster.html
Elasticsearch Node:
A node is a running instance of Elasticsearch
Elasticsearch Cluster
A cluster consists of one or more nodes with the same cluster.name that are working together to share their data and workload.
According to https://www.elastic.co/guide/en/elasticsearch/guide/current/_add_an_index.html
Elasticsearch Shard
A shard is a low-level worker unit that holds just a slice of all the data in the index.
A shard is a single instance of Lucene, and is a complete search engine in its own right
Okay, now we have seen the concept of Cluster, Node and Shard in Elasticsearch. We can see that those definitions are pretty different (because specific to ES) to the one given by xosp7tom.
One piece of advice would be to read the elasticsearch chapter: https://www.elastic.co/guide/en/elasticsearch/guide/current/distributed-cluster.html if you want to have more information on how Elasticsearch team built their distributed search engine. It is pretty interesting and a good introduction to distributed system!
Upvotes: 1
Reputation: 2183
to 1)
a node refers one machine of a cluster. a socket refers one processor of a machine. a core refers one processing unit of a socket. a cpu is typically same as core.
For example, Tianhe-2 - as one cluster - has 130,000 nodes, 260,000 sockets, and 3,120,000 cores. https://www.top500.org/system/177999
Upvotes: 1