language-agnosticdistributed-computingp2p

Reputation: 4884

Maintaining network integrity in peer-to-peer network

I am looking for information on techniques, algorithms, etc. on how to maintain network integrity in a dynamic peer-to-peer network. Both practical implementations, academic papers and anything else in that category are welcome.

Imagine a network that is solely peer-to-peer based where each node is only connect to x other nodes. Without having a grand list of all nodes, each node is responsible for maintaining a connection with the network. Nodes go down and come up dynamically, meaning each node needs to ask it's neighbors (and their neighbors?) for new nodes to connect to, in order to maintain the x number of connections.

Network segmentation (two halves of the network are only connected by one node from each network - if either of those go down, the network splits into two) and how to avoid this and efficient routing (distance metrics, etc.) are my main interests, but anything relating to networks with similar descriptions would be interesting.

I am currently looking at the Chord DHT protocol as it has some similarity to what I'm asking.

Upvotes: 3

Answers (6)

Wernight

Reputation: 37600

For ubiquitous computing various ad-hoc P2P networks have been developed and they'd probably fit your needs. It's been used for example in the army to deploy small capsules each talking to neighbors up to usually some command center. If you don't have a center, it may be related to distributed computing, anyway here are some links:

Wireless ad hoc network
List of ad-hoc routing protocols <-- This may be your best starting source info
http://eprints.kfupm.edu.sa/49089/1/49089.pdf

Upvotes: 2

lorenzog

Reputation: 3609

Just to avoid reinventing the wheel, take a look at the various routing protocols. OSPF might be a good starting point, given your scenario. Of course there are many, many variables that might make it not the best choice for you. Such as:

you can keep a shortest path to X nodes; if a node goes down, attached nodes are informed and can do a new SP search to find a suitable one; you need to consider overhead for ping and keep-alive messages
do you need to instradate connections (i.e. searches in the p2p network) or just maintain a large set of nodes interconnected (a la botnet)? If so, a mixed approach (a small, distributed hash table for small subsets of the network + OSPF/BGP for borders) might help;
so on and so forth

Upvotes: 1

Matt Williamson

Reputation: 40193

Use Chord. http://en.wikipedia.org/wiki/Chord_(peer-to-peer)

I've implemented it in projects before and it addresses these problems.

Upvotes: 0

DataWraith

Reputation: 3414

Have you looked at Kademlia? It is similar to Chord, and versions of it are used by BitTorrent and eMule. The paper lists a few measures for ensuring network integrity, even in the face of attack. The two basic ones are

Maintain enough peers so that the failure of enough of them to cause trouble is unlikely
Maintain the list of known peers in order of longest uptime. Studies have shown that the probability of a node going offline within the next hour gets lower the longer it has already been online. This also makes it difficult for attackers to flood the network with malicious nodes.

I'm not sure how much of this applies to Chord, because I haven't read as much about it, but I think going with a DHT is a good idea unless you need fuzzy searching.

Upvotes: 0

martinr

Reputation: 3812

My thoughts only -- not a complete solution; not tested in practice but still may touch on a number of interesting problems and potential solutions.

Standardised time for node failure and rejoining must be recorded and managed. To achieve this the network does not calculate on real-time basis but on an animation frame number basis. Have N front-end processors assigning FEP ID and job ID and network animation frame number to incoming jobs. There are a number of issues with real-time that are not quite addressed with quantizing time even; in some exception cases, its a bit like in accounting, posting events to when they should be regarded as occuring rather than when any cash moves.

For high performance, the heartbeat packets must also contain details of jobs being performed and recently completed or abandoned as well as the inventory of hosts in the network.

Network proceeds to process work items and publish their results to adjacent peers or FEPs. FEPs forward completed job details to clients, and can take over for failed FEPs as only state in an FEP is the last serial number stamped on a request.

Network must have a quorum to continue. External monitors track connectivity and inform the nodes which experience changes in connectivity whether they are now within or outside the quorum.

Where a work item is not completed by a machine because it fails, or a new node joins the network, a new work allocation policy must be established based on work item ID to allocate the work to the remaining nodes, until the new node comes back online.

For cases where multiple nodes perform the same job (duplication of effort - which is possible but minimised by designing the usual timeouts sensibly) the jobs must be rollbackable, and the conflict resolved using Markov Chains.

To detect the possible duplications reliably jobs must auto-rollback in less time than the timeout for receiving job results that applies during a crisis period ie when nodes are failing. A shorter timeout applies when nodes are not failing.

Upvotes: 2

Rafał Dowgird

Reputation: 45081

Netsukuku project aims for creation of protocols and software implementation for large wifi based ad-hoc network(s).

From their FAQ: "The Netsukuku project is based on the very simple idea of exploiting the great potentiality of the wifi connectivity, making the PCs of wireless communities act as routers and handle together an ad-hoc network even bigger than the Internet."

Upvotes: 3

Maintaining network integrity in peer-to-peer network

Answers (6)

Related Questions