Reputation: 1645
I have a collection of objects (I'll call them nodes) that are connected to each other. Every node is connected to at least one other node, and the entire collection of nodes is one big blob (no outliers). The number of nodes is a multiple of three.
What I want to do is divide the nodes into groups of three connected nodes (with no overlapping between groups).
Every node has a unique ID. The nodes are stored in a document wherein every line is the ID of a node followed by the nodes it is connected to. It looks something like this:
30:240
40:210/240/230/60
50:80/220/230/70/210/190/200
60:240/40/230/220
70:90/200/50/220
80:50/170/190/210
90:200/70/180
100:110/160/190/170/200/180
110:100/180
160:170/100
170:80/160/190/100
180:90/200/110/100
190:50/80/200/170/100
200:90/70/100/50/190/180
210:50/80/40/230
220:50/230/70/60
230:50/210/60/40/220
240:40/30/60
I've made some images to help visualize it:
I want to divide my nodes into groups of three, but they must be connected. There are many possible combinations of grouping the nodes. One possibility is the following image:
However, I am struggling to come up with an algorithm that can ensure that every node is in a group of three connected nodes (and no overlapping). In order to avoid overlapping, I "claim" nodes while making groups, which is my way of checking if a node already belongs to another group.
I have come up with the following code:
nodes = dict()
with open("input.txt") as f:
for line in f.readlines():
split = line.split(":")
node = int(split[0])
nodes[node] = []
for connection in split[1].split("/"):
nodes[node].append(int(connection.strip()))
groups = dict()
claimed = []
for node in nodes:
if node in claimed:
continue
connections = nodes[node]
if len(connections) > 1:
for i in range(len(connections)):
if node in groups and len(groups[node]) == 2:
break
if not connections[i] in claimed:
if not node in groups:
groups[node] = []
groups[node].append(connections[i])
claimed.append(connections[i])
if not node in claimed:
claimed.append(node)
f = open("output.txt",'w')
for group in groups:
line = str(group) + ":"
for node in groups[group]:
line += str(node) + "/"
line = line[:-1] + "\n"
f.write(line)
f.close()
However, this produces the following:
160:170/100
80:190
230:50/210
70:90/200
110:180
240:40/30
220:60
Which can be visualized like this:
As you can see, there are four groups of three (in blue), and three groups of two (in orange). This is because I claim nodes as I am making the groups, causing some nodes to no longer have two unclaimed connected nodes for a group.
So my question is: How can I have my nodes claim connected nodes in such a way that other nodes can still form their own groups of three?
Or more basically: How can I divide connected nodes into groups of three connected nodes with no overlapping? Every node needs to be in a group of three.
My question is more about the algorithm than code. I am also open to an entirely different algorithm.
Upvotes: 3
Views: 1120
Reputation: 7100
data.py
node_string = """
30:240
40:210/240/230/60
50:80/220/230/70/210/190/200
60:240/40/230/220
70:90/200/50/220
80:50/170/190/210
90:200/70/180
100:110/160/190/170/200/180
110:100/180
160:170/100
170:80/160/190/100
180:90/200/110/100
190:50/80/200/170/100
200:90/70/100/50/190/180
210:50/80/40/230
220:50/230/70/60
230:50/210/60/40/220
240:40/30/60
"""
node_string_1 = """
(238,140,141):(208,134,191)/(117,119,222)/(229,134,102)/(167,234,229)
(176,120,113):(150,201,147)/(111,237,177)
(113,171,164):(237,128,198)/(159,172,151)/(202,171,225)/(240,168,211)/(111,218,139)
(208,134,191):(238,140,141)/(229,134,102)
(131,150,110):(212,209,198)/(155,133,217)/(168,140,187)/(173,204,120)/(147,210,106)/(207,149,101)/(228,172,108)
(229,134,102):(208,134,191)/(238,140,141)/(167,234,229)
(212,209,198):(109,129,191)/(155,133,217)/(147,210,106)/(131,150,110)
(105,199,125):(150,201,147)/(175,118,238)/(240,168,211)
(175,118,238):(105,199,125)/(240,168,211)
(114,178,129):(164,233,166)/(227,123,219)/(135,156,161)/(224,183,104)/(228,172,108)
(147,210,106):(212,209,198)/(173,204,120)/(131,150,110)/(143,147,114)
(222,128,188):(135,156,161)/(237,128,198)/(159,172,151)/(111,237,177)/(150,201,147)
(166,218,221):(207,149,101)/(143,114,198)/(159,172,151)/(202,171,225)
(143,147,114):(173,204,120)/(107,184,168)/(147,210,106)
(111,218,139):(132,138,234)/(202,171,225)/(113,171,164)
(117,119,222):(109,129,191)/(155,133,217)/(238,140,141)/(167,234,229)
(107,184,168):(164,233,166)/(173,204,120)/(143,147,114)/(227,123,219)
(224,183,104):(164,233,166)/(114,178,129)/(111,237,177)/(135,156,161)
(228,172,108):(227,123,219)/(173,204,120)/(135,156,161)/(159,172,151)/(207,149,101)/(131,150,110)/(114,178,129)
(227,123,219):(173,204,120)/(164,233,166)/(107,184,168)/(228,172,108)/(114,178,129)
(168,140,187):(155,133,217)/(207,149,101)/(176,204,213)/(143,114,198)/(131,150,110)/(159,166,127)
(207,149,101):(168,140,187)/(159,172,151)/(166,218,221)/(143,114,198)/(131,150,110)/(228,172,108)
(159,172,151):(207,149,101)/(135,156,161)/(237,128,198)/(202,171,225)/(113,171,164)/(166,218,221)/(222,128,188)/(228,172,108)
(143,114,198):(168,140,187)/(207,149,101)/(202,171,225)/(166,218,221)/(184,163,168)/(159,166,127)
(164,233,166):(227,123,219)/(114,178,129)/(107,184,168)/(224,183,104)
(150,201,147):(176,120,113)/(240,168,211)/(237,128,198)/(105,199,125)/(111,237,177)/(222,128,188)
(111,237,177):(135,156,161)/(224,183,104)/(222,128,188)/(176,120,113)/(150,201,147)
(159,166,127):(184,163,168)/(168,140,187)/(176,204,213)/(143,114,198)
(155,133,217):(212,209,198)/(168,140,187)/(167,234,229)/(176,204,213)/(109,129,191)/(117,119,222)/(131,150,110)
(109,129,191):(212,209,198)/(117,119,222)/(155,133,217)
(132,138,234):(184,163,168)/(202,171,225)/(111,218,139)
(240,168,211):(150,201,147)/(237,128,198)/(105,199,125)/(175,118,238)/(113,171,164)
(167,234,229):(155,133,217)/(117,119,222)/(238,140,141)/(176,204,213)/(229,134,102)
(173,204,120):(227,123,219)/(147,210,106)/(143,147,114)/(107,184,168)/(131,150,110)/(228,172,108)
(135,156,161):(114,178,129)/(224,183,104)/(159,172,151)/(111,237,177)/(222,128,188)/(228,172,108)
(237,128,198):(150,201,147)/(159,172,151)/(222,128,188)/(240,168,211)/(113,171,164)
(176,204,213):(155,133,217)/(168,140,187)/(159,166,127)/(167,234,229)
(202,171,225):(132,138,234)/(184,163,168)/(159,172,151)/(113,171,164)/(166,218,221)/(143,114,198)/(111,218,139)
(184,163,168):(143,114,198)/(132,138,234)/(159,166,127)/(202,171,225)
"""
node_grouper.py
import itertools
import math
from collections import deque
from ast import literal_eval as totuple
import data
class NodeGrouper():
finished = False
steps = 0
nodes = {}
groups = {}
weights = {}
best_group = []
returned_results_cache = []
# parse node string
# works with two different conventions:
# - node names that are numbers
# - node names that are tuples
def parse_node(self, string):
try:
return totuple(string)
except:
return int(string)
# Get a dictionary from the source string.
# receives node_string (str): the node string passed in settings
# returns {160: [170, 100], 240: [40, 30, 60] ... }
def parse_string(self, node_string):
nodes = {}
for line in node_string.split():
split = line.split(":")
node = self.parse_node(split[0])
if node not in nodes:
nodes[node] = set()
for connection in split[1].split("/"):
connection = self.parse_node(connection.strip())
if connection not in nodes:
nodes[connection] = set()
nodes[node].add(connection)
nodes[connection].add(node)
self.nodes = nodes
for node in self.nodes:
self.nodes[node] = sorted(self.nodes[node])
return nodes
# Get all possible combinations.
# Build a tuple with every possible 3 node connection.
# receives node (dict) as created by self.parse_string()
# returns ((90, 180, 200), (80, 100, 190), (60, 70, 220) ... )
def possible_combinations(self, nodes):
possible_links = {}
for node, links in nodes.items():
link_possibilities = itertools.combinations(links, self.nodes_per_group - 1)
for link_possibility in link_possibilities:
if node not in possible_links:
possible_links[node] = []
possible_links[node] += [link_possibility]
groups = set()
for node, possible_groups in possible_links.items():
for possible_group in possible_groups:
groups.add(tuple(sorted(possible_group + (node,))))
return tuple(groups)
# Handle status messages (if settings['log_status'] == True).
def log_status(self):
# log how many steps were needed
if self.settings['log_status']:
print 'Error margin (unused nodes): %s' % (len(self.nodes) -(len(self.best_group) * 3))
print '%s steps' % self.steps
print self.format_output(self.best_group)
# Create an output string in the format of the Stack Overflow ticket.
def format_output(self, group_combination):
out_dict = {}
for group in group_combination:
for node in group:
links = set([n for n in group if n != node])
if set(links).issubset(set(self.nodes[node])):
if node not in out_dict:
out_dict[node] = set()
out_dict[node] = links
break
output = ''
for node, links in out_dict.items():
output += '%s:' % str(node)
output += '/'.join([str(link) for link in links])
output += '\n'
return output
# Start with groups with nodes harder to match.
def sort_possible_groups(self, queried_group):
gid = self.groups.index(queried_group) + 1
for queried_group in queried_group:
if queried_group in self.preferred_nodes:
self.weights[gid - 1] -= 3
return self.weights[gid - 1]
def initial_weights(self):
weights = []
groups_with_this_node = {}
for group_id, group in enumerate(self.groups):
ordinarity = 0
for node in group:
if node not in groups_with_this_node:
groups_with_this_node[node] = sum(x.count(node) for x in self.groups)
ordinarity += groups_with_this_node[node] * 100
weights += [ordinarity]
return weights
def selected_group_available(self, group_1, group_pool):
for group_2 in group_pool:
for node in group_1:
self.steps += 1
if self.settings['log_status']:
# log status each 500000 steps, if specified in the settings
if self.steps % self.settings['log_period'] == 0:
self.log_status()
if node in group_2:
return False
return True
def get_results(self, i = None, weights = None):
group_length = len(self.groups)
group_length_iterator = range(group_length)
groups = deque(sorted(self.groups, key = self.sort_possible_groups))
output = [groups[i]]
for j in group_length_iterator:
if i == j:
continue
selected_group = tuple(groups[j])
if self.selected_group_available(selected_group, output):
output += [selected_group]
match = sorted(tuple(output))
if len(output) > len(self.best_group):
self.best_group = output
if len(output) >= self.maximum_groups or \
(
self.settings['maximum_steps'] > 0 and \
self.steps > self.settings['maximum_steps']
):
self.finished = True
# print len(output)
if match not in self.returned_results_cache:
self.returned_results_cache += [match]
self.preferred_nodes = [n for n in self.nodes]
for g in match:
for n in g:
if n in self.preferred_nodes:
self.preferred_nodes.remove(n)
return len(output)
def __init__(self, **settings):
# save settings in a property
self.settings = {}
for key, value in settings.iteritems():
self.settings[key] = value
self.nodes_per_group = 3
self.nodes = self.parse_string(self.settings['node_string'])
self.preferred_nodes = [node for node in self.nodes]
self.groups = self.possible_combinations(self.nodes)
self.groupable_nodes = list(set(sum(self.groups, ())))
self.groupable_nodes_total = len(self.groupable_nodes)
self.maximum_groups = int(math.floor(self.groupable_nodes_total / self.nodes_per_group))
self.weights = self.initial_weights()
def calculate(self):
for i in range(len(ng.groups)):
self.get_results(i)
if self.finished:
break
if __name__ == "__main__":
ng = NodeGrouper(
# 0 is for unlimitted
# (try until all nodes are used)
maximum_steps = 10000000,
# Log number of iterations
log_status = False,
# Log each x iterations
log_period = 5000000,
node_string = data.node_string_3,
)
ng.calculate()
print ng.format_output(ng.best_group)
It handled this example in 1285000000 steps:
Upvotes: 1
Reputation: 6342
Create adjacency lists such as below based on the connected nodes.
90 -> 180, 200, 70
70 -> 90, 200, 220
180 -> 90, 110, 100, 200
110 -> 180, 100
100 -> 110, 180, 200, 190, 160, 170
Navigation should be as follows.
Increment group number after the group is full.
And so on......
Once the condition of (number of groups * 3 ) == total number of nodes is satisfied, break out of loop.
Upvotes: 1
Reputation: 45
I'm not sure if it works in every edge case, but if you can find a path through the graph that passes through every node once, you can just group that path into sections of three. Depending on the size of the graph, you could potentially do that with a simple DFS.
EDIT: For this to work you need to trim off all tails (paths of no return) but that just means you'll need to count off the nodes slightly differently when you're grouping the ones adjacent to the tail.
Upvotes: 2
Reputation: 77
The bigger question is how many nodes you will have in your search. If the # nodes remains low, you could always iter over all possible combinations. And valide each possible solution.
But if you have serveral 100 nodes, this will be become impossible. Then you should like to for AI code that does the work for you.
Upvotes: 0