Reputation: 23

networkx position (biggest) nodes in the middle of a graph

I've been creating graphs with the networkx package and everything works fine. I would like to make the graphs even better by placing the bigger nodes in the middle of the graph and the layout functions from networkx does not seem to do the job. The nodes represent the size of degree (the higher connected the node, the bigger).

Is there any way to program these graphs in such a way that the bigger nodes are positioned in the middle? It does not have to be automated, i could also manually choose the nodes and give them the middle position but i can also not find how to do this.

If this is not possible with networkx or something else; is there any way to do it with Gephi or cytoscape? I had trouble with Gephi that it does not import the graph the same way i see it in my jupyter notebook (the colors, the node- and edge-sizes do not import).

To summarize; i want to put bigger nodes in the middle of my graph but i dont mind how i get it done (with networkx, matplotlib or whatever).

Unfortunately i cannot provide my actual graphs but here is an example which can look like one of my graphs; it is a directed weighted graph.

    G = nx.gnp_random_graph(15, 0.2, directed=True)
    d = dict(G.degree(weight='weight'))
    d = {k: v/10 for k, v in d.items()}
    edge_size = [(float(i)/sum(weights))*100 for i in weights]
    node_size = [(v*1000) for v in d.values()]
    nx.draw(G,width=edge_size,node_size=node_size)

Upvotes: 2

Answers (1)

Paul Brodersen

Reputation: 13041

There are several options:

import networkx as nx

G = nx.gnp_random_graph(15, 0.2, directed=True)
node_degree = dict(G.degree(weight='weight'))

# A) Precompute node positions, and then manually over-ride some node positions.
node_positions = nx.spring_layout(G)
node_positions[0] = (0.5, 0.5) # by default, networkx plots on a canvas with the origin at (0, 0) and a width and height of 1; (0.5, 0.5) is hence the center
nx.draw(G, pos=node_positions, node_size=[100 * node_degree[node] for node in G])
plt.show()

# B) Use netgraph to draw the graph and then drag the nodes around with the mouse.
from netgraph import InteractiveGraph # pip install netgraph
plot_instance = InteractiveGraph(G, node_size=node_degree)
plt.show()

# C) Modify the Fruchterman-Reingold algorithm to include a gravitational force that pulls nodes with a large "mass" towards the center.
# This is left as an exercise to the interested reader (i.e. very non-trivial).

Edit: option C is non-trivial but also very do-able.

Here is my stab at it.

#!/usr/bin/env python
# coding: utf-8

"""
FR layout but with an additional gravitational pull towards a gravitational center.
The pull is proportional to the mass of the node.
"""
import numpy as np
import matplotlib.pyplot as plt

# pip install netgraph
from netgraph._main import BASE_SCALE
from netgraph._utils import (
    _get_unique_nodes,
    _edge_list_to_adjacency_matrix,
)
from netgraph._node_layout import (
    _is_within_bbox,
    _get_temperature_decay,
    _get_fr_repulsion,
    _get_fr_attraction,
    _rescale_to_frame,
    _handle_multiple_components,
    _reduce_node_overlap,
)

DEBUG = False

@_handle_multiple_components
def get_fruchterman_reingold_newton_layout(edges,
                                           edge_weights         = None,
                                           k                    = None,
                                           g                    = 1.,
                                           scale                = None,
                                           origin               = None,
                                           gravitational_center = None,
                                           initial_temperature  = 1.,
                                           total_iterations     = 50,
                                           node_size            = 0,
                                           node_mass            = 1,
                                           node_positions       = None,
                                           fixed_nodes          = None,
                                           *args, **kwargs):
    """Modified Fruchterman-Reingold node layout.

    Uses a modified Fruchterman-Reingold algorithm [Fruchterman1991]_ to compute node positions.
    This algorithm simulates the graph as a physical system, in which nodes repell each other.
    For connected nodes, this repulsion is counteracted by an attractive force exerted by the edges, which are simulated as springs.
    Unlike the original algorithm, there is an additional attractive force pulling nodes towards a gravitational center, in proportion to their masses.

    Parameters
    ----------
    edges : list
        The edges of the graph, with each edge being represented by a (source node ID, target node ID) tuple.
    edge_weights : dict
        Mapping of edges to edge weights.
    k : float or None, default None
        Expected mean edge length. If None, initialized to the sqrt(area / total nodes).
    g : float or None, default 1.
        Gravitational constant that sets the magnitude of the gravitational pull towards the center.
    origin : tuple or None, default None
        The (float x, float y) coordinates corresponding to the lower left hand corner of the bounding box specifying the extent of the canvas.
        If None is given, the origin is placed at (0, 0).
    scale : tuple or None, default None
        The (float x, float y) dimensions representing the width and height of the bounding box specifying the extent of the canvas.
        If None is given, the scale is set to (1, 1).
    gravitational_center : tuple or None, default None
        The (float x, float y) coordinates towards which nodes experience a gravitational pull.
        If None, the gravitational center is placed at the center of the canvas defined by origin and scale.
    total_iterations : int, default 50
        Number of iterations.
    initial_temperature: float, default 1.
        Temperature controls the maximum node displacement on each iteration.
        Temperature is decreased on each iteration to eventually force the algorithm into a particular solution.
        The size of the initial temperature determines how quickly that happens.
        Values should be much smaller than the values of `scale`.
    node_size : scalar or dict, default 0.
        Size (radius) of nodes.
        Providing the correct node size minimises the overlap of nodes in the graph,
        which can otherwise occur if there are many nodes, or if the nodes differ considerably in size.
    node_mass : scalar or dict, default 1.
        Mass of nodes.
        Nodes with higher mass experience a larger gravitational pull towards the center.
    node_positions : dict or None, default None
        Mapping of nodes to their (initial) x,y positions. If None are given,
        nodes are initially placed randomly within the bounding box defined by `origin` and `scale`.
        If the graph has multiple components, explicit initial positions may result in a ValueError,
        if the initial positions fall outside of the area allocated to that specific component.
    fixed_nodes : list or None, default None
        Nodes to keep fixed at their initial positions.

    Returns
    -------
    node_positions : dict
        Dictionary mapping each node ID to (float x, float y) tuple, the node position.

    References
    ----------
    .. [Fruchterman1991] Fruchterman, TMJ and Reingold, EM (1991) ‘Graph drawing by force‐directed placement’,
       Software: Practice and Experience

    """

    # This is just a wrapper around `_fruchterman_reingold`, which implements (the loop body of) the algorithm proper.
    # This wrapper handles the initialization of variables to their defaults (if not explicitely provided),
    # and checks inputs for self-consistency.

    assert len(edges) > 0, "The list of edges has to be non-empty."

    if origin is None:
        if node_positions:
            minima = np.min(list(node_positions.values()), axis=0)
            origin = np.min(np.stack([minima, np.zeros_like(minima)], axis=0), axis=0)
        else:
            origin = np.zeros((2))
    else:
        # ensure that it is an array
        origin = np.array(origin)

    if scale is None:
        if node_positions:
            delta = np.array(list(node_positions.values())) - origin[np.newaxis, :]
            maxima = np.max(delta, axis=0)
            scale = np.max(np.stack([maxima, np.ones_like(maxima)], axis=0), axis=0)
        else:
            scale = np.ones((2))
    else:
        # ensure that it is an array
        scale = np.array(scale)

    assert len(origin) == len(scale), \
        "Arguments `origin` (d={}) and `scale` (d={}) need to have the same number of dimensions!".format(len(origin), len(scale))
    dimensionality = len(origin)

    if gravitational_center is None:
        gravitational_center = origin + 0.5 * scale
    else:
        # ensure that it is an array
        gravitational_center = np.array(gravitational_center)

    if fixed_nodes is None:
        fixed_nodes = []

    connected_nodes = _get_unique_nodes(edges)

    if node_positions is None: # assign random starting positions to all nodes
        node_positions_as_array = np.random.rand(len(connected_nodes), dimensionality) * scale + origin
        unique_nodes = connected_nodes

    else:
        # 1) check input dimensionality
        dimensionality_node_positions = np.array(list(node_positions.values())).shape[1]
        assert dimensionality_node_positions == dimensionality, \
            "The dimensionality of values of `node_positions` (d={}) must match the dimensionality of `origin`/ `scale` (d={})!".format(dimensionality_node_positions, dimensionality)

        is_valid = _is_within_bbox(list(node_positions.values()), origin=origin, scale=scale)
        if not np.all(is_valid):
            error_message = "Some given node positions are not within the data range specified by `origin` and `scale`!"
            error_message += "\n\tOrigin : {}, {}".format(*origin)
            error_message += "\n\tScale  : {}, {}".format(*scale)
            error_message += "\nThe following nodes do not fall within this range:"
            for ii, (node, position) in enumerate(node_positions.items()):
                if not is_valid[ii]:
                    error_message += "\n\t{} : {}".format(node, position)
            error_message += "\nThis error can occur if the graph contains multiple components but some or all node positions are initialised explicitly (i.e. node_positions != None)."
            raise ValueError(error_message)

        # 2) handle discrepancies in nodes listed in node_positions and nodes extracted from edges
        if set(node_positions.keys()) == set(connected_nodes):
            # all starting positions are given;
            # no superfluous nodes in node_positions;
            # nothing left to do
            unique_nodes = connected_nodes
        else:
            # some node positions are provided, but not all
            for node in connected_nodes:
                if not (node in node_positions):
                    warnings.warn("Position of node {} not provided. Initializing to random position within frame.".format(node))
                    node_positions[node] = np.random.rand(2) * scale + origin

            unconnected_nodes = []
            for node in node_positions:
                if not (node in connected_nodes):
                    unconnected_nodes.append(node)
                    fixed_nodes.append(node)
                    # warnings.warn("Node {} appears to be unconnected. The current node position will be kept.".format(node))

            unique_nodes = connected_nodes + unconnected_nodes

        node_positions_as_array = np.array([node_positions[node] for node in unique_nodes])

    total_nodes = len(unique_nodes)

    if isinstance(node_size, (int, float)):
        node_size = node_size * np.ones((total_nodes))
    elif isinstance(node_size, dict):
        node_size = np.array([node_size[node] if node in node_size else 0. for node in unique_nodes])

    if isinstance(node_mass, (int, float)):
        node_mass = node_mass * np.ones((total_nodes))
    elif isinstance(node_mass, dict):
        node_mass = np.array([node_mass[node] if node in node_mass else 0. for node in unique_nodes])

    adjacency = _edge_list_to_adjacency_matrix(
        edges, edge_weights=edge_weights, unique_nodes=unique_nodes)

    # Forces in FR are symmetric.
    # Hence we need to ensure that the adjacency matrix is also symmetric.
    adjacency = adjacency + adjacency.transpose()

    if fixed_nodes:
        is_mobile = np.array([False if node in fixed_nodes else True for node in unique_nodes], dtype=bool)

        mobile_positions = node_positions_as_array[is_mobile]
        fixed_positions = node_positions_as_array[~is_mobile]

        mobile_node_sizes = node_size[is_mobile]
        fixed_node_sizes = node_size[~is_mobile]

        mobile_node_masses = node_mass[is_mobile]
        fixed_node_masses = node_mass[~is_mobile]

        # reorder adjacency
        total_mobile = np.sum(is_mobile)
        reordered = np.zeros((adjacency.shape[0], total_mobile))
        reordered[:total_mobile, :total_mobile] = adjacency[is_mobile][:, is_mobile]
        reordered[total_mobile:, :total_mobile] = adjacency[~is_mobile][:, is_mobile]
        adjacency = reordered
    else:
        is_mobile = np.ones((total_nodes), dtype=bool)

        mobile_positions = node_positions_as_array
        fixed_positions = np.zeros((0, 2))

        mobile_node_sizes = node_size
        fixed_node_sizes = np.array([])

        mobile_node_masses = node_mass
        fixed_node_masses = np.array([])

    if k is None:
        area = np.product(scale)
        k = np.sqrt(area / float(total_nodes))

    temperatures = _get_temperature_decay(initial_temperature, total_iterations)

    # --------------------------------------------------------------------------------
    # main loop

    for ii, temperature in enumerate(temperatures):
        candidate_positions = _fruchterman_reingold_newton(mobile_positions, fixed_positions,
                                                           mobile_node_sizes, fixed_node_sizes,
                                                           adjacency, temperature, k,
                                                           mobile_node_masses, fixed_node_masses,
                                                           gravitational_center, g)
        is_valid = _is_within_bbox(candidate_positions, origin=origin, scale=scale)
        mobile_positions[is_valid] = candidate_positions[is_valid]

    # --------------------------------------------------------------------------------
    # format output

    node_positions_as_array[is_mobile] = mobile_positions

    if np.all(is_mobile):
        node_positions_as_array = _rescale_to_frame(node_positions_as_array, origin, scale)

    node_positions = dict(zip(unique_nodes, node_positions_as_array))

    return node_positions


def _fruchterman_reingold_newton(mobile_positions, fixed_positions,
                                 mobile_node_radii, fixed_node_radii,
                                 adjacency, temperature, k,
                                 mobile_node_masses, fixed_node_masses,
                                 gravitational_center, g):
    """Inner loop of modified Fruchterman-Reingold layout algorithm."""

    combined_positions = np.concatenate([mobile_positions, fixed_positions], axis=0)
    combined_node_radii = np.concatenate([mobile_node_radii, fixed_node_radii])

    delta = mobile_positions[np.newaxis, :, :] - combined_positions[:, np.newaxis, :]
    distance = np.linalg.norm(delta, axis=-1)

    # alternatively: (hack adapted from igraph)
    if np.sum(distance==0) - np.trace(distance==0) > 0: # i.e. if off-diagonal entries in distance are zero
        warnings.warn("Some nodes have the same position; repulsion between the nodes is undefined.")
        rand_delta = np.random.rand(*delta.shape) * 1e-9
        is_zero = distance <= 0
        delta[is_zero] = rand_delta[is_zero]
        distance = np.linalg.norm(delta, axis=-1)

    # subtract node radii from distances to prevent nodes from overlapping
    distance -= mobile_node_radii[np.newaxis, :] + combined_node_radii[:, np.newaxis]

    # prevent distances from becoming less than zero due to overlap of nodes
    distance[distance <= 0.] = 1e-6 # 1e-13 is numerical accuracy, and we will be taking the square shortly

    with np.errstate(divide='ignore', invalid='ignore'):
        direction = delta / distance[..., None] # i.e. the unit vector

    # calculate forces
    repulsion    = _get_fr_repulsion(distance, direction, k)
    attraction   = _get_fr_attraction(distance, direction, adjacency, k)
    gravity      = _get_gravitational_pull(mobile_positions, mobile_node_masses, gravitational_center, g)

    if DEBUG:
        r = np.median(np.linalg.norm(repulsion, axis=-1))
        a = np.median(np.linalg.norm(attraction, axis=-1))
        g = np.median(np.linalg.norm(gravity, axis=-1))
        print(r, a, g)

    displacement = attraction + repulsion + gravity

    # limit maximum displacement using temperature
    displacement_length = np.linalg.norm(displacement, axis=-1)
    displacement = displacement / displacement_length[:, None] * np.clip(displacement_length, None, temperature)[:, None]

    mobile_positions = mobile_positions + displacement

    return mobile_positions


def _get_gravitational_pull(mobile_positions, mobile_node_masses, gravitational_center, g):
    delta = gravitational_center[np.newaxis, :] - mobile_positions
    direction = delta / np.linalg.norm(delta, axis=-1)[:, np.newaxis]
    magnitude = mobile_node_masses - np.mean(mobile_node_masses)
    return g * magnitude[:, np.newaxis] * direction


if __name__ == '__main__':

    import networkx as nx
    from netgraph import Graph

    G = nx.gnp_random_graph(15, 0.2, directed=True)
    node_degree = dict(G.degree(weight='weight'))

    node_positions = get_fruchterman_reingold_newton_layout(
        list(G.edges()),
        node_size={node : BASE_SCALE * degree for node, degree in node_degree.items()},
        node_mass=node_degree, g=2
    )
    Graph(G, node_layout=node_positions, node_size=node_degree)
    plt.show()

Upvotes: 3

networkx position (biggest) nodes in the middle of a graph

Answers (1)

Edit: option C is non-trivial but also very do-able.

Related Questions