MDL Network Backbones

Tutorial

Code to perform network backboning method derived in “Fast nonparametric inference of network backbones for weighted graph sparsification” (Kirkley, 2024, https://arxiv.org/abs/2409.06417).

The function allows for both directed and undirected networks and provides options to adjust the analysis according to the network’s characteristics.

elist: A list consisting of directed tuples $(i, j, w_{ij})$ representing edges from node $i$ to node $j$ with weight $w_{ij}$ .
directed: Specify whether the input edge list is directed or undirected.
out_edges: Determine whether to track out-edges or in-edges attached to each node in the local pruning method.
allow_empty: Decide whether to allow empty backbones when the minimum description length is achieved with no edges.

Outputs include the global and local network backbones along with their inverse compression ratios:

backbone_global: Edge list of the global MDL-optimal backbone.
backbone_local: Edge list of the local MDL-optimal backbone.
compression_global: Inverse compression ratio for the global backbone.
compression_local: Inverse compression ratio for the local backbone.

The method minimizes the following MDL objectives according to Eq. 1 and Eq. 5 in https://arxiv.org/abs/2409.06417:

Microcanonical Global Description Length Objective of Network Backbone:

$\mathcal{L}_M^{(\text{global})}(G^{(b)}) = \log (E+1) + \log (W-E+1) + \log \binom{E}{E^{(b)}} +\log \binom{W^{(b)}-1}{E^{(b)}-1} +\log\binom{W-W^{(b)}-1}{E-E^{(b)}-1}$

Microcanonical Local Description Length Objective of Network Backbone:

$\mathcal{L}_M^{(\text{local})}\left(G^{(b)}\right) = \log\binom{N+W-E-1}{W-E} + \sum_{i=1}^{N} \Bigg(\log(k_i + 1) + \log(s_i - k_i + 1) + \log \binom{k_i}{k_i^{(b)}} + \log \binom{s_i^{(b)} - 1}{k_i^{(b)} - 1} + \log \binom{s_i - s_i^{(b)} - 1}{k_i - k_i^{(b)} - 1} \Bigg)$

MDL Backboning

This module provides functions to calculate the MDL-optimal network backbones for both global and local perspectives.

Functions
Function	Description
logchoose(n, k)	Compute the logarithm of the binomial coefficient.
logmultiset(n, k)	Compute the logarithm of the multiset coefficient.
to_undirected(edge_list, policy=”sum”)	Convert a directed edge list to an undirected edge list by merging edges.
MDL_backboning(elist, directed=True, out_edges=True, allow_empty=True)	Compute the MDL-optimal global and local network backbones.

Reference

function logchoose(n, k) [source]

Description: Compute the logarithm of the binomial coefficient.

Parameters:

(n, k)

n: Total number of items.
k: Number of chosen items.

Returns:

float: Logarithm of the binomial coefficient.

function logmultiset(n, k) [source]

Description: Compute the logarithm of the multiset coefficient.

Parameters:

(n, k)

n: Number of types.
k: Number of items.

Returns:

float: Logarithm of the multiset coefficient.

function to_undirected(edge_list, policy="sum") [source]

Description: Convert a directed edge list to an undirected edge list by merging edges.

Parameters:

(edge_list, policy="sum")

edge_list: List of directed edges as tuples (i, j, w_ij).
policy: Policy for merging edges, can be "sum", "max", "min", or "error". Defaults to "sum".

Returns:

list: Undirected edge list as tuples (i, j, w_ij) where edges are merged according to the specified policy.

function MDL_backboning(elist, directed=True, out_edges=True, allow_empty=True) [source]

Description: Compute the MDL-optimal global and local network backbones from the given edge list.

Parameters:

(elist, directed=True, out_edges=True, allow_empty=True)

elist: List of edges as tuples (i, j, w_ij).
directed: Boolean indicating if the network is directed, defaults as `True`.
out_edges: Boolean indicating whether to track out-edges (`True`) or in-edges (`False`), defaults as `True`.
allow_empty: Allows empty backbones if `True`, defaults as `True`.

Returns:

backbone_global: Edge list of the global MDL-optimal backbone.
backbone_local: Edge list of the local MDL-optimal backbone.
compression_global: Inverse compression ratio for the global backbone.
compression_local: Inverse compression ratio for the local backbone.

Demo

Example Code

Step 1: Import necessary libraries

import networkx as nx
import matplotlib.pyplot as plt
from paninipy.mdl_backboning import MDL_backboning

Step 2: Define the weighted edge list

# Weighted edge list for the example
elist = [
    (0, 1, 12), (0, 3, 20), (0, 4, 8),
    (1, 2, 1), (1, 4, 3),
    (2, 0, 1), (2, 1, 3),
    (3, 2, 3), (3, 4, 1),
    (4, 3, 1)
]

Step 3: Compute backbones and compression ratios

# Compute backbones using out-edges
backbone_global, backbone_local, compression_global, compression_local = MDL_backboning(
    elist, directed=True, out_edges=True
)

Step 4: Visualize the original network and backbones

def visualize_backbones(elist, backbone_global, backbone_local, compression_global, compression_local):
    G_original = nx.DiGraph()
    G_global = nx.DiGraph()
    G_local = nx.DiGraph()

    for i, j, w in elist:
        G_original.add_edge(i, j, weight=w)
    for i, j, w in backbone_global:
        G_global.add_edge(i, j, weight=w)
    for i, j, w in backbone_local:
        G_local.add_edge(i, j, weight=w)

    pos = nx.spring_layout(G_original, seed=42)
    W_original = sum([d['weight'] for u, v, d in G_original.edges(data=True)])
    E_original = G_original.number_of_edges()

    W_global = sum([d['weight'] for u, v, d in G_global.edges(data=True)])
    E_global = G_global.number_of_edges()

    W_local = sum([d['weight'] for u, v, d in G_local.edges(data=True)])
    E_local = G_local.number_of_edges()

    plt.figure(figsize=(18, 6))

    plt.subplot(1, 3, 1)
    nx.draw_networkx_nodes(G_original, pos, node_color='lightblue', node_size=500)
    nx.draw_networkx_edges(G_original, pos, arrowstyle='->', arrowsize=15)
    nx.draw_networkx_labels(G_original, pos)
    plt.title('Original Network')
    plt.axis('off')
    plt.text(0.5, -0.1, f'Total weight of the network = {W_original}\nTotal number of edges = {E_original}', ha='center', transform=plt.gca().transAxes)

    plt.subplot(1, 3, 2)
    nx.draw_networkx_nodes(G_global, pos, node_color='red', node_size=500)
    nx.draw_networkx_edges(G_global, pos, arrowstyle='->', arrowsize=15)
    nx.draw_networkx_labels(G_global, pos)
    plt.title('Global Backbone')
    plt.axis('off')
    plt.text(0.5, -0.1, f'Total weight of the network = {W_global}\nTotal number of edges = {E_global}\nInverse compression ratio = {compression_global:.4f}', ha='center', transform=plt.gca().transAxes)

    plt.subplot(1, 3, 3)
    nx.draw_networkx_nodes(G_local, pos, node_color='lightgreen', node_size=500)
    nx.draw_networkx_edges(G_local, pos, arrowstyle='->', arrowsize=15)
    nx.draw_networkx_labels(G_local, pos)
    plt.title('Local Backbone')
    plt.axis('off')
    plt.text(0.5, -0.1, f'Total weight of the network = {W_local}\nTotal number of edges = {E_local}\nInverse compression ratio = {compression_local:.4f}', ha='center', transform=plt.gca().transAxes)
    plt.tight_layout()
    plt.savefig("mdl_network_backbones.png", bbox_inches='tight', dpi=200)
    plt.show()

visualize_backbones(elist, backbone_global, backbone_local, compression_global, compression_local)

Example Output

Visualization of the original network and the extracted backbones with statistics.

Left: Original weighted, directed network, with edge width proportional to weight. Center: Global MDL backbone, which learns a global threshold on the edge weights for network sparsification. Right: Local MDL backbone using out-neighborhoods. The local MDL method learns a threshold adapted to each neighborhood’s weight heterogeneity. Summary statistics are shown below each network.

Paper Source

If you use this algorithm in your work, please cite:

A. Kirkley, “Fast nonparametric inference of network backbones for weighted graph sparsification.” arXiv preprint arXiv:2409.06417 (2024). Paper: https://arxiv.org/abs/2409.06417