Skip to content

A library for implementing Decentralized Graph Neural Network algorithms.

License

Notifications You must be signed in to change notification settings

MKLab-ITI/decentralized-gnn

Repository files navigation

decentralized-gnn

A repository for implementing and simulating decentralized Graph Neural Network algorithms for classification of peer-to-peer nodes. Developed code supports the publication p2pGNN: A Decentralized Graph Neural Network for Node Classification in Peer-to-Peer Networks.

⚡ Quick Start

To generate a local instance of a decentralized learning device:

from decentralized.devices import GossipDevice
from decentralized.mergers import SlowMerge
from learning.nn import MLP
node = ... # a node identifier object (can be any object)
features = ... # feature vector, should have the same length for each device
labels = ... # one hot encoding of class labels, zeroes if no label is known
predictor = MLP(features.shape[0], labels.shape[0])  # or load a pretrained model with
device = GossipDevice(node, predictor, features, labels, gossip_merge=SlowMerge)

In this code, the type of the device (GossipDevice) and the variable merge protocol (SlowMerge) work together to define a decentralized learning setting for a Graph Neural Network that runs on unstructured peer-to-peer links of uncertain availability. The communication network itself is the graph being analysed, operating under the assumption that communicating peers are related (e.g., they could be friends in decentralized social networks).

Whenever possible (e.g. at worst, whenever devices send messages to the others for other reasons) perform the following information exchange between linked devices u and v:

send = u.send()
receive = v.receive(u.name, send)
u.ack(v.name, receive)

🛠️ Simulations

Clone the repository and install all dependencies with:

pip install -r requirements.txt

This will also install the infrastructure needed by torch geometric to automatically download data. Set up and run simulations on many devices automatically generated by existing datasets with the following code:

from decentralized.devices import GossipDevice
from decentralized.mergers import AvgMerge
from decentralized.simulation import create_network

dataset_name = ... # "cora", "citeseer" or "pubmed"
network, test_labels = create_network(dataset_name, 
                                      GossipDevice,
                                      pretrained=False,
                                      gossip_merge=AvgMerge,
                                      gossip_pull=False,
                                      seed=0,
                                      min_communication_rate=0,
                                      max_communication_rate=0.1)
for epoch in range(800):
    network.round()
    accuracy_base = sum(1. if network.devices[u].predict(False) == label else 0 for u, label in test_labels.items()) / len(test_labels)
    accuracy = sum(1. if network.devices[u].predict() == label else 0 for u, label in test_labels.items()) / len(test_labels)
    print(f"Epoch {epoch} \t Acc {accuracy:.3f} \t Base acc {accuracy_base:.3f}")

In the above snippet, datasets are automatically downloaded. Then, devices are instantiated from desired settings. Communication between pairs of linked devices occurs with probability [min_communication_rate, max_communication_rate] in each round; the probability is different between each pair of devices but does not change over time.

Everything runs on numpy because, at the time of the first implementation, adequate GPU memory was hard to find.

Reproducibility

Find running implementations in the files experiments.py, with centralized equivalents in centralized_experiments.py. Publication experiments used the downloader of dgl. However, this is not working properly anymore and we have switched to torch geometric. Other than this, default settings match the publication.

Reducing resources

Some merge schemes need a lot of memory to simulate. Reduce consumption to a fraction of the original by moving from the default np.float64 numeric format to less precise ones. Do so with the pattern demonstrated bellow, where the datatype is passed to the dtype argument of numpy array creation:

learning.optimizers.Variable.datatype = np.float16 # do this before calling `create_network`

📝 Options

The following parameters are provided for experimentation in the decentralized simulation. Several of the options are experimental.

Parameter Option Description
device_type decentralized.devices.GossipDevice A device that shares predictions with those it communicates.
decentralized.devices.EstimationDevice A device that is anonymous by sharing synthetically generated predictions that would emulate its own for some synthetic data generated by its local model. This is experimental and unstable. DO NOT USE.
decentralized.devices.EstimationDevice A device that shares a corpus of synthetically generated predictions based on a similar strategy as above.. This is experimental and unstable. DO NOT USE.
classifier torch.nn.MLP Using multilayer perceptron as a base classifier.
torch.nn.LR Using logistic regression as a base classifier.
gossip_merge decentralized.mergers.AvgMerge (Default) When trying to perform gossip learning (not when pretrained), it averages each device's trained parameters with the parameters of its communicating neighbors. This is the standard gossip averaging algorithm.
decentralized.mergers.FairMerge A variation of the above that tries to converge to a quantity best estimating the truth average. This is experimental.
decentralized.mergers.TopologicalMerge Another variation with the same goal as FairMerge. This is experimental.
decentralized.mergers.SlowMerge Similar to AvgMerge but imposes slower convergence by maintaining a greater percentage of each node's learned parameters.
smoother decentralized.mergers.NoSMooth Default implementation of decentralized graph signals without any improvements.
decentralized.mergers.Smoothen Reduces the statistical bias of decentralized graph signals. This is experimental but promising.
decentralized.mergers.DecoupleNormalization Decouples the order of magnitude with diffused values while diffusing decentralized graph signals.
pretrained true/false Whether the classifier's parameters should be pre-trained and shared, with the p2p architecture providing only a refinement (true). Otherwise, decentralized training protocols are employed.
gossip_pull true/false Determines whether gossip training strategy should retrieve model parameters from a random device in the network (if true and not pretrained). This requires communication-on-request guarantees and is not very realistic for social media networks.

Internally, decentralized graph signals are diffused by the decoupled GNN with the decentralized.mergers.PPRVariable class. The smoother argument may improve the diffusion.

📓 Citation

@article{krasanakis2022p2pgnn,
  title={p2pgnn: A decentralized graph neural network for node classification in peer-to-peer networks},
  author={Krasanakis, Emmanouil and Papadopoulos, Symeon and Kompatsiaris, Ioannis},
  journal={IEEE Access},
  volume={10},
  pages={34755--34765},
  year={2022},
  publisher={IEEE}
}

About

A library for implementing Decentralized Graph Neural Network algorithms.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages