Implementing FedRep #127

emersodb · 2024-04-22T20:52:53Z

PR Type

Feature

Implementation of FedRep. This includes an example of the methods use, unit tests, and addition to the smoke tests.

Note: Most of the file changes are more very small typo fixes.

There is also a slight refactor of the base models for FedPer, MOON, and FedRep, as they all belong to the category of "sequentially split" models, in some way. This is a first pass at this refactor and will need to be integrated into @sanaAyrml's work in her branch.
Also adding Ditto to the smoke tests.

Clickup Ticket(s): Link(s) if applicable.

Tests Added

Added to the Smoke Tests
Tests have also been added in tests/clients/test_fedrep_client.py

… method. There is also a slight refactor of the base models for FedPer, MOON, and FedRep, as they all belong to the category of SequentiallySplit models, in some way. Also adding FedRep and Ditto to the smoke tests.

emersodb · 2024-04-22T20:55:49Z

examples/fedper_example/client.py

 from fl4health.utils.load_data import load_mnist_data
 from fl4health.utils.metrics import Accuracy, Metric
 from fl4health.utils.sampler import MinorityLabelBasedSampler


-class MnistFedPerClient(MoonClient):
+class MnistFedPerClient(FedPerClient):


Changing this client to no longer inherit from MoonClient. I liked the idea of mixing them, because they are so related, but it can be a little confusing. This also means that we can rip out the need to remember to specify the exchanger when using vanilla FedPer. We can still use the MoonClient to combine the idea with FedPer, just removing it from the examples to reduce confusion.

emersodb · 2024-04-22T20:56:07Z

examples/fedper_example/client.py

@@ -56,10 +52,6 @@ def get_optimizer(self, config: Config) -> Optimizer:
    def get_criterion(self, config: Config) -> _Loss:
        return torch.nn.CrossEntropyLoss()

-    def get_parameter_exchanger(self, config: Config) -> ParameterExchanger:


Now appears in FedPerClient

emersodb · 2024-04-22T20:57:11Z

examples/models/fedper_cnn.py

@@ -1,31 +0,0 @@
-import torch


Moved this model into the same file as that used for FedRep since they are both just sequentially split models but for different datasets (MNIST vs. CIFAR)

emersodb · 2024-04-22T20:57:49Z

examples/apfl_example/README.md


-In this demo, APFL is applied to an augmented version of the MNIST dataset that is non--IID. The FL server expects three clients to be spun up (i.e. it will wait until three clients report in before starting training). Each client has a modified version of the MNIST dataset. This modification essentially subsamples a certain number from the original training and validation sets of MNIST in order to synthetically induce local variations in the statistical properties of the clients training/validation data. In theory, the models should be able to perform well on their local data while learning from other clients data that has different statistical properties. The proportion of labels at each client is determined by dirichlet distribution across the classes. The lower the beta parameter is for each class, the higher the degree of the label heterogeneity.
+In this demo, APFL is applied to an augmented version of the MNIST dataset that is non--IID. The FL server expects three clients to be spun up (i.e. it will wait until three clients report in before starting training). Each client has a modified version of the MNIST dataset. This modification essentially subsamples a certain number from the original training and validation sets of MNIST in order to synthetically induce local variations in the statistical properties of the clients training/validation data. In theory, the models should be able to perform well on their local data while learning from other clients data that has different statistical properties. The proportion of labels at each client is determined by Dirichlet distribution across the classes. The lower the beta parameter is for each class, the higher the degree of the label heterogeneity.


Dirichlet is the name of a German mathematician, hence the capitalization.

fl4health/clients/moon_client.py

emersodb · 2024-04-22T21:01:25Z

tests/clients/test_instance_level.py

@@ -32,7 +32,7 @@ def __getitem__(self, index: int) -> Tuple[torch.Tensor, torch.Tensor]:
        return self.data[index], self.targets[index]


-class TestClient(InstanceLevelPrivacyClient):
+class ClientForTest(InstanceLevelPrivacyClient):


These classes are just renamed to address a warning in pytest that it can't "discover" these tests. It just uses a rudimentary GREP for the prefix "Test" to find tests. So the rename just reduces the warnings.

sanaAyrml

I just put a few comments. Overall it is a vrry neat PR, so I approve it.

fl4health/model_bases/sequential_split_models.py

fl4health/model_bases/fedper_base.py

fl4health/clients/moon_client.py

fl4health/clients/fedrep_client.py

tests/smoke_tests/run_smoke_test.py

…base classes for FedPer, FedRep, and MOON to bury the flatten features functionality in the base class for SequentiallySplitModels. With this change in functionality the FedPer class became essentially a pass through. So I eliminated it and made some modifications for the FedPerClient to make it clear that a SequentiallySplitExchangeBaseModel was the right type to be used for this client. Also added a test for the FedPerClient and modified the fedper base test.

sanaAyrml

Nice changes, just 2 comments and then good to go.

examples/fedper_example/server.py

fl4health/model_bases/sequential_split_models.py

emersodb requested review from lotif, fatemetkl, jewelltaylor, sanaAyrml and yc7z April 22, 2024 20:52

Small fixes to documentation and reverting changes to run_fl_local

6e8775a

emersodb commented Apr 22, 2024

View reviewed changes

fl4health/clients/moon_client.py Show resolved Hide resolved

emersodb commented Apr 22, 2024

View reviewed changes

emersodb added 2 commits April 22, 2024 18:06

Taking a shot at fixing smoke tests.

e5f636f

Smoke test fix number 2

6d3a0a0

sanaAyrml approved these changes May 7, 2024

View reviewed changes

emersodb added 3 commits May 7, 2024 10:26

Merge branch 'main' into dbe/implement_fed_rep

1684cca

Merge branch 'main' into dbe/implement_fed_rep

f2563d1

emersodb requested a review from sanaAyrml May 8, 2024 14:37

sanaAyrml approved these changes May 8, 2024

View reviewed changes

examples/fedper_example/server.py Outdated Show resolved Hide resolved

fl4health/model_bases/sequential_split_models.py Outdated Show resolved Hide resolved

Two small PR comment fixes.

cec1334

emersodb merged commit 2bbf8ae into main May 8, 2024
6 checks passed

emersodb deleted the dbe/implement_fed_rep branch May 8, 2024 17:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing FedRep #127

Implementing FedRep #127

emersodb commented Apr 22, 2024

emersodb Apr 22, 2024

emersodb Apr 22, 2024

emersodb Apr 22, 2024

emersodb Apr 22, 2024

emersodb Apr 22, 2024

sanaAyrml left a comment

sanaAyrml left a comment


		In this demo, APFL is applied to an augmented version of the MNIST dataset that is non--IID. The FL server expects three clients to be spun up (i.e. it will wait until three clients report in before starting training). Each client has a modified version of the MNIST dataset. This modification essentially subsamples a certain number from the original training and validation sets of MNIST in order to synthetically induce local variations in the statistical properties of the clients training/validation data. In theory, the models should be able to perform well on their local data while learning from other clients data that has different statistical properties. The proportion of labels at each client is determined by dirichlet distribution across the classes. The lower the beta parameter is for each class, the higher the degree of the label heterogeneity.
		In this demo, APFL is applied to an augmented version of the MNIST dataset that is non--IID. The FL server expects three clients to be spun up (i.e. it will wait until three clients report in before starting training). Each client has a modified version of the MNIST dataset. This modification essentially subsamples a certain number from the original training and validation sets of MNIST in order to synthetically induce local variations in the statistical properties of the clients training/validation data. In theory, the models should be able to perform well on their local data while learning from other clients data that has different statistical properties. The proportion of labels at each client is determined by Dirichlet distribution across the classes. The lower the beta parameter is for each class, the higher the degree of the label heterogeneity.

Implementing FedRep #127

Implementing FedRep #127

Conversation

emersodb commented Apr 22, 2024

PR Type

Tests Added

emersodb Apr 22, 2024

Choose a reason for hiding this comment

emersodb Apr 22, 2024

Choose a reason for hiding this comment

emersodb Apr 22, 2024

Choose a reason for hiding this comment

emersodb Apr 22, 2024

Choose a reason for hiding this comment

emersodb Apr 22, 2024

Choose a reason for hiding this comment

sanaAyrml left a comment

Choose a reason for hiding this comment

sanaAyrml left a comment

Choose a reason for hiding this comment