Changes to Support Expanded Experimentation with FedDG-GA #252

emersodb · 2024-10-10T14:56:14Z

PR Type

Feature/Experimentation.

Short Description

The changes in this PR are targeted at enabling expanded experimentation with the FedDG-GA strategy for a wider set of FL approaches. Also included is a name change, FedDgGaStrategy -> FedDgGa to fit other strategy formats a bit better and renaming of the associated file from feddg_ga_strategy.py to feddg_ga.py as it is already under the strategies folder.

For FENDA+Ditto, there is also a bug fix moving from SequentiallySplitExchangeBaseModel to SequentiallySplitModel. In this setting we want to exchange the whole Ditto model, not just the feature extractor component. It wasn't causing a real bug, as a FullParameterExchanger was being used anyway, but the typing was dissonant.

Finally, I moved some "client agnostic" functionality out of basic client and into a utils file to help trim a few functions from the BasicClient class.

Tests Added

Added a test for the new Fed DG-GA strategy that is compatible with adaptive constraint server-client pairs

…t scripts and setups.

emersodb · 2024-10-10T15:10:02Z

fl4health/clients/basic_client.py

 from fl4health.utils.losses import EvaluationLosses, LossMeter, LossMeterType, TrainingLosses
 from fl4health.utils.metrics import TEST_LOSS_KEY, TEST_NUM_EXAMPLES_KEY, Metric, MetricManager
 from fl4health.utils.random import generate_hash
 from fl4health.utils.typing import LogLevel, TorchFeatureType, TorchInputType, TorchPredType, TorchTargetType


-class LoggingMode(Enum):


Moved this into its own file, utils.logging.py. In future work, we probably want to create a logging module to abstract some of the logging components that currently reside in the BasicClient anyway. So this is a very small step in that direction.

emersodb · 2024-10-10T15:12:29Z

fl4health/clients/basic_client.py

@@ -247,8 +240,13 @@ def process_config(self, config: Config) -> Tuple[Union[int, None], Union[int, N
        except ValueError:
            evaluate_after_fit = False

+        try:


This will control whether the contents of the loss dictionary should be packed into the metrics for communication with the server or not. Packing the losses makes the full set of additional losses available to the server, which is essentially for FedDG-GA to work for an expanded set of FL techniques. However, it also means that the server would be aware of these which is, perhaps, advantageous in other settings as well.

emersodb · 2024-10-10T15:13:49Z

fl4health/clients/basic_client.py

@@ -326,21 +326,6 @@ def fit(self, parameters: NDArrays, config: Config) -> Tuple[NDArrays, int, Dict
            metrics,
        )

-    def evaluate_after_fit(self) -> Tuple[float, Dict[str, Scalar]]:


This is being torched in favor of just using validate with pack_losses_with_val_metrics = True. We were packing the loss here specifically to facilitate FedDG-GA.

It's possible in the future that we'll want to reinstate this function to allow it to be overridden in upper classes, but for now I think it can be dropped

emersodb · 2024-10-10T15:14:39Z

fl4health/clients/basic_client.py

@@ -536,69 +527,6 @@ def get_client_specific_reports(self) -> Dict[str, Any]:
        """
        return {}

-    def _move_data_to_device(


This is a pretty generic function. Moved it to utils/client.py and made a slight typing improvement. Let me know if you disagree with the change.

emersodb · 2024-10-10T15:15:08Z

fl4health/clients/basic_client.py

-                    two"
-            )
-
-    def is_empty_batch(self, input: Union[torch.Tensor, Dict[str, torch.Tensor]]) -> bool:


This is a pretty generic function. Moved it to utils/client.py. Let me know if you disagree with the change.

emersodb · 2024-10-10T15:16:12Z

fl4health/clients/basic_client.py

@@ -818,9 +749,12 @@ def _validate_or_test(
        metrics = metric_manager.compute()
        self._log_results(loss_dict, metrics, logging_mode=logging_mode)

+        if include_losses_in_metrics:


This is where we inject the loss_dict results into the metrics during validation.

emersodb · 2024-10-10T15:18:48Z

fl4health/clients/basic_client.py

@@ -1034,24 +977,6 @@ def set_optimizer(self, config: Config) -> None:
        assert not isinstance(optimizer, dict)
        self.optimizers = {"global": optimizer}

-    def clone_and_freeze_model(self, model: nn.Module) -> nn.Module:


This is a pretty generic function. Moved it to utils/client.py. Let me know if you disagree with the change.

emersodb · 2024-10-10T15:18:57Z

fl4health/clients/basic_client.py

@@ -1241,35 +1166,6 @@ def update_before_epoch(self, epoch: int) -> None:
        """
        pass

-    def maybe_progress_bar(self, iterable: Iterable) -> Iterable:


This is a pretty generic function. Moved it to utils/client.py. Let me know if you disagree with the change.

emersodb · 2024-10-10T15:20:26Z

fl4health/clients/fenda_ditto_client.py

@@ -9,7 +9,7 @@
 from fl4health.checkpointing.client_module import ClientCheckpointModule
 from fl4health.clients.ditto_client import DittoClient
 from fl4health.model_bases.fenda_base import FendaModel
-from fl4health.model_bases.sequential_split_models import SequentiallySplitExchangeBaseModel
+from fl4health.model_bases.sequential_split_models import SequentiallySplitModel


While it didn't actually change the flow of the example, since it uses a FullParameterExchanger this type is misleading, given that it's mean to facilitate only exchanging the feature extraction module.

emersodb · 2024-10-10T15:24:40Z

fl4health/strategies/feddg_ga.py

@@ -23,7 +30,7 @@ class FairnessMetricType(Enum):
    """Defines the basic types for fairness metrics, their default names and their default signals"""

    ACCURACY = "val - prediction - accuracy"
-    LOSS = "val - loss"


Changing to checkpoint, as this is the name of the vanilla loss for basic clients during validation.

emersodb · 2024-10-10T15:26:11Z

fl4health/strategies/feddg_ga.py

+        # Setting self.num_rounds once and doing some sanity checks
+        assert self.on_fit_config_fn is not None, "on_fit_config_fn must be specified"
+        config = self.on_fit_config_fn(server_round)
+        assert "evaluate_after_fit" in config, "evaluate_after_fit must be present in config"


FedDG-GA requires evaluate_after_fit to be present and true. Similarly, it requires pack_losses_with_val_metrics to be present and true to allow the server to see the measured validation loss. So when using the strategy we check for them and throw if not present.

emersodb · 2024-10-10T15:26:53Z

fl4health/strategies/feddg_ga.py

+
+        assert self.on_evaluate_config_fn is not None, "on_fit_config_fn must be specified"
+        config = self.on_evaluate_config_fn(server_round)
+        assert "pack_losses_with_val_metrics" in config, "pack_losses_with_val_metrics must be present in config"


During validation, we need the server to have access to the losses. So pack_losses_with_val_metrics must be present and true for FedDG-GA to work properly.

emersodb · 2024-10-10T15:28:37Z

fl4health/strategies/feddg_ga.py

+            fit_metrics = [(res.num_examples, res.metrics) for _, res in results]
+            metrics_aggregated = self.fit_metrics_aggregation_fn(fit_metrics)
+        elif server_round == 1:  # Only log this warning once
+            log(WARNING, "No fit_metrics_aggregation_fn provided")


Rather than relying on the original aggregation approach and discarding the weights aggregation, which is a bit of a wasted set of calculations, just do the metrics aggregation here, since that is all we want anyway.

emersodb · 2024-10-10T15:29:21Z

fl4health/strategies/feddg_ga.py

            self.evaluation_metrics[cid] = eval_res.metrics
-            # adding the loss to the metrics
-            val_loss_key = FairnessMetricType.LOSS.value
-            self.evaluation_metrics[cid][val_loss_key] = eval_res.loss


No need to force the loss into the dictionary here. It should be packed with the metrics. If it's not, something else has gone wrong.

emersodb · 2024-10-10T15:30:40Z

fl4health/strategies/feddg_ga_with_adaptive_constraint.py

@@ -0,0 +1,268 @@
+from logging import INFO, WARNING


This class is needed in order to facilitate FedDG-GA for clients that also require adaptive constraint considerations. These are clients such as Ditto, MR-MTL, and FedProx. This strategy needs to coordinate the exchange of the loss weights in addition to the Generalization Adjustments for aggregation

emersodb · 2024-10-10T15:32:09Z

fl4health/strategies/feddg_ga_with_adaptive_constraint.py

+            updated_weights, train_loss = self.parameter_packer.unpack_parameters(
+                parameters_to_ndarrays(fit_res.parameters)
+            )
+            # Modify the parameters in-place to just be the model weights.


Since the training losses are packed with the weights, we need to extract the losses and jam the weights back into the FitRes parameters object so that FedDG-GA can happen unabated.

emersodb · 2024-10-10T15:34:47Z

tests/smoke_tests/load_from_checkpoint_example/README.md

Deleted this as it seemed to be a carbon copy of the BasicExample readme? Perhaps that's not true. Just let me know

fl4health/clients/basic_client.py

fl4health/strategies/feddg_ga.py

fl4health/strategies/feddg_ga_with_adaptive_constraint.py

fl4health/utils/client.py

jewelltaylor

Everything looks pretty much good to go for me! I will wait till you have a chance to take a look at the minor comments I made and take one last pass tomorrow

jewelltaylor

Changes look good to me!

emersodb added 2 commits October 10, 2024 10:44

Spliting out the underlying changes to the library from the experimen…

1413c89

…t scripts and setups.

renaming test to eliminate 'strategy' in the name

5b0831c

emersodb commented Oct 10, 2024

View reviewed changes

emersodb requested review from jewelltaylor, scarere, fatemetkl and sanaAyrml October 10, 2024 15:35

emersodb marked this pull request as ready for review October 10, 2024 15:35

emersodb requested a review from lotif October 10, 2024 15:52

emersodb added 5 commits October 22, 2024 14:00

Merge branch 'main' into dbe/support_changes_for_feddgga

e823397

Fixing a unit test failure caused by a bad merge on my part.

a6924a1

Merge branch 'main' into dbe/support_changes_for_feddgga

9982ce6

Merge branch 'main' into dbe/support_changes_for_feddgga

d2bf8d7

Merge branch 'main' into dbe/support_changes_for_feddgga

0816b4d