Utilities¶

Functionals¶

ddpw.functional.seed_generators(seed: int) → None¶

Seed [pseudo]random number generators from various dependencies.

Parameters:: seed (int) – The seed.

ddpw.functional.average_params_grads(module: Module, params: bool = True, grads: bool = False) → None¶

Averages the parameters of the given module and/or their gradients across all the GPUs (copied over from a PyTorch blog and further modified here).

Parameters:

module (torch.nn.Module) – The module whose parameters/gradients are to be averaged.
params (bool) – Whether to average the parameters or not. Default: True.
grads (bool) – Whether to average the gradients or not. Default: False.

ddpw.functional.optimiser_to(optimiser: Optimizer, device: device) → None¶

This function offers a simple way to move all the parameters optimised by an optimiser to the specified device. This function has been taken as is from a solution on PyTorch Discuss.

Parameters:

optimiser (torch.optim.Optimizer) – The optimiser to move to a device.
device (torch.device) – The device to which to move the optimiser.

ddpw.functional.has_batch_norm(module: Module) → bool¶

This function checks if a module has batch normalisation layer(s) in it.

Parameters:: module (torch.nn.Module) – The module to be checked for containing any batch normalisation layers.
Returns bool:: Whether or not the module has batch normalisation layer(s) in it.

ddpw.functional.to(module: Module, local_rank: int, sync_modules: bool = True, device: Device = Device.GPU, *ddp_args, **ddp_kwargs) → Module¶

A quick and minimal function that works on all devices to move the given module to the specified device: CPU, GPU, or MPS. If the platform is set up in an IPC fashion, this function optionally moves the module in a distributed data parallel-fashion and synchronises batch normalisation layers, if any.

Parameters:

module (torch.nn.Module) – The module to be moved.
local_rank (int) – The local rank of the device.
sync_modules (bool) – Whether to synchronise the modules across devices or not. If yes, the module becomes DistributedDataParallel. Default: True.
device (Device) – The type of device to which to move the module. This argument is useful because once device has been globally set, this function can be called regardless of the current device in which the items are and thus helps avoid additional checks. Default: Device.GPU.
ddp_args – Arguments to DistributedDataParallel.
ddp_kwargs – Keyworded arguments to DistributedDataParallel.

Returns torch.nn.Module:

The module moved to the appropriate device.

ddpw.functional.get_dataset_sampler(dataset: Dataset, global_rank: int, platform: Platform) → DistributedSampler | None¶

This function selects a portion of the original dataset shared by other devices. If the device is CPU or MPS, no sharing is necessary.

Parameters:

dataset (torch.utils.data.Dataset) – The dataset from which to sample for the current device.
global_rank (int) – The global rank of the device.
platform (Platform) – Platform-related configurations.

Returns torch.utils.data.DistributedSampler:

Dataset sampler for the given dataset and world size.

ddpw.functional.device(module: Module) → device¶

Given a module, this function returns the device on which it currently resides. If the module has no parameters, the current device is returned by default.

Parameters:: module (torch.nn.Module) – The module whose device is sought.
Returns torch.device:: The device of the module.

ddpw.functional.set_device(local_rank: int, platform: Platform) → None¶

Sets the device for the thread from which this is called.

Parameters:

local_rank (int) – The local rank of the device.
platform (Platform) – Platform-related configurations.