Introduction¶
Distributed Data Parallel Wrapper (DDPW) is a lightweight Python wrapper relevant for PyTorch users. It is written in Python 3.10.
DDPW enables writing compute-intensive tasks (such as training models) without deeply worrying about the underlying compute platform (CPU, Apple SoC, GPUs, or SLURM (uses Submitit)) and instead allows specifying it simply as an argument. This considerably minimises the need to change the code for each type of platform.
DDPW handles basic logistical tasks such as creating threads on GPUs/SLURM nodes, setting up inter-process communication, etc., and provides simple, default utility methods to move modules to devices and get dataset samplers, allowing the user to focus on the main aspects of the task.
Installation¶
DDPW is distributed on PyPI. The source code is available on GitHub and can be used to manually build the package.
Target platforms
This wrapper is released for all architectures but is tested only on Linux arch-64 and Apple SoC.
Usage¶
1from ddpw import Platform, Wrapper
2
3# some task
4def task(global_rank, local_rank, process_group, args):
5 print(f'This is GPU {global_rank}(G)/{local_rank}(L); args = {args}')
6
7# platform (e.g., 4 GPUs)
8platform = Platform(device='gpu', n_gpus=4)
9
10# wrapper
11wrapper = Wrapper(platform=platform)
12
13# start
14wrapper.start(task, ('example',))
As a decorator¶
from ddpw import Platform, wrapper
@wrapper(Platform(device='gpu', n_gpus=2, n_cpus=2))
def run(a, b):
# some task
pass
Refer to the API for more configuration options or the example with MNIST for an illustration.