TorchDevice is a class in the torchdevice.py package that intercepts PyTorch calls related to GPU hardware, enabling transparent code portability between NVIDIA CUDA and Apple Silicon (MPS) hardware. It allows developers to write code that works seamlessly on both hardware types without modification and is meant to assist in porting code from CUDA to MPS.
- Automatic Device Redirection: Intercepts
torch.device
instantiation and redirects it based on available hardware (CUDA, MPS, or CPU). - Mocked CUDA Functions: Provides mocked implementations of CUDA-specific functions, enabling code that uses CUDA functions to run on MPS hardware.
- Unified Memory Handling: Handles differences in memory management between CUDA and MPS, providing reasonable values for memory-related functions.
- Logging and Debugging: Outputs informative log messages indicating how calls are intercepted and handled, assisting in code migration and debugging.
- Transparent Integration: Works transparently without requiring changes to existing codebases.
- Python 3.7 or higher
- PyTorch installed with appropriate support for your hardware:
- For CUDA support on NVIDIA GPUs
- For MPS support on Apple Silicon (macOS 12.3+)
- Additional Python packages:
numpy
psutil
Follow the official PyTorch installation instructions to install PyTorch with the appropriate support for your hardware.
You may want the latest nightly builds for Apple Silicon, use pip to install them:
pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
-
Clone the Repository
git clone https://github.com/unixwzrd/TorchDevice.git
-
Navigate to the Project Directory
cd TorchDevice
-
Install Dependencies
pip install -r requirements.txt
Alternatively, install dependencies manually:
pip install numpy psutil
IMPORTANT - For Apple Silicon, you will want a NumPY linked to the accelerate framework. This should always be done in the even NumPy gets downgraded or overlaid by another package, (SciPy), etc. The binaries as far as I can tell are not linked to the Apple Accelerate Framework, and NumPy does a lot of heavy lifting for PyTorch. Doing this can result in about an 8x performance improvement for vector operations.
Here is the way you can ensure NumPy is linked properly for your machine;
# NumPy Rebuild with Pip
CFLAGS="-I/System/Library/Frameworks/vecLib.framework/Headers -Wl,-framework -Wl,Accelerate -framework Accelerate" pip install numpy==1.26.* --force-reinstall --no-deps --no-cache --no-binary :all: --compile -Csetup-args=-Dblas=accelerate -Csetup-args=-Dlapack=accelerate -Csetup-args=-Duse-ilp64=true
-
Install TorchDevice Module
Since
TorchDevice
is a single Python file, you can copyTorchDevice.py
to your project's directory or install it as a package:python setup.py install
Alternatively, install
TorchDevice
as a package this way as well:pip install .
Import TorchDevice
in your code before or after importing torch
. The module will automatically apply patches to intercept and redirect PyTorch calls.
import torchdevice # import torchdevice to apply patches
import torch
device = torch.device('cuda') # This will be redirected based on available hardware
# Your existing PyTorch code
- Device Selection:
TorchDevice
will select the appropriate device based on hardware availability:- If CUDA is requested but not available, it will redirect to MPS if available.
- If MPS is requested but not available, it will redirect to CUDA if available.
- If neither is available, it will default to CPU.
- Logging: The module outputs log messages indicating how calls are intercepted and handled. These messages include the caller's filename, function name, and line number.
- Unsupported Functions: Functions that are not supported on the current hardware are stubbed and will log a warning but allow execution to continue.
You can include the demo scripts provided earlier in your project to showcase how to use TorchDevice
.
demo_basic_tensor.py
demo_matrix_multiplication.py
demo_neural_network.py
demo_unsupported_functions.py
demo_device_info.py
Ensure these scripts are updated with any changes you've made to TorchDevice.py
.
- Precision Support on MPS: The MPS backend does not support
float64
(double precision). Usefloat32
instead. - Multiple Devices on MPS: MPS does not support multiple devices. Calls to set or get devices will be redirected appropriately.
- Partial CUDA Functionality: While many CUDA functions are mocked, some functionality cannot be fully replicated on MPS hardware.
- Performance Considerations: Mocked functions may not reflect actual hardware performance or capabilities.
Contributions are welcome! Please submit a pull request or open an issue to discuss potential changes or improvements.
This project is licensed under the Apache License
Version 2.0, License.
Copyright 2024 Michael P. Sullivan - [email protected]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.