Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch dataproc bm scripts and instructions [skip ci] #819

Merged
merged 4 commits into from
Jan 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions notebooks/dataproc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,10 @@ If you already have a Dataproc account, you can run the example notebooks on a D
```
- Create a cluster with at least two single-gpu workers. **Note**: in addition to the initialization script from above, this also uses the standard [initialization actions](https://github.com/GoogleCloudDataproc/initialization-actions) for installing the GPU drivers and RAPIDS:
```
export CUDA_VERSION=11.8
export RAPIDS_VERSION=24.12.0

gcloud dataproc clusters create $USER-spark-rapids-ml \
--image-version=2.1-ubuntu \
--image-version=2.2-ubuntu22 \
--region ${COMPUTE_REGION} \
--master-machine-type n1-standard-16 \
--master-accelerator type=nvidia-tesla-t4,count=1 \
Expand All @@ -42,11 +41,11 @@ If you already have a Dataproc account, you can run the example notebooks on a D
--worker-machine-type n1-standard-16 \
--num-worker-local-ssds 4 \
--worker-local-ssd-interface=NVME \
--initialization-actions gs://goog-dataproc-initialization-actions-us-central1/gpu/install_gpu_driver.sh,gs://${GCS_BUCKET}/spark_rapids.sh,gs://${GCS_BUCKET}/spark_rapids_ml.sh \
--initialization-actions gs://${GCS_BUCKET}/spark-rapids.sh,gs://${GCS_BUCKET}/spark_rapids_ml.sh \
--initialization-action-timeout=20m \
--optional-components=JUPYTER \
--metadata gpu-driver-provider="NVIDIA" \
--metadata rapids-runtime=SPARK \
--metadata cuda-version=${CUDA_VERSION} \
--metadata rapids-version=${RAPIDS_VERSION} \
--bucket ${GCS_BUCKET} \
--enable-component-gateway \
Expand Down
13 changes: 4 additions & 9 deletions notebooks/dataproc/spark_rapids_ml.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Copyright (c) 2024, NVIDIA CORPORATION.
# Copyright (c) 2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -16,17 +16,12 @@

RAPIDS_VERSION=24.12.0

# patch existing packages
mamba install "llvmlite<0.40,>=0.39.0dev0" "numba>=0.56.2"

# dataproc 2.1 pyarrow and arrow conda installation is not compatible with cudf
mamba uninstall -y pyarrow arrow

# install cudf and cuml
pip install --upgrade pip
pip install cudf-cu11~=${RAPIDS_VERSION} cuml-cu11~=${RAPIDS_VERSION} cuvs-cu11~=${RAPIDS_VERSION} \
pylibraft-cu11~=${RAPIDS_VERSION} \
rmm-cu11~=${RAPIDS_VERSION} \
pip install cudf-cu12~=${RAPIDS_VERSION} cuml-cu12~=${RAPIDS_VERSION} cuvs-cu12~=${RAPIDS_VERSION} \
pylibraft-cu12~=${RAPIDS_VERSION} \
rmm-cu12~=${RAPIDS_VERSION} \
--extra-index-url=https://pypi.nvidia.com

# install spark-rapids-ml
Expand Down
13 changes: 4 additions & 9 deletions python/benchmark/dataproc/init_benchmark.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Copyright (c) 2024, NVIDIA CORPORATION.
# Copyright (c) 2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -24,19 +24,14 @@ function get_metadata_attribute() {

RAPIDS_VERSION=$(get_metadata_attribute rapids-version 24.12.0)

# patch existing packages
mamba install "llvmlite<0.40,>=0.39.0dev0" "numba>=0.56.2"

# install cudf and cuml
# using ~= pulls in lates micro version patches
pip install --upgrade pip

# dataproc 2.1 pyarrow and arrow conda installation is not compatible with cudf
mamba uninstall -y pyarrow arrow

pip install cudf-cu11~=${RAPIDS_VERSION} cuml-cu11~=${RAPIDS_VERSION} cuvs-cu11~=${RAPIDS_VERSION} \
pylibraft-cu11~=${RAPIDS_VERSION} \
rmm-cu11~=${RAPIDS_VERSION} \
pip install cudf-cu12~=${RAPIDS_VERSION} cuml-cu12~=${RAPIDS_VERSION} cuvs-cu12~=${RAPIDS_VERSION} \
pylibraft-cu12~=${RAPIDS_VERSION} \
rmm-cu12~=${RAPIDS_VERSION} \
--extra-index-url=https://pypi.nvidia.com

# install benchmark files
Expand Down
5 changes: 3 additions & 2 deletions python/benchmark/dataproc/start_cluster.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Copyright (c) 2024, NVIDIA CORPORATION.
# Copyright (c) 2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -32,6 +32,7 @@ gpu_args=$(cat <<EOF
--master-accelerator type=nvidia-tesla-t4,count=1
--worker-accelerator type=nvidia-tesla-t4,count=1
--initialization-actions gs://${BENCHMARK_HOME}/spark-rapids.sh,gs://${BENCHMARK_HOME}/init_benchmark.sh
--initialization-action-timeout=20m
--metadata gpu-driver-provider="NVIDIA"
--metadata rapids-runtime=SPARK
--metadata benchmark-home=${BENCHMARK_HOME}
Expand Down Expand Up @@ -62,7 +63,7 @@ if [[ $? == 0 ]]; then
else
set -x
gcloud dataproc clusters create ${cluster_name} \
--image-version=2.1-ubuntu \
--image-version=2.2-ubuntu22 \
--region ${COMPUTE_REGION} \
--master-machine-type n1-standard-16 \
--num-workers 2 \
Expand Down
Loading