Skip to content

Commit

Permalink
Thrust update 1.17 (ROCm#245)
Browse files Browse the repository at this point in the history
* Update .gitlab-ci.yml file

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml file

* Update .gitlab-ci.yml file

* Update .gitlab-ci.yml file

* Thrust Update 1.17.0

All commits of 1.17.0:

Bump CUB

Re-apply PR with the new documentation framework.

This reverts commit 4d657ac37f4a76548e3cf52cb459606e426aa570.

Finish removing pinned_allocator (fb24e3278)

Fix some issues introduced in #1475.

Docs/Doxygen: #ifdef out some of the `optional` implementation details that
cause Doxygen to choke.

Use the `nullptr` literal instead of `std::nullptr_t` parameters in
`thrust::pointer` and `thrust::device_ptr` to silence spurious "set but not
used" warnings from old GCC versions.

Fix merge conflict left in CHANGELOG.md from the documentation PR.

Docs: Remove just-the-docs `{: .btn }` syntax for link buttons from `README.md` as GitHub Flavored Markdown doesn't recognize it.

add sccache to build script

edit sccache flags with cmake env variables

add newline at end of file

zero sccache statistics

update s3 key prefix

Add support for local build caching.

Integrate sccache logging output

Disable sccache on nvc++ builds.

Docs/Doxybook: Add examples of `\param` and `\tparam` to the Doxybook rendering
test.

Bump CUB.

Fix contributing guide link to cmake options

Fix CI label for GCC 11 builder.

Increase contrast of search input text in docs.

Bump CUB.

Add GitHub action to validate links in markdown files (#1640)

Add GitHub action to run xrefcheck on markdown files.

Handle when CMAKE_INSTALL_LIBDIR has nested directories (#1653)

* Handle when CMAKE_INSTALL_LIBDIR has nested directories

* simplify the implementation

* depend on cub with the same changes

Bump CUB.

Fix gpuCI links in README.

Bump CUB.

Waive some additional GCC11 miscompiles.

Bump CUB.

Fix bug in permutation_iterator example.

Fixes #1660.

Bump CUB.

Add make_tagged_iterator

make_tagged_iterator<Tag, Iterator>(iterator);
tparam `Iterator` could be skipped due to function template type deduction.

Add trailing return type to support C++11

Co-authored-by: Jake Hemstad <[email protected]>

Bump CUB

Bump CUB.

Fix thrust::reduce_by_key for 2^31 elements

Style updates to memmon.py.

Update memmon.py:

- Print a message immediately when the fail threshold is exceeded.
  This helps locate issues since the command string may not contain
  useful information.
- Don't fail the build over memmon issues. We should revisit this, but
  due to sccache, these failures manifest intermittently.

Bump CUB.

Bump CUB.

Bump CUB.

add gdb pretty-printer for thrust vectors

add gdb pretty-printer for thrust device_reference

Add __forceinline__ to thrust::detail::wrapped_function::operator()

Add missing header.

Add utilities to convert contiguous iterators to pointers.

All off these are internal implementation details in the
`thrust::detail` namespace:

Contiguous iterators only:
- `contiguous_iterator_traits`
- `contiguous_iterator_raw_pointer_t`:
- `contiguous_iterator_raw_pointer_cast`

These work on all iterators, but convert to a
raw pointer if given a contiguous iterator.
- `try_unwrap_contiguous_iterator_return_t`
- `try_unwrap_contiguous_iterator`

Update the CUDA scan_by_key impl to use cub's ScanByKey.

Split the scan_by_key test into inclusive/exclusive tests.

This test was consuming excessive memory during nvc++ compilation.
Splitting into two TUs should remedy this.

Ran clang-format on the new test files, but the contents are the same.

Address review suggestions.

Add an example that shows how to use custom CUDA streams.

Fixes #1626.

Cover par_nosync in new stream example.

Co-authored-by: Jake Hemstad <[email protected]>

add unique_count algorithm

Add a counting equivalent to unique_* algorithms
that can be used to allocate the correct amount of data
before actually filling it.

Addresses issue #1612

unique_count: weaken iterator requirements

unique: improve template parameter naming

The interface specifies ForwardIterator,
not InputIterator

unique: test with ForwardIterator parameters

improve forward_iterator_wrapper

* use iterator traits
* use hidden friend operators
* fix member access operator

Co-authored-by: Eric Niebler <[email protected]>

unique_count: add missing cuda tests

use thrust iterator categories in iterator wrapper

Revert "use thrust iterator categories in iterator wrapper"

This reverts commit fac36573bec54519d713d06f46fa45292714a7d2.

Revert "improve forward_iterator_wrapper"

This reverts commit 1532df8007ff38189cdb88738eafb1759b90b377.

Revert "unique: test with ForwardIterator parameters"

This reverts commit 0b41e08165825d55145442ebe07e87c3dc85351f.

Bump CUB.

Bump CUB.

Fix some exec space annotations.

Use CUB version of adjacent difference

Compare value types instead of iterator once

Better name for in-place execution

Bump CUB.

Update changelog for 1.17.0.

* Add hip backend and test for unique_count

Test based on upstream thrust test ported to google test.

* bump cub version to thrust 1.17 proper

* add equivalent reduce_by_key large indices hip test

* Updating changelog for ROCm 5.4

Co-authored-by: Lőrinc Serfőző <[email protected]>
Co-authored-by: Gergely Meszaros <[email protected]>
Co-authored-by: Stanley Tsang <[email protected]>
  • Loading branch information
4 people authored Sep 1, 2022
1 parent 767898e commit f9e0d85
Show file tree
Hide file tree
Showing 300 changed files with 8,276 additions and 3,938 deletions.
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,8 @@ compile_commands.json
CTestTestfile.cmake
thrust/system/cuda/detail/.gitignore
*.bash
*.log
.p4config
run
build*
doc/html
discrete_voronoi.pgm

# End of https://www.gitignore.io/api/c++,cmake
Expand Down
95 changes: 39 additions & 56 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# ########################################################################
# Copyright 2019-2021 Advanced Micro Devices, Inc.
# Copyright 2019-2022 Advanced Micro Devices, Inc.
# ########################################################################

include:
Expand All @@ -10,6 +10,7 @@ include:
- /deps-cmake.yaml
- /deps-rocm.yaml
- /gpus-rocm.yaml
- /rules.yaml

stages:
- build # Tests if builds succeed (CMake)
Expand All @@ -20,13 +21,6 @@ variables:
PACKAGE_DIR: $BUILD_DIR/package
ROCPRIM_GIT_BRANCH: develop_stream

.deps:doc:
image: sphinxdoc/sphinx-latexpdf
before_script:
- $SUDO_CMD apt update -qq
- $SUDO_CMD apt install -y -qq doxygen
- $SUDO_CMD pip3 install sphinx_rtd_theme breathe exhale

.cmake-latest:
extends:
- .deps:rocm
Expand All @@ -45,31 +39,20 @@ variables:

.install-rocprim:
script:
- git clone https://github.com/RadeonOpenCompute/rocm-cmake.git
- cd rocm-cmake && mkdir build && cd build
- cmake
-G Ninja
-D CMAKE_CXX_COMPILER=hipcc
-D CMAKE_BUILD_TYPE=Release
../
- $SUDO_CMD cmake
--build ./ --target install

- branch_name="$ROCPRIM_GIT_BRANCH"
- if [ $CI_COMMIT_BRANCH == develop ] || [ $CI_COMMIT_BRANCH == master ]; then branch_name=$CI_COMMIT_BRANCH;
- fi;
- git clone -b $branch_name https://gitlab-ci-token:${CI_JOB_TOKEN}@${ROCPRIM_GIT_URL}
- cd rocPRIM
- mkdir build
- cd build
- git clone -b $branch_name https://gitlab-ci-token:${CI_JOB_TOKEN}@${ROCPRIM_GIT_URL} $CI_PROJECT_DIR/rocPRIM
- cmake
-G Ninja
-D CMAKE_CXX_COMPILER=hipcc
-D CMAKE_BUILD_TYPE=Release
-D BUILD_TEST=OFF
-D BUILD_EXAMPLE=OFF
-D ROCM_DEP_ROCMCORE=OFF
../.
-S $CI_PROJECT_DIR/rocPRIM
-B $CI_PROJECT_DIR/rocPRIM/build
- cd $CI_PROJECT_DIR/rocPRIM/build
- cpack
-G "DEB"
- $SUDO_CMD dpkg -i rocprim*.deb
Expand All @@ -78,25 +61,16 @@ variables:
stage: build
extends:
- .gpus:rocm-gpus
- .rules:build
tags:
- rocm-build
script:
- !reference [.install-rocprim, script]
# Setup env vars for testing
- rng_seed_count=0; prng_seeds="0";
- if [ $CI_COMMIT_BRANCH == develop_stream ] ; then rng_seed_count=3; prng_seeds="0 1000";
- fi;
# Build rocThrust
- mkdir -p $BUILD_DIR
- cd $BUILD_DIR
- git clone https://github.com/RadeonOpenCompute/rocm-cmake.git
- cd rocm-cmake && mkdir build && cd build
- cmake
-G Ninja
-D CMAKE_CXX_COMPILER=hipcc
-D CMAKE_BUILD_TYPE=Release
../
- $SUDO_CMD cmake
--build ./ --target install
- cd ../..
- cmake
-G Ninja
-D CMAKE_CXX_COMPILER=hipcc
Expand All @@ -109,18 +83,18 @@ variables:
-D RNG_SEED_COUNT=$rng_seed_count
-D PRNG_SEEDS=$prng_seeds
-S $CI_PROJECT_DIR
-B $BUILD_DIR
- cmake
--build $BUILD_DIR
-B $CI_PROJECT_DIR/build
- cmake --build $CI_PROJECT_DIR/build
artifacts:
paths:
- $BUILD_DIR/test/*
- $BUILD_DIR/testing/*
- $BUILD_DIR/deps/*
- $BUILD_DIR/CMakeCache.txt
- $BUILD_DIR/CTestTestfile.cmake
- $CI_PROJECT_DIR/build/test/*
- $CI_PROJECT_DIR/build/testing/*
- $CI_PROJECT_DIR/build/deps/*
- $CI_PROJECT_DIR/build/CMakeCache.txt
- $CI_PROJECT_DIR/build/CTestTestfile.cmake
- $CI_PROJECT_DIR/build/.ninja_log
exclude:
- $BUILD_DIR/**/*.o
- $CI_PROJECT_DIR/build/**/*.o
expire_in: 2 weeks

build:cmake-latest:
Expand All @@ -131,9 +105,6 @@ build:cmake-latest:

build:cmake-minimum:
stage: build
needs:
- job: build:cmake-latest
artifacts: false
extends:
- .cmake-minimum
- .build:common
Expand All @@ -142,13 +113,16 @@ build:package:
stage: build
extends:
- .cmake-minimum
- .rules:build
tags:
- rocm-build
script:
- !reference [.install-rocprim, script]
- cmake
-S $CI_PROJECT_DIR
-B $PACKAGE_DIR
-G Ninja
-D CMAKE_BUILD_TYPE=release
-D CMAKE_BUILD_TYPE=Release
-D CMAKE_CXX_COMPILER=hipcc
- cd $PACKAGE_DIR
- cpack
Expand All @@ -165,39 +139,48 @@ test:package:
- build:package
extends:
- .cmake-minimum
- .rules:test
tags:
- rocm-build
script:
- !reference [.install-rocprim, script]
- $SUDO_CMD dpkg -i $PACKAGE_DIR/rocthrust*.deb
# Test install
- cmake
-S $CI_PROJECT_DIR/extra
-B package_test
-B $CI_PROJECT_DIR/package_test
-G Ninja
-D CMAKE_CXX_COMPILER=hipcc
-D CMAKE_BUILD_TYPE=Release
-D ROCPRIM_ROOT=/opt/rocm/rocprim
- cmake
--build package_test
- cmake --build $CI_PROJECT_DIR/package_test
# Remove rocPRIM and rocThrust
- $SUDO_CMD dpkg -r rocthrust-dev
- $SUDO_CMD dpkg -r rocprim-dev

test:doc:
extends: .deps:doc
image: sphinxdoc/sphinx-latexpdf
needs: []
stage: test
extends:
- .rules:test
before_script:
- $SUDO_CMD apt update -qq
- $SUDO_CMD apt install -y -qq doxygen
- $SUDO_CMD pip3 install sphinx_rtd_theme breathe exhale
script:
- bash -x ./docs/run_doc.sh
- bash -x $CI_PROJECT_DIR/docs/run_doc.sh

test:
stage: test
extends:
- .cmake-latest
- .cmake-minimum
- .rules:test
- .gpus:rocm
needs:
- build:cmake-latest
- build:cmake-minimum
script:
- cd $BUILD_DIR
- cd $CI_PROJECT_DIR/build
- cmake
-D CMAKE_PREFIX_PATH=/opt/rocm
-P $CI_PROJECT_DIR/cmake/GenerateResourceSpec.cmake
Expand Down
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,21 @@

Full documentation for rocThrust is available at [https://rocthrust.readthedocs.io/en/latest/](https://rocthrust.readthedocs.io/en/latest/)

## (Unreleased) rocThrust 2.17.0 for ROCm 5.4
### Added
- Updated to match upstream Thrust 1.17.0

## (Unreleased) rocThrust 2.16.0 for ROCm 5.3
### Added
- Updated to match upstream Thrust 1.16.0
### Changed
- rocThrust functionality dependent on device malloc works is functional as ROCm 5.2 reneabled device malloc. Device launched `thrust::sort` and `thrust::sort_by_key` are available for use.

## rocThrust 2.15.0 for ROCm 5.2
### Added
- Packages for tests and benchmark executable on all supported OSes using CPack.
### Known issues
- async_copy, partition, and stable_sort_by_key unit tests are failing on HIP on Windows.

## rocThrust 2.14.0 for ROCm 5.1
### Added
Expand Down
2 changes: 1 addition & 1 deletion dependencies/cub
Submodule cub updated 52 files
+54 −0 CHANGELOG.md
+2 −2 CONTRIBUTING.md
+1 −0 README.md
+0 −10 cmake/CubBuildCompilerTargets.cmake
+2 −1 cmake/CubInstallRules.cmake
+0 −195 common.mk
+6 −5 cub/agent/agent_adjacent_difference.cuh
+2 −2 cub/agent/agent_radix_sort_downsweep.cuh
+6 −5 cub/agent/agent_radix_sort_histogram.cuh
+4 −6 cub/agent/agent_radix_sort_upsweep.cuh
+14 −11 cub/agent/agent_scan_by_key.cuh
+15 −19 cub/agent/agent_select_if.cuh
+28 −60 cub/block/block_adjacent_difference.cuh
+29 −60 cub/block/block_discontinuity.cuh
+1 −1 cub/block/block_shuffle.cuh
+11 −1 cub/cmake/cub-header-search.cmake.in
+62 −0 cub/detail/choose_offset.cuh
+78 −90 cub/device/device_adjacent_difference.cuh
+18 −0 cub/device/device_partition.cuh
+3 −23 cub/device/device_radix_sort.cuh
+4 −4 cub/device/device_scan.cuh
+13 −26 cub/device/dispatch/dispatch_adjacent_difference.cuh
+11 −6 cub/device/dispatch/dispatch_radix_sort.cuh
+274 −112 cub/device/dispatch/dispatch_segmented_sort.cuh
+3 −5 cub/device/dispatch/dispatch_select_if.cuh
+1 −1 cub/device/dispatch/dispatch_three_way_partition.cuh
+1 −1 cub/device/dispatch/dispatch_unique_by_key.cuh
+2 −1 cub/grid/grid_even_share.cuh
+3 −0 cub/iterator/tex_obj_input_iterator.cuh
+7 −335 cub/iterator/tex_ref_input_iterator.cuh
+15 −0 cub/thread/thread_operators.cuh
+1 −1 cub/util_device.cuh
+2 −2 cub/version.cuh
+1 −1 cub/warp/warp_load.cuh
+0 −1 experimental/.gitignore
+0 −1,072 experimental/defunct/example_coo_spmv.cu
+0 −2,134 experimental/defunct/test_device_seg_reduce.cu
+0 −106 experimental/histogram/histogram_cub.h
+0 −185 experimental/histogram/histogram_gmem_atomics.h
+0 −195 experimental/histogram/histogram_smem_atomics.h
+0 −635 experimental/histogram_compare.cu
+0 −1,244 experimental/sparse_matrix.h
+0 −917 experimental/spmv_compare.cu
+0 −30 experimental/spmv_script.sh
+56 −36 test/test_device_adjacent_difference.cu
+150 −39 test/test_device_radix_sort.cu
+115 −0 test/test_device_select_if.cu
+57 −0 test/test_device_select_unique.cu
+1 −80 test/test_iterator.cu
+306 −0 test/test_iterator_deprecated.cu
+0 −1 tune/.gitignore
+0 −763 tune/tune_device_reduce.cu
Loading

0 comments on commit f9e0d85

Please sign in to comment.