Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to the code base from the Patatrack CMSSW release #120

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Nov 5, 2020

Highlights:

  • remove the dependency on the CUB external
  • reduce caching allocator memory usage
  • update CUDA to version 11.0 and C++17
  • update the code base from CMSSW 11.2.x
  • import more tests from the cuda directory

@fwyzard fwyzard force-pushed the cudadev-update-CMSSW_11_2_0_pre8_Patatrack branch from 56003a9 to 7de7cee Compare November 5, 2020 14:57
@fwyzard fwyzard changed the title [CUDADEV] Update to the code base from the Patatrack CMSSW release Update to the code base from the Patatrack CMSSW release Nov 5, 2020
@fwyzard fwyzard force-pushed the cudadev-update-CMSSW_11_2_0_pre8_Patatrack branch from 7de7cee to 5b720d5 Compare November 5, 2020 16:51
Copy link
Collaborator

@makortel makortel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also update the README in https://github.com/cms-patatrack/pixeltrack-standalone#cudadev to point to CMSSW_11_2_0_pre8_Patatrack (which I presume to be the version where you picked the code based on the branch name)

@@ -38,15 +42,13 @@ export CUDA_LDFLAGS := -L$(CUDA_BASE)/lib64 -lcudart -lcudadevrt
export CUDA_NVCC := $(CUDA_BASE)/bin/nvcc
define CUFLAGS_template
$(2)NVCC_FLAGS := $$(foreach ARCH,$(1),-gencode arch=compute_$$(ARCH),code=sm_$$(ARCH)) -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored --expt-relaxed-constexpr --expt-extended-lambda --generate-line-info --source-in-ptx --cudart=shared
$(2)NVCC_COMMON := -std=c++14 -O3 $$($(2)NVCC_FLAGS) -ccbin $(CXX) --compiler-options '$(HOST_CXXFLAGS) $(USER_CXXFLAGS)'
$(2)NVCC_COMMON := -std=c++17 -O3 $$($(2)NVCC_FLAGS) -ccbin $(CXX) --compiler-options '$(HOST_CXXFLAGS) $(USER_CXXFLAGS)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would bump the minimum CUDA requirement to 11, right? (ok, I see you updated the README). I need to understand the impact on Kokkos.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that Kokkos 3.1.01 works fine with the combination of CUDA 11 and -std=c++17 here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which leaves the (builds of) cuda and cudauvm programs to break until they have been migrated to the changes. Maybe we can live a short period of time with them broken.

src/cuda/CUDACore/deviceAllocatorStatus.cc Outdated Show resolved Hide resolved
src/cuda/CUDACore/deviceAllocatorStatus.h Outdated Show resolved Hide resolved
src/cudadev/CUDACore/deviceAllocatorStatus.cc Show resolved Hide resolved
src/cudadev/CUDACore/deviceAllocatorStatus.h Show resolved Hide resolved
@@ -135,6 +138,8 @@ void go() {
}

int main() {
cms::cudatest::requireDevices();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this test require devices? It is supposed to run on CPU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.
It's like this in CMSSW, but I agree the requirement should be removed.

@@ -0,0 +1,56 @@
#include "CUDADataFormats/TrackingRecHit2DCUDA.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future note: this file is from cuda instead of CMSSW

@@ -0,0 +1,75 @@
#include "CUDADataFormats/TrajectoryStateSoA.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future note: this file is from cuda instead of CMSSW.

#include <iostream>
#include <sstream>

#include "plugin-Validation/SimpleAtomicHisto.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, this is from cuda instead of CMSSW

@@ -0,0 +1,2 @@
#define USE_BL
#include "testRiemannFit.cc"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference: this file is from cuda instead of CMSSW

@fwyzard fwyzard force-pushed the cudadev-update-CMSSW_11_2_0_pre8_Patatrack branch from 5b720d5 to ccba578 Compare November 5, 2020 22:51
Highlights:
  - remove the dependency on the CUB external
  - reduce caching allocator memory usage
  - update CUDA to version 11.0 and C++17
  - update the code base from CMSSW 11.2.0-pre8
@fwyzard fwyzard force-pushed the cudadev-update-CMSSW_11_2_0_pre8_Patatrack branch from ccba578 to eda7d59 Compare November 5, 2020 23:07
@fwyzard
Copy link
Contributor Author

fwyzard commented Nov 5, 2020

I've tried to separate the files from cuda instead of from CMSSW to a second commit

@makortel makortel self-requested a review November 10, 2020 02:06
@makortel
Copy link
Collaborator

(I'm still running tests before hitting the merge button)

@makortel
Copy link
Collaborator

The alpaka and alpakatest fail to build with

.../pixeltrack-standalone/external/alpaka/include/alpaka/core/Concepts.hpp:81:33: error: '__T0' was not declared in this scope
                 static_assert(std::is_base_of<type, TDerived>::value, "The type implementing the concept has to be a publicly accessible base class!");
                                 ^~~~
.../pixeltrack-standalone/external/alpaka/include/alpaka/core/Concepts.hpp:81:33: note: suggested alternative: '__y0'
                 static_assert(std::is_base_of<type, TDerived>::value, "The type implementing the concept has to be a publicly accessible base class!");
                                 ^~~~
                                 __y0
.../pixeltrack-standalone/external/alpaka/include/alpaka/core/Concepts.hpp:81:47: error: template argument 1 is invalid
                 static_assert(std::is_base_of<type, TDerived>::value, "The type implementing the concept has to be a publicly accessible base class!");

Updating Alpaka to 0.5.0 changed the error to

.../pixeltrack-standalone/external/alpaka/include/alpaka/core/UniformCudaHip.hpp: In instantiation of 'void alpaka::uniform_cuda_hip::detail::rtCheckIgnore(const Error_t&, const char*, const char*, const int&, TErrors&& ...) [with TErrors = {cudaError}; alpaka
::uniform_cuda_hip::detail::Error_t = cudaError]':
.../pixeltrack-standalone/external/alpaka/include/alpaka/queue/cuda_hip/QueueUniformCudaHipRtBase.hpp:196:342:   required from here
.../pixeltrack-standalone/external/alpaka/include/alpaka/core/UniformCudaHip.hpp:86:61: error: missing braces around initializer for 'std::__array_traits<cudaError, 1>::_Type' {aka 'cudaError [1]'} [-Werror=missing-braces]
                     std::array<Error_t, sizeof...(ignoredErrorCodes)> const aIgnoredErrorCodes{ignoredErrorCodes...};
                                                             ^~~~~~~~~~~~~~~~~~

which can be worked around. (FYI @ghugo83)

I think this PR can move forward now, and then I/we'll address the issues to bring cuda, cudauvm, alpakatest, and alpaka back to life.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants