-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preliminary fix for Base.unsafe_wrap #583
base: master
Are you sure you want to change the base?
Conversation
@pxl-th can you please review? |
Can you show an example of when it allocated? Wrapping host arrayjulia> x = zeros(Float32, 4, 4);
julia> xd = unsafe_wrap(ROCArray, pointer(x), size(x));
:3:hip_context.cpp :152 : 2775812923 us: [pid:13824 tid:0x7fa8c6f40d00] hipCtxCreate ( 0x7ffdc05f8250, 0, 0 )
:3:hip_context.cpp :164 : 2775812936 us: [pid:13824 tid:0x7fa8c6f40d00] hipCtxCreate: Returned hipSuccess :
:3:hip_device_runtime.cpp :652 : 2775812941 us: [pid:13824 tid:0x7fa8c6f40d00] hipSetDevice ( 0 )
:3:hip_device_runtime.cpp :656 : 2775812945 us: [pid:13824 tid:0x7fa8c6f40d00] hipSetDevice: Returned hipSuccess :
:3:hip_device_runtime.cpp :637 : 2775812955 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDeviceCount ( 0x7ffdc05f84e8 )
:3:hip_device_runtime.cpp :639 : 2775812959 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDeviceCount: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2775812963 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f83f8 )
:3:hip_device_runtime.cpp :630 : 2775812966 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_stream.cpp :375 : 2775812975 us: [pid:13824 tid:0x7fa8c6f40d00] hipStreamCreateWithPriority ( 0x7fa8bc099620, 0, 0 )
:3:rocdevice.cpp :2768: 2775812984 us: [pid:13824 tid:0x7fa8c6f40d00] number of allocated hardware queues with low priority: 0, with normal priority: 0, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp :2846: 2775825071 us: [pid:13824 tid:0x7fa8c6f40d00] created hardware queue 0x7fa89c00a000 with size 16384 with priority 1, cooperative: 0
:3:rocdevice.cpp :2938: 2775825081 us: [pid:13824 tid:0x7fa8c6f40d00] acquireQueue refCount: 0x7fa89c00a000 (1)
:4:rocdevice.cpp :2099: 2775825338 us: [pid:13824 tid:0x7fa8c6f40d00] Allocate hsa host memory 0x7fa6e0600000, size 0x100000
:3:devprogram.cpp :2686: 2775957593 us: [pid:13824 tid:0x7fa8c6f40d00] Using Code Object V5.
:3:hip_stream.cpp :390 : 2775958610 us: [pid:13824 tid:0x7fa8c6f40d00] hipStreamCreateWithPriority: Returned hipSuccess : stream:0xb7eb70
:3:hip_device_runtime.cpp :622 : 2775958618 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8518 )
:3:hip_device_runtime.cpp :630 : 2775958622 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2775958632 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8538 )
:3:hip_device_runtime.cpp :630 : 2775958635 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_memory.cpp :1233: 2775958655 us: [pid:13824 tid:0x7fa8c6f40d00] hipHostRegister ( 0x7fa8bcd1bf40, 64, 2 )
:4:rocmemory.cpp :967 : 2775958684 us: [pid:13824 tid:0x7fa8c6f40d00] Locking to pool 0x7802e0, size 0x40, HostPtr = 0x7fa8bcd1bf40, DevPtr = 0x7fa8bcd1bf40
:3:hip_memory.cpp :1235: 2775958688 us: [pid:13824 tid:0x7fa8c6f40d00] hipHostRegister: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2775958692 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f84d8 )
:3:hip_device_runtime.cpp :630 : 2775958694 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_memory.cpp :3350: 2775958702 us: [pid:13824 tid:0x7fa8c6f40d00] hipHostGetDevicePointer ( 0x7fa8bc099630, 0x7fa8bcd1bf40, 0 )
:3:hip_memory.cpp :3364: 2775958706 us: [pid:13824 tid:0x7fa8c6f40d00] hipHostGetDevicePointer: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2775963126 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8e38 )
:3:hip_device_runtime.cpp :630 : 2775963133 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_device_runtime.cpp :607 : 2775963139 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceSynchronize ( )
:3:hip_device_runtime.cpp :610 : 2775963143 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceSynchronize: Returned hipSuccess : Wrapping device arrayjulia> x = ROCArray{Float32}(undef, 4, 4);
:3:hip_device_runtime.cpp :622 : 2870790570 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8438 )
:3:hip_device_runtime.cpp :630 : 2870790582 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_mempool.cpp :59 : 2870790591 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceGetMemPool ( 0x7fa8be5ed090, 0 )
:3:hip_mempool.cpp :64 : 2870790594 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceGetMemPool: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2870790686 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8058 )
:3:hip_device_runtime.cpp :630 : 2870790690 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_mempool.cpp :69 : 2870790699 us: [pid:13824 tid:0x7fa8c6f40d00] hipMallocAsync ( 0x7fa8be5ed110, 64, stream:0xb7eb70 )
:4:rocdevice.cpp :2227: 2870790782 us: [pid:13824 tid:0x7fa8c6f40d00] Allocate hsa device memory 0x7fa6e0200000, size 0x40
:3:rocdevice.cpp :2266: 2870790786 us: [pid:13824 tid:0x7fa8c6f40d00] device=0xb19f90, freeMem_ = 0x5feffffc0
:3:hip_mempool_impl.cpp :208 : 2870790791 us: [pid:13824 tid:0x7fa8c6f40d00] Pool AllocMem: 0x7fa6e0200000, 0x10b7250
:3:hip_mempool.cpp :88 : 2870790795 us: [pid:13824 tid:0x7fa8c6f40d00] hipMallocAsync: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2870794261 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8e38 )
:3:hip_device_runtime.cpp :630 : 2870794267 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_device_runtime.cpp :607 : 2870794271 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceSynchronize ( )
:3:hip_device_runtime.cpp :610 : 2870794275 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceSynchronize: Returned hipSuccess :
julia> xd = unsafe_wrap(ROCArray, pointer(x), size(x); lock=false);
:3:hip_device_runtime.cpp :622 : 2875471365 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice ( 0x7ffdc05f8e38 )
:3:hip_device_runtime.cpp :630 : 2875471383 us: [pid:13824 tid:0x7fa8c6f40d00] hipGetDevice: Returned hipSuccess :
:3:hip_device_runtime.cpp :607 : 2875471392 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceSynchronize ( )
:3:hip_device_runtime.cpp :610 : 2875471399 us: [pid:13824 tid:0x7fa8c6f40d00] hipDeviceSynchronize: Returned hipSuccess : You can see that in both cases logs do not display any allocation. Also the tests needs to be updated, they are failing right now.
Ideally yes, but I don't have much time at the moment to work on that. |
@pxl-th even if it doesn't allocate, it still doesn't control the ownership of the buffer. The Base definition has a keyword argument I might be able to contribute managed buffers to AMDGPU.jl if no one is doing it. |
I don't have any objections regarding these changes (as long as they work). FYI, current implementation also does not take the ownership, when creating @luraess this might be breaking for you. |
@pxl-th you are correct, I edited my last comment to reflect this. At its current state however, I will update the tests and let CI run this. In terms of implementing the managed buffer, is there anything on the LLVM side I should fix (e.g. address space)? Also do you know what the exact difference between managed buffers and host-accessible buffers in HIP/HSA? |
The one place we are currently using Would be good to sync on that. |
This PR should fix
Base.unsafe_wrap
, which allocated new memory instead of wrapping over the passed pointer.