-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashes ccminer with more then 10 GPUs #55
Comments
if running 2 instances works, I guess it is the way to go... |
Ths is strange. At first it looks like a page file size issue but if the page file is the entire 120 GB SSD It isn't the total number of GPUs because they all work when split 7 + 7. There doesn't seem to be an internal limitation in ccminer because it can start up to 10 GPUs A resource issue seems unlikely because running 2 instances uses more overhead because of I'm suspecting a timing problem with 2 components: the number of GPUs starting up at once Some test observations: Starting 10 with 0 running works. The last test is interesting. The problem isn't the total number of GPUs, nor the number It must be something happening during the startup. This is where the page file size would The page file size issue, as I understand it, is essentially a race condition among the GPUs The symptoms here are similar except that available VM exceeds the total mem of all the GPUs As a trial and error exercise it might be a good idea to slow down the thread creation at startup. I would also suggest some more testing for consistency and more focus on the tipping point. Hopefully a consistent pattern will emerge. |
Ok, I'll do the testing and report later |
I'm hoping it's just a timing issue and that a little pause between creating mining threads would Otherwise if the tests are consistent it may point to an area for further investigation. I hope DJM34 doesn't see this as interference, I'm just trying to help. |
may-be it is something to try. (I can't test it myself as I don't have such rig. ) |
Something I've noticed, not specific to this issue, but memory in general is ccminer uses 45G VM |
ccminer requires around 4.5-4.7Gb of virtual memory per gpu (this is the way cuda allocates vram). Make sure you are really using the latest release, on linux clone the master |
From what I can tell v1.3.1 has the latest updates. I recloned anyway but it made no difference. Here's a snapshot from top: top - 14:06:22 up 20 days, 1:41, 1 user, load average: 0.09, 0.08, 0.07 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND I don't know how to obtain the same info on Windows, I checked task manager and It might be worth noting the usage varies by around 4G, from less than 45G to more than 49G, This doesn't seem like the leak you described, VM jumps to either 45 or 49G, and toggles If you can't reproduce it could be cuda or driver version, or OS related. |
My analysis has been incorrect. Tpruvot also shows very high VM usage. Actually it shows I have no idea what it means except that it isn't unique to your fork or to MTP. |
this is late but @JayDDee i think you were on to something with this. "As a trial and error exercise it might be a good idea to slow down the thread creation at startup. |
I will look into something like this, I indeed saw something a little weird in the number of threads which were created (sorry for the delay in replaying, I don't read very often emails sent through github (3/4 are for other zcoin projects I don't participate in) |
all good, im bringing this tread back from the dead lol, wasnt exactly expecting a next day answer :P |
I'll be implementing staged thread startup in cpuminer-opt, a 10 ms, usleep( 10000 ), pause The issues are different with a CPU miner but the theory to smooth out sudden demand increases It's one of those things that are worth doing just for the sake of it. |
I apologize for the delay in responding. In addition to this, the problem of 100% CPU utilization by ccminer process returned on the latest nVidia (both Game Ready and Studio) drivers. Now I use t-rex for production, it may work with all 14 GPUs for a month or maybe longer (I reboot system every month for up to date) without any error and not overload CPU. But not in Solo... |
actually, it seems I have found a way to reduce cpu usage. I will try to commit the change soon. first in the command line use "--cpu-affinity 1 (or 2 or 4)" to force gpu thread to be ran on one cpu thread. It should work like this but with some instabilities in gpu usage. Let me know if at least the cpu affinitiy has an effect for you. |
I have rig with 14 GPUs, but ccminer can't start with more then 10 GPUs. If ccminer run with 11 or more GPUs, then stop on CUDA memory allocation error. Only divide GPUs to 2 ccminer processes (7 GPUs to 1 ccminer instance) allows all GPUs to work.
Tested combinations:
CryptoDredge and T-Rex work fine with all 14 GPUs, but can't support solo mode.
Rig HW:
ASUS B250 MINING EXPERT MB, Intel Core i5 - 7500 CPU, 16 Gb (2x8Gb) 2400MHz DDR4 memory, 1x1080Ti + 13x1080 GPUs, 120Gb SSD system disk + 120Gb SSD separate disk for pagefile.
The text was updated successfully, but these errors were encountered: