Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configuration and hashrate problem #1739

Open
iskyd opened this issue Jul 23, 2018 · 10 comments
Open

configuration and hashrate problem #1739

iskyd opened this issue Jul 23, 2018 · 10 comments

Comments

@iskyd
Copy link

iskyd commented Jul 23, 2018

HI, I've got two different server one with 3,5ghz of CPU single core and the other dual-core 3,5ghz of CPU.
The dual core processors are Intel Core Processor (Haswell, no TSX). It doesn't have l3 cache.
The single core processor is Intel Core Processor (Broadwell). It doens't have l3 cache.
What I simply expect to that is that the dual-core mine double then the single-core.
This is my cpu configuration on the single-core server:

"cpu_threads_conf" :
[
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 0 },
],

And this is the dual-core configuration:

"cpu_threads_conf" :
[
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 0 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 1 }
],

What I notice from my pool is that the first (single-core) has in the past 48 hour an average hashrate of 78 instead the dual-core has an average hashrate of 92.

Why this happen?
Am I miss something?

@iskyd iskyd changed the title mining configuration and hashrate configuration and hashrate Jul 23, 2018
@iskyd iskyd changed the title configuration and hashrate configuration and hashrate problem Jul 23, 2018
@Spudz76
Copy link
Contributor

Spudz76 commented Jul 23, 2018

What coin? What actual models of CPU? What OS?
How is there no L3? Those rates seem too high for literally none.

if you don't have 2048K L3 per core then adding the second core will only cause contention for the limited cache space and slow both threads down, better to run single thread and try to get CPU to do turbo (it won't running all cores)

Also if its a HT virtual core ("1x core, 2x threads") and not actually two cores then running on the HT core will drag down the real one as they share and contend for even more than caches.

Different core family do the mining better/worse but you may benefit from compiling locally on each rig (so the compiler detects all CPU features and builds optimal) the release is built for just SSE3 and AES-NI and may not take advantage of some other features (up to the compiler).

Speaking of compilers, on Linux I have had better performance from clang-3.8 than any gcc.

@iskyd
Copy link
Author

iskyd commented Jul 24, 2018

Hi, i'm mining monero.
I'm running on both server ubuntu 16.04.
These the single core cpu details:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 61
Model name:            Intel Core Processor (Broadwell)
Stepping:              2
CPU MHz:               3504.000
BogoMIPS:              7008.00
Virtualization:        VT-x
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0

These are the dual core cpu details:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             2
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 60
Model name:            Intel Core Processor (Haswell, no TSX)
Stepping:              1
CPU MHz:               3504.000
BogoMIPS:              7008.00
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0,1

You are suggesting to just run one single "cpu_threads_conf" instead of one per core? What about the affine to cpu ?

@Spudz76
Copy link
Contributor

Spudz76 commented Jul 24, 2018

eww, lscpu output sucks please post cat /proc/cpuinfo instead

@Spudz76
Copy link
Contributor

Spudz76 commented Jul 24, 2018

apt install clang++-6
and then
CC=/usr/bin/clang-6.0 CXX=/usr/bin/clang++-6.0 cmake -DCMAKE_BUILD_TYPE=Release -DCUDA_ENABLE=OFF -DOpenCL_ENABLE=OFF ..

That will gain some hashrate by itself, your hashrates seem about right for low power cpus.

@iskyd
Copy link
Author

iskyd commented Jul 24, 2018

cat /proc/cpuinfo

dual-core

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel Core Processor (Haswell, no TSX)
stepping	: 1
microcode	: 0x1
cpu MHz		: 3504.000
cache size	: 4096 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single retpoline kaiser fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 7008.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel Core Processor (Haswell, no TSX)
stepping	: 1
microcode	: 0x1
cpu MHz		: 3504.000
cache size	: 4096 KB
physical id	: 1
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single retpoline kaiser fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 7008.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

single core

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel Core Processor (Broadwell)
stepping	: 2
microcode	: 0x1
cpu MHz		: 3504.000
cache size	: 4096 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single retpoline kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap xsaveopt arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 7008.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

Can you explain a little bit better about clang++-6 ?
I need to recompile xmr-stak?

Thanks.

@Spudz76
Copy link
Contributor

Spudz76 commented Jul 24, 2018

Also disable iGPU and KVM/VT-d in bios

Intel iGPU tend to share cache / get in the way of actual processing. KVM/VT-d are pointless as well unless you are actually running qemu or something.

@Spudz76
Copy link
Contributor

Spudz76 commented Jul 24, 2018

Yes, since clang 6.0 makes better CPU code, release binaries are meant to work well on many systems but not the best on every system ("generic", SSE3 and AES-NI enabled only), also compiling detects your rigs local capabilities and makes code specifically for them.

Otherwise your hashrates are correct. You can get a few more though using better compiler and compiling locally for each CPU type.

I am working with others on PR #1604 which helps Haswell/Broadwell cores beyond what is possible with current code. They benefit from piling on tons of work (10, 20, or 100 or more) versus single thread per core as apparently it forces more of the internal "guesses" to be correct for the workload. Chopping work into single threads that run a chunk of work seems to confuse these cores and make them slower (internal optimizations like SmartCache and various prefetching can't see far enough into the future so they don't work as well as they would with a larger roadmap of what's coming down the pipeline). That large stack of work hurts other core-types performance, so it's not default yet.

You could try low_power_mode:5 and no_prefetch:false on them and see if it helps at all. I am working on a 10-way patch that works with current 2.4.7 code, you could compile from my fork dev-hax which has it. Then run with low_power_mode:10 and no_prefetch:true.

@Spudz76
Copy link
Contributor

Spudz76 commented Jul 24, 2018

Also pool hashrate is not always meaningful. Your slowest rig could go on a huge luck run and have an apparent hashrate much higher than others for an hour.

Better to refer to the real rates given by the miner, when you press H

@minzak
Copy link

minzak commented Jul 29, 2018

@Spudz76 if i enable KVM/VT-d in bios and not run any virtual software - it is fine?
Or in any way for best perfomance need disable KVM/VT-d in bios?
Thanks.

@Spudz76
Copy link
Contributor

Spudz76 commented Jul 29, 2018

I just shut off every technology that I am not using
I do not know if it makes a difference it's just best practice.
Less things for computer to pay attention to and bug out on.
Sound off, driver blocked, HDMI sound off, drivers blocked, etc

I do run a couple CPU miners on Proxmox hypervisor OS with VM running also - works fine, probably same. But one hash a second when you mine for months (three million seconds) is three million more hashes done. So I rather just turn it off and believe in invisibly higher average rates even when I can't see a 10H difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants