-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashing #41
Comments
My interpretation is that you assume that the hashcode of an object is derived from its value.
So, no, by definition, it is not "broken". Since you don't define what's better, or what's terrible, I have to disregard the issue as random ramblings of someone grumpy because the sun doesn't shine... If you tell me what you would consider "better", we may have a way to constructively discuss the issue. |
Grumpy? Guilty as charged! But I was mostly going on this from
So I agree that for non-primitive types, it's unreasonable to assume anything very much about hashing. |
I take your point about me not providing an alternative. So let me do that. It seems reasonable to me that SOM should define "sensible" hashes for primitive types (ints, doubles, strings); any other types (including insances) would come under the "no guarantees given about the hash at all" category. |
Well, Though, what's a better hash for integer? |
String uses at least on CSOM the following: SOM-java uses Java's String.hashCode(), which if I read the documentation correctly, is where CSOM got its hash code from: https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#hashCode() |
Gosh, there are lots of integer hashing algorithms, and I just use other people's! https://en.wikipedia.org/wiki/Hash_function#Hashing_integer_data_types has some thoughts on this. I've found FNV to be decent for ints http://www.isthe.com/chongo/tech/comp/fnv/index.html but I don't know enough to claim that it's the absolute best. |
Ok, I guess this means something like FNV-1a:
Where And 32 bit FNV_prime = 224 + 28 + 0x93 = 16777619 |
Personally I don't think a language needs to define what the hashing algorithm is, other than guaranteeing when equality and hashing are expected to give compatible answers. FNV is just, AFAIK, a decent algorithm for small data like ints. |
Well, and then there's benchmarking. Do you want this to be another variable you have to account for? I am not saying this should be spec'ed and tested. Though, if you want this to be changed in the core-lib, we'll need implementations. And based on what I read quickly, and what you said, the above seems a likely candidate. On slightly related note: SOM implementations are allowed to replace core-lib methods with VM-level substitutes. So, you can replace |
I think this sort of non-determinism is endemic in modern software. For example, if you use a implementation language level hashmap (in Java or Rust or whatever) in your interpreter, you'll have this problem anyway. The only realistic solution we have to this at the moment, IMHO, is just to run things more often and get decent confidence intervals. I'd love to do better than that, but it's not easy.
I'm a bit surprised at that, as this afternoon I discovered (but haven't accounted for!) the fact that SOM programs can tell which methods are user-level or primitives ( |
This should be a representation of what the source code says.
In an ideal world, you can cheat, as long as you don't get caught. Elsewhere, don't remember where, there was a discussion on checking that the method in the core-lib is what one expects it to be. |
Hashing in Java SOM seems to be broken at least for doubles:
prints different numbers when I run it:
i.e. this program prints:
I'm reasonably sure it's supposed to be all
true
s?I wondered if this is why
Integer.som
defineshashcode = (^self)
? On a related note,Integer
's hash function is terrible and I'd much rather provide a better one: would it be better for important types such as this to have the VM provide the hash (since then the hard work of providing a decent hash is outsourced to someone else). Can we maybe get rid of thehashcode = (^self)
fromInteger
?The text was updated successfully, but these errors were encountered: