-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
constructive collisions #6
Comments
Hi, thanks for your suggestion. The behavior of handling duplicate keys isn’t explicitly guaranteed by the standard for If you need custom duplicate handling, consider preprocessing your input data to remove or merge duplicates before constructing the table. Alternatively, you can insert elements one by one using |
Hi, I meant different keys that map to the same value, not the same keys. |
Hi, I’m not entirely sure I understand what you mean. If you’re suggesting that we allow different keys to map to the same slot (i.e., introduce collisions), that runs counter to the whole purpose of this project. The goal here is to create a perfect hash map, which by definition avoids collisions. If we allowed collisions, we’d have to handle them like a normal hash map, losing the advantages of a perfect hashing approach. Could you clarify your use case or the rationale behind this request? |
The key collisions would only be allowed if the stored value was the same, thus saving both time (as in reducing the need to rehash) and space while preserving the perfect hash (unless the stored value was changed later). |
Thanks for clarifying. If I understand your suggestion correctly, you’d like different keys that hash to the same slot—and have identical values—to share that single slot. While that might sound straightforward, it would actually require redesigning large parts of the data structure due to the following reasons: Storing Multiple KeysEven if two keys share the same value, we still need both keys available for lookups. Our current design stores exactly one key-value pair per slot, and changing that would mean creating an entirely new scheme that accommodates multiple keys for a single slot. Comparing ValuesTo confirm whether two values are the same, we’d need to compare them. That’s not part of our current API requirements, nor does the standard library mandate that all value types be comparable. Adding value-comparison logic would significantly alter our design. Cache Friendliness & Space UsageOur layout is optimized for storing one pair per slot in a compact array. If we allow an unknown number of keys in any single slot, it’s not obvious how to maintain both the perfect-hash efficiency and cache-friendliness. Because of these issues, it’s not a direction we plan to pursue. Our goal is to provide a perfect hash solution without collisions or extra comparisons. If collisions need to be managed—or if value comparisons are required—there are more traditional hash table implementations that handle those scenarios well. Thanks again for the feedback. |
Hi, I was assuming that if this separate mode was enabled, the user would be responsible for handling if the mapped values were to suddenly change for the keys which map to the same slot, not the library. This would enable the support to be trivial on the library side. I see you are not interested, which is fine. |
Please add a mode where hash collisions are allowed if two different keys have the same value stored in the table.
The text was updated successfully, but these errors were encountered: