Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
WoW
Talk
English
Views
Read
Edit
History
More
Search
Navigation
Home
Random page
Help using wiki
Editions
for WoW
for WildStar
for Solar2D
Documentation
for WoW
for WildStar
Reference
WoW
⦁ FrameXML
⦁ AddOns
⦁ API
⦁ WoW Lua
WildStar
⦁ AddOns
⦁ API
⦁ WildStar Lua
Engine
Tools
What links here
Related changes
Special pages
Page information
Site
Recent Changes
Editing
WoW:StringHash/Analysis
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Collision finding test == <div style="margin-left: 3%;"> This test exhaustively computes all possible values of a given input set and tests for collisions. The size of the input sets is unfortunately constrained by RAM and CPU, but it is believed to be a fair indication of hash strength. === ASCII codes 64-89 (upper case A-Z), 3 bytes === <div style="margin-left: 3%;"> JavaHash 17576 non-colliding values StringHash 17576 non-colliding values </div> === ASCII codes 64-89 (upper case A-Z), 4 bytes === <div style="margin-left: 3%;"> JavaHash 456976 non-colliding values StringHash 456976 non-colliding values </div> === ASCII codes 32-127 (all printable), 2 bytes === <div style="margin-left: 3%;"> JavaHash 62 non-colliding values 62 instances of 2 inputs producing the same hash (124 strings) 2638 instances of 3 inputs producing the same hash (7914 strings) 279 instances of 4 inputs producing the same hash (1116 strings) StringHash 9216 non-colliding values Already here, with only searching all possible combinations of 2 letters of printable ASCII, the Java hash is showing poor results. </div> === ASCII codes 32-127 (all printable), 3 bytes === <div style="margin-left: 3%;"> This is about as exhaustive we can go in Lua without resulting to data compression techniques for the result sets: all possible combinations of 3 letters of printable ASCII for a total of 884736 outputs. JavaHash 62 non-colliding values 62 instances of 2 inputs producing the same hash (124 strings) 1630 instances of 3 inputs producing the same hash (4890 strings) 224 instances of 4 inputs producing the same hash (896 strings) 62 instances of 5 inputs producing the same hash (310 strings) 1630 instances of 6 inputs producing the same hash (9780 strings) 62 instances of 7 inputs producing the same hash (434 strings) 224 instances of 8 inputs producing the same hash (1792 strings) 68606 instances of 9 inputs producing the same hash (617454 strings) 5214 instances of 10 inputs producing the same hash (52140 strings) 5214 instances of 11 inputs producing the same hash (57354 strings) 9672 instances of 12 inputs producing the same hash (116064 strings) 558 instances of 13 inputs producing the same hash (7254 strings) 558 instances of 14 inputs producing the same hash (7812 strings) 558 instances of 15 inputs producing the same hash (8370 strings) StringHash 884736 non-colliding values </div> === ASCII codes 32-127 (all printable), 3 bytes, with prefix === <div style="margin-left: 3%;"> The same input values as the 32-127 x 3 test, except all prefixed with the string "01234567890123456789". This is to test for loss of accuracy. JavaHash 62 non-colliding values 62 instances of 2 inputs producing the same hash (124 strings) 1630 instances of 3 inputs producing the same hash (4890 strings) 224 instances of 4 inputs producing the same hash (896 strings) 62 instances of 5 inputs producing the same hash (310 strings) 1630 instances of 6 inputs producing the same hash (9780 strings) 62 instances of 7 inputs producing the same hash (434 strings) 224 instances of 8 inputs producing the same hash (1792 strings) 68606 instances of 9 inputs producing the same hash (617454 strings) 5214 instances of 10 inputs producing the same hash (52140 strings) 5214 instances of 11 inputs producing the same hash (57354 strings) 9672 instances of 12 inputs producing the same hash (116064 strings) 558 instances of 13 inputs producing the same hash (7254 strings) 558 instances of 14 inputs producing the same hash (7812 strings) 558 instances of 15 inputs producing the same hash (8370 strings) StringHash 884736 non-colliding values </div> === ASCII codes 32-127 (all printable), 3 bytes, with suffix === <div style="margin-left: 3%;"> The same input values as the 32-127 x 3 test, except all suffixed with the string "01234567890123456789". This is to test for loss of "old" information (data from earlier on in the string) in the mixing function. JavaHash 62 non-colliding values 62 instances of 2 inputs producing the same hash (124 strings) 1630 instances of 3 inputs producing the same hash (4890 strings) 224 instances of 4 inputs producing the same hash (896 strings) 62 instances of 5 inputs producing the same hash (310 strings) 1630 instances of 6 inputs producing the same hash (9780 strings) 62 instances of 7 inputs producing the same hash (434 strings) 224 instances of 8 inputs producing the same hash (1792 strings) 68606 instances of 9 inputs producing the same hash (617454 strings) 5214 instances of 10 inputs producing the same hash (52140 strings) 5214 instances of 11 inputs producing the same hash (57354 strings) 9672 instances of 12 inputs producing the same hash (116064 strings) 558 instances of 13 inputs producing the same hash (7254 strings) 558 instances of 14 inputs producing the same hash (7812 strings) 558 instances of 15 inputs producing the same hash (8370 strings) StringHash 884736 non-colliding values </div> === ASCII codes 64-82 (A-S), 5 bytes === <div style="margin-left: 3%;"> The Lua memory churn becomes somewhat ugly here, consuming several hundred megabytes. Even attempting A-Z with the same length brings a machine with 1GB RAM to its knees. JavaHash 2476099 non-colliding values StringHash 2476099 non-colliding values </div> === Result analysis === <div style="margin-left: 3%;"> The Java hash is shown to produce many more collisions even for rather simple inputs. The only cases where it behaves well are the A-Z cases; as long as the input subset is constricted to 32 unique values, it is collision free for 6 bytes (6 * 5 bits = 30 bits), in other words, it is a perfect hash for e.g. short strings consisting of only upper case ASCII letters A-Z. This is an expected property of the algorithm. Though taken with the rather abysmal results for more generic strings above (or, indeed, any sort of string longer than 6 bytes), the Java hash demonstrably makes for a poor choice for general input. </div> </div> </div>
Summary:
Please note that all contributions to AddOn Studio are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
AddOn Studio Wiki:Copyrights
for details).
Submissions must be written by you, or copied from a public domain or similar free resource (see
AddOn Studio Wiki:Copyrights
for details).
Cancel
Editing help
(opens in new window)