Overview
HashStrings is a WPF desktop application that plots the distribution of string hashes by different algorithms. The application and the results produced the algorithms are discussed in detail on:
http://www.orthogonal.com.au/computers/hashstrings
There is an important comment in the code regarding performance and coding style that is worth duplicating here:
It's unclear what is the best way of running hashes at high-speed and displaying them in the image. In the original Forms app a tight loop wrote to the Bitmap and a timer would blit it to the screen. In WPF a similar technique is used, where a timer causes a burst of hash loops to update a writeable Bitmap which seems to display when it unlocks. Locking the Bitmap, then setting pixels and marking the 1-pixel rectangles as dirty in a tight loop doesn't seem very elegant or efficient, but web searches don't reveal the best professional technique for doing this. Until a great discovery is made, we'll just stick to the current technique. Advice on this whole issue would be most welcome.
Controls
Hash Algorithm
There are currently 23 small classes that implement the IHasher
interface and are annotated with the [Hasher]
attribute. When the app starts, all annotated classes are instantiated and listed in the ComboBox.
The original Windows Forms app used the fancy technique of reading the source code for the hash algorithm classes and compiling them into a dynamic assembly. Although that technique was technically interesting, it was overkill for the replacement WPF app and it was dropped.
String Length Range
Specifies the length range 1 to 100 of random strings that are hashed. The strings are composed from 96 random printable characters in the range 32 to 127. To improve performance, strings are not internally created and then converted to bytes for hashing, the string creation is skipped and span<char>
over a mutable buffer is actually passed to the hashers.
Timer Millisec
The milliseconds interval of timer ticks that run random strings through the selected algorithm.
Hash Loops
The number of hashes that are run in a tight loop every timer tick.
X and Y Stretch
Stretches the scale of pixel drawing by the specified factor in the X and Y axis directions. Stretching can help reveal fine patterns in the hash values.
Collision Test
Runs all combinations of 1, 2 and 3 character strings from the pool of 96 through the selected hash algorithm and displays the collision counts.
Speed Test
Runs millions of hashes in a tight loop and estimates the speed in hashes per second. As multiple speed tests are run for the same algorithm, a running average is also calculated.
Screenshot
History Note
About 12 years ago I stumbled upon the following C function that implemented string hashing back in .NET Framework 1.x.
{
uint hash = 5381;
for (int i = 0; i < s.Length; i++)
{
uint c = s[i];
uint u1 = (hash << 5);
hash = (u1 + hash) ^ c;
}
return (int)hash;
}
The latest source code for String GetHashCode() can be found in this file, scroll down to around line 800:
https://referencesource.microsoft.com/#mscorlib/system/string.cs