Issue
Is there a known Java String with hashCode exactly equal to Integer.MIN_VALUE ? It would be helpful for writing a test for a hash table to help avoid a common mistake of running Math.Abs on the hashcode before performing the remainder operation.
Ideally the string would include only ASCII characters, but I'm not sure if it woul dbe feasible.
Solution
Based on the formula for hash code (from StringLatin1
):
public static int hashCode(byte[] value) {
int h = 0;
for (byte v : value) {
h = 31 * h + (v & 0xff);
}
return h;
}
it depends linearly on the characters, the longer the string and greater the characters, the greater the hash code will be, until it overflows. Also note that the first characters have a greater impact on the resulting hash code (more often multiplied by 31).
The basic idea of the two first algorithm is to increment the characters until the hash code turns negative, starting with the left-most character since it has a greater weight. The searched string must have the character previous to the one that caused it to overflow on each position but the last one.
The code starts testing the strings "A", "AA", "AAA", ...
until one starts returning a negative value - the previous string is used as starting value.
Now it starts incrementing the first character up to Z
or until a string with a negative hash is found. The same is done for every next character.
Since the hash code of such strings was not yet reaching Integer.MIN_VALUE
, an additional pass is done, to also test lowercase characters. This should have been integrated in the previous loop...
Now the last character is adjusted to exactly get to Integer.MIN_VALUE
- simple since the last character is just added, without multiplication to calculate the hash code.
Here the code:
var string = "A";
while ((string+"A").hashCode() > 0) {
string += "A";
}
var array = string.toCharArray();
var i = 0;
while (i < array.length) {
array[i] += 1;
if (array[i] > 'z' || new String(array).hashCode() < 0) {
array[i] -= 1;
i += 1;
continue;
}
}
i = 1;
while (i < array.length) {
if (array[i] == 'Z') {
array[i] = 'a';
}else {
array[i] += 1;
}
if (array[i] > 'Z' || new String(array).hashCode() < 0) {
if (array[i] == 'a')
array[i] = 'Z';
else
array[i] -= 1;
i += 1;
continue;
}
}
int hash = new String(array).hashCode();
if (hash > 0) {
array[array.length-1] += Integer.MAX_VALUE - hash + 1;
}
System.out.printf("%s = %d%n", new String(array), new String(array).hashCode());
This results in:
HZcxf_ = -2147483648
Merging the two incrementing loops of previous code, we have:
var string = "A";
while ((string+"A").hashCode() > 0) {
string += "A";
}
var array = string.toCharArray();
var i = 0;
while (i < array.length) {
var prev = array[i];
if (prev == 'Z') {
array[i] = 'a';
} else {
array[i] += 1;
}
if (array[i] > 'z' || new String(array).hashCode() < 0) {
array[i] = prev;
i += 1;
continue;
}
}
int hash = new String(array).hashCode();
if (hash > 0) {
array[array.length-1] += Integer.MAX_VALUE - hash + 1;
}
System.out.printf("%s = %d%n", new String(array), new String(array).hashCode());
Resulting in (slightly different than previous):
HZdZG_ = -2147483648
Another method would be more strongly based on the hash calculation, basically undoing it.
Since I did not want to work with negative number, it starts with Integer.MAX_VALUE
, which is one less than Integer.MIN_VALUE
(considering over/underflow).
First it finds out how often it must be divided by 31
until the result is less than 128 (ASCII), kind of determining the string length.
Next it loops and finds out each character with some special handling to avoid characters less than ' '.
At the end, the last character is incremented by one to move the hash code from MAX_VALUE
to MIN_VALUE
by overflowing.
var string = "";
var remain = Integer.MAX_VALUE;
var i = 0;
var multiplier = 1;
while (remain > 127) {
remain /= 31;
multiplier *= 31;
i += 1;
}
remain = Integer.MAX_VALUE;
while (i >= 0) {
var ch = (char)(remain / multiplier);
remain -= ch * multiplier;
multiplier /= 31;
if (i > 0) {
// correct if next ch will be less than ' '
var correct = (' ' - (remain / multiplier) + 30) / 31; // old fashion rounding
if (correct > 0) {
ch -= correct;
remain += correct * 31 * multiplier;
}
} else {
ch += 1;
}
string += ch;
i -= 1;
}
System.out.printf("%s = %d%n", string, string.hashCode());
And its results:
I='<*! = -2147483648
Note: the last code will definitively fail if the hash code algorithm of String
is changed! The previous two may fail, depending on how the hash calculation is changed.
Answered By - user16320675
Answer Checked By - Terry (JavaFixing Volunteer)