Issue
Consider the following code in Java 11:
StringBuilder sb = new StringBuilder("one");
sb.append("δύο"); // "two"
The first line creates a StringBuilder
that uses the Latin1 coder (one byte per character). Then the second line causes the StringBuilder to realise that it needs to use the UTF16 coder instead, so it copies its current contents into a new array before appending the new UTF-16 characters.
The StringBuilder class has a constructor overload that takes an initial capacity argument, which is designed to avoid reallocation if you already know the required size of the string to be built. But if you start with an English string and then append a foreign string, this particular constructor overload is useless as it still reallocates the byte array.
Is there a way to create a StringBuilder instance that uses UTF16 right from the start?
Solution
There is nothing in the Java 11 or Java 12 version of StringBuilder
that would do this.
The real issue is how important to you is the performance increment that you might get from this. Profile your application to find out if this unwanted reallocation contributes significantly to your application's overall performance.
If it is going to make a significant difference, you could implement your own version StringBuilder
(extending the same interfaces for compatibility).
Alternatively, if you were prepare to wait, you could download the OpenJDK source code and develop / build / test an extension to StringBuilder
... and submit it as a patch for consideration. (If you included benchmarks that demonstrated a clear performance benefit, that would help the chances of inclusion.)
Answered By - Stephen C
Answer Checked By - Terry (JavaFixing Volunteer)