The Whiteboard is a gathering of programmers from around the world interested in investigating every aspect of software development. Topics range from coffee to software design and everything in between. Have you accepted Haskell into your home and into your heart?
More than occasionally I’ve seen Java code that looks like:
or something to that effect. The key point of this being the
call of several string constants in append one right after another.
pmd complains about this, but the reasons why is rather vague
at best. So, the question of “why shouldn’t you do this?”
Lets write some code. Two classes…
The above are compiled to two separate classes with javac.
The first thing to notice is the size of the files:
-rw-r--r-- 1 shagie staff 593 Jan 17 19:02 Bar.class
-rw-r--r-- 1 shagie staff 199 Jan 17 19:01 Bar.java
-rw-r--r-- 1 shagie staff 412 Jan 17 19:01 Foo.class
-rw-r--r-- 1 shagie staff 135 Jan 17 19:01 Foo.java
So, lets look at those files with the Java disassembler - javap.
That list is the start of the constant pool of the class file.
It is defined in Section 4.1 of the jvm spec:
The constant_pool is a table of structures (§4.4) representing various string constants, class and interface names, field names, and other constants that are referred to within the ClassFile structure and its substructures. The format of each constant_pool table entry is indicated by its first “tag” byte.
The thing to note here is that the String foobar is one string - not
two.
Further down, when we look at the bytecode invoked for the main
method:
You can see the ldc #2 which is the load constant #2 from the
constant pool and its right there. One constant, one instruction.
Looking at Bar.class now,
In here you can see the two string constants foo and bar in
slots 4 and 6 of the constant pool. While I didn’t show them all,
Foo.class had 28 constants and Bar.class ahs 41 constants. These
point to various things in the code such as the method or UTF-8,
boring stuff like that. But its half again as much for the append
approach.
When looking at the bytecode itself,
you can see the StringBuilder getting created, some stack work with
the dup, the invocation of init, storing the value, loading it,
calling load constant once for foo, invoking the append, poping
the stack, loading the string builder again, loading the constant,
invoking the virtual method and so on.
The point of all this? Don’t use a StringBuilder to build a
literal string. javac will build that literal string as part
of compilation even if you use +.
Chaining the + operator on String is the simplest to understand
and most likely correct way to build a String in any situation
outside of a loop (if you’re building a String and there’s a loop,
use a StringBuilder).