String literals in Java classes

Some background

More than occasionally I’ve seen Java code that looks like:

StringBuilder query = new StringBuilder();
query.append("SELECT something, somethingelse");
query.append(" FROM table");
query.append(" WHERE 1=1");
ResultSet rs = stmt.execuateQuery(query.toString());

or something to that effect. The key point of this being the call of several string constants in append one right after another.

pmd complains about this, but the reasons why is rather vague at best. So, the question of “why shouldn’t you do this?”

Lets write some code. Two classes…

public class Foo {
    public static void main(String... args) {
        String foo = "foo" + "bar";
	System.out.println(foo);
    }
}

public class Bar {
    public static void main(String... args) {
        StringBuilder foo = new StringBuilder();
	foo.append("foo");
	foo.append("bar");
	System.out.println(foo.toString());
    }
}

The above are compiled to two separate classes with javac.

The first thing to notice is the size of the files:

-rw-r--r--  1 shagie  staff  593 Jan 17 19:02 Bar.class
-rw-r--r--  1 shagie  staff  199 Jan 17 19:01 Bar.java
-rw-r--r--  1 shagie  staff  412 Jan 17 19:01 Foo.class
-rw-r--r--  1 shagie  staff  135 Jan 17 19:01 Foo.java

So, lets look at those files with the Java disassembler - javap.

~/De/javastr $ javap -v -c Foo.class 
Classfile /Users/shagie/Development/javastr/Foo.class
  Last modified Jan 17, 2017; size 412 bytes
  MD5 checksum 0aa784d3599dd09977b9ad6feab73dea
  Compiled from "Foo.java"
public class Foo
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #6.#15         // java/lang/Object."<init>":()V
   #2 = String             #16            // foobar
   #3 = Fieldref           #17.#18        // java/lang/System.out:Ljava/io/PrintStream;
   #4 = Methodref          #19.#20        // java/io/PrintStream.println:(Ljava/lang/String;)V
   #5 = Class              #21            // Foo
   #6 = Class              #22            // java/lang/Object

That list is the start of the constant pool of the class file. It is defined in Section 4.1 of the jvm spec:

The constant_pool is a table of structures (§4.4) representing various string constants, class and interface names, field names, and other constants that are referred to within the ClassFile structure and its substructures. The format of each constant_pool table entry is indicated by its first “tag” byte.

The thing to note here is that the String foobar is one string - not two.

Further down, when we look at the bytecode invoked for the main method:

  public static void main(java.lang.String...);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC, ACC_VARARGS
    Code:
      stack=2, locals=2, args_size=1
         0: ldc           #2                  // String foobar
         2: astore_1
         3: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
         6: aload_1
         7: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        10: return
}

You can see the ldc #2 which is the load constant #2 from the constant pool and its right there. One constant, one instruction.

Looking at Bar.class now,

Classfile /Users/shagie/Development/javastr/Bar.class
  Last modified Jan 17, 2017; size 593 bytes
  MD5 checksum ef1e0f7d6be2af193a44b83bd97751f6
  Compiled from "Bar.java"
public class Bar
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #11.#20        // java/lang/Object."<init>":()V
   #2 = Class              #21            // java/lang/StringBuilder
   #3 = Methodref          #2.#20         // java/lang/StringBuilder."<init>":()V
   #4 = String             #22            // foo
   #5 = Methodref          #2.#23         // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   #6 = String             #24            // bar

In here you can see the two string constants foo and bar in slots 4 and 6 of the constant pool. While I didn’t show them all, Foo.class had 28 constants and Bar.class ahs 41 constants. These point to various things in the code such as the method or UTF-8, boring stuff like that. But its half again as much for the append approach.

When looking at the bytecode itself,

  public static void main(java.lang.String...);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC, ACC_VARARGS
    Code:
      stack=2, locals=2, args_size=1
         0: new           #2                  // class java/lang/StringBuilder
         3: dup
         4: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
         7: astore_1
         8: aload_1
         9: ldc           #4                  // String foo
        11: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        14: pop
        15: aload_1
        16: ldc           #6                  // String bar
        18: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        21: pop
        22: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
        25: aload_1
        26: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        29: invokevirtual #9                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        32: return
}

you can see the StringBuilder getting created, some stack work with the dup, the invocation of init, storing the value, loading it, calling load constant once for foo, invoking the append, poping the stack, loading the string builder again, loading the constant, invoking the virtual method and so on.

The point of all this? Don’t use a StringBuilder to build a literal string. javac will build that literal string as part of compilation even if you use +.

Chaining the + operator on String is the simplest to understand and most likely correct way to build a String in any situation outside of a loop (if you’re building a String and there’s a loop, use a StringBuilder).