There comes a point in every java program’s lifetime where it must answer once of life’s fundamental questions: how am I going to talk to my config file (or, indeed, any file like resource that’s not another class)?
As good test driven developers, we’ll usually have a short digression about the definition of unit testing before agreeing that there should be at least one test that reads from a real file in the same way that the production code will, and then start to write that test.
I have noticed a prevalence of projects that choose to use the src/test/resources area of a project in order to get the configuration file onto the classpath. The production code then reads that file from the root of the classpath and everything is fine. The discussion of whether this is the right thing to do appears to have happened long ago, unbeknownst to me, and that is a little problematic, because I am not convinced it is a good idea. I would much prefer it if most resources of this sort could live on the file system.
We’ll get to that shortly (if you must know the answer, click here), first though:
A little history
Perhaps readers have come across the concept of “maven layout”. This is a convention for arranging the source code in a project so that maven can easily understand and build it. Maven layout has become standard even in projects not built by maven, and that is, for the most part, no bad thing; in particular I particularly like that it allows for multiple languages within the same
One unfortunate introduction that maven makes, however, is the inclusion of
src/main|test/resources. This is an area in which we can drop resources and expect them to turn up on the classpath without really thinking about it. As such, there is temptation to avoid answering the question of “where should we store this file” when there’s an easy get out of jail answer right in front of us.
This is fine right until the point where we stop using maven, resources in that directory no longer get bundled on to the classpath at the desired time, and a cacophony of alarm bells when we try to run our program or its tests.
An easy way out
When in maven layout, do as maven does: try to follow the classpath and assembly rules that maven applies to that filesystem structure accordingly. That’s not the point I want to make, however (and why use !maven if you’re going to make your specific !maven exactly like maven?), so forget I said anything.
Why is this bad?
..and why is the filesystem a better solution?
#1 Bad diagnostics
This is probably the one that sways me most. If we mis-specify a file path in our program, figuring out where it was supposed to be is as simple as logging the result of invoking
getAbsolutePath on the
java.io.File object that’s not finding bits where it should be.
What about loading a resource from the classpath? Well, a standard call to
getResource will return us a
java.net.URL. That sounds, initially, as if it might be quite useful. How about when the resource isn’t found, though? Well, in that case, we get back
null. That is considerably worse than the file case; we have no idea where the classloader looked, and therefore no idea how to fix the problem.
If we’re lucky, we can get an idea of what the classpath is by looking in a debugger (
sun.misc.URLClassPath is relatively easy to figure out). If we’re in a more complicated world with custom classloaders, however, this gets far more difficult.
[ file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/jce.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/rhino.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/charsets.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/resources.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/jsse.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/compilefontconfig.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/management-agent.jar, file:/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/rt.jar, file:/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/javazic.jar, file:/usr/share/java/java-atk-wrapper.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/ext/localedata.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/ext/sunpkcs11.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/ext/pulse-java.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/ext/dnsns.jar, file:/usr/lib/jvm/java-6-openjdk-common/jre/lib/ext/sunjce_provider.jar, file:/home/james/blogs/the_class_path/out/production/the_class_path/, file:/home/james/idea-IC-123.94/lib/idea_rt.jar ]
A sample collection of paths from a
So, let’s now imagine that we’re successfully running our program, and we want to know which config file is in use. We work out the list of places on the classpath, and find that config.txt is in three of them. Which one is in use? Yes, usually it is the one which is found first (just as it would be for an actual object file), but the order of finding is an implementation detail of a specific classloader, not a guarantee. With this ambiguity comes the dubious ability to override configuration by racing to put config.txt in the first listed place on the classpath.
With a file – well, there is a possible ambiguity in that if we specify the path to a file in a relative pattern, our lookup may be affected by the current working directory of the java process we’re in. If this error does occur though, we can trivially create a diagnostic message that tells us the absolute expected path of the resource we are seeking, and amend appropriately.
So what is the classpath for, then?
The classpath is a path, or collection of paths, where compiled java object files are found, just as
LD_LIBRARY_PATH is a collection of places for the linker to look for shared objects. Wikipedia’s definition provides some more detail – and also avoids even the slightest hint of loading anything other than classes from it, to my delight.
Exceptions to the rule
Some java deployment platforms take the file system option off the table. Is this an excuse to start bundling non class resources on to the classpath? No.
- The best ones provide better APIs for storage/retrieval of non-class resources.
- A file is really only one type of URL. We could try loading configuration from a different protocol, like http.
Conclusion, and tl;dr for the impatient
The class path is for classes. While it does provide enough scaffolding to create general purpose resource loading solution, using it as such is error prone and unclear to both users and maintainers.
- The clue is in the name.
- The classpath. A path for classes.
- This ‘filesystem’ concept. Might it be useful for storing files?