Daniel Lemire's blog

, 1 min read

Java Serialization is not for long term storage

Using Serialization for long term storage, is a common mistake. In fact, Microsoft made with with Microsoft Word and it is a well known source of trouble (ever had a corrupted file you could not recover from?). Serialization in Java was never advertized as a viable storage long term mechanism. We serialize in order to send objects over a wire (RMI), or for lightweight persistance (especially for non-critical data). I’m not making this up, this is how Sun documents it.

Also, Sun makes no promise that you’ll be able to deserialize, if your code changes. Ever heard of the java.io.InvalidClassException class? That’s what you’ll get on your face if you ever change the class you used to serialize (even if you change it just a little bit).

Think about the following scenario:

  1. You serialize some objects you care about.
  2. Weeks pass by.
  3. For some reason or other, you change the class. Suppose, for example, that you delete a field or you move the class up or down in the hierarchy. You don’t keep the old class around anymore.
  4. That’s it, you can’t deserialize your objects ever unless you do reverse engineering. It won’t stop and ask you how you want it fixed, it will just throw an exception with no direct way for you to fix this. You’ll need to “hand recover you data”. Have fun. If the data was not your own, and it was hand crafted by a client, you are probably going to lose your job.

And let’s not even get into what happens if you must exchange your data with other software not written in Java.

If you really care about your data, dump it in a custom XML format. It isn’t that hard.