Recovering a Corrupted embedded Apache Derby Database after an error XJ040 (or if you got an error XSDG2)

Apache Derby is a usually a strong database but sometimes it might get corrupted.

This is what worked for me after many embedded  derby databases became corrupted because we finished the free space on the partition (yes, shame on us).

To make things worse, we didn’t always have a recent backup of the db, which just grew the complexity of our problem.

The Errors we got were XSDG2 or XJ040

After discovering that some database were corrupted because the clients were trying to write data onto the database but the space on the devices was finished (error XJ040), we soon noticed that the databases corrupted themselves in three different ways.

The first one didn’t even allow us to access/boot the embedded database and gave this error

------------ BEGIN SHUTDOWN ERROR STACK -------------

ERROR XSDG2: Invalid checksum on Page Page(0,Container(0, 3632)), expected=1.828.510.840, on-disk version=3.463.751.691, page dump follows: Hex dump:

In the other situation the database was accessible but some table were inconsistent.

ERROR X0X62: Inconsistency found between table 'APP.T1' and index
'T1_I'. Error when trying to retrieve row location '(1,6)' from the
table. The full index key,including the row location, is '{ 1, (1,6) }'.
The suggested corrective action is to recreate the index.

The third one was a mix, because the database was accessible, but once we tried using a specific table, that table would make the connection crash (dunno if that’s the right term to use), making impossible every action.

Our first problem was: how do we get to boot the embedded db again?

How to gain access to Derby even if it won’t boot

This is an “hack” that worked many times for us.
If the database has some dirt transaction that are still pending in its internal logs then it might not even boot.

So in these cases the best way to access again the database is to replace the internal “log” folder with the one from an empty derby database.

This allowed us to get into the database with the ij utilities.

java -cp "derby.jar;derbytools.jar" org.apache.derby.tools.ij
ij> connect 'jdbc:derby:C:/tmp/private/data';

Checking the table consistency

Now that we accessed the db we checked if every table was ok with this command.

SELECT schemaname || '.' || tablename as TableName,SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK FROM sys.sysschemas s, sys.systables t WHERE s.schemaid = t.schemaid   and t.tabletype = 'T'   ;

This command usually ended in 3 possibilities

  1. One or more tables were inconsistent
  2. The connection crashed when trying to check one table (usually the one that was writing when the space on device ended)
  3. Everything was ok (yes… sometimes things go well 😀 )

Case 1: Inconsistency – error X0X62

When we got the X0X62 error all we had to do is to call the SYSCS_COMPRESS_TABLE on the table and everything was instantly fixed.

ij> call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP','TABLE_NAME',1);

Case 2: Crash !

When the table crashed, we first re-executed the sql to know the exact table name by looking at the last table that didn’t have any error.

SELECT schemaname || '.' || tablename as TableName AS OK FROM sys.sysschemas s, sys.systables t WHERE s.schemaid = t.schemaid   and t.tabletype = 'T'   ;

Then we had only 2 options

  1. If we had a recent backup we could replace the corresponding _.dat_ table file and have the db working again.
  2. If we _didn’t_ have a recent backup we had to read the data of the corrupted table and reimport into a new clean database together with the non-corrupted one. Sounds complex but in fact it’s quite easy.

Replacing the corrupted table _.dat_ file from one of your backups 

To replace the .dat file with one of your backups all you have to do is discover what’s the right one.

To do that just execute this script (taken from the Derby Wiki )
notice: you need derby 10.10, if not I advice you to upgrade a copy of the db just to execute this, and then substitute the .dat on the original

CREATE FUNCTION bigintToHexString(hexString bigint)     RETURNS VARCHAR(16)     PARAMETER STYLE JAVA NO SQL     LANGUAGE JAVA      EXTERNAL NAME 'java.lang.Long.toHexString';

--select that will tell you which .dat file belongs to which table
select  CAST(TABLENAME as varchar(32)) as DB_table, CAST('C' ||  bigintToHexString (CONGLOMERATENUMBER)  ||  '.dat' as varchar(12)) as file, isindex from SYS.SYSCONGLOMERATES a, SYS.SYSTABLES b  where a.TABLEID = b.TABLEID AND b.TABLETYPE <> 'S' order by file;

Once you got the name, just replace it and you should be good to go.

I know it’s not the perfect solution but it worked well for us since the tables were usually enough fresh so we didn’t lose any data by using the backup.

How to read the data if you don’t have a backup

We had corrupted database we didn’t know how to recover because the connection kept crashing, making it impossible to work on them or recover data from the corrupted table.

Searching on internet I found this jira bug that was talking about dumping data from a corrupted db .

Following these instruction (that I copy just for future references in case the original goes down)

1.-Download lastest derby bin.
2.- Add DerbyRecovery-0.0.1-SNAPSHOT.jar to the ij classpath
3.- Run dataFileVTI.sql (remember to change the route to the database in the connect string)
3.- enter ij and connect to the database using upgrade=true
4.- use call SYSCS_UTIL.SYSCS_REGISTER_TOOL('customTool',true,'RawDBReader', 'CONTROL','RAW_','c:db','','APP',null); -> change the db route, password, bootpassword and schema
5.- check with select * from raw_app.table1;

we were able to read the data from the table! Awesome.
Once we got into this step this is what we’ve done

  • We created a db with all the non-corrupted data. You can use a simple tool like SQLSquirrel to transfer the data between the two database, or any other import-export procedure you might like.
    Remember: don’t try to copy the corrupted table, or the connection will crash.
  • Since the fix in the jira didn’t create a real table, we could only export the data by using the _SYSCS_UTIL.SYSCS_EXPORT_QUERY_ so we exported to a file and then reimported with a bit of work into the new db (don’t forget to recreate the table 😉 )

That’s it, it took a long time but it taught us many lessons, both in terms of backing up and how to handle this emergency situations.

Closing line

One thing I’ve discovered throughout this process of fixing corrupted databases was how well done was the Derby wiki.
Whenever you’re in trouble or you want some more info, it’s worth taking a look.

12 responses to “Recovering a Corrupted embedded Apache Derby Database after an error XJ040 (or if you got an error XSDG2)”

  1. Hello Andrea,
    I was very interested to read your blog. I have a Derby database which is corrupted and I cannot access. I followed your guide and entered this commands:
    “java -cp “derby.jar;derbytools.jar” org.apache.derby.tools.ij
    ij> connect ‘jdbc:derby:C:/my_database’;”
    But I could not connect.

    Would you be interested in looking at the problem for me? I would be willing to pay for your service if you think you have a chance to recover the data.
    thanks
    John

    Like

    1. Hi John, what error do you get in when you try to connect? (Right now I’m in Japan with no pc, but once I get home I’ll see if I got some ideas)

      Like

      1. Hello Andrea,

        This is the error:

        ij version 10.11
        ij> connect ‘jdbc:derby:C:/temp/ehour/1.4.3/home/db/ehourDb’;
        ERROR XJ040: Failed to start database ‘C:/temp/ehour/1.4.3/home/db/ehourDb’ with
        class loader sun.misc.Launcher$AppClassLoader@5e481248, see the next exception
        for details.
        ERROR XSDFI: Error encountered while trying to write data to disk during databas
        e recovery. Check that the database disk is not full. If it is then delete unne
        cessary files, and retry connecting to the database. It is also possible that t
        he file system is read only, or the disk has failed, or some other problem with
        the media. System encountered error while processing page Page(13,Container(0,
        1936)).
        ij>

        thanks for any advice you can offer.
        John

        Like

  2. Hello Andrea. I made some progress and have managed to recover some data, although unfortunately a lot of the entries are missing. I don’t understand why that is though. thanks John

    Like

    1. Good, what was the problem before? Some readonly files?
      Btw it has happened to me too that some data couldn’t be recovered, it really depends on the level of corruption.
      Did you used the method in the article? (The one with the class to read the raw data)

      Like

      1. I followed some other advice and had to load Derby db 10.11 and compile the java files in that, then I could connect ok. I’m still struggling to access the log.dat files though, as I think they must contain more of the missing data. I was using your advice and advice here:
        https://issues.apache.org/jira/browse/DERBY-6136
        https://issues.apache.org/jira/browse/DERBY-5195
        thanks, John

        Like

  3. Hi Andrea, I’m a java developer. I have used apache Derby database for database to my java desktop application. So I have a problem with this database, after a couples years, when I try to open my database, I have this strange error:

    java.sql.SQLException: Formato contenitore sconosciuto nel contenitore null : 1.701.080.940
    ….
    ….
    Caused by: ERROR XSDB2

    with apache derby community this is an error of database but I don’t know how can restore it.
    Can you help me?

    best regards
    Michele Castriotta

    Like

    1. Hi Michele 🙂
      It’s a really strange behaviour. Could it be that you’re using a different version of the db driver?
      Which version was the old derby? I guess it might be very old (10.3?) so some workaround might not work correctly as we hope

      Like

  4. Hi Andrea,

    Would you be interested in updating the article/posting an update in the comments on the changes made on the Jira ticket since this article was posted? As a hobbie Java Developer I’m trying to find the best solution for a program I’m compiling in Java 7 for backwards compatibility. The official implementation of that Jira was fully integrated into version 10.13 of Derby, which does not support Java 7. I’m thinking I would want to either use the solution you posted via my program, or try to use some of the source code that was posted later on in the Jira ticket which may be more reliable (hard to tell).

    Also, great read by the way, covers all the bases even two years later.

    Like

    1. Hi Nick and thanks 😉

      If you have the constraint to use java 7 I think there’s no choice but try to use the solution I gave through your program.

      As for the test with the 10.13, I’ll see if I can find an old corrupt db to see if using the 10.13 works out magically 😉 but I doubt I have some in my hd since it was long ago when I got this problem.

      Like

  5. hi how can view password database derby with ij, thanks

    Like

    1. Hi there, what do you mean? You need to access a password protected derby db with IJ? If so, do you have the password? (obvious question, but I wanted to be sure about the answer )

      Like

Leave a reply to Andrea Grassi Cancel reply