What is migrate4j?

migrate4j is a database migration tool. Suppose you determine that you need a new database table for your project. If you develop alone, you could write an SQL script that adds a table and manually apply this to your development system. But if you work with other developers, or need to keep a test system in synch with your development system, this become tedious and error prone.

Migration tools make it possible to add your new table (or make any other schema changes) in an automated fashion, ensuring all your systems are always in synch. Migration tools also make it possible to quickly and easily roll back previous changes. Unlike typing commands into an interactive SQL window or storing SQL scripts, migration tools keep a detailed history of how your database schema evolved (just in case you need to go back to a previous version). Finally, migration tools minimize or eliminate the problem of having to use vendor specific syntax – you may never switch database products, but if you do, using a migration tool will make your life much easier.

The initial intent of migrate4j was to make a Java version of Ruby's db:migrate. If you've used db:migrate, you probably fell in love with it's simple syntax, easy configuration and ability to roll changes up and back effortlessly. The intent (and the challenge) of migrate4j is to bring the power and simplicity of db:migrate to Java programmers, using familiar type safety and syntax. Along the way, we're adding additional functionality that makes migrate4j more than just another Ruby tool rewritten for Java – it is a Java project intended to make other Java projects even better.

Quick Start

Configure migrate4j

Manually apply the following SQL to your database:

create table version (version int primary key);
insert into version values (0);

Create a file named migrate4j.properties and add the following text (replacing the values in [brackets] with actual values for your database.

connection.url=[jdbc connection url such as "jdbc:mysql://localhost:3306/mydb"]
connection.driver=[jdbc driver class name such as "com.mysql.jdbc.Driver"]connection.username=[username]
connection.password=[password]
migration.package.name=db.migrations

Save this file to the source directory of your project.

Write a migration

Create a new package named db.migrations in the source directory of your project. Create a new Java file in the db.migrations package named Migration_1. Add the following code:

package db.migrations; 

import static com.eroi.migrate.Define.*;
import static com.eroi.migrate.Define.DataTypes.*;
import static com.eroi.migrate.Execute.*;
import com.eroi.migrate.Migration; 

public class Migration_1 implements Migration {

  public void up() {
    createTable(
      table("simple_table",
        column("id", INTEGER, primarykey(), notnull()),
        column("desc", VARCHAR, length(50), defaultValue("NA"))));
  }

  public void down() {
    dropTable("simple_table");
  }
}

Copy the migrate4j.jar file and commons-logging.jar file into the source directory of your project. Compile Migration_1 - the following command should accomplish this:

javac -classpath migrate4j.jar db/migrations/Migration_1.java

Apply the migration

Locate migrate4j.jar, commons-logging.jar and the jar for your JDBC database driver (the following assumes these are in your projects source directory). To apply Migration_1, run the following command from the command line:

java -cp .:mysql.jar:migrate4j.jar:lib/commons-logging.jar com.eroi.migrate.Engine

Your database should now include simple_table. To remove the table, rerun the Engine with a target version of 0:

java -cp .:mysql.jar:migrate4j.jar:lib/commons-logging.jar com.eroi.migrate.Engine 0

Obtaining Migrate4j

Migrate4j can be downloaded from our sourceforge site. Visit our download page or our main sourceforge page for more information. Questions, comments and help can be obtained by emailing our list at migrate4j-users AT lists.sourceforge.net (replacing the AT with the "at symbol").

Configuring migrate4j

Connection and package configuration

Configuring migrate4j is achieved through the Configure class. In some cases, simply providing a migrate4j.properties file is all you need. For example, the Ant task that comes with migrate4j, and command line option in the Engine class (as shown in the Quick Start section of this manual), handle the call to Configure automatically. However, the Configure class offers a lot of flexibility.

Before any migrations are applied (or rolled back), one of the configure methods must be called on the Configure class. These methods include (among others):

configure()
configure(Connection, String)
configure(String, String, String, String, String, String, String, String, Integer, String)

The no-argument option loads properties from a file named migrate4j.properties that is found on the classpath (a sample of this file comes with the migrate4j distribution file). If your connection details change often, or are unknown until deployment, this is a convenient option. This form of configure is generally good for automating migrations within a development environment (if you choose not to use the Ant task).

The configure method that accepts a Connection and String argument takes an active javax.sql.Connection object and the fully qualified package name where your Migration classes reside. This option works well for situations where a Connection is already available. For example, if you're using a connection pool or can obtain a connection from a JNDI repository, using this form of the configure method allows passing in an existing connection. One thing to remember is that migrate4j will not close the connection – you need to do this after running the Engine. This form of configure is a good choice for ensuring your database schema is at the lastest version during application startup (perhaps in an initialization servlet or construction of a main JFrame).

There are also methods that take multiple arguments, allowing connection values to be set programmatically. If you'd like to set values in some fashion other than from a properties file, these methods allow passing in values directly to the configure method. This form of configure lends itself well to form based tools.

The javadocs provide more information on the various forms of configure. And again, if you choose to use the Ant task to run migrations, you won't need to explicitly call configure in your code.

The version table

The simplest way of providing a version table is to create a table named "version" with a single column, also named "version". This table must be created manually. The following SQL will generate the table for most database products:

create table version (version int primary key);
insert into version values (0);

You may name the table something other than "version", though you will need to tell migrate4j about this during configuration. The version column must be named "version".

Writing Migrations

The Up and Down methods

Database schema changes are defined in java classes that extend the com.eroi.migrate.Migration interface. Besides implementing this interface, all of your Migration classes must reside in a single package and follow a naming convention. This allows migrate4j to locate and determine the order in which to apply your Migrations.

The com.eroi.migrate.Migration interface defines 2 methods: up and down. The up method is called by migrate4j when Migrations are being applied (for example, going from version 0 to 1). The down method is called when Migrations are being rolled back (for example, when removing the changes made in version 1, returning the database to version 0). As you can imagine, any changes applied in the up method should be reversed in the down version.

Naming convention

There is no default package for Migration classes. The package name must be specified during the Configuration step. For example, if using the migrate4j.properties file, set the migration.package.name property.

The names of your Migration classes allow migrate4j to understand the order in which to apply (or rollback) your schema changes. By default, Migration_1 is your initial Migration, followed by Migration_2, then Migration_3 and so on. You may substitute both the Migration and separator (_) with other values by specifying these changes during Configuration (though not all forms of configure allow this at the current time).

You define changes to your database schema with the com.eroi.migrate.Execute class. You can make your source code cleaner by statically importing this class (including import static com.eroi.migrate.Execute.*; in your class file). The Execute class includes methods for many schema changes you may require, such as adding/dropping tables, columns, indexes and foreign keys. Support for stored procedures, triggers, rules, etc., are not yet implemented but may be added in the future.

Defining database changes

The com.eroi.migrate.Execute class methods take objects that can be obtained through the com.eroi.migrate.Define class. Again, statically importing this class (and it's DataTypes enum) will significantly clean up your source code. The Define class provides methods for creating objects that represent tables, columns, indexes and foreign keys. Keep in mind, these objects are simply intended for schema definitions – do not expect a table object to contain rows of records.

While migrations are mainly intended to create database structure, you can also use them to add records to your tables (useful for adding static records in lookup tables). It is possible to obtain a connection to the database through the com.eroi.migrate.Configure class getConnection method. Use this to add, remove, modify, etc. records, but do not close the connection when you're finished with it (this is difficult for many Java developers since we're so accustomed to closing JDBC objects when finished with them).

Applying Migrations

Applying and rolling back migrations is done with the com.eroi.migrate.Engine class. It's migrate method can be called with a specific version number, or without any arguments. For example, calling Engine.migrate(0) means rollback (run the down method) on all Migration classes found in the migration package. On the other hand, calling Engine.migrate() means apply (run the up method) on all Migrations in the migration package.

Using a specific version number may either apply, or roll back, depending on the database's current version number. For example, calling Engine.migrate(5) will roll back all Migrations named higher than Migration_6 in the migration package if the current database version is higher than 5. On the other hand, if the current database version is below 5, the Engine will apply all migrations up to and including Migration_5 in the migration package. If the database is currently at version 5, the call has no effect.

Rolling back is a normal part of the development process. In fact, if you unit test your database persistence layer (for example, with DbUnit), you might consider a full rollback and application of migrations before running your test suite. This ensures that changes to Migration classes are applied prior to running unit tests. When checking in code from your source control versioning system, rollback your database schema (perhaps with the Ant task) to a known stable version (such as 0), compile all new files, and then apply all Migration classes. Experiment to see what works best in your development environment.

Deploying to your production systems is different. Obviously, rolling back a production database is not wise since any drop statements will result in data loss. If you find that a schema change needs to be rolled back on your production systems, create a new Migration class that makes the change. In general, you never want to use anything other than Engine.migrate() on a production system.

When you first create a new Migration class, it's likely you won't get it right the first time. Perhaps you create a column with the wrong data type, or you decide a name you've used isn't descriptive enough. It's wise to keep Migrations that are being developed separate from Migrations that have been deployed to production. To make this easier, create two separate packages, one for developing and one for production-ready classes. For example, create new Migration classes in a dev.migrations package. When you're confident they are ready to deploy to production, copy them to db.migrations. Your development system can pass dev.migrations as it's migration package name during configuration while your production system can use db.migrations.

The Ant Task and command line Engine

While your production application will probably be limited to loading new Migration classes at startup, you'll likely want more flexibility while developing. Migrate4j comes with a couple of basic tools to help.

The com.eroi.migrate.Engine class has a main method, allowing it to be launched as a standalone application. You can call this from the command line, but will need to ensure your classpath is passed (using the -cp switch) and that you have a migrate4j.properties file.

Automating schema changes to coincide with Migration class changes is made easier with an include Ant task. The migrate4j distribution file contains a file named build.sample.xml that shows how to include the migrate4j Ant task into your build files.

Getting Help

It's possible to create migrations that cause inconsistencies in your schema. If rolling back fails, it's possible that reapplying will fail and you are stuck. This is not a flaw of migrate4j, but rather a reality of programming (for migration tools, this situation is analogous to an endless loop). The easiest remedy is to manually drop all the tables in your database, manually set the version column on your version table back to 0, and then reapply your Migration classes (after you've fixed the problem that caused the error). Of course, there's no need to roll all the way back to 0 - if you can identify a known state prior to that, manually reset the database back to that point.

If you find other issues, or need additional help, by all means email the migrate4j mailing list at migrate4j-users AT lists.sourceforge.net (replacing the AT with the "at symbol"). Someone will get back to you to try to assist you.

Currently, migrate4j supports a very small number of databases. Unfortunately, DDL (data definition language) is much less consistent between database vendors than DML (data manipulation language). Therefore, it's quite possible that migrate4j will not work with your database - our status page contains more details. Again, email the mailing list and indicate which database product you are using. There may be work being done on your product that just hasn't made it into a release yet - your email may be the trigger to get it finalized.

We hope you enjoy using migrate4j and find it useful. Comments, suggestions and questions are encouraged, so don't be shy - send us an email and let us know what you think!

Welcome to migrate4j