Online Backup
From Neo4j Wiki
This page will teach you how to use the Neo4j online backup component.
Contents |
[edit] Online backup basics
The online backup utility can be used to
synchronize a destination neo4j database
from a source neo4j database.
The source database is a running EmbeddedGraphDatabase
instance, which can continue to run as usual during the backup.
The destination is either a running EmbeddedGraphDatabase
or a filesystem location with a neo4j database in. The destination database
has to start out as a copy of the files of the original datastore.
All completed transactions for all included data sources will be copied to the backup. Transactions that are still open don't affect the backup, and are of course not included in the backup.
The component information is located at: http://components.neo4j.org/neo4j-online-backup/
Adding online-backup as a Maven dependency is done like this (assuming version 0.4-SNAPSHOT, it's the latest as of 2010-01-13):
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-online-backup</artifactId>
<version>0.5</version>
</dependency>
If you want to download the component as a jar file, it's found here: http://m2.neo4j.org/org/neo4j/neo4j-online-backup/
[edit] Database configuration
The backup relies on using the logical logs, so the original (source) database has to be configured to keep the logs:
EmbeddedGraphDatabase graphDb = new EmbeddedGraphDatabase( STORE_LOCATION_DIR );
XaDataSourceManager xaDsMgr = graphDb.getConfig().getTxModule().getXaDataSourceManager();
XaDataSource dataSource = xaDsMgr.getXaDataSource( "nioneodb" );
dataSource.keepLogicalLogs( true );
Note: All data sources included in a backup have to be set to keep their logical logs.
There are also settings for auto-rotating the logs. These are the corresponding methods, using the default values for the settings:
dataSource.setAutoRotate( true );
dataSource.setLogicalLogTargetSize( 10 * 1024 * 1024 ); // 10 MB
[edit] How to perform backup
Note: The very first backup has to be performed by shutting down the neo4j database and copying its files to the backup location. All subsequent backups can then be performed online using the online backup utility to keep the backup in sync with the live database.
The backup method can differ in two ways:
- destination is a running
EmbeddedGraphDatabaseinstance vs. only the location of a neo4j database is given - there is just a single data source (e.g. neo4j) vs. multiple data sources (e.g. neo4j + lucene)
We will walk you through the different alternatives below.
[edit] Single data source; backup to filesystem location
EmbeddedGraphDatabase graphDb = getTheGraphDbFromApp();
String location = "/var/backup/neo4j-db";
Backup backup = new Neo4jBackup( graphDb, location );
backup.doBackup();
That's it.
Note: If there is a problem writing to the file system locationBackup.doBackup()will throw anIOException.
[edit] Single data source; backup to running backup database
EmbeddedGraphDatabase graphDb = getTheGraphDbFromApp();
String location = "/var/backup/neo4j-db";
EmbeddedGraphDatabase backupGraphDb = new EmbeddedGraphDatabase( location );
Backup backup = new Neo4jBackup( graphDb, backupGraphDb );
backup.doBackup();
backupGraphDb.shutdown();
Not much to say here. Feed both databases to Neo4jBackup
and you should be fine.
[edit] Multiple data sources; backup to filesystem location
For now, this variation assumes that you're running a neo4j service together with a lucene index service (from the index component).
The Neo4jBackup constructor is in this case given a list of
data source names. The names in the example are the typical names
when running neo4j + lucene.
EmbeddedGraphDatabase graphDb = getTheGraphDbFromApp(); // assume lucene is hooked into this instance
String location = "/var/backup/neo4j-db";
Backup backup = new Neo4jBackup( graphDb, location,
new ArrayList<String>()
{
{
add( "nioneodb" );
add( "lucene" );
}
} );
backup.doBackup();
[edit] Multiple data sources; backup to running data sources
In this case, the neo4j source and destination instances are used to lookup any data sources in the list of names.
(TODO: missing info: how to wrap your data source to be used together with neo4j)
EmbeddedGraphDatabase neo = getTheGraphDbFromApp();
String location = "/var/backup/neo4j-db";
EmbeddedGraphDatabase backupGraphDb = new EmbeddedGraphDatabase( location );
IndexService backupIndexService = new LuceneIndexService( backupGraphDb );
Backup backup = new Neo4jBackup( graphDb, backupGraphDb,
new ArrayList<String>()
{
{
add( "nioneodb" );
add( "lucene" );
}
} );
backup.doBackup();
backupIndexService.shutdown();
backupGraphDb.shutdown();
[edit] Manually transferring and applying logical logs
If you have a running Neo4j graph database which is set up to keep its logical logs, you can manually copy or move rotated logical logs from the server and have a client apply them on a destination database. The first step still is to start with a copy of the source database and from there apply new logs incrementally whenever you like.
It's done by starting up a new JVM and run the org.neo4j.onlinebackup.ApplyNewLogs main class with a path to the destination database where you've put your copied/moved logical logs from the source database (keeping the directory structure from the source database). It will then apply those logs on the destination database. Example (assuming you have a running source database in /var/db and a destintion database (originated from the source database at some point) in /var/backup-db:
mv /var/db/*log.v* /var/backup-db/
java -cp $CLASSPATH_INCLUDING_ONLINE_BACKUP_AND_ITS_DEPENDENCIES \
org.neo4j.onlinebackup.ApplyNewLogs /var/backup-db
If you're using LuceneIndexService/LuceneFulltextIndexService as well you'll have to additionally move/copy its logs. So the script can be extended to this:
mv /var/db/*log.v* /var/backup-db/
mv /var/db/lucene/*log.v* /var/backup-db/lucene/
mv /var/db/lucene-fulltext/*log.v* /var/backup-db/lucene-fulltext/
java -cp $CLASSPATH_INCLUDING_ONLINE_BACKUP_AND_INDEX_AND_THEIR_DEPENDENCIES \
org.neo4j.onlinebackup.ApplyNewLogs /var/backup-db
[edit] Backup logs
As per default, backup logs are sent to standard error output (usually to the console, that is). If you want to you can enable logging to a file as well (default is: off), using the following method call:
backup.enableFileLogger();
The log file will be named backup.log and created
or appended to in the current working directory.
Changed your mind? Then go:
backup.disableFileLogger();
There are three different log levels to choose from:
backup.setLogLevelNormal(); // default, few lines of output
backup.setLogLevelDebug(); // detailed output
backup.setLogLevelOff(); // no output at all
This setting affects both console and file log output.
[edit] Summary
In summary, this is what you have to do:
- Preparation:
- shutdown the database, copy all the files to the backup location
- configure the database to keep its logical log in the future
- Performing backup:
- instantiate
Neo4jBackupaccording to your scenario (location/running, single/multiple data sources)- configure file output and log level of the backup log
- off you go,
doBackup()!

