Restore database dump
On this page, we walk through the steps to populate an empty Neo4j database using our database dump. This approach allows you to skip the resource-intensive bulk import process and start querying the full graph significantly faster.
The process involves:
- Downloading a multi-part archive of the database dump.
- Extracting the archive to a local directory.
- Loading the dump into your Neo4j instance.
Yes, if you want to sample application-specific communities or explore the graph interactively (e.g., querying -hop neighborhoods).
No, if you want a quick start for developing models using our generic, pre-sampled communities. In this case, you can jump straight to the g101 Jupyter Notebook or these quick-start examples.
Bandwidth:
This process involves downloading nearly 1 TB of data;
ensure you are using a stable connection without data caps.
Storage: Ensure you have at least 4.3 TB of free disk space
(compressed download: ~800 GB,
extracted database dump: ~800 GB, and
populated Neo4j database: ~2.7 TB).
Prerequisites & setup
-
Install neo4j graph database.
-
Install data source CLI: install the AWS CLI.
-
Install 7-Zip.
sudo apt update && sudo apt install p7zip-full -y
Download & extract archive
The database dump is compressed and split into many chunks
(1070 chunks, 700 MB each in data release v1)
to ensure reliable downloading.
-
Configure environment variables to specify the target directories for downloading and extracting the data.
# Set the download path for the multi-part archive.
# Requires ~800 GB free space.
export G_DOWNLOAD_PATH="/mnt/download/path"
# Set the extraction path for the archive.
# Requires ~800 GB free space.
export G_EXTRACT_PATH="/mnt/extract/path" -
Download the database dump files.
aws s3 sync s3://bitcoin-graph/v1/neo4j_db_dump/ "${G_DOWNLOAD_PATH}" --no-sign-request -
Extract the downloaded multi-part archive.
7z x "${G_DOWNLOAD_PATH}/neo4j.dump.gz.001" -o"${G_EXTRACT_PATH}"By targeting the
.001file, 7-Zip will automatically detect and process the remaining parts in the sequence.Note that decompressing
~700 GBof data is a heavy operation, and it will take several hours depending on your disk speed.
Restore database dump
-
Stop the database service.
sudo systemctl stop neo4j -
Restore the database.
sudo -u neo4j neo4j-admin database load neo4j \
--from-path="${G_EXTRACT_PATH}" \
--overwrite-destination \
--verbosePlease refer to this page for documentation on the
database loadcommand.Note: This step will take a significant amount of time (e.g., 24h) and requires at least
2.72 TBof free space in the Neo4j database path. -
Start the database service.
sudo systemctl start neo4j -
Enable APOC.