General strategy for migrating relational data to document-based
I was recently asked how to migrate a project using relational data to a document-based approach (e.g. from .NET ORM Entity Framework into Marten). As always, it’s easy to ask a question but much harder to answer. But hell, let’s try!
Document approach differs from relational mainly by:
- using weak schema,
- having denormalised data.
Contrary to common belief, document data is structured but less rigidly, as in the relational approach. We can easily extend the schema for our documents, even for specific ones, by adding new fields. We should also not fail if the field we expect to exist, but doesn’t. That’s why probably it’s called weak schema. In my opinion, that’s not a weakness by definition. In fact, it may mean that we care about ourselves and apply backwards and forward compatibility.
Document databases work best if we access them by id, as they’re type of key-value databases. That means we should group related data and keep them together to use it efficiently. For instance, selected product items only make sense with the shopping cart. That’s why we should keep them together, use a shopping cart as a root document, and keep product items as nested collections.
Documents may also have relations. But also lightweight, without enforcement like relational databases with their foreign keys. Keeping the shopping cart example, we may have products’ definitions as separate documents and keep only their ids in the product items list. We may also take the tradeoff and keep basic information like names to reduce the need to access other documents. Document databases usually don’t provide efficient joins like relational databases. They typically do look-ups. Look-up means that you’re querying first one set of documents, then using data from them getting related documents.
So, how to migrate relational data into document-based? The simplistic answer is to migrate data by loading batches from the relational database, serialising them and storing them in batches in the document database. But that’s the technical recommendation. Moving from a relational way to the document one requires more than that, e.g. denormalising data and finding a way to break the strong relationships.
My general strategy would be:
-
Find root entities (e.g. for a shopping cart with product items, the shopping cart is the root entity).
-
Then check related properties to see if they can live independently.
-
If they can’t live separately, embed them inside the root entity. If they can, reference them by id.
-
Set those entities that can live on their own as documents.
-
Do this exercise and define policies for each table. Prepare the classes that will be loading and storing the data.
-
Set up a project to load data from a relational database and store them in a document database. The import process may take some time, so it should be run by a background worker. Remember to put extensive logging; this will help you troubleshoot issues.
-
Do a dry run and test the first migration for a single table.
-
After that, create a program that will load entities from relational and document databases and compare if they’re the same.
-
Check what type of indexes your document database provides and apply those that make sense.
-
Rinse/repeat for other data.
-
Consider also using the strangler fig pattern while migrating to your existing system to do it step by step.
Of course, you’ll need to figure out a lot of grey matter, but it’d be worth first making the change easy, then making the easy change. So, minimising the data model refactoring during migration.
It’s also better to start with simple mapping without changing schema too much(unless your data is simple). After it works, consider adjusting it to fit the document approach. Read also my other article where I lined up the heuristics on changing legacy design: What do the British writer and his fence have to do with Software Architecture?.
See also decent reference guides:
- HevoData - How to Migrate Relational Database to MongoDB?: Made Easy
- Couchbase - Migrating from Relational Databases,
- MongoDB Data Modeling.
Of course, those are just simple heuristics, as each migration should be made case by case, respecting the data we have and their business context.
Still, I hope that they will be helpful enough to start your journey and plan your migration strategy!
Cheers!
Oskar
p.s. Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help. You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to Red Cross, Ukraine humanitarian organisation or donate Ambulances for Ukraine.