Isn't data migration a major benefit of MongoDB compared to the commonly used relational databases?

Software Engineering Asked on January 4, 2022

I was reading an old answer which was recently updated, and noticed that the author doesn’t quote the simplicity of data migration as a benefit of MongoDB. I always thought that the major benefit of a NoSQL solution such as MongoDB is that it makes migrations extremely simple compared to the usual relational databases.

In fact, in the common relational databases, some changes in the schema, such as adding a column to a table, are simple. However, a lot of other changes (and from my experience, those are the ones which are the most frequent) could be extremely challenging.

Here’s an example of such a change. Recently, I got a project which was managed by a different team. In this project, they decided not to use natural keys, and this made any data manipulation very painful. Say I want to know if a given employee has the administrator role. Instead of just querying the user_role table, filtering by the employee ID, I have to first go to the user table to find the primary key corresponding to the employee ID, then to the role table to find the primary key corresponding to the administrator role, and finally SELECT from the user_role table, possibly putting the two primary keys in the wrong order and mistakenly thinking that the user is not an administrator, while he actually is.

The solution was to move to natural keys. However, the original primary keys being referenced in other tables, I had to:

  1. Create extra tables (such as user2) with the new schema.
  2. Modify the application so that it handles two tables in parallel.
  3. Move the data to the new tables.
  4. Modify the application once again to give preference to new tables.
  5. Remove the old tables.
  6. Modify the application to handle the next renaming.
  7. Rename the new tables (user2 becoming user).
  8. Modify the application to forget about user2.

That’s… well… a lot of work for a change which looks so simple from a business perspective. If I were using MongoDB, I would simply:

  1. Change the application to start using natural keys, while still supporting the auto-incremented values.
  2. Migrate the data by replacing the auto-incremented identifier by the value of a natural key.
  3. Change the application once more to forget about auto-incremented values.

That’s like moving from a few days of work to something which could be performed in less than an hour, with much less risk to make a mistake somewhere in the process.

Am I missing something about the data migration in MongoDB compared to the data migration in most popular relational databases? Or is the simplicity of data migration still a huge benefit of MongoDB?

2 Answers

Both processes' simplicity and complexity have been greatly exaggerated, I'm sure the first one (RDBMS) can be both more complex or simpler, depending on a range of factors.

I believe the "every-table-must-have-an-autoincremental-pk-no-matter-what-and-disregard-natural-keys-no-matter-what" way of data modeling causes problems like the one you depict.

There's no doubt that making arbitrary changes to a relational database, which complains about relational integrity, uniqueness etc., than in a NoSQL database that doesn't. But I don't see why the changes in the application are so trivial as you mention in your second example.

The difficulty of the changes you explain in the first example come from the database engine strictness on relational rules. And in both cases the app needs to be changed and tested.

There's an old piece of advice that states something in the lines of "don't query tables, query views". An app usually queries more than it inserts or updates, so complex queries should be already turned into views than can be modified under the hood without breaking the app, but changes are inevitable nonetheless. There should be very little code querying tables instead of views and the insertion and update should be done through very defined interfaces or in very few places.

I would suggest that stating "ease of migration" as an advantage of NoSQL databases is a bit misleading because it might be a wild generalization and it all depends of the complexity of the business rules you have to enforce before and after the change and other factors so it would be safer not to state that. RDBMS do some stuff for you so you don't have to write everything whereas in NoSQL databases you have to enforce integrity by yourself. So your mileage may vary.

In the other hand the needs a RDBMS caters for can be wildly different from the needs a NoSQL database is required for.

Answered by Tulains Córdova on January 4, 2022

First, I’d like to rescope the migration benefit that you attribute to nosql in general to the sole segment of document-stores. Because migrating from anything to a row store, a graph database or a tuple store might require a serious mapping effort.

Document stores in general and MongoDB in particular have indeed a great flexibility in regard of their structure. So at first sight, since it can store anything in form of an arbitrary document, logically it should be easy to migrate from a large set of dbs to mongodb.

However, the devil is in the details: having the data in a document structure is one thing. Using it is another. So in the end, it’s as always a matter of trade-offs. For example, when migrating from an RDBMS, since you trade a huge flexibility in the querying (enabled by a very rigid structure) against a huge flexibility in the storing. So I would pretend after a second thought the migration is not as easy if you consider the system as a whole.

Typically, a one to many relation in an rdbms would be migrated to mongodb either by embedding dependent data or by using document references. You need to make a choice for the migration and this choice does not always seem trivial and easy.

For many to many relations, you might consider duplicating some data for convenience of use of the related documents, or you may again, opt for linking, but you’ll have to code the use of those links.

And migrating to document reference might require a multi-pass approach as well.

Of course, if you only see the question under the angle of moving the data, and not loosing anything, then it’s fine. But a more comprehensive view on migration will have to address many issues, many of them related to code.

Answered by Christophe on January 4, 2022

Add your own answers!

Ask a Question

Get help from others!

© 2024 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP