Log in to post a comment.
Published on 5/27/22 at 3:23 PM. Updated on 5/27/22 at 3:23 PM.
My data migration framework and its Symfony bundle.
Published on 5/27/22 at 3:23 PM. Updated on 5/27/22 at 3:23 PM.
My data migration framework and its Symfony bundle.
Photo by Charles J. Sharp, CC BY-SA 3.0, via Wikimedia Commons.
My 3rd Symfony bundle is available, this time it's about data migration with my framework Fregata. This is a « pet project » but maybe some people will find a good use case for their work.
This post is only describing the projects. If you want to jump to the code, check the GitHub repos:
AymDev/Fregata
AymDev/FregataBundle
Many years ago, I did my first data migration: I had a MariaDB database without any foreign key constraints, obvious normalization errors, and I had to send everything to a PostgreSQL database with a better structure.
It was not a big database but there were many tables, some of them had dozens of thousands of rows. As the data migration should have been executed once, I started scripting it without much care: the idea of Fregata was born...
Well, in reality it was a disaster:
First of all, it was a long PHP script without any dependency and a lot of SQL files. I had an export of the source database on a MariaDB server, I'd migrate a first time to another database with the desired structure on the same server. The original data was a mess, the SQL queries were very long and complex. Then I'd export the data from the 2nd MariaDB database to CSV that I'd send to the PostgreSQL database.
Everything was run locally, it took 7h each time, and failed multiple times. I obviously did it in the best possible way.
I did the same kind of complex migration between 2 DBMS multiple times after that, improving the process along the way. I finally published Fregata which was initially database-oriented and became « source agnostic » starting from v1.
Fregata's work is to migrate data from a source A to a target B. And of course you can have multiple sources/targets. However if your migration is that simple, there are chances that other tools are more suitable.
The framework ignores the origin of the data, you can mix multiple data sources during a migration. For example:
The 2 main components of a migration are tasks and migrators.
Tasks are optional and allow to do things unrelated to data migration: database setup, directory creation, file deletions, ...
It can run before or after migrators. A migration runs in 3 steps:
A migrator has 3 sub-components which allows to mix data sources:
We can also declare dependencies between migrators, to ensure a consistent execution order. If a migrator A
needs the migrated data from migrator B
, then they will run in the order B-A
.
The framework itself is already based on Symfony components (Console, DependencyInjection, ...), the service configuration related to migrations is unchanged in the bundle.
Its main purpose is to run migrations with Symfony Messenger. We can then run multiple migrations and/or migrators at the same time.
When starting a migration run, each component (migration, tasks and migrators) is saved in database to keep track of the migration state between the Messenger messages.
Saving in database also allows to get the complete run history. This is why the bundle provides a web user interface where you can start a run, see the actual configured migrations, the run history and the details of a run:
I have many more ideas to bring new features or improve how Fregata works:
Feel free to contribute if you're interested in the project. And I'd love to have your feedback if you tried it, or just know about your use cases.
Bye !