Introducing Fregata and its bundle

Published on 5/27/22 at 3:23 PM. Updated on 5/27/22 at 3:23 PM.

My data migration framework and its Symfony bundle.

Introducing Fregata and its bundle

Photo by Charles J. Sharp, CC BY-SA 3.0, via Wikimedia Commons.


My 3rd Symfony bundle is available, this time it's about data migration with my framework Fregata. This is a « pet project » but maybe some people will find a good use case for their work.

This post is only describing the projects. If you want to jump to the code, check the GitHub repos:
AymDev/Fregata
AymDev/FregataBundle

Once upon a time ...

Many years ago, I did my first data migration: I had a MariaDB database without any foreign key constraints, obvious normalization errors, and I had to send everything to a PostgreSQL database with a better structure.

It was not a big database but there were many tables, some of them had dozens of thousands of rows. As the data migration should have been executed once, I started scripting it without much care: the idea of Fregata was born...

Well, in reality it was a disaster:

First of all, it was a long PHP script without any dependency and a lot of SQL files. I had an export of the source database on a MariaDB server, I'd migrate a first time to another database with the desired structure on the same server. The original data was a mess, the SQL queries were very long and complex. Then I'd export the data from the 2nd MariaDB database to CSV that I'd send to the PostgreSQL database.
Everything was run locally, it took 7h each time, and failed multiple times. I obviously did it in the best possible way.

I did the same kind of complex migration between 2 DBMS multiple times after that, improving the process along the way. I finally published Fregata which was initially database-oriented and became « source agnostic » starting from v1.

The framework

Purpose

Fregata's work is to migrate data from a source A to a target B. And of course you can have multiple sources/targets. However if your migration is that simple, there are chances that other tools are more suitable.

The framework ignores the origin of the data, you can mix multiple data sources during a migration. For example:

  • send data from a MySQL database to a PostgreSQL one with a different structure
  • get CSV files to send the contents to multiple databases
  • fetch data from a relational database, associate file content with it and make calls to a REST API
  • ...

How it works

The 2 main components of a migration are tasks and migrators.

Tasks

Tasks are optional and allow to do things unrelated to data migration: database setup, directory creation, file deletions, ...

It can run before or after migrators. A migration runs in 3 steps:

  • « before tasks » for initialization purposes
  • migrators
  • « after tasks » for cleaning purposes

Migrators

A migrator has 3 sub-components which allows to mix data sources:

  • a « puller » to fetch data all at once or by batch
  • a « pusher » to send data in the right place while transforming it if needed
  • an « executor » which makes the connection between the 2: it calls the puller and progressively sends the data to the pusher

We can also declare dependencies between migrators, to ensure a consistent execution order. If a migrator A needs the migrated data from migrator B, then they will run in the order B-A.

The bundle

The framework itself is already based on Symfony components (Console, DependencyInjection, ...), the service configuration related to migrations is unchanged in the bundle.

Its main purpose is to run migrations with Symfony Messenger. We can then run multiple migrations and/or migrators at the same time.
When starting a migration run, each component (migration, tasks and migrators) is saved in database to keep track of the migration state between the Messenger messages.

Saving in database also allows to get the complete run history. This is why the bundle provides a web user interface where you can start a run, see the actual configured migrations, the run history and the details of a run:

What's next

I have many more ideas to bring new features or improve how Fregata works:

  • Symfony 6 support
  • Webpack Encore support for front-end
  • manually triggered tasks and migrators
  • integrations with various packages
  • ...

Feel free to contribute if you're interested in the project. And I'd love to have your feedback if you tried it, or just know about your use cases.
Bye !

Comments: 0

Robot invasion coming from robohash.org