We’re big fans of Amazon Redshift. (Well, we’re big fans of data, so anything that makes it easier to store, access, analyze and utilize data is going to appeal to us.)
We do have an eensy, teensy, tiny bone to pick with how Amazon Redshift portrays the ease of migrating to their platform.
As the first of its reasons why to migrate to Redshift, Amazon says, “Amazon Redshift is fully managed and simple to use, enabling you to deploy a new data warehouse in minutes and load virtually any type of data from a range of cloud or on-premises data sources.”
We’ve been around the data block a few times, and we can’t recall the last time we saw a migration to Redshift that took a few minutes.
To be fair, Amazon isn’t claiming that the migration will take you a few minutes. They’re talking about the time it takes to “deploy a new data warehouse” in Redshift.
Setting up the data warehouse can take minutes. Moving in can take months… or even years.
The joys and trials of data migration
The last time I moved apartments, it took me a grand total of 3 hours to find the new apartment I wanted.
It then took weeks to pack up my old place and more weeks to unpack at my new place… and a year later I was still tripping over that One. Last. Box.
And this was moving from a two-bedroom to a three-bedroom apartment. Had I been packing and unpacking a mansion’s worth of possessions, I probably still wouldn’t be 100% settled in.
Packing up your gigabytes or terabytes of data assets, moving to Redshift, and settling in is usually a process that takes months to years. Of course, during the migration you not only have to be dealing with the actual migration process, but you also need to still be taking care of your data management work on the old system, and at some point on the new system.
Packing up and moving two sets of china dishes uses more resources (time, manpower, etc.) than packing up and moving one set. If I’m a dinner host extraordinaire and actually use both sets of china, the extra resources spent moving the second one are a necessary investment. If I only need and use one set, but I don’t realize that I have two in my china closet, moving them both is unnecessarily using up resources and complicating my move.
For organizations whose BI landscapes have grown organically over time, there is often redundancy and overlap in data assets or processes.
Moving redundant data is, well, redundant.
Octopai comes into play prior to a major migration to provide a comprehensive mapping of the objects, assets, tables, and processes across the systems in your current BI landscape. With a clear visual inventory of what you have, you can make informed decisions about what needs to be transferred to Amazon Redshift and what doesn’t.
For the assets that will be transferred, Octopai helps your BI team easily see the relationships between different data sources, greatly simplifying the migration process.
During the Amazon Redshift migration process, your day-to-day BI work doesn’t need to come to a halt. Often it just gets harder, since part of your data is in your old systems, and part in your new systems. Figuring out where to look and when can make you feel like you’re seeing double. It’s a great recipe for a data migraine.
Octopai’s end-to-end data lineage, from the legacy/operational system to reporting, simplifies insight into and through Amazon Redshift along with all other layers of the company’s data ecosystem. Put on your Octopai glasses, and your migraine will fade as your vision clears.
Less is more.
Simpler is faster.
When you’re migrating only what you need and leaving all the rest behind, you can move light, free and quick.
Also important to a smooth move is having a clear idea of what goes where. If your movers are carrying a box of china dishes into your new house, but don’t know where you intend to store your china, you can bet that it’s going to take a while until you’re settled. Drawing up a map of what you’re taking and where it fits into your new home environment will make the process much smoother and faster.
Octopai’s data lineage mapping enables you to check pre-migration, during migration, and post-migration that everything is going to the right place. Clear information and direction translate into a fast, incident-free move.
Less expensive migration (and post-migration)
In addition to the money saved by reducing the time and resources needed for migration, Octopai can also save you money in your long-term Amazon Redshift use. Amazon Redshift uses a pay-per-use model. Depending on your plan, you pay for the number of bytes scanned when you run queries, or for computing time, or for the data you store.
Eliminating redundancy means less expensive storage and often less expensive computing.
Additionally, Octopai’s data flow lineage is fully aligned with the detailed Amazon Redshift metadata repository, including any object defined in an Amazon Redshift database. Octopai can provide the full picture of what and how different tables, columns, etc. are being used, decreasing the number of needed exploration queries over the Amazon Redshift reporting and analytics platform, which directly reduces expenses.
That’s music to your CFO’s ears.
Let’s get migrating
There’s no way around it; moving always takes time and resources.
But with Octopai as an integral part of your Amazon Redshift migration team, it won’t be long until you’re unpacking that Very. Last. Box.