Source: Migrating Teradata and other data warehouses to BigQuery from Google Cloud
Traditional, on-premises data warehouses collect and store what is often an organization’s most valuable data—which helps drive growth and innovation. Organizations depend on this data to make informed and timely decisions that can shape the future of their business. But we know that traditional data warehouses can be expensive, hard to maintain, and unable to keep up with business needs.
As data rapidly increases in volume, velocity and variety, it’s especially hard to get business needs met. We know that businesses are turning to BigQuery, our highly scalable and serverless enterprise data warehouse, to perform fast, real-time analysis of their data.
When migrating your data warehouse, you’re moving what’s essentially the center of gravity of your entire data analytics and business intelligence environment. Many business applications depend on your data warehouse for reports, data feeds, and dashboards, and the users of these business applications expect to have minimal to no disruption during the migration. With all this in mind, we’ve created a new data warehouse migration guide to help walk you through data warehouse migrations with as little complexity and risk as possible. In the guide, you’ll find prescriptive, end-to-end guidance to securely migrate legacy data warehouses to BigQuery. Though the guide contains some sections specific for migrations from Teradata, you’ll be able to use the vast majority of the guide for any enterprise data warehouse migration.
Building the migration framework
A migration can be a complex and lengthy endeavor, but it can be made simpler with planning. As part of the migration guide, you’ll find our suggested structured framework for data warehouse migrations, based on Agile principles. The framework facilitates the application of project management best practices, helping to bring incremental and tangible business value while managing risk and minimizing disruptions.
The framework adheres to the phases shown in the following diagram, with more details below:
1. Prepare and discover: In this initial phase, the focus is on preparation and discovery. It’s about affording yourself and your stakeholders an early opportunity to discover the use cases you’re planning for BigQuery, raise initial concerns, and, importantly, conduct an initial analysis around the expected benefits.
2. Assess and plan: The assess-and-plan phase is about taking the input from the prepare-and-discover phase, assessing that input, and then using it to plan for the migration. This phase can be broken down into the following tasks:
Assess the current state
Catalog and prioritize use cases
Define measures of success
Create a definition of “done”
Design and propose a proof-of-concept (POC), short-term state, and ideal end state
Create time and cost estimates
Identify and engage a migration partner (if applicable)
Find more details here on these tasks.
3. Execute: After you’ve gathered information about your legacy data warehouse platform, and created a prioritized backlog of use cases, you can group the use cases into workloads and proceed with the migration in iterations.
An iteration can consist of a single use case, a few separate use cases, or a number of use cases pertaining to a single workload. Which option you choose for the iteration depends on the interconnectivity of the use cases, any shared dependencies, and the resources you have available to undertake the work. For example, a use case might have the following relationships and dependencies:
Purchase reporting can stand alone and is useful for understanding monies spent and requesting discounts.
Sales reporting can stand alone and is useful for planning marketing campaigns.
Profit and loss reporting, however, is dependent on both purchases and sales, and is useful for determining the company’s value.
With each use case, you’ll want to decide whether it will be offloaded or fully migrated. Offloading focuses on time to delivery, where speed is the top priority, and fully migrating is about ensuring all upstream dependencies are also migrated. The following diagram shows the execution process and flow in greater detail:
During the execute phase, the work to fully migrate or offload the use case or workload should focus on one or more of the following steps. Our guide includes documents dedicated to each of these steps:
Understanding the migration architecture
After each iteration in the execution phase, you’ll likely have some use cases offloaded to BigQuery, some fully migrated, and some still in your on-premises data warehouse. This iterative approach is enabled by an architecture where both your data warehouse and BigQuery can be actively used in parallel. This architecture allows you to take data warehouse migration one step at a time, breaking down its complexity and reducing risk.
The next diagram illustrates the architecture, showing Teradata working on-premises and BigQuery on GCP, where both can ingest from the source systems, integrate with your business applications, and provide access to the users who need it. Importantly, you can also see in the diagram that data is synchronized from Teradata to BigQuery.
The data warehouse migration guide provides a wealth of prescriptive guidance so you can structure your migration project carefully and undertake each one of its challenges in a systematic manner. Our professional services organization and our partners are ready to assist you further in your migration journey, no matter how complex it may be. And check out our migration offer for help creating a streamlined path to a modern data warehouse.