Cloud data platforms: Not just a lift and shift
February 14, 2019
Author: James Hartley
It’s hard to believe that AWS was launched in its current form over 12 years ago, but in this time the service offerings available have evolved massively. For a long time, migrating Data Warehouse and Analytics applications to the cloud made no sense. Most line-of-business (LOB) applications were on-premise and costs (in time and money) of large data transfers into the cloud were prohibitive.
That equation is changing, with cloud becoming the ‘new normal’ for enterprise applications. So much so that the 2016 IDG Enterprise Computing Survey found that data and analytics were the leading workloads moving to cloud in 2017.
Cloud offers many benefits, including cost, scalability and resilience improvements. Attractively, cloud offers the ability to convert large CapEx outlays to more evenly distributed OpEx over the lifetime of your system, as well as a huge number of new technologies available as SaaS (Software as a Service) that are simply impossible to set up on premise for many businesses, including artificial intelligence and machine learning applications.
This is why you should see migration to the cloud as an opportunity to re-architect your workload instead of implementing the same technologies in the cloud. In fact, implementing the same on-premise optimised architecture may cost you more in the cloud!
To illustrate this, let’s use an example. We’ll take a very simple data warehouse environment that we will migrate using a couple of different approaches.
Let’s suppose that this warehouse:
- Uses change data capture technology to replicate data
- Uses a database cluster to store the data
- Has an ETL process that runs each night to complete batch processes
- Exposes reports and ad-hoc query capability via a BI Server.
This is a very basic configuration that would support a department, however many of the concepts are the same for larger implementations. The key aspects of this architecture are that:
- All components are on-site and have a fixed capacity – frequently running at that same capacity 24 hours a day
- Costs incurred are large, up-front, with licencing, support and maintenance ongoing
- Investment required is relative to the peak capacity provisioned – maybe only fully utilised 2 days a month for month-end processing!
In this blog, we will look at a couple of scenarios for migration to a Cloud Environment:
- Cloud Lift and Shift – Reimplementing the same architecture on cloud infrastructure; and
- Cloud Re-architecture – Redesigning the capability to make best use of cloud offerings.
Cloud lift and shift is a frequent first step into the cloud as it offers a faster, lower-risk entry point into the cloud. This approach is frequently used for customised LOB applications that have low tolerance for risk and cannot be easily refactored or upgraded. However, is it appropriate for a data and analytics platform?
To deliver the same capability with a Lift and Shift approach, we would just re-provision the same servers in the cloud:
- On-premise CDC servers would be replaced with cloud CDC servers (maybe AWS EC2 instances)
- On premise database servers will be replaced with cloud database servers And so-on for each layer…
This still provides benefits by increasing the speed you can implement Infrastructure changes (independent of the software running on it), rather than the weeks required on premise. However, the architecture is:
- Likely to have a similar TCO, shifted from up-front CapEx to an ongoing infrastructure OpEx
- Still running 24/7 and constrained by the size of the servers that you provisioned
- Still costing you a fixed amount of money (even if no-one is using it or no data is being loaded!)
- Will still have similar software licencing costs, where licences are required to match peak capacity
- Still requires management and administration – exactly like an on-premises data warehouse.
However, all is not lost. Depending on your appetite for change you can:
- Save money by turning off certain parts of your architecture overnight (such as your reporting server)
- Clone the environment for non-production usage much more quickly
- Turn off development and test environments as required.
Whilst an environment re-architecture will have a higher risk and effort profile, the changes made can deliver significant new benefits to the business. These benefits will include:
- Offering the ability to scale capacity up and down on demand by moving away from legacy technologies that were not built for rapid scalability – allowing the platform to accommodate expanded user needs faster
- Delivering a true consumption-based costing model, where costs track consumption closely. Therefore your business will not incur costs 24/7 to provide for peak capacity
- Reducing the administrative burden by using SaaS or PaaS (Platform as a Service) offerings where every-day administrative tasks like patching, backups and storage management are managed by the cloud provider – allowing you to focus your team budget on higher value innovation or application-level activities.
We achieved these benefits by doing the following:
- Change from a constantly running server for CDC to a Kinesis Firehose service to capture Change Data from our source system – saving money by paying only for the usage required. For example, a stream capable of 1000 x 5 KB records per second will cost US$118 per month (12 TB/month)
- Changed from a self-managed database in EC2 to a Redshift cluster – optimised for analytics. This reduces management and administration requirements; increases performance for data warehousing workloads; increases availability, recoverability and durability including automated backups; and is scalable without downtime
- Updated our ETL to run on ephemeral EMR Clusters instead of traditional ETL software on EC2 servers (using Spark, a data processing toolset). These clusters are only instantiated for the duration of batch runs and can be tuned or re-sized as required should data volumes significantly grow or shrink – without the long lead time of the traditional ETL infrastructure it replaces
- Replaced our reporting solution with Amazon QuickSight to avoid paying for constantly running servers and separate BI tool licences. QuickSight operates as a service that hosts a reporting environment and is priced from US$9 / Month.
This is a potentially simplistic example without considering other factors such as the costs of re-factoring or redeveloping some potentially complex components of your warehouse infrastructure, however it demonstrates some of the benefits realisable in even a simple warehousing environment.
In summary, to best utilise the true potential of cloud computing, it is necessary to look beyond replication of the status quo in a different environment and truly consider the opportunities that are available to the organisation.
Our tips would be:
- Evaluate serverless services to ensure that you are truly paying for consumption, not for an expensive virtual server to run 24/7
- Where serverless architectures are not available, consider offerings that scale readily in response to demand
- Automate your environments to enable fast provision of new environments for test and development
- Evaluate if you really need 24/7 Development & Test environments – you can turn off these environments overnight and on non-business days
- Put in place governance controls to monitor and adjust Cloud spend. Cloud providers have numerous tools to help, but it is important to monitor usage to ensure cost control
- Modularise your architecture to allow substitution of any one part and develop a roadmap for continuous improvement. For example, AWS last year announced hundreds of new products and services!
Working with us means engaging some of Australia's most experienced and skilled cloud architects. It means partnering with dedicated data scientists. And working with people who understand cloud environments inside-out. And it means even the most innovative and interesting technology needs to achieve real-world outcomes.
James Hartley is Head of Data Engineering with Arq Group and is responsible for developing the Data Engineering competency nationally. He has 10+ years’ experience in the Information Management industry, predominately working on complex Data Warehousing and Business Intelligence projects and has undertaken roles in Australia and the UK for organisations within Financial Services, Communications, Resources, Gaming, Education and Government.