Insights
Imagine Communications boosts performance and scalability with AWS cloud migration
THE CHALLENGE
The Imagine Communications Research and Development data team challenge was twofold. First, they had to reduce prohibitive overhead and operational costs associated with developing a new product on Azure using Microsoft SQL Server. Second, they needed to deliver a new analytics platform built to leverage their existing analytic tools, provide instant elasticity, integrate multiple data sources, and reduce time to deliver new products and services.
This analytics system transforms TV ratings data provided by Nielsen and ComScore spanning more than 200 geographic markets in a single serverless Data Lake.
THE SOLUTION
After a thorough consultation and evaluation, the Beyondsoft Big Data team proposed a solution built on AWS cloud-native services, comprised of Athena, Glue, Redshift, and Redshift Spectrum. The project also included the easy deployment of Redshift clusters and event-driven pipelines to allow Imagine Communications to move large amounts of daily generated data to a Redshift cluster.
Beyondsoft also implemented tuning and revision of the former workflow. Changes included:
- Historical data and real-time data provided by Nielsen and ComScore, the rating agencies, is now stored in S3, consisting of more than 1 million files and more than 50 TB of uncompressed data.
- The input data comes in a mainframe-like format, so a Lambda/Python function was used to convert to S3 in JSON format. Lambda functions were used for high concurrency data processing for S3 files and were invoked more than 1 million times to process more than 800,000 files in one hour.
- To save money, the S3/JSON data was converted to Parquet format using Spark which runs in Fargate containers.
- The Glue Catalog has more than 2,000 tables that is queried by Athena.
- The Parquet-formatted data stored in S3 was moved to Redshift Spectrum and Redshift Clusters via staging tables.
- The Redshift cluster, which is a 2-node RA3 configuration, houses more than 20TB of raw data for 4 years.
- The ETL into the Redshift cluster is done by Python/SQL using Fargate containers.
- The Reporting UI now communicates to the Redshift spectrum and Redshift cluster through C# API.
By implementing the AWS managed services stack and architectural advancements, the Beyondsoft Big Data team successfully delivered a scalable, truly elastic, and highly performant data solution that also met Imagine Communications’ stringent success criteria. By moving to the AWS cloud platform, the overall cost to operate the new solution dropped from an average of $200,000 per month to an average of $5,000 per month.
KNOWLEDGE TRANSFER
During the engagement with Imagine Communications, the Beyondsoft team also provided education and training, so that the Imagine Communications Data Analytics Team could take ownership of the newly developed environment. As a practice, the Beyondsoft Big Data Team developed and delivered a runbook, so that Imagine Communications could not only manage the newly provided data platform, but also add to the solution in the future.
BENEFITS
Migrating from Microsoft SQL Server on Azure to Redshift on Amazon Web Services provided Imagine Communications with increased performance, high scalability, and faster data processing by leveraging a sophisticated data lake solution.
Download Case StudyHow we do it
Our success factors over the years are a testament to driving your return on investment. Singapore is our global head office and we have 15 regional offices around the world.
Nearly 3 decades of strong IT consulting and services
Global presence across four continents
Certifications* in CMMI 5, ISO 9001, ISO 45001, and ISO 27001
~30,000 global experts
Microsoft Azure Expert MSP
ISO 9001 and 45001 (certificates issued to Beyondsoft International (Singapore) Pte Ltd). ISO 27001 (certificates issued to Beyondsoft International (Singapore) Pte Ltd, Beyondsoft (Malaysia) Sdn. Bhd., and Beyondsoft Consulting Inc., Bellevue, WA, USA)