Source: Working with Qubole to tackle the challenges of machine learning at an enterprise scale from Google Cloud
With virtually unlimited storage and compute resources, the cloud has emerged as the prime location for enterprises doing large-scale big data and machine learning projects. Enterprises need ever more sophisticated technology to quickly innovate with data projects in the cloud, without compromising ease of use, scale, and security. At Google Cloud, we’re building cloud infrastructure that’s flexible and open-source-centric to meet customer needs.
Our partners are essential to our mission of helping customers grow their tech capabilities and their businesses. Qubole, a recently announced Google Cloud Platform (GCP) partner, offers an integrated cloud-native data analytics platform. Qubole provides GCP users with a unified, self-service platform where data scientists and data engineers can collaborate using familiar tools and languages, as well as performance-optimized versions of open source data processing engines. The Qubole data platform provides a range of optimized open-source engines, including Apache Spark, TensorFlow, Presto, Airflow, Hadoop, Hive, and more. With Qubole, you can combine and analyze data from BigQuery and data lakes on Cloud Storage super quickly.
Building modern machine learning models
We’ve heard great stories from customers using the Qubole platform for powerful analytics, including Recursion Pharmaceuticals, True Fit, and AgilOne, which provides a customer data platform for its enterprise users. They support real-time use cases and large volumes of data. To do that, AgilOne operates complex machine learning (ML) models and stores vast quantities of data using Qubole and GCP for its 150-plus customers, including lululemon, Travelzoo, and TUMI.
AgilOne Cortex is a machine learning framework that uses supervised machine learning models to predict customer events such as purchase, subscription, and engagement. It segments customers together based on interest and behavior using unsupervised learning techniques. AgilOne Cortex’s recommender system models lets customers orchestrate offers and messages to customers on a one-on-one basis. AgilOne uses cloud platforms to perform close to one billion predictions every day, averaging dozens of millions of customer predictions for each client across all its models.
In order to meet the challenges of such vast amounts of data and millions of predictions , AgilOne chose Qubole and GCP to better automate the provision of machine learning data-processing resources based on workload, while allowing for portability across cloud providers; eliminating prototyping bottlenecks; supporting the seamless orchestration of jobs; and automating cluster management.
AgilOne now runs a variety of workloads for querying data, running ML models, orchestrating ML workflows, and more on Qubole—all on a single platform with optimized versions of Apache Spark, Apache Airflow, Zeppelin Notebooks, and leveraging Qubole’s APIs to automate tasks.
Using GCP and Qubole, AgilOne has seen some key benefits:
Elimination of critical bottlenecks through intelligent, autonomous and self-service provisioning of compute resources for the data science models.
Increased efficiency for AgilOne’s machine learning and ops teams.
Improved prototyping and efficient movement of ML models into production.
Simplified and reduced time to production, transitioning to GCP through a consistent user experience, tools, and technologies.
Efficient orchestration of machine-learning model lifecycle through Airflow.
Automated tasks end-to-end with Qubole APIs.
Improved customer support and added zero-downtime upgrades and roll-back capabilities.
AgilOne also uses Google Cloud Storage for its real-time data store of its customers’ transaction and event data. This repository of cleansed, deduped, and enriched data serves as the master customer record for all reporting, analyses, machine learning models, and advanced segmentation.
Limiting bottlenecks, simplifying cluster management
Using Qubole and GCP, AgilOne’s data science team can make cluster management and cluster provisioning more self-service, smarter, and less dependent on operations teams. They’re now able to make delivery of ML models more agile.
AgilOne data teams now rely less on the operations team, since infrastructure is provisioned automatically through Qubole. Qubole on GCP means that it’s now easier to provision new and larger clusters with different sets of permissions, install dependencies on VMs, maintain stable prototyping environments, and upgrade software. The data science team’s variable infrastructure needs are now addressed with intelligent automation—spinning up and releasing clusters and different types of nodes as needed. In Qubole’s managed Zeppelin environement, AgilOne can prototype its Python/Pyspark/Scala applications.
The comprehensive quality assurance and support, zero-downtime software upgrades, and rollback capabilities help to add stability for AgilOne’s ML operations. Eliminating bottlenecks has let the company build and test new models at lightning speed. This translates to a much faster go-to-market and onboarding of new clients.
Finding improved execution
AgilOne Cortex requires a powerful orchestration system to run and monitor dozens of models for all clients, and to run each model across all users every day.
Since Qubole and GCP bring open source options, AgilOne’s data science team can use configuration-as-code workflow engine Airflow. This has allowed AgilOne to better manage the lifecycle of its ML workflows by providing easy maintenance, versioning, and testing.
Qubole also provides customers like AgilOne a comprehensive set of APIs critical for end-to-end automation. This includes automating such tasks as starting and stopping clusters, submitting a Spark job or changing the Spark configuration, generating reports, increasing the timeout, and more.
Looking ahead with cloud
As its business continues to rapidly expand, the need for more data insights and more models increases. AgilOne will look to use Qubole and GCP for running ad-hoc queries for data discovery, exploration, and analyses.
From a cluster-management perspective, AgilOne wants to further use Qubole’s intelligent management of Google’s Preemptible VMs and heterogeneous cluster management capabilities to lower its ML processing costs without compromising reliability.