Source: AI in Depth: Serving a PyTorch text classifier on AI Platform Serving using custom online prediction from Google Cloud
Earlier this week, we explained in detail how you might build and serve a text classifier in TensorFlow. Today, we’ll provide a new explainer on how to build a similar classifier in PyTorch, another machine learning framework. In today’s blog post, we’ll explain how to implement the same model using PyTorch, and deploy it to AI Platform Serving for online prediction. We will reuse the preprocessing implemented in Keras in the previous blog post. The code for this example can be found in this Notebook.
AI Platform ML Engine is a serverless, NoOps product that lets you train and serve machine learning models at scale. These models can then be served as REST APIs for online prediction. The AI Platform Serving automatically scales to adjust to any throughput, and provides secure authentication to its REST endpoints.
To help maintain affinity of preprocessing between training and serving, AI Platform Serving now enables users to customize the prediction routine that gets called when sending prediction requests to their model deployed on AI Platform Serving. This feature allows you to upload a Custom Model Prediction class, along with your exported model, to apply custom logic before or after invoking the model for prediction.
In other words, we can now leverage AI Platform Serving to execute arbitrary Python code, breaking the typical and previous coupling with TensorFlow. This change enables you to pick the best framework for the job, or even combine multiple frameworks into a single application. For example, we can use Keras APIs for their easy-to-use text pre-processing methods, and combine them with PyTorch for the actual machine learning model. This combination of frameworks is precisely what we’ll discuss in this blog post.
For more details on text classification, the Hacker News dataset used in the example, and the text preprocessing logic, refer to the Serving a Text Classifier with Preprocessing using AIPlatform Serving blog post.
You can begin by implementing your
TorchTextClassifier model class in the
torch_model.py module. As shown in the following code block, we implement the same text classification model architecture described in this post, which consists of an Embedding layer, Dropout layer, followed by two Conv1d and Pooling Layers, then a Dense layer with Softmax activation at the end.
The following code prepares both the training and evaluation data. Note that, you use both fit() and transform() with the training data, while you only use transform() with the evaluation data, to make use of the tokenizer generated from the training data. The created
eval_texts_vectorized objects will be used to train and evaluate our text classification model respectively.
The implementation of
TextPreprocessor class, which uses Keras APIs, is described in Serving a Text Classifier with Preprocessing using AI Platform Serving blog post.
Now you need to save the processor object—which includes the tokenizer generated from the training data—to be used when serving the model for prediction. The following code dumps the object to a new
The following code snippet shows you how to train your PyTorch model. First, you create an object of the
TorchTextClassifier, according to your parameters. Second, you implement a training loop, in which each iteration you predictions from your model (
y_pred) given the current training batch, compute the loss using
cross_entropy, and backpropagation using
NUM_EPOCH epochs, the trained model is saved to
In order to apply a custom prediction routine, which includes both preprocessing and postprocessing, you need to wrap this logic in a Custom Model Prediction class. This class, along with the trained model and its corresponding preprocessing object, will be used to deploy the AI Platform Serving microservices. The following code shows how the Custom Model Prediction class (
CustomModelPrediction) for our text classification example is implemented in the
Next, you’ll want to upload your artifacts to Cloud Storage, as follows:
Your saved (trained) model file:
trained_saved_model.pt (see Training and Saving the PyTorch model).
Your pickled preprocessing objects (which contain the state needed for data transformation prior to prediction):
processor_state.pkl. As described in the previous, Keras-based post, the
processor_state.pkl object includes the tokenizer generated from the training data.
Second, you need to upload a Python package including all the classes you’ll need for prediction (preprocessing, model classes, and post-processing, if any). In this example, you need to create a `pip`-installable tar file that includes
preprocess.py. To begin, create the following
setup.py file includes a list of the PyPI packages you need to `pip install` and use for prediction in the
REQUIRED_PACKAGES variable.Because we are deploying a model implemented by PyTorch, we need to include ‘torch’ in
REQUIRED_PACKAGES. Now, you can create the package by running the following command:
This will create a `.tar.gz` package under /dist directory. The name of the package will be `$name-$version.tar.gz` where `$name` and `$version` are the ones specified in
Once you have successfully created the package, you can upload it to Cloud Storage:
Let’s define the model name, the model version, and the AI Platform Serving runtime (which corresponds to a TensorFlow version) required for deploying the model.
First, you create a model in AI Platform Serving by running the following gcloud command:
Second, you create a model version using the following
gcloud command, in which you specify the location of the model and preprocessing object (
--origin), the location the package(s) including the scripts needed for your prediction (
--package-uris), and a pointer to you Custom Model prediction class (
--prediction-class).This should take 1-2 minutes.
After deploying the model to AI Platform Serving, you can invoke the model for prediction using the code described in previous Keras-based blog post .
Note that the client of our REST API does not need to know whether the service was implemented in TensorFlow or in PyTorch. In either case, the client should send the same request, and receive a response of the same form.
Although AI Platform initially provided only support for TensorFlow, it is now evolving into a platform that supports multiple frameworks. You can now deploy models using TensorFlow, PyTorch, or any Python-based ML framework, since AI Platform Serving supports custom prediction Python code, available in beta. This post demonstrates that you can flexibly deploy a PyTorch text classifier, which utilizes text preprocessing logic implemented in using Keras.
Feel free to reach out @GCPcloud if there are still features or other frameworks you’d like to train or deploy on AI Platform Serving.