Rolling Out a Python Machine Learning Model to Production Using a Java based REST api


Motivation

Once your data was prepared for Machine Learning algorithms, and your model is trained and fine-tuned. Now it is time to launch your solution.

In the previous blog post, I focused on the steps presented above. Using python scikit-learn library, I showed you how to build a model. Now that the model is ready, how can you exposed the model knowledge to other applications?

Approach

This post presents the steps you should follow once you are ready to roll out a production deployment. To this end, I will expose the model using a REST api. But I will propose one more challenge. Beside the fact Python has great solutions to expose a REST api, imagine that you production environment allows you to deploy using only Java, as an example.

The machine learning team has to be able to delivery a model which will be executed using a Java Virtual Machine.

Exporting the Model

The first post showed the steps to build model (xgb_clf). As you can see below, I am exporting the model to a file called "pipeline.pkl.z".

#Exporting the model (xgb_clf) to pipeline.pkl.z
from sklearn.externals import joblib
joblib.dump(xgb_clf, "pipeline.pkl.z", compress = 9)

Translating to PMML

The next step consists in translate the "pipeline.pkl.z" to a PMML file. PMML is a Predictive Model Markup Language. To this task, I will use a Java library application for converting Scikit-Learn pipelines to PMML, the jpmml-sklearn is found at https://github.com/jpmml/jpmml-sklearn.

git clone https://github.com/jpmml/jpmml-sklearn.git
cd jpmml-sklearn/
mvn package
java -jar target/converter-executable-1.4-SNAPSHOT.jar --pkl-input pipeline.pkl.z --pmml-output pipeline.pmml

Renaming the features

Now I will rename my features, it is an optional step. But the PMML file will be more readable.

sed -e 's;x1;Pclass;g' -e 's;x2;Age;g' -e 's;x3;SibSp;g' -e 's;x4;Fare;g' -e 's;x5;Female;g' pipeline.pmml > pipeline_named.pmml

As you can see, the model has 5 features: Pclass, Age, SibSp, Fare and Female.

Loading the Model

Before we continue, I invite you to clone a my titanic-pred-api project using the command below.
git clone https://github.com/mathcunha/titanic-pred-api.git

The PMML file is loaded by the class ml.pmml.Model using the method loadEvaluator as you can see next. The class has one single attribute, that is the modelEvaluator. This attribute expose the methods to predict a instance based on the model.

public static PMML load(InputStream is) throws SAXException, JAXBException {
 return org.jpmml.model.PMMLUtil.unmarshal(is);
}

public static ModelEvaluator<?> loadEvaluator() throws SAXException, JAXBException {
 PMML pmml = load(Model.class.getResourceAsStream("/pipeline_named.pmml"));
 ModelEvaluatorFactory modelEvaluatorFactory = ModelEvaluatorFactory.newInstance();
 return modelEvaluatorFactory.newModelEvaluator(pmml);
}

Building the API

I am using Spring boot to expose the REST api. Spring boot is supported by many cloud providers, this allows you to deploy your model using Heroku, as an example (this will be presented in other post).
The class ml.api.EvaluatorController represents the controller responsible to expose the model.
@Controller
@RequestMapping("/evaluator")
public class EvaluatorController {
    private static final Model model = new Model();

    @RequestMapping(method=RequestMethod.GET)
    public @ResponseBody String evaluate(@RequestParam(value="Pclass", required=true) Float pclass,
                @RequestParam(value="Age", required=true) Float age,
                @RequestParam(value="SibSp", required=true) Float sibsp,
                @RequestParam(value="Fare", required=true) Float fare,
                @RequestParam(value="Female", required=true) Float female) {
        return model.evaluate(Model.getFeaturesMap(pclass, age, sibsp, fare, female)).toString();
    }
}
The EvaluatorController  has one single method. This methods requires the model 5 features Pclass, Age, SibSp, Fare and Female.

Running the API

To run the REST api you should run the command bellow.
mvn spring-boot:run
If everything was OK, you should call your api using http (e.g. http://127.0.0.1:9000/evaluator?Pclass=3&Age=34.5&SibSp=0&Fare=14.4542&Female=0). The answer will be "{y=ProbabilityDistribution{result=0, probability_entries=[1=0.21608171, 0=0.78391826]}, probability(0)=0.78391826, probability(1)=0.21608171}".

Conclusion

This post presented how you can deploy python scikit-learn model to production using a REST api implemented in Java. This api exposes the model and allows the consumers to use the model predictions capabilities making http calls.

Author

Matheus Cunha (@mathcunha) works as Solutions Architect at SEFAZ-CE. He holds a B.Sc. in Computer Science from Federal University of Bahia, Brazil; a M.Sc. and a Ph.D in Applied Informatics from University of Fortaleza, Brazil. His main research areas are distributed systems and cloud computing.

Comments

Popular posts from this blog

Using scikit-learn to tackle the Titanic Kaggle Competition