Published on

MLOps Basics [Week 9]: Prediction Monitoring - Kibana

Authors
normal

What is the need of monitoring?

Monitoring systems can help give us confidence that our systems are running smoothly and, in the event of a system failure, can quickly provide appropriate context when diagnosing the root cause.

Things we want to monitor during and training and inference are different. During training we are concered about whether the loss is decreasing or not, whether the model is overfitting, etc.

But, during inference, We like to have confidence that our model is making correct predictions.

There are many reasons why a model can fail to make useful predictions:

  • The underlying data distribution has shifted over time and the model has gone stale. i.e inference data characteristics is different from the data characteristics used to train the model.

  • The inference data stream contains edge cases (not seen during model training). In this scenarios model might perform poorly or can lead to errors.

  • The model was misconfigured in its production deployment. (Configuration issues are common)

In all of these scenarios, the model could still make a successful prediction from a service perspective, but the predictions will likely not be useful. Monitoring machine learning models can help us detect such scenarios and intervene (e.g. trigger a model retraining/deployment pipeline).

The scope of this post to understand the basics of monitoring predictions.

There are different tools available for monitoring:

and many more...

I will be using Kibana.

In this post, I will be going through the following topics:

  • Basics of Cloudwatch Logs

  • Creating Elastic Search Cluster

  • Configuring Cloudwatch Logs with Elastic Search

  • Creating Index Patterns in Kibana

  • Creating Kibana Visualisations

  • Creating Kibana Dashboard

Basics of Cloudwatch Logs

What is Cloudwatch logs?

Amazon CloudWatch Logs is a service that collects and stores logs from your application and infrastructure running on AWS, provides the same features expected of any log management tool: real-time monitoring, searching and filtering, and alerts.

Let's see how to check the logs for the lambda we implemented in the last week.

  • Go to the MLOps-Basics Lambda. Navigate to the Monitor section and Logs part. There will be a button indicating View logs in cloudwatch. Click on that.
normal
  • Now a new window will open (Cloudwatch) which contains the logs of the Lambda. This is the Log group corresponding to the lambda. Click on the top one.
normal
  • The logs corresponding to the latest stream will be visible as follows.
normal

It will be hard to monitor the predictions using these logs. Having a dashboard containing the predictions and counts will be helpful to monitor. Let's see how to stream these logs to Kibana and visualise there.

Creating Elastic Search Cluster

Let's create an Elastic Search Cluster which will be used to stream the logs.

  • Sign in to the AWS Management Console and open the Elasticsearch Service at https://console.aws.amazon.com/es/home. Create a New Domain

  • Choose the domain type as Development and Testing (change according to the needs)

normal
  • Give the domain a name. mlops-cluster
normal
  • Choosing the instance type as t2.small.elasticsearch (since this is for demo).
normal
  • Choosing the Network configuration as Public access so that it can be shared across different people easily. (With VPC also it can be shared but it requires some more configuration.)
normal
  • Get the ip of the machine using this link and then add that in the domain access policy.
normal
  • Since the instance chosen is t2.small it does not support https encryption. Deselect that option.
normal
  • After reviewing everything create the instance. This will take some time (5mins +). Once the cluster is created, the status will be shown as Active.
normal

Creating a IAM role with necessary permissions

In order to stream logs to Elasticsearch cluster, Cloudwatch should have necessary permissions to write to ES Cluster. Let's a create a role with the permissions required.

  • Go to the IAM console and Roles section. Click on Create Role
normal
  • Select the AWS Service and the use case as Lambda
normal
  • Search for esfull and select the AmazonESFullAccess policy.
normal
  • Give the role a name mlops-cluster-role and save it.
normal

Configuring Elasticsearch cluster to Cloudwatch logs

  • Now that the role has created, go to the cloudwatch Log Group of the lambda. Under Actions/Subscription Filters there will be Create Elasticsearch Subscription Filter. Select it.
normal
  • Select mlops-cluster under the Amazon ES cluster option. Select the mlops-cluster-role IAM Execution role.
normal
  • Select the Log format as JSON, since we are printing the logs in Json format. Filter patterns will help in filtering the unnecessary logs and focus on the necessary ones. As a simple usecase, let's write the filter pattern as prediction. This will filter the logs which has prediction in it.
normal
  • Test the filter pattern by select one of the latest log stream and then click Test Pattern. Now in the test results you can see only the prediction related logs.
normal
  • Cross check once the configuration and then create it.

Creating Index Patterns in Kibana

Now that cloudwatch is configured with Elasticsearch, let's go to Kibana dashboard. Kibana link can be accessed as below:

normal
  • Go to the Discover section.
normal
  • You might see page like this. In that case fire some queries so that some logs will be created.
normal
  • Once few logs are there, the page will look like this. Click on Create Index Pattern
normal
  • Add the index pattern as cwl-* which indicates all the Cloudwatch Logs.
normal
  • Include the @timestamp field also
normal
  • Now we can see prediction.label, prediction.label.keyword, prediction.score, text in the extracted fields.
normal
  • Once the pattern is created, the logs will be visible in Discover
normal
  • Create a data table by selecting the relevant fields. I have selected the fields prediction.label, prediction.label.keyword, prediction.score, text..
normal
  • Save the data table by giving it a name.
normal
  • Adjust the refresh time to 5 secs so that the latest logs will be fetched. Click on Start.
normal

Creating Kibana Visualisations

Let's create some visualisations.

  • Go to Visualize section in the Kibana.
normal
  • Click on Create new Visualization
normal
  • Select Vertical Bar (bar chart) as the type of visualization.
normal
  • Select the input for visualisation as MLOps Datatable
normal
  • By default only Y-Axis is created with aggregation as Count. Let's create X-Axis.
normal
  • Select the aggregation type as Terms and select the required field. Any custom label can also be provided.
normal
  • Save the visualisation by giving it a title.
normal

Creating Kibana Dashboard

Dashboards can be created with the necessary visualisations.

  • Go to the Dashboard section in the Kibana
normal
  • Create new dashboard
normal
  • Click on Add button. This will open up a pane on the right side containing all the visualisations we have created till now.
normal
  • Select the necessary visualisations and save the dashboard by giving it a name.
normal
  • Now the dashboard contains the visualisations. These visualisations can be arranged according to the needs.
normal
  • In order to share this dashboard, click on share button on top right, and permalink. Get the Short URL and share it with the concerned people.
normal

🔚

This concludes the post. We have seen how to monitor predictions using Cloudwatch & Kibana.

Complete code can also be found here: Github

References