Ungraded Lab Part 1 - Deploying a Machine Learning Model

Welcome to this ungraded lab!

This lab is all about deploying a real machine learning model, and checking what doing so feels like. More concretely, you will deploy a computer vision model trained to detect common objects in pictures. Deploying a model is one of the last steps in a prototypical machine learning lifecycle. However, we thought it would be exciting to get you to deploy a model right away. This lab uses a pretrained model called YOLOV3. This model is very convenient for two reasons: it runs really fast, and for object detection it yields accurate results.

The sequence of steps/tasks to complete in this lab are as follow:

  1. Inspect the image data set used for object detection.
  2. Take a look at the model itself.
  3. Deploy the model using fastAPI. You can check its website here.

Here is a shortcut to the instructions on how to interact with your model once it has been deployed. This will be useful later on, for now just continue with the notebook as usual.

Object Detection with YOLOV3

Inspecting the images

Let's take a look at the images that will be passed to the YOLOV3 model. This will bring insight on what type of common objects are present for detection. These images are part of the ImageNet dataset and are stored within the images directory within this environment.

Overview of the model

Now that you have a sense of the image data and the objects present, let's try and see if the model is able to detect and classify them correctly.

For this you will be using cvlib, which is a very simple but powerful library for object detection that is fueled by OpenCV and Tensorflow.

More concretely, you will use the detect_common_objects function, which takes an image formatted as a numpy array and returns:

In the next section you will visually see these elements in action.

Creating the detect_and_draw_box function

Let's define the detect_and_draw_box function which takes as input arguments:

With these inputs, it detects common objects in the image and saves a new image displaying the bounding boxes alongside the detected object. These new images will be saved within the images_with_boxes directory.

You might ask yourself why does this function receive the model as an input argument? What models are there to choose from? The answer is that detect_common_objects uses the yolov3 model by default. However, there is another option available that is much tinier and requires less computational power.

It is the yolov3-tiny version. As the model name indicates, this model is designed for constrained environments that cannot store big models. With this comes a natural tradeoff: the results are less accurate than the full model. However, it still works pretty well. Going forward you can use whichever you prefer but by default yolov3-tiny will be used.

The model output is a vector of probabilities for the presence of different objects on the image. The last input argument, confidence level, determines the threshold that the probability needs to surpass to report that a given object is detected on the supplied image. By default, detect_common_objects uses a value of 0.5 for this.

Let's try it out for the example images.

Changing the confidence level

Looks like the object detection went fairly well. Let's try it out on a more difficult image containing several objects:

The model failed to detect several fruits and misclassified an orange as an apple. This might seem strange since it was able to detect one apple before, so one might think the model has a fair representation on how an apple looks like.

One possibility is that the model did detect the other fruits but with a confidence level lower than 0.5. Let's test if this is a valid hypothesis:

By lowering the confidence level the model successfully detects most of the fruits. However, in order to correctly detect the objects present, we had to set the confidence level really low. In general, you should be careful when decreasing or increasing these kinds of parameters, as changing them might yield undesired results.

As for this concrete example when an orange was misclassified as an apple, it serves as a reminder that these models are not perfect and this should be considered when using them for tasks in production.

Deploying the model using fastAPI

Placing your object detection model in a server

Now that you know how the model works it is time for you to deploy it! Aren't you excited? :)

Before diving into deployment, let's quickly recap some important concepts and how they translate to fastAPI. The images that are uploaded to the server will be used within the images_uploaded directory.

Some concept clarifications

Client-Server model

When talking about deploying, what is usually meant is to put all of the software required for predicting in a server. By doing this, a client can interact with the model by sending requests to the server.

This client-server interaction is out of the scope of this notebook but there are a lot of resources on the internet that you can use to understand it better.

The important thing you need to focus on, is that the Machine Learning model lives in a server waiting for clients to submit prediction requests. The client should provide the required information that the model needs in order to make a prediction. Keep in mind that it is common to batch many predictions in a single request. The server will use the information provided to return predictions to the client, who can then use them at their leisure.

Let's get started by creating an instance of the FastAPI class:

app = FastAPI()

The next step is using this instance to create endpoints that will handle the logic for predicting (more on this next). Once all the code is in place to run the server you only need to use the command:

uvicorn.run(app)

Your API is coded using fastAPI but the serving is done using uvicorn, which is a really fast Asynchronous Server Gateway Interface (ASGI) implementation. Both technologies are closely interconnected and you don't need to understand the implementation details. Knowing that uvicorn handles the serving is sufficient for the purpose of this lab.

Endpoints

You can host multiple Machine Learning models on the same server. For this to work, you can assign a different endpoint to each model so you always know what model is being used. An endpoint is represented by a pattern in the URL. For example, if you have a website called myawesomemodel.com you could have three different models in the following endpoints:

Each model would do what the name pattern suggests.

In fastAPI you define an endpoint by creating a function that will handle all of the logic for that endpoint and decorating it with a function that contains information on the HTTP method allowed (more on this next) and the pattern in the URL that it will use.

The following example shows how to allow a HTTP GET request for the endpoint "/my-endpoint":

@app.get("/my-endpoint")
def handle_endpoint():
    ...
    ...

HTTP Requests

The client and the server communicate with each other through a protocol called HTTP. The key concept here is that this communication between client and server uses some verbs to denote common actions. Two very common verbs are:

If your client does a GET request to an endpoint of a server you will get some information from this endpoint without the need to provide additional information. In the case of a POST request you are explicitly telling the server that you will provide some information for it that must be processed in some way.

Interactions with Machine Learning models living on endpoints are usually done via a POST request since you need to provide the information that is required to compute a prediction.

Let's take a look at a POST request:

@app.post("/my-other-endpoint")
def handle_other_endpoint(param1: int, param2: str):
    ...
    ...

For POST requests, the handler function contains parameters. In contrast with GET, POST requests expect the client to provide some information to it. In this case we supplied two parameters: an integer and a string.

Why fastAPI?

With fastAPI you can create web servers to host your models very easily. Additionally, this platform is extremely fast and it has a built-in client that can be used to interact with the server. To use it you will need to visit the "/docs" endpoint, you will see how to do this later on. Isn't that convenient?

Enough chatter, let's get going!

By running the following cell you will spin up the server!

This causes the notebook to block (no cells/code can run) until you manually interrupt the kernel. You can do this by clicking on the Kernel tab and then on Interrupt. You can also enter Jupyter's command mode by pressing the ESC key and tapping the I key twice.

The server is now running! Nice job!

Consume your service

Normally you will now head over to http://127.0.0.1:8000/ to see it in action. However Coursera environment works somewhat different to a regular pc. Within this environment you need to interact with the service through the navigation bar, which can be found in the upper side of your screen:

If you don't see this bar you might need to click on the Navigate button first:

Come back to this notebook

To come back you have two alternatives:

Visit the server

You can think of /serve/ as an alias for http://127.0.0.1:8000/.

With this in mind, to interact with the server you need to type /serve/ in this bar and press enter.

This will take you to the / endpoint of the server, which should display the message Congratulations! Your API is working as expected.

Using fastAPI's integrated client

To actually use your server for image detection you can leverage the client that comes built-in with fastAPI.

To use this client type /serve/docs in the navigation bar and press enter.

Try submitting an image and see how your API is able to detect objects within it and return a new image containing the bounding boxes alongside the labels of the detected objects.

When doing so you will get a screen that should look like the one below, follow the instructions next:

Instructions to use the client

Note: If you need to come back to this notebook to check these instructions you can do so as explained earlier. Remember that at the top of the notebook there is a shortcut to this section so you don't have to scroll all the way.

Click on top of the /predict endpoint and more options will become visible:

image.png

To test your server click on the Try it out button.

image.png

You can choose a model from the model field and a file which should be the image in which you want the server to detect objects.

Submit an image from your local filesystem by clicking the Choose File button, then click on the blue Execute button to send an HTTP request to the server. After doing so, scroll down and you will see the response from it. Pretty cool, right?

image.png

Try different images! You can use the ones we provided with this lab or some of your own. Since the model is using the default confidence level of 0.5 it might not always succeed to detect some objects.

To download the images provided with lab follow these steps:

Also, try submitting non-image files and see how the server reacts to it.

Congratulations on finishing this ungraded lab!

Real life servers have a lot more going on in terms of security and performance. However, the code you just experienced is close to what you see in real production environments. Hopefully, this lab served the purpose of increasing your familiarity with the process of deploying a Deep Learning model, and consuming from it.

Keep it up!

Consuming your model from another client

It is awesome that fastAPI allows you to interact with your API through its built-in client. However, you might wonder how you can interact with your API using regular code and not some UI.

There is a bonus section which shows how to code a minimal client in Python. This is useful to break down (in a very high level) what fastAPI's client is doing under the hood. However this section cannot be used within Coursera's environment. For this reason consider checking out the version of this lab that is meant to be run within your local computer. You can find it here.