Using ResNet18 To Classify Dogs

January 05, 2025

Image classified using a ResNet18 architecture.

What is a ResNet?

A ResNet, short form of “Residual Network,” created to solve the vanishing gradient (modifying the parameters does not result in an increase in performance) and exploding gradient (modifying the parameters results in large updates to the parameters and fails in the learning process) problems. This allows us to create larger, and more deep networks.

They solved this problem by using a skip connection, where they skip some layers to find as potentially more layers are unable to learn the identity well, and allows it to try and find the difference between the output and the input. Then, you add the transformed input (output of the network) to the initial input to pass back into the network again.

Implementation

*This code is a modification from the PyTorch website (https://pytorch.org/hub/pytorch_vision_resnet). You can get my modified code on GitHub (https://github.com/smrtdylan/Dog-Identification-Network). The Dataset is from Stanford-Dogs, and downloaded from Ajinkya Kolhe in Hugging Face.

1. First, let’s load up the model and set the model to evaluation mode (used to tell the model we’re testing it).

2. Next, we’re going to load in a version of the dataset from Huggin Face. We will be using the train split since we can actually see whether or not the model was actually correct. We will also shuffle the images in the dataset so that we will always get a random image.

3. Next, we’re going to download the labels for the ResNet. This does hold labels that are not in the Stanford-Dogs Dataset.

Now comes the fun part of pre-processing and passing the image into the network, getting the top 5 probabilities, and checking whether or not the model was correct.

4. Create the network and preprocess the image.

5. Now, we’re going to pass the images (in the same function) into the model and generate our probabilities.

6. Now, create a new function with two parameters: the probabilities from the network and the true value/answer to the image. We are also going to print out the top 5 probabilities of the network.

7. Now, we’re going to create the ability (in the same function) to check whether or not the top answer by the model was correct.

8. Now, let’s create a loop to test on some images (10 in our case)!

Here’s the final result,

Works Referenced/Extra Resources

These were the works I referenced while researching for this article. I also included some other resources that I thought would be beneficial for everybody.

“Exploding Gradient Problem” by DeepAI.

“Vanishing and Exploding Gradients Problems in Deep Learning” by GeeksForGeeks.

“ResNet” by PyTorch Team.

"What are Skip Connections in Deep Learning?" by Sivaram (Analytics Vidhya).

A Beginner's look into Artificial Intelligence

Using ResNet18 To Classify Dogs

What is a ResNet?

Implementation

Works Referenced/Extra Resources

Comments

Post a Comment

Popular posts from this blog

What is a Multimodal LLM?

A Brief Introduction into Model Quantization

A Mathematical Explanation of Gradient Descent