Building a Streamlit app to identify Indian Food — Part I (Data creation and augmentation)
Introduction
Streamlit is an open-source application that allows a programmer to effectively create and publish an app with just a few lines of code.
Building an application is highly important for machine learning engineers and data scientists as this allows them to showcase their projects to the world.
In this series of articles, I will take you through a few steps to build your own image recognition application for Indian cuisine.
Getting the data
I have used the Indian cuisine data set available on Kaggle. This data set has 80 classes of Indian Cuisine. Apart from the 80 classes, I have also included 5 more classes. The 85 classes of Indian food can be found here.
The zipped data set folder can be found here. You will need to unzip this file to get access to the pictures.
A few examples are below :
Reading the Image and Image Augmentation
After unzipping the downloaded file, a folder is created which has the name
indian_food/
This folder contains 85 sub-folders; each corresponding to one class among the 85 classes of images.
Before we build any convolutional neural network to classify images, we need to make sure that we also include versions of images that are noisy, vertically/ horizontally flipped, that have experienced channel shift, shearing etc.
The below code does just that. It uses the ImageGenerator() class from the Keras preprocessing library to generate image versions that are rotated, noisy, and shifted.
These images are then stored in a new folder called train_new_large/. The sub-folder structure ( each sub-folder corresponding to a single image class) still remains the same.
The code snippet below makes sure that only image files are being read (files with the jpg, png, and jpeg extensions)
For each image present, 15 new augmented images are added to the new folder.
The above process will take about 45 minutes to an hour depending on the system that you are using.
Next Steps
In the next articles, I will go through the process of importing a pre-trained EfficientNet-V2 model for our classification process. The next article can be found here.