Building a Streamlit app to identify Indian Food — Part I (Data creation and augmentation)

Pradeep Adhokshaja
3 min readNov 12, 2022

--

Streamlit Logo

Introduction

Streamlit is an open-source application that allows a programmer to effectively create and publish an app with just a few lines of code.

Building an application is highly important for machine learning engineers and data scientists as this allows them to showcase their projects to the world.

In this series of articles, I will take you through a few steps to build your own image recognition application for Indian cuisine.

Getting the data

I have used the Indian cuisine data set available on Kaggle. This data set has 80 classes of Indian Cuisine. Apart from the 80 classes, I have also included 5 more classes. The 85 classes of Indian food can be found here.

The zipped data set folder can be found here. You will need to unzip this file to get access to the pictures.

A few examples are below :

Aloo Methi
Vada
Dosa

Reading the Image and Image Augmentation

After unzipping the downloaded file, a folder is created which has the name

indian_food/

This folder contains 85 sub-folders; each corresponding to one class among the 85 classes of images.

Before we build any convolutional neural network to classify images, we need to make sure that we also include versions of images that are noisy, vertically/ horizontally flipped, that have experienced channel shift, shearing etc.

The below code does just that. It uses the ImageGenerator() class from the Keras preprocessing library to generate image versions that are rotated, noisy, and shifted.

These images are then stored in a new folder called train_new_large/. The sub-folder structure ( each sub-folder corresponding to a single image class) still remains the same.

The code snippet below makes sure that only image files are being read (files with the jpg, png, and jpeg extensions)

For each image present, 15 new augmented images are added to the new folder.

augment.py for image augmentation

The above process will take about 45 minutes to an hour depending on the system that you are using.

Next Steps

In the next articles, I will go through the process of importing a pre-trained EfficientNet-V2 model for our classification process. The next article can be found here.

GIF from giphy

--

--

Pradeep Adhokshaja
Pradeep Adhokshaja

Written by Pradeep Adhokshaja

Data Scientist @Philips. Passionate about ML,Statistics & hiking. If you like to buy me a coffee, you can use this link https://ko-fi.com/pradeepadhokshaja

No responses yet