🥤Lighter deep learning for IOT devices (Raspberry Pi, mobile app, AWS Lambda)

🥤Lighter deep learning for IOT devices (Raspberry Pi, mobile app, AWS Lambda)

TensorFlow Lite - Model Optimization for IoT Devices

📹 Video Tutorial

https://youtu.be/g2jI_0oqWsc

1. Introduction - The Problem

⚠️ How Big is TensorFlow?

TensorFlow installation size: 1.22 GB 😢

This massive size makes deployment difficult for:

  • 🔴 Raspberry Pi and IoT devices
  • 📱 Mobile phones
  • ☁️ AWS Lambda functions
  • ⚡ Edge computing devices

What Gets Installed?

After running pip3 install --upgrade tensorflow, you get all these dependencies:

📦 requirements.txt
absl-py==0.9.0
astor==0.8.1
cachetools==4.1.0
certifi==2020.4.5.1
chardet==3.0.4
gast==0.2.2
google-auth==1.13.1
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.28.1
h5py==2.10.0
idna==2.9
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
Markdown==3.2.1
numpy==1.18.2
oauthlib==3.1.0
opt-einsum==3.2.0
protobuf==3.11.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
requests==2.23.0
requests-oauthlib==1.3.0
rsa==4.0
scipy==1.4.1
six==1.14.0
tensorboard==2.1.1
tensorflow==2.1.0
tensorflow-estimator==2.1.0
termcolor==1.1.0
urllib3==1.25.8
Werkzeug==1.0.1
wrapt==1.12.1

Additional Problem: Inference models can also exceed 100MB!

✅ The Solution: TensorFlow Lite

Convert your models to lightweight TensorFlow Lite format for efficient deployment on resource-constrained devices.

2. Convert to TensorFlow Lite

Official Documentation: https://www.tensorflow.org/lite

Reference Colab Notebook:
TensorFlow Lite Digit Classifier Example

Step 1: Load Your Keras Model

🐍 load_model.py
import tensorflow as tf
from tensorflow.keras.models import load_model

# Load your pre-trained Keras model
mnist_cnn_model = load_model('/content/drive/My Drive/Colab Notebooks/my_cnn_mnist_model.h5')

Step 2: Convert to TensorFlow Lite Format

🐍 convert_to_tflite.py
# Convert Keras model to TF Lite format
converter = tf.lite.TFLiteConverter.from_keras_model(mnist_cnn_model)
tflite_float_model = converter.convert()

# Show model size in KBs
float_model_size = len(tflite_float_model) / 1024
print('Float model size = %dKBs.' % float_model_size)

Float model size result

Step 3: Apply Quantization for Further Compression

💡 What is Quantization?

Quantization reduces model size by converting 32-bit floating point weights to 8-bit integers, typically achieving 4x size reduction with minimal accuracy loss.

🐍 quantize_model.py
# Re-convert the model to TF Lite using quantization
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quantized_model = converter.convert()

# Show model size in KBs
quantized_model_size = len(tflite_quantized_model) / 1024
print('Quantized model size = %dKBs,' % quantized_model_size)
print('which is about %d%% of the float model size.' \
      % (quantized_model_size * 100 / float_model_size))

Quantized model size result

Step 4: Prepare for Inference Testing

🐍 setup_inference.py
# Create input and output tensors
tflite_model = tflite_quantized_model
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()

input_tensor_index = interpreter.get_input_details()[0]["index"]
output = interpreter.tensor(interpreter.get_output_details()[0]["index"])

Step 5: Load Test Image

🐍 load_image.py
from PIL import Image
import numpy as np

# Load and preprocess image
img_path = '/content/drive/My Drive/Colab Notebooks/mnist_8_585.jpg'
img = Image.open(img_path)
img = np.resize(img, (28, 28, 1))
im2arr = np.array(img)
im2arr = im2arr.astype('float32') / 255.
test_image = im2arr.reshape(1, 28, 28, 1)

Step 6: Run Inference

🐍 run_inference.py
# Set input tensor
interpreter.set_tensor(input_tensor_index, test_image)

# Run inference
interpreter.invoke()

# Post-processing: find the digit with highest probability
digit = np.argmax(output()[0])
print("Output probabilities:", output())
print("Predicted digit:", digit)

Inference result

Step 7: Save the Model

🐍 save_model.py
# Save TFLite model to Google Drive
f = open('/content/drive/My Drive/Colab Notebooks/mnist.tflite', "wb")
f.write(tflite_quantized_model)
f.close()

# Download the model to your local machine
from google.colab import files
files.download('mnist.tflite')

3. Standalone Deep Learning Inference

✨ Lightweight Installation

Instead of installing the full 1.22GB TensorFlow package, you only need the TensorFlow Lite interpreter!

Install TensorFlow Lite Interpreter

Official Guide: https://www.tensorflow.org/lite/guide/python

  1. Download the appropriate prebuilt wheel file for your platform
  2. Install using pip

TFLite installation

Standalone Inference Code

🐍 standalone_inference.py - Complete Example
"""
Standalone TensorFlow Lite Inference Example
No full TensorFlow installation required!
"""

import tflite_runtime.interpreter as tflite
import numpy as np
from PIL import Image

# Load TFLite model
interpreter = tflite.Interpreter(model_path='./mnist.tflite')
interpreter.allocate_tensors()

# Get input and output tensors
input_tensor_index = interpreter.get_input_details()[0]["index"]
output = interpreter.tensor(interpreter.get_output_details()[0]["index"])

# Load and preprocess test image
img_path = './mnist_8_585.jpg'
img = Image.open(img_path)
img = np.resize(img, (28, 28, 1))
im2arr = np.array(img)
im2arr = im2arr.astype('float32') / 255.
test_image = im2arr.reshape(1, 28, 28, 1)

# Set input tensor
interpreter.set_tensor(input_tensor_index, test_image)

# Run inference
interpreter.invoke()

# Get prediction
digit = np.argmax(output()[0])
print("Output probabilities:", output())
print("Predicted digit:", digit)

📊 Size Comparison

Component Size Use Case
Full TensorFlow 1.22 GB Training & Development
TFLite Float Model ~100+ KB Inference Only
TFLite Quantized Model ~25 KB (75% reduction) IoT & Mobile Devices

🎯 Key Benefits

  • Smaller model size: 4x reduction with quantization
  • 🚀 Faster inference: Optimized for mobile and edge devices
  • 📦 Minimal dependencies: Only TFLite runtime needed
  • 💾 Lower memory footprint: Perfect for resource-constrained devices
  • 🔋 Energy efficient: Reduced power consumption

📚 Resources

🙇🏻 Thank you!

main picture from Evie S.

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.