🥤Lighter deep learning for IOT devices (Raspberry Pi, mobile app, AWS Lambda)
TensorFlow Lite - Model Optimization for IoT Devices
📹 Video Tutorial
1. Introduction - The Problem
⚠️ How Big is TensorFlow?
TensorFlow installation size: 1.22 GB 😢
This massive size makes deployment difficult for:
- 🔴 Raspberry Pi and IoT devices
- 📱 Mobile phones
- ☁️ AWS Lambda functions
- ⚡ Edge computing devices
What Gets Installed?
After running pip3 install --upgrade tensorflow, you get all these dependencies:
absl-py==0.9.0
astor==0.8.1
cachetools==4.1.0
certifi==2020.4.5.1
chardet==3.0.4
gast==0.2.2
google-auth==1.13.1
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.28.1
h5py==2.10.0
idna==2.9
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
Markdown==3.2.1
numpy==1.18.2
oauthlib==3.1.0
opt-einsum==3.2.0
protobuf==3.11.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
requests==2.23.0
requests-oauthlib==1.3.0
rsa==4.0
scipy==1.4.1
six==1.14.0
tensorboard==2.1.1
tensorflow==2.1.0
tensorflow-estimator==2.1.0
termcolor==1.1.0
urllib3==1.25.8
Werkzeug==1.0.1
wrapt==1.12.1
Additional Problem: Inference models can also exceed 100MB!
✅ The Solution: TensorFlow Lite
Convert your models to lightweight TensorFlow Lite format for efficient deployment on resource-constrained devices.
2. Convert to TensorFlow Lite
Official Documentation: https://www.tensorflow.org/lite
Reference Colab Notebook:
TensorFlow Lite Digit Classifier Example
Step 1: Load Your Keras Model
import tensorflow as tf
from tensorflow.keras.models import load_model
# Load your pre-trained Keras model
mnist_cnn_model = load_model('/content/drive/My Drive/Colab Notebooks/my_cnn_mnist_model.h5')
Step 2: Convert to TensorFlow Lite Format
# Convert Keras model to TF Lite format
converter = tf.lite.TFLiteConverter.from_keras_model(mnist_cnn_model)
tflite_float_model = converter.convert()
# Show model size in KBs
float_model_size = len(tflite_float_model) / 1024
print('Float model size = %dKBs.' % float_model_size)

Step 3: Apply Quantization for Further Compression
💡 What is Quantization?
Quantization reduces model size by converting 32-bit floating point weights to 8-bit integers, typically achieving 4x size reduction with minimal accuracy loss.
# Re-convert the model to TF Lite using quantization
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quantized_model = converter.convert()
# Show model size in KBs
quantized_model_size = len(tflite_quantized_model) / 1024
print('Quantized model size = %dKBs,' % quantized_model_size)
print('which is about %d%% of the float model size.' \
% (quantized_model_size * 100 / float_model_size))

Step 4: Prepare for Inference Testing
# Create input and output tensors
tflite_model = tflite_quantized_model
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
input_tensor_index = interpreter.get_input_details()[0]["index"]
output = interpreter.tensor(interpreter.get_output_details()[0]["index"])
Step 5: Load Test Image
from PIL import Image
import numpy as np
# Load and preprocess image
img_path = '/content/drive/My Drive/Colab Notebooks/mnist_8_585.jpg'
img = Image.open(img_path)
img = np.resize(img, (28, 28, 1))
im2arr = np.array(img)
im2arr = im2arr.astype('float32') / 255.
test_image = im2arr.reshape(1, 28, 28, 1)
Step 6: Run Inference
# Set input tensor
interpreter.set_tensor(input_tensor_index, test_image)
# Run inference
interpreter.invoke()
# Post-processing: find the digit with highest probability
digit = np.argmax(output()[0])
print("Output probabilities:", output())
print("Predicted digit:", digit)

Step 7: Save the Model
# Save TFLite model to Google Drive
f = open('/content/drive/My Drive/Colab Notebooks/mnist.tflite', "wb")
f.write(tflite_quantized_model)
f.close()
# Download the model to your local machine
from google.colab import files
files.download('mnist.tflite')
3. Standalone Deep Learning Inference
✨ Lightweight Installation
Instead of installing the full 1.22GB TensorFlow package, you only need the TensorFlow Lite interpreter!
Install TensorFlow Lite Interpreter
Official Guide: https://www.tensorflow.org/lite/guide/python
- Download the appropriate prebuilt wheel file for your platform
- Install using pip

Standalone Inference Code
"""
Standalone TensorFlow Lite Inference Example
No full TensorFlow installation required!
"""
import tflite_runtime.interpreter as tflite
import numpy as np
from PIL import Image
# Load TFLite model
interpreter = tflite.Interpreter(model_path='./mnist.tflite')
interpreter.allocate_tensors()
# Get input and output tensors
input_tensor_index = interpreter.get_input_details()[0]["index"]
output = interpreter.tensor(interpreter.get_output_details()[0]["index"])
# Load and preprocess test image
img_path = './mnist_8_585.jpg'
img = Image.open(img_path)
img = np.resize(img, (28, 28, 1))
im2arr = np.array(img)
im2arr = im2arr.astype('float32') / 255.
test_image = im2arr.reshape(1, 28, 28, 1)
# Set input tensor
interpreter.set_tensor(input_tensor_index, test_image)
# Run inference
interpreter.invoke()
# Get prediction
digit = np.argmax(output()[0])
print("Output probabilities:", output())
print("Predicted digit:", digit)
📊 Size Comparison
| Component | Size | Use Case |
|---|---|---|
| Full TensorFlow | 1.22 GB | Training & Development |
| TFLite Float Model | ~100+ KB | Inference Only |
| TFLite Quantized Model | ~25 KB (75% reduction) | IoT & Mobile Devices |
🎯 Key Benefits
- ⚡ Smaller model size: 4x reduction with quantization
- 🚀 Faster inference: Optimized for mobile and edge devices
- 📦 Minimal dependencies: Only TFLite runtime needed
- 💾 Lower memory footprint: Perfect for resource-constrained devices
- 🔋 Energy efficient: Reduced power consumption
📚 Resources
🙇🏻 Thank you!
main picture from Evie S.