✍️ From Pen to Pixel: Custom OCR Pipeline for Handwritten Journal Digitization

This project is a handcrafted end-to-end Optical Character Recognition (OCR) pipeline built to transcribe my handwritten journal entries into digital text—using PyTorch, Faster R-CNN, AWS Lambda, and iOS Shortcuts. It's a personal and technical showcase of deep learning, MLOps, and full-stack AI deployment.

🚀 What This Project Does

🔹 Uses a CNN-LSTM OCR model trained from scratch on 60+ pages of my handwritten journals
🔹 Fine-tunes FasterRCNN_ResNet50 to automate bounding box annotation
🔹 Chains both into an inference pipeline that extracts, segments, and transcribes handwriting
🔹 Wraps the pipeline in an API deployed via AWS Lambda & API Gateway
🔹 Accesses the API via a native iOS Shortcut app for mobile transcription

📄 Full Technical Report

Want to dive deep into the models, training process, architecture, and deployment stack?

👉 Read the full PDF report

Covers:

Motivation
Data Generation and Preparation
OCR model architecture (CNN + BiLSTM + CTC)
Annotation model (Faster R-CNN with transfer learning)
AWS Lambda containerized deployment
Inference pipeline logic
iOS Shortcut integration and demo
Results, CER/WER, error correction
Challenges, learnings, and future work

✍️ Image annotation process (more about this in the technical report)

📂 Key Files in This Repo

File/Notebook	Description
`ocr_model.ipynb`	Trains the CNN-LSTM OCR model from scratch
`auto_annotator_model.ipynb`	Fine-tunes Faster R-CNN for line detection
`inference.ipynb`	Full pipeline: detection + OCR + decoding
`lambda_function.py`	AWS Lambda handler with integrated pipeline
`ios_app_pipeline.png`	Visual of iOS Shortcut interacting with the API
`From Pen to Pixel ... .pdf`	📄 Full project report with all technical details

⚙️ Tech Stack

🧠 PyTorch, Albumentations, TextBlob
📦 AWS Lambda (Dockerized), API Gateway, S3, ECR
📱 iOS Shortcuts for mobile interface
📷 VGG Image Annotator (for labeling training data)

🧪 Results

Metric	Value
Character Error Rate (CER)	2.3%
Word Error Rate (WER)	9.33%
Average Inference Time	~18s (CPU, Lambda)
Manual Transcription Time	~5 min/page ⏱️

✅ ~17x improvement in processing time
✅ Fully automated pipeline
✅ Self-trained on personal dataset (60+ A5 pages)

📱 iOS Shortcut app demo (more about this in the technical report)

💡 Future Improvements

Replace Faster R-CNN with YOLOv8 or DETR
Move inference to GPU-backed container for speed
Integrate LLM-based grammar + spell checking
Create auto blog upload pipeline from transcribed text

👤 About Me

Orestas Dulinskas
MSc Data Science | AI + MLOps Engineer-in-Progress
LinkedIn | orestasdulinskas@gmail.com

If you're building real-world AI products—or want to—I'm always open to connect.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
images		images
From Pen to Pixel_ Building a Custom OCR Pipeline to Digitize My Handwritten Journal Using PyTorch and AWS - Orestas Dulinskas.pdf		From Pen to Pixel_ Building a Custom OCR Pipeline to Digitize My Handwritten Journal Using PyTorch and AWS - Orestas Dulinskas.pdf
README.md		README.md
auto_annotation_model.ipynb		auto_annotation_model.ipynb
inference.ipynb		inference.ipynb
ios_app_pipeline.png		ios_app_pipeline.png
lambda_function.py		lambda_function.py
ocr_model.ipynb		ocr_model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✍️ From Pen to Pixel: Custom OCR Pipeline for Handwritten Journal Digitization

🚀 What This Project Does

📄 Full Technical Report

✍️ Image annotation process (more about this in the technical report)

📂 Key Files in This Repo

⚙️ Tech Stack

🧪 Results

📱 iOS Shortcut app demo (more about this in the technical report)

💡 Future Improvements

👤 About Me

About

Uh oh!

Uh oh!

Languages

orestasdulinskas/handwrinting_recognition

Folders and files

Latest commit

History

Repository files navigation

✍️ From Pen to Pixel: Custom OCR Pipeline for Handwritten Journal Digitization

🚀 What This Project Does

📄 Full Technical Report

✍️ Image annotation process (more about this in the technical report)

📂 Key Files in This Repo

⚙️ Tech Stack

🧪 Results

📱 iOS Shortcut app demo (more about this in the technical report)

💡 Future Improvements

👤 About Me

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages