Ashari Abidin's Developer Docs

OpenClaw OCR MVP PaddleOCR

🦞 OpenClaw OCR MVP · PaddleOCR

MVP Tahap 1 β€” sederhana, benar, scalable  |  Dockerized  |  FastAPI + JSON

🎯 Fokus MVP (Tahap 1)

Yang DIBAHAS

  • πŸ“€ Upload image β†’ OCR jalan β†’ JSON output
  • πŸ“‘ OpenClaw siap panggil via REST API
  • 🐳 Fully Dockerized (docker-compose)
  • πŸ“ˆ Mudah di-scale nanti
  • 🧠 PaddleOCR sebagai engine core

Jangan dulu (Tahap 2)

  • ❌ Queue / Redis
  • ❌ Database, vector DB
  • ❌ Multi-agent, Kubernetes
  • ❌ Workflow kompleks
  • ❌ Async heavy workers
Arsitektur paling sederhana: User β†’ OpenClaw β†’ OCR API (FastAPI) β†’ PaddleOCR β†’ JSON

πŸ“¦ Step 1–3 : Install Docker & Buat Project

Ubuntu / Linux sudo apt update && sudo apt install docker.io docker-compose -y

docker --version
docker compose version

mkdir openclaw-ocr-mvp
cd openclaw-ocr-mvp

Struktur folder final :

openclaw-ocr-mvp/
β”œβ”€β”€ app/
β”‚ β”œβ”€β”€ main.py
β”‚ β”œβ”€β”€ ocr.py
β”‚ β”œβ”€β”€ utils.py (opsional preprocessing)
β”‚ └── requirements.txt
β”œβ”€β”€ uploads/
β”œβ”€β”€ Dockerfile
└── docker-compose.yml

βš™οΈ Core Code : PaddleOCR + FastAPI Endpoint

app/requirements.txt

fastapi
uvicorn
python-multipart
paddleocr
paddlepaddle
opencv-python
pillow

app/ocr.py (engine)

from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en')

def run_ocr(image_path):
 result = ocr.ocr(image_path)
 output = []
 if result and result[0]:
 for line in result[0]:
 output.append({
 "text": line[1][0],
 "confidence": float(line[1][1])
 })
 return output

app/main.py (FastAPI)

from fastapi import FastAPI, UploadFile, File
import shutil
import uuid
from ocr import run_ocr

app = FastAPI(title="OpenClaw OCR API", version="1.0")

@app.get("/")
def home():
 return {"status": "running", "service": "OpenClaw OCR MVP"}

@app.post("/ocr")
async def process_ocr(file: UploadFile = File(...)):
 filename = f"uploads/{uuid.uuid4()}.png"
 with open(filename, "wb") as buffer:
 shutil.copyfileobj(file.file, buffer)
 result = run_ocr(filename)
 return {"status": "success", "data": result}

🐳 Dockerfile + docker-compose.yml

πŸ“„ Dockerfile

FROM python:3.11
WORKDIR /app
COPY app/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY app .
RUN mkdir -p uploads
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

πŸ“„ docker-compose.yml

version: '3'
services:
 paddleocr:
 build: .
 container_name: paddleocr-api
 ports:
 - "8000:8000"
 volumes:
 - ./uploads:/app/uploads
Build & Run: docker compose up --build (pertama kali agak lama karena download model PaddleOCR)

πŸ§ͺ Test API & Contoh Hasil OCR

βœ… Buka browser di http://localhost:8000/docs β†’ Swagger UI siap testing upload gambar.

βœ… Endpoint POST /ocr β€” upload file gambar, dapatkan JSON:

{
 "status": "success",
 "data": [
 { "text": "Nama Mahasiswa", "confidence": 0.99 },
 { "text": "ASHARI ABIDIN", "confidence": 0.98 }
 ]
}
Integrasi OpenClaw (Python example):
import requests
url = "http://localhost:8000/ocr"
with open("sample.png", "rb") as f:
 files = {"file": f}
 response = requests.post(url, files=files)
print(response.json())
Preprocessing (opsional tapi penting):

Tambahkan app/utils.py dengan thresholding Otsu untuk scan noisy.

import cv2
def preprocess(path):
 img = cv2.imread(path)
 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
 thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
 cv2.imwrite(path, thresh)

Lalu di main.py panggil preprocess(filename) sebelum OCR.

πŸ† Kenapa Ini MVP yang Benar

  • βœ… Modular β€” pisah antara OCR engine & API layer
  • βœ… API-based β€” siap diintegrasi OpenClaw sebagai tool / action agent
  • βœ… Dockerized & scalable β€” mudah replika horizontal nanti
  • βœ… Panggil agent β†’ POST /ocr langsung JSON
  • βœ… Mudah upgrade ke tahap 2 (PDF, queue, GPU)
Minimum server MVP (CPU-only)
πŸ–₯️ CPU: 4 core  |  RAM: 8 GB  |  Disk: 20 GB
PaddleOCR production-ready dan ringan untuk tahap awal.

Upgrade Tahap 2 (Nanti jangan sekarang) : PDF OCR, table extraction, Redis queue, GPU, async workers, AI cleanup, RAG, vector DB, spreadsheet export.

Kenapa PaddleOCR? Balanced open source, multilingual, document parsing, aktif dikembangkan, populer untuk production document AI modern.

πŸ“‹ Command Lengkap Cepat

πŸ”¨ Build & start
docker compose up --build

πŸ›‘ Stop
docker compose down

πŸ“œ Logs realtime
docker compose logs -f

πŸ”„ Restart
docker compose restart

Pastikan port 8000 terbuka, lalu buka http://localhost:8000 untuk cek status. Swagger docs interaktif tersedia.

Flow Integrasi OpenClaw β†’ OCR MVP

🦞 OpenClaw Agent
πŸ“‘ POST /ocr (FastAPI)
🧠 PaddleOCR Engine
πŸ“„ JSON Output

✨ Arsitektur paling sederhana tapi β€œbenar” β€” sudah API‑based, dockerized, scalable, dan siap dipanggil agent. No over-engineering.

🎯 Kesimpulan MVP OpenClaw + PaddleOCR

βœ… Upload image β†’ OCR berjalan β†’ hasil JSON stabil.
βœ… Docker compose satu perintah langsung siap pakai.
βœ… PaddleOCR memberikan keseimbangan terbaik antara akurasi, kecepatan, dan kemudahan deployment.
βœ… Arsitektur ini menjadi fondasi tepat untuk OpenClaw sebagai agen cerdas yang membutuhkan ekstraksi teks dari gambar/dokumen.
βœ… Nanti saat traffic naik, scale dengan menambahkan load balancer, multiple container, atau pindah ke GPU β€” tanpa merusak desain inti.

Production-ready flow MVP Validated
OpenClaw OCR MVP β€” Red Themed Architecture | PaddleOCR + FastAPI + Docker | β€œSederhana tapi sudah benar dan siap scale nanti.”
Back