🔩 10-AI-Powered Document Intake
Turning messy PDFs, screenshots, and spreadsheets into structured, auditable data — in seconds.
✍️ Written from Riyadh — for founders, product teams, and AI builders in regulated markets.
🎧 Listen to this Article
📂 From Raw Docs to Insights
Why the hardest part of lending AI isn’t scoring — it’s document chaos.
Walk into any credit team and you’ll see the real bottleneck:
Borrower uploads a hodge-podge of PDFs, JPEGs, and Excel files.
Someone downloads them, renames them, and drags them into folders.
Another person re-types numbers into a spreadsheet “just to be safe.”
Weeks later, underwriting finally gets “clean” data — that might still be wrong.
We built Qararak’s Document Intake Engine to kill that workflow.
🏗️ 1. Ingestion Pipeline Built for Real-World Mess
Accepted formats: PDFs, images, zip bundles, E-statements, even WhatsApp screenshots.
Upload channels: Web portal, API, SFTP, mobile capture.
Every file is stamped, versioned, and sent straight to an on-prem object store — no local downloads, no email chains.
🧠 2. AI-Driven Classification (Multilingual & Domain-Specific)
Standard OCR alone isn’t enough, especially when documents mix Arabic and English.
We combine:
Tesseract + EasyOCR for baseline extraction
LLM-powered layout parsing to detect tables, stamps, and handwritten notes
Custom CNN classifier trained on 40+ Saudi financial templates (bank statements, ZATCA tax returns, MOF certificates)
Outcome: “This is a 2023 audited balance sheet (Arabic), 6 pages.” — with 98 % precision.
📊 3. Smart Validation Rules — Not Manual Checklists
Once a doc type is confirmed, we fire validation rules in real time:
Example Rule Logic Result Date range check Transaction dates within last 12 m? ✅ / ❌ Completeness All mandatory columns present? ✅ / ❌ Math integrity Assets = Liabilities + Equity? ✅ / ❌
Rules are decision-table driven — business users can add or edit without code.
🔁 4. Feedback Loop to the Borrower (or RM)
If a document fails validation:
Qararak generates a reason code (“Missing VAT certificate”).
Sends an API callback / email template.
Borrower re-uploads only what’s missing.
No phone calls. No guesswork.
🔐 5. Compliance & Audit Trail
SHA-256 hash stored for every file version.
Linked to the borrower’s master record.
Full changelog: who viewed, validated, or rejected a doc — and why.
All data stays on-prem (PDPL-aligned), with optional encryption at rest.
⚡ 6. What This Unlocks
Pain Point Yesterday With Qararak Today Manual renaming & sorting Auto-classification & routing Spreadsheet re-typing Structured JSON payloads Version confusion Immutable hash & timestamp Week-long doc QA Sub-minute AI validation
Speed goes up, errors go down — and underwriting finally gets clean, trusted data.
🧭 Final Thought
Great credit decisions start long before a model runs.
They start the moment a borrower drags a PDF into your portal.
By turning raw documents into validated, structured insights — instantly and securely — Qararak frees your team to focus on what matters: risk, not re-typing.
Next Article
🔩 How We Build AI Differently | Build on Top of Us: Extending Qararak via APIs
🎧 Explore More
→ Listen to the 🤖AI on the Ground Podcast: Real-world AI powering compliance, credit, and regulated markets in Saudi — decoded for operators.