Platform overview

Nine forensic layers.
One clear risk verdict.

DocPrism combines deterministic forensic checks and practical analyst workflows so teams can move faster without lowering risk standards or compromising auditability.

What happens the moment you upload a document

Nine independent forensic engines run simultaneously. Results are normalized into a 0–100 risk score with per-signal evidence your reviewers can act on immediately.

🔎

Layer 01 — Critical

Font injection detection

Character-level font analysis across every word. Edited documents leave subset font prefixes (ABCDEF+Arial) that reveal exactly which words were changed — with page and word coordinates as proof.

text_injected_on_document editor_fonts_detected editor_generated_document
👤

Layer 02 — Critical

Identity fraud detection

Compares the visible customer name (OCR-rendered) against the original name in the raw PDF data layer. Catches name substitution attacks that completely bypass visual inspection.

customer_name_mismatch
📄

Layer 03 — High

PDF binary & revision analysis

Reads raw binary structure to detect incremental edits, hidden revision layers, multiple %%EOF markers, and tampered cross-reference tables — proof the document was modified after creation.

incremental_updates pdf_content_modified multiple_xref_tables
🔌

Layer 04 — High

Metadata & AI tool intelligence

Checks creator, producer, and modification fields against 70+ known fraud tools — AI generators (Canva, Firefly, Midjourney), PDF editors (Acrobat, Smallpdf), and office suites. Stripped metadata is itself a signal.

ai_tool_in_metadata pdf_editor_in_metadata stripped_metadata
📊

Layer 05 — High

Transaction math verification

Extracts the full transaction table, verifies every running balance end-to-end, checks the closing balance, and applies Benford's Law to detect statistically improbable digit distributions in fabricated data.

balance_mismatch running_balance_breaks benford_law_violation
🔗

Layer 06 — High

Cross-document duplicate detection

Semantic vector embeddings identify when a new document's transaction set closely matches one already on file — even when amounts, dates, or names have been changed. Catches recycled statements across applicants.

duplicate_transactions
🚫

Layer 07 — Critical

Content legitimacy scanning

Scans text for explicit fraud markers: sample watermarks, lorem ipsum filler, placeholder names (John Doe), template labels, demonstration disclaimers, and known fake document generator website references.

sample_statement known_generator_site lorem_ipsum_filler
📏

Layer 08 — Medium

Layout & font consistency

Genuine bank statements have consistent column alignment, line spacing, and font usage. DocPrism detects alignment breaks, character spacing anomalies, and font switching mid-line — artifacts of desktop editing.

text_block_misalignment font_switching_within_line
📜

Layer 09 — Configurable

Negative database matching

A team-managed pattern library. Flag known fraud tool producers, match transaction signatures to prior cases, and blacklist document fingerprints. New patterns added by admin — no code changes needed.

negative_db_metadata_tool negative_db_transaction

Designed for production review teams

Not a lab prototype. Built to support the daily throughput of underwriting, compliance, and servicing operations.

Use cases

  • Income and bank statement validation in mortgage origination
  • Fraud triage for re-submitted or revised documents
  • Pre-underwriting quality and authenticity checks
  • Evidence support for manual review escalations
  • Multi-applicant cross-document duplicate screening
  • Synthetic identity fraud detection at submission

Deployment model

  • Self-hosted Docker Compose — deploys in under 10 minutes
  • PostgreSQL with pgvector for semantic duplicate matching
  • Role-based access: Analyst, Team Lead, Country Manager, Admin, CEO
  • Multi-organisation and multi-country team structure
  • Full audit log for every document review and decision
  • Configurable institution lists, scoring thresholds, and pattern rules

Frequently asked questions

Does DocPrism replace human fraud reviewers?

No. It accelerates and standardises review by surfacing high-confidence forensic signals and risk context, so analysts spend time on meaningful decisions rather than manually inspecting clean documents.

What document types does DocPrism support?

Bank statements, pay stubs, W-2, 1099, P60, P45, SA302, and PAYG summaries. PDF, JPG, PNG, TIFF, and WebP formats. US, UK, and AU jurisdictions are supported with jurisdiction-aware analysis logic.

Can we tune signal behaviour for our policy?

Yes. Institution records, negative database patterns, scoring thresholds, and review rules can all be configured by your administrator with no code changes required.

Is deployment cloud-only?

No. DocPrism is designed for private self-hosted deployment. Your applicant data never leaves your controlled infrastructure, and there are zero external API dependencies during analysis.

How does the risk score work?

Signals across nine forensic categories are weighted and normalised to a 0–100 scale. Below 30 is Low; 30–60 Medium; 60–80 High; 80+ Critical. Each category contributes independently so you can see exactly which signals drove the score.

Ready to see it on your documents?

We can run DocPrism against your sample statement types and walk through the forensic output with your team — no commitment required.