🔤

OCR — Image to Text

OCR text extraction from image or scanned document — free online Tesseract OCR

Extract text from photos, scanned documents, and screenshots. Supports 100+ languages via AI.

📂 Upload Image

🔤

Drop an image here or click to browse

JPG, PNG, WEBP — for best results use clear, high-contrast images

Choose Image

Language

✅ Extracted Text

Source Image

Extracted Text

(Text will appear here after processing)

Confidence

—

Words

—

Characters

—

What is OCR and how do I extract text from an image? OCR (Optical Character Recognition) converts text inside photos, scans, and screenshots into editable, copyable text. With PDFdukan's free OCR tool you just upload an image, pick the language (100+ supported, including Urdu), and copy the extracted text. It runs in your browser — no uploads, no signup, completely private.

📖 Related guide: How OCR Works: Extract Text from Any Image →

Free OCR — Extract Text from Images Online

PDFdukan's OCR tool uses Tesseract.js — the same engine behind Google's OCR technology — to accurately extract text from photos, screenshots, and scanned documents. Supports over 100 languages including English, Arabic, Urdu, Chinese, and more. All processing happens locally in your browser with zero privacy concerns.

🌍

100+ Languages

Extract text in any language including Arabic, Urdu, Chinese, Japanese, and all Latin scripts.

🎯

High Accuracy

📋

Copy & Export

Copy text directly to clipboard or download as a .txt file with one click.

How to Extract Text from an Image in 3 Steps

Upload your image or scanned document — JPG, PNG, or WEBP photos of documents, screenshots, book pages, receipts, or handwritten-style printed text.
Select the language — Choose the language of the text in your image (English, Urdu, Arabic, and 100+ more). Correct language selection significantly improves accuracy.
Run OCR and copy the text — Click Extract Text. Within seconds the recognised text appears in an editable box — copy it to your clipboard or download it as a .txt file.

Common Uses for OCR

Digitising printed documents: Convert photographed letters, certificates, contracts, and reports into editable, searchable text instead of retyping.
Extracting text from screenshots: Grab text from a screenshot of a webpage, chat, error message, or app where copying is disabled.
Book and notes scanning: Students can photograph textbook pages or class notes and extract the text for summaries and assignments.
Receipts and invoices: Pull amounts, dates, and details from photographed receipts for record-keeping and expense reports.
Old archives: Convert old printed family documents, certificates, and records into digital text before they fade.
Translating foreign text: Extract text from an image first, then paste it into a translator — useful for documents, signs, and labels.

Tips for the Best OCR Accuracy

Use a sharp, well-lit photo — blur is the biggest cause of recognition errors.
Hold the camera straight above the text — tilted text reduces accuracy. For paper documents, scan with CamMaster Scanner first to flatten and enhance the page.
Select the correct language before running OCR — the engine loads language-specific recognition models.
Higher resolution images give better results — avoid heavily compressed WhatsApp forwards when possible; ask for the original photo.
Printed text recognises far better than handwriting — Tesseract is built for printed characters.

Frequently Asked Questions

How accurate is the OCR text extraction?

For clear, well-lit photos of printed text, accuracy is typically 95–99%. Accuracy drops with blurry images, low light, unusual fonts, or skewed angles. The LSTM neural network in Tesseract 4.x handles most real-world documents very well — and you can edit the extracted text directly in the output box to fix any stray errors.

Does it support Urdu and Arabic text?

Yes. The tool supports over 100 languages including Urdu, Arabic, Persian, Hindi, Chinese, and Japanese. Select the correct language before extraction — the engine downloads that language's trained recognition model and applies right-to-left text handling automatically.

Can it read handwriting?

Tesseract is designed for printed text and performs poorly on cursive or rough handwriting. Very neat, print-style handwriting may partially work. For handwritten notes, expect to correct errors manually.

Are my images uploaded to a server?

No. The entire OCR engine (Tesseract.js) runs inside your browser using WebAssembly. Your photos — ID cards, bank documents, private letters — never leave your device.

Can I extract text from a scanned PDF?

Convert the PDF pages to images first with the PDF to JPG tool, then run OCR on each page image. Alternatively, use the Searchable PDF tool to embed recognised text directly back into the PDF.

Why is the first extraction slow?

On first use, the browser downloads the OCR engine and the selected language's model (a few MB). After that, they're cached and subsequent extractions run much faster.

Is this tool completely free?

Yes. OCR on PDFdukan is 100% free with no page limits, no daily caps, no signup, and no watermarks. Extract text from as many images as you need.