Extract text from photos, scanned documents, and screenshots. Supports 100+ languages via AI.
📂 Upload Image
🔤
Drop an image here or click to browse
JPG, PNG, WEBP — for best results use clear, high-contrast images
Choose Image
🤖 Recognizing Text
Loading OCR engine…
First run downloads the language model (~10MB). Subsequent runs are instant.
✅ Extracted Text
Source Image
Extracted Text
(Text will appear here after processing)
Confidence
—
Words
—
Characters
—
What is OCR and how do I extract text from an image? OCR (Optical Character Recognition) converts text inside photos, scans, and screenshots into editable, copyable text. With PDFdukan's free OCR tool you just upload an image, pick the language (100+ supported, including Urdu), and copy the extracted text. It runs in your browser — no uploads, no signup, completely private.
Free OCR — Extract Text from Images Online
PDFdukan's OCR tool uses Tesseract.js — the same engine behind Google's OCR technology — to accurately extract text from photos, screenshots, and scanned documents. Supports over 100 languages including English, Arabic, Urdu, Chinese, and more. All processing happens locally in your browser with zero privacy concerns.
🌍
100+ Languages
Extract text in any language including Arabic, Urdu, Chinese, Japanese, and all Latin scripts.
🎯
High Accuracy
Powered by Tesseract 4.x with LSTM neural network for superior recognition accuracy.
📋
Copy & Export
Copy text directly to clipboard or download as a .txt file with one click.
How to Extract Text from an Image in 3 Steps
Upload your image or scanned document — JPG, PNG, or WEBP photos of documents, screenshots, book pages, receipts, or handwritten-style printed text.
Select the language — Choose the language of the text in your image (English, Urdu, Arabic, and 100+ more). Correct language selection significantly improves accuracy.
Run OCR and copy the text — Click Extract Text. Within seconds the recognised text appears in an editable box — copy it to your clipboard or download it as a .txt file.
Common Uses for OCR
Digitising printed documents: Convert photographed letters, certificates, contracts, and reports into editable, searchable text instead of retyping.
Extracting text from screenshots: Grab text from a screenshot of a webpage, chat, error message, or app where copying is disabled.
Book and notes scanning: Students can photograph textbook pages or class notes and extract the text for summaries and assignments.
Receipts and invoices: Pull amounts, dates, and details from photographed receipts for record-keeping and expense reports.
Old archives: Convert old printed family documents, certificates, and records into digital text before they fade.
Translating foreign text: Extract text from an image first, then paste it into a translator — useful for documents, signs, and labels.
Tips for the Best OCR Accuracy
Use a sharp, well-lit photo — blur is the biggest cause of recognition errors.
Hold the camera straight above the text — tilted text reduces accuracy. For paper documents, scan with CamMaster Scanner first to flatten and enhance the page.
Select the correct language before running OCR — the engine loads language-specific recognition models.
Higher resolution images give better results — avoid heavily compressed WhatsApp forwards when possible; ask for the original photo.
Printed text recognises far better than handwriting — Tesseract is built for printed characters.
Frequently Asked Questions
For clear, well-lit photos of printed text, accuracy is typically 95–99%. Accuracy drops with blurry images, low light, unusual fonts, or skewed angles. The LSTM neural network in Tesseract 4.x handles most real-world documents very well — and you can edit the extracted text directly in the output box to fix any stray errors.
Yes. The tool supports over 100 languages including Urdu, Arabic, Persian, Hindi, Chinese, and Japanese. Select the correct language before extraction — the engine downloads that language's trained recognition model and applies right-to-left text handling automatically.
Tesseract is designed for printed text and performs poorly on cursive or rough handwriting. Very neat, print-style handwriting may partially work. For handwritten notes, expect to correct errors manually.
No. The entire OCR engine (Tesseract.js) runs inside your browser using WebAssembly. Your photos — ID cards, bank documents, private letters — never leave your device.
Convert the PDF pages to images first with the PDF to JPG tool, then run OCR on each page image. Alternatively, use the Searchable PDF tool to embed recognised text directly back into the PDF.
On first use, the browser downloads the OCR engine and the selected language's model (a few MB). After that, they're cached and subsequent extractions run much faster.
Yes. OCR on PDFdukan is 100% free with no page limits, no daily caps, no signup, and no watermarks. Extract text from as many images as you need.
Welcome back
Sign in with Google to get up to 5 GB of free storage — your files save directly to your own Google Drive. Private and secure.