How to OCR a PDF — Extract Text from Scanned PDFs for Free (2026)

Published: June 10, 2026 · 5 min read

You just received a scanned contract, a photo of a receipt, or a PDF that's nothing but images. You need the text — to copy, search, or edit it. But when you try to select text... nothing happens. That PDF is just a picture.

This is where OCR(Optical Character Recognition) comes in. OCR technology reads the text inside images and converts it to selectable, searchable, copyable text. And you don't need expensive software to do it.

What Is PDF OCR and When Do You Need It?

OCR is the process of converting images of text into actual text characters. You need OCR when:

  • Scanned documents — Contracts, invoices, forms scanned to PDF
  • Photo-based PDFs — Receipt photos, whiteboard snapshots, book pages
  • Image-heavy PDFs — Old PDFs where every page is a flattened image
  • Non-searchable PDFs — PDFs where Ctrl+F finds nothing

4 Free Methods to OCR a PDF

1. Browser-Based OCR (Recommended)

Modern browser-based tools like PDF Toolbox OCR use WebAssembly-powered OCR engines that run entirely in your browser. No file uploads, no server processing — everything stays on your device.

Pros:

  • 100% private — your documents never leave your device
  • No installation, no signup
  • Works on any device: Windows, Mac, Linux, mobile
  • Free, unlimited use

Cons: Slightly slower than desktop apps for very large files (100+ pages).

2. Google Drive OCR (Built-in)

Google Drive has a hidden OCR feature. Upload a PDF or image to Google Drive, right-click it, and open with Google Docs. Google automatically extracts the text. It works surprisingly well for clean documents.

Pros: Free with any Google account, good accuracy.

Cons: Your document must be uploaded to Google's servers. Formatting is often lost.

3. Tesseract OCR (Open Source)

Tesseract is the most popular open-source OCR engine, originally developed by HP and now maintained by Google. It supports 100+ languages.

# Install Tesseract (macOS)
brew install tesseract

# OCR a PDF (requires ImageMagick to convert PDF to images first)
convert -density 300 scanned.pdf page_%04d.png
tesseract page_0001.png output -l eng

Pros: Free, open-source, highly configurable, supports many languages.

Cons: Command-line only. Requires setup. PDF-to-image conversion needed first.

4. Apple Preview + Live Text (Mac/iOS)

On macOS Ventura+ and iOS 15+, Apple's Live Text can recognize text in any PDF opened in Preview. Just open the PDF, click the text selection tool, and click on image text — it becomes selectable automatically.

Pros: Built-in, no extra steps, instant.

Cons: Apple-only. No batch processing.

OCR Accuracy: What Affects It

FactorImpactTip
DPI / ResolutionHigher = better300 DPI minimum
Font clarityClean fonts = 99%+ accuracyAvoid handwriting, decorative fonts
LanguageLatin scripts perform bestSpecify the language in OCR settings
ContrastHigh contrast = easier to readDark text on light background is ideal

FAQ

Is online OCR safe for sensitive documents?

It depends on the tool. Server-based OCR services upload your document to their servers, which is a privacy risk for contracts, legal documents, or personal files. Browser-based OCR tools like PDF Toolbox OCR process everything locally — your file never leaves your device.

Can OCR handle handwriting?

Most OCR engines struggle with handwriting, especially cursive. Printed text yields 95-99% accuracy. Handwritten text might get 50-70%. For handwritten documents, specialized ICR (Intelligent Character Recognition) tools work better but are usually paid.

How long does OCR take?

Browser-based OCR typically processes 1-2 pages per second on a modern device. A 10-page scanned PDF takes about 5-15 seconds. Server-based tools may be faster for very large files (50+ pages) but at the cost of privacy.

Try PDF Toolbox OCR — Free & Private

Extract text from scanned PDFs directly in your browser. Nothing is uploaded, nothing is stored. 100% free, no signup needed.

OCR Your PDF Now →