Power Automate Extract Text from PDF - The Complete Guide (2026)

Need to extract text from a PDF in Power Automate? You’re not alone. It’s one of the most searched automation tasks, and for good reason. Whether you’re processing invoices, parsing resumes, or pulling data from contracts, getting text out of PDFs is the backbone of document automation.

In this guide, we’ll cover everything from the built-in Power Automate actions to handling the tricky cases (scanned PDFs, images, messy layouts) where native tools fall short. We’ll also show you how to plug in AI-powered extraction using ParserBee’s API to get clean, structured JSON data — all without writing a single line of code.

What Does “Extract Text from PDF” Mean in Power Automate?

When people say they want to extract text from a PDF in Power Automate, they usually mean one of these things:

Get all the raw text from a digital PDF (text you can select and copy)
Read text from a scanned PDF or image-based document using OCR
Extract specific data fields like names, dates, amounts, or addresses from a document
Pull structured data (like table rows or form fields) into a usable format

Power Automate offers different tools for each of these scenarios. The right approach depends on what kind of PDF you’re working with and what you want to do with the extracted data.

Method 1: Using the Built-in “Extract Text from PDF” Action

Power Automate Desktop includes a native “Extract text from PDF” action. Here’s how to use it:

Steps:

Open Power Automate Desktop and create a new flow
Add the “Extract text from PDF” action
Set the PDF file path (from a local folder, SharePoint, OneDrive, etc.)
Choose the page range (all pages or specific pages)
Run the flow – the extracted text is stored in a variable

What It Does Well:

✅ Works great for digitally created PDFs (from Word, Excel, Google Docs, etc.)
✅ Fast and free. No premium connectors needed
✅ Simple to set up for basic text extraction

What It Doesn’t Do:

❌ Cannot read scanned PDFs or image-based documents
❌ Returns raw unstructured text. No field-level data extraction
❌ No support for extracting data from images (PNG, JPEG, WebP)
❌ Struggles with complex layouts, multi-column PDFs, and tables

Bottom line: If your PDFs are digitally generated and you just need the raw text, this built-in action works fine. But if you need structured data or have scanned documents, keep reading.

Method 2: Using AI Builder for Scanned PDFs and Images

For scanned PDFs and image-based documents, Power Automate offers AI Builder — Microsoft’s AI add-on that includes OCR (Optical Character Recognition).

Steps:

In your cloud flow, add the “Recognize text in an image or PDF document” action (AI Builder)
Provide the file content from your trigger (e.g., “When a file is created in SharePoint”)
AI Builder uses OCR to read the text
Access the extracted text in subsequent actions

What It Does Well:

✅ Can read text from scanned PDFs and images
✅ Integrates directly into cloud flows

What It Doesn’t Do:

❌ Requires a premium license (AI Builder credits)
❌ Returns raw OCR text. Still no structured data extraction
❌ Accuracy varies significantly based on document quality
❌ Slow processing. Large PDFs can take minutes or even hours
❌ Has file size and page count limits
❌ Different OCR engines produce inconsistent results
❌ Cannot extract data from forms or extract specific fields

The Problem: Why Native PDF Extraction Often Falls Short

Here’s the reality that most Power Automate users run into – and it’s the reason you’re probably reading this article:

1. Raw Text ≠ Usable Data

The built-in actions give you a giant block of text. If you’re extracting an invoice, you get something like:

Invoice #12345
Date: March 10, 2026
Bill To: Acme Corp
123 Main Street
Item: Consulting Services
Amount: $5,000.00
Tax: $500.00
Total: $5,500.00

Now you need to write complex regex patterns or use multiple string functions to pull out the invoice number, date, amount, etc. This is fragile, breaks when layouts change, and is not scalable.

2. Scanned Documents Are a Nightmare

Roughly 30–40% of business documents are scanned PDFs or images. The native “Extract text from PDF” action simply returns nothing for these documents. AI Builder helps, but adds cost, complexity, and inconsistent accuracy.

3. No Support for Images

Need to extract text from an image (PNG, JPEG) in Power Automate? The built-in PDF action doesn’t work at all. You need AI Builder or a third-party connector, which means premium costs.

4. Complex Layouts Break Everything

Multi-column PDFs, tables, forms with checkboxes, documents with headers and footers – these common formats produce garbled output with native extraction.

5. No Template-Based Extraction

What if you process the same type of document every day (invoices, purchase orders, applications)? You’d want to define once what fields to extract and have it work automatically. Native Power Automate has no built-in way to do this.

This is exactly the gap that ParserBee fills – and it works right inside your Power Automate flows.

Method 3: Using ParserBee API in Power Automate (Recommended)

ParserBee is an AI-powered document parsing platform that extracts structured data from any document – PDFs, scanned documents, and images (PNG, JPEG, WebP) – and returns clean JSON output.

The best part? You can call ParserBee’s API directly from Power Automate using the HTTP connector. No coding required.

How It Works:

Create a template on ParserBee – define the exact fields you want to extract (e.g., invoice_number, date, total_amount, line_items)
Get your API key from the ParserBee dashboard
Add an HTTP action in Power Automate to call ParserBee’s extraction API
Get structured JSON back – ready to use in your flow

What Makes ParserBee Different:

Feature	Native Power Automate	AI Builder	ParserBee API
Digital PDFs	✅	✅	✅
Scanned PDFs	❌	✅ (with limits)	✅
Images (PNG, JPEG, WebP)	❌	✅ (with limits)	✅
Structured JSON output	❌	❌	✅
Custom extraction templates	❌	Limited	✅
Multi-record extraction	❌	❌	✅
Nested/complex fields	❌	❌	✅
Works in Power Automate	✅ Native	✅ Native	✅ Via HTTP connector
No-code setup	✅	✅	✅
Pricing	Free	Premium (AI Builder credits)	Free tier available

Step-by-Step: Set Up ParserBee in Power Automate

Here’s exactly how to set up AI-powered document extraction in your Power Automate flow using ParserBee. No coding required.

Step 1: Create Your ParserBee Account and Template

Go to app.parserbee.com and create a free account
Navigate to Dashboard → Create Template
Give your template a name (e.g., “Invoice Extraction”)
Add the fields you want to extract. For an invoice, you might add:
- invoice_number (string)
- date (string)
- vendor_name (string)
- total_amount (number)
- line_items (array of objects with description, quantity, unit_price)
Click Create Template and note down the Template ID

Step 2: Get Your API Key

In the ParserBee dashboard, go to Settings → API Keys
Copy your API key – you’ll need this for the Power Automate flow

Step 3: Build Your Power Automate Flow

Here’s the flow structure:

3a. Set Up Your Trigger

Choose a trigger based on your use case:

“When a file is created (SharePoint)” – for documents uploaded to SharePoint
“When a new email arrives (Outlook)” – for email attachments
“Manually trigger a flow” – for testing

3b. Add the HTTP Action to Call ParserBee

Add a new action → Search for “HTTP” → Select “HTTP” (premium connector)
Configure it as follows:

Method: POST

URI: https://app.parserbee.com/api/v1/extract

Headers:

Key	Value
`x-api-key`	`your-parserbee-api-key`

Body (using file_url):

If your file is accessible via a URL (e.g., SharePoint link):

{
  "template_id": "your-template-id",
  "file_url": "@{triggerOutputs()?['body/MediaUrl']}"
}

Body (using file upload):

If you want to upload the file directly, change the content type to multipart/form-data and include:

template_id: Your template ID
file: The file content from your trigger

3c. Parse the JSON Response

Add a “Parse JSON” action after the HTTP step
Use this schema:

{
  "type": "object",
  "properties": {
    "success": { "type": "boolean" },
    "request_id": { "type": "string" },
    "data": { "type": "object" },
    "credits_remaining": { "type": "integer" },
    "processing_time_ms": { "type": "integer" },
    "usage": {
      "type": "object",
      "properties": {
        "pages_processed": { "type": "integer" },
        "doc_size_bytes": { "type": "integer" }
      }
    }
  }
}

3d. Use the Extracted Data

Now you can use the extracted fields anywhere in your flow:

Save to Excel or SharePoint – Map each field to a column
Send an email notification – Include the extracted data in the email body
Create a record in Dynamics 365 or Dataverse – Populate fields automatically
Post to Microsoft Teams – Notify your team with the extracted information
Update a database – Write data to SQL Server or any connected system

Use Case Examples

📄 Invoice Processing

Trigger: A new invoice PDF is uploaded to SharePoint

ParserBee Template Fields:

invoice_number (string)
vendor_name (string)
invoice_date (string)
due_date (string)
subtotal (number)
tax (number)
total_amount (number)
line_items (array) – with description, quantity, unit_price

What You Get Back:

{
  "success": true,
  "data": {
    "invoice_number": "INV-2026-0342",
    "vendor_name": "CloudTech Solutions",
    "invoice_date": "2026-03-10",
    "due_date": "2026-04-10",
    "subtotal": 4500.00,
    "tax": 450.00,
    "total_amount": 4950.00,
    "line_items": [
      {
        "description": "Cloud Hosting (March 2026)",
        "quantity": 1,
        "unit_price": 3000.00
      },
      {
        "description": "Technical Support Package",
        "quantity": 1,
        "unit_price": 1500.00
      }
    ]
  },
  "credits_remaining": 487,
  "processing_time_ms": 2340
}

Next Steps in Flow: Add a row to an Excel table, send approval email, create a payment entry.

📋 Resume/CV Parsing

Trigger: A candidate uploads their resume via Microsoft Forms

ParserBee Template Fields:

name (string)
email (string)
phone (string)
education (array) – with institution, degree, major, year
work_experience (array) – with company, role, duration, description
skills (array of strings)

What You Get Back:

{
  "success": true,
  "data": {
    "name": "Sarah Johnson",
    "email": "[email protected]",
    "phone": "(555) 123-4567",
    "education": [
      {
        "institution": "MIT",
        "degree": "B.S.",
        "major": "Computer Science",
        "year": "2022"
      }
    ],
    "work_experience": [
      {
        "company": "Google",
        "role": "Software Engineer",
        "duration": "2022 - Present",
        "description": "Full-stack development on Cloud Platform"
      }
    ],
    "skills": ["Python", "React", "AWS", "SQL", "Docker"]
  }
}

Next Steps in Flow: Create a record in your ATS, send to hiring manager, add to SharePoint list.

🧾 Receipt and Expense Extraction

Trigger: An employee forwards a receipt image via email

ParserBee Template Fields:

merchant_name (string)
date (string)
total (number)
payment_method (string)
items (array) – with name, price

This works even with:

📱 Photos of receipts (JPEG, PNG)
📑 Scanned PDFs of receipts
🌐 WebP images

📝 Contract and Agreement Parsing

Trigger: A new contract is uploaded to a SharePoint folder

ParserBee Template Fields:

parties (array of strings)
effective_date (string)
expiration_date (string)
contract_value (number)
key_terms (array of strings)
renewal_clause (string)

ParserBee vs Native Power Automate PDF Extraction

Here’s a side-by-side comparison to help you decide:

When to Use the Native “Extract Text from PDF” Action:

✅ Your PDFs are always digitally generated (never scanned)
✅ You only need the raw text as a block of unformatted content
✅ You have a simple, consistent document layout
✅ You are comfortable writing regex or string operations to parse the output

When to Use ParserBee API:

✅ You need structured JSON data – not raw text
✅ You process scanned PDFs or image-based documents
✅ You want to extract data from images (PNG, JPEG, WebP)
✅ You need to handle complex layouts, tables, or multi-page documents
✅ You want a template-based approach – define fields once, extract automatically
✅ You need nested data structures (e.g., line items on an invoice)
✅ You want to process the same document type repeatedly with consistent results
✅ You want a no-code solution that doesn’t require regex or string manipulation

Supported File Types

ParserBee supports all the common document formats you’ll encounter:

Format	Extension	Use Case
PDF	`.pdf`	Invoices, contracts, reports, resumes
PNG	`.png`	Screenshots, scanned documents, captured images
JPEG	`.jpg`, `.jpeg`	Photos of receipts, ID cards, documents
WebP	`.webp`	Web-optimized document images

Maximum file size: 50 MB

Frequently Asked Questions

Can I use Power Automate to extract text from a scanned PDF?

Yes, but not with the built-in “Extract text from PDF” action – that only works for digitally created PDFs with selectable text. For scanned PDFs, you have two options:

AI Builder – Microsoft’s OCR add-on (requires premium license, returns raw text only)
ParserBee API – AI-powered extraction via the HTTP connector (returns structured JSON data, free tier available)

Can Power Automate extract text from images?

The built-in PDF actions don’t support images. You can use AI Builder’s “Recognize text in an image” action (premium), or call the ParserBee API which supports PNG, JPEG, and WebP images natively.

How do I extract specific fields (like invoice number or date) from a PDF in Power Automate?

The native actions only return raw text – you’d need to write string functions or regex to parse individual fields. With ParserBee, you define the exact fields you want in a template, and the API returns them as a structured JSON object. No string parsing needed.

Is ParserBee free to use?

ParserBee offers a free tier with credits to get started. You can create templates, generate an API key, and start extracting data immediately. Paid plans are available for higher volume usage.

Does ParserBee work with Power Automate Cloud (not Desktop)?

Yes! ParserBee’s API works with both Power Automate Cloud and Power Automate Desktop. In Cloud flows, use the HTTP connector. In Desktop flows, use the “Invoke web service” action.

What happens if the PDF has tables?

ParserBee handles tables natively. You can define array-type fields in your template (e.g., line_items as an array of objects), and ParserBee will extract each row as a structured object with the fields you specified.

Can I extract data from multiple documents of the same type?

Absolutely – that’s exactly what templates are designed for. Create a template once for your document type (invoices, receipts, etc.), and use the same template ID in every API call. ParserBee’s AI adapts to variations in layout while extracting the same fields consistently.

Is the HTTP connector in Power Automate a premium connector?

Yes, the HTTP connector requires a Power Automate Premium license. However, if you’re processing PDFs at any meaningful scale, you likely already have a premium plan. The investment is worth it for the structured data output you get from ParserBee compared to raw text.

Can I use ParserBee to extract data from documents in languages other than English?

Yes, ParserBee’s AI engine supports multiple languages. The OCR and extraction capabilities work across different languages and character sets.

How fast is ParserBee’s extraction?

Most documents are processed in 2–5 seconds. This is significantly faster than AI Builder, which can take minutes for complex or multi-page documents.

Conclusion

Extracting text from PDFs in Power Automate doesn’t have to be painful. Here’s the simple decision tree:

Digital PDFs + raw text only? → Use the built-in “Extract text from PDF” action
Scanned PDFs or images? → You need OCR – either AI Builder (premium, raw text) or ParserBee (structured data)
Need structured data (specific fields)? → Use ParserBee API via the HTTP connector – it’s the only option that gives you clean JSON without writing regex

ParserBee works inside your existing Power Automate flows. There’s nothing to migrate, no workflows to rebuild. Just add one HTTP action, point it at ParserBee’s API, and you’ll go from raw blobs of text to structured, usable data in seconds.

Get started with ParserBee for free →

Have questions about using ParserBee with Power Automate? Reach out to us at [email protected] – we’re happy to help you set up your first flow.