AI Extract Node
Use a language model to extract structured fields from unstructured text, outputting a typed JSON object.
Runs a prompt + JSON schema through a language model to extract structured data from free-form text. The output is a validated JSON object stored in a variable.
- Extract name, email, and budget from a free-text enquiry message.
- Parse a PDF (scraped with the HTTP Scrape node) into structured fields like company name, address, and services.
---
| Field | Type | Required | Description |
|---|---|---|---|
| model | enum | Yes | gpt-4o, claude-3-5-sonnet, or gemini-1.5-pro |
| prompt | string | Yes | Instructions for what to extract |
| schema | JSON object | Yes | JSON Schema defining the fields to extract |
| text | string | Yes | The source text — typically {{variables.pdfText}} or {{variables.scrapedText}} |
---
| Field | Type | Required | Description |
|---|---|---|---|
| outputVar | string | No | Variable name for the result (default: first key of schema or extractResult) |
---
| Variable | Description |
|---|---|
{{variables.extractResult}} | Structured JSON object matching your schema (or custom name) |
---
- Scrape or retrieve source text earlier in the flow (e.g. HTTP Scrape →
scrapedText). - Add an AI Extract node.
- Write the extraction prompt e.g.
Extract the company name, registration number, and address from the following text. - Define your schema:
json
{
"type": "object",
"properties": {
"companyName": { "type": "string" },
"regNumber": { "type": "string" },
"address": { "type": "string" }
}
}
- Set
textto{{variables.scrapedText}}. - Downstream nodes can reference
{{variables.extractResult.companyName}}etc.
---
json
{
"model": "gpt-4o",
"prompt": "Extract the fields defined in the schema from the text below.",
"schema": {
"type": "object",
"properties": {
"budget": { "type": "number" },
"timeline": { "type": "string" },
"service": { "type": "string" }
}
},
"text": "{{variables.enquiryText}}",
"outputVar": "enquiryData"
}
---
- Larger schemas → more tokens used → higher cost per run. Keep schemas focused.
- If a field can't be found in the source text, the model will typically return
nullfor that key. Check with an If/Else node if you need it. - This node is particularly powerful after the HTTP Scrape node for parsing scraped page content.