Document Capturing Agent

Agent Description:

The Document Capturing Agent automates the intake of insurance claim documents. It retrieves raw JSON data via an API and validates key identifiers such as claim ID and policy number. The agent then parses, cleans, and normalizes the data, handling nested structures and ensuring consistent formatting. Finally, it outputs a structured and standardized JSON file ready for downstream claim management systems.

Purpose and Components

Purpose: To automate the first step of insurance claim processing by ingesting raw JSON claim documents, validating them for key identifiers, and then normalizing the (often nested) data into a clean, standardized schema.
Components:
- Document Extractor: An agent that fetches the raw JSON claim document via an API.
- Claim Structurer: An agent that parses the raw JSON and maps it to a clean, standardized data structure.
- API Connector (GET): A tool to retrieve the raw claim data from a remote source.

Supported Capabilities

Accessing and retrieving incoming JSON claim documents using a GET tool.
Capturing key fields, including claimant details, policy number, hospital data, diagnosis, billing, and insurance information, even from nested or irregular structures.
Validating that each incoming document includes a unique claim reference ID and policy number.
Passing the raw, untransformed JSON to the next agent for processing.
Parsing and mapping all raw attributes into a standardized schema (e.g., ClaimReferenceID, PolicyNumber, Claimant, HospitalDetails, Financials).
Normalizing numerical values by removing currency symbols and converting them to numbers.
Formatting nested objects (like billing breakdowns) into proper sub-JSON structures.
Outputting a final, structured JSON in a consistent format, ready for ingestion into a claim management system.

LLM Used

GPT_4O_MINI

Note: To learn more about the LLM and to modify its behavior, refer to the Configuring LLM settings section.

Sub-Agents

1. Document Extractor

Role: Data Fetcher
Scope: Retrieve raw insurance claim data in JSON format using the get tool.
Description: This agent initiates the workflow. It uses the GET tool to access and retrieve an incoming raw JSON claim document. It performs a basic validation to ensure a claim reference ID and policy number are present, then passes the complete, unchanged JSON object to the Claim Structurer.
LLM Used: Default (Inherits from parent agent).
Tools Used: Request - Get

2. Claim Structurer

Role: Data Normalizer
Scope: Convert the raw JSON into a clean, standardized schema for structured data storage.
Description: This agent receives the raw JSON from the extractor. It performs the critical task of parsing the (potentially messy) data and mapping it to a clean, standardized schema. This includes normalizing numerical values (e.g., removing "$") and correctly formatting nested objects. The final output is a clean, structured JSON object ready for processing.
LLM Used: Default (Inherits from parent agent).
Tools Used: None

Tools Used:

Request - Get Tool: Fetches the raw insurance claim document from a remote JSON endpoint.

Note: For details on modifying the Tools, refer Tools Library section.

Agent Workflow Behavior Summary

The Document Extractor (start node) is triggered and uses the Request - Get tool to fetch a raw, unstructured JSON claim document.
It validates that the raw JSON contains a claim ID and policy number, then passes the entire object to the Claim Structurer.
The Claim Structurer (end node) parses the raw data, normalizes fields (like numbers and nested objects), and maps everything to a clean, standardized schema.
The agent outputs the final, structured JSON, ready to be used by a claims management system.

Sample Questions:

Process the new claim document from the API.
Extract and structure the data from claim 'CL-1001'.