SubQL GraphQL Agent

A TypeScript/Node.js GraphQL agent for SubQuery Network that uses LangChain and OpenAI to query GraphQL endpoints intelligently with natural language.

Overview

This toolkit provides LLM agents with the ability to interact with any GraphQL API built with SubQuery SDK through natural language, automatically understanding schemas, validating queries, and executing complex GraphQL operations.

Key Features

Natural Language Interface: Ask questions about blockchain data in plain English
Automatic Schema Understanding: Agents learn PostGraphile v4 patterns and SubQuery entity schemas
Query Generation & Validation: Converts natural language to valid GraphQL queries with built-in validation
SubQuery SDK Optimized: Works with any project built using SubQuery SDK (Ethereum, Polkadot, Cosmos, etc.)

Design Philosophy

Solving the GraphQL Schema Size Problem

Traditional GraphQL agents face a fundamental challenge: schema size exceeds LLM context limits. Most GraphQL APIs have introspection schemas that are tens of thousands of tokens, making them:

Too large for most commercial LLMs (exceeding context windows)
Too expensive for cost-effective query generation
Too noisy for reliable query construction (low signal-to-noise ratio)

Our Innovative Approach: Entity Schema + Rules

Instead of using raw GraphQL introspection schemas, we developed a compressed, high-density schema representation:

Entity Schema as Compressed Knowledge

Compact Format: 100x smaller than full introspection schemas
Domain-Specific: Contains project-specific entities and relationships
High Information Density: Only essential types, relationships, and patterns
Rule-Based: Combined with PostGraphile v4 patterns for query construction

Size Comparison

Traditional Approach:
├── Full GraphQL Introspection: ~50,000+ tokens
├── Context Window Usage: 80-95%
└── Result: Often fails or generates invalid queries

Our Approach:
├── Entity Schema: ~500-1,000 tokens
├── PostGraphile Rules: ~200-300 tokens
├── Context Window Usage: 5-10%
└── Result: Reliable, cost-effective query generation

Benefits

Cost Effective: 10-20x lower token usage than traditional approaches
Higher Accuracy: Domain-specific knowledge reduces errors
Faster Responses: Smaller context means faster processing
Scalable: Works consistently across different LLM models

Architecture

Core Components

GraphQLSource - Connection wrapper for GraphQL endpoints with entity schema support
GraphQLToolkit - LangChain-compatible toolkit providing all GraphQL tools
GraphQL Agent Tools - Individual tools for specific GraphQL operations

Available Tools

graphql_schema_info - Get raw entity schema with PostGraphile v4 rules
graphql_query_validator_execute - Combined validation and execution tool (validates queries, then executes them if valid)

Installation

# Install dependencies
pnpm install

Configuration

Copy the environment example file:

cp .env.example .env

Edit .env and add your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

Environment Variables

# Required
OPENAI_API_KEY=your-openai-api-key

# Optional
LLM_MODEL=gpt-4o  # Default model
PORT=8000         # Server port (if running API server)

Usage

Basic Example

import { runLangChainGraphQLAgent } from './src/langchain-agent.js';
import type { GraphQLProjectConfig } from './src/types.js';

// Load entity schema (learn more: https://subquery.network/doc/indexer/build/graphql.html)
// This example uses SubQuery Network's schema - replace with your own project's schema
const entitySchema = `
type Indexer implements Entity {
  id: ID!
  ownerId: String!
  active: Boolean!
  rewards: [Reward!]!
  projects: [Project!]!
}

type Project implements Entity {
  id: ID!
  owner: String!
  metadata: String!
}
`;

const config: GraphQLProjectConfig = {
  cid: "subquery-network",
  schemaContent: entitySchema,
  nodeType: GraphqlProvider.SUBQL,
  updatedAt: new Date().toISOString(),
  domainName: "SubQuery Network",
  domainCapabilities: [
    "Indexer information and performance metrics",
    "Project registration and metadata",
    "Staking rewards and delegation data",
    "Network statistics and era information"
  ],
  declineMessage: "I'm specialized in SubQuery Network data queries. I can help you with indexers, projects, staking rewards, and network statistics, but I cannot assist with cooking. Please ask me about SubQuery Network data instead."
};

// Note: This example uses SubQuery Network's API - replace with your own project's endpoint
const endpoint = "https://index-api.onfinality.io/sq/subquery/subquery-mainnet";

// Query with natural language
const answer = await runLangChainGraphQLAgent(
  config,
  "Show me the top 3 indexers with their project information",
  endpoint
);

console.log(answer);

Example Natural Language Queries

Note: These examples are for the SubQuery Network demo. For your own project, the queries would be specific to your indexed blockchain data.

Basic Data Retrieval

"Show me the first 5 indexers and their IDs"
"What projects are available? Show me their owners"
"List all indexers with their project information"

Staking & Rewards

"What are my staking rewards for wallet 0x123...?"
"Show me rewards for the last era"
"Find delegations for a specific indexer"

Performance & Analytics

"Which indexers have the highest rewards?"
"Show me project performance metrics"
"List top performing indexers by era"

PostGraphile v4 Query Patterns

The agent understands PostGraphile v4 patterns automatically:

Entity Queries

Single: entityName(id: ID!) -> Full entity object
Collection: entityNames(first: Int, filter: EntityFilter) -> Connection with pagination

Filtering

filter: {
  fieldName: { equalTo: "value" }
  amount: { greaterThan: 100 }
  status: { in: ["active", "pending"] }
}

Ordering

orderBy: [FIELD_NAME_ASC, CREATED_AT_DESC]

Pagination

{
  entities(first: 10, after: "cursor") {
    nodes { id, field }
    pageInfo { hasNextPage, endCursor }
  }
}

Agent Workflow

The agent follows this intelligent workflow:

Relevance Check: Determines if the question relates to the project data
Schema Analysis: Loads entity schema and PostGraphile rules (once per session)
Query Construction: Builds GraphQL queries using PostGraphile patterns
Validation: Validates queries against the live GraphQL schema
Execution: Executes validated queries to get real data
Summarization: Provides user-friendly responses based on actual results

Development

# Development mode with auto-reload
pnpm dev

# Build the project
pnpm build

# Run tests
pnpm test

# Type checking
pnpm typecheck

Scripts

dev - Run in development mode with file watching
build - Compile TypeScript to JavaScript
build:watch - Compile with file watching
start - Run the compiled application
test - Run tests
test:watch - Run tests in watch mode
typecheck - Run TypeScript type checking
lint - Run ESLint (if configured)
clean - Clean build output

Project Structure

sq-graphql-agent/
├── src/                    # Source code
│   ├── langchain-agent.ts  # Main agent implementation
│   ├── tools.ts            # GraphQL tools
│   ├── types.ts            # TypeScript type definitions
│   └── utils.ts            # Utility functions
├── examples/               # Usage examples
├── tests/                  # Test files
├── package.json            # Dependencies and scripts
└── tsconfig.json           # TypeScript configuration

Dependencies

Runtime Dependencies

@langchain/community - LangChain community tools
@langchain/core - LangChain core utilities
@langchain/langgraph - LangGraph for agent workflows
@langchain/openai - OpenAI integration
graphql - GraphQL query library
openai - OpenAI API client
pino - Fast JSON logger
zod - Schema validation

Development Dependencies

typescript - TypeScript compiler
tsx - TypeScript execution tool
jest - Testing framework
ts-jest - Jest TypeScript preset
@types/node - Node.js type definitions

Error Handling

The toolkit includes comprehensive error handling:

Network Issues

GraphQL endpoint connectivity problems
Timeout handling for long-running queries
Automatic retry for transient failures

Schema Introspection Issues

Authorization error detection (e.g., missing headers)
Invalid endpoint error handling
Prevents caching of failed introspection results

Query Issues

Invalid GraphQL syntax detection
Schema validation with detailed error messages
Field existence verification

Performance Considerations

Query Optimization

Always use pagination (first: N) for collection queries
Limit nested relationship depth to avoid expensive queries
Use specific field selection rather than querying all fields

Caching Strategy

GraphQL schema introspection results are cached (configurable TTL)
Entity schema is loaded once per toolkit instance
No query result caching (always fresh data)

Resource Management

Connection pooling for HTTP requests
Automatic cleanup of resources
Memory-efficient schema processing

Model Performance Comparison

Based on comprehensive testing, here's how different LLM models perform with this GraphQL agent:

Model	Performance	Query Accuracy	Complex Reasoning	Cost Efficiency	Recommendation
Gemini-3-flash(openrouter)	Excellent	Excellent	Excellent	Good	Recommended
GLM-4.6	Very Good	Very Good	Excellent	Excellent	Cost-Effective
Kimi-k2	Good	Fair	Fair	Very Good	Cost-Effective

Recommendation Guidelines

For Production Use:

export LLM_MODEL="gpt-4o"  # Best reliability and accuracy

For Cost-Conscious Production:

export LLM_MODEL="deepseek-v3"  # Excellent value proposition

For Development/Testing:

export LLM_MODEL="gpt-4.1-mini"  # Good balance for non-production

Comparison with Alternatives

Feature	SubQL GraphQL Agent	LangChain GraphQL	LangChain SQL Agents
Schema Size Handling	Entity compression (500 tokens)	Full schema per query (50k+ tokens)	Table schemas (compact)
Domain Flexibility	SubQuery SDK only	Any GraphQL API	Any SQL database
Schema Learning	Learns once, reasons multiple queries	Requires schema in every query	Learns schema structure
Natural Language	Full support	Limited by context size	Full support
Query Construction	PostGraphile rules	Pattern matching only	Mature SQL generation
Cost Efficiency	Low token usage	Very high token usage	Very efficient
Security & Access	API-only, no DB access	API-only, no DB access	Requires DB credentials
Setup Complexity	Simple for SubQuery	Schema management overhead	DB access + permissions

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.husky		.husky
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
eslint.config.cjs		eslint.config.cjs
jest.config.js		jest.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo

subquery/sq-graphql-agent.js

Folders and files

Latest commit

History

Repository files navigation