Integrate Runpod with external tools

Runpod can be integrated with any system that supports custom endpoint configuration. Integration is straightforward: any library or framework that accepts a custom base URL for API calls will work with Runpod without specialized adapters or connectors. This means you can use Runpod with tools like n8n, CrewAI, LangChain, and many others by simply pointing them to your Runpod endpoint URL.

Deployment options

Runpod offers four deployment options for endpoint integrations:

Public Endpoints

Public Endpoints are pre-deployed AI models that you can use without setting up your own Serverless endpoint. They’re vLLM-compatible and return OpenAI-compatible responses, so you can get started quickly or test things out without deploying infrastructure. The following Public Endpoint URLs are available for OpenAI-compatible models:

# Public Endpoint for Qwen3 32B AWQ
https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1

# Public Endpoint for ibm/IBM Granite-4.0-H-Small
https://api.runpod.ai/v2/granite-4-0-h-small/openai/v1

vLLM workers

vLLM workers provide an inference engine that returns OpenAI-compatible responses, making it ideal for tools that expect OpenAI’s API format. When you deploy a vLLM endpoint, access it using the OpenAI-compatible API at:

https://api.runpod.ai/v2/ENDPOINT_ID/openai/v1

Where ENDPOINT_ID is your Serverless endpoint ID.

SGLang workers

SGLang workers also return OpenAI-compatible responses, offering optimized performance for certain model types and use cases.

Load balancing endpoints

Load balancing endpoints let you create custom endpoints where you define your own inputs and outputs. This gives you complete control over the API contract and is ideal when you need custom behavior beyond standard inference patterns.

Model configuration for compatibility

Some models require specific vLLM environment variables to work with external tools and frameworks. You may need to set a custom chat template or tool call parser to ensure your model returns responses in the format your integration expects. For example, you can configure the Qwen/qwen3-32b-awq model for OpenAI compatibility by adding these environment variables in your vLLM endpoint settings:

ENABLE_AUTO_TOOL_CHOICE=true
REASONING_PARSER=qwen3
TOOL_CALL_PARSER=hermes

These settings enable automatic tool choice selection and set the right parsers for the Qwen3 model to work with tools that expect OpenAI-formatted responses. For more information about tool calling configuration and available parsers, see the vLLM tool calling documentation.

Integration tutorials

Follow these step-by-step tutorials to integrate Runpod with popular tools:

Integrate with n8n

Connect Runpod to n8n for AI-powered workflow automation.

Integrate with CrewAI

Use Runpod to power autonomous AI agents in CrewAI.

Compatible frameworks

The same integration pattern works with any framework that supports custom OpenAI-compatible endpoints, including:

CrewAI: A framework for orchestrating role-playing autonomous AI agents.
LangChain: A framework for developing applications powered by language models.
AutoGen: Microsoft’s framework for building multi-agent conversational systems.
Haystack: An end-to-end framework for building search systems and question answering.
n8n: A workflow automation tool with AI integration capabilities.

Configure these frameworks to use your Runpod endpoint URL as the base URL, and provide your Runpod API key for authentication.

Third-party integrations

For infrastructure management and orchestration, Runpod integrates with:

dstack: Simplified Pod orchestration for AI/ML workloads.
SkyPilot: Multi-cloud execution framework.
Mods: AI-powered command-line tool.

Get started

Serverless

Pods

Storage

Integrations

Hub

Instant Clusters

Fine-tuning

Reference

Integrate Runpod with external tools

Deployment options

Public Endpoints

vLLM workers

SGLang workers

Load balancing endpoints

Model configuration for compatibility

Integration tutorials

Integrate with n8n

Integrate with CrewAI

Compatible frameworks

Third-party integrations

Get started

Serverless

Pods

Storage

Integrations

Hub

Instant Clusters

Fine-tuning

Reference

​Deployment options

​Public Endpoints

​vLLM workers

​SGLang workers

​Load balancing endpoints

​Model configuration for compatibility

​Integration tutorials

Integrate with n8n

Integrate with CrewAI

​Compatible frameworks

​Third-party integrations

Deployment options

Public Endpoints

vLLM workers

SGLang workers

Load balancing endpoints

Model configuration for compatibility

Integration tutorials

Compatible frameworks

Third-party integrations