Skip to main content
Learn how to integrate Runpod Serverless with CrewAI, a framework for orchestrating role-playing autonomous AI agents. By the end of this tutorial, you’ll have a vLLM endpoint running on Runpod that you can use to power your CrewAI agents.

What you’ll learn

In this tutorial, you’ll learn how to:
  • Deploy a vLLM worker on Runpod Serverless.
  • Configure your vLLM endpoint for OpenAI compatibility.
  • Connect CrewAI to your Runpod endpoint.
  • Test your integration with a simple agent.

Requirements

Step 1: Deploy a vLLM worker on Runpod

First, you’ll deploy a vLLM worker to serve your language model.
1

Create a new vLLM endpoint

Open the Runpod console and navigate to the Serverless page.Click New Endpoint and select vLLM under Ready-to-Deploy Repos.
2

Configure your endpoint

For more details on vLLM deployment options, see Deploy a vLLM worker.
In the deployment modal:
  • Enter the model name or Hugging Face model URL (e.g., openchat/openchat-3.5-0106).
  • Expand the Advanced section:
    • Set Max Model Length to 8192 (or an appropriate context length for your model).
    • You may need to enable tool calling and set an appropriate reasoning parser depending on your model.
  • Click Next.
  • Click Create Endpoint.
Your endpoint will now begin initializing. This may take several minutes while Runpod provisions resources and downloads your model. Wait until the status shows as Running.
3

Copy your endpoint ID

Once deployed, navigate to your endpoint in the Runpod console and copy the Endpoint ID. You’ll need this to connect your endpoint to CrewAI.

Step 2: Connect CrewAI to your Runpod endpoint

Now you’ll configure CrewAI to use your Runpod endpoint as an OpenAI-compatible API.
1

Open LLM connections settings

Open the CrewAI dashboard and look for the LLM connections section.
2

Select custom OpenAI provider

Under Provider, select custom-openai-compatible from the dropdown menu.
3

Add your Runpod API key

Configure the connection with your Runpod credentials:
  • For OPENAI_API_KEY, use your Runpod API Key. You can find or create API keys in the Runpod console.
4

Configure the base URL

For OPENAI_API_BASE, add the base URL for your vLLM’s OpenAI-compatible endpoint:
https://api.runpod.ai/v2/ENDPOINT_ID/openai/v1
Replace ENDPOINT_ID with your actual endpoint ID from Step 1.
5

Test the connection

Click Fetch Available Models to test the connection. If successful, CrewAI will retrieve the list of models available on your endpoint.

Step 3: Test your integration

Verify that your CrewAI agents can use your Runpod endpoint.
1

Create a test agent

Create a simple CrewAI agent that uses your Runpod endpoint for its language model.
2

Run a test task

Assign a simple task to your agent and run it to verify that it can communicate with your Runpod endpoint.
3

Monitor requests

Monitor requests from your CrewAI agents in the endpoint details page of the Runpod console.
4

Verify responses

Confirm that your agent is receiving appropriate responses from your model running on Runpod.

Next steps

Now that you’ve integrated Runpod with CrewAI, you can:
I