What you’ll learn
In this tutorial, you’ll learn how to:- Deploy a vLLM worker on Runpod Serverless.
- Configure your vLLM endpoint for OpenAI compatibility.
- Connect n8n to your Runpod endpoint.
- Test your integration with a simple workflow.
Requirements
- You’ve created a Runpod account.
- You’ve created a Runpod API key.
- You have n8n installed and running.
- (Optional) For gated models, you’ve created a Hugging Face access token.
Step 1: Deploy a vLLM worker on Runpod
First, you’ll deploy a vLLM worker to serve your language model.1
Create a new vLLM endpoint
Open the Runpod console and navigate to the Serverless page.Click New Endpoint and select vLLM under Ready-to-Deploy Repos.
2
Configure your endpoint
For more details on vLLM deployment options, see Deploy a vLLM worker.
- Enter the model name or Hugging Face model URL (e.g.,
openchat/openchat-3.5-0106
). - Expand the Advanced section:
- Set Max Model Length to
8192
(or an appropriate context length for your model). - You may need to enable tool calling and set an appropriate reasoning parser depending on your model.
- Set Max Model Length to
- Click Next.
- Click Create Endpoint.
3
Copy your endpoint ID
Once deployed, navigate to your endpoint in the Runpod console and copy the Endpoint ID. You’ll need this to connect your endpoint to n8n.
Step 2: Connect n8n to your Runpod endpoint
Now you’ll configure n8n to use your Runpod endpoint as an OpenAI-compatible API.1
Add an OpenAI Chat Model node
In your n8n workflow, add a new OpenAI Chat Model node to your canvas. Double-click the node to configure it.
2
Create a new credential
Click the dropdown under Credential to connect with and select Create new credential.
3
Add your Runpod API key
Under API Key, add your Runpod API Key. You can create an API key in the Runpod console.
4
Configure the base URL
Under Base URL, replace the default OpenAI URL with your Runpod endpoint URL:Replace
ENDPOINT_ID
with your endpoint ID from Step 1.5
Save the credential
Click Save. n8n will automatically test your endpoint connection. If successful, you can start using the node in your workflow.
Step 3: Test your integration
Create a simple workflow to test your integration.1
Create a test workflow
Add a Manual Trigger node and connect it to your OpenAI Chat Model node.
2
Configure the chat model
In the OpenAI Chat Model node, add a test message like “Hello, what can you help me with?”
3
Execute the workflow
Click Execute Workflow in n8n. You should see a response from your model running on Runpod.
4
Monitor requests
Monitor requests from your n8n workflow in the endpoint details page of the Runpod console.
The n8n chat feature may have trouble parsing output from vLLM depending on your model. If you experience issues, try adjusting your model’s output format or testing with a different model.
Next steps
Now that you’ve integrated Runpod with n8n, you can:- Build complex AI-powered workflows using your Runpod endpoints.
- Explore other integration options with Runpod.
- Learn about OpenAI compatibility features in vLLM.