AI/ML Bearer Token

Replicate REST API

Run and deploy AI models via REST API in seconds

Replicate is a cloud platform that lets developers run machine learning models through a simple REST API without managing infrastructure. Deploy open-source models or custom ML models with automatic scaling, pay-per-use pricing, and production-ready inference endpoints. Used by developers to integrate AI capabilities like image generation, language models, speech synthesis, and more into applications.

Base URL https://api.replicate.com/v1

API Endpoints

Method	Endpoint	Description
GET	`/models`	List all public models available on Replicate
GET	`/models/{model_owner}/{model_name}`	Get details about a specific model including versions and schema
GET	`/models/{model_owner}/{model_name}/versions`	List all versions of a specific model
GET	`/models/{model_owner}/{model_name}/versions/{version_id}`	Get details about a specific model version
POST	`/predictions`	Create a new prediction (run a model) and receive results
GET	`/predictions/{prediction_id}`	Get the status and output of a prediction
POST	`/predictions/{prediction_id}/cancel`	Cancel an in-progress prediction
GET	`/predictions`	List all predictions for your account
POST	`/deployments/{deployment_owner}/{deployment_name}/predictions`	Create a prediction using a private deployment
GET	`/deployments`	List all deployments in your account
GET	`/deployments/{deployment_owner}/{deployment_name}`	Get details about a specific deployment
POST	`/trainings`	Start a model training job with custom data
GET	`/trainings/{training_id}`	Get the status and results of a training job
POST	`/trainings/{training_id}/cancel`	Cancel an in-progress training job
GET	`/collections/{collection_slug}`	Get a curated collection of models by category

Code Examples

curl -s -X POST \
  https://api.replicate.com/v1/predictions \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "version": "stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
    "input": {
      "prompt": "A serene lake at sunset with mountains",
      "num_inference_steps": 50
    }
  }'

const response = await fetch('https://api.replicate.com/v1/predictions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.REPLICATE_API_TOKEN}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    version: 'stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b',
    input: {
      prompt: 'A serene lake at sunset with mountains',
      num_inference_steps: 50
    }
  })
});

const prediction = await response.json();
console.log(prediction);

// Poll for results
while (prediction.status !== 'succeeded' && prediction.status !== 'failed') {
  await new Promise(resolve => setTimeout(resolve, 1000));
  const statusResponse = await fetch(
    `https://api.replicate.com/v1/predictions/${prediction.id}`,
    {
      headers: {
        'Authorization': `Bearer ${process.env.REPLICATE_API_TOKEN}`
      }
    }
  );
  prediction = await statusResponse.json();
}

console.log('Output:', prediction.output);

import replicate
import os

# Set your API token
os.environ['REPLICATE_API_TOKEN'] = 'r8_your_token_here'

# Run a model
output = replicate.run(
    "stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
    input={
        "prompt": "A serene lake at sunset with mountains",
        "num_inference_steps": 50
    }
)

print(output)

# Or use the API directly
import requests

headers = {
    'Authorization': f'Bearer {os.environ["REPLICATE_API_TOKEN"]}',
    'Content-Type': 'application/json'
}

data = {
    'version': 'stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b',
    'input': {
        'prompt': 'A serene lake at sunset with mountains',
        'num_inference_steps': 50
    }
}

response = requests.post(
    'https://api.replicate.com/v1/predictions',
    headers=headers,
    json=data
)

prediction = response.json()
print(prediction)

Use Replicate from Claude / Cursor / ChatGPT

Get a hosted MCP endpoint for Replicate. Paste your Replicate API key, copy back one URL, drop it into Claude Desktop, Cursor, or any AI client that supports remote MCP. Your AI calls Replicate directly with your credentials — no local install, works on mobile.

run_replicate_model Execute any Replicate model with specified inputs and return the prediction results, supporting image generation, text generation, audio synthesis, and other AI tasks

get_prediction_status Check the status and retrieve outputs of a running or completed prediction by ID, useful for monitoring long-running model executions

search_models Search and filter available Replicate models by category, task type, or keywords to find appropriate models for specific AI use cases

train_custom_model Start a training job to fine-tune models with custom datasets, enabling personalized AI models for specific domains or styles

list_model_versions Retrieve all available versions of a model to compare capabilities, performance, and select the optimal version for a task

Connect in 60 seconds

Paste your Replicate key → get an MCP URL → paste into Claude/Cursor. Hosted by IOX, encrypted at rest.

Connect Replicate to your AI →

Replicate REST API

API Endpoints

Sponsor this page

Code Examples

Use Replicate from Claude / Cursor / ChatGPT

Connect in 60 seconds

Related APIs