Skip to main content
Create, modify, and delete Serverless endpoints using the GraphQL API. For the complete schema, see the GraphQL Spec.

Quick reference

OperationMutation/Query
Create endpointsaveEndpoint
Modify endpointsaveEndpoint (with id)
List endpointsmyself { endpoints { ... } }
Delete endpointdeleteEndpoint

Required fields

Endpoints require the following fields:
FieldTypeDescription
gpuIdsStringGPU tier identifier. Options: AMPERE_16 (16GB), AMPERE_24 (24GB), ADA_24 (24GB Ada), AMPERE_48 (48GB), ADA_48_PRO (48GB Ada Pro), AMPERE_80 (80GB), ADA_80_PRO (80GB Ada Pro).
nameStringEndpoint name.
templateIdStringID of the Serverless template to use.

Create an endpoint

curl --request POST \
  --header 'content-type: application/json' \
  --url 'https://api.runpod.io/graphql?api_key=${YOUR_API_KEY}' \
  --data '{"query": "mutation { saveEndpoint(input: { gpuIds: \"AMPERE_16\", idleTimeout: 5, locations: \"US\", name: \"My Endpoint\", flashBootType: FLASHBOOT, scalerType: \"QUEUE_DELAY\", scalerValue: 4, templateId: \"YOUR_TEMPLATE_ID\", workersMax: 3, workersMin: 0 }) { id name gpuIds idleTimeout locations flashBootType scalerType scalerValue templateId workersMax workersMin } }"}'

Configuration options

FieldDescription
idleTimeoutSeconds before idle workers shut down.
locationsRestrict to specific regions. Leave empty for any region.
flashBootTypeEnum value for boot optimization. Set to FLASHBOOT for faster cold starts (no quotes in GraphQL).
scalerTypeAutoscaling strategy. Options: QUEUE_DELAY, REQUEST_COUNT.
scalerValueTarget value for the scaler (e.g., queue delay in seconds).
workersMinMinimum active workers. Set to 0 for scale-to-zero.
workersMaxMaximum concurrent workers.
networkVolumeIdOptional network volume to mount.

Modify an endpoint

Include the endpoint id to update an existing endpoint.
curl --request POST \
  --header 'content-type: application/json' \
  --url 'https://api.runpod.io/graphql?api_key=${YOUR_API_KEY}' \
  --data '{"query": "mutation { saveEndpoint(input: { id: \"i02xupws21hp6i\", gpuIds: \"AMPERE_16\", name: \"My Endpoint\", templateId: \"YOUR_TEMPLATE_ID\", workersMax: 5 }) { id gpuIds name templateId workersMax } }"}'

List endpoints

curl --request POST \
  --header 'content-type: application/json' \
  --url 'https://api.runpod.io/graphql?api_key=${YOUR_API_KEY}' \
  --data '{"query": "query { myself { endpoints { id name gpuIds idleTimeout locations networkVolumeId scalerType scalerValue templateId workersMax workersMin pods { desiredStatus } } serverlessDiscount { discountFactor type expirationDate } } }"}'

Delete an endpoint

Before deleting, set both workersMin and workersMax to 0.
The endpoint’s min and max workers must both be zero before you can delete it.
curl --request POST \
  --header 'content-type: application/json' \
  --url 'https://api.runpod.io/graphql?api_key=${YOUR_API_KEY}' \
  --data '{"query": "mutation { deleteEndpoint(id: \"i02xupws21hp6i\") }"}'