Skip to content

Studio REST API v1.0.0

Scroll down for code samples, example requests and responses. .

DataChain Studio provides REST API for programmatically managing datasets, jobs, and storage operations. All API endpoints require authentication and are scoped to specific teams.

Authorization:

All API endpoints require authentication via a Studio token. The token must be included in the Authorization header. You can get a token by using datachain auth token after logging in with datachain auth login or from Tokens page in the Studio UI Settings. Once you get a token, attach it to the Authorization header in the following format:

Authorization: Bearer <token>

  • Base URL: https://studio.datachain.ai/api

Default

Get Jobs

Code samples

import http.client

conn = http.client.HTTPSConnection("example.com")

headers = {
    'Accept': "application/json",
    'Authorization': "API_KEY"
    }

conn.request("GET", "/api/datachain/jobs/?team_name=string", headers=headers)

res = conn.getresponse()
data = res.read()

print(data.decode("utf-8"))
curl --request GET \
  --url 'https://example.com/api/datachain/jobs/?team_name=string' \
  --header 'Accept: application/json' \
  --header 'Authorization: API_KEY'

GET /api/datachain/jobs/

Retrieve a list of jobs with optional status filtering.

Requires a token with read access to JOB scope.

Example responses

200 Response

[
  {
    "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
    "url": "https://studio.datachain.ai/team/team_name/datasets/jobs/0502eef6-a32e-45fa-8e3b-d20ecpabbcf0",
    "status": "CREATED",
    "created_at": "2021-01-01T00:00:00Z",
    "created_by": "username",
    "finished_at": "2021-01-01T00:00:00Z",
    "query": "print('Hello, World!')",
    "query_type": "PYTHON",
    "team": "TeamName",
    "name": "QueryName",
    "workers": 1,
    "python_version": "3.12",
    "requirements": "numpy==1.24.0",
    "repository": "https://github.com/user/repo",
    "environment": {
      "ENV_NAME": "ENV_VALUE"
    },
    "exit_code": 0,
    "error_message": "Error message"
  }
]

Responses

Status Meaning Description Schema
200 OK OK Inline

Response Schema

Status Code 200

Create Job

Code samples

import http.client

conn = http.client.HTTPSConnection("example.com")

payload = "{\"query\":\"print('Hello, World!')\",\"query_type\":\"PYTHON\",\"team_name\":\"TeamName\",\"environment\":\"ENV_NAME=ENV_VALUE\",\"workers\":1,\"query_name\":\"QueryName\",\"files\":[\"2\",\"3\"],\"python_version\":\"3.12\",\"requirements\":\"numpy==1.24.0\",\"repository\":\"https://github.com/user/repo\",\"priority\":1,\"compute_cluster_name\":\"ComputeClusterName\",\"compute_cluster_id\":1,\"start_after\":\"2021-01-01T00:00:00Z\",\"cron_expression\":\"0 0 * * *\",\"credentials_name\":\"CredentialsName\"}"

headers = {
    'Content-Type': "application/json",
    'Accept': "application/json",
    'Authorization': "API_KEY"
    }

conn.request("POST", "/api/datachain/jobs/", payload, headers)

res = conn.getresponse()
data = res.read()

print(data.decode("utf-8"))
curl --request POST \
  --url https://example.com/api/datachain/jobs/ \
  --header 'Accept: application/json' \
  --header 'Authorization: API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"query":"print('\''Hello, World!'\'')","query_type":"PYTHON","team_name":"TeamName","environment":"ENV_NAME=ENV_VALUE","workers":1,"query_name":"QueryName","files":["2","3"],"python_version":"3.12","requirements":"numpy==1.24.0","repository":"https://github.com/user/repo","priority":1,"compute_cluster_name":"ComputeClusterName","compute_cluster_id":1,"start_after":"2021-01-01T00:00:00Z","cron_expression":"0 0 * * *","credentials_name":"CredentialsName"}'

POST /api/datachain/jobs/

Creates a job and returns the job metadata.

Note that compute_cluster_name and compute_cluster_id are mutually exclusive. Requires a token with write access to JOB scope.

Body parameter

{
  "query": "print('Hello, World!')",
  "query_type": "PYTHON",
  "team_name": "TeamName",
  "environment": "ENV_NAME=ENV_VALUE",
  "workers": 1,
  "query_name": "QueryName",
  "files": [
    "2",
    "3"
  ],
  "python_version": "3.12",
  "requirements": "numpy==1.24.0",
  "repository": "https://github.com/user/repo",
  "priority": 1,
  "compute_cluster_name": "ComputeClusterName",
  "compute_cluster_id": 1,
  "start_after": "2021-01-01T00:00:00Z",
  "cron_expression": "0 0 * * *",
  "credentials_name": "CredentialsName"
}

Example responses

200 Response

{
  "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "url": "https://studio.datachain.ai/team/team_name/datasets/jobs/0502eef6-a32e-45fa-8e3b-d20ecpabbcf0",
  "status": "CREATED",
  "created_at": "2021-01-01T00:00:00Z",
  "created_by": "username",
  "finished_at": "2021-01-01T00:00:00Z",
  "query": "print('Hello, World!')",
  "query_type": "PYTHON",
  "team": "TeamName",
  "name": "QueryName",
  "workers": 1,
  "python_version": "3.12",
  "requirements": "numpy==1.24.0",
  "repository": "https://github.com/user/repo",
  "environment": {
    "ENV_NAME": "ENV_VALUE"
  },
  "exit_code": 0,
  "error_message": "Error message"
}

Responses

Status Meaning Description Schema
200 OK OK JobOutput

Cancel Job

Code samples

import http.client

conn = http.client.HTTPSConnection("example.com")

payload = "{\"team_name\":\"TeamName\"}"

headers = {
    'Content-Type': "application/json",
    'Accept': "application/json",
    'Authorization': "API_KEY"
    }

conn.request("POST", "/api/datachain/jobs/497f6eca-6276-4993-bfeb-53cbbbba6f08/cancel", payload, headers)

res = conn.getresponse()
data = res.read()

print(data.decode("utf-8"))
curl --request POST \
  --url https://example.com/api/datachain/jobs/497f6eca-6276-4993-bfeb-53cbbbba6f08/cancel \
  --header 'Accept: application/json' \
  --header 'Authorization: API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"team_name":"TeamName"}'

POST /api/datachain/jobs/{job_id}/cancel

Cancel a running or queued job.

Requires a token with write access to JOB scope.

Body parameter

{
  "team_name": "TeamName"
}

Example responses

200 Response

{
  "message": "Successfully canceled"
}

Responses

Status Meaning Description Schema
200 OK OK ActionFeedback

Upload File

Code samples

import http.client

conn = http.client.HTTPSConnection("example.com")

payload = "-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"file\"\r\n\r\nstring\r\n-----011000010111000001101001--\r\n"

headers = {
    'Content-Type': "multipart/form-data; boundary=---011000010111000001101001",
    'Accept': "application/json",
    'Authorization': "API_KEY"
    }

conn.request("POST", "/api/datachain/jobs/files?team_name=string", payload, headers)

res = conn.getresponse()
data = res.read()

print(data.decode("utf-8"))
curl --request POST \
  --url 'https://example.com/api/datachain/jobs/files?team_name=string' \
  --header 'Accept: application/json' \
  --header 'Authorization: API_KEY' \
  --header 'Content-Type: multipart/form-data; boundary=---011000010111000001101001' \
  --form file=string

POST /api/datachain/jobs/files

Upload a file to use with a job.

Use the file id returned by this endpoint in the files field of the job input. Requires a token with write access to JOB scope.

Body parameter

file: string

Example responses

200 Response

{
  "id": 1,
  "filename": "file.txt",
  "size": 100,
  "state": "pending",
  "error": "Error message"
}

Responses

Status Meaning Description Schema
200 OK OK UploadFileOutput

Get Clusters

Code samples

import http.client

conn = http.client.HTTPSConnection("example.com")

headers = {
    'Accept': "application/json",
    'Authorization': "API_KEY"
    }

conn.request("GET", "/api/datachain/clusters/", headers=headers)

res = conn.getresponse()
data = res.read()

print(data.decode("utf-8"))
curl --request GET \
  --url https://example.com/api/datachain/clusters/ \
  --header 'Accept: application/json' \
  --header 'Authorization: API_KEY'

GET /api/datachain/clusters/

Example responses

200 Response

[
  {
    "id": 1,
    "name": "ComputeClusterName",
    "status": "ACTIVE",
    "cloud_provider": "AWS",
    "cloud_credentials": "CredentialsName",
    "is_active": true,
    "default": true,
    "max_workers": 1
  }
]

Responses

Status Meaning Description Schema
200 OK OK Inline

Response Schema

Status Code 200

Schemas

JobOutput

{
  "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "url": "https://studio.datachain.ai/team/team_name/datasets/jobs/0502eef6-a32e-45fa-8e3b-d20ecpabbcf0",
  "status": "CREATED",
  "created_at": "2021-01-01T00:00:00Z",
  "created_by": "username",
  "finished_at": "2021-01-01T00:00:00Z",
  "query": "print('Hello, World!')",
  "query_type": "PYTHON",
  "team": "TeamName",
  "name": "QueryName",
  "workers": 1,
  "python_version": "3.12",
  "requirements": "numpy==1.24.0",
  "repository": "https://github.com/user/repo",
  "environment": {
    "ENV_NAME": "ENV_VALUE"
  },
  "exit_code": 0,
  "error_message": "Error message"
}

JobInput

{
  "query": "print('Hello, World!')",
  "query_type": "PYTHON",
  "team_name": "TeamName",
  "environment": "ENV_NAME=ENV_VALUE",
  "workers": 1,
  "query_name": "QueryName",
  "files": [
    "2",
    "3"
  ],
  "python_version": "3.12",
  "requirements": "numpy==1.24.0",
  "repository": "https://github.com/user/repo",
  "priority": 1,
  "compute_cluster_name": "ComputeClusterName",
  "compute_cluster_id": 1,
  "start_after": "2021-01-01T00:00:00Z",
  "cron_expression": "0 0 * * *",
  "credentials_name": "CredentialsName"
}

ActionFeedback

{
  "message": "Successfully canceled"
}

JobCancelInput

{
  "team_name": "TeamName"
}

UploadFileOutput

{
  "id": 1,
  "filename": "file.txt",
  "size": 100,
  "state": "pending",
  "error": "Error message"
}

ComputeClusterOutput

{
  "id": 1,
  "name": "ComputeClusterName",
  "status": "ACTIVE",
  "cloud_provider": "AWS",
  "cloud_credentials": "CredentialsName",
  "is_active": true,
  "default": true,
  "max_workers": 1
}