Skip to content

Getting Started

Raku Sight lets you upload a labeled dataset to train your own YOLO detection model in the cloud. Once training is complete, you get a dedicated inference server with both YOLO and LLM capabilities built in — call it from any application via a simple REST API.

Raku Sight login page


How it works

graph LR
  A[Create project] --> B[Upload dataset];
  B --> C[Train model in cloud];
  C --> D[Copy API key];
  D --> E[Call inference API];
  E --> F[Get detection results];

The platform handles dataset storage (R2), GPU training (Northflank), and versioned model deployment. You only interact with the dashboard and the API.


Step 1 — Create an account

Go to https://sight.raku.so/ and sign in with your operator email and password.

Need access?

Contact your workspace admin to get an operator account created. Self-signup is not currently available.


Step 2 — Create a new project

After logging in you land on the Projects dashboard. Click + New project in the top-right corner.

Projects dashboard

Give your project a name and description, then confirm. Each project has its own datasets, models, and API key.


Step 3 — Go to the Model page

Inside your project, navigate to the Models tab in the left sidebar.

Model page

This page lists all trained model versions for the project and their deployment status.


Step 4 — Upload a labeled dataset and start training

On the Models page, click Upload dataset. Select your labeled dataset archive and click Upload and start training.

Upload sidebar

Dataset format

Raku Sight expects YOLO-format annotations. Each image should have a corresponding .txt label file. Zip the entire dataset before uploading.

Training runs on Northflank cloud infrastructure. You can monitor progress directly on the Models page — the status will update from TrainingDeployed when the model is ready.


Step 5 — Copy your API key

Navigate to the API Keys tab in your project sidebar.

Api key page

Click Copy next to your key. This key authenticates every inference request.

Keep your API key secret

Do not commit your API key to version control. Store it in an environment variable or a secrets manager.

# Example: store as an environment variable
export RAKU_API_KEY="your-api-key-here"

Step 6 — Test or integrate

With your API key ready, you have two options:

Try it in the Playground — head to the Playground tab inside your project on sight.raku.so to upload images and run inference directly in the browser. No code needed — great for verifying your model is working before you ship anything.

Integrate into your app — if you're ready to call the API from your own code, see the API Integration section for the full reference and code examples in JavaScript, TypeScript, Python, .NET, Java, and Go.

Start with the Playground

It's the fastest way to confirm your trained model responds correctly before wiring up the API in production.


API quick start

The base URL for all inference requests is:

https://sightapi.raku.so/api/v1

Quick example — batch prediction

Send one or more images and get back a complete result:

import json
import httpx

response = httpx.post(
    "https://sightapi.raku.so/api/v1/inference/predict/batch",
    data={
        "api_key": "YOUR_API_KEY",
        "query": "find me the serial number",
        "metadata": [json.dumps({"is_yolo": True, "is_llm": True})],
    },
    files=[("files", open("photo.jpg", "rb"))],
)

print(response.json())
const form = new FormData();
form.append("api_key", "YOUR_API_KEY");
form.append("query", "find me the serial number");
form.append("metadata", JSON.stringify({ is_yolo: true, is_llm: true }));
form.append("files", fileInput.files[0]);

const res = await fetch(
  "https://sightapi.raku.so/api/v1/inference/predict/batch",
  { method: "POST", body: form }
);
console.log(await res.json());
// See the Batch Prediction page for the full multipart example
result, err := predictBatch("YOUR_API_KEY", "find me the serial number",
    []string{"photo.jpg"},
    []FileMetadata{{IsYolo: true, IsLlm: true}},
)

Quick example — streaming prediction

Receive results incrementally as each image is processed:

import json
import httpx

with httpx.Client() as client:
    with client.stream(
        "POST",
        "https://sightapi.raku.so/api/v1/inference/predict/stream",
        data={
            "api_key": "YOUR_API_KEY",
            "query": "find me the serial number",
            "metadata": [json.dumps({"is_yolo": True, "is_llm": True})],
        },
        files=[("files", open("photo.jpg", "rb"))],
    ) as response:
        for line in response.iter_lines():
            if line:
                print(json.loads(line))
const form = new FormData();
form.append("api_key", "YOUR_API_KEY");
form.append("query", "find me the serial number");
form.append("metadata", JSON.stringify({ is_yolo: true, is_llm: true }));
form.append("files", fileInput.files[0]);

const res = await fetch(
  "https://sightapi.raku.so/api/v1/inference/predict/stream",
  { method: "POST", body: form }
);

const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split("\n");
  buffer = lines.pop();
  for (const line of lines) {
    if (line.trim()) console.log(JSON.parse(line));
  }
}

Inference modes

Each file you send can run one or both inference engines independently:

is_yolo is_llm What runs
true false YOLO object detection only — fast, returns bounding boxes
false true LLM vision analysis only — slower, returns natural language answers
true true Both engines — most complete result

Next steps