Fine-tune Qwen3 4B Instruct 2507 from your own dataset bucket. Compare measured GPU performance, open a preloaded Serverless Job, choose your dataset and output buckets, and start training in your Nebius account.
No measured benchmark yet — in progress.
GPU types queued for benchmarking.
Per-GPU fine-tune throughput and time-per-step, measured on Forge GPUs. Pick a target before you start; cells still being measured show as in progress.
Benchmarks in progress
Forge is measuring per-GPU fine-tune throughput for this model on its own GPUs. Measured numbers will appear here once the first benchmarks land.
Serverless training
Open Nebius with the training image, GPU preset, and command preloaded. Then choose your dataset bucket and output bucket in your account.
Upload data
Put images or records in your Object Storage bucket.
Open job
Use the preloaded Serverless Job form.
Start training
Select your dataset and output bucket, then run it.
JSONL chat or prompt/completion training data
s3://my-bucket/llm-lora/train.jsonl
{"messages":[{"role":"user","content":"Summarize this ticket"},{"role":"assistant","content":"Short summary..."}]}
{"prompt":"Classify this support request","completion":"billing"}Captions are optional for image LoRA. If filenames start with a custom token, the training command can infer it automatically.
Serverless job URL
readyNebius Jobs create link is generated with training image, GPU platform, preset, command, and dataset mount defaults.
Open link ↗Serverless endpoint URL
verify after runEndpoint create link preloads the serving image and output mount; after training, attach the produced adapter/checkpoint and run a health check plus representative sample request.
Open link ↗Input data guidance
readyDataset format, accepted input methods, and an example are present: JSONL chat or prompt/completion training data.
Agent handoff
readyAgent steps cover job creation, monitoring, output verification, endpoint smoke test, and user-facing closeout.
# Runs in YOUR Nebius account, on YOUR data — you own the weights and # you pay for the GPUs. Forge does not run this job; this just starts it. # Uses Nebius AI Jobs CLI (`nebius ai job create`). # Fill in these customer-owned values before running: # FORGE_NEBIUS_PROJECT_ID: your Nebius project / parent ID. # FORGE_TRAIN_PLATFORM/FORGE_TRAIN_PRESET: pick GPU resources available in your project. # FORGE_TRAIN_DATASET_URI: point this at your bucket, e.g. s3://my-bucket/train.jsonl. # FORGE_TRAIN_OUTPUT_URI: bucket path where trained weights are written. # Verify the command starts a user-data fine-tune, not a benchmark/probe. # After completion: verify output artifacts, create the Serverless Endpoint, # then run endpoint health and one representative sample request. export FORGE_NEBIUS_PROJECT_ID="YOUR_PROJECT_ID" export FORGE_TRAIN_PLATFORM="YOUR_GPU_PLATFORM" export FORGE_TRAIN_PRESET="YOUR_GPU_PRESET" export FORGE_TRAIN_JOB_NAME="forge-fine-tune" export FORGE_TRAIN_DATASET_URI="s3://my-bucket/train.jsonl" export FORGE_TRAIN_OUTPUT_URI="s3://my-bucket/outputs/" FORGE_TRAIN_COMMAND='python -m forge_finetune \ --base-model Qwen/Qwen3-4B-Instruct-2507 \ --method lora \ --dataset '"$FORGE_TRAIN_DATASET_URI"' # <-- point this at YOUR OWN bucket \ --output '"$FORGE_TRAIN_OUTPUT_URI"' # <-- your bucket; you own the weights' nebius ai job create \ --parent-id "$FORGE_NEBIUS_PROJECT_ID" \ --name "$FORGE_TRAIN_JOB_NAME" \ --platform "$FORGE_TRAIN_PLATFORM" \ --preset "$FORGE_TRAIN_PRESET" \ --image 'cr.eu-north1.nebius.cloud/e00h91c5sa606xfwpj/forge-finetune:training-flop-util-74f0a06c@sha256:77640f8f47850193a9cb98678a1fb95056b9e75e46050d5c948c76d6bc14eaa3' \ --volume "$FORGE_TRAIN_DATASET_URI":/workspace/dataset:ro \ --volume "$FORGE_TRAIN_OUTPUT_URI":/workspace/output:rw \ --container-command "/bin/sh" \ --args "-lc \"$FORGE_TRAIN_COMMAND\""
cr.eu-north1.nebius.cloud/e00h91c5sa606xfwpj/forge-finetune:training-flop-util-74f0a06c@sha256:77640f8f47850193a9cb98678a1fb95056b9e75e46050d5c948c76d6bc14eaa3