black-forest-labs-flux-1-schnellGeneralapache-2.0

FLUX.1 Schnell

Fine-tune FLUX.1 Schnell as a LoRA from your own image bucket. Compare measured GPU performance, open a preloaded Serverless Job, choose your dataset and output buckets, and start training in your Nebius account.

Fine-tune workloadsflux-lora

Start training job ↓

Best throughput

1.51 img/s

Fastest measured GPU: B200.

Parameters

12.0B params

GPUs benchmarked

GPU types with results or in progress.

Training performance

Per-GPU fine-tune throughput and time-per-step, measured on Forge GPUs. Pick a target before you start; cells still being measured show as in progress.

GPU	Region	Workload	Status	Throughput	Time / step	FLOP util.
B200fastest	us-central1	flux-lora	Measured	1.51 img/s	661 ms	5.0%
B300	uk-south1	flux-lora	Measured	1.46 img/s	683 ms	4.3%
H100	eu-north1	flux-lora	Measured

Train in your account

Serverless training

Start a training job

Open Nebius with the training image, GPU preset, and command preloaded. Then choose your dataset bucket and output bucket in your account.

Start Serverless training job ↗Endpoint after training ↗

Upload data

Put images or records in your Object Storage bucket.

Open job

Use the preloaded Serverless Job form.

Start training

Select your dataset and output bucket, then run it.

License: Apache-2.0, but this is a distilled fast model with gated Hugging Face checkpoint access. Pass HF_TOKEN after accepting the model terms; prefer FLUX.2 klein 4B Base for most new LoRAs.

Input data

Image folder with .jpg/.jpeg/.png files and optional same-name .txt captions

s3://my-bucket/flux-subject-lora/
  customtoken_001.jpg
  customtoken_001.txt  # "photo of customtoken person, studio portrait, natural skin texture"
  customtoken_002.png
  customtoken_002.txt  # "customtoken person in everyday clothing, outdoor natural light"
  customtoken_003.jpeg

Captions are optional for image LoRA. If filenames start with a custom token, the training command can infer it automatically.

Advanced details: CLI, image, tracking, agent checks

Accepted inputs

Object Storage bucket mounted as /workspace/dataset with image files.
Same-name .txt captions for each image. Use filenames like customtoken_001.jpg so the trainer can infer the custom trigger token.
Optional FORGE_TRAIN_TRIGGER_WORD override when the token should not be inferred from filenames.
Optional W&B run tracking through WANDB_* environment variables.

Outputs

LoRA adapter weights, usually .safetensors, in your output bucket.
Generated sample images at the configured sample interval.
Training config and logs; W&B run history when WANDB_* is configured.

Optional tracking

WANDB_API_KEY (secret): Optional W&B API key. Store it in MysteryBox and pass it with --env-secret.

WANDB_PROJECT: Optional W&B project name for training progress and sample tracking.

WANDB_RUN_NAME: Optional W&B run name, e.g. flux-klein-subject-lora.

FORGE_TRAIN_TRIGGER_WORD: Optional FLUX trigger token override. When unset, the trainer infers it from image filenames such as customtoken_001.jpg.

HF_TOKEN (secret): Required for gated BFL checkpoints after you accept the model terms on Hugging Face.

Readiness checks

job -> output -> endpoint

Serverless job URL

ready

Nebius Jobs create link is generated with training image, GPU platform, preset, command, and dataset mount defaults.

Open link ↗

Serverless endpoint URL

verify after run

Endpoint create link preloads the serving image and output mount; after training, attach the produced adapter/checkpoint and run a health check plus representative sample request.

Open link ↗

Input data guidance

ready

Dataset format, accepted input methods, and an example are present: Image folder with .jpg/.jpeg/.png files and optional same-name .txt captions.

Agent handoff

ready

Agent steps cover job creation, monitoring, output verification, endpoint smoke test, and user-facing closeout.

Full instructions

11. This fine-tune runs in YOUR Nebius account on YOUR data. You own the produced weights and you pay for the GPUs — Forge does not run or bill this job.
22. Authenticate the Nebius CLI to your account and project: `nebius iam whoami` to confirm, `nebius iam project list` to find your project/parent ID.
33. Put your training data in your own bucket. Expected format: Image folder with .jpg/.jpeg/.png files and optional same-name .txt captions. Replace `s3://YOUR-BUCKET/your-training-data/` and `s3://YOUR-BUCKET/fine-tuned-output/` with bucket paths you own; Forge never sees your data or weights.
44. Submit the training job in your account: run the `command_template` below as a Nebius Jobs workload, e.g. `nebius ai job create --parent-id <YOUR_PROJECT_ID> --image docker.io/ostris/aitoolkit@sha256:220d85e443589c6b52521c594a2d9f052d733afe360966d24bb8a5fe853745f7 --volume <DATASET_S3>:/workspace/dataset:ro --volume <OUTPUT_S3>:/workspace/output:rw ...` — or open the console link below, which preloads the image, GPU preset, and fine-tune command.
55. The base model is `black-forest-labs/FLUX.1-schnell` (fine-tuned with the `flux-lora` workload). Pick a GPU/preset that matches the fine-tune performance shown for this model in the training catalog.
66. Optional W&B tracking: create a W&B key in MysteryBox and pass `WANDB_API_KEY` via `--env-secret`; pass `WANDB_PROJECT` and `WANDB_RUN_NAME` via `--env` if you want named runs.
77. Monitor the run in your account with `nebius ai job list --parent-id <YOUR_PROJECT_ID>` and `nebius ai job get <JOB_ID>`; the fine-tuned weights and samples land in your output bucket when it completes.
88. After the job completes, create a Serverless Endpoint from the endpoint console link when available, mount or attach the output adapter/checkpoint, and verify a health check plus one representative sample request before considering the model ready for the user.

Agent instructions

01Fetch `/v1/training/models/{slug}` and use `jobs_handoff`; do not infer bucket paths, image refs, or commands from memory.
02Open or validate `jobs_handoff.console_url`; it must point to the Nebius Serverless Job create page and preload the training image, GPU preset, and command.
03Confirm the user supplied a dataset in their own bucket matching: Image folder with .jpg/.jpeg/.png files and optional same-name .txt captions. Replace `s3://YOUR-BUCKET/your-training-data/` and `s3://YOUR-BUCKET/fine-tuned-output/` with user-owned bucket URIs.
04Create the Nebius AI Job with image `docker.io/ostris/aitoolkit@sha256:220d85e443589c6b52521c594a2d9f052d733afe360966d24bb8a5fe853745f7`, base model `black-forest-labs/FLUX.1-schnell`, and workload `flux-lora`; pass optional W&B/HF secrets only through the user's approved secret mechanism.
05Monitor `nebius ai job get` and `nebius ai logs` until the run completes; verify that adapter/checkpoint files, samples, and logs exist in the output bucket.
06After the job writes weights to the output bucket, open `jobs_handoff.endpoint_console_url`, attach or mount that output, create the Serverless Endpoint in the user's project, and run a health check plus one representative sample request before telling the user it is ready.
07Return concise user instructions: where their dataset should live, where outputs were written, the endpoint URL/status, and how to reproduce or tune the run.

Nebius AI Jobs CLI

# Runs in YOUR Nebius account, on YOUR data — you own the weights and
# you pay for the GPUs. Forge does not run this job; this just starts it.
# Uses Nebius AI Jobs CLI (`nebius ai job create`).
# Fill in these customer-owned values before running:
#   FORGE_NEBIUS_PROJECT_ID: your Nebius project / parent ID.
#   FORGE_TRAIN_PLATFORM/FORGE_TRAIN_PRESET: pick GPU resources available in your project.
#   FORGE_TRAIN_DATASET_URI: point this at your bucket, e.g. s3://my-bucket/train.jsonl.
#   FORGE_TRAIN_OUTPUT_URI: bucket path where trained weights are written.
# Verify the command starts a user-data fine-tune, not a benchmark/probe.
# After completion: verify output artifacts, create the Serverless Endpoint,
# then run endpoint health and one representative sample request.
# Optional training environment:
#   --env-secret WANDB_API_KEY=... Optional W&B API key. Store it in MysteryBox and pass it with --env-secret.
#   --env WANDB_PROJECT=... Optional W&B project name for training progress and sample tracking.
#   --env WANDB_RUN_NAME=... Optional W&B run name, e.g. flux-klein-subject-lora.
#   --env FORGE_TRAIN_TRIGGER_WORD=... Optional FLUX trigger token override. When unset, the trainer infers it from image filenames such as customtoken_001.jpg.
#   --env-secret HF_TOKEN=... Required for gated BFL checkpoints after you accept the model terms on Hugging Face.
export FORGE_NEBIUS_PROJECT_ID="YOUR_PROJECT_ID"
export FORGE_TRAIN_PLATFORM="YOUR_GPU_PLATFORM"
export FORGE_TRAIN_PRESET="YOUR_GPU_PRESET"
export FORGE_TRAIN_JOB_NAME="forge-fine-tune"
export FORGE_TRAIN_DATASET_URI="s3://my-bucket/train.jsonl"
export FORGE_TRAIN_OUTPUT_URI="s3://my-bucket/outputs/"
FORGE_TRAIN_COMMAND='set -eu
mkdir -p /workspace/config /workspace/dataset /workspace/output
# Dataset source: '"$FORGE_TRAIN_DATASET_URI"' (mounted at /workspace/dataset by the Jobs CLI).
# Output destination: '"$FORGE_TRAIN_OUTPUT_URI"' (mounted at /workspace/output by the Jobs CLI).
DATASET_IMAGE_ROOT="/workspace/dataset"
if [ -d /workspace/dataset/target ]; then DATASET_IMAGE_ROOT="/workspace/dataset/target"; fi
sanitize_trigger_word() {
  printf '\''%s'\'' "$1" | tr '\''[:upper:]'\'' '\''[:lower:]'\'' | sed -E '\''s/[^a-z0-9_-]+/-/g; s/^-+|-+$//g'\''
}
TRIGGER_WORD="$(sanitize_trigger_word "${FORGE_TRAIN_TRIGGER_WORD:-}")"
if [ -z "$TRIGGER_WORD" ]; then
  FIRST_IMAGE="$(find "$DATASET_IMAGE_ROOT" -maxdepth 2 -type f \( -iname '\''*.jpg'\'' -o -iname '\''*.jpeg'\'' -o -iname '\''*.png'\'' \) | sort | head -n 1 || true)"
  if [ -n "$FIRST_IMAGE" ]; then
    STEM="$(basename "$FIRST_IMAGE")"
    STEM="${STEM%.*}"
    STEM="$(printf '\''%s'\'' "$STEM" | sed -E '\''s/([_-]?[0-9]+)$//'\'')"
    TRIGGER_WORD="$(sanitize_trigger_word "$STEM")"
  fi
fi
if [ -z "$TRIGGER_WORD" ]; then TRIGGER_WORD="subject"; fi
export FORGE_TRAIN_TRIGGER_WORD="$TRIGGER_WORD"
echo "Using FLUX trigger token: ${FORGE_TRAIN_TRIGGER_WORD}"
for candidate in /app/ai-toolkit /workspace/ai-toolkit /root/ai-toolkit /ai-toolkit /app /workspace; do
  if [ -f "$candidate/run.py" ]; then cd "$candidate"; break; fi
done
test -f run.py
cat > /workspace/config/forge-flux-lora.yaml <<YAML
---
job: extension
config:
  name: "forge_black_forest_labs_flux_1_schnell_lora"
  process:
    - type: '\''sd_trainer'\''
      training_folder: "/workspace/output"
      performance_log_every: 50
      device: cuda:0
      trigger_word: "${FORGE_TRAIN_TRIGGER_WORD}"
      network:
        type: "lora"
        linear: 16
        linear_alpha: 16
      save:
        dtype: float16
        save_every: 250
        max_step_saves_to_keep: 4
        push_to_hub: false
      datasets:
        - folder_path: "/workspace/dataset"
          caption_ext: "txt"
          caption_dropout_rate: 0.05
          shuffle_tokens: false
          cache_latents_to_disk: true
          resolution: [512, 768, 1024]
      train:
        batch_size: 1
        steps: 1600
        gradient_accumulation_steps: 1
        train_unet: true
        train_text_encoder: false
        gradient_checkpointing: true
        noise_scheduler: "flowmatch"
        optimizer: "adamw8bit"
        lr: 1e-4
        ema_config:
          use_ema: true
          ema_decay: 0.99
        dtype: bf16
      model:
        name_or_path: "black-forest-labs/FLUX.1-schnell"
        is_flux: true
        quantize: true
        assistant_lora_path: "ostris/FLUX.1-schnell-training-adapter"
      sample:
        sampler: "flowmatch"
        sample_every: 250
        width: 1024
        height: 1024
        prompts:
          - "photo of ${FORGE_TRAIN_TRIGGER_WORD} person, studio portrait, natural skin texture"
          - "${FORGE_TRAIN_TRIGGER_WORD} person in everyday clothing, outdoor natural light"
        neg: ""
        seed: 42
        walk_seed: true
        guidance_scale: 1
        sample_steps: 4
meta:
  name: "FLUX.1 schnell LoRA"
  version: '\''1.0'\''
YAML
python run.py /workspace/config/forge-flux-lora.yaml'

nebius ai job create \
  --parent-id "$FORGE_NEBIUS_PROJECT_ID" \
  --name "$FORGE_TRAIN_JOB_NAME" \
  --platform "$FORGE_TRAIN_PLATFORM" \
  --preset "$FORGE_TRAIN_PRESET" \
  --image 'docker.io/ostris/aitoolkit@sha256:220d85e443589c6b52521c594a2d9f052d733afe360966d24bb8a5fe853745f7' \
  --volume "$FORGE_TRAIN_DATASET_URI":/workspace/dataset:ro \
  --volume "$FORGE_TRAIN_OUTPUT_URI":/workspace/output:rw \
  --container-command "/bin/sh" \
  --args "-lc \"$FORGE_TRAIN_COMMAND\""

Training image

docker.io/ostris/aitoolkit@sha256:220d85e443589c6b52521c594a2d9f052d733afe360966d24bb8a5fe853745f7