How Fairen works, technically.

No black boxes here either.

Price Calculation

price = rate_per_second × duration

the only formula we use

ModelReel
Rate per second$0.08
Duration5s
Total charged$0.40
API Response

The full pipeline.

Our model. Our infrastructure.

User input

Your idea becomes a generation request

You describe what you want — a scene, a mood, a motion. Your prompt is parsed, validated, and packaged into a structured generation request with all parameters locked in before anything runs.

Input formatNatural language + settings
ValidationPrompt safety + parameter check
Price lockBefore generation starts
// generation request created
{
  "prompt": "Golden hour over calm ocean waves",
  "model": "reel",
  "duration": 5,
  "resolution": "1080p",
  "status": "validated",
  "price_locked": 0.40
}

Powered by N1

Orchestration OS

N1's orchestration layer acts as an operating system for the compute pool — dynamically allocating GPU resources across thousands of machines.

TOTAL GPU RESOURCES (PHYSICAL)GPU SLICE 1:GPU CORESMEMORY PARTITIONL2 CACHEGPU SLICE 2:GPU CORESMEMORY PARTITIONL2 CACHEGPU SLICE 3:GPU CORESMEMORY PARTITIONL3 CACHEWORKLOAD:AI INFERENCE(INSTANCE 1)WORKLOAD:MODEL RENDERING(INSTANCE 2)WORKLOAD:FRAME ASSEMBLY(INSTANCE 3)

GPU Slicing

Physical GPUs are partitioned at the hardware level. Fairen gets precisely allocated cores and memory — no overprovisioning, no wasted compute.

LevelHardware partition
AllocationCores + memory per slice
WasteNear zero

Workload Scheduling

AI jobs are intelligently routed and prioritized across the GPU pool. The scheduler considers model size, queue depth, and cluster health in real time.

RoutingReal-time, adaptive
PriorityPer-job, dynamic
FailoverAutomatic reroute

Multi-Tenant Isolation

Multiple clients run on shared hardware, but every workload is fully isolated. Fairen's data never touches another tenant's memory or storage.

IsolationMemory + storage
Data crossoverImpossible by design
ComplianceSOC 2 ready

Distributed Execution

Heavy generation jobs are split and executed in parallel across the GPU pool. This is how we keep render times under 3 minutes even at peak load.

ParallelismCross-node
Job splittingAutomatic
Peak performanceSame as off-peak