Sora 2 Video Generation

curl --request POST \
  --url https://api.mulerun.com/vendors/openai/v1/sora-2/generation \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "A serene sunset over a calm ocean, with gentle waves lapping against the shore",
  "seconds": "8",
  "size": "1280x720"
}
'

{
  "task_info": {
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "status": "pending",
    "created_at": "2025-09-21T00:00:00.000Z",
    "updated_at": "2025-09-21T00:00:00.000Z"
  }
}

POST

vendors

openai

sora-2

generation

Sora 2 Video Generation

curl --request POST \
  --url https://api.mulerun.com/vendors/openai/v1/sora-2/generation \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "A serene sunset over a calm ocean, with gentle waves lapping against the shore",
  "seconds": "8",
  "size": "1280x720"
}
'

{
  "task_info": {
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "status": "pending",
    "created_at": "2025-09-21T00:00:00.000Z",
    "updated_at": "2025-09-21T00:00:00.000Z"
  }
}

Beta

This model is currently in public testing. Not everyone has access, and API requests may also be unstable.

Overview

Sora 2 is OpenAI’s video generation model capable of creating richly detailed, dynamic clips from natural language prompts or image references. Built on years of research into multimodal diffusion and trained on diverse visual data, Sora brings a deep understanding of 3D space, motion, and scene continuity to video generation.

Key Features

Text-to-video and image-to-video generation
Fast generation speed, ideal for rapid iteration
Good quality results suitable for social media content and prototypes

Supported Resolutions

Size	Aspect Ratio	Use Case
720x1280	9:16	Vertical/Portrait (mobile, social media)
1280x720	16:9	Horizontal/Landscape (standard video)
1024x1792	9:16	Tall Portrait (extended vertical)
1792x1024	16:9	Wide Landscape (cinematic)

Duration Options

Videos can be generated in three duration options:

4 seconds: Quick clips (default)
8 seconds: Standard duration
12 seconds: Extended clips

Effective Prompting

For best results, describe shot type, subject, action, setting, and lighting. For example:

“Wide shot of a child flying a red kite in a grassy park, golden hour sunlight, camera slowly pans upward.”
“Close-up of a steaming coffee cup on a wooden table, morning light through blinds, soft depth of field.”

This level of specificity helps the model produce consistent results without inventing unwanted details.

Content Restrictions

The API enforces several content restrictions:

Only content suitable for audiences under 18
Copyrighted characters and copyrighted music will be rejected
Real people—including public figures—cannot be generated
Input images with faces of humans are currently rejected

Example Requests

Text-to-Video

{
  "prompt": "A serene sunset over a calm ocean, with gentle waves lapping against the shore",
  "seconds": "8",
  "size": "1280x720"
}

Image-to-Video (with Reference)

{
  "prompt": "She turns around and smiles, then slowly walks out of the frame",
  "input_reference": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA...",
  "seconds": "8",
  "size": "1280x720"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

prompt

string

required

Text description for the video. For best results, describe:

Shot type (wide shot, close-up, etc.)
Subject (what is the main focus)
Action (what is happening)
Setting (where the action takes place)
Lighting (time of day, mood)

Example: "Wide shot of a child flying a red kite in a grassy park, golden hour sunlight, camera slowly pans upward."

Maximum string length: 2000

input_reference

string

Optional image reference that guides generation. Can be a URL or Base64 encoded data.

Format for Base64: data:image/jpeg;base64,{base64_data}

Supported formats: image/jpeg, image/png, image/webp Image resolution must match the target video's size parameter Max file size: 10MB

seconds

enum<string>

default:4

Clip duration in seconds.

Available options:

4,

8,

12

size

enum<string>

default:720x1280

Output resolution formatted as width x height.

Available options:

720x1280,

1280x720,

1024x1792,

1792x1024

Response

202 - application/json

Accepted - Task created successfully

task_info

object

Show child attributes

Video Diffusion Get Sora 2 Generation Task

⌘I

Documentation Index

​Overview

​Key Features

​Supported Resolutions

​Duration Options

​Effective Prompting

​Content Restrictions

​Example Requests

​Text-to-Video

​Image-to-Video (with Reference)

Authorizations

Body

Response

Overview

Key Features

Supported Resolutions

Duration Options

Effective Prompting

Content Restrictions

Example Requests

Text-to-Video

Image-to-Video (with Reference)