Jun 12, 2026

A Video Production Pipeline Built Around a 15-Minute Constraint

How I designed a content system that produces videos quickly enough to be sustainable.

Goal

Create educational videos and shorts consistently without spending hours editing.

The original motivation was supporting content distribution for ClearFit, a career-fit project I was experimenting with. I wanted a workflow I could realistically sustain long-term.

Target:

less than 15 minutes of effort per video
good AI voice quality
good quality AI-generated imagery
visually engaging output
inexpensive to run locally (I was curious how much quality could be achieved with roughly 16GB of VRAM before reaching for paid APIs)

Approach

After testing several text-to-speech systems and image-generation workflows, I settled on a combination that produced acceptable quality on local hardware.

The central piece became a JSON specification describing the entire video.

Each file contains:

title
scenes
video thumbnail
narration
image prompts
animation instructions

System

The JSON acts as the source of truth.

From there, local scripts:

generate narration
generate images
animate backgrounds
assemble scenes
render the final video

A single specification can produce both long-form videos and shorts.

Result

The workflow is now simple enough that a new video starts as a JSON file and can be rendered with a few commands.

More importantly, the production process fits the original constraint: creating content quickly, inexpensively, and with little ongoing effort.

SystemClarity

Have a real case? Submit it.

If this kind of pattern feels familiar in your own work, use the inquiry form to share what you are trying to build and where the technical shape is still unclear.

Discuss Your Idea

Share This Essay