How to Create a Video With the JSONClip API Using cURL and Hosted Images or Videos
A complete JSONClip API tutorial for developers who want to render videos with cURL and stable hosted media URLs, with clear examples for scenes, overlays, audio, captions, effects, responses, and troubleshooting.
Long-read tutorial
This is the cleanest API path in JSONClip: send one JSON request to POST https://api.jsonclip.com/render?sync=1, reference downloadable HTTP media URLs, and get back a final movie_url. If your images, videos, and audio are already hosted somewhere stable, this is usually the first API workflow you should ship.
Why start with hosted media URLs? Because it removes an entire class of complexity. You do not have to think about multipart uploads, basename matching, or temporary local files. You can focus on the actual render contract: format, scenes, overlays, audio, effects, captions, and the final result shape.
Tutorial map
These guides are meant to work together. Start with the article that matches your current workflow, then use the others when you move from manual setup into repeatable automation.
- Editor tutorial for the visual workflow.
- Hosted API tutorial for plain JSON and hosted URLs.
- Local upload tutorial for multipart uploads with files from your machine.
- n8n tutorial for workflow automation with the HTTP Request node.
- Make.com tutorial for scenario-driven automation.
- Zapier tutorial for Webhooks by Zapier flows.
Who this guide is for
- Developers who already have media in S3, Cloudinary, CDN storage, CMS storage, or stable object storage.
- Automation builders who can compute public URLs before rendering.
- Teams that want a predictable cURL example before they move to n8n, Make.com, or Zapier.
- Anyone who wants the lowest-friction JSONClip API workflow.
The render model in one minute
JSONClip works best when you think in layers, not in vague editor gestures. A render request has a format, a scene list, optional overlays, optional audio, optional effects, and optional captions. That separation matters because it keeps the workflow legible whether you are clicking in the editor, sending cURL, or calling the API from an automation tool.
| Layer | What it controls | Why it matters |
|---|---|---|
| Format | Width, height, FPS, background color | If format is unclear, everything downstream gets harder, especially captions and text fit. |
| Scenes | The base images or videos | Treat scenes as the backbone. If scene order is wrong, every overlay, effect, and audio cue inherits the mistake. |
| Overlays | Text, logos, sticker-like layers | Overlays carry the messaging. They should be positioned with intent, not added as a last-minute afterthought. |
| Audio | Voiceover, music, sound cues | Good video feels finished because the audio is managed carefully, not because the visuals are fancy. |
| Effects and transitions | Motion treatment and continuity | Effects are there to reinforce pacing, not to rescue weak structure. |
| Captions | Subtitle-style bottom text or inline cues | Captions should stay readable on mobile and should match the spoken pacing. |
The single rule that makes this guide work
Every `src` in JSON mode must be a downloadable URL. That means the JSONClip renderer has to be able to fetch it on its own. A local workstation path like `/Users/me/Desktop/video.mp4` is not a downloadable URL. A `file://` path is not a hosted media URL either. If the asset is only on your machine, use the multipart guide instead.
That sounds strict, but it keeps the request honest. If the renderer cannot fetch the media independently, the render cannot be reproduced later. Reproducibility matters more than convenience once you move beyond experiments.
Your first successful request should be boring
Start with one image, one short duration, and no fancy layers. You are not trying to prove how much JSONClip can do. You are trying to prove the request path, the API key, the render endpoint, and the fact that the final `movie_url` comes back as expected.
curl -sS -X POST 'https://api.jsonclip.com/render?sync=1' \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR_API_KEY" \
--data @- <<'JSON'
{
"env": "prod",
"movie": {
"format": { "width": 720, "height": 1280, "fps": 30, "background_color": "#000000" },
"scenes": [
{ "type": "image", "src": "https://cdn.example.com/media/cover.jpg", "duration_ms": 2000 },
{ "type": "video", "src": "https://cdn.example.com/media/b-roll.mp4", "duration_ms": 3000 }
],
"overlays": [
{
"type": "text",
"text": "Your launch headline",
"from_ms": 150,
"to_ms": 2200,
"position_px": { "x": 360, "y": 180 },
"width_px": 560,
"style": { "font": "Avenir Next", "size_px": 64, "bold": true, "align": "center", "color": "#ffffff" },
"stroke": { "color": "#000000", "width_px": 5 }
}
],
"audio": [
{ "src": "https://cdn.example.com/audio/music.mp3", "role": "music", "from_ms": 0, "to_ms": 5000, "fade_out_ms": 350 }
],
"effects": [
{ "type": "zoom_in", "from_ms": 0, "to_ms": 1800, "settings": { "strength": 1.1 } }
],
"captions": {
"style": "bold_center",
"cues": [
{ "from_ms": 0, "to_ms": 1500, "text": "Launch faster" },
{ "from_ms": 1500, "to_ms": 3000, "text": "Stay on brand" }
]
}
}
}
JSONOnce the minimal request works, you can add more scenes, overlays, audio, captions, effects, and transitions. But do not skip the boring first step. If the first request already mixes three asset types, two effects, and a caption file, you will not know what broke if the response fails.
What each part of the request is doing
| Part | Why it exists | What to verify |
|---|---|---|
| `env` | Selects the render environment | Use `prod` for the hosted API flow unless you intentionally target a local worker. |
| `format` | Defines size and FPS | Choose dimensions that match the actual destination platform. |
| `scenes` | Build the visual backbone | Every `src` must be a real downloadable asset URL. |
| `overlays` | Carry message layers on top | Use readable widths and positions. |
| `audio` | Adds music or voiceover | Make sure each track has sensible timing. |
| `effects` | Shapes motion and emphasis | Use them to support pacing, not to hide weak structure. |
| `captions` | Control subtitle-like bottom text | Keep lines readable and aligned with the spoken rhythm. |
A richer example with image, video, overlay, music, and captions
curl -sS -X POST 'https://api.jsonclip.com/render?sync=1' -H "Content-Type: application/json" -H "X-API-Key: YOUR_API_KEY" --data @- <<'JSON'
{
"env": "prod",
"movie": {
"format": { "width": 1080, "height": 1920, "fps": 30, "background_color": "#000000" },
"scenes": [
{ "type": "image", "src": "https://cdn.example.com/media/opener.jpg", "duration_ms": 1700, "transition_out": { "type": "blur", "duration_ms": 260 } },
{ "type": "video", "src": "https://cdn.example.com/media/demo.mp4", "duration_ms": 3200 },
{ "type": "image", "src": "https://cdn.example.com/media/closer.jpg", "duration_ms": 1800 }
],
"overlays": [
{
"type": "text",
"text": "Ship a vertical demo in one request",
"from_ms": 120,
"to_ms": 2000,
"position_px": { "x": 540, "y": 210 },
"width_px": 860,
"style": { "font": "Avenir Next", "size_px": 82, "bold": true, "align": "center", "color": "#ffffff" },
"stroke": { "color": "#000000", "width_px": 5 }
}
],
"audio": [
{ "src": "https://cdn.example.com/audio/voice.mp3", "role": "voiceover", "from_ms": 0, "to_ms": 6700 },
{ "src": "https://cdn.example.com/audio/music.mp3", "role": "music", "from_ms": 0, "to_ms": 6700, "volume_db": -11, "duck_under_voice": true }
],
"effects": [
{ "type": "zoom_in", "from_ms": 0, "to_ms": 1500, "settings": { "strength": 1.1 } },
{ "type": "fade_out", "from_ms": 6200, "to_ms": 6700 }
],
"captions": {
"style": "bold_center",
"cues": [
{ "from_ms": 0, "to_ms": 1600, "text": "Your opener should say one thing clearly." },
{ "from_ms": 1600, "to_ms": 4200, "text": "Then let the product or idea do the work." },
{ "from_ms": 4200, "to_ms": 6400, "text": "Close with a simple call to action." }
]
}
}
}
JSONHow to think about hosted URLs before you call the API
The best hosted URLs are boring. They point straight to the asset, they return a stable content type, they do not require browser cookies, and they do not expire in a minute. A signed URL can work if it stays valid long enough for the render job, but a durable URL is better for repeatability.
If you are pulling media out of a CMS or marketing platform, test the URL directly with `curl -I` before you put it in the render request. That one small preflight step catches a surprising number of bad assumptions: HTML landing pages instead of media, redirects that require cookies, and content that is technically public but rate-limited in a way the renderer cannot tolerate.
What a good response looks like
{
"ok": true,
"job_id": "01HXYZEXAMPLE",
"movie_url": "https://renderer.jsonclip.com/jsonclip/movies/example.mp4",
"duration_ms": 5000,
"credits_used": 42
}For the first integration, sync mode is the right default. It is simpler to debug because you get the outcome in one response. Once traffic grows, you can switch some flows to async, but that should happen after the base request is stable and observable.
Common design choices that make the first API result look better
- Keep the opener short and clear. A strong 1.5 to 2.0 second opener is better than a vague four-second opener.
- Use overlays for intentional messaging, not for everything. If the scene itself explains the point, let it breathe.
- Use one music track and optional voiceover before you layer additional sound cues.
- Use captions when the spoken content matters. Do not add bottom text because every other short-form video does it.
- Keep the first version visually plain enough that you can debug structure fast.
When hosted URLs are the wrong choice
If your assets are not already reachable through stable URLs, do not force the hosted JSON flow. Use the local upload tutorial instead. Multipart upload exists for a reason. Trying to fake hosted URLs with local paths or unstable temporary links usually wastes more time than it saves.
The same rule applies when you are automating from tools like n8n, Make.com, or Zapier. If the upstream tool already knows a stable public URL, the JSON flow stays elegant. If the tool only has local or transient binary data, you need a different request strategy.
Preflight checklist before you call the render endpoint
| Check | Good answer | Bad answer |
|---|---|---|
| API key | Stored and passed in `X-API-Key` | Hard-coded into a random shell script you cannot rotate later |
| Scene URLs | Direct downloadable media URLs | Browser pages, `file://` paths, or URLs that need cookies |
| Text | Short, intentional, readable | Paragraph-sized overlay copy |
| Audio timing | Trimmed to the actual reel length | Full song dropped in without checking the exit |
| Sync behavior | Use `?sync=1` while debugging | Expecting a final `movie_url` from an async flow without polling |
| Project scope | Start with a minimal reel | Try to automate a whole campaign before the first request succeeds |
Troubleshooting
Most first attempts fail for ordinary reasons, not exotic ones. The fix is usually to simplify the request, verify the media sources, and add complexity back in once the minimal version works.
| What you see | What it usually means | What to do |
|---|---|---|
| The API returns an error before rendering starts | Your JSON shape or media references are wrong | Validate the body, confirm your header is `X-API-Key`, and make sure every `src` is either a downloadable URL or a basename uploaded in multipart mode. |
| The final video renders but the pacing feels wrong | Scene durations, effect timing, or audio trim are off | Shorten the first version of the workflow. Get a clean five-second or eight-second result before you scale to a longer reel. |
| The video looks fine in one environment and wrong in another | Preview parity or unsupported media format issue | Stick to stable formats and verify with the final render, not only with a browser preview. |
| The output is technically correct but hard to read | Typography, caption size, or spacing is too aggressive | Reduce text density. Good automation usually starts with simpler copy than teams expect. |
| The API says the asset is missing or unreadable | The URL is not actually downloadable by the renderer | Test it independently with curl and confirm it returns the media bytes, not HTML. |
| The output video renders but some scenes are blank | One or more URLs redirected or expired during fetch | Use more stable media hosting or persist the assets earlier in the pipeline. |
| The response never returns the final video URL in time | Your workflow is too heavy for sync mode right now | Switch to async after you confirm the render itself is healthy. |
How this guide connects to automation tools
This hosted URL pattern maps almost directly into n8n, Make.com, and Zapier. The reason is simple: those tools are best when they can build a JSON body and send it to one endpoint without also carrying file-upload complexity.
That is why many teams standardize their upstream process around generating or storing stable media URLs before they render. The video step stays simpler, easier to debug, and easier to scale.
FAQ
Can the hosted API read my local disk path? No. Use multipart upload if the media only exists locally.
Should I use sync mode in production? Use it while the workflow is small or when the caller truly needs the final URL immediately. Otherwise async can be cleaner at scale.
Can I mix images and videos in the same request? Yes. The important part is that each asset is reachable and the timing is intentional.
How to reason about a hosted API payload field by field
The fastest way to debug a hosted API request is to stop thinking about it as one giant blob. Treat it as a set of layers with separate responsibilities. Once you do that, most failures become ordinary: the format was wrong, a scene URL was bad, the overlay text was too dense, the audio timing was sloppy, or the effect window did not match the cut.
| Field group | What to verify first | What teams usually overcomplicate |
|---|---|---|
| Format | Width, height, FPS, and channel fit | Over-optimizing FPS before the composition works |
| Scenes | Each scene source resolves correctly in hosted JSON mode | Adding too many scenes before the first request proves the path |
| Overlays | Text width, contrast, and timing | Decorative styling before the copy is clear |
| Audio | Track role and trim window | Layering multiple tracks before one clean track works |
| Effects | Whether the effect has a job | Using motion as a substitute for structure |
| Captions | Cue timing and line readability | Treating captions as mandatory even when the video does not need them |
How to version a hosted API render contract
Once a hosted API request starts producing useful videos, give the template a name and a version. That can be as simple as `starter_vertical_v1`, `product_demo_v2`, or `quote_card_landscape_v1`. The naming convention matters because it lets your team discuss the template as a stable object instead of as a vague memory of the JSON.
This also makes change control less emotional. If a new layout is better, call it `v2` and keep `v1` available until the new one proves itself. Teams that overwrite a working request every time someone has a new idea create their own instability.
{
"template_key": "starter_vertical_v1",
"campaign_id": "campaign_2048",
"channel": "reels",
"request": {
"...": "JSONClip movie payload here"
}
}How to adapt one hosted API tutorial into several channels
A strong template family usually changes by channel more than by brand idea. The hook, visual proof, and CTA can stay conceptually similar while format, text density, and end-card pacing shift. That means you do not need a brand-new request for every destination. You need controlled variants.
| Channel | Typical format | Typical edit change |
|---|---|---|
| Short-form vertical | 720x1280 or 1080x1920 | Keep the opener fast and the text large |
| Landscape explainer | 1280x720 or 1920x1080 | Give the layout more breathing room and wider text blocks |
| Square promo | 1080x1080 | Center the composition and reduce edge-hugging text |
| Story or ad variant | Vertical with clear CTA zone | Protect the closing frame for call-to-action legibility |
How to decide when to leave hosted API and use automation tooling
The plain hosted API workflow is the correct long-term solution when the caller already has the data and can assemble the request predictably. Move into automation only when a trigger, branch, schedule, or downstream business system genuinely needs orchestration.
That is why these guides connect. Start with hosted API. If the next problem is workflow coordination, step into n8n, Make.com, or Zapier based on where the rest of the business process lives.
How to review a hosted API video before you call it done
The easiest mistake in a hosted API workflow is to stop as soon as the render technically succeeds. A successful render is not the same thing as a useful video. Before you ship, review the video with boring discipline: can a person understand the opener instantly, does each scene stay on screen long enough to make sense, does the audio enter and exit cleanly, and does the close actually tell the viewer what to do next?
This matters even more in automation because the first video is rarely the final goal. The real goal is a repeatable pattern. If the first result works only because you manually tolerated a weak opening, awkward copy density, or a sloppy CTA, the system is not ready to scale. A reusable template needs stronger quality rules than a one-off experiment.
Review the first output at normal speed, then one more time with the sound off, and then once again by jumping through key moments on the timeline. Sound-off review tells you whether the visual structure is carrying its own weight. Scrub review tells you whether the transitions, text timing, and end card are landing where you think they are landing.
| Review pass | What to look for | What usually needs fixing |
|---|---|---|
| Normal playback | Overall rhythm and legibility | Scene durations that are slightly too long or slightly too short |
| Muted playback | Message clarity without audio support | Overlays doing too much work or not enough |
| Scrub review | Cut points, effect windows, caption timing | Transitions or text cues landing a little early or late |
| Mobile-size check | Phone readability | Text that technically fits but is tiring to read |
| Final export review | Parity between idea and delivered file | Subtle issues that were easy to ignore in the build flow |
How to turn one hosted API example into a repeatable template
The healthy way to reuse a hosted API project is to freeze the structure and vary only the data that actually changes. In plain terms, that means you decide which parts are template constants and which parts are runtime variables. Constants usually include format, text style, caption style, transition family, and effect intensity. Variables usually include scene source URLs, headline text, supporting copy, voiceover, music, or the closing CTA.
This distinction is operationally important because it keeps later edits cheap. If your structure and data are mixed together without a rule, every new campaign becomes a mini redesign. If they are separated early, one template can support many outputs with much less rework.
| Template layer | Keep stable when possible | Let it vary when needed |
|---|---|---|
| Canvas | Width, height, FPS, safe margins | Only change for a different destination channel |
| Typography | Font family, general weight, default alignment | Swap only when the brand system truly requires it |
| Motion language | Core transition and effect families | Change only when the creative intent changes |
| Content data | Never hard-code campaign-specific values into the template | Headlines, asset URLs, captions, and CTA text |
| Distribution | Delivery step shape | Destination channel, notification recipient, or storage path |
What to log so debugging stays cheap
Every serious workflow needs enough logs to answer four questions later: what payload did we send, what assets did we reference, what result came back, and which business record did that result belong to? Teams often log too little and then start guessing. Guessing is expensive.
For JSONClip, the minimum useful log record is usually a request identifier, the project or business record identifier, the format, the main asset references, the final `movie_url`, and any credits or duration metadata returned by the render. If you can replay or inspect a failed run from that record, your observability is probably good enough for this stage.
{
"template_key": "starter_vertical_v1",
"source_record_id": "campaign_2048",
"format": { "width": 720, "height": 1280, "fps": 30 },
"primary_assets": [
"cover.jpg",
"demo.mp4",
"voice.mp3"
],
"movie_url": "https://renderer.jsonclip.com/jsonclip/movies/example.mp4",
"duration_ms": 6100,
"credits_used": 42
}A practical shipping checklist
- The opener is readable in under a second.
- The text density matches the actual pace of the cut.
- No scene exists only because an asset was available.
- Music and voiceover timing make sense together.
- Effects and transitions reinforce pacing instead of hiding weak structure.
- The closing frame clearly tells the viewer what happens next.
- The request or project can be rerun without manual mystery steps.
- The workflow owner knows whether the next step is hosted JSON, multipart upload, or a workflow tool such as n8n, Make.com, or Zapier.
Conclusion
If your media is already hosted, this is the fastest clean JSONClip API workflow you can ship. One request, one clear payload, one clear result.
If your assets live on a laptop or arrive as uploaded files instead of stable URLs, go next to the multipart local upload tutorial. If the next step is automation, move to n8n, Make.com, or Zapier.
That is the practical bar for a good JSONClip workflow: easy to read, easy to rerun, easy to debug, and easy to hand off to the next person or the next automation layer.