DevLog 250517 — Full Stack Vision for Sovereign AI Creation

> Log Date: 250517

The goal today: architect a full-stack, open-source, Docker-powered video generation pipeline that lets me make art, animation, music, and narration — all from prompts — and upload it to IPFS and optionally mint as NFTs. Total sovereignty, total freedom.

It’s important to note that I am an artist, and I never use AI for my prints, canvases, or finished art pieces. However, in the interest of learning — and for fun — I want to explore what it would be like to train a machine on my own artwork and use that as a launch pad for videos, stories, and tutorial-style projects. This is a creative experiment in machine learning, not a replacement for my actual art practice.

This is the first time I’ve written out the full ecosystem from prompt to published. It’s more than a wishlist. It’s an execution map — one I’ll walk through piece by piece as I build the tools, train the models, and push every part to my local Forge machine. Each component is free, local, and modular. I’m keeping SaaS out. I'm keeping gatekeepers out.


Hardware Foundation

The T3600 workstation — affectionately named the Forge — is my platform. With 2TB SSD storage, 12GB VRAM RTX 3060, and Ubuntu dual booted, it’s built for the kind of edge-computing I want to do. Once the BSOD issues are resolved, I’ll finish stacking the rest of the build.


Toolchain Layers

I’m organizing the pipeline into clear layers: generation, sound, assembly, and automation. Each tool serves a focused role, and they all integrate via Docker and n8n. No fluff. Just what’s needed to render visions and ship them to the world.


Prompt-to-Video Stack

Ollama          → LLM for scripts and story prompts  
Stable Diff     → Art generation (A1111 or ComfyUI)  
AnimateDiff     → Frame-based animation  
TTS             → Coqui or Tortoise for narration  
Whisper         → Transcription  
LMMS + Magenta  → Soundtrack  
ffmpeg          → Assembly  
n8n + PostgreSQL → Orchestration  
Web3.Storage    → IPFS backup  
ethers.js       → Optional NFT minting  

Automation Pipeline

n8n is the backbone. Every stage will be callable from terminal, web form, or webhook.


1. Submit prompt
2. Script generated (Ollama)
3. Art rendered (txt2img → AnimateDiff)
4. Voice + subtitles generated
5. Music added (LMMS/Magenta)
6. Video stitched (ffmpeg)
7. Uploaded (YouTube + IPFS)
8. Optional NFT minted with CID metadata

Next Steps


This stack isn’t just functional — it’s a creative liberation engine. It’s built from scratch to ensure that no platform owns my output, no API rate limit slows me down, and no trend dictates my aesthetic.

Signed,
Lorelei Noble

← Back to DevLogs