Skip to content
All work
Applied AI · Media2025

An automated object-identification and metadata-tagging pipeline for enterprise media asset management.

A scalable microservice that watches video uploads, runs computer-vision and LLM models to detect objects and write contextual descriptions back into the MAM — turning terabytes of unlogged footage into searchable, reusable assets.

Client
Enterprise media operation
Vertical
Media asset management
Region
Remote engagement
Duration
Contract build
Team
Solo build
Role
AI Engineer & Backend Developer
Overview

What we walked into.

The client runs an enterprise media operation with terabytes of video footage living across Media Asset Management (MAM) platforms like Iconik and CATDV. Most of that footage was effectively invisible — searchable only by filename and whatever a human had manually logged, which for the bulk of the archive was nothing.

Post-production teams couldn't find the footage they already owned. Manual logging doesn't scale to terabytes of uploads, so a large share of the archive went unused. The system had to identify what was actually in each video and make it discoverable — without adding work for editors, and without bottlenecking on a single synchronous pipeline.

Approach

What we built.

01

Asynchronous processing pipeline

A Python microservice, containerised with Docker and coordinated over RabbitMQ, consumes video uploads as they arrive and processes them asynchronously — so ingest never blocks on inference and the system scales horizontally under load.

02

Computer vision + LLM tagging

Custom computer-vision models (Roboflow) detect objects in the footage, and an LLM turns those detections into contextual, human-readable descriptions — the kind of metadata an editor would actually search for.

03

Write-back to the MAM

Generated objects and descriptions are written straight back into the MAM via its API, so the tags live where the footage lives — inside Iconik or CATDV — with no separate tool to learn.

Outcome

What shipped.

Manual logging was eliminated. Terabytes of previously unsearchable footage became discoverable by what's actually on screen, letting post-production teams reuse assets they already had and measurably increasing asset utilisation across the archive.

TB+Footage made searchable
AsyncVideo processing
0Manual logging
2MAM integrations
Stack
  • Python
  • Computer Vision
  • Roboflow
  • LLMs
  • Docker
  • RabbitMQ
  • API Integration

Ready when you are.

A 30-minute conversation. We'll listen. If we're a fit, we'll say so. If not, we'll point you to someone who is.

No discovery decks · No sales calls · One conversation