By Kees van Bemmel, Managing Director, Incentro
[Editor’s note: Today we hear from Incentro, a digital service provider and Google partner, which recently built a digital asset management solution on top of GCP. It combines machine learning services like Cloud Vision and Speech APIs to easily find and tag digital assets, plus Cloud Pub/Sub and Cloud Functions for an automated, serverless solution. Read on to learn how they did it.]
Here at Incentro, we have a large customer base among media and publishing companies. Recently, we noticed that our customers struggle with storing and searching for digital media assets in their archives. It’s a cumbersome process that involves a lot of manual tagging. As a result, the videos are often stored without being properly tagged, making it nearly impossible to find and reuse these assets afterwards.
To eliminate this sort of manual labour and to generate more business value from these expensive video assets, we sought to create a solution that would take care of mundane tasks like tagging photos and videos. This solution is called Segona Media (https://segona.io/media), and lets our customers store assets and tag and index their digital assets automatically.
Segona Media currently supports images, video and audio assets. For each of these asset types, Google Cloud provides specific managed APIs to extract relevant content from the asset without customers having to tag them manually or transcribe them.
The traditional way for developing a solution like this involves getting hardware running, determining and installing application servers, databases, storage nodes, etc. After developing and getting the solution into production you may then come across a variety of familiar challenges: the operating system needs to be updated or upgraded or databases don’t scale to cope with unexpected production data. We didn’t want any of this, so after careful consideration, decided on a completely managed solution, serverless architecture for Segona Media. That way we’d have no servers to maintain, we could leverage Google’s ongoing API improvements and our solution could scale to handle the largest archive we could find.
We wanted Segona Media to also be able to easily connect to common tools in the media and publishing industries. Adobe InDesign, Premiere, Photoshop and Digital Asset Management solutions must all be able to easily store and retrieve assets from Segona Media. We solved this by using GCP APIs that were already in place for storing assets in Google Cloud Storage and just take it from there. We retrieve assets using the managed Elasticsearch engine’s API that runs on GCP.
Each action that Segona Media performs is a separate Google Cloud Function, usually triggered mostly by a Cloud Pub/Sub queue. Using a Pub/Sub queue to trigger a Cloud Function is an easy and scalable way to publish new actions.
Here’s a high-level architecture view of Segona Media:
High Level Architecture
And here’s how the assets flow through Segona Media:
Now, let’s see how Segona Media handles different types of media assets.
Images have a lot of features on which you can search, which we do via a dedicated microservice processor.
Processing audio is pretty straightforward. We want to be able to search for spoken text in audio files, and we use Cloud Speech API to extract text from the audio. We then feed the transcription into the Elasticsearch index, making the audio file searchable by every word.
Video is basically the combination of everything we do with images and audio files. There are some minor differences though, so let’s see what microservices we invoke for these assets:
There you have it. A summary of how smart media tagging can be done in a completely serverless fashion, without all the OS updates when scaling up or out, and of course, infrastructure maintenance and support! This way we can focus on what we care about: bringing an innovative, scalable solution to our end customers. Any questions? Let us know! We love to talk about this stuff ;) Leave a comment below, email me at firstname.lastname@example.org, or find us on Twitter at @incentro_.