Automate Transcript with AI

Seeking time for redemption

At the moment I am in the break of my creator, which is a time slot during which I am trying to work on my private project and learn more about what I am doing.

Let’s call it “Video Reader - Retirement project or trying to live ”

Roadmap

This document outlines the technical roadmap for the development of video-reader a simple web application for translating and transcribing videos from English to Persian and vice versa. This document provides a high-level overview of the key steps and technologies involved in it. The roadmap is designed to guide the development team in effectively planning and executing the project.

Video Reader - Make it more readable:

The project involves the use of Artificial Intelligence (AI) techniques to make video content more accessible. The web application will utilize AI-based techniques such as natural language processing (NLP) and optical character recognition (OCR) to read and transcribe text from videos. Additionally, it will use machine learning algorithms to translate the text into other languages, allowing for easier comprehension of videos by users with different language backgrounds. The web application will also feature a user-friendly interface, making it easy to navigate and use.

Audience:

Those who need to have their videos transcribed by a small business or individual.

Goal:

Deliver a MVP, the first version will be a dashboard for uploading and managing videos. Transcribing inputed video from input language (English, Persian) to output language (English, Persian).

Features:

Transcript a video

1. Project Planning:

Define project goals and objectives
Conduct market research and competitor analysis
Create a project timeline with clear milestones and deliverables
Allocate resources and define project roles and responsibilities

2. Requirement Gathering:

A web server with a GPU is necessary for the efficient processing of videos by the web application.

3. Architecture and Design:

MVC with RestAPI between the frontend and backend

NoAuth Base

4. Front-end Development:

For the front-end development, I tried out multiple variations of VueJS setups during the day. Ultimately, I chose to start with a vanilla Vue project with TailwindCSS. This setup was simple and worked great for the project.

TypeScript

VueJS

TailwindCSS

5. Back-end Development:

I am not sure what would be the best choice. As you know most of the code is written in Python, but writing a backend system in a python framework like fastAPI, rather than nestJS or PHP/Laravel would be too much for a MVP.

On the other hand, for implementing those AI models and NLP processes, Python is the only language that can be used.

Server side with Python
FastAPI

6. Database Development:

It is a good idea to start off with a NOSQL database as a starting point to store all the meta information about video files, such as the duration of the transcript text, as a starting point. To get started, even storing the data in a JSON file would be a good idea.

MongoDB

7. File System Development:

As I’m trying to create each part of the service to make it better, I think that using a system like ftp or S3 bucket for storing video files would be helpful. It seems to me that Minio is a good place to start for this part of the project.

Minio

8. Deployment and Maintenance:

Deploy the web application with docker and single project with docker compose file

If you require further information, please feel free to contact me. I am more than happy to assist you with any inquiries or concerns you may have.