In the age of artificial intelligence (AI), complex photo editing can be done with the click of a button. Whether you want to remove an object from the background, change the color of your shirt or make your face appear flawless, AI can transform your pictures with simple text-to-image commands.
Editing videos with those same AI commands isnβt quite as simple, however, ΒιΆΉΣ³»΄«Γ½ researchers in the College of Engineering and Computer Science aim to change that.
Professor Nazanin Rahnavard, Associate Professor Chen Chen, and ΒιΆΉΣ³»΄«Γ½ alumni Nazmul Karim β20MS β23PhD and Umar Khalid β20MS β23PhD have developed novel text-to-video AI technology that can dramatically change videos in minutes.
βOur system takes the βbrainβ of an AI thatβs already skilled at generating images from text and adapts it for video, without losing the creative power that makes it effective in the first place,β Rahnavard says. βOur breakthrough came from recognizing a fundamental inefficiency in existing text-to-video editing approaches. Current systems either require massive text-to-video datasets for training or rely on computationally expensive, per-video adaptations of text-to-image models. We believed there had to be a more efficient and elegant solution.β
With Rahnavardβs background in electrical engineering and Chenβs background in computer science, the team used linear algebra techniques to examine the numerical parameters of an AI model that are optimized and adjusted while learning a new task. They realized that instead of fine-tuning the entire parameter set, they could update only the singular values, preserving the AI modelβs ability to generalize while speeding up its adaptation time.
βThe key was learning which parts of the AIβs βmemoryβ to adjust, and which to preserve,β Rahnavard says. βBy focusing only on the most essential elements and leaving the rest untouched, we created a method that adapts much faster and more efficiently while still producing high-quality, expressive results.β
The AI model works best on existing video clips and can edit them in minutes. It can change the colors of clothing, swap a cat for a dog or transform the clip into a cartoon. The more complex the commands, the longer the editing time. But Rahnavard says the process can still be completed in minutes, not hours.
The university was recently awarded a patent for the technology, which movie studios and social media companies could use.
βThis technology has the potential to revolutionize video editing across a wide range of industries,β Rahnavard says. βMovie studios could use it for rapid scene modifications without the need for costly reshoots, while social media platforms could offer their users instant, highly sophisticated video filters far beyond whatβs available today.β