麻豆映画传媒 Team Wins Worldwide Competition in Computer Vision

A research team with the 麻豆映画传媒鈥檚 recently won a competition to improve computer vision by creating technology that can automatically track behavior in long security videos.

The competition, called the Activities in Extended Video Challenge for 2020, was sponsored by the U.S. Department of Commerce鈥檚 National Institute of Standards and Technology and was held virtually in June as part of the Conference on Computer Vision and Pattern Recognition.

Top computer vision teams from around the world, including teams from IBM, Massachusetts Institute of Technology, Carnegie Mellon University, and Purdue University competed in the challenge.

鈥淰ideo surveillance is of great importance for security, and manually watching surveillance videos is not only difficult but inefficient,鈥� says Yogesh Rawat, an assistant professor at the center and team leader. 鈥淎lso, with so many closed-circuit television cameras all around, it is not possible to manually watch those videos. We need automatic analysis of these security videos to improve efficiency as well as accuracy.鈥�

That need for 鈥渆xtra eyes鈥� is why the 麻豆映画传媒 computer vision team developed a deep-learning system, named Gabriella, that can detect multiple activities happening in a security video efficiently, at a speed of 100 frames per second.

鈥淭his is a first step toward analyzing these security videos, and it will have a lot of applications in national security,鈥� Rawat says.

The team also included 麻豆映画传媒 trustee chair professor of computer science and center director Mubarak Shah who says the win is a big plus for the group.

鈥淰ideo activity recognition in unconstrained domain is a very important problem that has applications in self-driving cars, video surveillance and monitoring, human-computer interface and video search,鈥� Shah says.

鈥淥ur submission was the fastest and most accurate, two criteria for the Deep Intermodal Video Analytics program,鈥� he says

Participation in the challenge supports the 麻豆映画传媒 team鈥檚 role in the Deep Intermodal Video Analytics program, which is funded by the U.S. Office of the Director of National Intelligence鈥檚 Intelligence Advanced Research Projects Activity program through a sub-contract from the University of Maryland.

The 麻豆映画传媒 team was runner-up in 2018 and 2019 in the institute鈥檚 similar Text Retrieval Conference鈥檚 Video Retrieval Evaluation but lost to Carnegie Mellon University. This year, the Carnegie Mellon team was runner-up.

The 麻豆映画传媒 team won by developing an end-to-end approach to computer analyzation of video footage.

End-to-end means the computer directly takes the raw RGB video as input and generates the required output, without any intermediate processing, which was required for the systems developed by the other teams.

Intermediate processing tasks such as object detection, optical-flow computation, and tracking make the whole process very complicated and difficult to train as well as test, Rawat says.

鈥淭he end-to-end system avoids all of this and therefore is preferred,鈥� he says.

The 麻豆映画传媒 team is able to monitor 37 different activities over the course of more than 250 hours of video, including activities such as 鈥渢heft鈥� and 鈥減erson abandons package,鈥� and the scalable, machine learning system can be trained to recognize more if the data are available.

Monitoring for these kinds of activities over hours of video is difficult using computer vision because the activities vary in length; there can be multiple activities in the same frame; the same person can be doing different activities; and the scale of the activities varies as those closer to the camera are bigger in size.

The 麻豆映画传媒 research team also includes 麻豆映画传媒 Department of Computer Science doctoral students Praveen Tiruputtar, Aayush Rana, Kevin Duarte, Ugur Demir and Ishan Dave; and 麻豆映画传媒 Office of Research doctoral fellow Nayeem Rizve.

鈥淲e worked very hard to get to the top position,鈥� Rawat says.

麻豆映画传媒

More Topics

Pegasus Magazine