A Deep Motion Vector Approach to Video Object Segmentation

Garg, Vineet

dc.contributor.advisor	Tian, Chao
dc.creator	Garg, Vineet
dc.date.accessioned	2020-04-23T19:56:51Z
dc.date.available	2021-05-01T12:34:40Z
dc.date.created	2019-05
dc.date.issued	2019-04-16
dc.date.submitted	May 2019
dc.identifier.uri	https://hdl.handle.net/1969.1/187959
dc.description.abstract	Video object segmentation is gaining increased research and commercial importance in recent times from no checkout lines in Amazon Go stores to autonomous vehicles operating on roads. Efficient operation for such use cases require segmentation inference in real time. Even though there has been significant research in image segmentation, both semantic and instance, there is still much scope for improvement in video segmentation. Video seg-mentation is a direct extension of image segmentation, except that there is temporal relation between neighboring frames of videos. Exploiting this temporal relation in an efficient way is one of the most important challenges in video segmentation. This temporal relation has a lot of redundancy involved and many of the prevalent state-of-the-art techniques do not exploit this redundancy. Optical flow is one of the approaches for exploiting temporal redundancies. Intermediate feature maps of previous frames are interpolated using this information and rest of the segmentation operation is performed. However, optical flow provides motion resolution on a pixel level. There is not enough motion between consecutive frames to warrant motion estimation on pixel level. Instead we can divide a frame into multiple blocks and estimate the movement of their centroids in consecutive video frames. Based on this idea, we present a motion vector approach to video semantic segmentation. Additionally, we also propose an adaptive technique to select keyframes during inference. We show that our proposed algorithm can bring down the computational complexity during inference by as much as 50% with only a 2-3% drop in the accuracy metric. Our algorithm can operate at as high as 136 frames per second indicating that it can easily handle real time inference.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Deep Learning	en
dc.subject	Video Segmentation	en
dc.subject	Motion Vector	en
dc.title	A Deep Motion Vector Approach to Video Object Segmentation	en
dc.type	Thesis	en
thesis.degree.department	Electrical and Computer Engineering	en
thesis.degree.discipline	Electrical Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Master of Science	en
thesis.degree.level	Masters	en
dc.contributor.committeeMember	Jiang, Anxiao
dc.contributor.committeeMember	Braga-Neto, Ulisses
dc.contributor.committeeMember	Xiong, Zixiang
dc.type.material	text	en
dc.date.updated	2020-04-23T19:56:52Z
local.embargo.terms	2021-05-01
local.etdauthor.orcid	0000-0002-2020-3185

Files in this item

Name:: GARG-THESIS-2019.pdf
Size:: 1.548Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record