Video Parsing and Camera Pose Estimation for 2D to 3D Video Conversion

Author: Tianrui Liu , 劉天瑞
Publisher: Open Dissertation Press
ISBN:

9781361026236

Publication Date: 26 January 2017
Format: Paperback
Availability: Temporarily unavailable

The supplier advises that this item is temporarily unavailable. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out to you.

Our Price $129.36 Quantity:

Share |

Overview

This dissertation, Video Parsing and Camera Pose Estimation for 2D to 3D Video Conversion by Tianrui, Liu, 劉天瑞, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: The increasing demand for 3D video contents allures the conversion of a large amount of 2D videos into 3D formats. As the contents of videos vary substantially, the performances of a fully automatic conversion technique are usually limited. It is therefore important to develop efficient semi-automatic techniques to ensure good conversion qualities. The purpose of this thesis is to build a video analysis system which is suitable to be adopted in prior to the 2D to 3D conversion processes. The system aims to automatically summarize the videos in order to relief the manual cost during the 2D to 3D conversion processes, and possibly to facilitate the depth assignment. Firstly, a shot boundary detection method is proposed for the video analysis system to parse a video into basic unit of shot. Based on a novel structure-aware histogram scheme and an adaptive double-threshold scheme, the proposed algorithm achieves improvement upon the conventional methods. The structure-aware scheme effectively integrates the structural similarity measure and local color histogram and hence significantly reduces the false alarms due to motions disturbances. The adaptive double-threshold scheme makes the algorithm effective in detecting mixing types of shot boundaries. Once a video has been detached into shots, keyframes of the shots are further summarized by gathering together those with similar contents. By modeling the keyframes as an undirected graph, the normalized cuts algorithm is employed to recursively partition the graph into clusters. Secondly, camera motion estimation is performed to examine the motion modality of the camera capturing this video shot. As the SfM method for 3D reconstruction is generally restricted to be applied to videos containing translational camera motions, this part of work contributes to the automatically identification of the videos falling in the regime of the SfM method. The camera estimation algorithm utilizes matched features and epipolar geometry constraints to incrementally compute the camera parameters for different views. Based on the camera estimation results, we proposed a method to further explore the distinguishable properties of the sequences taken by translational moving camera. Consequently, the motion modality of the camera can be identified to ensure that the video shots are suitable for the SfM method. Last but not the least, a semantic scene analysis approach which can simultaneously segment and recognize the objects contained in a scene is proposed. The proposed method contains a two-layer random forests (RF) framework. In the first layer, RF effectively labels the image by assigning object classes to superpixels. The structured RF in the second layer predicts local labels together with reliability scores to be aggregated with the initial labeling results. The proposed method achieves higher accuracy because some of the inaccuracy segmentations and implausible labeling problems have been remedied in the second layer. The semantic analysis method can be used to differentiate the immotile background regions and the motile moving objects to assist depth propagation from keyframes. In this way, the semantic scene analysis approach can facilitate the depth propagation from keyframes obtained say from a user interface. Subjects: Image processing - Digital techniques3-D video (Three-dimensional imaging)

Full Product Details

Author: Tianrui Liu , 劉天瑞
Publisher: Open Dissertation Press
Imprint: Open Dissertation Press
Dimensions: Width: 21.60cm , Height: 0.60cm , Length: 27.90cm
Weight: 0.286kg
ISBN:

9781361026236

ISBN 10: 1361026235
Publication Date: 26 January 2017
Audience: General/trade , General
Format: Paperback
Publisher's Status: Active
Availability: Temporarily unavailable

The supplier advises that this item is temporarily unavailable. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out to you.

Reviews

Author Information

Tab Content 6

Author Website:

Countries Available

All regions

Latest Reading Guide

Shopping Cart

Your cart is empty

Mailing List

Video Parsing and Camera Pose Estimation for 2D to 3D Video Conversion

9781361026236

Availability Information

Overview

Full Product Details

9781361026236

Table of Contents

Reviews

Author Information

Tab Content 6

Countries Available

Sign up now