Video Content Analysis Using Multimodal Informationhttp://www.fishpond.com.au/Books/Video-Content-Analysis-Using-Multimodal-Information-Ying-Li-C-CJay-Kuo/9781441953650
Video Content Analysis Using Multimodal Information For Movie Content Extraction, Indexing and Representation is on content-based multimedia analysis, indexing, representation and applications with a focus on feature films. Presented are the state-of-art techniques in video content analysis domain, as well as many novel ideas and algorithms for movie content analysis based on the use of multimodal information. The authors employ multiple media cues such as audio, visual and face information to bridge the gap between low-level audiovisual features and high-level video semantics. Based on sophisticated audio and visual content processing such as video segmentation and audio classification, the original video is re-represented in the form of a set of semantic video scenes or events, where an event is further classified as a 2-speaker dialog, a multiple-speaker dialog, or a hybrid event. Moreover, desired speakers are simultaneously identified from the video stream based on either a supervised or an adaptive speaker identification scheme. All this information is then integrated together to build the video's ToC (table of content) as well as the index table. Finally, a video abstraction system, which can generate either a scene-based summary or an event-based skim, is presented by exploiting the knowledge of both video semantics and video production rules. This monograph will be of great interest to research scientists and graduate level students working in the area of content-based multimedia analysis, indexing, representation and applications as well s its related fields.
Table of Contents
Dedication. List of Figures. List of Tables. Preface. Acknowledgments. 1: Introduction. 1. Audiovisual Content Analysis. 1.1. Audio Content Analysis. 1.2. Visual Content Analysis. 1.3. Audiovisual Content Analysis. 2. Video Indexing, Browsing and Abstraction. 3. MPEG-7 Standard. 4. Roadmap of The Book. 4.1. Video Segmentation. 4.2. Movie Content Analysis. 4.3. Movie Content Abstraction. 2: Background And Previous Work. 1. Visual Content Analysis. 1.1. Video Shot Detection. 1.2. Video Scene and Event Detection. 2. Audio Content Analysis. 2.1. Audio Segmentation and Classification. 2.2. Audio Analysis for Video Indexing. 3. Speaker Identification. 4. Video Abstraction. 4.1. Video Skimming. 4.2. Video Summarization. 5. Video Indexing and Retrieval. 3: Video Content Pre-Processing. 1. Shot Detection in Raw Data Domain. 1.1. YUV Color Space. 1.2. Metrics for Frame Differencing. 1.3. Camera Break Detection. 1.4. Gradual Transition Detection. 1.5. Camera Motion Detection. 1.6. Illumination Change Detection. 1.7. A Review of the Proposed System. 2. Shot Detection in Compressed Domain. 2.1. DC-image and DC-sequence. 3. Audio Feature Analysis. 4. Commercial Break Detection. 4.1. Features of A Commercial Break. 4.2. Feature Extraction. 4.3. The Proposed Detection Scheme. 5. Experimental Results. 5.1. Shot Detection Results. 5.2. Commercial Break Detection Results. 4: Content-Based Movie Scene And Event Extraction. 1. Movie Scene Extraction. 1.1. Sink-based Scene Construction. 1.2. Audiovisual-based Scene Refinement. 1.3. User Interaction. 2. Movie Event Extraction. 2.1. Sink Clustering and Categorization. 2.2. Event Extraction and Classification. 2.3. Integrating Speech and Face Information. 3. Experimental Results. 3.1. Scene Extraction Results. 3.2. Event Extraction Results. 5: Speaker Identification For Movies. 1. Supervised Speaker Identification for Movie Dialogs. 1.1. Feature Selection and Extraction. 1.2. Gaussian Mixture Model. 1.3. Likelihood Calculation and Score Normalization. 1.4. Speech Segment Isolation. 2. Adaptive Speaker Identification. 2.1. Face Detection, Recognition and Mouth Tracking. 2.2. Speech Segmentation and Clustering. 2.3. Initial Speaker Modeling. 2.4. Likelihood-based Speaker Identification. 2.5. Audiovisual Integration for Speaker Identification. 2.6. Unsupervised Speaker Model Adaptation. 3. Experimental Results. 3.1. Supervised Speaker Identification Results. 3.2. Adaptive Speaker Identification Results. 3.3. An Example of Movie Content Annotation. 6: Scene-Based Movie Summarization. 1. An Overview of the Proposed System. 2. Hierarchical Keyframe Extraction. 2.1. Scene Importance Computation. 2.2. Sink Importance Computation. 2.3. Shot Importance Computation. 2.4. Frame Importance Computation. 2.5. Keyframe Selection. 3. Scalable Movie Summarization and Navigation. 4. Experimental Results. 4.1. Keyframe Extraction Results. 4.2. User Study. 4.3. System Interface Design. 4.4. Applications. 7: Event-Based Movie Skimming. 1. Introduction. 2. An Overview of the Proposed System. 3. Extended Event Set Construction. 4. Extended Event Feature Extraction. 5. Video Skim Generation. 6. More Thoughts on the Video Skim. 6.1. When More Judging Rules Are Needed. 6.2. Sub-sampling the Video Skim. 6.3. Discovering the Story and Visual Structure. 7. Experimental Results. 8: Conclusion And Future Work. 1. Conclusion. 2. Future Work. 2.1. System Refinement. 2.2. New Research Topics. References. Index.
Already own this item? Sell Yours and earn some cash.
It's fast and free to list! (Learn More.)
Reviews
Review this Product
Webmasters, Bloggers & Website Owners
You can earn a 5% commission by selling Video Content Analysis Using Multimodal Information paperback book on your website. It's easy to get started - we will give you example code. After you're set-up, your website can earn you money while you work, play or even sleep!
Authors/Publishers
Are you the Author/Publisher? Improve sales by submitting additional information on this title.
This item ships from and is sold by Fishpond World Ltd.