KWTR: multimedia
From semanticweb.org
[edit] Contributors:
Main Contributors:
Stamatia Dasiopoulou - Informatics and Telematics Institute, Centre for Research and Technology Hellas (dasiop@iti.gr)
Ioannis Kompatsiaris - Informatics and Telematics Institute, Centre for Research and Technology Hellas (ikom@iti.gr)
Please add your CV in the list of contributors
- What is the state of the art of Semantic Web in your research field?
Semantic Web relates to multimedia content under two, partially intertwined, central contexts: content annotation (manual or (semi)-automatic) and content analysis and interpretation.
The first dimension concerns the definition of interoperable annotations, extended to encompass media related aspects as well. The current state of the art includes a number of initiatives towards the translation of the MPEG-7 standard into ontological representations that aim to overcome ambiguities resulting from normative semantics and to enable integration with existing Semantic Web metadata (i.e., domain ontologies, core ontologies, etc.). The undertaken approaches can be roughly classified into those targeting an explicit translation that aims to preserve the intended flexibility (Hunter 2001, Garcia et. al. 2005), and those opting for a cleaner modelling that attempts to provide more intuitive conceptualisations (Bloehdorn et. al. 2005, Oberle et. al. 2006, Tsinaraki et. al. 2005), while both automatic and manual approaches to translation have been reported. Related subjects include the definition of semantic profiles (as an extension to classical profiles) and work related to fragment identification; the latter however being mostly addressed from a syntactic perspective (Troncy et. al. 2006). Despite the advantages brought by these efforts to translate MPEG-7 into an ontology (improved retrieval, exchange and share of knowledge), open issues still remain with respect to interoperability, highlighting the need for a common well-defined multimedia annotation framework (MMSEM, Multimedia Ontology). Furthermore, within such a common framework, questions arise with respect to the way integration should take place among the different aspects of multimedia annotations: generic upper level ontologies, such as DOLCE and SUMO, have been proposed for harmonizing domain specific and multimedia ontologies [hunter,simou,oberle], while the use of MPEG-7 as a core multimedia ontology, further specialized by attaching domain specific ontologies, has been investigated (Tsinaraki et. al. 2005). Efforts have been also reported towards the formalization of the conceptualization of the different multimedia description types (decomposition, content annotation, media annotation, etc.) (Arndt et. al. 2007).
The second dimension refers to the use of SW technologies to represent knowledge necessary in semantic multimedia analysis and to model/realize (part of) the process of obtaining the targeted content descriptions. Ontologies have been used to capture both domain-specific and feature-level (quantitative and qualitative) knowledge (Hunter et. al. 2004, Dasiopoulou et. al. 2005, Maillot et. al. 2005], including spatial relations and constraints, while both rule-based and DL-based (Neumann et. al. 2005, Schober et. al. 2005] reasoning have been explored for multimedia understanding. With respect to the latter crisp as well fuzzy implementations (Dasiopoulou et. al. 2007) have been considered, while non-standard inferences, and in particular abduction are under investigation (Espinosa et. al. 2007). The types of knowledge and inference services required for realizing approaches that make use of explicit knowledge and entailed semantics, give rise to further questions with respect to the information and semantics a core multimedia conceptualization should support (e.g., annotation, decomposition, and analysis information, domain and contextual information).
- Provide references and short abstracts of three papers you consider as significant in your research field.
S. Dasiopoulou, J. Heinecke, C. Saathooff and M.G. Strintzis, "Multimedia Reasoning with Natural Language Support", 1st IEEE International Conference on Semantic Computing, Sept 17-19, Irvine, California, 2007.
Neumann, B. and Moller, R., "On scene interpretation with description logics", Tech. Rep. FBI-B-257/04, 2004.
- Please provide one or more examples (either business, or research, or both) in which semantic web has been used (if you can, add some references).
Relevant material (publications, presentations, demos) can be found in the sites of the various research projects that address semantic multimedia analysis and annotation.
- Are there existing tools or demos? Please indicate some of them.
The reader is referred to the relevant projects sites and publications. Specifically for tools related to manual semantic annotation, a list can be found in [1].
- What are the open problems in your Semantic Web research field? Why?
Most of the confronted open problems relate to the inherent characteristics of multimedia content, and the challenge of moving from features that can be automatically extracted to descriptions that approximate human cognition, based on generic and not application-specific methodologies. Analysis results inherent feature of ‘cues’ rather than ‘evidences’, limitations of analysis techniques (segmentation, incompleteness of obtained results), the extreme complexity that does not allow for constructing a complete knowledge covering all possible variations in appearance and accounting for similar appearances of semantically distinct entities, the need to couple traditional analysis with explicit knowledge under a constant interaction framework, and the entailed immense challenge of implementing the required control and reasoning strategies, pose very specific requirements with respect to the usage of SW technologies. Briefly speaking these requirements, that constitute the challenges met, include: i) support for uncertainty handling, addressing both fuzzy and probabilistic aspects, ii) support for scalable and efficient inference SW tools implementations (including repositories, reasoners), iii) coupling of rules and ontologies as the different types of knowledge related to multimedia pose different semantics and expressivity requirements, iv) the evolution and integration of standard and non-standard inferences (multimedia analysis and understanding cannot be seen as a sequential process of entailments; assumptions need to be made with respect to missing or non-consistent descriptions, selection between possible interpretations is required, and uncertainty is needed to traverse along the generated entailments). Stronger datatype support, although desirable, is not stressed here as an additional requirement, since it is neither desirable not meaningful to exploit SW technologies for representing and handling procedural knowledge and computations.
- Provide references and links of the most relevant Semantic Web research projects in your field.
Research projects focusing on multimedia annotation and analysis under a SW framework include among others:
aceMedia ([2]), X-media ([3]), K-space ([4]), MeSH ([5]), Salero ([6]), BOEMIE ([7]), Aim@Shape ([8])
- What challenges try these projects to overcome?
The common goal underlying these projects efforts is to overcome the so called semantic gap in multimedia analysis and retrieval, through the utilisation of explicit knowledge and the entailed semantic inferences. Acknowledging the challenges involved in the automatic acquisition of semantic content descriptions, these projects investigate and explore the use of fuzzy extensions in ontology languages, formal inference techniques, ontology engineering and evolution, and formal knowledge engineering, for the purpose of contributing towards well-defined common frameworks for enabling multimedia analysis and annotation to benefit from precise and explicit knowledge. Given the current state of SW technologies and the relatively recent research carried out on SW frameworks for multimedia analysis, these projects have provided insight into a number of issues with respect to expressivity requirements, SW technologies integration and assessment in practical application scenarios (scalability, performance, etc.), reasoning capabilities and so on
- What are their foreseen benefits (both in market and scientific community)?
The foreseen benefits include more intuitive and efficient multimedia content management, exploitation, delivery and consumption. With respect to the scientific community, of particular importance is the insight and future guidelines acquired with respect to coupling and utilising explicit knowledge with the traditional learning and statistical approaches of the multimedia community
- When, in your opinion, will projects’ results be ready for industry?
Given the current state, where mostly prototypical implementations are available, while still certain purely SW oriented issues need to be tackled (e.g., effective uncertainty and stronger datatype support), it will take some time before the outcomes of the projects reach an adequately mature state. However, already parts of the investigated technologies and lessons learnt are considered as of potential practical interest for industry
- Do you think that it is important to invest (money and time) in these topics? Why?
Given that multimedia content is omnipresent, that traditional content-based approaches fail to scale satisfactorily and generalize, and that explicit knowledge can contribute significantly as a complement and a useful pillar for the unquestionably valuable machine-learning approaches.
- What are, in your opinion, the most relevant Semantic Web challenges that will be solved in the long term (10 years)? Why?
Within the SW multimedia analysis and annotation vision and the SW challenge in general, the following problems are expected to be addressed within a time framework of ten years.
- Support for uncertainty, including fuzzy as well as probabilistic extensions. This is to be expected given the significance that uncertainty plays in all kind of interactions with knowledge and metadata. - Support for non standard inferences. Partially related to the need for handling uncertainty, and particularly in terms of ambiguity in reaching conclusions about what a multimedia document is about, and partially because deductive reasoning in not always appropriate for the tasks at hand (multimedia analysis is not merely model checking), nonstandard inferences hold a significant potential. Since the need for extending standard inference, becomes more and more evident in many aspects of the envisaged SW, e.g. service mediation, ontology matching, query answering, semantic retrieval, etc., it is expected that within the next years, proposals to possible solutions will have reached a significant level. - Support for more scalable approaches and technologies. Huge amounts of associated digital content metadata are made available everyday, urging for efficient and effective methods to process, store and access them. Recent interest and efforts put into tractable knowledge representations and respective reasoning services, clearly indicate the significance of achieving scalability, capable of supporting real life applications.
- References
Arndt, R. et. al., "Adding Formal Semantics to MPEG-7: Designing a Well-Founded Multimedia Ontology for the Web", Department of Computer Science, University of Koblenz. Technical Report. January 2007.
Bloehdorn, S. et. al., "Semantic Annotation of Images and Videos for Multimedia Analysis", Proc. ESWC, Heraklion, Crete, Greece, May 29 - June 592-607, 2005.
Dasiopoulou, S., et. al., "Knowledge-assisted semantic video object detection", IEEE CSVT, 15 (10), pp. 1210-1224, 2005.
Dasiopoulou, S, et. al., "Multimedia Reasoning with Natural Language Support", 1st IEEE International Conference on Semantic Computing, Sept 17-19, Irvine, California, 2007.
Espinosa, S., "Multimedia Interpretation as Abduction", International Workshop on Description Logics (DL-2007), 2007.
Garcia, R., and Celma, O., "Semantic Integration and Retrieval of Multimedia Metadata", Proc. ISWC, Galway, Ireland, Nov. 6-10, 2005.
Hunter, J., "Adding Multimedia to the Semantic Web: Building an MPEG-7 Ontology", Proc. 1st Semantic Web Working Symposium, Stanford University, California, USA, July 30 - August 1, pp. 261-283, 2001.
Hunter, J., et. al., "Realizing the hydrogen economy through Semantic Web technologies", IEEE Intelligent Systems 19 (1), pp. 40-47, 2004.
Maillot, N. et. al., "Towards ontology-based cognitive vision", Mach. Vis. Appl. 16 (1), pp. 33-40, 2004.
Neumann, B. et. al., "On scene interpretation with description logics", Tech. Rep. FBI-B-257/04, 2004.
Oberle, D. et.al., "DOLCE ergo SUMO: On Foundational and Domain Models in SWIntO (SmartWeb Integrated Ontology)", Tech. Report, AIFB, University of Karlsruhe. July 2006.
Schober, J.P., "Content-based image retrieval by ontology-based object recognition", KI-2004 Workshop on Applications of Description Logics (ADL-2004), 2004.
Troncy, R. et. al., "Enabling Multimedia Metadata Interoperability by Defining Formal Semantics of MPEG-7 Profiles", Proc. SAMT, Athens, Greece, December 6-8, 2006, pp. 41-55.
Tsinaraki, C. et. al., "Ontology-Based Semantic Indexing for MPEG-7 and TVAnytime Audiovisual Content", Multimedia Tools Appli. 26(3), pp. 299-325, 2005.
