Ontology learning
Contents |
[edit] Introduction
Manual construction of ontologies for the SemanticWeb is a time consuming task. In order to help humans, the ontology learning field tries to automate the construction of new ontologies. The amount of data caused by the success of Internet is demanding methodologies and tools to automatically extract unknown and potentially useful knowledge out of it, generating structured representations with that knowledge. Although ontological engineering tools have matured over the last decade [Gómez-Pérez et al, 2003], manual ontology acquisition remains a tedious, time-consuming, error prone, and complex task that can easily result in a knowledge acquisition bottleneck [Maedche 2002]. Besides, while the new necessities of information are growing, the available ontologies need to be updated, enriched with new contents.
The research on the ontology learning field has made possible the development of several approaches that allow the partial automation of the ontology construction process [Gómez-Pérez and Manzano-Macho, 2005; Buitellar et al., 2005]. It aims at reducing the time and effort in the ontology development process. Some methods and tools have been proposed in the last years, to speed up the ontology building process, using different sources and several techniques. Computational linguistics techniques, information extraction, statistics, and machine learning are the most prominent paradigms applied until now. There are also a great variety of information sources used for ontology learning. Though Web pages, dictionaries, knowledge bases, semi-structured and structured sources can be used to learn an ontology, most of the methods only use textual sources for the learning process. All methods and tools have a strong relationships to the type of processing performed.
In summary, the ontology learning field puts a number of research activities together, which focus on different types of knowledge and information sources, but share their target of a common domain conceptualisation The ontology learning is a complex multi-disciplinary field that uses the natural language processing, text and web data extraction, machine learning and ontology engineering.
[edit] Methods for Ontology Learning
Several approaches have appeared during the last decade for the partial automatization of the knowledge acquisition process. To carry out this automatization, natural language analysis and machine learning techniques can be used. According to the main technique followed to learn, the methods can be grouped into: based on linguistics, based on statistical approaches, and based on machine learning techniques.
[edit] Based on Linguistic Techniques
This group includes all those methods that mainly base their performance on linguistic techniques, including for example, linguistic patterns, pattern-based extraction, semantic relativeness measures, etc. One of the most well-known approaches is to apply Hearst’ style linguistic patters to detect taxonomical relationships between words or group of words (i.e. hypernym, hyponym, etc). But other approaches are also possible, such as: Alfonseca y Manandhar (2002) Aussenac-Gilles (2005), and Hahn and Mark´o (2001) among others.
[edit] Based on Statistical Approaches
This group includes all those methods that mainly base their performance on calculating several statistical measures to help the ontologist to detect new concepts or relations among them. These techniques are usually applied together with others techniques, mainly natural language processing. In this group appears for example, the methods proposed by Aguirre et al., (2000) and Faatz and Steinmetz (2002) among others.
[edit] Based on Machine Learning Techniques
This group includes all those methods that mainly base their performance on using several learning algorithms to assist the ontologist in detecting new concepts or relations among them, and to help on finding the correct place in the taxonomy. This techniques are usually applied together with others techniques, mainly natural language processing techniques. In this group of methods appear, among others, Cimiano et al., (2006), Karoui et al., (2006), Khan y Luo (2002), Ruiz-Casado et al., (2007) and Velardi et al., (2002).
---
[edit] Tools for Learning Ontologies
There are a few tools and systems that assist the ontological engineering performing the knowledge acquisition task. Group them according to the main aim reached by the tool, distinguishing which elements of the ontology can be learned, there are tree main groups of ontology learning tools. The first group includes those tools that help to detect new relations (taxonomic or not taxonomic) from the selected input. In this group appears, for example, ASIUM [Faure et al., 2000], LTG Text Processing Workbench [Mikheev et al., 1997] or OntoLT [Buitelaar et al., 2004]. These have a close relationship with the type of natural language processing required to perform the learning process. The second covers all those tools that assist the knowledge engineer to find and set up new concepts. Among these appear Mo’K [Bisson et al., 2000], TERMINAE [Biebow et al., 1999] or Welkin [Alfoseca and Manadhar, 2002]. Finally, the last group deals with the tools that help the knowledge engineer to build a taxonomy or enrich an existing one. In this group are included OntoLearn [Navigli et al., 2004], KAON and Text-2-Onto [Cimiano et al., 2005] among others.
[edit] References
[Gómez-Pérez et al, 2003] Gómez-Pérez, A., Fernandez-Lopez, M., Corcho, O.: Ontological Engineering: With Examples from the Areas of Knowledge Management, E-Commerce and SemanticWeb. Advanced Information and Knowledge Processing. Springer Verlag (2003)
[Maedche 2002] Maedche, A.: Ontology Learning for the Semantic Web. Volume 665 of The Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publishers (2002)
[Gómez-Pérez and Manzano-Macho, 2005] Gómez-Pérez, A., Manzano-Macho, D.: An overview of methods and tools for ontology learning from texts. Knowledge Engineering Review 19 (2005) 187–212
[Buitellar et al., 2005] P. Buitelaar, P. Cimiano, and B. Magnini, editors. Ontology Learning from Text: Methods, Evaluation and Applications, volume 123 of Frontiers in Artificial Intelligence and Applications. IOS Press, Nieuwe Hemweg 6B, 1013 BG Amsterdam, The Netherlands, July 2005.
[Alfoseca and Manadhar, 2002] E. Alfonseca and S. Manadhar. Extending a lexical ontology by a combination of distributional semantics signatures. In Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2002), pages 1–7, Berlin, 2002. Springer.
[Aussenac-Gilles, 2005 ] N. Aussenac-Gilles. Supervised text analyses for ontology and terminology engineering. In Proceedings of the Dagstuhl Seminar on Machine Learning for the Semantic Web, 2005.
[Aguierre et al., 2000] E. Aguirre, O. Ansa, E. Hovy, and D. Martinez. Enriching very large ontologies using www. In Workshop on Ontology Construction of the European Conference of A.I. (ECAI-00), 2000.
[Faatz et al., 2002] A. Faatz and R. Steinmetz. Ontology enrichment with texts from the www. In Semantic Web Mining 2nd Workshop at ECML/PKDD-2002, Helsinki, Finland, 2002.
[Hahn et al., 2001] U. Hahn and K. Mark´o. Joint knowledge capture for grammars and ontologies. In Y. Gil, M. Musen, and J. Shavlik, editors, Proceedings of the First International Conference on Knowledge Capture (K-CAP 2001), pages 68–75, Victoria, British Columbia, Canada, 2001. ACM Press.
[Cimiano et al., 2005] P. Cimiano and S. Staab. Learning concept hierarchies from text with a guided hierarchical clustering algorithm. In C. Biemann and G. Paas, editors, Proceedings of the ICML 2005 Workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods, Bonn, Germany, 2005.
[Karoui et al., 2006] L. Karoui, M. Aufaure, and N. N. Bennacer. Context-based hierarchical clustering for the ontology learning. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, 2006. WI 2006., pages 400–427, Hong Kong, China, 2006.
[Khan and Luo, 2002] L. Khan and F. Luo. Ontology construction for information selection. In Proceedings of 14th IEEE International Conference on Tools with Artificial Intelligence, pages 122–127, Washington, USA, 2002. IEEE Computer Science.
[Ruiz Casado et al., 2007] M. Ruiz-Casado, E. Alfonseca, and P. Castells. Automatising the learning of lexical patterns: An application to the enrichment of wordnet by extracting semantic relationships from wikipedia. Data and Knowledge Engineering, 61:484–499, 2007.
[Veladi et al., 2002] P. Velardi, R. Navigli, and M. Missikoff. Integrated approach for web ontology learning and engineering. IEEE Computer, 35(11):60–63, 2002.
[Faure et al., 2000] Faure D, Poibeau T 2000. First experiments of using semantic knowledge learned by ASIUM for information extraction task using INTEX. In: S. Staab, A. Maedche, C. Nedellec, P. Wiemer-Hastings (eds.), Proceedings of the Workshop on Ontology Learning, 14th European Conference on Artificial Intelligence ECAI’00, Berlin, Germany.
[Mikheev et al., 1997] Mikheev, A. Finch, S.1997. A Workbench for Finding Structure in Texts. In Proceedings of ANLP-97 (Washington D.C.). ACL March 1997.
[Buitelaar et al., 2004] P. Buitelaar, D. Olejnik, and M. Sintek. A prot´eg´e plug-in for ontology extraction from text based on linguistic analysis. In J. Davies, D. Fensel, C. Bussler, and R. Studer, editors, Proceedings of the 1st European Semantic Web symposium (ESWS), Heraklion, Greece, May 2004.
[Bisson et al., 2000] G. Bisson, C. Nedellec, and D. Canamero. Designing clustering methods for ontology building. The mo’k workbench. In S. Staab, A. Maedche, C. Nedellec, and P. Wiemer-Hastings, editors, Proceedings of the Workshop on Ontology Learning.
[Biebow et al., 1999] B. Biebow and S. Szulman. Terminae: a linguistic-based tool for the building of a domain ontology. In D. Fensel and R. Studer, editors, Proceedings of the 11th European Workshop on Knowledge Acquisition, Modelling and Management, volume 1621 of Lecture Notes in AI, pages 49–66, Germany, 1999. Springer-Verlag.
[Cimiano et al., 2005] P. Cimiano and J. V¨olker. Text2onto - a framework for ontology learning and data-driven change discovery. In Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB’2005), 2005.
[Navigli et al., 2004] R. Navigli and P. Velardi. Learning domain ontologies from document warehouses and dedicated web sites. Comput. Linguist., 30(2):151–179, 2004.