Providing for successful ontology re-use
Along with the rapid development of the World Wide Web, the amount of available information online has increased exponentially. A lack of standardisation and common vocabulary would result in heterogeneity, hindering information exchange and communication. Ontologies, capturing the semantics of information from various sources and providing a concise and declarative description, have therefore gained unprecedented significance. It is the increasing awareness of the benefits of ontologies in information processing that has lead to monolithic ontologies for real-world domains. Within the European project WONDERWEB, coordinated efforts have been devoted to the development of modularisation methods and tools for managing large domains containing thousands of concepts. This notion of modularisation comes from software engineering, where it refers to the design of software from well-defined components that can be handled and re-used independently. Researchers at the Vrije Universiteit in Amsterdam sought to address the scalability problems by means of a partitioning method purely based on the structure of the ontology. Ignoring its inefficiency in capturing important dependencies that could be found by analysing classes and the logical definition of concepts was a practical consideration. A structure-based semantical investigation could scale up to large ontologies that contain hundreds of thousands of concepts. This iterative partitioning has already been successfully applied to real-world ontologies available in OWL (Web Ontology Language) or RDF (Resource Description Framework) schema, including the Suggested Upper Merged Ontology (SUMO). SUMO was developed within the IEEE Standard Upper Ontology Working Group as a standard ontology that will promote data interoperability, information search and retrieval, as well as natural language processing. In a second experiment on the partitioning created for the administrative part of the National Cancer Institute (NCI) ontology, promising results were achieved with this minimal approach. It remains to be investigated whether the use of additional structural information to the hierarchy concept produces better results. Another direction for future research is the method's performance on graphs that include dependencies resulting from user-defined relations.