Название: Semantic Web for the Working Ontologist
Автор: Dean Allemang
Издательство: Ingram
Жанр: Программы
Серия: ACM Books
isbn: 9781450376167
isbn:
It is well known to anyone with even a passing interest in politics that good legislation is not an easy task and that crafting the words carefully for a law or statute is very important. The same flexibility of interpretation that makes natural language models so flexible also makes it difficult to control how the laws will be interpreted in the future. When someone else reads the text, they will have their own background and their own interests that will influence how they interpret any particular model. Readers of the previous paragraph in the third edition probably interpreted it very differently from readers of the first edition only a decade earlier, despite the fact that the text has not changed at all. This phenomenon is so widespread that most government systems include a process (usually involving a court magistrate and possibly a committee of citizens) whereby disputes over the interpretation of a law or its applicability can be resolved.
When a model relies on particulars of the context of its reader for interpretation of its meaning, as is the case in legislation, we say that a model is informal. That is, the model lacks a formalism whereby the meaning of terms in the model can be uniquely defined.
In the hypertext Web today, there are informal models that help people communicate about the organization of the information. It is common for commerce web sites to organize their wares in catalogs with category names like “webcams,” “Oxford shirts,” and “granola.” In such cases, the communication is primarily one way; the catalog designer wants to communicate to the buyers the information that will help them find what they want to buy. The interpretation of these words is up to the buyers. The effectiveness of such a model is measured by the degree to which this is successful. If enough people interpret the categories in a way similar enough to the intent of the cataloger, then they will find what they want to buy. There will be the occasional discrepancy like “Why wasn’t that item listed as a webcam?” or “That’s not granola, that’s just plain cereal!” But as long as the interpretation is close enough, the model is successful.
A more collaborative style of document modeling comes in the form of community tagging. A number of web sites have been successful by allowing users to provide meaningful symbolic descriptions of their content in the form of tags. A tag in this sense is simply a single word or short phrase that describes some aspect of the content. Early examples of this sort of tagging system include Flickr for photos and del.icio.us for Web bookmarks. In more modern systems, we see “hashtags” in social media like Twitter, LinkedIn, and Facebook playing a similar role. Users of content organization services like Slideshare for presentations and YouTube for videos use tags to help other users find and discover content. The idea of community tagging is that each individual who provides content will describe it using tags of their own choosing. If any two people use the same tag, this becomes a common organizing entity; anyone who is browsing for content can access information from both contributors under that tag. The tagging infrastructure shows which tags have been used by many people. Not only does this help browsers determine what tags to use in a search, but it also helps content providers to find commonly used tags that they might want to use to describe new content. Thus, a tagging system will have a certain self-organizing character, whereby popular tags become more popular and unpopular tags remain unpopular—something like evolution by artificial selection of tags. The resulting collection of tags and their relations is called a Folksonomy to reflect the fact this is a categorization from and by the crowd.
Tagging systems of this sort provide an informal organization to a large body of heterogeneous information. The organization is informal in the sense that the interpretation of the tags requires human processing in the context of the consumer. Just because a tag is popular doesn’t mean that everyone is using it in the same way. In fact, the community selection process actually selects tags that are used in several different ways, whether they are compatible or not. As more and more people provide content, the popular tags saturate with a wide variety of content, making them less and less useful as discriminators for people browsing for content. This sort of problem is inherent in information modeling systems; since there isn’t an objective description of the meaning of a symbol outside the context of the provider and consumer of the symbol, the communication power of that symbol degrades as it is used in more and more contexts.
When tags are used incompatibly, it is a challenge to both humans and machines to differentiate their meaning. For example, the Twitter hashtag “#rpi” is currently used for a university in the US, a British currency concept, the Spanish term for someone who has passed away, and a shorthand for the Raspberry Pi computer. While these would seem very different, when coupled with technology like search engines or social networks, the term becomes a challenge to differentiate—a tweet like “#rpi is up” could refer to the university leading in a sports event, the British economy doing well, or someone having attached the small computer to a tree in their backyard (lest you think this is far-fetched, this was a real tweet which was indeed about someone putting their Raspberry Pi into a treehouse).
Formality of a model isn’t a black-and-white judgment; there can be degrees of formality. This is clear in legal systems, where it is common to have several layers of legislation, each one giving objective context for the next. A contract between two parties is usually governed by some regional law that provides standard definitions for terms in the contract. Regional laws are governed by national laws, which provide constraints and definitions for their terms. National laws have their own structure, in which a constitution or a body of case law provides a framework for new decisions and legislation. Even though all these models are expressed in natural language and fall back on human interpretation in the long run, they can be more formal than private agreements that rely almost entirely on the interpretation of the agreeing parties.
This layering of informal models sometimes results in a modeling style that is reminiscent of Talmudic scholarship. The content of the Talmud includes not only the original scripture but also interpretative comments on the scripture by authoritative sources (classical rabbis). Their comments have gained such respect that they are traditionally published along with the original scripture for comment by later rabbis, whose comments in turn have become part of the intellectual tradition. The original scripture, along with all the authoritative comments, is collectively called the Talmud, and it is the basis of a classical Jewish education to this day.
A similar effect happens with informal models. The original model is appropriate in some context, but as its use expands beyond that context, further models are required to provide common context to explicate the shared meaning. But if this further exposition is also informal, then there is the risk that its meaning will not be clear, so further modeling must be done to clarify that. This results in heavily layered models, in which the meaning of the terms is always subject to further interpretation. It is the inherent ambiguity of natural language at each level that makes the next layer of commentary necessary until the degree of ambiguity is “good enough” that no more levels are needed. When it is possible to choose words that are evocative and have considerable agreement, this process converges much more quickly.
Human communication, as a goal for modeling, allows it to play a role in the ongoing collection of human knowledge. The levels of communication can be quite sophisticated, including the collection of information used to interpret other information. In this sense, human communication is the fundamental requirement for building a Semantic Web. It allows people to contribute to a growing body of knowledge and then draw from it. But communication is not enough; to empower a web of human knowledge, the information in a model needs to be organized in such a way that it can be useful to a wide range of consumers.
2.2 Explanation and Prediction
Models are used to organize human thought in the form of explanations. СКАЧАТЬ