Название: Linked Data Visualization
Автор: Laura Po
Издательство: Ingram
Жанр: Программы
Серия: Synthesis Lectures on the Semantic Web: Theory and Technology
isbn: 9781681738345
isbn:
Laura Po, Nikos Bikakis, Federico Desimoni, and George Papastefanatos
March 2020
Acknowledgments
Sincere thanks go to Dr. Jakub Klímek for his careful reading of the manuscript and his constructive remarks. The work carried out in this book was partially supported by the Networking on Linked Data project funded by the “Enzo Ferrari” Engineering Department of the University of Modena and Reggio Emilia within FAR2019 as well as by the VisualFacts project (#1614) funded by the 1st Call of the Hellenic Foundation for Research and Innovation Research Projects for the support of post-doctoral researchers.
Laura Po, Nikos Bikakis, Federico Desimoni, and George Papastefanatos
March 2020
CHAPTER 1
Introduction
Linked data provides the basis for knowledge to be distributed, networked, and shared. The term Linked Data (LD) refers to a set of best practices for publishing and interlinking structured data on the Web. Creating a connection between data and its contexts could lead to the development of intelligent search engines which could explore the Web, moving from a keyword-based approach to a meaning-based approach. Researches can be more accurate by exploiting the relations between words. LD can provide a benefit in several research areas like in the medical field for structuring the connections between various illness and the relative cures, in the scientific literature for structuring the citations between the million of documents published online. The potentialities of exploitation of LD are countless.
On the other hand, given the wide availability of LD sources, it is crucial to provide intuitive tools enabling users without semantic technology background to explore, analyze, and interact with increasingly large datasets. Visual analytics integrates the analytic capabilities of the computer and the abilities of the human analyst, allowing novel discoveries and empowering individuals to take control of the analytical process. LD visualization aims to provide graphical representations of datasets with the aim to facilitate their analysis and the generation of insights out of complex interconnected information.
In this chapter, we will introduce why visualization is a powerful means for linked data exploration, then, the principles and technologies that are the bases for the creation of LD are presented, and we also depict the incredible impact that LD can have in the real world.
In the next section, we start illustrating how visualization is good way of interacting with the corresponding very large amounts of complex, interlinked, multi-dimensional data. The evolution of the web from Web 1.0 to Web 4.0. is depicted in Section 1.2. We highlight the principles of LD in Section 1.3; after this, we describe the Linked Data Cloud (Section 1.4) that draws datasets that have been published according to those principles. Sections 1.5 and 1.6 are devoted to assessing the impact of LD in our life and the opportunities they can generate. Finally, in Section 1.7, we introduce the theoretical basis of LD by describing the Semantic Web technologies.
1.1 THE POWER OF VISUALIZATION ON LINKED DATA
On any kind of data visualization enables serendipity and exploration. On LD it allows users to start understanding data previously unknown and to get the picture of the dataset in their mind, or to penetrate in some portions of the source. Moreover, visualization over LD is probably the only way to enable users without technical skills to grasp the meaning of the content of LD sources. Furthermore, also domain experts can take advantage of a visual exploration of the dataset resulting in reduction in time.
Following the idea of Tim-Berners Lee, each resource should have a unique name that starts with HTTP. It means that reality can be replicated over the Internet. Each resource of the world can have its digital alter ego. Moreover, the pillar of linked data is that resources should be connected to other resources.
The simplest form of relationship are personal relation; John is a friend of Martin, Martin is the son of Peter, and somehow John is remotely connected to Martin. However, this can be extended to every existing field. Biology, Sociology, and Art are only a few areas in which LD can be deployed. LD has the power to universally express everything. However, how to visualize everything? One possible choice is to learn Semantic Web techonologies, write SPARQL queries and then analyze the results. Despite the difficulty of writing SPARQL queries, this approach can be used only when the results are limited, since the information that can be displayed on a screen are limited. The other possible approach is to exploit the power of visualization.
The first tests on graphic visualization date back to 1890. In 1890, Herman Hollerith revolutionized the world of data analysis with a creative and innovative idea: he used punch cards to collect and analyze the U.S. census data. Using punch cards saved two years and five million dollars over the manual tabulation techniques used in the previous census while enabling more thorough analysis of the data [Blodgett and Schultz, 1969]. We currently face an analogous development in the filed of LD. Since 2006, many researchers developed original solutions for solving the task of LD visualization and now we can exploit different tools and different visualization layouts.
Listing 1.1: Query for extracting relations between classes
For example, how can a user understand the content of the Wikipathways dataset1? Assuming that the user wants to know the contents of the dataset, he/she could formulate a SPARQL query to extract the classes and relations similar to the Listing 1.1 and then analyze the results, as shown in Figure 1.1.
Adopting a graphical visualization, instead, can simplify a lot the analysis of the results. For example, the previous information can be obtained through one of the visualization provided by the tool H-BOLD (Figure 1.2). As it can be seen, displaying the same information with a graph, it is more easy to understand the connections and paths among the classes.
Figure 1.1: Results of the query in Listing 1.1.
Figure 1.2: HBOLD schema visualization of the Wikipathways dataset.
A crucial and impressive aspect of LD is that information are interlinked with different sources. Therefore, starting from the URI of a resource it is possible to display not only the information that describe the resource within the dataset, but also information from outside datasets. Figure 1.3 depicts how a LD visualization tool is able to create a collage of information from disparate sources. In that example, LodView2 has been exploited to merge all information about London. Starting from the URI of the resource London in Dbpedia (http://dbpedia.org/resource/London), the tool look at all the outcoming links and illustrates СКАЧАТЬ