Название: Handbook on Intelligent Healthcare Analytics
Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Жанр: Техническая литература
isbn: 9781119792536
isbn:
Healthcare big data is huge because of its abundant sources like electronic health record (EHR), the data from wearables and medical devices, genomic data, clinical research data, Internet of Things (IoT), the search engine data, and the social media data. It is very difficult to integrate big medical data with traditional information processing systems and databases because of its variety and volume. However, the tool and technology of big data is used for gathering, processing, and storing large volumes of medical data, which is diverse in nature. The role of big data in the healthcare industry is important due to availability and easy accessibility of vast amounts of medical data, increase in healthcare costs, and the need for personalized patient care. Big data analytics effectively accumulate, analyze, and take the knowledge from the medical data.
Big data knowledge systems in the healthcare industry are more important because all these data about patients are used for developing health recommender systems, clinical decision support system, disease prediction system, and knowledge discovery systems. The medical big data with knowledge system assists during the process of analyzing and extracting valuable knowledge. The extracted knowledge used by the healthcare professional for their clinical decisions. The big data knowledge system transforms the healthcare big data to useful information. The healthcare industries are providing better clinical decisions, for the quality patient care with lower cost, using the healthcare data analytics.
Meaning of big data, various dimensions of big data, tools and technologies used in big data, process of creating value from big data, big data analytics, role of big data knowledge system in healthcare, healthcare big data analytics, the applications of big data, and the challenges faced by big data in the healthcare industry are presented and discussed in the following sections.
3.2 Overview of Big Data
3.2.1 Big Data: Definition
Every day, the organization is producing an enormous quantity of data. These huge volumes of data compose “big data”. Big data, complex in nature, requires powerful technologies and advanced algorithms for its processing.
A formal definition of big data was given in [1]: “Big data is the information asset characterized by such a high volume, velocity, and variety to acquire specific technology and analytical methods for its transformation into value.”
Data has been increasing constantly in an unpredicted way from the last decade due to the digitalization and the advancements in technology. In common, the big data is being generated from following sources [14]:
Social data: The social data refers to social media data. It is generated from social media such as YouTube and Twitter. This data is mainly used in market analysis. The analysis on Facebook likes and comments and tweets on Twitter provide the details about the consumer behavior.
Machine data: Machine data refers to the data generated by machines, such as wearable, sensor devices, web logs, and satellites.
Transactional data: The transactional data are generated as a result of the transactions. The transactions can be online or offline. Examples of the transactional data are the delivery receipts, order, invoices, etc.
Human generated data: The human generated data is extracted from the emails, electronic medical reports, messages, etc.
Search engine data: The search engine data are generated from the browsers.
All the abovementioned data are in diverse formats such as comments, videos, email, and sensor data, most of which are in unstructured format. Big data is the huge size of a data set that grows exponentially with time. Examples of big data: Amazon product list, YouTube videos, Google search engine, and Jet engine data. Storing and processing of abovementioned big data is not possible with conventional databases because traditional databases can contain only gigabytes of data. But, the big data contains several petabytes of data. The big data solutions solve this entire problem with distributed storage and processing systems.
3.2.2 Big Data: Characteristics
The massive set of information generated with the utilization of the latest technologies is called big data. This large set of data is used for individual and organizational purposes. Previously, the information was generated, stored, and processed easily because of limited sources of data. The conventional database was the single supply of data. Most of the data in the conventional database was in structured format. Presently, data is in a wide variety of formats such as sensor data, email messages, images, audio, and video, which makes the big data. Most of this data is unstructured. The five common big data characteristics are volume, variety, veracity, velocity, and value. Apart from the abovementioned five dimensions of big data, many researchers also added new dimensions of big data. Other dimensions of big data are volatility, validity, visualization, and variability. Big data is characterized by following commonly adopted V’s. Figure 3.1 and Table 3.1 represent the big data features in term of Vs [24].
Figure 3.1 Dimensions of big data.
Table 3.1 Dimensions of big data.
Dimensions of big data | Description |
Volume | In general, volume refers to quantity or amount. The data that contains gigantic larger data sets is called volume in big data. Big data is known for its voluminous size. The data is being produced on a daily basis from various places. |
Variety | Meaning of variety is diversity. The data that are generated and collected from diversified sources is called a variety in big data. Usually, big data is in a variety of forms that comprises structured, semi-structured, and unstructured data. |
Velocity | Velocity means speed. In big data context, the velocity denotes the speed of data creation. Big data is arriving at a faster rate, like a stream of water. |
Veracity | The veracity is the credibility, reliability, and accuracy in big data and the quality of the data sources. |
Variability |
The variability is not the same as variety. The variability means
СКАЧАТЬ
|