Название: Smarter Data Science
Автор: Cole Stryker
Издательство: John Wiley & Sons Limited
Жанр: Базы данных
isbn: 9781119693420
isbn:
ARCHITECTURE AND DESIGN
The difference between architecture and design is not immediately clear. Professor and author Philippe Kruchten has argued that all architecture is design. By using themes of difficulty and cost, an example can be used to help create a mental model for delineation.
A building has an external structure. Within the building, rooms are created, and furniture or other objects can be placed in each room. In this analogy, the external structure represents architecture, and the objects in a room represent design. The placement of the furniture, even if heavy, can be rearranged with minimal effort and cost. New elements can even be brought into the room over time, and other elements can be removed. The placement is designed.
The external walls may be immovable, especially if you just want to move the walls on the 50th story of a skyscraper. But, even if you could move the walls, the time, expense, and complexity can make the prospect inadvisable.
Elements within your designs that are anchor points and highly disruptive or expensive to change are architectural. Elements that can be reasonably changed over time are design elements. In an information architecture, the need to have an environment to support AI is architectural; the use of a machine learning library or the selection of features for use in a model is design.
Facilitating the Winds of Change: How Organized Data Facilitates Reaction Time
How much time an organization is given to respond to a change is a variant and is always predicated on being circumstantial. When the European Union introduced a law known as the General Data Protection Regulation (GDPR), all companies conducting business with individual citizens of the European Union and the European Economic Area were given a specific date by which they were required to comply with the changes the law introduced.
When the media company Netflix switched its core business model to subscription-based streaming from online DVD rentals, the company essentially gave notice to all brick-and-mortar DVD rental companies to switch their own existing business models or risk irrelevance. The response from the traditional DVD rental companies has proven to be overwhelmingly inadequate. So, while Netflix does have marketplace competition, the competition is not coming from the organizations that owned or operated the brick-and-mortar DVD rental facilities at the time Netflix switched its business model.
Sometimes transformations occur in a slow and progressive manner, while some companies can seemingly transform overnight. Some transformation needs can be sweeping (e.g., to comply with insider-trading rules). Adjustments might even blindside some employees in ways that they perceive as unwarranted.
Sweeping changes equivalent to eminent domain can be part and parcel of management prerogative. An internal IT department can be outsourced, a division can be sold, sales regions rearranged, and unsatisfactory deals made just to appease a self-imposed quota or sales mark. Some of these changes can be forced on an organization on a moment's notice or even appear to be made on a whim. Like eminent domain, sometimes change arrives at the organization swiftly and seemingly capriciously—but when it comes, reaction is not optional.
NOTE
Eminent domain is a government's right to expropriate private property. In 1646, Hugo Grotius (1583–1645) coined the term eminent domain as taking away by those in authority. In general, eminent domain is the procurement of an individual's property by a state for public purposes. An individual's right to own a home and the land beneath it is viewed as part of the liberty extended to all Americans by the Constitution of the United States. Varying degrees of land ownership is also a liberty afforded to individuals in many other nations around the world, too.
Within a corporate culture, eminent domain represents the ability for senior leaders to maneuver around previously accepted controls and protocols.
MUTABLE
In computing, a mutable object is an object whose state can be modified after it is created. An immutable object is an object whose state cannot be modified after it is created. Designing solutions to be mutable will make it easier to address new needs. When it comes to managing data, adding mutability concepts into a design will make adding a variable, deleting a variable, and modifying a variable's use or characteristics easier and more cost effective.
Quae Quaestio (Question Everything)
Different users might phrase comparable questions using different terminology, and even the same user from query to query might introduce nuances and various idiosyncrasies. Users are not always succinct or clear about their objectives or informational needs. Users may not necessarily know what to request.
Consequently, in business, there is a need to question everything to gain understanding. Although it might seem that to “question everything” stymies progress in an endless loop (Figure 2-5), ironically to “question everything” opens up all possibilities to exploration, and this is where the aforementioned trust matrix can help guide the development of a line of inquiry. This is also why human salespeople, as a technique, will often engage a prospect in conversation about their overall needs, rather than outright asking them what they are looking for.
Figure 2-5: Recognizing that the ability to skillfully ask questions is the root to insight
In Douglas Adams' The Hitchhiker's Guide to the Galaxy, when the answer to the ultimate question was met with a tad bit of disdain, the computer said, “I think the problem, to be quite honest with you, is that you've never actually known what the question is” (New York: Harmony Books, 1980). The computer then surmised that unless you fully come to grips with what you are asking, you will not always understand the answer. Being able to appropriately phrase a question (or query) is a topic that cannot be taken too lightly.
Inserting AI into a process is going to be more effective when users know what they want and can also clearly articulate that want. As there are variations as to the type of an AI system and many classes of algorithms that comprise an AI system, the basis to answer variations in the quality of question is to first seek quality and organization in the data.
However, data quality and data organization can seem out-of-place topics if an AI system is built to leverage many of its answers from unstructured data. For unstructured data that is textual—versus image, video, or audio—the data is typically in the form of text from pages, documents, comments, surveys, social media, and so on. But even nontextual data can yield text in the form of metadata, annotations, or tags via transcribing (in the case of audio) or annotating/tagging words or objects found in an image, as well as any other derivative information such as location, object sizes, time, etc. All types of unstructured data can still yield structured data from parameters associated with the source and the data's inherent context.
Social media data, for example, requires various additional data points to describe users, their posts, relationships, time of posts, СКАЧАТЬ