Computation in Science (Second Edition). Konrad Hinsen
Чтение книги онлайн.

Читать онлайн книгу Computation in Science (Second Edition) - Konrad Hinsen страница 5

Название: Computation in Science (Second Edition)

Автор: Konrad Hinsen

Издательство: Ingram

Жанр: Программы

Серия: IOP ebooks

isbn: 9780750332873

isbn:

СКАЧАТЬ English word in terms of bit sequences.

      It is less obvious and perhaps even surprising to many readers that most numbers cannot be represented as bit sequences. For natural numbers, there is no problem: any sequence of bits can be interpreted as a natural number in base 2 notation. Inversely, every natural number can be written in base 2 notation, and therefore as a sequence of bits. It is thus possible to define a one-to-one correspondence between natural numbers and bit sequences. In mathematical terminology, the set of natural numbers is isomorphic to the set of bit sequences. Since we can perform computations on sequences of bits, we can perform computations on natural numbers. In fact, any set of values that we wish to do computations on must be isomorphic to the set of bit sequences, or equivalently to the set of natural numbers, or to a subset of such a set. Such sets are called countable. All finite sets are countable: just write down the elements in some order and then write a number below to each element, starting from 1. Infinite sets that are countable are called countably infinite sets.

      It is straightforward to see that integers are still countable: use one bit for the sign, and then a natural-number bit sequence for the absolute value. It takes a bit more effort to show that the rational numbers, i.e. the set of all quotients of integers, are countable. By the definition of countability, this requires the assignment of a unique natural number to each rational number. The standard procedure is based on a two-dimensional matrix-like arrangement of the rational numbers:

      11213141…12223242…13233343…14243444…⋮⋮⋮⋮⋱

      The entries of this infinite matrix can now be enumerated along diagonals:

      1361015…25914…4813…712…11…⋮⋮⋮⋮⋮⋱

      A more sophisticated enumeration scheme would skip over each number that is equal to one that already received an index earlier. For example, 2/2 would be skipped because it is equal to 1/1.

      The proof that the set of real numbers is not countable is more involved, and I will not reproduce it here. Like many other proofs concerning infinite sets, it goes back to Georg Cantor, a German mathematician who laid the foundations of set theory in the late 19th century, and actually provided the first rigorous definition of the real numbers. The complex numbers, being a superset of the real numbers, are also uncountable. There are, however, countable number sets larger than the rational numbers. A well-known one in mathematics is the set of algebraic numbers, defined as the roots of polynomials with integer coefficients. In the context of computation, the largest useful countable subset of the real numbers is the set of computable numbers, which was introduced by Alan Turing in the same 1936 publication as the Turing machine. I will come back to this subject in chapters 2 and 3, because it is of central importance for the use of computation in science.

      We can now write down a first definition of computation, which will be refined in chapter 3:

       Computation is the transformation of sequences of symbols according to precise rules.

      What will need refinement is the ‘precise rules’, which must be expressed so precisely that a machine can apply them unambiguously.

      Once we get rid of the idea that computation is about numbers, we can easily identify other operations that qualify as computations. One example is solving equations by algebraic manipulations. The steps leading from

      y+2x=z

      to

      x=12(z−y)

      are completely mechanical and can be formulated as an algorithm. The practical evidence is that computers can do the job. Software packages that implement such operations are called computer algebra systems, emphasizing algebraic manipulations. However, computer algebra systems also perform other non-numerical algorithms, for example finding the derivative or integral of an elementary function. The algorithm for computing derivatives is simple and taught in every calculus course. In contrast, the algorithm for computing indefinite integrals is very complicated [2] and was first implemented as a computer program only in 1987 [3].

      A perhaps more surprising use of computation in mathematics is the validation of proofs. A proof is a sequence of deduction steps, each of which establishes the truth of a statement based on other statements already known to be true and a finite set of rules of logic deduction. In textbooks and mathematical publications, proofs are written concisely for human readers who have a prior knowledge of mathematics. But when all the details are spelled out, proofs consist of nothing but applications of a finite set of deduction rules. These rules are applied to statements that must respect the rules of a formal language. The use of computers to check such detailed proofs is becoming more and more common, both in mathematical research and in industrial applications. This involves writing the proof in some formal language, as a sequence of symbols. The ‘proof checking’ computation transforms this sequence of symbols into a ‘true’ or ‘false’ statement about the proof’s validity.

      Leaving the narrow scope of mathematics, we find a huge number of domains where computation is applied to textual data. Finding a word in a text is a computation: it transforms the input data (the text and the word to look for) into output data (the position in the text where the word occurs), both of which can be encoded as sequences of symbols. Likewise, aligning two DNA sequences, translating a sentence from English to French, or compiling a Fortran program, are computations in which the data being processed are text. In fact, numbers and text are the two kinds of elementary data from which almost everything else is made up by composition: images are represented as two-dimensional arrays of numbers, dictionaries as sets of word pairs, etc. However, this fundamental role of numbers and text is due to their importance to humans, not computers.

      Another kind of data that are becoming increasingly important for scientific computation are graphs, in particular when used to describe networks. Traffic flow, protein interactions in a cell, and molecular structures are all examples of what can be described by graphs. An example of a computation on graphs is a check for cycles. This transforms the input data (a graph) into output data (a list of all cycles in the graph). Encoding graphs, cycles, and lists as sequences of symbols is not as obvious as encoding numbers and text. Humans already represented numbers and text by sequences of symbols long before thinking about computers and computation. The human representation of graphs, on the other hand, takes the form of two-dimensional drawings. The precise techniques for representing graphs as sequences of symbols are beyond the scope of this book, but I will come back to the general question of information representation in computers in section 5.3.2.

      Computation as a tool

      The most visible role of computation in scientific research is its use as a tool. Experimentalists process their raw data, for example to correct for artifacts of their equipment. Then they fit a theoretical model to their data by adjusting parameters. Theoreticians compute numerical predictions from a model, in order to compare them to experimental results. Both experimentalists and theoreticians make heavy use of computation for understanding their data, in particular using visualization techniques.

      It is worth making a distinction between computation as a tool and computers as a tool. Computers perform computations, but they also deal with other tasks. Scientists use computers for communicating with their peers, looking up information in Wikipedia, and for controlling lab equipment. None of these tasks involve much visible СКАЧАТЬ