Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Bhisham C. Gupta
Чтение книги онлайн.

Читать онлайн книгу Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP - Bhisham C. Gupta страница 27

СКАЧАТЬ contained in a data set to draw conclusions that do not go beyond the boundaries of the data set. Inferential statistics uses techniques that allow us to draw conclusions about a large body of data based on the information obtained by analyzing a small portion of these data. In this book, we study both descriptive statistics and inferential statistics. This chapter discusses the topics of descriptive statistics. Chapters 3 through Chapter 7 are devoted to building the necessary tools needed to study inferential statistics, and the rest of the chapters are mostly dedicated to inferential statistics.

      2.1.2 Population and Sample in a Statistical Study

      In a very broad sense, statistics may be defined as the science of collecting and analyzing data. The tradition of collecting data is centuries old. In European countries, numerous government agencies started keeping records on births, deaths, and marriages about four centuries ago. However, scientific methods of analyzing such data are not old. Most of the advanced techniques of analyzing data have in fact been developed only in the twentieth century, and routine use of these techniques became possible only after the invention of modern computers.

      Definition 2.1.1

      A population is a collection of all elements that possess a characteristic of interest.

      Populations can be finite or infinite. A population where all the elements are easily countable may be considered as finite, and a population where all the elements are not easily countable as infinite. For example, a production batch of ball bearings may be considered a finite population, whereas all the ball bearings that may be produced from a certain manufacturing line are considered conceptually as being infinite.

      Definition 2.1.2

      A portion of a population selected for study is called a sample.

      Definition 2.1.3

      The target population is the population about which we want to make inferences based on the information contained in a sample.

      Definition 2.1.4

      The population from which a sample is being selected is called a sampled population.

      The population from which a sample is being selected is called a sampled population, and the population being studied is called the target population. Usually, these two populations coincide, since every effort should be made to ensure that the sampled population is the same as the target population. However, whether for financial reasons, a time constraint, a part of the population not being easily accessible, the unexpected loss of a part of the population, and so forth, we may have situations where the sampled population is not equivalent to the whole target population. In such cases, conclusions made about the sampled population are not usually applicable to the target population.

      Definition 2.1.5

      A sample is called a simple random sample if each element of the population has the same chance of being included in the sample.

      There are several techniques of selecting a random sample, but the concept that each element of the population has the same chance of being included in a sample forms the basis of all random sampling, namely simple random sampling, systematic random sampling, stratified random sampling, and cluster random sampling. These four different types of sampling schemes are usually referred to as sample designs.

      Since collecting each data point costs time and money, it is important that in taking a sample, some balance be kept between the sample size and resources available. Too small a sample may not provide much useful information, but too large a sample may result in a waste of resources. Thus, it is very important that in any sampling procedure, an appropriate sampling design is selected. In this section, we will review, very briefly, the four sample designs mentioned previously.

      Before taking any sample, we need to divide the target population into nonoverlapping units, usually known as sampling units. It is important to recognize that the sampling units in a given population may not always be the same. Sampling units are in fact determined by the sample design chosen. For example, in sampling voters in a metropolitan area, the sampling units might be individual voters, all voters in a family, all voters living in a town block, or all voters in a town. Similarly, in sampling parts from a manufacturing plant, the sampling units might be an individual part or a box containing several parts.

      Definition 2.1.6

      A list of all sampling units is called the sampling frame.

      The most commonly used sample design is the simple random sampling design, which consists of selecting images (sample size) sampling units in such a way that each sampling unit has the same chance of being selected. If, however, the population is finite of size images, say, then the simple random sampling design may be defined as selecting images sampling units in such a way that each possible sample of size images has the same chance of being selected. The number of such samples of size images that may be formed from a finite population of size images is discussed in Section 3.4.3.

СКАЧАТЬ