End-to-end Data Analytics for Product Development. Chris Jones
Чтение книги онлайн.

Читать онлайн книгу End-to-end Data Analytics for Product Development - Chris Jones страница 11

Название: End-to-end Data Analytics for Product Development

Автор: Chris Jones

Издательство: John Wiley & Sons Limited

Жанр: Математика

Серия:

isbn: 9781119483700

isbn:

СКАЧАТЬ sample standard deviation “S”

      to draw conclusions about the corresponding unknown quantities of the population, called parameters:

population mean: “μ” population proportion “π” population standard deviation “σ”

      Note that it is standard to use Greek letters for certain parameters, such as μ to stand for a population mean, σ for a population standard deviation, σ2 for a population variance, and π for a proportion of statistical units having a characteristic of interest.

      A statistic (mean, proportion, variance) describes a characteristic of the sample (central tendency, variability, shape of data) and is known.

      A parameter (mean, proportion, variance) describes a characteristic of the population (central tendency, variability, shape of data) and is unknown.

      Statistical inference uses sample data to draw conclusions about a population with a known level of risk. In general, statistical inference proceeds as follows:

      1 We are interested in a population.

      2 We identify parameters of that population that will help us understand it better.

      3 We take a random sample and compute sample statistics.

      4 Through inferential techniques, we use the sample statistics to infer facts about the population parameters of interest.

image

      Stat Tool 1.13 Inferential Problems Icon01

       What is the stability of a new formulation?

       Which attributes of a product do consumers find most appealing?

       What is the performance of a new product compared with products currently on the market?

       What is causing high levels of variation and waste during processing?

       Can a process change reduce production time to get the product in stores more quickly?

      How can we use inferential techniques to answer these questions?

      Inferential problems are usually related to:

       Estimation of a population parameter:What is the stability of a new formulation?Which attributes of a product do consumers find most appealing?

       Comparison of a population parameter to a specified value or among groups:What is the performance of a new product compared with the industry standard or products currently on the market?

       Assessing relationships among variables:What is causing high levels of variation and waste during processing?Can a process change reduce production time to get the product in stores more quickly?

      We may use several inferential techniques to answer different questions:

       Estimation of a population parameter:Point estimate and confidence intervals

       Comparison among groups:Hypothesis testing (one‐sample tests; two‐sample tests; analysis of variance, ANOVA)

       Assessing relationships among variables:Regression models

      Let's introduce the problem of the estimation of a population parameter.

      Because it is often impractical or impossible to gather data on the entire population, we must estimate the population parameters using sample statistics.

      Statistics, such as the sample mean and standard deviation, are called point estimators.

       Point estimators:

sample mean images sample proportion p sample standard deviation S

       Population parameters:

population mean μ population proportion π population standard deviation σ

      Point estimates, such as the sample mean or standard deviation, provide a lot of information, but they don't give us the full picture.

      As it is highly unlikely that, for example, the sample mean and standard deviation we obtain are exactly the same as the population parameters, and to get a better sense of the true population values, we can use confidence intervals.

      A confidence interval is a range of likely values for a population parameter, such as the population mean or standard deviation.

      Usually, a confidence interval is a range:

equation

      Using confidence intervals, we can say that it is likely that the population parameter is somewhere within this range.

       Example 1.3. To illustrate this point, suppose that a research team wants to know the mean satisfaction score (from 0: completely not satisfied, to 10: completely satisfied) for the population of people who use a new formulation of a product.From a random sample of consumers, the sample mean is 6.8, and the confidence interval is CI = (6.2; 7.4).Mean satisfaction score (population parameter) = ?So the true unknown population mean satisfaction score is likely to be somewhere between 6.2 and 7.4.The central point of the confidence interval is the sample mean: = 6.8 (point estimate of μ).

      There's СКАЧАТЬ