分步说明

Arrange Your Data in Ascending Order

First, arrange your dataset in ascending order. This is necessary to calculate the expected values under normality. For example, if your dataset is {12, 15, 18, 20, 22}, arrange it in ascending order as {12, 15, 18, 20, 22}.

Calculate the Mean and Standard Deviation

Next, calculate the mean (μ) and standard deviation (σ) of your dataset. The mean is calculated as the sum of all values divided by the number of values, while the standard deviation is calculated as the square root of the sum of squared differences from the mean divided by the number of values minus one.

Calculate the Expected Values Under Normality

Calculate the expected values under normality using the inverse cumulative distribution function (CDF) of the standard normal distribution. The expected values are calculated as μ + σ \* Φ^(-1)((i - 3/8) / (n + 1/4)), where i is the rank of the observation, n is the number of observations, and Φ^(-1) is the inverse CDF of the standard normal distribution.

Calculate the W Statistic

The W statistic is calculated as the sum of squared differences between the observed values and the expected values under normality, divided by the sum of squared differences between the observed values and the mean. The formula for the W statistic is W = (∑(x_i - E(x_i))^2) / (∑(x_i - μ)^2), where x_i is the observed value, E(x_i) is the expected value under normality, and μ is the mean.

Determine the p-Value

The p-value is determined using a table of critical values for the Shapiro-Wilk test or using software. The p-value indicates the probability of observing the W statistic (or a more extreme value) assuming that the data is normally distributed. If the p-value is less than your chosen significance level (usually 0.05), you reject the null hypothesis that the data is normally distributed.

Interpret the Results

Finally, interpret the results of the Shapiro-Wilk test. If the p-value is less than your chosen significance level, you conclude that the data is not normally distributed. Otherwise, you fail to reject the null hypothesis that the data is normally distributed. Note that the Shapiro-Wilk test is sensitive to sample size, and large samples may result in rejection of the null hypothesis even if the data is approximately normally distributed.

The Shapiro-Wilk test is a statistical method used to determine if a dataset is normally distributed. It is commonly used in hypothesis testing to ensure that the data meets the assumptions of parametric tests. In this guide, we will walk you through the steps to perform a Shapiro-Wilk test manually and provide a worked example with real numbers.

Introduction to the Shapiro-Wilk Test

The Shapiro-Wilk test is based on the W statistic, which measures the correlation between the observed values and the expected values under normality. The test also provides a p-value, which indicates the probability of observing the W statistic (or a more extreme value) assuming that the data is normally distributed.

Step-by-Step Calculation

To perform a Shapiro-Wilk test manually, follow these steps:

How to Perform a Shapiro-Wilk Normality Test: Step-by-Step Guide