Пошаговые инструкции
Arrange the Data in Ascending Order
First, arrange the dataset in ascending order. This is necessary to calculate the empirical distribution function of the sample.
Calculate the Empirical Distribution Function
Next, calculate the empirical distribution function of the sample. The empirical distribution function is calculated as the number of observations less than or equal to x, divided by the total number of observations.
Calculate the Cumulative Distribution Function
Calculate the cumulative distribution function of the known distribution. For example, if we are testing for normality, we can use the cumulative distribution function of the standard normal distribution.
Calculate the Absolute Difference
Calculate the absolute difference between the empirical distribution function and the cumulative distribution function for each data point.
Calculate the Supremum
Calculate the supremum of the absolute differences calculated in the previous step. This is the Kolmogorov-Smirnov statistic.
Determine the p-Value
Finally, determine the p-value associated with the Kolmogorov-Smirnov statistic. The p-value can be looked up in a table or calculated using a software package.
The Kolmogorov-Smirnov test is a statistical test used to determine whether a dataset comes from a known distribution. In this guide, we will walk you through the steps to calculate the Kolmogorov-Smirnov statistic by hand.
Introduction to the Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov test is a non-parametric test that can be used to test the normality of a dataset. The test statistic is calculated by comparing the empirical distribution function of the sample with the cumulative distribution function of the known distribution.
The Formula
The Kolmogorov-Smirnov statistic is calculated using the following formula: D = sup|x| |F(x) - G(x)| where D is the Kolmogorov-Smirnov statistic, F(x) is the empirical distribution function of the sample, G(x) is the cumulative distribution function of the known distribution, and sup|x| denotes the supremum of the absolute difference between F(x) and G(x).
Step-by-Step Calculation
To calculate the Kolmogorov-Smirnov statistic by hand, follow these steps:
Step 1: Arrange the Data in Ascending Order
First, arrange the dataset in ascending order. This is necessary to calculate the empirical distribution function of the sample.
Step 2: Calculate the Empirical Distribution Function
Next, calculate the empirical distribution function of the sample. The empirical distribution function is calculated as the number of observations less than or equal to x, divided by the total number of observations.
Step 3: Calculate the Cumulative Distribution Function
Calculate the cumulative distribution function of the known distribution. For example, if we are testing for normality, we can use the cumulative distribution function of the standard normal distribution.
Step 4: Calculate the Absolute Difference
Calculate the absolute difference between the empirical distribution function and the cumulative distribution function for each data point.
Step 5: Calculate the Supremum
Calculate the supremum of the absolute differences calculated in the previous step. This is the Kolmogorov-Smirnov statistic.
Step 6: Determine the p-Value
Finally, determine the p-value associated with the Kolmogorov-Smirnov statistic. The p-value can be looked up in a table or calculated using a software package.
Worked Example
Suppose we have the following dataset: 2, 4, 6, 8, 10. We want to test whether this dataset comes from a standard normal distribution. First, we arrange the data in ascending order: 2, 4, 6, 8, 10. Next, we calculate the empirical distribution function:
| x | F(x) |
|---|---|
| 2 | 0.2 |
| 4 | 0.4 |
| 6 | 0.6 |
| 8 | 0.8 |
| 10 | 1.0 |
| Then, we calculate the cumulative distribution function of the standard normal distribution: | |
| x | G(x) |
| --- | --- |
| 2 | 0.0228 |
| 4 | 0.0228 |
| 6 | 0.0228 |
| 8 | 0.0228 |
| 10 | 0.0228 |
| We calculate the absolute difference between F(x) and G(x) for each data point: | |
| x | |
| --- | --- |
| 2 | 0.1772 |
| 4 | 0.3772 |
| 6 | 0.5772 |
| 8 | 0.7772 |
| 10 | 0.9772 |
| Finally, we calculate the supremum of the absolute differences: D = 0.9772. |
Common Mistakes to Avoid
One common mistake to avoid is not arranging the data in ascending order before calculating the empirical distribution function. Another mistake is not using the correct cumulative distribution function for the known distribution.
When to Use the Calculator
While it is possible to calculate the Kolmogorov-Smirnov statistic by hand, it is often more convenient to use a calculator or software package. This is especially true for large datasets, where calculating the empirical distribution function and cumulative distribution function can be time-consuming.