Unit 2: Measures of Central Tendency & Dispersion

Table of Contents

1. Measures of Central Tendency (Averages)

A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set.

1. Arithmetic Mean (AM or Mean)

The sum of all observations divided by the number of observations.

Ungrouped Data: x̄ = (x1 + x2 + ... + xn) / n = (Σx) / n
Grouped Data: x̄ = (Σf * x) / (Σf) = (Σf * x) / N
(where x = midpoint of class, f = frequency, N = total frequency)

2. Median (Md)

The middle value of a dataset that has been arranged in order (ascending or descending).

Ungrouped Data:
- If n is odd: Median = Value of the ((n+1)/2)-th item.
- If n is even: Median = Average of the (n/2)-th and ((n/2) + 1)-th items.

Grouped Data: Median = L + [ ( (N/2) - cf ) / f ] * h
(L = lower boundary of median class, N = total frequency, cf = cumulative frequency *before* median class, f = frequency of median class, h = class width)

3. Mode (Mo)

The value that appears most frequently in a dataset.

Grouped Data: Mode = L + [ (f1 - f0) / (2*f1 - f0 - f2) ] * h
(L = lower boundary of modal class, f1 = freq of modal class, f0 = freq of pre-modal class, f2 = freq of post-modal class, h = class width)
Empirical Relationship: For a moderately skewed distribution:
Mean - Mode ≈ 3 * (Mean - Median) or Mode ≈ 3*Median - 2*Mean

4. Geometric Mean (GM)

The n-th root of the product of n observations. Used for averaging ratios, percentages, or growth rates.

Ungrouped Data: GM = (x1 * x2 * ... * xn)^(1/n)
Using Logs: log(GM) = (Σ log(x)) / n => GM = Antilog[ (Σ log(x)) / n ]

5. Harmonic Mean (HM)

The reciprocal of the arithmetic mean of the reciprocals of the observations. Used for averaging rates and speeds.

Ungrouped Data: HM = n / (Σ (1/x))
Grouped Data: HM = N / (Σ (f/x))
Note: For any set of positive numbers: AM ≥ GM ≥ HM.

2. Partition Values

Values that divide an ordered dataset into a number of equal parts. The Median is a partition value (it divides data into 2 parts).

1. Quartiles (Q)

Divide the data into 4 equal parts (Q1, Q2, Q3).

2. Deciles (D)

Divide the data into 10 equal parts (D1, D2, ... D9).

3. Percentiles (P)

Divide the data into 100 equal parts (P1, P2, ... P99).

Note: Q1 = P25, Q2 = D5 = P50 = Median, Q3 = P75.
Formula for Grouped Data (Percentile 'k'):
Pk = L + [ ( (k*N/100) - cf ) / f ] * h
(To find Q1, use k=25. To find Q3, use k=75. To find D4, use k=40, etc.)

3. Measures of Dispersion (Variability)

Measures that describe the spread or scatter of data points in a dataset.

1. Range

The simplest measure. The difference between the largest and smallest observation.

Range = Largest Value (L) - Smallest Value (S)

2. Quartile Deviation (QD)

Also known as the Semi-Interquartile Range. It measures the spread of the middle 50% of the data.

QD = (Q3 - Q1) / 2

3. Mean Deviation (MD)

The arithmetic mean of the absolute deviations of the observations from a measure of central tendency (usually the median or mean).

MD (from median): (Σ |x - Median|) / n

4. Variance and Standard Deviation (SD)

The most important and widely used measures of dispersion.

Variance (s²): The average of the squared deviations from the mean.
Sample Variance: s² = (Σ (x - x̄)²) / (n - 1)
Standard Deviation (s): The square root of the variance.
SD (s) = sqrt(Variance)

4. Coefficient of Variation (Relative Dispersion)

We cannot compare the standard deviations of two datasets if they have different units (e.g., heights in cm vs. weights in kg) or different means.

We use a relative measure, the Coefficient of Variation (CV).

Coefficient of Variation (CV): The ratio of the standard deviation to the mean, expressed as a percentage.
CV = (Standard Deviation / Mean) * 100
CV = (s / x̄) * 100

5. Moments

Moments are statistical parameters used to describe the characteristics of a distribution (center, spread, shape).

1. Raw Moments (μ'r) - Moments about Origin (Zero)

The r-th raw moment is the arithmetic mean of the r-th power of the observations.

μ'r = (Σ xr) / n

2. Central Moments (μr) - Moments about the Mean

The r-th central moment is the arithmetic mean of the r-th power of the deviations from the mean.

μr = (Σ (x - x̄)r) / n

6. Measures of Skewness and Kurtosis

Skewness (Shape - Asymmetry)

Measures the asymmetry or lack of symmetry of a distribution.

Moment-based Coefficient (β1 and γ1):

β1 = (μ3)² / (μ2
γ1 = sqrt(β1) = μ3 / (μ2)1.5

(If γ1 > 0, positive skew. If γ1 < 0, negative skew. If γ1 = 0, symmetrical)

Kurtosis (Peakedness)

Measures the peakedness or flatness of a distribution compared to the standard Normal distribution.

Measures of Kurtosis (β2 and γ2):

β2 = μ4 / (μ2
γ2 = β2 - 3 (This is "excess kurtosis")