The "Theory of Attributes" deals with qualitative data (see Unit 1), which cannot be measured numerically but can be classified based on the presence or absence of a characteristic (an "attribute").
These frequencies are related. For example:
This is the easiest way to organize the frequencies for two attributes.
| Attribute | B (Employed) | β (Unemployed) | Total | 
|---|---|---|---|
| A (Literate) | (AB) | (Aβ) | (A) | 
| α (Illiterate) | (αB) | (αβ) | (α) | 
| Total | (B) | (β) | N | 
Consistency: A set of data (class frequencies) is said to be consistent if no class frequency is negative.
Since a frequency represents a count of items, it cannot be less than zero. If any calculation results in a negative frequency (e.g., (AB) < 0), the data is inconsistent and likely contains errors in collection or transcription.
All frequencies of the highest order must be non-negative. For two attributes, this means:
Independence: Two attributes A and B are independent if there is no relationship between them. The presence or absence of A has no effect on the presence or absence of B.
If they are independent, the proportion of 'A's among 'B's should be the same as the proportion of 'A's in the whole population.
Proportion of A's among B's = (AB) / (B)
Proportion of A's in total = (A) / N
A and B are independent if: (AB) = (A) * (B) / N
The observed frequency (AB) is compared to the expected frequency (A)*(B)/N.
If attributes are not independent, they are associated.
These are numerical measures of the *strength* and *direction* of the association.
This is the most common measure. It ranges from -1 to +1.
Another measure, also ranging from -1 to +1. It is related to Q.
The "Principle of Least Squares" is a fundamental method used in regression and curve fitting. It helps us find the "best-fit" line or curve for a set of data points (x, y).
Principle: The best-fit curve is the one that minimizes the sum of the squares of the vertical errors (residuals) between the observed data points (y) and the values predicted by the curve (ŷ).
We use calculus (partial derivatives) to find the parameters (e.g., 'a' and 'b' in a line) that make this sum as small as possible. This process generates a set of "Normal Equations."
Using the Principle of Least Squares, we can derive the Normal Equations needed to fit specific curves to data.
Equation: y = a + bx
Here, 'a' is the y-intercept and 'b' is the slope. We need to find the values of 'a' and 'b' that minimize Σ(y - (a + bx))².
Normal Equations for a Straight Line:
- Σy = n*a + b*(Σx)
- Σxy = a*(Σx) + b*(Σx²)
How to solve: 1. From your data, calculate: n, Σx, Σy, Σxy, Σx² 2. Plug these 5 values into the two normal equations. 3. You now have two simultaneous linear equations with two unknowns (a, b). Solve for 'a' and 'b'.
Equation: y = a + bx + cx²
We need to find 'a', 'b', and 'c'.
Normal Equations for a Parabola:
- Σy = n*a + b*(Σx) + c*(Σx²)
- Σxy = a*(Σx) + b*(Σx²) + c*(Σx³)
- Σx²y = a*(Σx²) + b*(Σx³) + c*(Σx⁴)
How to solve: 1. Calculate n, Σx, Σy, Σx², Σxy, Σx³, Σx²y, Σx⁴. 2. Plug these values in to get three simultaneous equations. 3. Solve for 'a', 'b', and 'c'.
Equation: y = a * bx
This is not a linear equation, so we can't use the normal equations directly. We must transform it into a linear form by taking the logarithm of both sides.
log(y) = log(a * bx)
log(y) = log(a) + log(bx)
log(y) = log(a) + x * log(b)
Now, let Y = log(y), A = log(a), and B = log(b).
The equation becomes: Y = A + Bx
This is just a straight line! We can use the normal equations for a straight line, but replacing 'y' with 'Y' (i.e., log(y)) and 'a' with 'A' and 'b' with 'B'.
Normal Equations for Exponential Curve:
- Σ(log y) = n*A + B*(Σx)
- Σ(x * log y) = A*(Σx) + B*(Σx²)
How to solve: 1. Create new columns in your data for Y = log(y) and x*log(y). 2. Calculate: n, Σx, Σx², Σ(log y), Σ(x * log y). 3. Solve the two normal equations for A and B. 4. Convert A and B back to 'a' and 'b':