Reputation: 679
I am trying to understand the calculation of global closest fit method to calculate the missing attribute value. I am trying to understand the example shown here on the page 10, chapter: 2.8 GLOBAL CLOSEST FIT
I would like to understand how did they computed the distance for example between case 1 and 3 shown in th table 1.10
I would be very grateful for any human-like explanation :).
Upvotes: 0
Views: 275
Reputation: 390
The distance between two cases is the sum of the distances between their attributes. The cases have three attributes: Temperature, Headache, and Nausea. We compare them one by one:
| Case 1 | Case 3 |
| high | ? |
Distance = 1.
Reason: One of the cases has a ?, so it falls under condition 2 of the distance(xi, yi) formula ("xi = ? or yi = ?").
| Case 1 | Case 3 |
|--------|--------|
| ? | no |
Distance = 1.
Reason: One of the cases has a ? again.
| Case 1 | Case 3 |
|--------|--------|
| no | no |
Distance = 0
Reason: Both are the same, so it falls under condition 1 ("xi = yi")
| Attribute | Case 1 | Case 3 | Distance |
|-------------|--------|--------|----------|
| Temperature | high | ? | 1 |
| Headache | ? | no | 1 |
| Nausea | no | no | 0 |
|-------------|--------|--------|----------|
| Total | | | 2 |
Distance = 2
Reason: We sum up the distances between attributes, according to the formula at the top of page 10.
Upvotes: 1