Martin Nemeth
Martin Nemeth

Reputation: 679

Global closest fit calculation of missing value

I am trying to understand the calculation of global closest fit method to calculate the missing attribute value. I am trying to understand the example shown here on the page 10, chapter: 2.8 GLOBAL CLOSEST FIT

I would like to understand how did they computed the distance for example between case 1 and 3 shown in th table 1.10

I would be very grateful for any human-like explanation :).

Upvotes: 0

Views: 275

Answers (1)

David Kretch
David Kretch

Reputation: 390

The distance between two cases is the sum of the distances between their attributes. The cases have three attributes: Temperature, Headache, and Nausea. We compare them one by one:

Temperature

| Case 1 | Case 3 |
| high   | ?      |

Distance = 1.

Reason: One of the cases has a ?, so it falls under condition 2 of the distance(xi, yi) formula ("xi = ? or yi = ?").

Headache

| Case 1 | Case 3 |
|--------|--------|
| ?      | no     |

Distance = 1.

Reason: One of the cases has a ? again.

Nausea

| Case 1 | Case 3 |
|--------|--------|
| no     | no     |

Distance = 0

Reason: Both are the same, so it falls under condition 1 ("xi = yi")

Conclusion

| Attribute   | Case 1 | Case 3 | Distance |
|-------------|--------|--------|----------|
| Temperature | high   | ?      | 1        |
| Headache    | ?      | no     | 1        |
| Nausea      | no     | no     | 0        |
|-------------|--------|--------|----------|
| Total       |        |        | 2        |

Distance = 2

Reason: We sum up the distances between attributes, according to the formula at the top of page 10.

Upvotes: 1

Related Questions