Decision tree learning
Problem: decide whether to wait for a table at a restaurant, based on the following attributes:
- Alternate: is there an alternative restaurant nearby?
- Bar: is there a comfortable bar area to wait in?
- Fri/Sat: is today Friday or Saturday?
- Hungry: are we hungry?
- Patrons: number of people in the restaurant (None, Some, Full)
- Price: price range ($, $)
- Raining: is it raining outside?
- Reservation: have we made a reservation?
- Type: kind of restaurant (French, Italian, Thai, Burger)
- WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60 minutes)
Aim: find a small tree consistent with the training examples Idea: (recursively) choose “most significant” attribute as root of tree or subtree
|
|
Implement IMPORTANCE(a,examples): - Idea: a good attribute splits the examples into subsets that are (ideally) “all positive” or “all negative” - Information theory: use notion of information gain. - Entropy H(V): a measure of the uncertainty of a random variable. More information -> reduced entropy. B(p) = - sum(probln(prob)) - Remainder(A) = sum((pk+nk)/(p+n) B(pk/(pk+nk))) - Gain(A) = B(p/(p+n)) - remainder(A)
Neural networks
feed-forward networks: - single layer perceptrons - multi layer perceptrons: powerful expressive, trained by gradient descent (error back-propagation) - have no internal state