mouse click: add red data point
shift + mouse click: add green data point
'r': retrain

Browser not supported for Canvas. Get a real browser.


number of trees = 100 (set this high for smoother and more regularized final prediction. Higher = always better, but slower.)

max depth = 4 (depth of each tree in the forest. Set this higher when more complicated decision boundaries are needed (but runs exponentially slower and can be more prone to overfitting if not enough trees). Usually if you can afford many trees and resources you want to set this higher.)

hypotheses / node = 10 (number of random hypotheses considered at each node during training. Setting this too high puts you in danger of overfitting your data because nodes in the forest lose variety.)


A bit of explanation: Random forests are a collection of independent trees. Each tree is made up of nodes arranged in tree structure. Every node receives data from the top, and splits it to its 2 children based on some very simple decision (such as if x-coordinate > 3). To get the decision, during training a few random splitting rules are generated at each node and the "best" one is kept.