A Review of Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow by Aurélien Géron
Summary
Decision Trees is a Machine Learning model that can do both classification and regression in a human readable way. Decision Trees do this by subdividing the data at each step (or node) until the data in the bottom values (leaves) are just the intended values (or as close to that as possible). In our above value we see the root node (the top node) says petal length ≤ 2.45. So, any value less or equal to 2.45 are filed into the orange column and any value greater than 2.45 are put in the white column. And since the Orange Column in a leaf (a terminal node) then our tree has decided that our value represents a Setosa!
The Sklearn version of Decision Trees that we learned in this chapter uses the CART algorithm which means that it only creates binary trees, which means that each node our data is divided into two categories, while in other machine learning libraries that create decision trees can create trees where each node has many leaves.
Major Takeaways
- Decision Trees are prone to extremely prone to overfitting. However, to help prevent this we can aggregate lots of them into Decision Forests to help our a model generalize better.
- Compared to other Machine Learning methods Decision Trees are extremely human readable. This is because we can produce visualizations like the one above to show how the model made its decisions.
- The two major ways to calculate impurity are to use either gini impurity or entropy. Although they are generally very similar entropy tends to produce slightly more balanced trees.
- You can also use Decision Trees to run regressions. It runs in a very similar way to how classification works and classified each data point into a likelihood of an occurrence.
Thanks for reading!
If you have any questions or feedback please reach out to me on twitter @wtothdev or leave a comment!
Additionally, I wanted to give a huge thanks to Aurélien Géron for writing such an excellent book. You can purchase said book here (non-affiliate).
Disclaimer: I don’t make any money from any of the services referenced and chose to read and review this book under my own free will.