By quantymacro in ML — Jul 23, 2024

It’s confirmed and guaranteed: Ridge Regression >> Tree Model

Source? It was revealed to me in a dream

Python code attached at the end.

Friends, welcome back.

Today is a big day for me as the Chief Ridge Officer. A lot of the fakers and haters out there have been claiming that tree models >> linear regressions. And guess what? Today that will stop once and for all.

> be me
> only comfortable w linear regression
> learning about trees (RF/XGB/LightGBM) but still feel uncomfortable with it
> learned how to represent any tree model with Ordinary Least Squares
> now I’m more comfortable w trees
— quantymacro (@quantymacro) July 17, 2024

Firstly, we will show that most tree algorithms can be represented as linear regressions. I promise you, the explanation was written for a 5 years old to understand. You won't be left out.

And we will prove an interesting corollary, which is, a Decision Tree algorithm is guaranteed to be suboptimal to Ridge Regression. Might be shocking to you, but not to me. Also as a bonus, we will look at a special interpretation of Ridge.

And after that we will propose a slightly radical idea. We will argue that you shouldn't regularize your tree algorithm by tuning max_depth, min_samples_leaf, min_samples split etc. The alternative we propose instead is a blazingly fast and efficient regularisation, that is guaranteed (Inshallah) to be better than using Optuna to tune all the parameters above.

This post is for paying subscribers only

Already have an account? Sign in.

This post is for paying subscribers only

subscribe to quantymacro