Story Detail of id 47206053 | Liveview Hacker News

lokimedes10 hours ago | on: Decision trees – the unreasonable power of nested decision rules

When I worked at CERN around 2010, Boosted Decision Trees were the most popular classifier, exactly due to the (potential for) explainability along with its power of expression. We had a cultural aversion for neural networks back then, especially if the model was used in physics analysis directly. Times have changed…

loading story #47208582

srean9 hours ago | parent | next

> Times have changed…

This makes me a little concerned -- the use of parameters rich opaque models in Physics.

Ptolemaic system achieved a far better fit of planetary motion (over the Copernican system) because his was a universal approximator. Epicyclic system is a form of Fourier analysis and hence can fit any smooth periodic motion. But the epicycles were not the right thing to use to work out the causal mechanics, in spite of being a better fit empirically.

In Physics we would want to do more than accurate curve fitting.

lokimedes9 hours ago | root | parent | next

If you sum up experimental physics into one heuristic it is “avoid fooling yourself with assumptions” - I left physics over a decade ago, but I feel confident that physicists still work hard to understand what they observe and don’t let LLMs have all the fun. If there’s one field of science where the scientists are legitimately allowed to go all the way back to basics, it’s elementary particle physics.

srean8 hours ago | root | parent

In general I would agree. I think it holds true at the highest levels.

What worries me is the noticeable uptick of presentations of the sort -- look ma better fit ... deep neural nets. These are mostly by more junior folks, but not necessarily. I have been in the audience in many.

These and the uptick in research proposals funded by providers of infra for such DNNs. I have been in the audience of many.

A charitable read could be that they just want the money and would do the principled thing.

loading story #47207213

loading story #47208868

wodenokoto10 hours ago | parent

Are boosted decision trees the same as a boosted random forest?

boccaff10 hours ago | root | parent | next

short answer: No.

longer answer: Random forests use the average of multiple trees that are trained in a way to reduce the correlation between trees (bagging with modified trees). Boosting trains sequentially, with each classifier working on the resulting residuals so far.

I am assuming that you meant boosted decision trees, sometimes gradient boosted decisions trees, as usually one have boosted decision trees. I think xgboost added boosted RF, and you can boost any supervised model, but it is not usual.

hansvm9 hours ago | root | parent

The training process differs, but the resulting model only differs in data rather than code -- you evaluate a bunch of trees and add them up.

For better or for worse (usually for better), boosted decision trees work harder to optimize the tree structure for a given problem. Random forests rely on enough trees being good enough.

Ignoring tree split selection, one technique people sometimes do makes the two techniques more related -- in gradient boosting, once the splits are chosen it's a sparse linear algebra problem to optimize the weights/leaves (iterative if your error is not MSE). That step would unify some part of the training between the two model types.

#visit	12,938,223
#session	74,665
#live-session	0