Personal website for projects, blog, and information.
by Brandon Julian
For the purpose of my own learning and general aversion to boredom, I will go cover to cover of Bayesian Statistics The Fun Way, by Will Kurt. I’m going to keep a record of things I struggle with while going cover to cover while also generating use cases as to why knowing bayesian stats is useful for someone working in a Data Science capacity. There may be no necessity to know bayesian statistics or more than its fundamentals… But either way I’ll learn something. The posts will be very informal and be more of a sounding board for myself than anything else.
There may be no necessity to know bayesian statistics This was a dumb statement. There is definitely a necessity to understand probabilistic modeling of things. For example, probabilistic approaches to hyperparameter tuning of machine learning models. GridSearchCV and RandomSearchCV are both standard ways to tune models. But when searching gigantic parameter spaces they are slow. GridSearchCV will attempt every combination of your parameter grid and save the better iteration every time. This means no stone goes unturned at the cost of time. RandomSearch will just randomly try combinations a specified number of times and return the best parameters as well, here you are at the mercy of chance that there wasn’t a better parameter set it didn’t try. And if that explanation was terrible, here is a two sentence summary of their similarity: GridSearch and RandomSearch are both uninformed forms of optimization, they try and forget, on repeat, and store the results for you to look at later. They do not choose their next set of parameters with an educated guess of what might be a better set of parameters based on their past evaluations. And with how complex or large some models can get, this can become too computationally expensive a process.
Now Bayesian Optimization is pretty neat. And I will cover it later…
tags: statistics