There seems to be a general misconception that Bayesian methods are harder to implement than Frequentist ones. Sometimes this is true, but more often existing R and Python libraries can help simplify the process.

Simpler to implement ≠ throw in some data and see what sticks. (We already have machine learning for that. :P)

People make Bayesian methods sound more complex than they are, mostly because there’s a lot of jargon involved (e.g. weak priors, posterior predictive distributions, etc.) which isn’t intuitive unless you have previously worked with these methods.

Adding more to that misconception is a clash of ideologies — “Frequentist vs. Bayesian.” (If you weren’t aware, well, now you know.) The problem is people, especially statisticians, commonly quarrel over which methods are more powerful when in fact the answer is “it depends.”

Bayesian methods, like any others, are just tools at our disposal. They have advantages and disadvantages.

So, with some personal “hot takes” out of the way, let’s move on to the fun stuff and implement some Bayesian linear mixed (LMM) models!

Here’s what I’ll cover (both in R and Python):

Practical methods to select priors (needed to define a Bayesian model)

A step-by-step guide on how to implement a Bayesian LMM using R and Python (with brms and pymc3, respectively)

Quick model diagnostics to help you catch potential problems early on in the process

Bayesian model comparison/evaluation methods aren’t covered in this article. (There are more ways to evaluate a model than RMSE.) I’ll publish a subsequent article covering these in more detail.

If you are unfamiliar with mixed models I recommend you first review some foundations covered here. Similarly, if you’re not very familiar with Bayesian inference I recommend Aerin Kim’s amazing article before moving forward.

Let’s just dive back into the marketing example I covered in my previous post. Briefly, our dataset is composed of simulated website bounce times (i.e. the length of time customers spend on a website), and the overall goal was to find out whether younger people spent more time on a website than older ones.

The dataset has 613 observed “bounce times” (bounce_time, secs) collected across 8 locations ( county), each with an associatedage. However, the number of observations per location varies (e.g. one has 150 observations while another has only 7).