If you work in applied data science long enough, you’ll eventually hear some version of this question:
“Should we switch from logistic regression to a tree-based model?”
Sometimes the answer is yes.
Very often, the answer is not really.
If you work in applied data science long enough, you’ll eventually hear some version of this question:
“Should we switch from logistic regression to a tree-based model?”
Sometimes the answer is yes.
Very often, the answer is not really.
In a world filled with gradient boosting, deep learning, and AutoML tools, logistic regression can feel almost embarrassing to mention.
It’s old.
It’s simple.
It’s taught in every intro course.
Over the past decade, data scientist has become one of the most attractive titles in tech.
It promises impact, influence, and technical depth.
It suggests working on hard problems, building models, and shaping decisions with data.
Most experiment analyses start—and end—the same way.
You group by experiment variant.
You calculate averages.
You compare numbers.
You call it a day.
When people talk about feature engineering, SQL is often treated as a second-class citizen.
You’ll hear things like:
If you’ve worked in BI or analytics long enough, you’ve probably heard people talk about models as if they were something mysterious.
“Once we build a model, we can predict this.”
“The model says this user will churn.”
“We need a better model for this problem.”
If you’ve ever run an A/B test, you’ve probably seen this happen:
As a data analyst, you’re probably very comfortable working with SQL tables, CSV files, and Excel spreadsheets. But sooner or later, you’ll run into a situation like this:
If you work with data — in analytics, BI, or data engineering — you’ve probably heard the term dbt (pronounced “dee-bee-tee”). It has become one of the most popular tools in the modern data stack because it empowers analysts to build production-grade data pipelines using just SQL.
Continue reading
A/B testing (or split testing) is one of the most powerful tools in an analyst’s toolbox: it allows you to compare two (or more) versions of a web page, feature, or user experience — and determine which version truly performs better.
Continue reading