Truly understanding logistic regression - Part 1

Starting today, we are launching a series of articles on logistic regression, taking a progressive approach with a particular focus on the IRLS algorithm. Our goal is to blend theoretical concepts with practical implementation (in C#), offering clear illustrations of the associated challenges.

In machine learning, classification tasks refer to the process of categorizing input data into predefined classes or categories based on their features. The goal is to train a model that can learn patterns and relationships in the data, allowing it to make predictions or assign labels to new, unseen instances.

Classification tasks are prevalent in various domains, such as spam detection (classifying emails as spam or not spam), image recognition (identifying objects or patterns in images (e.g., recognizing digits in handwritten digits recognition) or sentiment analysis (determining the sentiment (positive, negative, neutral) in textual data).

Among the various algorithms employed for classification tasks, commonly used ones include decision trees, support vector machines, k-nearest neighbors, and neural networks. In this series, we will delve into classical logistic regression, highlighting the inherent mathematical intricacies associated with this technique.

The following textbooks on this topic merit consultation. These books extend beyond logistic regression and covers a myriad of expansive and general machine learning topics.

Pattern Recognition and Machine Learning (Bishop)

Machine Learning: An Algorithmic Perspective (Marsland)

Probabilistic Machine Learning: An Introduction (Murphy)

Probabilistic Machine Learning: Advanced Topics (Murphy)

Without further ado and as usual, let's begin with a few prerequisites to correctly understand the underlying concepts. Continue here.