reading-notes

Reading Notes: Locally Weighted Regression, Probabilistic Interpretation, Logistic Regression, and Newton’s Method

Locally Weighted Regression (Loess/LOWESS)

Concepts:

Non-parametric regression method that fits a smooth curve to data points.
Does not assume a global function for the entire dataset but instead fits local functions to subsets of data points.
Reduces the influence of outliers and captures local patterns.

Basic Algorithm:

For each data point, fit a weighted least-squares regression model to its neighbors.
Assign weights to the neighbors based on their distance from the target point, with higher weights for closer points.
Compute the estimated value at the target point using the fitted local model.
Repeat steps 1-3 for all data points to obtain a smooth curve.

Use Cases:

Exploratory data analysis to identify trends and patterns.
Smoothing noisy data in time series analysis and signal processing.
Non-linear regression when the global function is unknown or complex.

Parametric and Non-parametric Learning Algorithms

Parametric Algorithms:

Assume a specific form or function for the underlying relationship between variables.
Require estimating a fixed number of parameters from the data.
Examples: Linear regression, logistic regression.

Non-parametric Algorithms:

Do not assume a specific form or function for the underlying relationship.
Estimate an arbitrary function based on the data without a fixed number of parameters.
Examples: Locally weighted regression, k-Nearest Neighbors, decision trees.

Probabilistic Interpretation

Concepts:

Provides a probability-based framework for understanding and optimizing learning algorithms.
Involves modeling the distribution of the target variable conditioned on the input features.
Enables quantifying uncertainty and making decisions under uncertainty.

Use Cases:

Bayesian inference for updating beliefs in light of new data.
Model evaluation and comparison using likelihood or Bayesian Information Criterion (BIC).
Probabilistic classification and regression, such as logistic regression or Gaussian processes.

Logistic Regression

Concepts:

Logistic regression is a parametric classification algorithm used to model the probability of a binary outcome.
Uses the logistic function (sigmoid function) to transform a linear combination of input features into a probability.
Suitable for binary classification tasks and can be extended to multi-class classification using techniques like one-vs-rest.

Basic Algorithm:

Model the probability of the positive class as P(y=1|x) = 1 / (1 + exp(-(b0 + b1*x1 + ... + bn*xn))).
Estimate the coefficients b0, b1, …, bn by maximizing the log-likelihood of the observed data.
Classify a new instance by calculating the estimated probability and comparing it to a threshold (e.g., 0.5).

Use Cases:

Predicting customer churn based on usage patterns and demographics.
Diagnosing diseases based on medical test results and patient history.
Detecting spam emails based on textual features and sender information.

Newton’s Method

Concepts:

Newton’s method is an optimization algorithm for finding the roots (zeros) of a real-valued function.
Iteratively refines an initial guess using the function’s derivatives.
Converges quadratically when close to the root, making it faster than gradient descent in some cases.

Basic Algorithm:

Initialize a guess for the root of the function.
Calculate the first and second derivatives of the function with respect to the variable.
Update the guess by subtracting the ratio of the function value to its first derivative: x_new = x_old - f(x_old) / f'(x_old).
Repeat steps 2 and 3 until convergence or a maximum number of iterations.