Regression with Gaussian Likelihood

In a future post, I would like to implement a (deep) quantile-regression model since I have never tried that before. To warm-up, let us start with a basic Gaussian regression.

The plot above represents a simple 1D dataset {(xi,yi)}i=1N{(x_i, y_i)}_{i=1}^N. To build a predictive model, we could simply try to fit a regression of the type

yiF(xi)

for some unkown function F:RRF: . For implementing this, one could parametrize the function FF with some parameter θRd^{d} (eg. a neural net) and minimize the standard MSE loss,

θ    1Ni=1NyiFθ(xi)2.

This simple regression setting does not capture the fact that the uncertainty is much higher at some places than at others. Better would be to use a model of the type

yiN(μ(xi),σ(xi)2)

where N(μ,σ2)N(, ^2) denotes a Gaussian distribution with mean μ and variance σ2^2. Here μ(x)(x) and σ(x)(x) are two unkown functions. In other words, one can make the variance term σ2(x)^2(x) depends on the covariate xx, and that is the standard approach in this type of situations. Indeed, any other distribution (eg. a Student’s distribution) could be used instead. For implementing this idea, one can use a neural net with parameter θRd^d that takes xRx as input and spits out the pair (μ(x),log[σ(x)])R×R((x), ) . Maximum Likelihood Estimation boils down to minimizing

θ;   1Ni=1Nyiμθ(xi)2+log[σθ2(xi)].

Using a neural-net with only one hidden layer and a basic SGD optimizer gives the following learning trajectory:

Not bad, although it is known that this simple approach can lead to optimization issues in slightly more complex situations 1,2. For example, it is hard to get out of the local minimum below.

In another post, I’ll try to implement a deep-quantile-regression model…


references

  1. Stirn, A., & Knowles, D. A. (2020). Variational variance: Simple, reliable, calibrated heteroscedastic noise variance parameterization arXiv preprint arXiv:2006.04910.

  2. Skafte, N., Jørgensen, M., & Hauberg, S. (2019). Reliable training and estimation of variance networks Advances in Neural Information Processing Systems, 32.



Date
November 19, 2022