Stack Overflow Asked by Benno on December 23, 2021
Is there any python package that allows the efficient computation of the PDF (probability density function) of a multivariate normal distribution?
It doesn’t seem to be included in Numpy/Scipy, and surprisingly a Google search didn’t turn up any useful thing.
Here I elaborate a bit more on how exactly to use the multivariate_normal() from the scipy package:
# Import packages
import numpy as np
from scipy.stats import multivariate_normal
# Prepare your data
x = np.linspace(-10, 10, 500)
y = np.linspace(-10, 10, 500)
X, Y = np.meshgrid(x,y)
# Get the multivariate normal distribution
mu_x = np.mean(x)
sigma_x = np.std(x)
mu_y = np.mean(y)
sigma_y = np.std(y)
rv = multivariate_normal([mu_x, mu_y], [[sigma_x, 0], [0, sigma_y]])
# Get the probability density
pos = np.empty(X.shape + (2,))
pos[:, :, 0] = X
pos[:, :, 1] = Y
pd = rv.pdf(pos)
Answered by tsveti_iko on December 23, 2021
You can easily compute using numpy. I have implemented as below for the purpose of machine learning course and would like to share, hope it helps to someone.
import numpy as np
X = np.array([[13.04681517, 14.74115241],[13.40852019, 13.7632696 ],[14.19591481, 15.85318113],[14.91470077, 16.17425987]])
def est_gaus_par(X):
mu = np.mean(X,axis=0)
sig = np.std(X,axis=0)
return mu,sig
mu,sigma = est_gaus_par(X)
def est_mult_gaus(X,mu,sigma):
m = len(mu)
sigma2 = np.diag(sigma)
X = X-mu.T
p = 1/((2*np.pi)**(m/2)*np.linalg.det(sigma2)**(0.5))*np.exp(-0.5*np.sum(X.dot(np.linalg.pinv(sigma2))*X,axis=1))
return p
p = est_mult_gaus(X, mu, sigma)
Answered by Freq on December 23, 2021
The following code helped me to solve,when given a vector what is the likelihood that vector is in a multivariate normal distribution.
import numpy as np
from scipy.stats import multivariate_normal
d= np.array([[1,2,1],[2,1,3],[4,5,4],[2,2,1]])
mean = sum(d,axis=0)/len(d)
OR
mean=np.average(d , axis=0)
mean.shape
cov = 0
for e in d:
cov += np.dot((e-mean).reshape(len(e), 1), (e-mean).reshape(1, len(e)))
cov /= len(d)
cov.shape
dist = multivariate_normal(mean,cov)
print(dist.pdf([1,2,3]))
3.050863384798471e-05
The above value gives the likelihood.
Answered by Shahad on December 23, 2021
If still needed, my implementation would be
import numpy as np
def pdf_multivariate_gauss(x, mu, cov):
'''
Caculate the multivariate normal density (pdf)
Keyword arguments:
x = numpy array of a "d x 1" sample vector
mu = numpy array of a "d x 1" mean vector
cov = "numpy array of a d x d" covariance matrix
'''
assert(mu.shape[0] > mu.shape[1]), 'mu must be a row vector'
assert(x.shape[0] > x.shape[1]), 'x must be a row vector'
assert(cov.shape[0] == cov.shape[1]), 'covariance matrix must be square'
assert(mu.shape[0] == cov.shape[0]), 'cov_mat and mu_vec must have the same dimensions'
assert(mu.shape[0] == x.shape[0]), 'mu and x must have the same dimensions'
part1 = 1 / ( ((2* np.pi)**(len(mu)/2)) * (np.linalg.det(cov)**(1/2)) )
part2 = (-1/2) * ((x-mu).T.dot(np.linalg.inv(cov))).dot((x-mu))
return float(part1 * np.exp(part2))
def test_gauss_pdf():
x = np.array([[0],[0]])
mu = np.array([[0],[0]])
cov = np.eye(2)
print(pdf_multivariate_gauss(x, mu, cov))
# prints 0.15915494309189535
if __name__ == '__main__':
test_gauss_pdf()
In case I make future changes, the code is here on GitHub
Answered by user2489252 on December 23, 2021
The multivariate normal is now available on SciPy 0.14.0.dev-16fc0af
:
from scipy.stats import multivariate_normal
var = multivariate_normal(mean=[0,0], cov=[[1,0],[0,1]])
var.pdf([1,0])
Answered by juliohm on December 23, 2021
I use the following code which calculates the logpdf value, which is preferable for larger dimensions. It also works for scipy.sparse matrices.
import numpy as np
import math
import scipy.sparse as sp
import scipy.sparse.linalg as spln
def lognormpdf(x,mu,S):
""" Calculate gaussian probability density of x, when x ~ N(mu,sigma) """
nx = len(S)
norm_coeff = nx*math.log(2*math.pi)+np.linalg.slogdet(S)[1]
err = x-mu
if (sp.issparse(S)):
numerator = spln.spsolve(S, err).T.dot(err)
else:
numerator = np.linalg.solve(S, err).T.dot(err)
return -0.5*(norm_coeff+numerator)
Code is from pyParticleEst, if you want the pdf value instead of the logpdf just take math.exp() on the returned value
Answered by ajn on December 23, 2021
I just made one for my purposes so I though I'd share. It's built using "the powers" of numpy, on the formula of the non degenerate case from http://en.wikipedia.org/wiki/Multivariate_normal_distribution and it aso validates the input.
Here is the code along with a sample run
from numpy import *
import math
# covariance matrix
sigma = matrix([[2.3, 0, 0, 0],
[0, 1.5, 0, 0],
[0, 0, 1.7, 0],
[0, 0, 0, 2]
])
# mean vector
mu = array([2,3,8,10])
# input
x = array([2.1,3.5,8, 9.5])
def norm_pdf_multivariate(x, mu, sigma):
size = len(x)
if size == len(mu) and (size, size) == sigma.shape:
det = linalg.det(sigma)
if det == 0:
raise NameError("The covariance matrix can't be singular")
norm_const = 1.0/ ( math.pow((2*pi),float(size)/2) * math.pow(det,1.0/2) )
x_mu = matrix(x - mu)
inv = sigma.I
result = math.pow(math.e, -0.5 * (x_mu * inv * x_mu.T))
return norm_const * result
else:
raise NameError("The dimensions of the input don't match")
print norm_pdf_multivariate(x, mu, sigma)
Answered by user692734 on December 23, 2021
The density can be computed in a pretty straightforward way using numpy functions and the formula on this page: http://en.wikipedia.org/wiki/Multivariate_normal_distribution. You may also want to use the likelihood function (log probability), which is less likely to underflow for large dimensions and is a little more straightforward to compute. Both just involve being able to compute the determinant and inverse of a matrix.
The CDF, on the other hand, is an entirely different animal...
Answered by Andrew Mao on December 23, 2021
I know of several python packages that use it internally, with different generality and for different uses, but I don't know if any of them are intended for users.
statsmodels, for example, has the following hidden function and class, but it's not used by statsmodels:
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/miscmodels/try_mlecov.py#L36
Essentially, if you need fast evaluation, rewrite it for your use case.
Answered by Josef on December 23, 2021
In the common case of a diagonal covariance matrix, the multivariate PDF can be obtained by simply multiplying the univariate PDF values returned by a scipy.stats.norm
instance. If you need the general case, you will probably have to code this yourself (which shouldn't be hard).
Answered by Sven Marnach on December 23, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP