Skip to content

Yet Another Note On Skewness And Kurtosis

October 10, 2010

Due to a project I recently work on, I would like to know the relationship between skewness and kurtosis. After some Google search, I am directed to Wilkins’s paper : A Note On Skewness And Kurtosis [PDF] in which he gave an new proof of the following inequality:

\displaystyle kurtosis \geq (skewness)^2+1.

However, he only proved it for random variables with finite values. It’s quite natural to extend his proof to any real-valued random variable. Here is the proof I give only involving  fundamental concept and definition from probability theory and quadratic form which is exactly the remarkable idea from Wilkins’s original proof.

Let X be a real-valued random variable defined on probability space (\Omega,\mathcal{E},P). Let \mu be the mean of X. Then,

\displaystyle \mu\equiv\int_\Omega X\, \text{d}P. \ \ \ \ \ (1)

Also, denote the i^{th} central moment of X by \upsilon_i. Then,

\displaystyle \upsilon_i\equiv\int_\Omega(X-\mu)^i\, \text{d}P. \ \ \ \ \ (2)

And, the standard deviation \sigma of X is defined as

\displaystyle \sigma \equiv \sqrt{\upsilon_2}. \ \ \ \ \ (3)

Define the i^{th} standard moment \lambda_i of X as

\displaystyle \lambda_i\equiv\frac{\upsilon_i}{\sigma^i}. \ \ \ \ \ (4)

Here, \lambda_3  is called skewness and \lambda_4 is called kurtosis. Note that

\upsilon_1=0, \lambda_1=0

and

\displaystyle \lambda_2=1. \ \ \ \ \ (5)

Now, consider the quadratic form

\displaystyle G(a,b,c)\equiv \int_{\Omega}\big(a+(X-\mu)b+(X-\mu)^2c\big)^2\text{d}P.

\displaystyle=\int_{\Omega}\bigg(a^2 + (X-\mu)^2b^2+(X-\mu)^4c^2+2(X-\mu)ab+2(X-\mu)^2ac+2(X-\mu)^3bc\bigg)\text{d}P
\displaystyle = a^2\int_\Omega\,\text{d}P + b^2\int_\Omega(X-\mu)^2\,\text{d}P+c^2\int_\Omega(X-\mu)^4\,\text{d}P
\displaystyle + 2ab\int_\Omega(X-\mu)\,\text{d}P + 2ac\int_\Omega(X-\mu)^2\,\text{d}P + 2bc\int_\Omega(X-\mu)^3\, \text{d}P

\displaystyle = a^2+\upsilon_2b^2+\upsilon_4c^2+2\upsilon_1ab+2\upsilon_2ac+2\upsilon_3bc

Since G(a,b,c)\geq 0 for all a,b,c (G is semi-definite), we shall have its discriminant larger than or equal to 0. That is,

\begin{vmatrix}1&\upsilon_1&\upsilon_2\\ \upsilon_1&\upsilon_2&\upsilon_3\\ \upsilon_2&\upsilon_3&\upsilon_4\end{vmatrix}=\begin{vmatrix}1&0&\sigma^2\\ 0&\sigma^2&\sigma^3\lambda_3\\ \sigma^2&\sigma^3\lambda_3&\sigma^4\lambda_4\end{vmatrix}=\sigma^6\lambda_4 - \sigma^6-\sigma^6\lambda_3^2=\sigma^6(\lambda_4-\lambda_3^2-1) \geq 0.

For any real-valued random variable whose standard deviation is not zero, we have

\displaystyle \lambda_4 \geq \lambda_3^2+1. \ \ \ \ \ (6)

Note that standard deviation \sigma=0 if and only if X is constant almost everywhere.

2 Comments leave one →
  1. Zeine Ould Zeidane permalink
    April 20, 2011 3:11 am

    Well done. Just note a type error that is the last term in the matrix should be lambda4*sigma^4

Leave a comment