This is the mail archive of the
gsl-discuss@sourceware.org
mailing list for the GSL project.
Sample skew and kurtosis
- From: Ben Klemens <klemens at hss dot caltech dot edu>
- To: gsl-discuss at sourceware dot org
- Date: Thu, 15 Mar 2007 15:39:56 -0800
- Subject: Sample skew and kurtosis
And while I'm writing in, I thought I'd mention a little anomaly in the
skew and kurtosis calculations. The documentation defines the kurtosis as
kurtosis = ((1/N) \sum ((x_i - \Hat\mu)/\Hat\sigma)^4) - 3,
and similarly for the skew.
This is inconsistent. \Hat\sigma and \Hat\mu are based on a sample,
meaning that the unbiased estimate involves \sum(...)/(n-1), as opposed
to the population variance, which involves \sum(...)/n.
The same holds for the kurtosis and skew: if you have a sample and not a
population, then the unbiased estimate is of the form \sum(...)/(n-1). But
the above starts with 1/n, meaning we have population kurtosis normalized
by sample variance squared.
If we have to choose only one kurtosis and skew function, it should
probably be the sample and not the population version. The fix is trivial:
just return kurtosis * n/(n+1.0) at the end of kurtosis_m_sd, and
similarly for skew.
Regards,
BK