This is the mail archive of the gsl-discuss@sourceware.org mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Sample skew and kurtosis


And while I'm writing in, I thought I'd mention a little anomaly in the
skew and kurtosis calculations. The documentation defines the kurtosis as 
kurtosis = ((1/N) \sum ((x_i - \Hat\mu)/\Hat\sigma)^4)  - 3,
and similarly for the skew.

This is inconsistent. \Hat\sigma and \Hat\mu are based on a sample,
meaning that the unbiased estimate involves \sum(...)/(n-1), as opposed
to the population variance, which involves \sum(...)/n.

The same holds for the kurtosis and skew: if you have a sample and not a
population, then the unbiased estimate is of the form \sum(...)/(n-1). But
the above starts with 1/n, meaning we have population kurtosis normalized
by sample variance squared.

If we have to choose only one kurtosis and skew function, it should
probably be the sample and not the population version. The fix is trivial:
just return kurtosis * n/(n+1.0) at the end of kurtosis_m_sd, and
similarly for skew.

Regards,

BK


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]