### Author Topic: Descriptive Statistics by Bruno Schaefer  (Read 1793 times)

#### The Librarian

• Moderator
##### Descriptive Statistics by Bruno Schaefer
« on: June 28, 2018, 06:41:36 AM »
Descriptive Statistics

Author: Bruno Schaefer (a.k.a BSpinoza), Losheim am See, Germany
Author contact: bup.schaefer (.at.) web.de
Source: Submission
Version: 2018-06-16
Tags: [maths] [statistics]

Description:
This program calculates basic descriptive statistics of univariate data:
n, Std.error, sum, standard error, mean, geometrical mean, variance,
standard deviation, coefficient of variation, minimum, 1st quartile, median,
2rd quartile, maximum,skewness, kurtosis, and excess kurtosis.
A dataset must have at least 4 values.

Remarks to kurtosis and skewness:
For kurtosis and skewness the same equation as SPSS, PAST and Excel is used.
Slightly different results may occur using other programs, especially for
small sample sizes.
kurtosis: peak shape  > 3 (excess > 0) leptokurtic: distribution with tapered peak and fat tails
= 3 (excess = 0) mesokurtic: similar to normal bell-curved distribution
< 3 (excess < 0) platykurtic: flat distribution with thin tails
skewness: symmetry    > 0 skewed right: its right tail is longer and most of the distribution is at the left.
= 0 symmetrical (not skewed)
< 0 skewed left: the left tail is longer and most of the distribution is at the right

Note that this program includes extended ASCII characters and may not copy/paste correctly. If the interface does not draw correctly, use the attached source listing.

Source code:
Code: [Select]
`'PROGRAM: descriptiveStatistics.bas'================= Descriptive Statistics  ================'        written by Bruno Schaefer, Losheim am See, Germany'                                       created: 15.12.2016'                                   last review: 16.06.2018'============================================================================================================' This programm calculates basic descriptive statistics of univariate data:' n, Std.error, sum, standard error, mean, geometrical mean, variance,' standard deviation, coefficient of variation, minimum, 1st quartile, median,' 2rd quartile, maximum,skewness, kurtosis, and excess kurtosis.' A dataset must have at least 4 values.' For kurtosis and skewness the same equation as SPSS, PAST and Excel is used.' Slightly different results may occur using other programs, especially for' small sample sizes.' kurtosis: peak shape  > 3 (excess > 0) leptokurtic: distribution with tapered peak and fat tails'                       = 3 (excess = 0) mesokurtic: similar to normal bell-curved distribution'                       < 3 (excess < 0) platykurtic: flat distribution with thin tails' skewness: symmetry    > 0 skewed right: its right tail is longer and most of the distribution is at the left.'                       = 0 symmetrical (not skewed)'                       < 0 skewed left: the left tail is longer and most of the distribution is at the right'===============================================================================================================_TITLE "descriptive statistics"SCREEN _NEWIMAGE(680, 520, 256)WEITER\$ = "y" 'loop variable_CLIPBOARD\$ = "" 'clears the clipboardCOMMON SHARED n AS INTEGEROPTION BASE 1DO    _LIMIT 30    DO        CLS , 14        COLOR 0, 14        PRINT " ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»   "        PRINT " º  DESCRIPTIVE STATISTICS OF UNIVARIATE DATA  º   "        PRINT " ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍŒ   "        PRINT "  number of values (n>3): ";        COLOR 9, 14        INPUT "", n 'input of the number of values    LOOP UNTIL n > 3    REDIM SHARED sample(n)    FOR I = 1 TO n        COLOR 0, 14        PRINT "  value no. " + STR\$(I) + ": ";        COLOR 12, 14        INPUT "", Wert#        sample(I) = Wert# '               fills the data array with values    NEXT I    ' ----- SORT of the values ----------    DO        ic = 0        FOR I = 1 TO n - 1            IF sample(I) > sample(I + 1) THEN                h = sample(I)                sample(I) = sample(I + 1)                sample(I + 1) = h                ic = 1            END IF        NEXT I    LOOP UNTIL ic = 0    ' -----------  calculations and output of the results ------------    CLS    COLOR 0, 14    PRINT    PRINT " =========================== RESULTS =================================="    COLOR 2, 14    PRINT "  n (number of values):          "; n    PRINT "  sum (sum of values):           "; sum#(sample())    PRINT "  standard error:                "; StdDev.s#(sample()) / SQR(n) ' stderr#(sample())    PRINT "  range (xmax - xmin):           "; sample(UBOUND(sample)) - sample(LBOUND(sample))    COLOR 12, 14    PRINT "  mean:                          "; mean#(sample())    PRINT "  geometrical mean:              "; geomean#(sample())    PRINT "  root mean square RMS:          "; rms#(sample())    PRINT "  variance (sample):             "; variance.s#(sample())    PRINT "  std.dev. (sample):             "; StdDev.s#(sample()); " = "; _ROUND((StdDev.s#(sample()) * 100 / mean#(sample())) * 100) / 100; " %"    PRINT "  coeff. of variation:           "; 100 * StdDev.s#(sample()) / mean#(sample())    COLOR 9, 14    PRINT "  variance (population):         "; variance.p#(sample())    PRINT "  std.dev. (population):         "; StdDev.p#(sample()); " = "; _ROUND((StdDev.p#(sample()) * 100 / mean#(sample())) * 100) / 100; " %"    PRINT "  coefficient of variation:      "; 100 * StdDev.p#(sample()) / mean#(sample())    COLOR 6, 14    PRINT "  minimum:                       "; sample(LBOUND(sample))    PRINT "  1st quartile (percentile 25%): "; quantile#(sample(), 0.25)    PRINT "  median (percentile 50%):       "; quantile#(sample(), 0.50)    PRINT "  standard error of the median:  "; variance.p#(sample()) / SQR(n)    PRINT "  3rd quartile (percentile 75%): "; quantile#(sample(), 0.75)    PRINT "  maximum:                       "; sample(UBOUND(sample))    PRINT "  interquartile range:           "; quantile#(sample(), 0.75) - quantile#(sample(), 0.25)    COLOR 9, 14    PRINT "  skewness (sample):             "; _ROUND(skew#(sample()) * 100000) / 100000    PRINT "  kurtosis (sample):             "; _ROUND(kurt#(sample()) * 100000) / 100000    PRINT "  excess kurtosis(sample):       "; _ROUND(kurt#(sample()) * 100000) / 100000 - 3    PRINT "  skewness (population):         "; _ROUND(skew#(sample()) * (n - 2) / SQR(n * (n - 1)) * 100000) / 100000    PRINT "  kurtosis (population):         "; _ROUND((kurt#(sample()) * (n - 2) * (n - 3) / (n - 1) - 6) / (n + 1) * 100000) / 100000    PRINT "  excess kurtosis (population):  "; _ROUND((kurt#(sample()) * (n - 2) * (n - 3) / (n - 1) - 6) / (n + 1) * 100000) / 100000 - 3    COLOR 0, 14    PRINT " ======================================================================"    DIM CrLf AS STRING * 2    CrLf = CHR\$(13) + CHR\$(10)    _CLIPBOARD\$ = _CLIPBOARD\$ + " ========================================= " + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " DESCRIPTIVE STATISTICS OF UNIVARIATE DATA      " + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " ========================================= " + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " sorted data:" + CrLf    FOR I = 1 TO n        _CLIPBOARD\$ = _CLIPBOARD\$ + "    " + STR\$(sample(I)) + CrLf    NEXT I    _CLIPBOARD\$ = _CLIPBOARD\$ + " ---------------------------------------------------------" + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " n (number of values):                  " + STR\$(n) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " sum (sum of values):                   " + STR\$(sum#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " standard error:                        " + STR\$(StdDev.s#(sample()) / SQR(n)) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " range (xmax - xmin):                   " + STR\$(sample(UBOUND(sample)) - sample(LBOUND(sample))) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " mean:                                  " + STR\$(mean#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " geometrical mean                       " + STR\$(geomean#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " root mean square RMS:                  " + STR\$(rms#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " variance (sample):                     " + STR\$(variance.s#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " standard deviation (sample):           " + STR\$(StdDev.s#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " standard deviation (sample) %:         " + STR\$(_ROUND((StdDev.s#(sample()) * 100 / mean#(sample())) * 100) / 100) + " %" + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " coefficient of variation (sample):     " + STR\$(100 * StdDev.s#(sample()) / mean#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " variance (population):                 " + STR\$(variance.p#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " standard deviation(population):        " + STR\$(StdDev.p#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " standard deviation (population) %:     " + STR\$(_ROUND((StdDev.p#(sample()) * 100 / mean#(sample())) * 100) / 100) + " %" + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " coefficient of variation (population): " + STR\$(100 * StdDev.p#(sample()) / mean#(sample())) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " minimum:                               " + STR\$(sample(LBOUND(sample))) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " 1st quartile (25% percentile):         " + STR\$(quantile#(sample(), 0.25)) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " median: 2nd quartile (50% percentile): " + STR\$(quantile#(sample(), 0.50)) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " standard error of the median:          " + STR\$(variance.p#(sample()) / SQR(n)) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " 3rd quartile (75%) :                   " + STR\$(quantile#(sample(), 0.75)) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " maximum:                               " + STR\$(sample(UBOUND(sample))) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " interquartile range:                   " + STR\$(quantile#(sample(), 0.75) - quantile#(sample(), 0.25)) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " skewness (sample):                     " + STR\$(_ROUND(skew#(sample()) * 100000) / 100000) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " kurtosis (sample):                     " + STR\$(_ROUND(kurt#(sample()) * 100000) / 100000) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " excess kurtosis (sample):              " + STR\$(_ROUND(kurt#(sample()) * 100000) / 100000 - 3) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " skewness (population):                 " + STR\$(_ROUND(skew#(sample()) * (n - 2) / SQR(n * (n - 1)) * 100000) / 100000) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " kurtosis (population):                 " + STR\$(_ROUND((kurt#(sample()) * (n - 2) * (n - 3) / (n - 1) - 6) / (n + 1) * 100000) / 100000) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " excess kurtosis (population):          " + STR\$(_ROUND((kurt#(sample()) * (n - 2) * (n - 3) / (n - 1) - 6) / (n + 1) * 100000) / 100000 - 3) + CrLf    _CLIPBOARD\$ = _CLIPBOARD\$ + " ---------------------------------------------------------" + CrLf    PRINT    PRINT " All results are stored in the clipboard!"    PRINT " Do you want to start a new statistical evaluation  [y/n]? ";    SLEEP    WEITER\$ = INKEY\$LOOP WHILE (WEITER\$ = "y") OR (WEITER\$ = "Y")COLOR 12, 14CLSLOCATE 10, 25: PRINT " E N D   O F   P R O G R A M "LOCATE 12, 25: PRINT "         - - - -"LOCATE 14, 25: PRINT "      Press any key ": PRINTSLEEPSYSTEMEND'FUNCTIONS'============= sum =========="FUNCTION sum# (x())    s# = 0    FOR i = 1 TO n        s# = s# + x(i)    NEXT i    sum# = s#END FUNCTION'============= mean =========="FUNCTION mean# (x())    mean# = sum#(x()) / nEND FUNCTION'========= variance (sample) =========="FUNCTION variance.s# (x())    m# = mean#(x())    s# = 0    FOR i = 1 TO n        s# = s# + (x(i) - mean#(x())) ^ 2    NEXT i    variance.s# = s# / (n - 1)END FUNCTION'========= variance population) =========="FUNCTION variance.p# (x())    m# = mean#(x())    s = 0    FOR i = 1 TO n        s# = s# + (x(i) - mean#(x())) ^ 2    NEXT i    variance.p# = s# / nEND FUNCTION'======= standard deviation (sample) ========"FUNCTION StdDev.s# (x())    StdDev.s# = SQR(variance.s#(x()))END FUNCTION'======= standard deviation (population) ========"FUNCTION StdDev.p# (x())    StdDev.p# = SQR(variance.p#(x()))END FUNCTION'============== median ====================="FUNCTION median# (x())    IF (n / 2) = INT(n / 2) THEN        'even        median# = (sample(n / 2) + sample((n / 2) + 1)) / 2    ELSE        'odd        median# = sample((n + 1) / 2)    END IFEND FUNCTION'============================ quantile ========================FUNCTION quantile# (x(), a)    rang# = a * (n - 1) + 1    index% = INT(rang#)    gewicht# = rang# - index%    quantile# = x(index%) + gewicht# * (x(index% + 1) - x(index%))END FUNCTION'============================ skewness ========================FUNCTION skew# (x())    m# = mean#(x())    s# = StdDev.s#(x())    sk# = 0    FOR J = 1 TO n        sk# = sk# + ((x(J) - m#) / s#) ^ 3    NEXT J    IF s# <> 0 THEN        skew# = sk# * (n / ((n - 1) * (n - 2)))    ELSE        skew# = 0    END IFEND FUNCTION'============================ kurtosis ========================FUNCTION kurt# (x())    m# = mean#(x())    s# = StdDev.s#(x())    krt# = 0    FOR j = 1 TO n        krt# = krt# + ((x(j) - m#) / s#) ^ 4    NEXT j    IF s# <> 0 THEN        kurt# = ((krt# * (n + 1) * n) / ((n - 1) * (n - 2) * (n - 3))) - ((3 * (n - 1) ^ 2) / ((n - 2) * (n - 3)))    ELSE        kurt# = 0    END IFEND FUNCTION'====================== geometrical mean ========================FUNCTION geomean# (x())    gm# = 1    FOR j = 1 TO n        gm# = gm# * x(j)    NEXT j    geomean# = gm# ^ (1 / n)END FUNCTION'============ mean square error ===================FUNCTION rms# (x())    ms# = 0    FOR j = 1 TO n        ms# = ms# + x(j) ^ 2    NEXT j    rms# = SQR(ms# / n)END FUNCTION`
« Last Edit: August 08, 2018, 11:15:28 PM by odin »