HP3000-L Archives

October 1999, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Wirt Atmar <[log in to unmask]>
Reply To:
Date:
Mon, 11 Oct 1999 17:13:56 EDT
Content-Type:
text/plain
Parts/Attachments:
text/plain (94 lines)
I've been meaning to answer John Dunlop's question about how to code standard
deviation in Quiz for several days now, but just haven't had the time.
However, as long as this is official math day, let me finish up John's
question. In a previous posting, I gave the relevant equations as:

               mean = SUM[x(1),x(2),....,x(n)]/n

               pop. variance = SUM([x(i)-mean]^2)/n     for i = 1,...,n

              unbiased variance = n/(n-1) * [pop. variance above]

              std. dev. = sqr(unbiased variance)

These equations, at least in this form, suggest that you will require two
passes to calculate the standard deviation, the first to get the average, and
then the second to calculate the variance of each of the individual entries
against the previously calculated mean. However, that isn't true. If you
multiply the second equation out, substituting the first equation into the
second, you only need to accumulate three variables during your first and
only pass through the data:

              the accumulating count

              the accumulating sum

              the accumulating sum of the squares

While it has been 15 years since I programmed a report in Quiz, and thus I
can't be of any real help in that regard, I can give the equations the way we
programmed them up internally within QueryCalc.

A matrix of six real numbers R0[*] is defined to be a global variable in
QueryCalc. The matrix is composed of these elements:

      R0[1] = the accumulating sum of the qualified entries
      R0[2] = the accumulating count of the qualified entries
      R0[3] = the maximum value so far qualified
      R0[4] = the minimum value so far qualified
      R0[5] = the accum. sum of squares of qualified entries
      R0[6] = the accum. count of all IMAGE/KSAM/MPE records touched

These six variables allows us to calculate the average, sum, max, min,
variance, standard deviation, count, and percent qualified, which are part of
every QueryCalc query question. You should be able to do the same thing in
Quiz.

As each record in the query is touched and/or qualified, these particular
values are updated in QC's search routines. The BASIC code to do this on an
"instanteous" basis (i.e., once for every qualifying instance) is the
following:

 5150   R0[1]=R0[1]+V0[K]
 5160   R0[5]=R0[5]+V0[K]**2
 5170   IF V0[K]>R0[3] THEN DO
 5180     R0[3]=V0[K]
 5200   DOEND
 5210   IF V0[K]<R0[4] THEN DO
 5220     R0[4]=V0[K]
 5240   DOEND
 5270   R0[2]=R0[2]+1

(some lines were deleted in this listing because they are irrelevant to the
process at hand).

Once all of the records in the dataset(s) have been searched (either by using
a chain or a serial read) and we've come to the end of the dataset, the
following code is called to convert these raw accumulated sums into the
standard statistics for display purposes:

 1620     IF Q$[1,102;3]="SUM" THEN CONVERT R0[1] TO I$
 1630     IF Q$[1,102;3]="NUM" THEN CONVERT R0[2] TO I$
 1631     IF Q$[1,102;3]="FND" THEN CONVERT R0[2] TO I$
 1640     IF R0[2]=0 THEN A=0
 1650     ELSE A=R0[1]/R0[2]
 1660     IF Q$[1,102;3]="AVG" THEN CONVERT A TO I$
 1670     IF R0[2]<2 OR R0[5]/R0[2]-A**2<0 THEN V=0
 1680     ELSE V=R0[2]/(R0[2]-1)*(R0[5]/R0[2]-A**2)
 1690     IF Q$[1,102;3]="VAR" THEN CONVERT V TO I$
 1700     IF Q$[1,102;3]="DEV" THEN CONVERT SQR(V) TO I$
 1710     IF Q$[1,102;3]="MAX" THEN CONVERT R0[3] TO I$
 1720     IF Q$[1,102;3]="MIN" THEN CONVERT R0[4] TO I$
 1730     IF R0[6]=0 THEN P5=0
 1740     ELSE P5=R0[2]/R0[6]*100
 1750     IF Q$[1,102;3]="PCT" THEN CONVERT P5 TO I$

Some of the statistics (sum, max, min, count) require no further conversion.
They simply need to be converted to text for display. The others (average,
variance, std. dev., percent qualifying) require some final calculation. A
few limits need to be placed on some of the calculations. An average makes no
sense with less than 1 qualifying record; a std. deviation/variance makes no
sense with less than two entries.

Wirt Atmar

ATOM RSS1 RSS2