next up previous contents index
Next: Efficient usage of patterned Up: Using data Previous: Examples

Summarising data

 


tex2html_wrap_inline33790 tex2html_wrap_inline33790 Syntax

  1. BD>summary : D1 [,D2 ...] tex2html_wrap_inline33712

  2. BD>summary : tex2html_wrap_inline33712

where tex2html_wrap_inline36640 are the names of bases or data-carriers.

tex2html_wrap_inline33806 tex2html_wrap_inline33806

The SUMMARY:  command is used to summarise the data carried by elements and data-elements. For the first form of the syntax, a list of bases, elements, and data-elements is supplied. Data is summarised for every data-carrier in the collection formed by these quantities. (Each base is broken into its constituents; if any of these are bases then the process continues until all the constituents are data-carriers.) The second form of the syntax is used when we want to summarise all data. See §6.2 for details of specifying collections for the definition part.

Summaries are given only for selected data. Data may be selected in one of two ways. Firstly, the autoselect  control may be switched on beforehand (its default). In this case, the data selection consists of all possible observations on all the quantities for which a summary is required, and thus the data selection is especially made for this particular SUMMARY:  command. Secondly, the autoselect  control may be switched off beforehand. In this case, the data selection consists of the current overall data selection, as made using the SELECT:  command (see §8.5), and so the data selection should have been made prior to this particular SUMMARY:  command. In either case, there might be missing values for some or all of the quantities.

The summary consists of, for each quantity for which a summary is required, the following values.

  1. The number n of observations available for the current selection. Suppose these observations to be tex2html_wrap_inline36642 .
  2. The smallest and largest of these n observations.
  3. The average of these n observations, tex2html_wrap_inline36644 .
  4. The sample standard deviation of these n observations, tex2html_wrap_inline36646 .

Additionally, if the scorr  and/or scov  options are switched on, correlation and/or covariance output for all the pairs of quantities for which a summary is required. Correlation/covariance output is given provided that these additional circumstances are satisfied: (1) n>1, (2) for every pair of quantities, an observation exists for the current selection.

As an example, consider the three data-carriers X,Y,Z and the three possible selections A,B,C shown in Table 8.2. Firstly suppose that the autoselect  control is switched on. Then the current data selection is irrelevant, and a summary will be given which relates to three values on X, six values on Y, and four values on Z. No correlation output will be given as the number of observations summarised is different.

For selection A, the summary would relate to three values for X, four for Y, and four for Z. No correlation output would be available. Now suppose that selection B prevailed, and that we issued the command

BD>summary : X, Y tex2html_wrap_inline33712

Under this scenario, there are three selected matching observations: (15.3,9.2), (12.6, 9.4), and (11.2,9.7). Hence individual summaries for X and Y will be output, followed by a correlation based upon these three pairs if the scorr  option is switched on, and a covariance matrix if the scov  option is switched on.  


next up previous contents index
Next: Efficient usage of patterned Up: Using data Previous: Examples

David Wooff
Wed Oct 21 15:14:31 BST 1998