|
PQL Procedures | Tabulate |
TABULATE
procedure selects and summarises data. It then produces cross-tabulations with summary data, statistics or percentages. See syntax.
A wide variety of cross-tabulations can be produced:
TABULATE FILENAME = filename HEADER = (expression) [STUB = (expression)] [WAFER = (expression)]
BOOLEAN = (boolean expression) SAMPLE = fraction WEIGHT = varname
TOTAL = varname ['label']... COUNT = varname ['label']... CSS = varname ['label']... CV = varname ['label']... CVERR = varname ['label']... MAXIMUM = varname ['label']... MEAN = varname ['label']... MEDIAN = varname ['label']... MINIMUM = varname ['label']... MISSING = varname ['label']... NORMALIZED = varname ['label'](n1)['label1'],(n2)['label2']... PERCENT = varname ['label']... QUANTILE = varname ['label'] n ['label1','label2'...]... QUANTILE = varname ['label'](n1)['label1'],(n2)['label2']... RANGE = varname (lo,hi) ['label']... STDERR = varname ['label']... STDEV = varname ['label']... TSTATISTIC = varname ['label']... USS = varname ['label']... ISDNUMBER = n
PRINTFORMATS= varlist (option)...
PAGETITLE = 'string' PAGELENGTH = n PAGEWIDTH = n COLLAPSE
HEADERWIDTH = n HEADERINDENTATION = n HEADERDIVIDER = 'character' NODIVIDERS NOHEADERCENTER
STUBTITLE = option 'string' STUBWIDTH = n STUBINDENTATION = n STUBCONTINUATION= n STUBHEIGHT = n STUBFILLER = 'character' STUBDIVIDER = 'character'
WAFERTITLE = option 'string' NOWAFERTITLE WAFERCENTER WAFERSTART = n WAFERINDENTATION = n WAFERCONTINUATION = n WAFERHEIGHT = n
HTML NOBOTTOMBORDER NOLEFTBORDER NORIGHTBORDER NOTES= 'string' FOOTNOTES= 'string' WAFERNOTES= 'string' NOZEROS ZEROCHARACTER= 'character' NOEMPTYROWS DELETEMISSING UPPERCASE SPANNERS DUMMY = varname 'label' ... LANGUAGE = DUTCH | GERMANThe
FILENAME
and the header expression are the only required clauses.
FILENAME
|
Specify the filename created by the procedure. If a filename is not specified on a subsequent TABULATE procedure, the output is appended to the last TABULATE output file.
If the filename is
|
HEADER
, STUB
& WAFER
) define the data in one of the dimensions of the table. Each expression consists of from one to n variables and/or
statistical keywords all enclosed in brackets. If multiple variables are specified on a single expression they are related by the keywords BY
or THEN
.Percentages can also be used in expressions.
The type of variable, Variable Modifiers, and statistics all effect the Cell contents.
HEADER = (expression)
| Defines the columns of the tabulation. This clause is required. |
STUB = (expression) | Defines the rows of the tabulation. |
WAFER = (expression)
| Defines the sections of the tabulation. Sections are a number of two dimensional sections specified by the stub and header. |
BY | Nest a variable within another thus producing combinations of the variable values within one dimension of the table. The asterisk (*) symbol may be used instead of the keyword. |
THEN | Appends a second variable or statistical keyword to the set of variables displayed within one dimension of the table. The plus (+) symbol may be used instead of the keyword. |
BY and THEN | Can be used in combination to create complex structures within a single dimension or expression. |
Errors may occur, not only if the table is misspecified, but also if it cannot fit the output page.
TABULATE
command has the general form:
TABULATE HEADER = (expression) [STUB = (expression) [WAFER = (expression)]]A table consists of up to three components or dimensions: header, stub and wafer. Each dimension is defined by an expression. Each expression is identical in format but applies to a different dimension of the table. These expressions name the variables to tabulate and their relation to each other in the table. The different keywords (
HEADER
, STUB
& WAFER
) define the dimension being specified. The HEADER
is required; WAFER
can only be specified if a STUB
is also specified. The sequence of the three expressions in the command is irrelevant.The header defines variables whose values correspond to columns; the stub defines variables whose values correspond to rows. The rows and columns define a two-dimensional array called the wafer. The wafer is a higher level categorisation resulting in multiple individual row/column combinations. The table description consists of one to three expressions.
Define an expression with variable names and keywords. At the most basic an expression consists of a single variable name. Multiple variables or statistics may be defined in a single expression separated with the keywords BY
and THEN
. BY
specifies a level of nesting within a single dimension; THEN
specifies multiple variables within a single dimension.
The structure of the table is defined by the values of the control variables specified in the various expressions. The values in the variables named in these expressions are used as the labels for the particular dimension being specified. The number of categories of any control variable is taken from the schema definition which contains information on variable ranges, valid values and value labels.
The body of the table consists of Cells. The content of a cell is a summary of data which applies to that particular combination of variables in the expressions. This might be a count of occurrences of data, a sum of
values, a percentage or some other statistic.
If the command has a single header expression, a table containing a single value for each column is produced. That is one line of data is produced with one entry per column. For example:
Nesting variables with the
If the tabulate statement contains a
HEADER = (SEX BY AGE BY INCOME)
When two variables are joined by
The next example concatenates
Other combinations might be:
Nesting and concatenation can also be used in the stub and wafer expressions.
TABULATE HEADER = (REGION)
TABULATE HEADER = (AGE BY SEX)
TABULATE HEADER = (SALARY BY (MIN THEN MEAN THEN MAX))
If the tabulate statement contains a STUB
expression, then a two-dimensional table is produced. The stub defines the rows of the table going down the page. The header defines the columns going across the page. For example,
TABULATE HEADER = (RACE) STUB = (REGION)
TABULATE HEADER = (RACE THEN SEX) STUB = (REGION)
TABULATE HEADER = (SALARY BY (MEAN THEN MEDIAN)) STUB = (REGION BY RACE)
By specifying variables in two expressions, a cross-tabulation is produced with one cell for each value of each variable. For example, if a cross tabulation for two variables is specified, where each variable has two values, four cells are produced. The more values, the more cells.BY
operator also produces a cross-tabulation within a single dimension. If the same variables are used as a Stub and header or as a header with a BY
clause, the same cells are produced but the table has a different shape.WAFER
expression, a set of two-dimensional tables is produced. There must be a stub expression for a WAFER
expression to be valid. When more than one wafer is generated a table of contents is also produced. If a particular wafer contains no data, it is not printed and is marked deleted in the table of contents. If only one wafer is produced, it is printed even if it is empty. For example:
TABULATE HEADER = (SEX) STUB = (AGE) WAFER = (REGION)
TABULATE HEADER = (SEX) STUB = (AGE) WAFER = (REGION BY RACE)
TABULATE HEADER = (AGE BY REGION) STUB = (RACE BY SEX) WAFER = (COUNT THEN PCT)
BY
Specifying BY
between two variables nests each of the values of the second variable in each of the values of the first variable. If a second BY
is specified, a third level of nesting takes place. BY
defines a structure within a particular dimension. Each variable specifies a further level of nesting. The number of cells produced is the product of the number of values for each variable. There is no limit to the number of levels allowed within one expression. The asterisk (*) symbol can be used instead of BY
.
Example: BY
HEADER = (AGE BY INCOME)
Young Old
Rich Poor Rich Poor
6 4 8 3
Male Female
Young Old Young Old
Rich Poor Rich Poor Rich Poor Rich Poor
4 1 6 2 2 3 2 1
THEN
Specifying THEN
between two variables or two clauses in effect concatenates two independent tables with all values of the first table followed by all values of the table. The plus (+) symbol can be used instead of THEN
.Example: THEN
The following header expression results in the independent tabulation of individual occurrences of the two values for each of the two variables:HEADER = (AGE THEN SEX)
Young Old Male Female
10 11 13 8 THEN
, each value of each variable defines a cell. The number of cells produced is the sum of the number of values for each variable.
Combining BY and THEN
BY
takes precedence over THEN
when the two operators are used in combination in an expression. Use parentheses to specify a different precedence.Example: BY and THEN
The following examples show various combinations of the same variables in a header expression. In the first example, since nesting takes precedence, SEX
is nested with INCOME
, and the result of that nesting operation is concatenated with AGE
.
HEADER = (AGE THEN INCOME BY SEX)
Young
Old Rich Poor
Male Female Male Female
10 11 10 4 3 4 AGE
and INCOME
and nests SEX
within both AGE
and INCOME
, using parentheses to cause the concatenation operator to be applied first.
HEADER = ((AGE THEN INCOME) BY SEX)
Young Old Young Old
Male Female Male Female Male Female Male Female
5 5 8 3 10 4 3 4
HEADER = (AGE BY INCOME THEN SEX)
Young Old
Male
Female
Rich Poor Rich Poor
6 4 8 3 13 8 HEADER = (AGE BY (INCOME THEN SEX))
Young Old
Rich Poor Male Female Rich Poor Male Female
6 4 5 5 8 3 8 3 Variable Types
Tabulate uses numeric variables only. This includes categorical variables, which are numeric codes representing discrete string values. In a VisualPQL program, string variables can be recoded into numeric categorical values. There are two types of numeric variables; variables with discrete categories, known as
CONTROL
variables and variables with continuous values, known as OBSERVATION
variables.
Tabulate determines the tabulation type of a variable from the information in the variable schema according to the following rules:
CATEGORICAL,
or have VALID VALUES
or have VALUE LABELS
are control variables. All other variables are observation variables.
CONTROL VARS
or OBSERVATION VARS
in the data dictionary.
GET VARS
maintain their tabulation type.
The CONTROL VARS
and OBSERVATION VARS
commands in a VisualPQL program assign an explicit tabulation type to local variables created in the program.
CONTROL
variables must have ranges specified.TABULATE
statement as a control variable.
If a control variable contains missing values, the record is normally excluded from the body of the tabulation. Categories can be generated to show cells for missing control variables. If an observation variable contains missing values, the record is normally included but the variable is not computed into
the cell contents. This can give rise to situations where overall totals do not agree with counts in the body of the table. Specify the DELETEMISSING
keyword to exclude records where any cross-tabulated variable has a missing or undefined value.
Control variables provide classifications which determine the structure of the table and the presence of a particular cell. Observation variables give the result that appears in the cell.
When two control variables are cross tabulated or nested, the content of the cell is a count of the number of occurrences of data which match that combination of values.
When an observation variable is cross tabulated or nested with a control variable, the total (sum) for the observation variable is the default. That is, the structure of the table is unaltered by the inclusion of an observation variable but, instead of a count of data appearing in the cell contents, a total of the observation variable is calculated and an appropriate header label printed. Other statistics can be specified for observation variables.
Two observation variables cannot be cross tabulated or nested within each other and therefore observation variables may be specified in one and only one expression (dimension) of the TABULATE
statement.
Different types of information are reported in the cell contents for Control and Observation variables. For Control variables, which have discrete values, counts or percentages are reported. For Observation variables, which have continuous values, a variety of cell statistics are reported, the default being a sum of the values. For any cell that cross tabulates a control and an observation variable, the observation variable statistics are reported.
SALARY
is
included in the expression. The number in any given cell is the sum of the
SALARY
values from each record having the combination of
values for the control variables for that particular cell. The placement of the
observation variable in the header expression determines the placement of the
label, but does not alter the content of the table. The observation variable
does not alter the number of cells. Observation variables and counts may be
specified in a single expression in various ways:
HEADER = (SALARY BY AGE BY SEX) STUB = (REGION)
SALARY | ||||
Young | Old | |||
Male | Female | Male | Female | |
North.......... South.......... | 2800 2200 | 5650 11700 | 20100 4950 |
3200 5000 |
HEADER = (SEX THEN (SALARY BY SEX) STUB = (AGE)
Male | Female | Current Monthly Salary | ||
Male | Female | |||
Young... Old..... | 5 8 | 5 3 | 13050 21900 | 14450 8200 |
HEADER = (AGE BY (SEX THEN SALARY)) STUB = (REGION)
Young | Old | |||||
Male | Female | Current Monthly Salary | Male | Female | Current Monthly Salary | |
North.... South.... | 3 2 | 2 3 | 13600 13900 | 6 2 | 1 2 | 20150 9950 |
variable[.modifier][.modifier]...Several modifiers can be appended to a single variable. The modifiers and their abbreviations are:
The
The
When a control variable has a missing or undefined value, it is normally
excluded from the detail cells in the table. Specify the
By default, the rows and columns of a table are labeled with value labels (if they exist) or with the "variable=value"
notation. The
The
There are two methods of specifying statistics.
The first method is to specify a pseudo-variable in the expression and to use one of the statistical option
clauses to specify the statistic and label to produce in the form:
The second method is to use a statistical keyword directly.
In effect, these are pre-defined, convenient pseudo-variables
with standard labels.
(This method cannot be used with
.CONTROL (.C)
.OBS (.O)
.FIRST (.F)
.TOTAL (.T)
.MISSING (.M)
.UNDEFINED (.U)
.SPAN (.S)
.NOSPAN (.NOS)
The .CONTROL
and .OBS
modifiers alter the
default tabulation type of a variable (i.e. control or observation). The
.CONTROL
modifier may only be specified for an observation
variable if valid ranges have been specified for it..FIRST
modifier is used in conjunction with the percent
base marker as described in the section on percentages. It specifies that the
first occurrence of the percent base marker is used..TOTAL
modifier automatically concatenates a
TOTAL
control variable as described in the section on
TOTAL
. The TOTAL
is displayed before
the variable being modified..MISSING
or .UNDEFINED
modifiers
to generate additional cells in the
table. The .UNDEFINED
modifier groups all missing and undefined values into a single cell. The
.MISSING
modifier displays undefined and each defined missing
value as separate cells. These modifiers override any DELETEMISSING
specification for that variable.Example: Undefined Modifier
HEADER = (AGE.T.U)
TOTAL Young Old AGE=
UNDEFINED
22 10 11 1 .SPAN
modifier generates a second level of heading for a variable. This heading is either the variable label (if it exists) or the variable name. This new heading "spans" all the categories
of the variable, hence the name .SPAN
.
Example: SPAN Modifier
The next two tables illustrate the effect of .SPAN
. The only
difference between them is the .SPAN
modifier in the second
table.
HEADER = (AGE BY SEX) STUB = (REGION)
Young Old
Male Female Male Female
North..........
South.......... 3
2 2
3 6
2 1
2
HEADER = (AGE.S BY SEX.S) STUB = (REGION.S)
AGE
Young Old
Gender
Male Female Male Female
REGION
North..........
South..........
3
2
2
3
6
2
1
2SPANNERS
option
can be used to turn on
spanning labels for all variables. The .NOSPAN
modifier turns
off spanning labels for a particular variable when the
SPANNERS
option is in effect. Statistics
Other than TOTAL,
statistics are used with observation variables.
Statistics are specified with the BY
keyword
to associate them to an observation variable.
keyword = pseudo-variable 'label'
The 'label' is optional but the variable name is required.NORMALIZED, PERCENT,
QUANTILE
and RANGE
, since additional data is
required on the specification.)
The statistical clauses are:
COUNT | A count of records in this category. |
CSS | Corrected Sum of Squares, where the cell = sum of squares - ((sum**2)/n). |
CV | Coefficient of Variance, where the cell = (standard deviation/mean) * 100. |
CVERR | Coefficient of Error, where the cell = (standard error/mean) * 100. |
MAXIMUM | Maximum value of the variable. |
MEAN | Mean value of the variable. |
MEDIAN | Median value of the variable. |
MINIMUM | Minimum value of the variable. |
MISSING | Count of missing values in the variable. |
NORMALIZED* | Normalized quantiles (by median value) where cell = (quantile/median) * 100. |
PERCENT* | Percentage, where cell = (cell/base) * 100. |
QUANTILE* | Quantile, equal or unequal quantiles can be specified. |
RANGE* | Range produces the range of the variable expressed as the difference between the lowest and the highest values encountered. The format of the range clause is:RANGE = variable (lo,hi) ['label'] The specification of the lo and hi values defines the outer limits of any values to be included in the calculation. This can be used, for example, to exclude negative or zero values or very high numbers used as some coding convention. |
STDERR | Standard Error where cell = standard deviation/sqrt(n). |
STDEV | Standard Deviation about the mean. |
TOTAL | Count of records. When cross tabulated with an observation variable, gives sum of values of the observation variable. |
TSTATISTIC | T-Statistic where cell = mean/standard error. |
USS | Uncorrected Sum of Squares where cell = sum of squares. |
TOTAL
clause specifies that the pseudo-variable N is a
total, with the label 'Valid Cases'. The MEAN
and
STDEV
clauses assign labels for these pseudo-variables to 'X
Bar' and 'Sigma'.
TABULATE HEADER = (N THEN SALARY BY (MEAN THEN STDEV)) STUB = (SEX) TOTAL = N 'Valid Cases' MEAN = MEAN 'X Bar' STDEV = STDEV 'Sigma'
Valid Cases | Current monthly salary | ||
X Bar | Sigma | ||
Male........... Female......... | 13 8 | 2688 2831 | 486 746 |
The pseudo-variables created can be used anywhere that is valid. If the
pseudo-variable is a TOTAL
it is a control variable, otherwise
it is an observation variable.
Multiple pseudo-variables can be created and assigned to the same keyword. In
the next example, both N
and SUM
are totals,
N
is the count of records and SUM
is the
total SALARY
.
TABULATE HEADER = (N THEN SALARY BY (SUM THEN MEAN THEN STDEV)) STUB = (SEX) TOTAL = N 'Valid Cases' SUM 'Sum' MEAN = MEAN 'X Bar' STDEV = STDEV 'Sigma'
Valid Cases | Current monthly salary | |||
Sum | X Bar | Sigma | ||
Male........... Female......... | 13 8 | 34950 22650 | 2688 2831 | 486 746 |
HEADER = (SALARY BY (MIN THEN MAX)) STUB = (SEX)
Current monthly salary | ||
MINIMUM | MAXIMUM | |
Male........... Female......... | 2000 1650 | 3600 4000 |
Keywords must be nested within an observation variable. Keywords may not be nested within each other; they may be concatenated. For example, SALARY BY MEAN BY STDEV
is not allowed; SALARY BY (MEAN THEN STDEV)
is allowed.
ISD
number
.
This number is initially set to 4 which means that the maximum relative error in a median or quantile estimate should not exceed 12.5%. The value is usually more accurate if the values are distributed smoothly. Set the ISD number explicitly by using the clause:
ISDNUMBER = n /that sets the
ISD
number to n. Increasing the ISD number by 1 cuts the maximum relative error in half while doubling the internal storage required to compute the medians or quantiles. The ISDNUMBER
clause must precede the clauses it is to affect. For example,
TABULATE HEADER = (SALARY BY (MEAN THEN Q)) STUB = (TOTAL THEN REGION) ISDNUMBER = 5 QUANTILES = Q 'Quantiles' 4
TOTAL
can be specified as an independent control variable
which is a count of records. It is specified in exactly the same way as other
statistics (using either pseudo-variables or a keyword). However, since it is a control variable rather than an observation variable, it operates in a different manner.
HEADER = (AGE THEN TOTAL)
Young | Old | TOTAL |
10 | 11 | 22 |
The control variable, AGE
, with two classifications is
concatenated with TOTAL
. The total column equals the count of
all occurrences at the appropriate level of nesting. (Note that records
with undefined values have still been counted.)
HEADER = (SALARY THEN TOTAL)
Current monthly salary | TOTAL |
57600 | 22 |
SALARY
is an observation variable and thus the sum is displayed by default. The keyword TOTAL
is a control variable and displays a count. Cross tabulating or nesting TOTAL
with either control or
observation variables is allowed since it is a control variable. As per the standard rules for cell contents, if TOTAL
is nested with a control variable, a count is produced; if it is nested with an observation
variable, a sum of the observation variable is produced.
SALARY
and the control variables AGE
and SEX
:
HEADER = (SALARY BY AGE BY (TOTAL THEN SEX))
Current Monthly Salary | |||||
Young | Old | ||||
TOTAL | Male | Female | TOTAL | Male | Female |
27500 | 13050 | 14450 | 30100 | 21900 | 8200 |
Since TOTAL
is nested within an observation variable, this column contains the total SALARY
for each AGE
group.
TOTAL
. Appending a single T to the variable name separated by a period specifies that a TOTAL
is produced before the variable. The following two expressions are identical:
TOTAL THEN AGE AGE.TA tabulate statement can be shortened as follows:
TABULATE HEADER = (SALARY BY AGE BY (TOTAL THEN SEX)) TABULATE HEADER = (SALARY BY AGE BY SEX.T)
In the first form, the quantity "n" specifies the number of equal-sized quantiles to be produced. Of these, n-1 are printed. Thus for n=4, TABULATE
prints output for the 25%, 50% and 75% quantiles. In
the second form a number of non-equal sized quantiles are produced and each n specifies the quantiles to produce. For example, specifying Ns of 15 and 85 produces the 15th and 85th percentage quantiles. Both forms allow specification
of labels. In the first form, when labels are used for individual quantiles, the commas separating 'label1', 'label2', etc. are required.
1) keyword = variable ['label'] n ['label1', ...] 2) keyword = variable ['label'] (n1) ['label'] (n2)...
TABULATE HEADER = (SALARY BY (MEAN THEN Q)) STUB = (REGION.T) QUANTILES = Q 'Quantiles' 4
Current monthly salary | ||||
MEAN | Quantiles | |||
QUANTILE- =25 | QUANTILE- =50 | QUANTILE- =75 | ||
TOTAL....... North........ South........ | 2743 2813 2650 |
2311 2238 2369 | 2678 2825 2538 | 3222 3315 2913 |
A specification in the first form , with multiple labels:
TABULATE HEADER = (SALARY BY (MEAN THEN Q)) STUB = (REGION.T) QUANTILES = Q 'Quantiles' 4 '25%', 'Median', '75%'
Current monthly salary | ||||
MEAN | Quantiles | |||
25% | Median | 75% | ||
TOTAL........ North........ South........ | 2743 2813 2650 |
2311 2238 2369 | 2678 2825 2538 | 3222 3315 2913 |
A specification in the second form:
TABULATE HEADER = (SALARY BY (MEAN THEN Q)) STUB = (REGION.T) QUANTILES = D 'Deciles' (10) '10th' (50) '50th' (90) '90th'
Current monthly salary | ||||
MEAN | Deciles | |||
10th | 50th | 90th | ||
TOTAL........ North........ South........ | 2743 2813 2650 |
2051 1973 2267 | 2678 2825 2538 | 3407 3647 3265 |
TABULATE
can calculate and display percents. Tables can be
specified which display just percentages or which also include the original
values.Tables can be specified which contain row percentages, column percentages or overall percentages. Multiple percentages can be displayed in a single table. A single table can have percentages computed from more than one base value. However a table cannot have any cell which is an intersect of two percentage calculations. This means that all percentage cells must appear in a single expression; there cannot be percentages in both a row and column expression.
To display percentages, specify where the percents appear and the base of each
percent calculation. Use the
A brief review of .T and %X notation: An expression such as
The basics of producing percentages in tables are covered in simple percentages. You can also get percentages with observation variables and tables with values and percentages. Tables can have percentages for row totals and can be displayed in different wafers. There are numerous examples of different tables using percentages in
percentage examples.
To add percentages to this table, define a pseudo-variable, say X, and reference this on
Decide where to place the percent variable X, ensuring that it is nested correctly. Specifying it in the header expression or in the stub expression only affects the labeling of the table.
The total (100%) and regional percentages for each age group (row) are displayed in the table. The table contains only percentages because the percent variable, X, is at the highest level of the header expression nesting. The remainder of
the header,
Moving X to the stub expression changes the labeling of the table but not its contents:
In the example, move the base marker,
As in the preceding table, the percent variable,
In this table, all percentages are measured using the first cell of the table as
the base. This cell is at the intersection of the stub
The corresponding percentage table is:
Moving the clause
The following has alternating columns of original values and percents. Note
only the cells nested within the percent variable X contain percents.
Instead of presenting the data in alternating columns, it could be grouped into
separate blocks of frequencies and percents by inverting the order of the header
expression.
An analogous set of tables can be designed for percentages and totals of an
observation variable. The following table displays alternating columns of total
salary and percentage.
This table groups all salary columns followed by all the percent columns.
The following shows total salary for both sexes and percentage by sex.
(On next page)
Suppose the table is a breakdown of males and females under and over forty. The basic table (without percentages) is easily produced with a table statement such as:
It is simple to add percentages to either the row totals, giving percentages of
males and females or the column totals giving percentages of age groups but not
both at once.
The solution is to produce two tables in the one run:
PERCENT
option to define a
pseudo-variable and then reference this in an expression to display the percent.
Specify the base variable to use to calculate the percentage by appending a
percentage symbol (%) and the same pseudo-variable name to a base variable.
The only valid base variables are TOTAL
control variables, typically cross tabulated against the observation variable
that is the percentage variable.
Shorthand Notation for Base Marker Specification
If a percent base marker is appended to a variable, it specifies that a TOTAL
precedes the variable and that this is the base. For example, the following
expressions are equivalent:TOTAL%X THEN REGION
REGION%X
This shorthand notation can simplify the tabulate expressions. For example, both
the following statements are equivalent:
TABULATE HEADER = (TOTAL%X THEN REGION) STUB = (X BY (TOTAL%X THEN AGE))
TABULATE HEADER = (REGION%X) STUB = (X BY AGE%X)
The table output is the same in both cases. With the shorthand form, the label of the created TOTAL
category is always "TOTAL
".AGE.T
, produces three rows or columns - TOTAL,Young
and Old
. An expression such as REGION%X
, produces three categories -
TOTAL
, North
and South
. It also specifies the TOTAL
column as the percent base. The sub-expression, X.T
, is the same as TOTAL THEN X
and produces two cells for each category. Percentages with Counts
The simplest percentage tables are produced with control variables and cell contents are therefore counts. Consider the following standard table:
HEADER = (TOTAL THEN REGION) STUB = (TOTAL THEN AGE)
TOTAL North South
TOTAL..........
Young..........
Old............ 22
10
11 12
5
7
9
5
4PERCENT
clause, for example:
PERCENT = X 'Percent'
Choose the base row or column for the percentage computations. Base cells are indicated within an expression by appending a percentage symbol (%) and the pseudo-variable to the base variable. For row percentages, there must be a total column for the base or 100% column. In the example, the TOTAL
variable in the header expression is the total column. Therefore, append %X
to TOTAL
to make it the base for percentage calculations.Row Percents
TABULATE HEADER = (X BY (TOTAL%X THEN REGION)) STUB = (TOTAL THEN AGE)
PERCENT = X 'Percent'
Percent
TOTAL North South
TOTAL..........
Young..........
Old............ 100
100
100
55
50
64 41
50
36(TOTAL THEN REGION)
, is nested within X. The percent marker, %X,
specifies the header column TOTAL
as the 'set' of base cells for this table's percent calculations. Each cell value for TOTAL
is the base for the percents in its row. For example, the calculations for the first row of the
table are:TOTAL = 2200/2200 * 100 = 100
North = 1200/2200 * 100 = 55
South = 800/2200 * 100 = 41
Note: The above illustrates the effect of missing values in a control variable. Some records had missing values in REGION
and thus the columns do not reflect these records. If required, specify the
MISSING
modifier on REGION
to generate an extra column for those records.
TABULATE HEADER = (TOTAL%X THEN REGION) STUB = (X BY (TOTAL THEN AGE))
PERCENT = X 'Percent'
TOTAL North South
Percent
TOTAL..........
Young..........
Old............
100
100
100
55
50
64
41
50
36Column Percents
To display column percentages instead of row percentages, use a total row
to be the base or 100% row.%X
, from TOTAL
in the header and append it
to TOTAL
in the stub:
TABULATE HEADER = (X BY (TOTAL THEN REGION)) STUB = (TOTAL%X THEN AGE)
PERCENT = X 'Percent'
Percent
TOTAL North South
TOTAL..........
Young..........
Old............ 100
45
50
100
42
58 100
56
44X
, is nested
with all cells of the table so that all cells contain percents. In this table,
however, the percent marker, %X
, specifies the
TOTAL
row in the stub expression as the set of base cells for
this table's percent calculations. For example, the calculations for the first
column of the table are:
TOTAL = 2200/2200 * 100 = 100
Young = 600/2200 * 100 = 45
Old = 1400/2200 * 100 = 50
Base Markers in Both the Header and the Stub
To get overall or grand total percentages, place the
percent base markers in both the stub and the header. By convention, the cell
(or cells) at the intersection of the base row and column are considered the
base cells for the table.
TABULATE HEADER = (X BY (TOTAL%X THEN REGION)) STUB = (TOTAL%X THEN AGE)
PERCENT = X 'Percent'
Percent
TOTAL North South
TOTAL..........
Young..........
Old............ 100
45
50
55
23
32 41
23
18TOTAL
row and the header TOTAL
column.
Percentages with Observation Variables
Percentages of observation variables can be produced but at least one control
variable must be nested with the percent variable.Example: Percentages for Observation Variables
The following shows salary by age group:TABULATE HEADER = (SALARY BY (TOTAL THEN AGE))
Monthly salary
TOTAL Young Old
57600 27500 30100
TABULATE HEADER = (X BY SALARY BY (TOTAL%X THEN AGE))
PERCENT = X 'Percent' /
Percent
Monthly salary
TOTAL Young Old
100 48 52
(TOTAL%X THEN AGE)
to the front of the
expression, results in the same data in the table with different labeling:
TABULATE HEADER = ((TOTAL%X THEN AGE) BY X BY SALARY)
PERCENT = X 'Percent'
TOTAL Young Old
Percent Percent
Monthly salary Monthly salary Monthly salary
100 48 52
Original Values and Percents
Original values can be included as well as the percents. In the examples that
follow, all of the manipulations are performed in the header expression.
However, they could equally be done in stub or wafer expressions.
TABULATE HEADER = (REGION%X BY X.T) STUB = (AGE.T)
PERCENT = X 'Pct' /
TOTAL North South
TOTAL Pct TOTAL Pct TOTAL Pct
TOTAL......
Young......
Old........ 22
10
11 100
100
10012
5
7 55
50
64 9
5
4 41
50
36
TABULATE HEADER = (X.T BY REGION%X) STUB = (AGE.T)
PERCENT = X 'Pct' /
TOTAL Pct
TOTAL North South TOTAL North South
TOTAL.....
Young.....
Old....... 22
10
11 12
5
7 9
5
4
100
100
100 55
50
64 41
50
36 Percents for Row Totals.
TABULATE HEADER = (TOTAL THEN (X BY REGION%X)) STUB = (AGE.T)
PERCENT = X 'Pct' /
TOTAL Pct
TOTAL North South
TOTAL..........
Young..........
Old............ 22
10
11
100
100
100 55
50
64 41
50
36
TABULATE HEADER = (SALARY BY SEX%X BY X.T STUB) = (AGE.T)
PERCENT = X 'Pct'
Current monthly salary
TOTAL Male Female
TOTAL Pct TOTAL Pct TOTAL Pct
TOTAL.....
Young.....
Old....... 57600
27500
30100 100
100
100
34950
13050
21900 61
47
73 22650
14450
8200 39
53
27
TABULATE HEADER = (SALARY BY X.T BY SEX%X) STUB = (AGE.T)
PERCENT = X 'Pct'
Current monthly salary
TOTAL Pct
TOTAL Male Female TOTAL Male Female
TOTAL.....
Young.....
Old....... 57600
27500
30100
34950
13050
21900 22650
14450
8200 100
100
100 61
47
73
39
53
27
TABULATE HEADER = (SALARY BY (TOTAL THEN (X BY SEX%X))) STUB = (AGE.T)
PERCENT = X 'Pct'
Current monthly salary
TOTAL Pct
TOTAL Male Female
TOTAL.....
Young.....
Old....... 57600
27500
30100 100
100
10061
47
73 39
53
27 Percentages in Wafers
It is possible to display frequencies and percentages in different wafers of the same table. For example:
TABULATE HEADER = (REGION%X) STUB = (AGE%X) WAFER = (X.T)
PERCENT = X 'Percent'
The wafer expression is X.T
which stands for TOTAL THEN X
. Thus the table consists of two wafers, the first wafer contains totals and the second contains percentages. Each wafer is a two-dimensional table of AGE
(in the stub) by REGION
(in the header). Since the base marker, %X
, appears in both the stub and the header, the percentage wafer contains overall percentages. The following two wafers are the output of the previous tabulate statement:HEADER = (REGION%X) STUB = (AGE%X) WAFER = (X.T)
Table of Contents:
TOTAL........................................ page 1
Percent...................................... page 2
HEADER = (REGION%X) STUB = (AGE%X) WAFER = (X.T)
TOTAL..............
TOTAL North South
TOTAL..........
Young..........
Old............
22
10
11
12
5
7
9
5
4HEADER = (REGION%X) STUB = (AGE%X) WAFER = (X.T)
Percent............
TOTAL North South
TOTAL..........
Young..........
Old............ 100
45
50
55
23
32 41
23
18 Example percentage tables
The following examples illustrate some of the different ways to display percentages. The examples are all based on the
following table of frequency counts.
TABULATE HEADER = (REGION.T) STUB = (AGE.T BY SEX.T)
TOTAL North South
TOTAL
TOTAL...........
Male............
Female..........
Young
TOTAL...........
Male............
Female..........
Old
TOTAL...........
Male............
Female..........
22
13
8
10
5
5
11
8
3
12
9
3
5
3
2
7
6
1
9
4
5
5
2
3
4
2
2Base marker appended to AGE
This table contains nine base (100%) cells. The TOTAL
for
AGE
is the base cell (AGE%X
). Since
SEX.T
is nested within AGE
, there are three base cell rows,
and since the header contains three columns (REGION.T
) there is a
total of nine base cells (three columns times three rows).TABULATE HEADER = (X BY REGION.T) STUB = (AGE%X BY SEX.T) PERCENT = X 'Percent'
Percent
TOTAL North South
TOTAL
TOTAL..........
Male...........
Female.........
Young
TOTAL..........
Male...........
Female.........
Old
TOTAL..........
Male...........
Female.........100
100
100
45
38
63
50
62
38100
100
100
42
33
67
58
67
33100
100
100
56
50
60
44
50
40Base marker appended to SEX
Note that the nine base cells now occur where TOTAL
for
SEX
appears.TABULATE HEADER = (X BY REGION.T) STUB = (AGE BY SEX%X) PERCENT = X 'Percent'
Percent
TOTAL North South
Young
TOTAL..........
Male...........
Female.........
Old
TOTAL..........
Male...........
Female.........
100
50
50
100
73
27
100
60
40
100
86
14
100
40
60
100
50
50
Use of FIRST
In the previous table, base rows appeared at every occurrence of the
TOTAL
variable for SEX
. The next table uses
.FIRST
to restrict the base row to the first occurrence
of TOTAL
for SEX
.
TABULATE HEADER = (X BY REGION.T) STUB = (AGE BY SEX%X.F)
PERCENT = X 'Percent'
Percent
TOTAL North South
Young
TOTAL..........
Male...........
Female.........
Old
TOTAL.......
...
Male...........
Female.........
100
50
50
110
80
30
100
60
40
140
120
20
100
40
60
80
40
40Multiple Base Markers
The next three tables are similar to the first three with the addition of a
second base marker appended to REGION
in the header. The base
cells occur at the intersections of the base rows and columns.
TABULATE HEADER = (X BY REGION%X) STUB = (AGE%X BY SEX.T)
PERCENT = X 'Percent'
Percent
TOTAL North South
TOTAL
TOTAL..........
Male...........
Female.........
Young
TOTAL.....
.....
Male...........
Female.........
Old
TOTAL..........
Male...........
Female.........
100
100
100
45
38
63
50
62
38
55
69
38
23
23
25
32
46
13
41
31
63
23
15
38
18
15
25
TABULATE HEADER = (X BY REGION%X) STUB = (AGE BY SEX%X)
PERCENT = X 'Percent'
Percent
TOTAL North South
Young
TOTAL..........
Male...........
Female.........
Old
TOTAL.......
...
Male...........
Female.........
100
50
50
100
73
27
50
30
20
64
55
9
50
20
30
36
18
18
TABULATE HEADER = (X BY REGION%X) STUB = (AGE BY SEX%X.F)
PERCENT = X 'Percent'
Percent
TOTAL North South
Young
TOTAL..........
Male...........
Female.........
Old
TOTAL.......
...
Male...........
Female.........
100
50
50
110
80
30
50
30
20
70
60
10
50
20
30
40
20
20Multiple Percent Variables Within a Table
More than one percent distribution can be displayed in a single table by
specifying more than one percent variable. For example, the following two
variables could be specified representing row and column percentages:PERCENT= PR 'Row Pct' PC 'Col Pct' /
Since PR
represents row percentages, the base marker associated
with PR
must be in the header expression. Similarly, since
PC
, represents column percentages, the base marker associated with
PC
must be in the stub expression. In the example below, the
results are presented in two tables concatenated together. The first table
contains row percentages, the second contains column percentages.
TABULATE HEADER = ((PR + PC) BY REGION%PR) STUB = (AGE%PC)
PERCENT = PR 'Row Pct' PC 'Col Pct'
Row Pct Col Pct
TOTAL North South TOTAL North South
TOTAL.....
Young.....
Old....... 100
100
100 55
50
64
41
50
36 100
45
50 100
42
58 100
56
44Multiple percents in Rows
The next example presents the same data in a more conventional form. Notice
(PR + PC
) is moved to the stub expression.
TABULATE HEADER = (REGION%PR) STUB = (AGE%PC BY (PR + PC))
PERCENT = PR 'Row Pct' /
PERCENT = PC 'Col Pct' /
TOTAL North South
TOTAL
Row Pct.........
Col Pct.........
Young
Row Pct.........
Col Pct.........
Old
Row Pct.........
Col Pct.........
100
100
100
45
100
50
55
100
50
42
64
58
41
100
50
56
36
44Example: Overall Percents
Overall percentages can be included in the same table by adding a third percent variable, PT
. Since PT
is to represent overall percentages, append the PT
base marker to both the stub and the header variables.
TABULATE HEADER = (REGION%PR%PT) STUB = (AGE%PC%PT BY (PR + PC + PT))
PERCENT = PR 'Row Pct'
PERCENT = PC 'Col Pct'
PERCENT = PT 'Tot Pct'
TOTAL North South
TOTAL
Row Pct.........
Col Pct.........
Tot Pct.........
Young
Row Pct.........
Col Pct.........
Tot Pct.........
Old
Row
Pct.........
Col Pct.........
Tot Pct.........
100
100
100
100
45
45
100
50
50
55
100
55
50
42
23
64
58
32
41
100
41
50
56
23
36
44
18Counts and Multiple Percents
Counts can be included by adding the TOTAL
keyword:
TABULATE HEADER = (REGION%PR%PT) STUB = (AGE%PC%PT BY (TOTAL + PR + PC + PT))
TOTAL = TOTAL 'Count'
PERCENT = PR 'Row Pct'
PERCENT = PC'Col Pct'
PERCENT = PT 'Tot Pct'
TOTAL North South
TOTAL
Count...........
Row Pct.........
Col Pct.........
Tot
Pct.........
Young
Count...........
Row Pct.........
Col
Pct.........
Tot Pct.........
Old
Count...........
Row
Pct.........
Col Pct.........
Tot Pct.........
22
100
100
100
10
100
45
45
11
100
50
50
12
55
100
55
5
50
42
23
7
64
58
32
9
41
100
41
5
50
56
23
4
36
44
18Percentages in both Columns and Rows
It is not possible to produce a table which has a column of percentages on the total and a row of percentages on the same total. The solution to this is to produce a second table on the same file with a single row - the final percentages.TABULATE HEADER = (AGEGROUP.T) STUB = (GENDER.T)
However it is impossible to produce both sets of percentages in the same table. The required output is the following:
Total Young and Old Under 40 40+ % BY AGE
Total Males and Females 20 6 14 100%
Male 12 2 10 60%
Female 8 4 4 40%
% BY GENDER 100% 30% 70%
TABULATE HEADER =(TOTY THEN AGEGROUP THEN X) STUB = (TOTX%X THEN GENDER)
FILENAME = A
TOTAL = TOTX 'Total Males and Females'
PERCENT = X '% BY AGE'
TABULATE HEADER = (TOTY%Y THEN AGEGROUP) STUB = (Y)
TOTAL = TOTY 'Total Young and Old'
PERCENT = Y '% BY GENDER'
Note that the second tabulate does not have a FILE
clause and thus writes the output to the same file. This second tabulate produces one row which is the required set of final percentages. Record Filtering
BOOLEAN = (logical expression)
SAMPLE = fraction
WEIGHT = varname
BOOLEAN
| Specifies which procedure table records are used by the procedure. The procedure table records for which the logical expression is true are used by the procedure. If this option is not specified, all procedure table records are used. |
SAMPLE
|
Specifies that a random sample of the procedure table records are
used by the procedure. The fraction specifies the percent of records used and is specified as a positive decimal number less than or equal to 1 (one). .25, for example specifies that a 25% sample be used. |
WEIGHT
|
Specifies an integer procedure variable used to weight the frequency
counts and aggregations. By default, each record adds a count of 1 to frequency
counts. Specifying a weight variable, adds the value of that variable rather
than 1. For example, if WEIGHT = FAMSIZE were specified,
then, in a table of RACE by REGION , a procedure table
record would contribute a count of 5 to the RACE=1 ,
REGION=2 cell of the table if it contained the data:RACE=1 REGION=2 FAMSIZE =5 |
PRINTFORMATS = var list (options,...) .../By default, cell contents are printed as integers. The
PRINTFORMATS
option is used to alter the defaults. The options
and their abbreviations are:
COMMAS (C)
| Separates groups of 3 digits by commas. |
DECIMAL=n (n)
| Specifies the number of decimal places to be printed. A decimal point is only printed when the number of decimal places is non-zero. |
DOLLAR
(D)
| Places a floating dollar sign before the number. |
PERCENT (P)
| Places a trailing percent sign after the number. |
DOLLAR, PERCENT
and COMMAS
options are not in effect and DECIMAL
= 0 (i.e. numbers are
printed as integers).
TABULATE HEADER = (SALARY BY (TOTAL THEN (X BY SEX%X))) STUB = (AGE.T) PERCENT = X 'Pct' / PRINTFORMATS = SALARY (D,C) X (P,2) /
Current monthly salary | ||||
TOTAL | Pct | |||
TOTAL | Male | Female | ||
TOTAL.......... Young.......... Old............ |
$57,600 $27,500 $30,100 | 100.00% 100.00% 100.00% |
60.68% 47.45% 72.76% | 39.32% 52.55% 27.24% |
TOTAL
under SALARY
is printed with dollar signs and
commas. However, TOTAL
under PCT
is printed with a
percent sign and two decimal places because of the precedence rules for
print formats.
PRINTFORMAT
are not considered in the precedence analysis.
PRINTFORMATS
for percentages have the highest
precedence. If a percent variable is not assigned an explicit
PRINTFORMAT
, it is printed as an integer.
PRINTFORMAT
takes
precedence.
MEAN, MAX
, etc.)
is nested with an observation variable, the observation variable's
PRINTFORMAT
is used. When nesting variables, the
PRINTFORMAT
of the variable lower in the nesting takes
precedence.
TABULATE HEADER = (SALARY BY REGION BY (MIN THEN MAX)) MINIMUM = MIN 'Lowest' MAXIMUM = MAX 'Highest' PRINTFORMATS = SALARY (D) MIN (2)MAX (2)The salaries are listed with 2 decimal points and no dollar sign. To print the salaries with dollar signs and no decimal points the expression is:
TABULATE HEADER = (REGION BY (MIN THEN MAX) BY SALARY)
PRINTFORMAT
of the lowest variable in the nesting takes
precedence.
PRINTFORMAT
precedence, the
header expression has highest precedence followed by the stub and then the
wafer.Observation variables are displayed in a summary cell labeled with the variable label. If there is no variable label, the variable name is used.
When not enough room is provided in a table for label or title information, the
label or title is broken into two or more segments so that it can fit in the
allotted space. The
VALUE LABELS
command defines value labels
for missing values including BLANK
and
UNDEFINED
.
Page Formatting
PAGETITLE = 'string'['string'['string']]
PAGELENGTH = n
PAGEWIDTH = n
COLLAPSE
The following options alter the default formatting of the page.
PAGETITLE
| Specifies the title printed left justified at the top of the page. By default, the page title is the string "SIR/XS Tabulate Procedure" and the page number is printed at the top right-hand side of the page. The page title may consist of up to 3 lines. The first line contains the page number. The second and third lines are printed if there are wafers and/or if the wafers must be broken into chunks (i.e. the wafer does not fit on a single page). |
PAGELENGTH
|
The maximum number of lines that can appear on a page. Specifying PAGELENGTH = NOEJECT turns off paging. The default PAGELENGTH value is the current output file page length. This must include space for:the page title and page number line(s), the wafer title, wafer label, and the blank lines between, the table (wafer) itself and any WAFERNOTES specified. |
PAGEWIDTH
|
Controls the width of the printed page. By default, tables are printed centred within PAGEWIDTH print positions. The default value is the current output file page width.
|
COLLAPSE
|
If a wafer exceeds either PAGELENGTH or PAGEWIDTH , an attempt is made to break the wafer into multiple 'chunks', each printed on a separate page. The chunks appear in the output in order from left to right, then top to bottom. The wafer and stub labels appear as needed. The COLLAPSE option allows chunks of broken up
wafers to be printed on the same page. No more than 2 chunks are printed per page. The default is not to collapse chunks into a single page.
|
HEADERWIDTH = n HEADERINDENTATION = n HEADERDIVIDER = 'character' NODIVIDERS NOHEADERCENTERThe following options alter the default formatting of the column header.
HEADERWIDTH
|
Specifies the width to be used in printing each column of the table. This width does not include the column divider. The default header width is 10 print positions. If HEADERWIDTH is set too small, there may be insufficient room to print a number in the table, in which case the field is filled with X s. Also, if HEADERWIDTH is too small, there may not be enough room for the header labels to be printed and they may be severely segmented. Conversely, if HEADERWIDTH is set too large, the table header may contain too many print positions for one page. When this occurs, the page is broken into chunks. Chunks are printed on successive pages left to right, top to bottom.
|
HEADERINDENTATION
| Specifies the number of print lines for each level of the header nesting. The default is 3 and it cannot be set smaller. |
HEADERDIVIDER
| Specifies the character used as the column divider. The default is a vertical bar (|). This character is used for both the label area of the header as well as between the columns of data. |
NODIVIDERS
| Sets the divider character between columns of data to blank. It does not affect the label area of the header. |
NOHEADERCENTER
| Left justifies header labels in a column. By default header labels are centred in the column. |
STUBTITLE = option 'string' STUBWIDTH = n STUBINDENTATION = n STUBCONTINUATION = n STUBHEIGHT = n STUBFILLER = 'character' STUBDIVIDER = 'character'The following options alter the default formatting of the stub:
STUBTITLE
|
Specifies a title which appears in the boxed in area to the left of the header and directly above the stub labels. The options allowed are CENTER, LEFT and RIGHT and control the justification of the stub label. If the option is omitted, it is assumed to be CENTER . If the stub title is too long, it is automatically segmented over two or more lines. There is no default stub title.
|
STUBWIDTH
|
Specifies the number of print positions for stub labels. The default
is 20 print positions. If "too much" stub label segmentation occurs, TABULATE aborts with a WAFER/STUB OUTPUT FORMATTING
ERROR. The solution is to increase the STUBWIDTH .
|
STUBINDENTATION
| Specifies the indentation for each level of nesting. The default is 2 print positions. |
STUBCONTINUATION
| Specifies the indentation for continuation lines. A label which exceeds the stub width is segmented into two or more lines. Continuation lines are indented by the parameter. The default indentation is 3 print positions. |
STUBHEIGHT
|
Specifies the number of print lines used for each stub label. The default is 1 line. Label segmentation works independently from the STUBHEIGHT parameter and, therefore, some stub labels may occupy more than STUBHEIGHT lines.
|
STUBFILLER
| Stub labels (on lines which contain data) are filled with the character to the full width of the stub. The default stub filler is a period (.). |
STUBDIVIDER
| Specifies the horizontal divider character. The default is the dash (-). |
TABULATE STUB = (AGE BY SEX) HEADER = (REGION.T) STUBTITLE = 'Age by Sex' STUBWIDTH = 20
Age by Sex | TOTAL | North | South |
Young Male............ Female.......... Old Male............ Female.......... | 5 5 8 3 | 3 2 6 1 | 2 3 2 2 |
TABULATE STUB = (AGE BY SEX) HEADER = (REGION.T STUBFILLER = ' '
TOTAL | North | South | |
Young Male Female Old Male Female |
5 5 8 3 |
3 2 6 1 | 2 3 2 2 |
WAFERTITLE = option 'string' NOWAFERTITLE WAFERCENTER WAFERSTART = n WAFERINDENTATION = n WAFERCONTINUATION = n WAFERHEIGHT = nWhen a tabulate statement contains a wafer expression, each wafer produced contains a wafer label in the upper left corner below the wafer title. If there is no wafer expression in the tabulate statement, a wafer label is not printed.
Wafer labels are formed in the same way that header and stub labels are produced.
WAFERTITLE
|
Specifies a wafer title printed above each wafer. By default, the tabulate statement is printed as the wafer title, centred and one line above the wafer (table). The option may be CENTER, LEFT or RIGHT to position the wafer title over the wafer.
|
NOWAFERTITLE
| Suppresses printing of the wafer title. |
WAFERCENTER
| Centres the wafer on the page. By default wafers are left justified on the page. |
WAFERSTART
|
Specifies the starting print position for the wafer. Does not apply when WAFERCENTER is specified.
|
WAFERINDENTATION
| Specifies the indentation used for each level of nesting in the wafer title. The default is 2 print positions. |
WAFERCONTINUATION
| Specifies the indentation for continuation lines when a wafer label is broken up into multiple lines. The default is 3 print positions. |
WAFERHEIGHT
| Specifies the number of print lines for the wafer label. The default is 1. This height is automatically adjusted if a label does not fit in the width of the wafer title and has to be segmented. |
TABULATE
reports a WAFER/STUB
OUTPUT FORMATTING ERROR.
The solution is to increase the size of
STUBWIDTH
.
TABULATE WAFER = (AGE BY REGION) STUB = (SEX) HEADER = (RACE) Under 21 North....... Under 21 South....... 21 and Over North....... 21 and Over South.......
HTML NOBOTTOMBORDER NOLEFTBORDER NORIGHTBORDER NOTES = 'string' FOOTNOTES = 'string' WAFERNOTES = 'string' NOZEROS ZEROCHARACTER = 'character' NOEMPTYROWS DELETEMISSING UPPERCASE SPANNERS DUMMY = variable 'label' ... LANGUAGE
When output is in html format, then the print formatting, page formatting and many of the general output control options can be specified but have no effect. There is no control over the detailed appearance of the table. The general shape of the table is dictated by the table expressions and the specifics of the table appearance depend on the software used to view and/or print the resulting html.
NOBOTTOMBORDER
, NOLEFTBORDER
and
NORIGHTBORDER
options alter the production
of the left, right and bottom borders of each wafer. They can be used in
conjunction with the HEADERDIVIDER
and
STUBDIVIDER
.
NOTES
specifies text printed
at the end of the Table of Contents which is produced if there are multiple
wafers or a NOTES
option specified.
FOOTNOTES
specifies text printed on a new page following all
wafers. WAFERNOTES
specifies text printed at the end of each
wafer or chunk of a wafer.
PRINTFORMAT
statement). Specify NOZEROS
to
print zeros as dashes (-). Specify ZEROCHARACTER
to print
another character for zero cells.
NOEMPTYROWS
suppresses the printing of empty rows.
DELETEMISSING
to exclude records
where any of the cross tabulation variables have missing or undefined values in
a variable.
UPPERCASE
to convert text to uppercase. By default,
both upper and lower case are used.
SPANNERS
option to produce spanner labels for all
control variables. Specify spanner labels on a per variable basis with the
.SPAN
modifier. When the SPANNERS
option is
in effect, suppress spanner labels for an individual variable with the
.NOSPAN
modifier.
DUMMY
clause to create label information that is
not related to any specific variable. The DUMMY
clause creates
a dummy variable whose sole purpose is to carry label information into the
table. Dummy spanner labels can be used to span concatenated expressions.Example: Dummy Spanner Labels
TABULATE HEADER = (J BY (SEX THEN REGION)) DUMMY = J 'Sex and region Data'
Sex and region Data | ||||
Male | Female | North | South | |
Dummy.......... | 13 | 8 | 12 | 9 |
'segment' option 'segment'Option can be a plus (+), blank or minus(-); the different characters specify where strings can be broken across lines:
Plus (+) simply concatenates two segments; blank forces a line break between the two segments; minus (-) specifies optional segmentation as per the following rules.
After the first line segment has been formed, all the remaining characters of the original label to the right of the break character are considered next. If the remaining segment fits in the allocated space, it is printed, otherwise, the first n characters of this portion of the label (where n is the space available) are also scanned from right to left and segmented as described above. This process continues until the entire label has been printed.
TABULATE
command
are detected, during compilation and during execution. Errors detected during
compilation cause error messages to be displayed and the VisualPQL program is not
executed. Errors detected during execution cause the program to terminate.
If during compilation, a CELL CONTENTS ERROR
error occurs,
check that:
WAFER/STUB OUTPUT FORMATTING ERROR
implies that there is a problem inserting the wafer and/or stub labels within their allocated
space in each wafer. Usually there are too many levels of nesting, within the
wafer or stub, for the width of the STUBWIDTH
.
If there is not enough room on a page to print an entire row or column,
TABULATE
breaks the wafer into + "chunks". If a
BY
clause has been specified, all of the rows or columns
beneath the highest level in a nesting must fit on a single page, otherwise a
CHUNK FORMATTING ERROR
is reported. This implies that a wafer is too large to print on a single page and that it cannot
be broken at any 'clean' place to produce multiple 'chunks' of output. A wafer
can be broken into chunks only along the highest variable nested within the stub
or header. The solutions are to:
TABULATE
statement, or