KSORT$ 3R2B -- Reentrant Internal Sort

KSORT$ 3R2B -- Reentrant Internal Sort

Go to Table of Contents

1. Introduction

KSORT$ is a reentrant assembly language subroutine for sorting serial, fixed-length, memory-resident records. It consists of three relocatable elements: KSORT$ (about 300 words), which generates and executes sort loops; KSORT$R (about 116 words), which is used only for executing previously generated sort loops; and KSORT$F (about 65 words), which allows KSORT$ to be called from FTN, Fortran V, or any language able to use Fortran library routines. Ordinarily, a program using KSORT$ requires at least the first two elements; however, a MASM proc element, KSORT$P, contains several procs that enable the user to set up sort packets, sort loops, and calls to KSORT$ at assembly time. This eliminates the need for the element KSORT$, thus reducing program size by nearly 300 words. (If the Fortran entry point is used, KSORT$ will always be required.)

Up to 63 sort fields may be specified -- more if the procs are used -- although this number may be limited in higher level languages depending on the number of subroutine arguments allowed. Because KSORT$ does not preserve sequence from one call to the next, each call to KSORT$ must specify all fields where sequence is desired.

A field consists of a whole or partial word of a record and may be sorted numerically or alphanumerically, in ascending or descending order. Long alphanumeric fields would thus be represented as several consecutive whole-word fields with partial-word fields at the beginning or end if necessary. Double precision numbers could be represented as two consecutive whole-word fields, the first being numeric and the second alphanumeric (to ignore the sign bit).

Unless the MASM procs are used, KSORT$ generates a tailor-made sort loop into a user-supplied buffer each time it is called, based on sort field descriptions and other items passed by the user in a sort information packet. There is also a method for reusing a previously generated sort loop if the same or similar records are to be sorted the same way more than once. Note, however, that this execution-time generation capability allows, with only a few changes, the same buffer to be used for many different types of sorts.

KSORT$ uses the Shell sort method, in which sort keys are compared and records exchanged across a progressively narrower gap. This gradually sifts the records into place, but with fewer comparisons and exchanges than by comparing only consecutive records (bubble sort).

Registers used: X11, A0-A3, R1. Any others used are saved and restored.

KSORT$ can operate in either third- or quarter-word mode (but not both within the same call). If the fraction codes XH1, T1-T3, or Q1-Q4 are used, they will be generated and sorted correctly as long as the user is in the proper mode both when generating and when sorting. (The sort loop generation element, KSORT$, must operate in third-word mode; therefore, it switches from the user's mode on entry, if necessary, and restores it after generating the loop.)

The relocatable KSORT$ routines must be collected below 65K. There is no such restriction on the sort packet and buffer since all references to them are made using index registers, including jump instructions within the sort buffer. The sort packet, buffer, and records must all be based when KSORT$ is called. KSORT$ operates in Basic Mode.

2. MASM Call [Top][Contents]

        LA,U      A0,SRTPKT . Address of sort information packet
        LMJ       X11,KSORT$ .
        ....................
The sort packet is set up in the user's dbank as follows:
SF1     FORM      6,6,6,18
SF2     FORM      12,6,18
SRTPKT  +LENGTH,RECADR .
        +NRECDS,BUFADR .
        SF1       NFLDS,FRACTION,CODE,KEY . Description of 1st (major) field
        SF2       FRACTION,CODE,KEY . Description of 2nd field
        SF2       FRACTION,CODE,KEY . Description of 3rd field
        Etc.
        ....................

Where:

LENGTH = length in words of one record.

RECADR = address of records to be sorted.

NRECDS = number of records. Maximum = 262,143. The highest record must not exceed 0777776, based on RECADR and LENGTH*NRECDS.

BUFADR = address of work buffer having a minimum length of 36+LENGTH+10*NFLDS.

Beginning with the 3rd word of the sort packet, there must be a word describing each field in order from major through minor. Note the first field description also contains the number of fields.

NFLDS = number of sort fields (maximum = 63).

FRACTION = j-designator code or mnemonic symbol for fraction of key word to sort on (0 for whole word). Note that numeric sorts may be affected if the code implies sign extension (T1, T2, T3, XH1, or XH2).

CODE = type of sort for this field:

             0 - Ascending numeric
             1 - Ascending alphanumeric
             2 - Descending numeric
             3 - Descending alphanumeric
             071 - Masked alphanumeric ascending
             073 - Masked alphanumeric descending

When the sort code is 071 or 073, this signals to KSORT$ that the user is supplying a 36-bit mask in the next word of the sort packet; and that this mask is to be used in a masked alphanumeric sort on the key word of the current field. Key words will be ANDed with the mask before being compared. (The next sort field, if any, would be described in the next sort packet word, immediately following the mask word.) This type of sort could be used, for instance, on fields with an irregular bit length, or on fields for which a signed numeric sort would not otherwise be available (e.g., do a masked descending sort on sign bit and then masked ascending on the entire field).

When 071 or 073 is used as a sort code, the fraction code must be 0.

KEY = relative word of record to sort on for this field

Upon completion, A0 will contain a status code; and A1 and A2 will contain the elapsed and CPU SUP times of the sort in microseconds, respectively, including loop generation. The possible status codes are:

0 - Normal completion

1 - Invalid number of records, or highest record would exceed 0777776.

2 - Invalid record length

3 - One or more invalid sort codes (neither 0 thru 3, 071, nor 073. If called from Fortran, any code greater than 3 is invalid.)

4 - One or more invalid keys (<0 or >LENGTH-1)

5 - One or more invalid fraction codes (>015)

6 - reuse error: buffer either has no sort loop or has been altered. (This determination is made based on a single word in the buffer which KSORT$ 'stamps' upon normal completion.)

Once a sort loop has been generated, it may be reused any number of times for the same or similar records. In order to do this, the work buffer must not have been altered since the last call to KSORT$, and the sort fields must remain the same. The only items in the sort packet that may be changed are the number of records (NRECDS) and the starting address of the records (RECADR).

To signal KSORT$ that there is already a sort loop in the work buffer that is to be reused, zero the number of fields before calling KSORT$. Example:

        SZ        A9 . Holds record count
NXTRCD  ... (Instructions to get next record & add to table)
        AA,U      A9,1 . Increment record count
        SA,H1     A9,SRTPKT+1 . Store new rcd count in packet
        LA,U      A0,SRTPKT .
        LMJ       X11,KSORT$ . Sort records
        SZ,S1     SRTPKT+2 . Flag to reuse sort loop
(Program does other things until ready to get next record.)
        J         NXTRCD
        ....................
SRTPKT  +5,RECADR, .
        +0,BUFADR .         (No. of records initially set to 0)
        SF1       3,0,0,0 .
        SF2       0,071,4
        +000777777000 .
        SF2       H2,1,2 .
BUFFER  RES       36+5+10*3 .
RECADR  RES       10000 .

3. Proc Calls [Top][Contents]

There are three MASM procedures available when using KSORT$:

A. Generate call to KSORT$:

K$SORT srtpkt[,*x,j] .

Generates: LA,U A0,SRTPKT[,*X,J] . ('U' assumed if j omitted)

LMJ X11,KSORT$ .

B. Generate call to KSORT$R:

K$SRT srtpkt[,*x,j] .

Generates: LA,U A0,SRTPKT[,*X,J] .

LMJ X11,KSRT$ .

This proc bypasses the element KSORT$ and may be used only when reusing a buffer or when using a buffer created by a call to K$SORTP.

The packet address may be omitted from either proc if it is already in A0.

C. Generate sort packet, buffer, and sort loop:


SRTPKT       K$SORTP[,*N]  RECADR,LENGTH,NRECDS  ;
                     KEY,CODE,FRACTION-OR-MASK ...  ...


  Generates: +LENGTH,RECADR .
             +NRECDS,BUFADR .
             [FIELD DESCRIPTION WORDS] . (Unless N=1)
             [BUFFER AND SORT LOOP]    . (Unless N=2)

This proc actually generates the assembler code for the sort loop into the buffer at assembly time, so it is ready to use at execution time by calling KSRT$.

Starting with the second field of the proc call, there must be a 3-part field describing each sort field (0 fractions and codes may be omitted). When this proc is used, there is no upper limit to the number of sort fields. If the sort code for a field is 071 or 073, the third subfield contains the user's mask (without parentheses), rather than the fraction code.

If any of the mode-dependent fraction codes are used (Q1 thru Q4, T1 thru T3, or XH1), they will be treated according to the mode in which the proc is called: quarter-word mode if in ASCII, third-word if in Fieldata. To have such a fraction code be processed in the opposite mode, precede it with an asterisk. In the following example, the Q3 would be treated properly, rather than as T2:

          $FDATA    . Assembly mode = Fieldata


SRTPKT    K$SORTP TABLEB,3,300  0,1  1,071,0770077000000  2,2,*Q3

(Note that in such a case as the above, the user will still be responsible for ensuring that quarter-word mode is set before calling KSORT$ with the above packet; the '*' only ensures that the proper code is generated into the sort loop.)

An EQUF name may be used as the first item in a sort field; its j-designator value will be retrieved as though it were in the third subfield.

If the optional n after the proc name is 1, only the first three words of the sort packet will be generated (i.e., no sort field description words); and the buffer will not be padded out to its normal length (36+LENGTH+10*NFLDS) if all words are not used. This can save some space, but it may be done only if the sort packet is not going to be altered or reused during the program to generate a new sort loop. The original buffer and sort loop, of course, may be reused as often as desired; and it is always permissible to change the starting address of the records or the number of records.

If n is 2, then only the complete sort packet will be generated, including the number of fields. The sort loop will not be generated, but the buffer will be reserved. The first use of this packet, then, must be through KSORT$.

If n is preceded by an asterisk, the octal code for the generated sort loop instructions will not be listed; only the sort packet and the last word of the buffer will appear. (If n is neither 1 nor 2, code *0.)

4. Fortran Call [Top][Contents]

        DIMENSION RECADR(10,500),BUFFER(70),ISTATS(3)
        CALL KSORTF(RECADR,LENGTH,NRECDS,BUFFER,NFLDS,
       * FFC,KEYWD, FFC,KEYWD,  ISTATS)

Where:

RECADR = name of array to be sorted.

LENGTH = length in words of one record (1st dimension of a two-dimensional array, or 1 if it is one-dimensional).

NRECDS = number of records (2nd dimension of a two-dimensional array, or the single dimension if it is one-dimensional). Maximum = 262,143.

BUFFER = name of a work array having at least 38+LENGTH+11*NFLDS words.

NFLDS = number of sort fields (maximum = 63).

Following NFLDS, there must be a pair of arguments for each sort field: an FFC code and a keywd.

FFC = a 1- to 3-digit code for this field, combining the fraction and sort type codes as described for the MASM call. The c portion is the one-digit sort code (0 thru 3 -- the 071 and 073 codes are not available in Fortran). To this is added the ff portion, which is the decimal fraction code multiplied by 10 (omitted for whole word).

The following list shows the available fraction codes, their decimal values, and the portion of a word they represent:

        H1 =  2 (left half)           S1 = 13 (left sixth)
        H2 =  1 (right half)          S2 = 12 (2nd sixth)
       XH1 = *4 (left half)           S3 = 11 (3rd sixth)
       XH2 = *3 (right half)          S4 = 10 (4th sixth)
        Q1 =  7 (left quarter)        S5 =  9 (5th sixth)
        Q2 =  4 (2nd quarter)         S6 =  8 (right sixth)
        Q3 =  6 (3rd quarter)         T1 = *7 (left third)
        Q4 =  5 (right quarter)       T2 = *6 (middle third)
             * = Sign-extended        T3 = *5 (right third)

The meanings of codes 4 thru 7 depend on the type of compiler used: ASCII Fortran (FTN) will interpret them as Q1 thru Q4; Fieldata Fortran (FOR) will interpret them as T1 thru T3 and XH1.

Example: the FFC for a descending numeric sort on the 2nd sixth of a word would be 12*10 + 2 = 122.

KEYWD = absolute word of record to sort on for this field (i.e., what the first subscript would be if referencing that word in the array). Example: if the sort field occurs in tbladr(3,n) of each record, then the keywd for this field is 3. For a one-dimensional array, keywd is always 1.

ISTATS = a 3-word integer array where the status and the elapsed and cpu sort times will be stored on return, as described for MASM.

When reusing sort loops in Fortran, the work buffer must not be altered between calls, the sort field arguments must remain the same, and the only other items that may change are the number of records and starting address for the records.

There are two ways of reusing a sort loop in Fortran. The first way would apply when a repeated call to KSORTF occurs in a loop and the same call statement is used both for the initial call and later calls. To signal reuse of the loop, set the number of fields to zero after the first call. Example:

        NFLDS = 2
  C FIRST CALL
    100 CALL KSORTF(RECADR,10,1000,BUFFER,NFLDS,
       * 0,1,20,2,ISTATS)
        NFLDS = 0
        (OTHER INSTRUCTIONS)
        GO TO 100

The second way of reusing a sort loop applies in the case of a loop generated in another part of the program. This type of call is allowed only when reusing a loop. It involves using a different entry point, KSORTR, and supplying only those 4 arguments that are necessary. In the following example, the starting address of the records is also changed to begin sorting on the 6th record and leave the first five untouched:

CALL KSORTR(RECADR(1,6),NRECDS-5,BUFFER,ISTATS)

Table of Contents

(Go to Top)

1. Introduction
2. MASM Call
3. Proc Calls
4. Fortran Call