<<< Back Home >>>

SDFCOMP 2R9D -- SDF Element/File Compressor


Go to Table of Contents


1. Quick Reference: Compressing/Expanding for 2200-2200 Transfer

Needed: Absolute element SDFEXP on both ends, absolute SDFCOMP on host doing the compressing. (SDFCOMP can both compress and expand; SDFEXP is a smaller absolute that can only expand.)

When compressing, the G option yields the most compression but takes longer. The K option (or the absence of G) is quicker.


1.1. Single Source Element
[Top][Contents]

Source host: @SDFCOMP,GST file.elt,CATFILE.

FTP CATFILE down to workstation.

If desired, PKZIP CATFILE for further compression.

FTP CATFILE to destination workstation.

Destination: PKUNZIP CATFILE if it was ZIPped.

FTP CATFILE up to 2200 host.

@SDFEXP CATFILE.,file.elt


1.2. Program File of Source Elements
[Top][Contents]

Source host: @SDFCOMP,BGST file.,CATFILE.

(Use K instead of G for quicker compression.)

FTP CATFILE down to workstation.

If desired, PKZIP CATFILE for further compression.

FTP CATFILE to destination workstation.

Destination: PKUNZIP CATFILE if it was ZIPped.

FTP CATFILE up to 2200 host.

@SDFEXP CATFILE.,file.

(Either destination site must have SDFEXP or it must be included

in a separate file. See also section on SRCEXP Source Expander.)

(See V option for example of how to split program files into symbolic and non-symbolic elements for the most efficient compression.)


1.3. Mixed Program or Data File
[Top][Contents]

Source host: @SDFCOMP,VGST file.,CATFILE.

(Use K instead of G for quicker compression.)

FTP CATFILE down to workstation.

If desired, PKZIP CATFILE for further compression.

FTP CATFILE to destination workstation.

Destination: PKUNZIP CATFILE if it was ZIPped.

FTP CATFILE up to 2200 host.

@SDFEXP CATFILE.,file.


2. Introduction
[Top][Contents]

2.1. Purpose
[Top][Contents]

SDFCOMP is a processor for compressing and expanding source elements, SDF files, or any other Unisys files. Together with a modified SDFI, it forms a package which can allow large amounts of source code to be maintained in compressed format, yet still be accessible to system compilers and processors such as FURPUR, MASM, SSG, and ED. There is also an auxiliary processor, SDFEXP, that may be used solely for expanding previously compressed input.

SDFCOMP is supported on all current Unisys 2200 systems. It uses SYSLIB level 77R1.


2.2. Methods
[Top][Contents]

SDFCOMP can compress input data in one of two ways: character mode or string mode. Character mode is faster and requires less memory when executing, but string mode can compress input to about 60-70% of what character mode could accomplish. Following is a description of the two methods.


2.2.1. Character Mode
[Top][Contents]

SDFCOMP makes two passes over the input. On the first pass, it scans each image, taking notice of which characters are used and how often, as well as the frequency of the number of leading blanks on each image. It then sorts the character table and creates up to six character sets that will be used when compressing. If the input is a print file, the frequency of line spacing is also observed.

On the second pass, each character is converted to a five-bit byte from one of the character sets. Characters in the range of 0200-0777 are converted to two such bytes. These compressed bytes are put into large SDF records such that each bit in a word is used, and the records are written to the output. Control codes are used to signal switching between character sets, the start of new images (including line spacing and leading blank count, if any), and repeated characters.


2.2.2. String Mode
[Top][Contents]

Only one pass is made over the input. Each image is scanned as if in character mode; in addition, strings of consecutive non-blank characters on the image are surveyed. Which types of characters are eligible to comprise strings is determined by the subtype of the input element, the 3-letter subtype mnemonic coded on the processor call (MSM, COB, FOR, ...), an R option input image, or a default. Strings consisting of these characters, as well as "exception" strings made up of non-member characters, are both extracted. Frequently used tab columns (occurring at least about 8% of the total input image count), and strings followed by trailing periods, commas, or spaces are also observed. Finally, the image currently being scanned is saved in an intermediate coded form in a scratch file.

The strings themselves are saved in different DBANKs according to their length. There is a maximum of about 250000 strings; any more will cause an overflow. (Type "@SDFCOMP,HELP to show the exact number.) If overflow occurs during bulk compression in nonbreakpointed demand, you will be asked if you want to recover to the last whole element scanned. A "Yes" reply may be assumed by calling SORTOC with the J option.

After the entire input has been processed, all strings accumulated are sorted by descending frequency of occurrence. Strings used at least twice are selected for the dictionary and assigned codes of 1 to 4 5-bit bytes, with those strings occurring most often being assigned the smallest codes. The dictionary is written to the output as a single long image consisting of each eligible string, compressed as though in character mode. As with the number of strings, there is also an upper limit to the dictionary, and overflow can occur.

Finally, the coded input is retrieved and interpreted: each image is written as a series of 1- to 4-byte codes representing the strings which comprise it. SDFCOMP toggles back and forth between single-string and dictionary mode as needed to compress those strings which only occurred once.


2.2.3. General
[Top][Contents]

At the head of the output is a 63-word SDF label image containing the character sets and other tables and flags used for this input.

When creating an output compressed element, SDFCOMP sets flag bit 27 in the element's table entry (the same one set for absolutes mapped with TYPE BLOCKSIZE64). This bit can be used for testing in such cases as examining a program file entry or the @PRT,TL listing of a modified FURPUR, where an element's type could be shown as: 'MSM SYMB -C'


Table of Contents

(Go to Top)

1. Quick Reference: Compressing/Expanding for 2200-2200 Transfer
1.1. Single Source Element
1.2. Program File of Source Elements
1.3. Mixed Program or Data File
2. Introduction
2.1. Purpose
2.2. Methods
2.2.1. Character Mode
2.2.2. String Mode
2.2.3. General