5-8  DATA COMPRESSION 
 *********************

 Data compression is a size reducing reversible transformation of data,
 used to save disk space, of course before the data is used it must be 
 decompressed.

 The best compression method is the Lempel-Ziv-Welch (LZW) algorithm 
 that is used by most compression software: UNIX/compress, 

 LZW compression is not simple, nice FORTRAN implementations can be 
 found in the FTP archive of Arne Vajhoej:

     ftp://ftp.hhs.dk/ftn/lzw.zip          (Plus a little VAX assembly)
     ftp://ftp.hhs.dk/ftn/splzw.zip        (DOS/Salford FORTRAN 77)



 RFC 959 (FTP) compression method
 -------------------------------- 
 The following method is from Postel & Reynolds RFC 959 October 1985,
 the specification of the File Transfer Protocol (FTP).


      3.4.3.  COMPRESSED MODE

         There are three kinds of information to be sent:  regular data,
         sent in a byte string; compressed data, consisting of
         replications or filler; and control information, sent in a
         two-byte escape sequence.  If n>0 bytes (up to 127) of regular
         data are sent, these n bytes are preceded by a byte with the
         left-most bit set to 0 and the right-most 7 bits containing the
         number n.

         Byte string:

             1       7                8                     8
            +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+     +-+-+-+-+-+-+-+-+
            |0|       n     | |    d(1)       | ... |      d(n)     |
            +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+     +-+-+-+-+-+-+-+-+
                                          ^             ^
                                          |---n bytes---|
                                              of data

            String of n data bytes d(1),..., d(n)
            Count n must be positive.

         To compress a string of n replications of the data byte d, the
         following 2 bytes are sent:

         Replicated Byte:

              2       6               8
            +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
            |1 0|     n     | |       d       |
            +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+

         A string of n filler bytes can be compressed into a single
         byte, where the filler byte varies with the representation
         type.  If the type is ASCII or EBCDIC the filler byte is 
         (Space, ASCII code 32, EBCDIC code 64).  If the type is Image
         or Local byte the filler is a zero byte.

         Filler String:

              2       6
            +-+-+-+-+-+-+-+-+
            |1 1|     n     |
            +-+-+-+-+-+-+-+-+

         The escape sequence is a double byte, the first of which is the
         escape byte (all zeros) and the second of which contains
         descriptor codes as defined in Block mode.  The descriptor
         codes have the same meaning as in Block mode and apply to the
         succeeding string of bytes.

         Compressed mode is useful for obtaining increased bandwidth on
         very large network transmissions at a little extra CPU cost.
         It can be most effectively used to reduce the size of printer
         files such as those generated by RJE hosts.

Return to contents page