Merging upstream version 1.14.

Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 21:33:45 +01:00 · 2025-02-20 21:33:45 +01:00 · 981b7e2738
commit 981b7e2738
parent a5f6fd65d6
29 changed files with 744 additions and 652 deletions
--- a/doc/lzlib.info
+++ b/doc/lzlib.info
@ -11,7 +11,7 @@ File: lzlib.info,  Node: Top,  Next: Introduction,  Up: (dir)
 Lzlib Manual
 ************

-This manual is for Lzlib (version 1.13, 23 January 2022).
+This manual is for Lzlib (version 1.14, 20 January 2024).

 * Menu:

@ -23,14 +23,14 @@ This manual is for Lzlib (version 1.13, 23 January 2022).
 * Decompression functions::  Descriptions of the decompression functions
 * Error codes::              Meaning of codes returned by functions
 * Error messages::           Error messages corresponding to error codes
-* Invoking minilzip::        Command line interface of the test program
+* Invoking minilzip::        Command-line interface of the test program
 * Data format::              Detailed format of the compressed data
 * Examples::                 A small tutorial with examples
 * Problems::                 Reporting bugs
 * Concept index::            Index of concepts


-   Copyright (C) 2009-2022 Antonio Diaz Diaz.
+   Copyright (C) 2009-2024 Antonio Diaz Diaz.

   This manual is free documentation: you have unlimited permission to copy,
 distribute, and modify it.
@ -76,6 +76,13 @@ library are declared in the file 'lzlib.h'. Usage examples of the library
 are given in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from
 the source distribution.

+   As 'lzlib.h' can be used by C and C++ programs, it must not impose a
+choice of system headers on the program by including one of them. Therefore
+it is the responsibility of the program using lzlib to include before
+'lzlib.h' some header that declares the type 'uint8_t'. There are at least
+four such headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and
+'cinttypes'.
+
   All the library functions are thread safe. The library does not install
 any signal handler. The decoder checks the consistency of the compressed
 data, so the library should never crash even in case of corrupted input.
@ -86,21 +93,21 @@ This interface is safer and less error prone than the traditional zlib
 interface.

   Compression/decompression is done when the read function is called. This
-means the value returned by the position functions will not be updated until
-a read call, even if a lot of data are written. If you want the data to be
+means the value returned by the position functions is not updated until a
+read call, even if a lot of data are written. If you want the data to be
 compressed in advance, just call the read function with a SIZE equal to 0.

-   If all the data to be compressed are written in advance, lzlib will
-automatically adjust the header of the compressed data to use the largest
+   If all the data to be compressed are written in advance, lzlib
+automatically adjusts the header of the compressed data to use the largest
 dictionary size that does not exceed neither the data size nor the limit
 given to 'LZ_compress_open'. This feature reduces the amount of memory
-needed for decompression and allows minilzip to produce identical compressed
-output as lzip.
+needed for decompression and allows minilzip to produce identical
+compressed output as lzip.

-   Lzlib will correctly decompress a data stream which is the concatenation
-of two or more compressed data streams. The result is the concatenation of
-the corresponding decompressed data streams. Integrity testing of
-concatenated compressed data streams is also supported.
+   Lzlib correctly decompresses a data stream which is the concatenation of
+two or more compressed data streams. The result is the concatenation of the
+corresponding decompressed data streams. Integrity testing of concatenated
+compressed data streams is also supported.

   Lzlib is able to compress and decompress streams of unlimited size by
 automatically creating multimember output. The members so created are large,
@ -111,22 +118,22 @@ concrete algorithm; it is more like "any algorithm using the LZMA coding
 scheme". For example, the option '-0' of lzip uses the scheme in almost the
 simplest way possible; issuing the longest match it can find, or a literal
 byte if it can't find a match. Inversely, a much more elaborated way of
-finding coding sequences of minimum size than the one currently used by lzip
-could be developed, and the resulting sequence could also be coded using the
-LZMA coding scheme.
+finding coding sequences of minimum size than the one currently used by
+lzip could be developed, and the resulting sequence could also be coded
+using the LZMA coding scheme.

   Lzlib currently implements two variants of the LZMA algorithm: fast
 (used by option '-0' of minilzip) and normal (used by all other compression
 levels).

   The high compression of LZMA comes from combining two basic, well-proven
-compression ideas: sliding dictionaries (LZ77/78) and markov models (the
-thing used by every compression algorithm that uses a range encoder or
-similar order-0 entropy coder as its last stage) with segregation of
-contexts according to what the bits are used for.
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.

   The ideas embodied in lzlib are due to (at least) the following people:
-Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
 definition of Markov chains), G.N.N. Martin (for the definition of range
 encoding), Igor Pavlov (for putting all the above together in LZMA), and
 Julian Seward (for bzip2's CLI).
@ -150,7 +157,7 @@ of them are declared in 'lzlib.h'.

 -- Constant: LZ_API_VERSION
     This constant is defined in 'lzlib.h' and works as a version test
-     macro. The application should verify at compile time that
+     macro. The application should check at compile time that
     LZ_API_VERSION is greater than or equal to the version required by the
     application:

@ -170,12 +177,13 @@ desire to have certain symbols and prototypes exposed.
 -- Function: int LZ_api_version ( void )
     If LZ_API_VERSION >= 1012, this function is declared in 'lzlib.h' (else
     it doesn't exist). It returns the LZ_API_VERSION of the library object
-     code being used. The application should verify at run time that the
+     code being used. The application should check at run time that the
     value returned by 'LZ_api_version' is greater than or equal to the
-     version required by the application. An application may be dinamically
+     version required by the application. An application may be dynamically
     linked at run time with a different version of lzlib than the one it
-     was compiled for, and this should not break the program as long as the
-     library used provides the functionality required by the application.
+     was compiled for, and this should not break the application as long as
+     the library used provides the functionality required by the
+     application.

          #if defined LZ_API_VERSION && LZ_API_VERSION >= 1012
            if( LZ_api_version() < 1012 )
@ -258,7 +266,7 @@ File: lzlib.info,  Node: Compression functions,  Next: Decompression functions,

 These are the functions used to compress data. In case of error, all of
 them return -1 or 0, for signed and unsigned return values respectively,
-except 'LZ_compress_open' whose return value must be verified by calling
+except 'LZ_compress_open' whose return value must be checked by calling
 'LZ_compress_errno' before using it.

 -- Function: struct LZ_Encoder * LZ_compress_open ( const int
@ -269,7 +277,7 @@ except 'LZ_compress_open' whose return value must be verified by calling
     LZ_compress functions, or a null pointer if the encoder could not be
     allocated.

-     The returned pointer must be verified by calling 'LZ_compress_errno'
+     The returned pointer must be checked by calling 'LZ_compress_errno'
     before using it. If 'LZ_compress_errno' does not return 'LZ_ok', the
     returned pointer must not be used and should be freed with
     'LZ_compress_close' to avoid memory leaks.
@ -277,8 +285,8 @@ except 'LZ_compress_open' whose return value must be verified by calling
     DICTIONARY_SIZE sets the dictionary size to be used, in bytes. Valid
     values range from 4 KiB to 512 MiB. Note that dictionary sizes are
     quantized. If the size specified does not match one of the valid
-     sizes, it will be rounded upwards by adding up to
-     (DICTIONARY_SIZE / 8) to it.
+     sizes, it is rounded upwards by adding up to (DICTIONARY_SIZE / 8) to
+     it.

     MATCH_LEN_LIMIT sets the match length limit in bytes. Valid values
     range from 5 to 273. Larger values usually give better compression
@ -286,15 +294,14 @@ except 'LZ_compress_open' whose return value must be verified by calling

     If DICTIONARY_SIZE is 65535 and MATCH_LEN_LIMIT is 16, the fast
     variant of LZMA is chosen, which produces identical compressed output
-     as 'lzip -0'. (The dictionary size used will be rounded upwards to
-     64 KiB).
+     as 'lzip -0'. (The dictionary size used is rounded upwards to 64 KiB).

     MEMBER_SIZE sets the member size limit in bytes. Valid values range
     from 4 KiB to 2 PiB. A small member size may degrade compression
     ratio, so use it only when needed. To produce a single-member data
     stream, give MEMBER_SIZE a value larger than the amount of data to be
-     produced. Values larger than 2 PiB will be reduced to 2 PiB to prevent
-     the uncompressed size of the member from overflowing.
+     produced. Values larger than 2 PiB are reduced to 2 PiB to prevent the
+     uncompressed size of the member from overflowing.

 -- Function: int LZ_compress_close ( struct LZ_Encoder * const ENCODER )
     Frees all dynamically allocated data structures for this stream. This
@ -420,7 +427,7 @@ File: lzlib.info,  Node: Decompression functions,  Next: Error codes,  Prev: Com

 These are the functions used to decompress data. In case of error, all of
 them return -1 or 0, for signed and unsigned return values respectively,
-except 'LZ_decompress_open' whose return value must be verified by calling
+except 'LZ_decompress_open' whose return value must be checked by calling
 'LZ_decompress_errno' before using it.

 -- Function: struct LZ_Decoder * LZ_decompress_open ( void )
@ -429,7 +436,7 @@ except 'LZ_decompress_open' whose return value must be verified by calling
     LZ_decompress functions, or a null pointer if the decoder could not be
     allocated.

-     The returned pointer must be verified by calling 'LZ_decompress_errno'
+     The returned pointer must be checked by calling 'LZ_decompress_errno'
     before using it. If 'LZ_decompress_errno' does not return 'LZ_ok', the
     returned pointer must not be used and should be freed with
     'LZ_decompress_close' to avoid memory leaks.
@ -459,13 +466,13 @@ except 'LZ_decompress_open' whose return value must be verified by calling
     Resets the error state of DECODER and enters a search state that lasts
     until a new member header (or the end of the stream) is found. After a
     successful call to 'LZ_decompress_sync_to_member', data written with
-     'LZ_decompress_write' will be consumed and 'LZ_decompress_read' will
-     return 0 until a header is found.
+     'LZ_decompress_write' is consumed and 'LZ_decompress_read' returns 0
+     until a header is found.

-     This function is useful to discard any data preceding the first member,
-     or to discard the rest of the current member, for example in case of a
-     data error. If the decoder is already at the beginning of a member,
-     this function does nothing.
+     This function is useful to discard any data preceding the first
+     member, or to discard the rest of the current member, for example in
+     case of a data error. If the decoder is already at the beginning of a
+     member, this function does nothing.

 -- Function: int LZ_decompress_read ( struct LZ_Decoder * const DECODER,
          uint8_t * const BUFFER, const int SIZE )
@ -571,7 +578,7 @@ File: lzlib.info,  Node: Error codes,  Next: Error messages,  Prev: Decompressio

 Most library functions return -1 to indicate that they have failed. But
 this return value only tells you that an error has occurred. To find out
-what kind of error it was, you need to verify the error code by calling
+what kind of error it was, you need to check the error code by calling
 'LZ_(de)compress_errno'.

   Library functions don't change the value returned by
@ -639,19 +646,20 @@ File: lzlib.info,  Node: Invoking minilzip,  Next: Data format,  Prev: Error mes
 9 Invoking minilzip
 *******************

-Minilzip is a test program for the compression library lzlib, fully
-compatible with lzip 1.4 or newer.
+Minilzip is a test program for the compression library lzlib, compatible
+with lzip 1.4 or newer.

   Lzip is a lossless data compressor with a user interface similar to the
 one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
-chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity
-checking to maximize interoperability and optimize safety. Lzip can compress
-about as fast as gzip (lzip -0) or compress most files more than bzip2
-(lzip -9). Decompression speed is intermediate between gzip and bzip2. Lzip
-is better than gzip and bzip2 from a data recovery perspective. Lzip has
-been designed, written, and tested with great care to replace gzip and
-bzip2 as the standard general-purpose compressed format for unix-like
-systems.
+chain-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
+checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
+files more than bzip2 (lzip -9). Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems.

 The format for running minilzip is:

@ -660,7 +668,8 @@ The format for running minilzip is:
 If no file names are specified, minilzip compresses (or decompresses) from
 standard input to standard output. A hyphen '-' used as a FILE argument
 means standard input. It can be mixed with other FILES and is read just
-once, the first time it appears in the command line.
+once, the first time it appears in the command line. Remember to prepend
+'./' to any file name beginning with a hyphen, or use '--'.

   minilzip supports the following options: *Note Argument syntax:
 (arg_parser)Argument syntax.
@ -696,17 +705,18 @@ once, the first time it appears in the command line.
     members). This option (or '-o') is needed when reading from a named
     pipe (fifo) or from a device. Use it also to recover as much of the
     decompressed data as possible when decompressing a corrupt file. '-c'
-     overrides '-o' and '-S'. '-c' has no effect when testing or listing.
+     overrides '-o' and '-S'. '-c' has no effect when testing.

 '-d'
 '--decompress'
-     Decompress the files specified. If a file does not exist, can't be
-     opened, or the destination file already exists and '--force' has not
-     been specified, minilzip continues decompressing the rest of the files
-     and exits with error status 1. If a file fails to decompress, or is a
-     terminal, minilzip exits immediately with error status 2 without
-     decompressing the rest of the files. A terminal is considered an
-     uncompressed file, and therefore invalid.
+     Decompress the files specified. The integrity of the files specified is
+     checked. If a file does not exist, can't be opened, or the destination
+     file already exists and '--force' has not been specified, minilzip
+     continues decompressing the rest of the files and exits with error
+     status 1. If a file fails to decompress, or is a terminal, minilzip
+     exits immediately with error status 2 without decompressing the rest
+     of the files. A terminal is considered an uncompressed file, and
+     therefore invalid.

 '-f'
 '--force'
@ -725,17 +735,17 @@ once, the first time it appears in the command line.
 '--match-length=BYTES'
     When compressing, set the match length limit in bytes. After a match
     this long is found, the search is finished. Valid values range from 5
-     to 273. Larger values usually give better compression ratios but longer
-     compression times.
+     to 273. Larger values usually give better compression ratios but
+     longer compression times.

 '-o FILE'
 '--output=FILE'
-     If '-c' has not been also specified, write the (de)compressed output to
-     FILE; keep input files unchanged. If compressing several files, each
-     file is compressed independently. (The output consists of a sequence of
-     independently compressed members). This option (or '-c') is needed when
-     reading from a named pipe (fifo) or from a device. '-o -' is
-     equivalent to '-c'. '-o' has no effect when testing or listing.
+     If '-c' has not been also specified, write the (de)compressed output
+     to FILE; keep input files unchanged. If compressing several files,
+     each file is compressed independently. (The output consists of a
+     sequence of independently compressed members). This option (or '-c')
+     is needed when reading from a named pipe (fifo) or from a device.
+     '-o -' is equivalent to '-c'. '-o' has no effect when testing.

     When compressing and splitting the output in volumes, FILE is used as
     a prefix, and several files named 'FILE00001.lz', 'FILE00002.lz', etc,
@ -748,13 +758,13 @@ once, the first time it appears in the command line.
 '-s BYTES'
 '--dictionary-size=BYTES'
     When compressing, set the dictionary size limit in bytes. Minilzip
-     will use for each file the largest dictionary size that does not
-     exceed neither the file size nor this limit. Valid values range from
-     4 KiB to 512 MiB. Values 12 to 29 are interpreted as powers of two,
-     meaning 2^12 to 2^29 bytes. Dictionary sizes are quantized so that
-     they can be coded in just one byte (*note coded-dict-size::). If the
-     size specified does not match one of the valid sizes, it will be
-     rounded upwards by adding up to (BYTES / 8) to it.
+     uses for each file the largest dictionary size that does not exceed
+     neither the file size nor this limit. Valid values range from 4 KiB to
+     512 MiB. Values 12 to 29 are interpreted as powers of two, meaning
+     2^12 to 2^29 bytes. Dictionary sizes are quantized so that they can be
+     coded in just one byte (*note coded-dict-size::). If the size
+     specified does not match one of the valid sizes, it is rounded upwards
+     by adding up to (BYTES / 8) to it.

     For maximum compression you should use a dictionary size limit as large
     as possible, but keep in mind that the decompression memory requirement
@ -776,7 +786,7 @@ once, the first time it appears in the command line.
     really performs a trial decompression and throws away the result. Use
     it together with '-v' to see information about the files. If a file
     fails the test, does not exist, can't be opened, or is a terminal,
-     minilzip continues checking the rest of the files. A final diagnostic
+     minilzip continues testing the rest of the files. A final diagnostic
     is shown at verbosity level 1 or higher if any file fails the test
     when testing multiple files.

@ -839,26 +849,29 @@ once, the first time it appears in the command line.
     defined). *Note Library version::.


-   Numbers given as arguments to options may be followed by a multiplier
-and an optional 'B' for "byte".
+   Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional 'B' for "byte".

   Table of SI and binary prefixes (unit multipliers):

-Prefix   Value                     |   Prefix   Value
-k        kilobyte  (10^3 = 1000)   |   Ki       kibibyte (2^10 = 1024)
-M        megabyte  (10^6)          |   Mi       mebibyte (2^20)
-G        gigabyte  (10^9)          |   Gi       gibibyte (2^30)
-T        terabyte  (10^12)         |   Ti       tebibyte (2^40)
-P        petabyte  (10^15)         |   Pi       pebibyte (2^50)
-E        exabyte   (10^18)         |   Ei       exbibyte (2^60)
-Z        zettabyte (10^21)         |   Zi       zebibyte (2^70)
-Y        yottabyte (10^24)         |   Yi       yobibyte (2^80)
+Prefix   Value                      |   Prefix   Value
+k        kilobyte   (10^3 = 1000)   |   Ki       kibibyte  (2^10 = 1024)
+M        megabyte   (10^6)          |   Mi       mebibyte  (2^20)
+G        gigabyte   (10^9)          |   Gi       gibibyte  (2^30)
+T        terabyte   (10^12)         |   Ti       tebibyte  (2^40)
+P        petabyte   (10^15)         |   Pi       pebibyte  (2^50)
+E        exabyte    (10^18)         |   Ei       exbibyte  (2^60)
+Z        zettabyte  (10^21)         |   Zi       zebibyte  (2^70)
+Y        yottabyte  (10^24)         |   Yi       yobibyte  (2^80)
+R        ronnabyte  (10^27)         |   Ri       robibyte  (2^90)
+Q        quettabyte (10^30)         |   Qi       quebibyte (2^100)


   Exit status: 0 for a normal exit, 1 for environmental problems (file not
-found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
-input file, 3 for an internal consistency error (e.g., bug) which caused
-minilzip to panic.
+found, invalid command-line options, I/O errors, etc), 2 to indicate a
+corrupt or invalid input file, 3 for an internal consistency error (e.g.,
+bug) which caused minilzip to panic.


 File: lzlib.info,  Node: Data format,  Next: Examples,  Prev: Invoking minilzip,  Up: Top
@ -886,7 +899,7 @@ when there is no longer anything to take away.
   represents a variable number of bytes.


-   Lzip data consist of a series of independent "members" (compressed data
+   Lzip data consist of one or more independent "members" (compressed data
 sets). The members simply appear one after another in the data stream, with
 no additional information before, between, or after them. Each member can
 encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
@ -933,10 +946,10 @@ size of a multimember data stream is unlimited.

 'Member size (8 bytes)'
     Total size of the member, including header and trailer. This field acts
-     as a distributed index, allows the verification of stream integrity,
-     and facilitates the safe recovery of undamaged members from
-     multimember files. Member size should be limited to 2 PiB to prevent
-     the data size field from overflowing.
+     as a distributed index, improves the checking of stream integrity, and
+     facilitates the safe recovery of undamaged members from multimember
+     files. Lzip limits the member size to 2 PiB to prevent the data size
+     field from overflowing.



@ -1234,7 +1247,7 @@ int ffrsdecompress( struct LZ_Decoder * const decoder,
      if( LZ_decompress_errno( decoder ) == LZ_header_error ||
          LZ_decompress_errno( decoder ) == LZ_data_error )
        { LZ_decompress_sync_to_member( decoder ); continue; }
-      else break;
+      break;
      }
    len = fwrite( buffer, 1, ret, outfile );
    if( len < ret ) break;
@ -1293,27 +1306,27 @@ Concept index
 Tag Table:
 Node: Top215
 Node: Introduction1338
-Node: Library version6413
-Node: Buffering8957
-Node: Parameter limits10182
-Node: Compression functions11136
-Ref: member_size12946
-Ref: sync_flush14712
-Node: Decompression functions19400
-Node: Error codes26968
-Node: Error messages29259
-Node: Invoking minilzip29838
-Node: Data format39786
-Ref: coded-dict-size41232
-Node: Examples42641
-Node: Buffer compression43602
-Node: Buffer decompression45122
-Node: File compression46536
-Node: File decompression47519
-Node: File compression mm48523
-Node: Skipping data errors51552
-Node: Problems52862
-Node: Concept index53423
+Node: Library version6778
+Node: Buffering9329
+Node: Parameter limits10554
+Node: Compression functions11508
+Ref: member_size13301
+Ref: sync_flush15063
+Node: Decompression functions19751
+Node: Error codes27308
+Node: Error messages29598
+Node: Invoking minilzip30177
+Node: Data format40595
+Ref: coded-dict-size42041
+Node: Examples43446
+Node: Buffer compression44407
+Node: Buffer decompression45927
+Node: File compression47341
+Node: File decompression48324
+Node: File compression mm49328
+Node: Skipping data errors52357
+Node: Problems53662
+Node: Concept index54223

 End Tag Table