Merging upstream version 1.8.

Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-24 04:16:09 +01:00 · 2025-02-24 04:16:09 +01:00 · 3ab3342c4f
commit 3ab3342c4f
parent 95e76700ee
21 changed files with 729 additions and 460 deletions
--- a/doc/plzip.info
+++ b/doc/plzip.info
@ -2,7 +2,7 @@ This is plzip.info, produced by makeinfo version 4.13+ from plzip.texi.

 INFO-DIR-SECTION Data Compression
 START-INFO-DIR-ENTRY
-* Plzip: (plzip).               Parallel compressor compatible with lzip
+* Plzip: (plzip).               Massively parallel implementation of lzip
 END-INFO-DIR-ENTRY


@ -11,7 +11,7 @@ File: plzip.info,  Node: Top,  Next: Introduction,  Up: (dir)
 Plzip Manual
 ************

-This manual is for Plzip (version 1.7, 7 February 2018).
+This manual is for Plzip (version 1.8, 5 January 2019).

 * Menu:

@ -28,7 +28,7 @@ This manual is for Plzip (version 1.7, 7 February 2018).
 * Concept index::          Index of concepts


-   Copyright (C) 2009-2018 Antonio Diaz Diaz.
+   Copyright (C) 2009-2019 Antonio Diaz Diaz.

   This manual is free documentation: you have unlimited permission to
 copy, distribute and modify it.
@ -39,20 +39,25 @@ File: plzip.info,  Node: Introduction,  Next: Output,  Prev: Top,  Up: Top
 1 Introduction
 **************

-Plzip is a massively parallel (multi-threaded) lossless data compressor
-based on the lzlib compression library, with a user interface similar to
-the one of lzip, bzip2 or gzip.
+Plzip is a massively parallel (multi-threaded) implementation of lzip,
+fully compatible with lzip 1.4 or newer. Plzip uses the lzlib
+compression library.
+
+   Lzip is a lossless data compressor with a user interface similar to
+the one of gzip or bzip2. Lzip can compress about as fast as gzip
+(lzip -0) or compress most files more than bzip2 (lzip -9).
+Decompression speed is intermediate between gzip and bzip2. Lzip is
+better than gzip and bzip2 from a data recovery perspective. Lzip has
+been designed, written and tested with great care to replace gzip and
+bzip2 as the standard general-purpose compressed format for unix-like
+systems.

   Plzip can compress/decompress large files on multiprocessor machines
 much faster than lzip, at the cost of a slightly reduced compression
 ratio (0.4 to 2 percent larger compressed files). Note that the number
 of usable threads is limited by file size; on files larger than a few GB
 plzip can use hundreds of processors, but on files of only a few MB
-plzip is no faster than lzip (*note Minimum file sizes::).
-
-   Plzip uses the lzip file format; the files produced by plzip are
-fully compatible with lzip-1.4 or newer, and can be rescued with
-lziprecover.
+plzip is no faster than lzip. *Note Minimum file sizes::.

   The lzip file format is designed for data sharing and long-term
 archiving, taking into account both data integrity and decoder
@ -80,15 +85,16 @@ repair the nearer it is from the beginning of the file. Therefore, with
 the help of lziprecover, losing an entire archive just because of a
 corrupt byte near the beginning is a thing of the past.

-   Plzip uses the same well-defined exit status values used by lzip and
-bzip2, which makes it safer than compressors returning ambiguous warning
-values (like gzip) when it is used as a back end for other programs like
-tar or zutils.
+   Plzip uses the same well-defined exit status values used by lzip,
+which makes it safer than compressors returning ambiguous warning
+values (like gzip) when it is used as a back end for other programs
+like tar or zutils.

-   Plzip will automatically use the smallest possible dictionary size
-for each file without exceeding the given limit. Keep in mind that the
-decompression memory requirement is affected at compression time by the
-choice of dictionary size limit (*note Memory requirements::).
+   Plzip will automatically use for each file the largest dictionary
+size that does not exceed neither the file size nor the limit given.
+Keep in mind that the decompression memory requirement is affected at
+compression time by the choice of dictionary size limit. *Note Memory
+requirements::.

   When compressing, plzip replaces every file given in the command line
 with a compressed version of itself, with the name "original_name.lz".
@ -101,7 +107,7 @@ anyothername   becomes   anyothername.out

   (De)compressing a file is much like copying or moving it; therefore
 plzip preserves the access and modification dates, permissions, and,
-when possible, ownership of the file just as "cp -p" does. (If the user
+when possible, ownership of the file just as 'cp -p' does. (If the user
 ID or the group ID can't be duplicated, the file permission bits
 S_ISUID and S_ISGID are cleared).

@ -188,6 +194,7 @@ command line.
 '-V'
 '--version'
     Print the version number of plzip on the standard output and exit.
+     This version number should be included in all bug reports.

 '-a'
 '--trailing-error'
@ -286,12 +293,14 @@ command line.
 '-s BYTES'
 '--dictionary-size=BYTES'
     When compressing, set the dictionary size limit in bytes. Plzip
-     will use the smallest possible dictionary size for each file
-     without exceeding this limit. Valid values range from 4 KiB to
-     512 MiB. Values 12 to 29 are interpreted as powers of two, meaning
-     2^12 to 2^29 bytes. Note that dictionary sizes are quantized. If
-     the specified size does not match one of the valid sizes, it will
-     be rounded upwards by adding up to (BYTES / 8) to it.
+     will use for each file the largest dictionary size that does not
+     exceed neither the file size nor this limit. Valid values range
+     from 4 KiB to 512 MiB. Values 12 to 29 are interpreted as powers
+     of two, meaning 2^12 to 2^29 bytes. Dictionary sizes are quantized
+     so that they can be coded in just one byte (*note
+     coded-dict-size::). If the specified size does not match one of
+     the valid sizes, it will be rounded upwards by adding up to
+     (BYTES / 8) to it.

     For maximum compression you should use a dictionary size limit as
     large as possible, but keep in mind that the decompression memory
@ -320,27 +329,32 @@ command line.
     except for single-member files.

 '-0 .. -9'
-     Set the compression parameters (dictionary size and match length
-     limit) as shown in the table below. The default compression level
-     is '-6'.  Note that '-9' can be much slower than '-0'. These
-     options have no effect when decompressing, testing or listing.
+     Compression level. Set the compression parameters (dictionary size
+     and match length limit) as shown in the table below. The default
+     compression level is '-6', equivalent to '-s8MiB -m36'. Note that
+     '-9' can be much slower than '-0'. These options have no effect
+     when decompressing, testing or listing.

     The bidimensional parameter space of LZMA can't be mapped to a
     linear scale optimal for all files. If your files are large, very
     repetitive, etc, you may need to use the '--dictionary-size' and
     '--match-length' options directly to achieve optimal performance.

-     Level   Dictionary size   Match length limit
-     -0      64 KiB            16 bytes
-     -1      1 MiB             5 bytes
-     -2      1.5 MiB           6 bytes
-     -3      2 MiB             8 bytes
-     -4      3 MiB             12 bytes
-     -5      4 MiB             20 bytes
-     -6      8 MiB             36 bytes
-     -7      16 MiB            68 bytes
-     -8      24 MiB            132 bytes
-     -9      32 MiB            273 bytes
+     If several compression levels or '-s' or '-m' options are given,
+     the last setting is used. For example '-9 -s64MiB' is equivalent
+     to '-s64MiB -m273'
+
+     Level   Dictionary size (-s)   Match length limit (-m)
+     -0      64 KiB                 16 bytes
+     -1      1 MiB                  5 bytes
+     -2      1.5 MiB                6 bytes
+     -3      2 MiB                  8 bytes
+     -4      3 MiB                  12 bytes
+     -5      4 MiB                  20 bytes
+     -6      8 MiB                  36 bytes
+     -7      16 MiB                 68 bytes
+     -8      24 MiB                 132 bytes
+     -9      32 MiB                 273 bytes

 '--fast'
 '--best'
@ -353,6 +367,18 @@ command line.
     if a file triggers a "corrupt header" error and the cause is not
     indeed a corrupt header.

+'--in-slots=N'
+     Number of 1 MiB input packets buffered per worker thread when
+     decompressing from non-seekable input. Increasing the number of
+     packets may increase decompression speed, but requires more
+     memory. Valid values range from 1 to 64. The default value is 4.
+
+'--out-slots=N'
+     Number of 1 MiB output packets buffered per worker thread when
+     decompressing to non-seekable output. Increasing the number of
+     packets may increase decompression speed, but requires more
+     memory. Valid values range from 1 to 1024. The default value is 64.
+

   Numbers given as arguments to options may be followed by a multiplier
 and an optional 'B' for "byte".
@ -465,11 +491,11 @@ additional information before, between, or after them.

 'DS (coded dictionary size, 1 byte)'
     The dictionary size is calculated by taking a power of 2 (the base
-     size) and substracting from it a fraction between 0/16 and 7/16 of
+     size) and subtracting from it a fraction between 0/16 and 7/16 of
     the base size.
     Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
-     Bits 7-5 contain the numerator of the fraction (0 to 7) to
-     substract from the base size to obtain the dictionary size.
+     Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+     from the base size to obtain the dictionary size.
     Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
     Valid values for dictionary size range from 4 KiB to 512 MiB.

@ -497,22 +523,25 @@ File: plzip.info,  Node: Memory requirements,  Next: Minimum file sizes,  Prev:
 6 Memory required to compress and decompress
 ********************************************

-The amount of memory required *per thread* for decompression or testing
-is approximately the following:
+The amount of memory required *per worker thread* for decompression or
+testing is approximately the following:

   * For decompression of a regular (seekable) file to another regular
     file, or for testing of a regular file; the dictionary size.

   * For testing of a non-seekable file or of standard input; the
-     dictionary size plus up to 5 MiB.
+     dictionary size plus 1 MiB plus up to the number of 1 MiB input
+     packets buffered (4 by default).

   * For decompression of a regular file to a non-seekable file or to
-     standard output; the dictionary size plus up to 32 MiB.
+     standard output; the dictionary size plus up to the number of 1 MiB
+     output packets buffered (64 by default).

   * For decompression of a non-seekable file or of standard input; the
-     dictionary size plus up to 35 MiB.
+     dictionary size plus 1 MiB plus up to the number of 1 MiB input
+     and output packets buffered (68 by default).

-The amount of memory required *per thread* for compression is
+The amount of memory required *per worker thread* for compression is
 approximately the following:

   * For compression at level -0; 1.5 MiB plus 3.375 times the data size
@ -561,7 +590,7 @@ for full use of N processors at a given compression level, using the
 default data size for each level:

 Processors   2         4         8         16        64        256
------------------------------------------------------------------------- 
+------------------------------------------------------------------
 Level                                                          
 -0           2 MiB     4 MiB     8 MiB     16 MiB    64 MiB    256 MiB
 -1           4 MiB     8 MiB     16 MiB    32 MiB    128 MiB   512 MiB
@ -633,7 +662,11 @@ compressed file (bugs in the system libraries, memory errors, etc).
 Therefore, if the data you are going to compress are important, give the
 '--keep' option to plzip and don't remove the original file until you
 verify the compressed file with a command like
-'plzip -cd file.lz | cmp file -'.
+'plzip -cd file.lz | cmp file -'. Most RAM errors happening during
+compression can only be detected by comparing the compressed file with
+the original because the corruption happens before plzip compresses the
+RAM contents, resulting in a valid compressed file containing wrong
+data.


 Example 1: Replace a regular file with its compressed version 'file.lz'
@ -728,21 +761,22 @@ Concept index


 Tag Table:
-Node: Top221
+Node: Top222
 Node: Introduction1158
-Node: Output5134
-Node: Invoking plzip6614
-Ref: --trailing-error7177
-Ref: --data-size7420
-Node: Program design14938
-Node: File format17090
-Node: Memory requirements19522
-Node: Minimum file sizes20985
-Node: Trailing data23002
-Node: Examples25285
-Ref: concat-example26450
-Node: Problems27025
-Node: Concept index27553
+Node: Output5456
+Node: Invoking plzip6936
+Ref: --trailing-error7563
+Ref: --data-size7806
+Node: Program design16267
+Node: File format18419
+Ref: coded-dict-size19719
+Node: Memory requirements20849
+Node: Minimum file sizes22531
+Node: Trailing data24540
+Node: Examples26823
+Ref: concat-example28238
+Node: Problems28813
+Node: Concept index29341

 End Tag Table