1
0
Fork 0

Merging upstream version 1.8.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-24 04:16:09 +01:00
parent 95e76700ee
commit 3ab3342c4f
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
21 changed files with 729 additions and 460 deletions

View file

@ -2,7 +2,7 @@ This is plzip.info, produced by makeinfo version 4.13+ from plzip.texi.
INFO-DIR-SECTION Data Compression
START-INFO-DIR-ENTRY
* Plzip: (plzip). Parallel compressor compatible with lzip
* Plzip: (plzip). Massively parallel implementation of lzip
END-INFO-DIR-ENTRY

@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
Plzip Manual
************
This manual is for Plzip (version 1.7, 7 February 2018).
This manual is for Plzip (version 1.8, 5 January 2019).
* Menu:
@ -28,7 +28,7 @@ This manual is for Plzip (version 1.7, 7 February 2018).
* Concept index:: Index of concepts
Copyright (C) 2009-2018 Antonio Diaz Diaz.
Copyright (C) 2009-2019 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to
copy, distribute and modify it.
@ -39,20 +39,25 @@ File: plzip.info, Node: Introduction, Next: Output, Prev: Top, Up: Top
1 Introduction
**************
Plzip is a massively parallel (multi-threaded) lossless data compressor
based on the lzlib compression library, with a user interface similar to
the one of lzip, bzip2 or gzip.
Plzip is a massively parallel (multi-threaded) implementation of lzip,
fully compatible with lzip 1.4 or newer. Plzip uses the lzlib
compression library.
Lzip is a lossless data compressor with a user interface similar to
the one of gzip or bzip2. Lzip can compress about as fast as gzip
(lzip -0) or compress most files more than bzip2 (lzip -9).
Decompression speed is intermediate between gzip and bzip2. Lzip is
better than gzip and bzip2 from a data recovery perspective. Lzip has
been designed, written and tested with great care to replace gzip and
bzip2 as the standard general-purpose compressed format for unix-like
systems.
Plzip can compress/decompress large files on multiprocessor machines
much faster than lzip, at the cost of a slightly reduced compression
ratio (0.4 to 2 percent larger compressed files). Note that the number
of usable threads is limited by file size; on files larger than a few GB
plzip can use hundreds of processors, but on files of only a few MB
plzip is no faster than lzip (*note Minimum file sizes::).
Plzip uses the lzip file format; the files produced by plzip are
fully compatible with lzip-1.4 or newer, and can be rescued with
lziprecover.
plzip is no faster than lzip. *Note Minimum file sizes::.
The lzip file format is designed for data sharing and long-term
archiving, taking into account both data integrity and decoder
@ -80,15 +85,16 @@ repair the nearer it is from the beginning of the file. Therefore, with
the help of lziprecover, losing an entire archive just because of a
corrupt byte near the beginning is a thing of the past.
Plzip uses the same well-defined exit status values used by lzip and
bzip2, which makes it safer than compressors returning ambiguous warning
values (like gzip) when it is used as a back end for other programs like
tar or zutils.
Plzip uses the same well-defined exit status values used by lzip,
which makes it safer than compressors returning ambiguous warning
values (like gzip) when it is used as a back end for other programs
like tar or zutils.
Plzip will automatically use the smallest possible dictionary size
for each file without exceeding the given limit. Keep in mind that the
decompression memory requirement is affected at compression time by the
choice of dictionary size limit (*note Memory requirements::).
Plzip will automatically use for each file the largest dictionary
size that does not exceed neither the file size nor the limit given.
Keep in mind that the decompression memory requirement is affected at
compression time by the choice of dictionary size limit. *Note Memory
requirements::.
When compressing, plzip replaces every file given in the command line
with a compressed version of itself, with the name "original_name.lz".
@ -101,7 +107,7 @@ anyothername becomes anyothername.out
(De)compressing a file is much like copying or moving it; therefore
plzip preserves the access and modification dates, permissions, and,
when possible, ownership of the file just as "cp -p" does. (If the user
when possible, ownership of the file just as 'cp -p' does. (If the user
ID or the group ID can't be duplicated, the file permission bits
S_ISUID and S_ISGID are cleared).
@ -188,6 +194,7 @@ command line.
'-V'
'--version'
Print the version number of plzip on the standard output and exit.
This version number should be included in all bug reports.
'-a'
'--trailing-error'
@ -286,12 +293,14 @@ command line.
'-s BYTES'
'--dictionary-size=BYTES'
When compressing, set the dictionary size limit in bytes. Plzip
will use the smallest possible dictionary size for each file
without exceeding this limit. Valid values range from 4 KiB to
512 MiB. Values 12 to 29 are interpreted as powers of two, meaning
2^12 to 2^29 bytes. Note that dictionary sizes are quantized. If
the specified size does not match one of the valid sizes, it will
be rounded upwards by adding up to (BYTES / 8) to it.
will use for each file the largest dictionary size that does not
exceed neither the file size nor this limit. Valid values range
from 4 KiB to 512 MiB. Values 12 to 29 are interpreted as powers
of two, meaning 2^12 to 2^29 bytes. Dictionary sizes are quantized
so that they can be coded in just one byte (*note
coded-dict-size::). If the specified size does not match one of
the valid sizes, it will be rounded upwards by adding up to
(BYTES / 8) to it.
For maximum compression you should use a dictionary size limit as
large as possible, but keep in mind that the decompression memory
@ -320,27 +329,32 @@ command line.
except for single-member files.
'-0 .. -9'
Set the compression parameters (dictionary size and match length
limit) as shown in the table below. The default compression level
is '-6'. Note that '-9' can be much slower than '-0'. These
options have no effect when decompressing, testing or listing.
Compression level. Set the compression parameters (dictionary size
and match length limit) as shown in the table below. The default
compression level is '-6', equivalent to '-s8MiB -m36'. Note that
'-9' can be much slower than '-0'. These options have no effect
when decompressing, testing or listing.
The bidimensional parameter space of LZMA can't be mapped to a
linear scale optimal for all files. If your files are large, very
repetitive, etc, you may need to use the '--dictionary-size' and
'--match-length' options directly to achieve optimal performance.
Level Dictionary size Match length limit
-0 64 KiB 16 bytes
-1 1 MiB 5 bytes
-2 1.5 MiB 6 bytes
-3 2 MiB 8 bytes
-4 3 MiB 12 bytes
-5 4 MiB 20 bytes
-6 8 MiB 36 bytes
-7 16 MiB 68 bytes
-8 24 MiB 132 bytes
-9 32 MiB 273 bytes
If several compression levels or '-s' or '-m' options are given,
the last setting is used. For example '-9 -s64MiB' is equivalent
to '-s64MiB -m273'
Level Dictionary size (-s) Match length limit (-m)
-0 64 KiB 16 bytes
-1 1 MiB 5 bytes
-2 1.5 MiB 6 bytes
-3 2 MiB 8 bytes
-4 3 MiB 12 bytes
-5 4 MiB 20 bytes
-6 8 MiB 36 bytes
-7 16 MiB 68 bytes
-8 24 MiB 132 bytes
-9 32 MiB 273 bytes
'--fast'
'--best'
@ -353,6 +367,18 @@ command line.
if a file triggers a "corrupt header" error and the cause is not
indeed a corrupt header.
'--in-slots=N'
Number of 1 MiB input packets buffered per worker thread when
decompressing from non-seekable input. Increasing the number of
packets may increase decompression speed, but requires more
memory. Valid values range from 1 to 64. The default value is 4.
'--out-slots=N'
Number of 1 MiB output packets buffered per worker thread when
decompressing to non-seekable output. Increasing the number of
packets may increase decompression speed, but requires more
memory. Valid values range from 1 to 1024. The default value is 64.
Numbers given as arguments to options may be followed by a multiplier
and an optional 'B' for "byte".
@ -465,11 +491,11 @@ additional information before, between, or after them.
'DS (coded dictionary size, 1 byte)'
The dictionary size is calculated by taking a power of 2 (the base
size) and substracting from it a fraction between 0/16 and 7/16 of
size) and subtracting from it a fraction between 0/16 and 7/16 of
the base size.
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
Bits 7-5 contain the numerator of the fraction (0 to 7) to
substract from the base size to obtain the dictionary size.
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
from the base size to obtain the dictionary size.
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
Valid values for dictionary size range from 4 KiB to 512 MiB.
@ -497,22 +523,25 @@ File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev:
6 Memory required to compress and decompress
********************************************
The amount of memory required *per thread* for decompression or testing
is approximately the following:
The amount of memory required *per worker thread* for decompression or
testing is approximately the following:
* For decompression of a regular (seekable) file to another regular
file, or for testing of a regular file; the dictionary size.
* For testing of a non-seekable file or of standard input; the
dictionary size plus up to 5 MiB.
dictionary size plus 1 MiB plus up to the number of 1 MiB input
packets buffered (4 by default).
* For decompression of a regular file to a non-seekable file or to
standard output; the dictionary size plus up to 32 MiB.
standard output; the dictionary size plus up to the number of 1 MiB
output packets buffered (64 by default).
* For decompression of a non-seekable file or of standard input; the
dictionary size plus up to 35 MiB.
dictionary size plus 1 MiB plus up to the number of 1 MiB input
and output packets buffered (68 by default).
The amount of memory required *per thread* for compression is
The amount of memory required *per worker thread* for compression is
approximately the following:
* For compression at level -0; 1.5 MiB plus 3.375 times the data size
@ -561,7 +590,7 @@ for full use of N processors at a given compression level, using the
default data size for each level:
Processors 2 4 8 16 64 256
-------------------------------------------------------------------------
------------------------------------------------------------------
Level
-0 2 MiB 4 MiB 8 MiB 16 MiB 64 MiB 256 MiB
-1 4 MiB 8 MiB 16 MiB 32 MiB 128 MiB 512 MiB
@ -633,7 +662,11 @@ compressed file (bugs in the system libraries, memory errors, etc).
Therefore, if the data you are going to compress are important, give the
'--keep' option to plzip and don't remove the original file until you
verify the compressed file with a command like
'plzip -cd file.lz | cmp file -'.
'plzip -cd file.lz | cmp file -'. Most RAM errors happening during
compression can only be detected by comparing the compressed file with
the original because the corruption happens before plzip compresses the
RAM contents, resulting in a valid compressed file containing wrong
data.
Example 1: Replace a regular file with its compressed version 'file.lz'
@ -728,21 +761,22 @@ Concept index

Tag Table:
Node: Top221
Node: Top222
Node: Introduction1158
Node: Output5134
Node: Invoking plzip6614
Ref: --trailing-error7177
Ref: --data-size7420
Node: Program design14938
Node: File format17090
Node: Memory requirements19522
Node: Minimum file sizes20985
Node: Trailing data23002
Node: Examples25285
Ref: concat-example26450
Node: Problems27025
Node: Concept index27553
Node: Output5456
Node: Invoking plzip6936
Ref: --trailing-error7563
Ref: --data-size7806
Node: Program design16267
Node: File format18419
Ref: coded-dict-size19719
Node: Memory requirements20849
Node: Minimum file sizes22531
Node: Trailing data24540
Node: Examples26823
Ref: concat-example28238
Node: Problems28813
Node: Concept index29341

End Tag Table