Merging upstream version 1.8.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
95e76700ee
commit
3ab3342c4f
21 changed files with 729 additions and 460 deletions
166
doc/plzip.info
166
doc/plzip.info
|
@ -2,7 +2,7 @@ This is plzip.info, produced by makeinfo version 4.13+ from plzip.texi.
|
|||
|
||||
INFO-DIR-SECTION Data Compression
|
||||
START-INFO-DIR-ENTRY
|
||||
* Plzip: (plzip). Parallel compressor compatible with lzip
|
||||
* Plzip: (plzip). Massively parallel implementation of lzip
|
||||
END-INFO-DIR-ENTRY
|
||||
|
||||
|
||||
|
@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Plzip Manual
|
||||
************
|
||||
|
||||
This manual is for Plzip (version 1.7, 7 February 2018).
|
||||
This manual is for Plzip (version 1.8, 5 January 2019).
|
||||
|
||||
* Menu:
|
||||
|
||||
|
@ -28,7 +28,7 @@ This manual is for Plzip (version 1.7, 7 February 2018).
|
|||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2009-2018 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2019 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to
|
||||
copy, distribute and modify it.
|
||||
|
@ -39,20 +39,25 @@ File: plzip.info, Node: Introduction, Next: Output, Prev: Top, Up: Top
|
|||
1 Introduction
|
||||
**************
|
||||
|
||||
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||
based on the lzlib compression library, with a user interface similar to
|
||||
the one of lzip, bzip2 or gzip.
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip,
|
||||
fully compatible with lzip 1.4 or newer. Plzip uses the lzlib
|
||||
compression library.
|
||||
|
||||
Lzip is a lossless data compressor with a user interface similar to
|
||||
the one of gzip or bzip2. Lzip can compress about as fast as gzip
|
||||
(lzip -0) or compress most files more than bzip2 (lzip -9).
|
||||
Decompression speed is intermediate between gzip and bzip2. Lzip is
|
||||
better than gzip and bzip2 from a data recovery perspective. Lzip has
|
||||
been designed, written and tested with great care to replace gzip and
|
||||
bzip2 as the standard general-purpose compressed format for unix-like
|
||||
systems.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines
|
||||
much faster than lzip, at the cost of a slightly reduced compression
|
||||
ratio (0.4 to 2 percent larger compressed files). Note that the number
|
||||
of usable threads is limited by file size; on files larger than a few GB
|
||||
plzip can use hundreds of processors, but on files of only a few MB
|
||||
plzip is no faster than lzip (*note Minimum file sizes::).
|
||||
|
||||
Plzip uses the lzip file format; the files produced by plzip are
|
||||
fully compatible with lzip-1.4 or newer, and can be rescued with
|
||||
lziprecover.
|
||||
plzip is no faster than lzip. *Note Minimum file sizes::.
|
||||
|
||||
The lzip file format is designed for data sharing and long-term
|
||||
archiving, taking into account both data integrity and decoder
|
||||
|
@ -80,15 +85,16 @@ repair the nearer it is from the beginning of the file. Therefore, with
|
|||
the help of lziprecover, losing an entire archive just because of a
|
||||
corrupt byte near the beginning is a thing of the past.
|
||||
|
||||
Plzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for other programs like
|
||||
tar or zutils.
|
||||
Plzip uses the same well-defined exit status values used by lzip,
|
||||
which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for other programs
|
||||
like tar or zutils.
|
||||
|
||||
Plzip will automatically use the smallest possible dictionary size
|
||||
for each file without exceeding the given limit. Keep in mind that the
|
||||
decompression memory requirement is affected at compression time by the
|
||||
choice of dictionary size limit (*note Memory requirements::).
|
||||
Plzip will automatically use for each file the largest dictionary
|
||||
size that does not exceed neither the file size nor the limit given.
|
||||
Keep in mind that the decompression memory requirement is affected at
|
||||
compression time by the choice of dictionary size limit. *Note Memory
|
||||
requirements::.
|
||||
|
||||
When compressing, plzip replaces every file given in the command line
|
||||
with a compressed version of itself, with the name "original_name.lz".
|
||||
|
@ -101,7 +107,7 @@ anyothername becomes anyothername.out
|
|||
|
||||
(De)compressing a file is much like copying or moving it; therefore
|
||||
plzip preserves the access and modification dates, permissions, and,
|
||||
when possible, ownership of the file just as "cp -p" does. (If the user
|
||||
when possible, ownership of the file just as 'cp -p' does. (If the user
|
||||
ID or the group ID can't be duplicated, the file permission bits
|
||||
S_ISUID and S_ISGID are cleared).
|
||||
|
||||
|
@ -188,6 +194,7 @@ command line.
|
|||
'-V'
|
||||
'--version'
|
||||
Print the version number of plzip on the standard output and exit.
|
||||
This version number should be included in all bug reports.
|
||||
|
||||
'-a'
|
||||
'--trailing-error'
|
||||
|
@ -286,12 +293,14 @@ command line.
|
|||
'-s BYTES'
|
||||
'--dictionary-size=BYTES'
|
||||
When compressing, set the dictionary size limit in bytes. Plzip
|
||||
will use the smallest possible dictionary size for each file
|
||||
without exceeding this limit. Valid values range from 4 KiB to
|
||||
512 MiB. Values 12 to 29 are interpreted as powers of two, meaning
|
||||
2^12 to 2^29 bytes. Note that dictionary sizes are quantized. If
|
||||
the specified size does not match one of the valid sizes, it will
|
||||
be rounded upwards by adding up to (BYTES / 8) to it.
|
||||
will use for each file the largest dictionary size that does not
|
||||
exceed neither the file size nor this limit. Valid values range
|
||||
from 4 KiB to 512 MiB. Values 12 to 29 are interpreted as powers
|
||||
of two, meaning 2^12 to 2^29 bytes. Dictionary sizes are quantized
|
||||
so that they can be coded in just one byte (*note
|
||||
coded-dict-size::). If the specified size does not match one of
|
||||
the valid sizes, it will be rounded upwards by adding up to
|
||||
(BYTES / 8) to it.
|
||||
|
||||
For maximum compression you should use a dictionary size limit as
|
||||
large as possible, but keep in mind that the decompression memory
|
||||
|
@ -320,27 +329,32 @@ command line.
|
|||
except for single-member files.
|
||||
|
||||
'-0 .. -9'
|
||||
Set the compression parameters (dictionary size and match length
|
||||
limit) as shown in the table below. The default compression level
|
||||
is '-6'. Note that '-9' can be much slower than '-0'. These
|
||||
options have no effect when decompressing, testing or listing.
|
||||
Compression level. Set the compression parameters (dictionary size
|
||||
and match length limit) as shown in the table below. The default
|
||||
compression level is '-6', equivalent to '-s8MiB -m36'. Note that
|
||||
'-9' can be much slower than '-0'. These options have no effect
|
||||
when decompressing, testing or listing.
|
||||
|
||||
The bidimensional parameter space of LZMA can't be mapped to a
|
||||
linear scale optimal for all files. If your files are large, very
|
||||
repetitive, etc, you may need to use the '--dictionary-size' and
|
||||
'--match-length' options directly to achieve optimal performance.
|
||||
|
||||
Level Dictionary size Match length limit
|
||||
-0 64 KiB 16 bytes
|
||||
-1 1 MiB 5 bytes
|
||||
-2 1.5 MiB 6 bytes
|
||||
-3 2 MiB 8 bytes
|
||||
-4 3 MiB 12 bytes
|
||||
-5 4 MiB 20 bytes
|
||||
-6 8 MiB 36 bytes
|
||||
-7 16 MiB 68 bytes
|
||||
-8 24 MiB 132 bytes
|
||||
-9 32 MiB 273 bytes
|
||||
If several compression levels or '-s' or '-m' options are given,
|
||||
the last setting is used. For example '-9 -s64MiB' is equivalent
|
||||
to '-s64MiB -m273'
|
||||
|
||||
Level Dictionary size (-s) Match length limit (-m)
|
||||
-0 64 KiB 16 bytes
|
||||
-1 1 MiB 5 bytes
|
||||
-2 1.5 MiB 6 bytes
|
||||
-3 2 MiB 8 bytes
|
||||
-4 3 MiB 12 bytes
|
||||
-5 4 MiB 20 bytes
|
||||
-6 8 MiB 36 bytes
|
||||
-7 16 MiB 68 bytes
|
||||
-8 24 MiB 132 bytes
|
||||
-9 32 MiB 273 bytes
|
||||
|
||||
'--fast'
|
||||
'--best'
|
||||
|
@ -353,6 +367,18 @@ command line.
|
|||
if a file triggers a "corrupt header" error and the cause is not
|
||||
indeed a corrupt header.
|
||||
|
||||
'--in-slots=N'
|
||||
Number of 1 MiB input packets buffered per worker thread when
|
||||
decompressing from non-seekable input. Increasing the number of
|
||||
packets may increase decompression speed, but requires more
|
||||
memory. Valid values range from 1 to 64. The default value is 4.
|
||||
|
||||
'--out-slots=N'
|
||||
Number of 1 MiB output packets buffered per worker thread when
|
||||
decompressing to non-seekable output. Increasing the number of
|
||||
packets may increase decompression speed, but requires more
|
||||
memory. Valid values range from 1 to 1024. The default value is 64.
|
||||
|
||||
|
||||
Numbers given as arguments to options may be followed by a multiplier
|
||||
and an optional 'B' for "byte".
|
||||
|
@ -465,11 +491,11 @@ additional information before, between, or after them.
|
|||
|
||||
'DS (coded dictionary size, 1 byte)'
|
||||
The dictionary size is calculated by taking a power of 2 (the base
|
||||
size) and substracting from it a fraction between 0/16 and 7/16 of
|
||||
size) and subtracting from it a fraction between 0/16 and 7/16 of
|
||||
the base size.
|
||||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
|
||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to
|
||||
substract from the base size to obtain the dictionary size.
|
||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
||||
from the base size to obtain the dictionary size.
|
||||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
|
@ -497,22 +523,25 @@ File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev:
|
|||
6 Memory required to compress and decompress
|
||||
********************************************
|
||||
|
||||
The amount of memory required *per thread* for decompression or testing
|
||||
is approximately the following:
|
||||
The amount of memory required *per worker thread* for decompression or
|
||||
testing is approximately the following:
|
||||
|
||||
* For decompression of a regular (seekable) file to another regular
|
||||
file, or for testing of a regular file; the dictionary size.
|
||||
|
||||
* For testing of a non-seekable file or of standard input; the
|
||||
dictionary size plus up to 5 MiB.
|
||||
dictionary size plus 1 MiB plus up to the number of 1 MiB input
|
||||
packets buffered (4 by default).
|
||||
|
||||
* For decompression of a regular file to a non-seekable file or to
|
||||
standard output; the dictionary size plus up to 32 MiB.
|
||||
standard output; the dictionary size plus up to the number of 1 MiB
|
||||
output packets buffered (64 by default).
|
||||
|
||||
* For decompression of a non-seekable file or of standard input; the
|
||||
dictionary size plus up to 35 MiB.
|
||||
dictionary size plus 1 MiB plus up to the number of 1 MiB input
|
||||
and output packets buffered (68 by default).
|
||||
|
||||
The amount of memory required *per thread* for compression is
|
||||
The amount of memory required *per worker thread* for compression is
|
||||
approximately the following:
|
||||
|
||||
* For compression at level -0; 1.5 MiB plus 3.375 times the data size
|
||||
|
@ -561,7 +590,7 @@ for full use of N processors at a given compression level, using the
|
|||
default data size for each level:
|
||||
|
||||
Processors 2 4 8 16 64 256
|
||||
-------------------------------------------------------------------------
|
||||
------------------------------------------------------------------
|
||||
Level
|
||||
-0 2 MiB 4 MiB 8 MiB 16 MiB 64 MiB 256 MiB
|
||||
-1 4 MiB 8 MiB 16 MiB 32 MiB 128 MiB 512 MiB
|
||||
|
@ -633,7 +662,11 @@ compressed file (bugs in the system libraries, memory errors, etc).
|
|||
Therefore, if the data you are going to compress are important, give the
|
||||
'--keep' option to plzip and don't remove the original file until you
|
||||
verify the compressed file with a command like
|
||||
'plzip -cd file.lz | cmp file -'.
|
||||
'plzip -cd file.lz | cmp file -'. Most RAM errors happening during
|
||||
compression can only be detected by comparing the compressed file with
|
||||
the original because the corruption happens before plzip compresses the
|
||||
RAM contents, resulting in a valid compressed file containing wrong
|
||||
data.
|
||||
|
||||
|
||||
Example 1: Replace a regular file with its compressed version 'file.lz'
|
||||
|
@ -728,21 +761,22 @@ Concept index
|
|||
|
||||
|
||||
Tag Table:
|
||||
Node: Top221
|
||||
Node: Top222
|
||||
Node: Introduction1158
|
||||
Node: Output5134
|
||||
Node: Invoking plzip6614
|
||||
Ref: --trailing-error7177
|
||||
Ref: --data-size7420
|
||||
Node: Program design14938
|
||||
Node: File format17090
|
||||
Node: Memory requirements19522
|
||||
Node: Minimum file sizes20985
|
||||
Node: Trailing data23002
|
||||
Node: Examples25285
|
||||
Ref: concat-example26450
|
||||
Node: Problems27025
|
||||
Node: Concept index27553
|
||||
Node: Output5456
|
||||
Node: Invoking plzip6936
|
||||
Ref: --trailing-error7563
|
||||
Ref: --data-size7806
|
||||
Node: Program design16267
|
||||
Node: File format18419
|
||||
Ref: coded-dict-size19719
|
||||
Node: Memory requirements20849
|
||||
Node: Minimum file sizes22531
|
||||
Node: Trailing data24540
|
||||
Node: Examples26823
|
||||
Ref: concat-example28238
|
||||
Node: Problems28813
|
||||
Node: Concept index29341
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue