Merging upstream version 1.11.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
2b58741015
commit
648618884e
21 changed files with 727 additions and 631 deletions
50
doc/plzip.1
50
doc/plzip.1
|
@ -1,24 +1,25 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16.
|
||||
.TH PLZIP "1" "January 2022" "plzip 1.10" "User Commands"
|
||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
|
||||
.TH PLZIP "1" "January 2024" "plzip 1.11" "User Commands"
|
||||
.SH NAME
|
||||
plzip \- reduces the size of files
|
||||
.SH SYNOPSIS
|
||||
.B plzip
|
||||
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
||||
.SH DESCRIPTION
|
||||
Plzip is a massively parallel (multi\-threaded) implementation of lzip, fully
|
||||
Plzip is a massively parallel (multi\-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
||||
.PP
|
||||
Lzip is a lossless data compressor with a user interface similar to the one
|
||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel\-Ziv\-Markov
|
||||
chain\-Algorithm' (LZMA) stream format and provides a 3 factor integrity
|
||||
checking to maximize interoperability and optimize safety. Lzip can compress
|
||||
about as fast as gzip (lzip \fB\-0\fR) or compress most files more than bzip2
|
||||
(lzip \fB\-9\fR). Decompression speed is intermediate between gzip and bzip2.
|
||||
Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip
|
||||
has been designed, written, and tested with great care to replace gzip and
|
||||
bzip2 as the standard general\-purpose compressed format for unix\-like
|
||||
systems.
|
||||
chain\-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32\-bit machines. Lzip provides accurate and robust 3\-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip (lzip \fB\-0\fR) or compress most
|
||||
files more than bzip2 (lzip \fB\-9\fR). Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general\-purpose compressed format for
|
||||
Unix\-like systems.
|
||||
.PP
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
|
@ -44,7 +45,7 @@ set size of input data blocks [2x8=16 MiB]
|
|||
write to standard output, keep input files
|
||||
.TP
|
||||
\fB\-d\fR, \fB\-\-decompress\fR
|
||||
decompress
|
||||
decompress, test compressed file integrity
|
||||
.TP
|
||||
\fB\-f\fR, \fB\-\-force\fR
|
||||
overwrite existing output files
|
||||
|
@ -104,21 +105,21 @@ If no file names are given, or if a file is '\-', plzip compresses or
|
|||
decompresses from standard input to standard output.
|
||||
Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,
|
||||
Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...
|
||||
Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12
|
||||
to 2^29 bytes.
|
||||
Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to
|
||||
2^29 bytes.
|
||||
.PP
|
||||
The bidimensional parameter space of LZMA can't be mapped to a linear
|
||||
scale optimal for all files. If your files are large, very repetitive,
|
||||
etc, you may need to use the options \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR
|
||||
directly to achieve optimal performance.
|
||||
The bidimensional parameter space of LZMA can't be mapped to a linear scale
|
||||
optimal for all files. If your files are large, very repetitive, etc, you
|
||||
may need to use the options \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR directly
|
||||
to achieve optimal performance.
|
||||
.PP
|
||||
To extract all the files from archive 'foo.tar.lz', use the commands
|
||||
\&'tar \fB\-xf\fR foo.tar.lz' or 'plzip \fB\-cd\fR foo.tar.lz | tar \fB\-xf\fR \-'.
|
||||
.PP
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems (file
|
||||
not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or
|
||||
invalid input file, 3 for an internal consistency error (e.g., bug) which
|
||||
caused plzip to panic.
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems
|
||||
(file not found, invalid command\-line options, I/O errors, etc), 2 to
|
||||
indicate a corrupt or invalid input file, 3 for an internal consistency
|
||||
error (e.g., bug) which caused plzip to panic.
|
||||
.SH "REPORTING BUGS"
|
||||
Report bugs to lzip\-bug@nongnu.org
|
||||
.br
|
||||
|
@ -126,12 +127,13 @@ Plzip home page: http://www.nongnu.org/lzip/plzip.html
|
|||
.SH COPYRIGHT
|
||||
Copyright \(co 2009 Laszlo Ersek.
|
||||
.br
|
||||
Copyright \(co 2022 Antonio Diaz Diaz.
|
||||
Using lzlib 1.13
|
||||
Copyright \(co 2024 Antonio Diaz Diaz.
|
||||
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||
.br
|
||||
This is free software: you are free to change and redistribute it.
|
||||
There is NO WARRANTY, to the extent permitted by law.
|
||||
Using lzlib 1.14
|
||||
Using LZ_API_VERSION = 1014
|
||||
.SH "SEE ALSO"
|
||||
The full documentation for
|
||||
.B plzip
|
||||
|
|
234
doc/plzip.info
234
doc/plzip.info
|
@ -11,13 +11,13 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Plzip Manual
|
||||
************
|
||||
|
||||
This manual is for Plzip (version 1.10, 24 January 2022).
|
||||
This manual is for Plzip (version 1.11, 21 January 2024).
|
||||
|
||||
* Menu:
|
||||
|
||||
* Introduction:: Purpose and features of plzip
|
||||
* Output:: Meaning of plzip's output
|
||||
* Invoking plzip:: Command line interface
|
||||
* Invoking plzip:: Command-line interface
|
||||
* Program design:: Internal structure of plzip
|
||||
* Memory requirements:: Memory required to compress and decompress
|
||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||
|
@ -28,7 +28,7 @@ This manual is for Plzip (version 1.10, 24 January 2022).
|
|||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2009-2022 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2024 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to copy,
|
||||
distribute, and modify it.
|
||||
|
@ -39,19 +39,20 @@ File: plzip.info, Node: Introduction, Next: Output, Prev: Top, Up: Top
|
|||
1 Introduction
|
||||
**************
|
||||
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip, fully
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
||||
|
||||
Lzip is a lossless data compressor with a user interface similar to the
|
||||
one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
||||
chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity
|
||||
checking to maximize interoperability and optimize safety. Lzip can compress
|
||||
about as fast as gzip (lzip -0) or compress most files more than bzip2
|
||||
(lzip -9). Decompression speed is intermediate between gzip and bzip2. Lzip
|
||||
is better than gzip and bzip2 from a data recovery perspective. Lzip has
|
||||
been designed, written, and tested with great care to replace gzip and
|
||||
bzip2 as the standard general-purpose compressed format for unix-like
|
||||
systems.
|
||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
|
||||
files more than bzip2 (lzip -9). Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
||||
Unix-like systems.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
|
@ -94,10 +95,10 @@ byte near the beginning is a thing of the past.
|
|||
makes it safer than compressors returning ambiguous warning values (like
|
||||
gzip) when it is used as a back end for other programs like tar or zutils.
|
||||
|
||||
Plzip will automatically use for each file the largest dictionary size
|
||||
that does not exceed neither the file size nor the limit given. Keep in
|
||||
mind that the decompression memory requirement is affected at compression
|
||||
time by the choice of dictionary size limit. *Note Memory requirements::.
|
||||
Plzip automatically uses for each file the largest dictionary size that
|
||||
does not exceed neither the file size nor the limit given. Keep in mind
|
||||
that the decompression memory requirement is affected at compression time
|
||||
by the choice of dictionary size limit. *Note Memory requirements::.
|
||||
|
||||
When compressing, plzip replaces every file given in the command line
|
||||
with a compressed version of itself, with the name "original_name.lz". When
|
||||
|
@ -109,22 +110,22 @@ filename.tlz becomes filename.tar
|
|||
anyothername becomes anyothername.out
|
||||
|
||||
(De)compressing a file is much like copying or moving it. Therefore plzip
|
||||
preserves the access and modification dates, permissions, and, when
|
||||
possible, ownership of the file just as 'cp -p' does. (If the user ID or
|
||||
the group ID can't be duplicated, the file permission bits S_ISUID and
|
||||
S_ISGID are cleared).
|
||||
preserves the access and modification dates, permissions, and, if you have
|
||||
appropriate privileges, ownership of the file just as 'cp -p' does. (If the
|
||||
user ID or the group ID can't be duplicated, the file permission bits
|
||||
S_ISUID and S_ISGID are cleared).
|
||||
|
||||
Plzip is able to read from some types of non-regular files if either the
|
||||
option '-c' or the option '-o' is specified.
|
||||
|
||||
Plzip will refuse to read compressed data from a terminal or write
|
||||
compressed data to a terminal, as this would be entirely incomprehensible
|
||||
and might leave the terminal in an abnormal state.
|
||||
Plzip refuses to read compressed data from a terminal or write compressed
|
||||
data to a terminal, as this would be entirely incomprehensible and might
|
||||
leave the terminal in an abnormal state.
|
||||
|
||||
Plzip will correctly decompress a file which is the concatenation of two
|
||||
or more compressed files. The result is the concatenation of the
|
||||
corresponding decompressed files. Integrity testing of concatenated
|
||||
compressed files is also supported.
|
||||
Plzip correctly decompresses a file which is the concatenation of two or
|
||||
more compressed files. The result is the concatenation of the corresponding
|
||||
decompressed files. Integrity testing of concatenated compressed files is
|
||||
also supported.
|
||||
|
||||
|
||||
File: plzip.info, Node: Output, Next: Invoking plzip, Prev: Introduction, Up: Top
|
||||
|
@ -185,7 +186,8 @@ The format for running plzip is:
|
|||
If no file names are specified, plzip compresses (or decompresses) from
|
||||
standard input to standard output. A hyphen '-' used as a FILE argument
|
||||
means standard input. It can be mixed with other FILES and is read just
|
||||
once, the first time it appears in the command line.
|
||||
once, the first time it appears in the command line. Remember to prepend
|
||||
'./' to any file name beginning with a hyphen, or use '--'.
|
||||
|
||||
plzip supports the following options: *Note Argument syntax:
|
||||
(arg_parser)Argument syntax.
|
||||
|
@ -208,30 +210,32 @@ once, the first time it appears in the command line.
|
|||
'-B BYTES'
|
||||
'--data-size=BYTES'
|
||||
When compressing, set the size in bytes of the input data blocks. The
|
||||
input file will be divided in chunks of this size before compression is
|
||||
input file is divided in chunks of this size before compression is
|
||||
performed. Valid values range from 8 KiB to 1 GiB. Default value is
|
||||
two times the dictionary size, except for option '-0' where it
|
||||
defaults to 1 MiB. Plzip will reduce the dictionary size if it is
|
||||
larger than the data size specified. *Note Minimum file sizes::.
|
||||
defaults to 1 MiB. Plzip reduces the dictionary size if it is larger
|
||||
than the data size specified. *Note Minimum file sizes::.
|
||||
|
||||
'-c'
|
||||
'--stdout'
|
||||
Compress or decompress to standard output; keep input files unchanged.
|
||||
If compressing several files, each file is compressed independently.
|
||||
This option (or '-o') is needed when reading from a named pipe (fifo)
|
||||
or from a device. Use 'lziprecover -cd -i' to recover as much of the
|
||||
decompressed data as possible when decompressing a corrupt file. '-c'
|
||||
overrides '-o'. '-c' has no effect when testing or listing.
|
||||
(The output consists of a sequence of independently compressed
|
||||
members). This option (or '-o') is needed when reading from a named
|
||||
pipe (fifo) or from a device. Use 'lziprecover -cd -i' to recover as
|
||||
much of the decompressed data as possible when decompressing a corrupt
|
||||
file. '-c' overrides '-o'. '-c' has no effect when testing or listing.
|
||||
|
||||
'-d'
|
||||
'--decompress'
|
||||
Decompress the files specified. If a file does not exist, can't be
|
||||
opened, or the destination file already exists and '--force' has not
|
||||
been specified, plzip continues decompressing the rest of the files
|
||||
and exits with error status 1. If a file fails to decompress, or is a
|
||||
terminal, plzip exits immediately with error status 2 without
|
||||
decompressing the rest of the files. A terminal is considered an
|
||||
uncompressed file, and therefore invalid.
|
||||
Decompress the files specified. The integrity of the files specified is
|
||||
checked. If a file does not exist, can't be opened, or the destination
|
||||
file already exists and '--force' has not been specified, plzip
|
||||
continues decompressing the rest of the files and exits with error
|
||||
status 1. If a file fails to decompress, or is a terminal, plzip exits
|
||||
immediately with error status 2 without decompressing the rest of the
|
||||
files. A terminal is considered an uncompressed file, and therefore
|
||||
invalid.
|
||||
|
||||
'-f'
|
||||
'--force'
|
||||
|
@ -258,18 +262,18 @@ once, the first time it appears in the command line.
|
|||
printed.
|
||||
|
||||
If any file is damaged, does not exist, can't be opened, or is not
|
||||
regular, the final exit status will be > 0. '-lq' can be used to verify
|
||||
regular, the final exit status is > 0. '-lq' can be used to check
|
||||
quickly (without decompressing) the structural integrity of the files
|
||||
specified. (Use '--test' to verify the data integrity). '-alq'
|
||||
additionally verifies that none of the files specified contain
|
||||
trailing data.
|
||||
specified. (Use '--test' to check the data integrity). '-alq'
|
||||
additionally checks that none of the files specified contain trailing
|
||||
data.
|
||||
|
||||
'-m BYTES'
|
||||
'--match-length=BYTES'
|
||||
When compressing, set the match length limit in bytes. After a match
|
||||
this long is found, the search is finished. Valid values range from 5
|
||||
to 273. Larger values usually give better compression ratios but longer
|
||||
compression times.
|
||||
to 273. Larger values usually give better compression ratios but
|
||||
longer compression times.
|
||||
|
||||
'-n N'
|
||||
'--threads=N'
|
||||
|
@ -291,10 +295,12 @@ once, the first time it appears in the command line.
|
|||
|
||||
'-o FILE'
|
||||
'--output=FILE'
|
||||
If '-c' has not been also specified, write the (de)compressed output to
|
||||
FILE; keep input files unchanged. If compressing several files, each
|
||||
file is compressed independently. This option (or '-c') is needed when
|
||||
reading from a named pipe (fifo) or from a device. '-o -' is
|
||||
If '-c' has not been also specified, write the (de)compressed output
|
||||
to FILE, automatically creating any missing parent directories; keep
|
||||
input files unchanged. If compressing several files, each file is
|
||||
compressed independently. (The output consists of a sequence of
|
||||
independently compressed members). This option (or '-c') is needed
|
||||
when reading from a named pipe (fifo) or from a device. '-o -' is
|
||||
equivalent to '-c'. '-o' has no effect when testing or listing.
|
||||
|
||||
In order to keep backward compatibility with plzip versions prior to
|
||||
|
@ -311,14 +317,14 @@ once, the first time it appears in the command line.
|
|||
|
||||
'-s BYTES'
|
||||
'--dictionary-size=BYTES'
|
||||
When compressing, set the dictionary size limit in bytes. Plzip will
|
||||
use for each file the largest dictionary size that does not exceed
|
||||
neither the file size nor this limit. Valid values range from 4 KiB to
|
||||
512 MiB. Values 12 to 29 are interpreted as powers of two, meaning
|
||||
2^12 to 2^29 bytes. Dictionary sizes are quantized so that they can be
|
||||
coded in just one byte (*note coded-dict-size::). If the size specified
|
||||
does not match one of the valid sizes, it will be rounded upwards by
|
||||
adding up to (BYTES / 8) to it.
|
||||
When compressing, set the dictionary size limit in bytes. Plzip uses
|
||||
for each file the largest dictionary size that does not exceed neither
|
||||
the file size nor this limit. Valid values range from 4 KiB to 512 MiB.
|
||||
Values 12 to 29 are interpreted as powers of two, meaning 2^12 to 2^29
|
||||
bytes. Dictionary sizes are quantized so that they can be coded in
|
||||
just one byte (*note coded-dict-size::). If the size specified does
|
||||
not match one of the valid sizes, it is rounded upwards by adding up
|
||||
to (BYTES / 8) to it.
|
||||
|
||||
For maximum compression you should use a dictionary size limit as large
|
||||
as possible, but keep in mind that the decompression memory requirement
|
||||
|
@ -330,7 +336,7 @@ once, the first time it appears in the command line.
|
|||
really performs a trial decompression and throws away the result. Use
|
||||
it together with '-v' to see information about the files. If a file
|
||||
fails the test, does not exist, can't be opened, or is a terminal,
|
||||
plzip continues checking the rest of the files. A final diagnostic is
|
||||
plzip continues testing the rest of the files. A final diagnostic is
|
||||
shown at verbosity level 1 or higher if any file fails the test when
|
||||
testing multiple files.
|
||||
|
||||
|
@ -408,26 +414,29 @@ once, the first time it appears in the command line.
|
|||
(lzlib)Library version.
|
||||
|
||||
|
||||
Numbers given as arguments to options may be followed by a multiplier
|
||||
and an optional 'B' for "byte".
|
||||
Numbers given as arguments to options may be expressed in decimal,
|
||||
hexadecimal, or octal (using the same syntax as integer constants in C++),
|
||||
and may be followed by a multiplier and an optional 'B' for "byte".
|
||||
|
||||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
Prefix Value | Prefix Value
|
||||
k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
|
||||
M megabyte (10^6) | Mi mebibyte (2^20)
|
||||
G gigabyte (10^9) | Gi gibibyte (2^30)
|
||||
T terabyte (10^12) | Ti tebibyte (2^40)
|
||||
P petabyte (10^15) | Pi pebibyte (2^50)
|
||||
E exabyte (10^18) | Ei exbibyte (2^60)
|
||||
Z zettabyte (10^21) | Zi zebibyte (2^70)
|
||||
Y yottabyte (10^24) | Yi yobibyte (2^80)
|
||||
Prefix Value | Prefix Value
|
||||
k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
|
||||
M megabyte (10^6) | Mi mebibyte (2^20)
|
||||
G gigabyte (10^9) | Gi gibibyte (2^30)
|
||||
T terabyte (10^12) | Ti tebibyte (2^40)
|
||||
P petabyte (10^15) | Pi pebibyte (2^50)
|
||||
E exabyte (10^18) | Ei exbibyte (2^60)
|
||||
Z zettabyte (10^21) | Zi zebibyte (2^70)
|
||||
Y yottabyte (10^24) | Yi yobibyte (2^80)
|
||||
R ronnabyte (10^27) | Ri robibyte (2^90)
|
||||
Q quettabyte (10^30) | Qi quebibyte (2^100)
|
||||
|
||||
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems (file not
|
||||
found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
|
||||
input file, 3 for an internal consistency error (e.g., bug) which caused
|
||||
plzip to panic.
|
||||
found, invalid command-line options, I/O errors, etc), 2 to indicate a
|
||||
corrupt or invalid input file, 3 for an internal consistency error (e.g.,
|
||||
bug) which caused plzip to panic.
|
||||
|
||||
|
||||
File: plzip.info, Node: Program design, Next: Memory requirements, Prev: Invoking plzip, Up: Top
|
||||
|
@ -441,7 +450,7 @@ multimember compressed file. Each chunk is compressed in-place (using the
|
|||
same buffer for input and output), reducing the amount of RAM required.
|
||||
|
||||
When decompressing, plzip decompresses as many members simultaneously as
|
||||
worker threads are chosen. Files that were compressed with lzip will not be
|
||||
worker threads are chosen. Files that were compressed with lzip are not
|
||||
decompressed faster than using lzip (unless the option '-b' was used)
|
||||
because lzip usually produces single-member files, which can't be
|
||||
decompressed in parallel.
|
||||
|
@ -535,10 +544,10 @@ multimember compressed file.
|
|||
For this to work as expected (and roughly multiply the compression speed
|
||||
by the number of available processors), the uncompressed file must be at
|
||||
least as large as the number of worker threads times the chunk size (*note
|
||||
--data-size::). Else some processors will not get any data to compress, and
|
||||
compression will be proportionally slower. The maximum speed increase
|
||||
achievable on a given file is limited by the ratio (file_size / data_size).
|
||||
For example, a tarball the size of gcc or linux will scale up to 10 or 14
|
||||
--data-size::). Else some processors do not get any data to compress, and
|
||||
compression is proportionally slower. The maximum speed increase achievable
|
||||
on a given file is limited by the ratio (file_size / data_size). For
|
||||
example, a tarball the size of gcc or linux scales up to 10 or 14
|
||||
processors at level -9.
|
||||
|
||||
The following table shows the minimum uncompressed file size needed for
|
||||
|
@ -585,7 +594,7 @@ when there is no longer anything to take away.
|
|||
represents a variable number of bytes.
|
||||
|
||||
|
||||
A lzip file consists of a series of independent "members" (compressed
|
||||
A lzip file consists of one or more independent "members" (compressed
|
||||
data sets). The members simply appear one after another in the file, with no
|
||||
additional information before, between, or after them. Each member can
|
||||
encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
|
||||
|
@ -629,10 +638,10 @@ size of a multimember file is unlimited.
|
|||
|
||||
'Member size (8 bytes)'
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, allows the verification of stream integrity,
|
||||
and facilitates the safe recovery of undamaged members from
|
||||
multimember files. Member size should be limited to 2 PiB to prevent
|
||||
the data size field from overflowing.
|
||||
as a distributed index, improves the checking of stream integrity, and
|
||||
facilitates the safe recovery of undamaged members from multimember
|
||||
files. Lzip limits the member size to 2 PiB to prevent the data size
|
||||
field from overflowing.
|
||||
|
||||
|
||||
|
||||
|
@ -648,12 +657,13 @@ member. Such trailing data may be:
|
|||
example when writing to a tape. It is safe to append any amount of
|
||||
padding zero bytes to a lzip file.
|
||||
|
||||
* Useful data added by the user; a cryptographically secure hash, a
|
||||
* Useful data added by the user; an "End Of File" string (to check that
|
||||
the file has not been truncated), a cryptographically secure hash, a
|
||||
description of file contents, etc. It is safe to append any amount of
|
||||
text to a lzip file as long as none of the first four bytes of the text
|
||||
match the corresponding byte in the string "LZIP", and the text does
|
||||
not contain any zero bytes (null characters). Nonzero bytes and zero
|
||||
bytes can't be safely mixed in trailing data.
|
||||
text to a lzip file as long as none of the first four bytes of the
|
||||
text matches the corresponding byte in the string "LZIP", and the text
|
||||
does not contain any zero bytes (null characters). Nonzero bytes and
|
||||
zero bytes can't be safely mixed in trailing data.
|
||||
|
||||
* Garbage added by some not totally successful copy operation.
|
||||
|
||||
|
@ -669,7 +679,7 @@ member. Such trailing data may be:
|
|||
discriminate trailing data from a corrupt header has a Hamming
|
||||
distance (HD) of 3, and the 3 bit flips must happen in different magic
|
||||
bytes for the test to fail. In any case, the option '--trailing-error'
|
||||
guarantees that any corrupt header will be detected.
|
||||
guarantees that any corrupt header is detected.
|
||||
|
||||
Trailing data are in no way part of the lzip file format, but tools
|
||||
reading lzip files are expected to behave as correctly and usefully as
|
||||
|
@ -689,12 +699,12 @@ File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: T
|
|||
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
|
||||
compressed file (bugs in the system libraries, memory errors, etc).
|
||||
Therefore, if the data you are going to compress are important, give the
|
||||
option '--keep' to plzip and don't remove the original file until you
|
||||
verify the compressed file with a command like
|
||||
'plzip -cd file.lz | cmp file -'. Most RAM errors happening during
|
||||
compression can only be detected by comparing the compressed file with the
|
||||
original because the corruption happens before plzip compresses the RAM
|
||||
contents, resulting in a valid compressed file containing wrong data.
|
||||
option '--keep' to plzip and don't remove the original file until you check
|
||||
the compressed file with a command like 'plzip -cd file.lz | cmp file -'.
|
||||
Most RAM errors happening during compression can only be detected by
|
||||
comparing the compressed file with the original because the corruption
|
||||
happens before plzip compresses the RAM contents, resulting in a valid
|
||||
compressed file containing wrong data.
|
||||
|
||||
|
||||
Example 1: Extract all the files from archive 'foo.tar.lz'.
|
||||
|
@ -722,7 +732,7 @@ the operation is successful, 'file.lz' is removed.
|
|||
plzip -d file.lz
|
||||
|
||||
|
||||
Example 5: Verify the integrity of the compressed file 'file.lz' and show
|
||||
Example 5: Check the integrity of the compressed file 'file.lz' and show
|
||||
status.
|
||||
|
||||
plzip -tv file.lz
|
||||
|
@ -800,20 +810,20 @@ Concept index
|
|||
Tag Table:
|
||||
Node: Top217
|
||||
Node: Introduction1156
|
||||
Node: Output5829
|
||||
Node: Invoking plzip7392
|
||||
Ref: --trailing-error8187
|
||||
Ref: --data-size8425
|
||||
Node: Program design18819
|
||||
Node: Memory requirements21122
|
||||
Node: Minimum file sizes22807
|
||||
Node: File format24821
|
||||
Ref: coded-dict-size26260
|
||||
Node: Trailing data27514
|
||||
Node: Examples29775
|
||||
Ref: concat-example31210
|
||||
Node: Problems31967
|
||||
Node: Concept index32522
|
||||
Node: Output5934
|
||||
Node: Invoking plzip7497
|
||||
Ref: --trailing-error8372
|
||||
Ref: --data-size8610
|
||||
Node: Program design19519
|
||||
Node: Memory requirements21818
|
||||
Node: Minimum file sizes23503
|
||||
Node: File format25506
|
||||
Ref: coded-dict-size26945
|
||||
Node: Trailing data28195
|
||||
Node: Examples30531
|
||||
Ref: concat-example31964
|
||||
Node: Problems32721
|
||||
Node: Concept index33276
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
248
doc/plzip.texi
248
doc/plzip.texi
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 24 January 2022
|
||||
@set VERSION 1.10
|
||||
@set UPDATED 21 January 2024
|
||||
@set VERSION 1.11
|
||||
|
||||
@dircategory Compression
|
||||
@direntry
|
||||
|
@ -38,7 +38,7 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
|
|||
@menu
|
||||
* Introduction:: Purpose and features of plzip
|
||||
* Output:: Meaning of plzip's output
|
||||
* Invoking plzip:: Command line interface
|
||||
* Invoking plzip:: Command-line interface
|
||||
* Program design:: Internal structure of plzip
|
||||
* Memory requirements:: Memory required to compress and decompress
|
||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||
|
@ -50,7 +50,7 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
|
|||
@end menu
|
||||
|
||||
@sp 1
|
||||
Copyright @copyright{} 2009-2022 Antonio Diaz Diaz.
|
||||
Copyright @copyright{} 2009-2024 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to copy,
|
||||
distribute, and modify it.
|
||||
|
@ -62,21 +62,22 @@ distribute, and modify it.
|
|||
@cindex introduction
|
||||
|
||||
@uref{http://www.nongnu.org/lzip/plzip.html,,Plzip}
|
||||
is a massively parallel (multi-threaded) implementation of lzip, fully
|
||||
is a massively parallel (multi-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library
|
||||
@uref{http://www.nongnu.org/lzip/lzlib.html,,lzlib}.
|
||||
|
||||
@uref{http://www.nongnu.org/lzip/lzip.html,,Lzip}
|
||||
is a lossless data compressor with a user interface similar to the one
|
||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
||||
chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity
|
||||
checking to maximize interoperability and optimize safety. Lzip can compress
|
||||
about as fast as gzip @w{(lzip -0)} or compress most files more than bzip2
|
||||
@w{(lzip -9)}. Decompression speed is intermediate between gzip and bzip2.
|
||||
Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip
|
||||
has been designed, written, and tested with great care to replace gzip and
|
||||
bzip2 as the standard general-purpose compressed format for unix-like
|
||||
systems.
|
||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip @w{(lzip -0)} or compress most
|
||||
files more than bzip2 @w{(lzip -9)}. Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
||||
Unix-like systems.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
|
@ -130,9 +131,9 @@ Plzip uses the same well-defined exit status values used by lzip, which
|
|||
makes it safer than compressors returning ambiguous warning values (like
|
||||
gzip) when it is used as a back end for other programs like tar or zutils.
|
||||
|
||||
Plzip will automatically use for each file the largest dictionary size that
|
||||
does not exceed neither the file size nor the limit given. Keep in mind that
|
||||
the decompression memory requirement is affected at compression time by the
|
||||
Plzip automatically uses for each file the largest dictionary size that does
|
||||
not exceed neither the file size nor the limit given. Keep in mind that the
|
||||
decompression memory requirement is affected at compression time by the
|
||||
choice of dictionary size limit. @xref{Memory requirements}.
|
||||
|
||||
When compressing, plzip replaces every file given in the command line
|
||||
|
@ -147,19 +148,19 @@ file from that of the compressed file as follows:
|
|||
@end multitable
|
||||
|
||||
(De)compressing a file is much like copying or moving it. Therefore plzip
|
||||
preserves the access and modification dates, permissions, and, when
|
||||
possible, ownership of the file just as @w{@samp{cp -p}} does. (If the user ID or
|
||||
the group ID can't be duplicated, the file permission bits S_ISUID and
|
||||
S_ISGID are cleared).
|
||||
preserves the access and modification dates, permissions, and, if you have
|
||||
appropriate privileges, ownership of the file just as @w{@samp{cp -p}} does.
|
||||
(If the user ID or the group ID can't be duplicated, the file permission
|
||||
bits S_ISUID and S_ISGID are cleared).
|
||||
|
||||
Plzip is able to read from some types of non-regular files if either the
|
||||
option @samp{-c} or the option @samp{-o} is specified.
|
||||
option @option{-c} or the option @option{-o} is specified.
|
||||
|
||||
Plzip will refuse to read compressed data from a terminal or write compressed
|
||||
Plzip refuses to read compressed data from a terminal or write compressed
|
||||
data to a terminal, as this would be entirely incomprehensible and might
|
||||
leave the terminal in an abnormal state.
|
||||
|
||||
Plzip will correctly decompress a file which is the concatenation of two or
|
||||
Plzip correctly decompresses a file which is the concatenation of two or
|
||||
more compressed files. The result is the concatenation of the corresponding
|
||||
decompressed files. Integrity testing of concatenated compressed files is
|
||||
also supported.
|
||||
|
@ -231,7 +232,8 @@ plzip [@var{options}] [@var{files}]
|
|||
If no file names are specified, plzip compresses (or decompresses) from
|
||||
standard input to standard output. A hyphen @samp{-} used as a @var{file}
|
||||
argument means standard input. It can be mixed with other @var{files} and is
|
||||
read just once, the first time it appears in the command line.
|
||||
read just once, the first time it appears in the command line. Remember to
|
||||
prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}.
|
||||
|
||||
plzip supports the following
|
||||
@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
|
||||
|
@ -259,30 +261,32 @@ garbage that can be safely ignored. @xref{concat-example}.
|
|||
@anchor{--data-size}
|
||||
@item -B @var{bytes}
|
||||
@itemx --data-size=@var{bytes}
|
||||
When compressing, set the size in bytes of the input data blocks. The
|
||||
input file will be divided in chunks of this size before compression is
|
||||
performed. Valid values range from @w{8 KiB} to @w{1 GiB}. Default value
|
||||
is two times the dictionary size, except for option @samp{-0} where it
|
||||
defaults to @w{1 MiB}. Plzip will reduce the dictionary size if it is
|
||||
larger than the data size specified. @xref{Minimum file sizes}.
|
||||
When compressing, set the size in bytes of the input data blocks. The input
|
||||
file is divided in chunks of this size before compression is performed.
|
||||
Valid values range from @w{8 KiB} to @w{1 GiB}. Default value is two times
|
||||
the dictionary size, except for option @option{-0} where it defaults to
|
||||
@w{1 MiB}. Plzip reduces the dictionary size if it is larger than the data
|
||||
size specified. @xref{Minimum file sizes}.
|
||||
|
||||
@item -c
|
||||
@itemx --stdout
|
||||
Compress or decompress to standard output; keep input files unchanged. If
|
||||
compressing several files, each file is compressed independently. This
|
||||
option (or @samp{-o}) is needed when reading from a named pipe (fifo) or
|
||||
compressing several files, each file is compressed independently. (The
|
||||
output consists of a sequence of independently compressed members). This
|
||||
option (or @option{-o}) is needed when reading from a named pipe (fifo) or
|
||||
from a device. Use @w{@samp{lziprecover -cd -i}} to recover as much of the
|
||||
decompressed data as possible when decompressing a corrupt file. @samp{-c}
|
||||
overrides @samp{-o}. @samp{-c} has no effect when testing or listing.
|
||||
decompressed data as possible when decompressing a corrupt file. @option{-c}
|
||||
overrides @option{-o}. @option{-c} has no effect when testing or listing.
|
||||
|
||||
@item -d
|
||||
@itemx --decompress
|
||||
Decompress the files specified. If a file does not exist, can't be opened,
|
||||
or the destination file already exists and @samp{--force} has not been
|
||||
specified, plzip continues decompressing the rest of the files and exits with
|
||||
error status 1. If a file fails to decompress, or is a terminal, plzip exits
|
||||
immediately with error status 2 without decompressing the rest of the files.
|
||||
A terminal is considered an uncompressed file, and therefore invalid.
|
||||
Decompress the files specified. The integrity of the files specified is
|
||||
checked. If a file does not exist, can't be opened, or the destination file
|
||||
already exists and @option{--force} has not been specified, plzip continues
|
||||
decompressing the rest of the files and exits with error status 1. If a file
|
||||
fails to decompress, or is a terminal, plzip exits immediately with error
|
||||
status 2 without decompressing the rest of the files. A terminal is
|
||||
considered an uncompressed file, and therefore invalid.
|
||||
|
||||
@item -f
|
||||
@itemx --force
|
||||
|
@ -302,23 +306,23 @@ Keep (don't delete) input files during compression or decompression.
|
|||
Print the uncompressed size, compressed size, and percentage saved of the
|
||||
files specified. Trailing data are ignored. The values produced are correct
|
||||
even for multimember files. If more than one file is given, a final line
|
||||
containing the cumulative sizes is printed. With @samp{-v}, the dictionary
|
||||
containing the cumulative sizes is printed. With @option{-v}, the dictionary
|
||||
size, the number of members in the file, and the amount of trailing data (if
|
||||
any) are also printed. With @samp{-vv}, the positions and sizes of each
|
||||
any) are also printed. With @option{-vv}, the positions and sizes of each
|
||||
member in multimember files are also printed.
|
||||
|
||||
If any file is damaged, does not exist, can't be opened, or is not regular,
|
||||
the final exit status will be @w{> 0}. @samp{-lq} can be used to verify
|
||||
quickly (without decompressing) the structural integrity of the files
|
||||
specified. (Use @samp{--test} to verify the data integrity). @samp{-alq}
|
||||
additionally verifies that none of the files specified contain trailing data.
|
||||
the final exit status is @w{> 0}. @option{-lq} can be used to check quickly
|
||||
(without decompressing) the structural integrity of the files specified.
|
||||
(Use @option{--test} to check the data integrity). @option{-alq}
|
||||
additionally checks that none of the files specified contain trailing data.
|
||||
|
||||
@item -m @var{bytes}
|
||||
@itemx --match-length=@var{bytes}
|
||||
When compressing, set the match length limit in bytes. After a match
|
||||
this long is found, the search is finished. Valid values range from 5 to
|
||||
273. Larger values usually give better compression ratios but longer
|
||||
compression times.
|
||||
When compressing, set the match length limit in bytes. After a match this
|
||||
long is found, the search is finished. Valid values range from 5 to 273.
|
||||
Larger values usually give better compression ratios but longer compression
|
||||
times.
|
||||
|
||||
@item -n @var{n}
|
||||
@itemx --threads=@var{n}
|
||||
|
@ -339,17 +343,19 @@ can find the number of members in a lzip file by running
|
|||
|
||||
@item -o @var{file}
|
||||
@itemx --output=@var{file}
|
||||
If @samp{-c} has not been also specified, write the (de)compressed output to
|
||||
@var{file}; keep input files unchanged. If compressing several files, each
|
||||
file is compressed independently. This option (or @samp{-c}) is needed when
|
||||
reading from a named pipe (fifo) or from a device. @w{@samp{-o -}} is
|
||||
equivalent to @samp{-c}. @samp{-o} has no effect when testing or listing.
|
||||
If @option{-c} has not been also specified, write the (de)compressed output
|
||||
to @var{file}, automatically creating any missing parent directories; keep
|
||||
input files unchanged. If compressing several files, each file is compressed
|
||||
independently. (The output consists of a sequence of independently
|
||||
compressed members). This option (or @option{-c}) is needed when reading
|
||||
from a named pipe (fifo) or from a device. @w{@option{-o -}} is equivalent
|
||||
to @option{-c}. @option{-o} has no effect when testing or listing.
|
||||
|
||||
In order to keep backward compatibility with plzip versions prior to 1.9,
|
||||
when compressing from standard input and no other file names are given, the
|
||||
extension @samp{.lz} is appended to @var{file} unless it already ends in
|
||||
@samp{.lz} or @samp{.tlz}. This feature will be removed in a future version
|
||||
of plzip. Meanwhile, redirection may be used instead of @samp{-o} to write
|
||||
of plzip. Meanwhile, redirection may be used instead of @option{-o} to write
|
||||
the compressed output to a file without the extension @samp{.lz} in its
|
||||
name: @w{@samp{plzip < file > foo}}.
|
||||
|
||||
|
@ -359,14 +365,14 @@ Quiet operation. Suppress all messages.
|
|||
|
||||
@item -s @var{bytes}
|
||||
@itemx --dictionary-size=@var{bytes}
|
||||
When compressing, set the dictionary size limit in bytes. Plzip will use
|
||||
for each file the largest dictionary size that does not exceed neither
|
||||
the file size nor this limit. Valid values range from @w{4 KiB} to
|
||||
@w{512 MiB}. Values 12 to 29 are interpreted as powers of two, meaning
|
||||
2^12 to 2^29 bytes. Dictionary sizes are quantized so that they can be
|
||||
coded in just one byte (@pxref{coded-dict-size}). If the size specified
|
||||
does not match one of the valid sizes, it will be rounded upwards by
|
||||
adding up to @w{(@var{bytes} / 8)} to it.
|
||||
When compressing, set the dictionary size limit in bytes. Plzip uses for
|
||||
each file the largest dictionary size that does not exceed neither the file
|
||||
size nor this limit. Valid values range from @w{4 KiB} to @w{512 MiB}.
|
||||
Values 12 to 29 are interpreted as powers of two, meaning 2^12 to 2^29
|
||||
bytes. Dictionary sizes are quantized so that they can be coded in just one
|
||||
byte (@pxref{coded-dict-size}). If the size specified does not match one of
|
||||
the valid sizes, it is rounded upwards by adding up to @w{(@var{bytes} / 8)}
|
||||
to it.
|
||||
|
||||
For maximum compression you should use a dictionary size limit as large
|
||||
as possible, but keep in mind that the decompression memory requirement
|
||||
|
@ -376,11 +382,11 @@ is affected at compression time by the choice of dictionary size limit.
|
|||
@itemx --test
|
||||
Check integrity of the files specified, but don't decompress them. This
|
||||
really performs a trial decompression and throws away the result. Use it
|
||||
together with @samp{-v} to see information about the files. If a file
|
||||
together with @option{-v} to see information about the files. If a file
|
||||
fails the test, does not exist, can't be opened, or is a terminal, plzip
|
||||
continues checking the rest of the files. A final diagnostic is shown at
|
||||
verbosity level 1 or higher if any file fails the test when testing
|
||||
multiple files.
|
||||
continues testing the rest of the files. A final diagnostic is shown at
|
||||
verbosity level 1 or higher if any file fails the test when testing multiple
|
||||
files.
|
||||
|
||||
@item -v
|
||||
@itemx --verbose
|
||||
|
@ -390,24 +396,24 @@ processed.@*
|
|||
When decompressing or testing, further -v's (up to 4) increase the
|
||||
verbosity level, showing status, compression ratio, dictionary size,
|
||||
decompressed size, and compressed size.@*
|
||||
Two or more @samp{-v} options show the progress of (de)compression,
|
||||
Two or more @option{-v} options show the progress of (de)compression,
|
||||
except for single-member files.
|
||||
|
||||
@item -0 .. -9
|
||||
Compression level. Set the compression parameters (dictionary size and
|
||||
match length limit) as shown in the table below. The default compression
|
||||
level is @samp{-6}, equivalent to @w{@samp{-s8MiB -m36}}. Note that
|
||||
@samp{-9} can be much slower than @samp{-0}. These options have no
|
||||
level is @option{-6}, equivalent to @w{@option{-s8MiB -m36}}. Note that
|
||||
@option{-9} can be much slower than @option{-0}. These options have no
|
||||
effect when decompressing, testing, or listing.
|
||||
|
||||
The bidimensional parameter space of LZMA can't be mapped to a linear
|
||||
scale optimal for all files. If your files are large, very repetitive,
|
||||
etc, you may need to use the options @samp{--dictionary-size} and
|
||||
@samp{--match-length} directly to achieve optimal performance.
|
||||
The bidimensional parameter space of LZMA can't be mapped to a linear scale
|
||||
optimal for all files. If your files are large, very repetitive, etc, you
|
||||
may need to use the options @option{--dictionary-size} and
|
||||
@option{--match-length} directly to achieve optimal performance.
|
||||
|
||||
If several compression levels or @samp{-s} or @samp{-m} options are
|
||||
given, the last setting is used. For example @w{@samp{-9 -s64MiB}} is
|
||||
equivalent to @w{@samp{-s64MiB -m273}}
|
||||
If several compression levels or @option{-s} or @option{-m} options are
|
||||
given, the last setting is used. For example @w{@option{-9 -s64MiB}} is
|
||||
equivalent to @w{@option{-s64MiB -m273}}
|
||||
|
||||
@multitable {Level} {Dictionary size (-s)} {Match length limit (-m)}
|
||||
@item Level @tab Dictionary size (-s) @tab Match length limit (-m)
|
||||
|
@ -461,28 +467,31 @@ and the value of LZ_API_VERSION (if defined).
|
|||
|
||||
@end table
|
||||
|
||||
Numbers given as arguments to options may be followed by a multiplier
|
||||
and an optional @samp{B} for "byte".
|
||||
Numbers given as arguments to options may be expressed in decimal,
|
||||
hexadecimal, or octal (using the same syntax as integer constants in C++),
|
||||
and may be followed by a multiplier and an optional @samp{B} for "byte".
|
||||
|
||||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
||||
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
||||
@item Prefix @tab Value @tab | @tab Prefix @tab Value
|
||||
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
||||
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
||||
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
||||
@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
|
||||
@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
|
||||
@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
|
||||
@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
|
||||
@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
|
||||
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
||||
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
||||
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
||||
@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
|
||||
@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
|
||||
@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
|
||||
@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
|
||||
@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
|
||||
@item R @tab ronnabyte (10^27) @tab | @tab Ri @tab robibyte (2^90)
|
||||
@item Q @tab quettabyte (10^30) @tab | @tab Qi @tab quebibyte (2^100)
|
||||
@end multitable
|
||||
|
||||
@sp 1
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems (file not
|
||||
found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
|
||||
input file, 3 for an internal consistency error (e.g., bug) which caused
|
||||
plzip to panic.
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems
|
||||
(file not found, invalid command-line options, I/O errors, etc), 2 to
|
||||
indicate a corrupt or invalid input file, 3 for an internal consistency
|
||||
error (e.g., bug) which caused plzip to panic.
|
||||
|
||||
|
||||
@node Program design
|
||||
|
@ -495,8 +504,8 @@ multimember compressed file. Each chunk is compressed in-place (using the
|
|||
same buffer for input and output), reducing the amount of RAM required.
|
||||
|
||||
When decompressing, plzip decompresses as many members simultaneously as
|
||||
worker threads are chosen. Files that were compressed with lzip will not
|
||||
be decompressed faster than using lzip (unless the option @samp{-b} was used)
|
||||
worker threads are chosen. Files that were compressed with lzip are not
|
||||
decompressed faster than using lzip (unless the option @option{-b} was used)
|
||||
because lzip usually produces single-member files, which can't be
|
||||
decompressed in parallel.
|
||||
|
||||
|
@ -600,14 +609,14 @@ When compressing, plzip divides the input file into chunks and
|
|||
compresses as many chunks simultaneously as worker threads are chosen,
|
||||
creating a multimember compressed file.
|
||||
|
||||
For this to work as expected (and roughly multiply the compression speed
|
||||
by the number of available processors), the uncompressed file must be at
|
||||
least as large as the number of worker threads times the chunk size
|
||||
(@pxref{--data-size}). Else some processors will not get any data to
|
||||
compress, and compression will be proportionally slower. The maximum
|
||||
speed increase achievable on a given file is limited by the ratio
|
||||
@w{(file_size / data_size)}. For example, a tarball the size of gcc or
|
||||
linux will scale up to 10 or 14 processors at level -9.
|
||||
For this to work as expected (and roughly multiply the compression speed by
|
||||
the number of available processors), the uncompressed file must be at least
|
||||
as large as the number of worker threads times the chunk size
|
||||
(@pxref{--data-size}). Else some processors do not get any data to compress,
|
||||
and compression is proportionally slower. The maximum speed increase
|
||||
achievable on a given file is limited by the ratio
|
||||
@w{(file_size / data_size)}. For example, a tarball the size of gcc or linux
|
||||
scales up to 10 or 14 processors at level -9.
|
||||
|
||||
The following table shows the minimum uncompressed file size needed for
|
||||
full use of N processors at a given compression level, using the default
|
||||
|
@ -657,7 +666,7 @@ represents one byte; a box like this:
|
|||
represents a variable number of bytes.
|
||||
|
||||
@sp 1
|
||||
A lzip file consists of a series of independent "members" (compressed data
|
||||
A lzip file consists of one or more independent "members" (compressed data
|
||||
sets). The members simply appear one after another in the file, with no
|
||||
additional information before, between, or after them. Each member can
|
||||
encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
|
||||
|
@ -711,10 +720,10 @@ Size of the original uncompressed data.
|
|||
|
||||
@item Member size (8 bytes)
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, allows the verification of stream integrity, and
|
||||
as a distributed index, improves the checking of stream integrity, and
|
||||
facilitates the safe recovery of undamaged members from multimember files.
|
||||
Member size should be limited to @w{2 PiB} to prevent the data size field
|
||||
from overflowing.
|
||||
Lzip limits the member size to @w{2 PiB} to prevent the data size field from
|
||||
overflowing.
|
||||
|
||||
@end table
|
||||
|
||||
|
@ -733,12 +742,13 @@ example when writing to a tape. It is safe to append any amount of
|
|||
padding zero bytes to a lzip file.
|
||||
|
||||
@item
|
||||
Useful data added by the user; a cryptographically secure hash, a
|
||||
description of file contents, etc. It is safe to append any amount of
|
||||
text to a lzip file as long as none of the first four bytes of the text
|
||||
match the corresponding byte in the string "LZIP", and the text does not
|
||||
contain any zero bytes (null characters). Nonzero bytes and zero bytes
|
||||
can't be safely mixed in trailing data.
|
||||
Useful data added by the user; an "End Of File" string (to check that the
|
||||
file has not been truncated), a cryptographically secure hash, a description
|
||||
of file contents, etc. It is safe to append any amount of text to a lzip
|
||||
file as long as none of the first four bytes of the text matches the
|
||||
corresponding byte in the string "LZIP", and the text does not contain any
|
||||
zero bytes (null characters). Nonzero bytes and zero bytes can't be safely
|
||||
mixed in trailing data.
|
||||
|
||||
@item
|
||||
Garbage added by some not totally successful copy operation.
|
||||
|
@ -756,8 +766,8 @@ integrity information itself. Therefore it can be considered to be below
|
|||
the noise level. Additionally, the test used by plzip to discriminate
|
||||
trailing data from a corrupt header has a Hamming distance (HD) of 3,
|
||||
and the 3 bit flips must happen in different magic bytes for the test to
|
||||
fail. In any case, the option @samp{--trailing-error} guarantees that
|
||||
any corrupt header will be detected.
|
||||
fail. In any case, the option @option{--trailing-error} guarantees that
|
||||
any corrupt header is detected.
|
||||
@end itemize
|
||||
|
||||
Trailing data are in no way part of the lzip file format, but tools
|
||||
|
@ -767,7 +777,7 @@ possible in the presence of trailing data.
|
|||
Trailing data can be safely ignored in most cases. In some cases, like
|
||||
that of user-added data, they are expected to be ignored. In those cases
|
||||
where a file containing trailing data must be rejected, the option
|
||||
@samp{--trailing-error} can be used. @xref{--trailing-error}.
|
||||
@option{--trailing-error} can be used. @xref{--trailing-error}.
|
||||
|
||||
|
||||
@node Examples
|
||||
|
@ -777,8 +787,8 @@ where a file containing trailing data must be rejected, the option
|
|||
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
|
||||
compressed file (bugs in the system libraries, memory errors, etc).
|
||||
Therefore, if the data you are going to compress are important, give the
|
||||
option @samp{--keep} to plzip and don't remove the original file until you
|
||||
verify the compressed file with a command like
|
||||
option @option{--keep} to plzip and don't remove the original file until you
|
||||
check the compressed file with a command like
|
||||
@w{@samp{plzip -cd file.lz | cmp file -}}. Most RAM errors happening during
|
||||
compression can only be detected by comparing the compressed file with the
|
||||
original because the corruption happens before plzip compresses the RAM
|
||||
|
@ -823,7 +833,7 @@ plzip -d file.lz
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 5: Verify the integrity of the compressed file @samp{file.lz} and
|
||||
Example 5: Check the integrity of the compressed file @samp{file.lz} and
|
||||
show status.
|
||||
|
||||
@example
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue