Adding upstream version 1.8.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
da5ddefa70
commit
b8ee6d8c5a
22 changed files with 614 additions and 336 deletions
20
doc/clzip.1
20
doc/clzip.1
|
@ -1,5 +1,5 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1.
|
||||
.TH CLZIP "1" "July 2015" "clzip 1.7" "User Commands"
|
||||
.TH CLZIP "1" "May 2016" "clzip 1.8" "User Commands"
|
||||
.SH NAME
|
||||
clzip \- reduces the size of files
|
||||
.SH SYNOPSIS
|
||||
|
@ -15,11 +15,14 @@ display this help and exit
|
|||
\fB\-V\fR, \fB\-\-version\fR
|
||||
output version information and exit
|
||||
.TP
|
||||
\fB\-a\fR, \fB\-\-trailing\-error\fR
|
||||
exit with error status if trailing data
|
||||
.TP
|
||||
\fB\-b\fR, \fB\-\-member\-size=\fR<bytes>
|
||||
set member size limit in bytes
|
||||
.TP
|
||||
\fB\-c\fR, \fB\-\-stdout\fR
|
||||
send output to standard output
|
||||
write to standard output, keep input files
|
||||
.TP
|
||||
\fB\-d\fR, \fB\-\-decompress\fR
|
||||
decompress
|
||||
|
@ -37,7 +40,7 @@ keep (don't delete) input files
|
|||
set match length limit in bytes [36]
|
||||
.TP
|
||||
\fB\-o\fR, \fB\-\-output=\fR<file>
|
||||
if reading stdin, place the output into <file>
|
||||
if reading standard input, write to <file>
|
||||
.TP
|
||||
\fB\-q\fR, \fB\-\-quiet\fR
|
||||
suppress all messages
|
||||
|
@ -63,13 +66,16 @@ alias for \fB\-0\fR
|
|||
\fB\-\-best\fR
|
||||
alias for \fB\-9\fR
|
||||
.PP
|
||||
If no file names are given, clzip compresses or decompresses
|
||||
from standard input to standard output.
|
||||
If no file names are given, or if a file is '\-', clzip compresses or
|
||||
decompresses from standard input to standard output.
|
||||
Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,
|
||||
Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...
|
||||
Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12
|
||||
to 2^29 bytes.
|
||||
.PP
|
||||
The bidimensional parameter space of LZMA can't be mapped to a linear
|
||||
scale optimal for all files. If your files are large, very repetitive,
|
||||
etc, you may need to use the \fB\-\-match\-length\fR and \fB\-\-dictionary\-size\fR
|
||||
etc, you may need to use the \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR
|
||||
options directly to achieve optimal performance.
|
||||
.PP
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems (file
|
||||
|
@ -81,7 +87,7 @@ Report bugs to lzip\-bug@nongnu.org
|
|||
.br
|
||||
Clzip home page: http://www.nongnu.org/lzip/clzip.html
|
||||
.SH COPYRIGHT
|
||||
Copyright \(co 2015 Antonio Diaz Diaz.
|
||||
Copyright \(co 2016 Antonio Diaz Diaz.
|
||||
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||
.br
|
||||
This is free software: you are free to change and redistribute it.
|
||||
|
|
189
doc/clzip.info
189
doc/clzip.info
|
@ -11,7 +11,7 @@ File: clzip.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Clzip Manual
|
||||
************
|
||||
|
||||
This manual is for Clzip (version 1.7, 7 July 2015).
|
||||
This manual is for Clzip (version 1.8, 13 May 2016).
|
||||
|
||||
* Menu:
|
||||
|
||||
|
@ -19,12 +19,13 @@ This manual is for Clzip (version 1.7, 7 July 2015).
|
|||
* Invoking clzip:: Command line interface
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Algorithm:: How clzip compresses the data
|
||||
* Trailing data:: Extra data appended to the file
|
||||
* Examples:: A small tutorial with examples
|
||||
* Problems:: Reporting bugs
|
||||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2010-2015 Antonio Diaz Diaz.
|
||||
Copyright (C) 2010-2016 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to
|
||||
copy, distribute and modify it.
|
||||
|
@ -53,7 +54,7 @@ availability:
|
|||
recovery means. The lziprecover program can repair bit-flip errors
|
||||
(one of the most common forms of data corruption) in lzip files,
|
||||
and provides data recovery capabilities, including error-checked
|
||||
merging of damaged copies of a file. *note Data safety:
|
||||
merging of damaged copies of a file. *Note Data safety:
|
||||
(lziprecover)Data safety.
|
||||
|
||||
* The lzip format is as simple as possible (but not simpler). The
|
||||
|
@ -73,15 +74,14 @@ corrupt byte near the beginning is a thing of the past.
|
|||
|
||||
The member trailer stores the 32-bit CRC of the original data, the
|
||||
size of the original data and the size of the member. These values,
|
||||
together with the value remaining in the range decoder and the
|
||||
end-of-stream marker, provide a 4 factor integrity checking which
|
||||
guarantees that the decompressed version of the data is identical to
|
||||
the original. This guards against corruption of the compressed data,
|
||||
and against undetected bugs in clzip (hopefully very unlikely). The
|
||||
chances of data corruption going undetected are microscopic. Be aware,
|
||||
though, that the check occurs upon decompression, so it can only tell
|
||||
you that something is wrong. It can't help you recover the original
|
||||
uncompressed data.
|
||||
together with the end-of-stream marker, provide a 3 factor integrity
|
||||
checking which guarantees that the decompressed version of the data is
|
||||
identical to the original. This guards against corruption of the
|
||||
compressed data, and against undetected bugs in clzip (hopefully very
|
||||
unlikely). The chances of data corruption going undetected are
|
||||
microscopic. Be aware, though, that the check occurs upon
|
||||
decompression, so it can only tell you that something is wrong. It
|
||||
can't help you recover the original uncompressed data.
|
||||
|
||||
Clzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
|
@ -128,14 +128,14 @@ two or more compressed files. The result is the concatenation of the
|
|||
corresponding uncompressed files. Integrity testing of concatenated
|
||||
compressed files is also supported.
|
||||
|
||||
Clzip can produce multi-member files and safely recover, with
|
||||
Clzip can produce multimember files and safely recover, with
|
||||
lziprecover, the undamaged members in case of file damage. Clzip can
|
||||
also split the compressed output in volumes of a given size, even when
|
||||
reading from standard input. This allows the direct creation of
|
||||
multivolume compressed tar archives.
|
||||
|
||||
Clzip is able to compress and decompress streams of unlimited size by
|
||||
automatically creating multi-member output. The members so created are
|
||||
automatically creating multimember output. The members so created are
|
||||
large, about 2 PiB each.
|
||||
|
||||
|
||||
|
@ -148,6 +148,10 @@ The format for running clzip is:
|
|||
|
||||
clzip [OPTIONS] [FILES]
|
||||
|
||||
'-' used as a FILE argument means standard input. It can be mixed with
|
||||
other FILES and is read just once, the first time it appears in the
|
||||
command line.
|
||||
|
||||
Clzip supports the following options:
|
||||
|
||||
'-h'
|
||||
|
@ -158,6 +162,13 @@ The format for running clzip is:
|
|||
'--version'
|
||||
Print the version number of clzip on the standard output and exit.
|
||||
|
||||
'-a'
|
||||
'--trailing-error'
|
||||
Exit with error status 2 if any remaining input is detected after
|
||||
decompressing the last member. Such remaining input is usually
|
||||
trailing garbage that can be safely ignored. *Note
|
||||
concat-example::.
|
||||
|
||||
'-b BYTES'
|
||||
'--member-size=BYTES'
|
||||
Set the member size limit to BYTES. A small member size may
|
||||
|
@ -166,14 +177,19 @@ The format for running clzip is:
|
|||
|
||||
'-c'
|
||||
'--stdout'
|
||||
Compress or decompress to standard output. Needed when reading
|
||||
from a named pipe (fifo) or from a device. Use it to recover as
|
||||
much of the uncompressed data as possible when decompressing a
|
||||
corrupt file.
|
||||
Compress or decompress to standard output; keep input files
|
||||
unchanged. If compressing several files, each file is compressed
|
||||
independently. This option is needed when reading from a named
|
||||
pipe (fifo) or from a device. Use it also to recover as much of
|
||||
the uncompressed data as possible when decompressing a corrupt
|
||||
file.
|
||||
|
||||
'-d'
|
||||
'--decompress'
|
||||
Decompress.
|
||||
Decompress the specified file(s). If a file does not exist or
|
||||
can't be opened, clzip continues decompressing the rest of the
|
||||
files. If a file fails to decompress, clzip exits immediately
|
||||
without decompressing the rest of the files.
|
||||
|
||||
'-f'
|
||||
'--force'
|
||||
|
@ -211,12 +227,13 @@ The format for running clzip is:
|
|||
|
||||
'-s BYTES'
|
||||
'--dictionary-size=BYTES'
|
||||
Set the dictionary size limit in bytes. Valid values range from 4
|
||||
KiB to 512 MiB. Clzip will use the smallest possible dictionary
|
||||
size for each file without exceeding this limit. Note that
|
||||
dictionary sizes are quantized. If the specified size does not
|
||||
match one of the valid sizes, it will be rounded upwards by adding
|
||||
up to (BYTES / 16) to it.
|
||||
Set the dictionary size limit in bytes. Clzip will use the smallest
|
||||
possible dictionary size for each file without exceeding this
|
||||
limit. Valid values range from 4 KiB to 512 MiB. Values 12 to 29
|
||||
are interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note
|
||||
that dictionary sizes are quantized. If the specified size does
|
||||
not match one of the valid sizes, it will be rounded upwards by
|
||||
adding up to (BYTES / 8) to it.
|
||||
|
||||
For maximum compression you should use a dictionary size limit as
|
||||
large as possible, but keep in mind that the decompression memory
|
||||
|
@ -228,16 +245,17 @@ The format for running clzip is:
|
|||
Split the compressed output into several volume files with names
|
||||
'original_name00001.lz', 'original_name00002.lz', etc, and set the
|
||||
volume size limit to BYTES. Each volume is a complete, maybe
|
||||
multi-member, lzip file. A small volume size may degrade
|
||||
compression ratio, so use it only when needed. Valid values range
|
||||
from 100 kB to 4 EiB.
|
||||
multimember, lzip file. A small volume size may degrade compression
|
||||
ratio, so use it only when needed. Valid values range from 100 kB
|
||||
to 4 EiB.
|
||||
|
||||
'-t'
|
||||
'--test'
|
||||
Check integrity of the specified file(s), but don't decompress
|
||||
them. This really performs a trial decompression and throws away
|
||||
the result. Use it together with '-v' to see information about
|
||||
the file.
|
||||
the file(s). If a file fails the test, clzip continues checking
|
||||
the rest of the files.
|
||||
|
||||
'-v'
|
||||
'--verbose'
|
||||
|
@ -246,18 +264,19 @@ The format for running clzip is:
|
|||
processed. A second '-v' shows the progress of compression.
|
||||
When decompressing or testing, further -v's (up to 4) increase the
|
||||
verbosity level, showing status, compression ratio, dictionary
|
||||
size, and trailer contents (CRC, data size, member size).
|
||||
size, trailer contents (CRC, data size, member size), and up to 6
|
||||
bytes of trailing data (if any).
|
||||
|
||||
'-0 .. -9'
|
||||
Set the compression parameters (dictionary size and match length
|
||||
limit) as shown in the table below. Note that '-9' can be much
|
||||
slower than '-0'. These options have no effect when decompressing.
|
||||
limit) as shown in the table below. The default compression level
|
||||
is '-6'. Note that '-9' can be much slower than '-0'. These
|
||||
options have no effect when decompressing.
|
||||
|
||||
The bidimensional parameter space of LZMA can't be mapped to a
|
||||
linear scale optimal for all files. If your files are large, very
|
||||
repetitive, etc, you may need to use the '--match-length' and
|
||||
'--dictionary-size' options directly to achieve optimal
|
||||
performance.
|
||||
repetitive, etc, you may need to use the '--dictionary-size' and
|
||||
'--match-length' options directly to achieve optimal performance.
|
||||
|
||||
Level Dictionary size Match length limit
|
||||
-0 64 KiB 16 bytes
|
||||
|
@ -327,12 +346,12 @@ additional information before, between, or after them.
|
|||
|
||||
Each member has the following structure:
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| ID string | VN | DS | Lzma stream | CRC32 | Data size | Member size |
|
||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
|
||||
All multibyte values are stored in little endian order.
|
||||
|
||||
'ID string'
|
||||
'ID string (the "magic" bytes)'
|
||||
A four byte string, identifying the lzip format, with the value
|
||||
"LZIP" (0x4C, 0x5A, 0x49, 0x50).
|
||||
|
||||
|
@ -350,8 +369,8 @@ additional information before, between, or after them.
|
|||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
'Lzma stream'
|
||||
The lzma stream, finished by an end of stream marker. Uses default
|
||||
'LZMA stream'
|
||||
The LZMA stream, finished by an end of stream marker. Uses default
|
||||
values for encoder properties. *Note Stream format: (lzip)Stream
|
||||
format, for a complete description.
|
||||
|
||||
|
@ -365,11 +384,11 @@ additional information before, between, or after them.
|
|||
Total size of the member, including header and trailer. This field
|
||||
acts as a distributed index, allows the verification of stream
|
||||
integrity, and facilitates safe recovery of undamaged members from
|
||||
multi-member files.
|
||||
multimember files.
|
||||
|
||||
|
||||
|
||||
File: clzip.info, Node: Algorithm, Next: Examples, Prev: File format, Up: Top
|
||||
File: clzip.info, Node: Algorithm, Next: Trailing data, Prev: File format, Up: Top
|
||||
|
||||
4 Algorithm
|
||||
***********
|
||||
|
@ -435,15 +454,48 @@ range encoding), Igor Pavlov (for putting all the above together in
|
|||
LZMA), and Julian Seward (for bzip2's CLI).
|
||||
|
||||
|
||||
File: clzip.info, Node: Examples, Next: Problems, Prev: Algorithm, Up: Top
|
||||
File: clzip.info, Node: Trailing data, Next: Examples, Prev: Algorithm, Up: Top
|
||||
|
||||
5 A small tutorial with examples
|
||||
5 Extra data appended to the file
|
||||
*********************************
|
||||
|
||||
Sometimes extra data is found appended to a lzip file after the last
|
||||
member. Such trailing data may be:
|
||||
|
||||
* Padding added to make the file size a multiple of some block size,
|
||||
for example when writing to a tape.
|
||||
|
||||
* Garbage added by some not totally successful copy operation.
|
||||
|
||||
* Useful data added by the user; a cryptographically secure hash, a
|
||||
description of file contents, etc.
|
||||
|
||||
* Malicious data added to the file in order to make its total size
|
||||
and hash value (for a chosen hash) coincide with those of another
|
||||
file.
|
||||
|
||||
* In very rare cases, trailing data could be the corrupt header of
|
||||
another member. In multimember or concatenated files the
|
||||
probability of corruption happening in the magic bytes is 5 times
|
||||
smaller than the probability of getting a false positive caused by
|
||||
the corruption of the integrity information itself. Therefore it
|
||||
can be considered to be below the noise level.
|
||||
|
||||
Trailing data can be safely ignored in most cases. In some cases,
|
||||
like that of user-added data, it is expected to be ignored. In those
|
||||
cases where a file containing trailing data must be rejected, the option
|
||||
'--trailing-error' can be used. *Note --trailing-error::.
|
||||
|
||||
|
||||
File: clzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top
|
||||
|
||||
6 A small tutorial with examples
|
||||
********************************
|
||||
|
||||
WARNING! Even if clzip is bug-free, other causes may result in a corrupt
|
||||
compressed file (bugs in the system libraries, memory errors, etc).
|
||||
Therefore, if the data you are going to compress are important, give the
|
||||
'--keep' option to clzip and do not remove the original file until you
|
||||
'--keep' option to clzip and don't remove the original file until you
|
||||
verify the compressed file with a command like
|
||||
'clzip -cd file.lz | cmp file -'.
|
||||
|
||||
|
@ -454,8 +506,8 @@ and show the compression ratio.
|
|||
clzip -v file
|
||||
|
||||
|
||||
Example 2: Like example 1 but the created 'file.lz' is multi-member
|
||||
with a member size of 1 MiB. The compression ratio is not shown.
|
||||
Example 2: Like example 1 but the created 'file.lz' is multimember with
|
||||
a member size of 1 MiB. The compression ratio is not shown.
|
||||
|
||||
clzip -b 1MiB file
|
||||
|
||||
|
@ -472,37 +524,46 @@ show status.
|
|||
clzip -tv file.lz
|
||||
|
||||
|
||||
Example 5: Compress a whole floppy in /dev/fd0 and send the output to
|
||||
Example 5: Compress a whole device in /dev/sdc and send the output to
|
||||
'file.lz'.
|
||||
|
||||
clzip -c /dev/fd0 > file.lz
|
||||
clzip -c /dev/sdc > file.lz
|
||||
|
||||
|
||||
Example 6: Decompress 'file.lz' partially until 10 KiB of decompressed
|
||||
Example 6: The right way of concatenating compressed files. *Note
|
||||
Trailing data::.
|
||||
|
||||
Don't do this
|
||||
cat file1.lz file2.lz file3.lz | clzip -d
|
||||
Do this instead
|
||||
clzip -cd file1.lz file2.lz file3.lz
|
||||
|
||||
|
||||
Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed
|
||||
data are produced.
|
||||
|
||||
clzip -cd file.lz | dd bs=1024 count=10
|
||||
|
||||
|
||||
Example 7: Decompress 'file.lz' partially from decompressed byte 10000
|
||||
Example 8: Decompress 'file.lz' partially from decompressed byte 10000
|
||||
to decompressed byte 15000 (5000 bytes are produced).
|
||||
|
||||
clzip -cd file.lz | dd bs=1000 skip=10 count=5
|
||||
|
||||
|
||||
Example 8: Create a multivolume compressed tar archive with a volume
|
||||
Example 9: Create a multivolume compressed tar archive with a volume
|
||||
size of 1440 KiB.
|
||||
|
||||
tar -c some_directory | clzip -S 1440KiB -o volume_name
|
||||
|
||||
|
||||
Example 9: Extract a multivolume compressed tar archive.
|
||||
Example 10: Extract a multivolume compressed tar archive.
|
||||
|
||||
clzip -cd volume_name*.lz | tar -xf -
|
||||
|
||||
|
||||
Example 10: Create a multivolume compressed backup of a large database
|
||||
file with a volume size of 650 MB, where each volume is a multi-member
|
||||
Example 11: Create a multivolume compressed backup of a large database
|
||||
file with a volume size of 650 MB, where each volume is a multimember
|
||||
file with a member size of 32 MiB.
|
||||
|
||||
clzip -b 32MiB -S 650MB big_db
|
||||
|
@ -510,7 +571,7 @@ file with a member size of 32 MiB.
|
|||
|
||||
File: clzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
|
||||
|
||||
6 Reporting bugs
|
||||
7 Reporting bugs
|
||||
****************
|
||||
|
||||
There are probably bugs in clzip. There are certainly errors and
|
||||
|
@ -539,6 +600,7 @@ Concept index
|
|||
* introduction: Introduction. (line 6)
|
||||
* invoking: Invoking clzip. (line 6)
|
||||
* options: Invoking clzip. (line 6)
|
||||
* trailing data: Trailing data. (line 6)
|
||||
* usage: Invoking clzip. (line 6)
|
||||
* version: Invoking clzip. (line 6)
|
||||
|
||||
|
@ -546,13 +608,16 @@ Concept index
|
|||
|
||||
Tag Table:
|
||||
Node: Top210
|
||||
Node: Introduction893
|
||||
Node: Invoking clzip6152
|
||||
Node: File format11705
|
||||
Node: Algorithm14108
|
||||
Node: Examples16933
|
||||
Node: Problems18900
|
||||
Node: Concept index19426
|
||||
Node: Introduction952
|
||||
Node: Invoking clzip6164
|
||||
Ref: --trailing-error6730
|
||||
Node: File format12728
|
||||
Node: Algorithm15150
|
||||
Node: Trailing data17980
|
||||
Node: Examples19355
|
||||
Ref: concat-example20537
|
||||
Node: Problems21544
|
||||
Node: Concept index22070
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
165
doc/clzip.texi
165
doc/clzip.texi
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 7 July 2015
|
||||
@set VERSION 1.7
|
||||
@set UPDATED 13 May 2016
|
||||
@set VERSION 1.8
|
||||
|
||||
@dircategory Data Compression
|
||||
@direntry
|
||||
|
@ -39,13 +39,14 @@ This manual is for Clzip (version @value{VERSION}, @value{UPDATED}).
|
|||
* Invoking clzip:: Command line interface
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Algorithm:: How clzip compresses the data
|
||||
* Trailing data:: Extra data appended to the file
|
||||
* Examples:: A small tutorial with examples
|
||||
* Problems:: Reporting bugs
|
||||
* Concept index:: Index of concepts
|
||||
@end menu
|
||||
|
||||
@sp 1
|
||||
Copyright @copyright{} 2010-2015 Antonio Diaz Diaz.
|
||||
Copyright @copyright{} 2010-2016 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission
|
||||
to copy, distribute and modify it.
|
||||
|
@ -78,7 +79,7 @@ program can repair bit-flip errors (one of the most common forms of data
|
|||
corruption) in lzip files, and provides data recovery capabilities,
|
||||
including error-checked merging of damaged copies of a file.
|
||||
@ifnothtml
|
||||
@ref{Data safety,,,lziprecover}.
|
||||
@xref{Data safety,,,lziprecover}.
|
||||
@end ifnothtml
|
||||
|
||||
@item
|
||||
|
@ -101,14 +102,14 @@ corrupt byte near the beginning is a thing of the past.
|
|||
|
||||
The member trailer stores the 32-bit CRC of the original data, the size
|
||||
of the original data and the size of the member. These values, together
|
||||
with the value remaining in the range decoder and the end-of-stream
|
||||
marker, provide a 4 factor integrity checking which guarantees that the
|
||||
decompressed version of the data is identical to the original. This
|
||||
guards against corruption of the compressed data, and against undetected
|
||||
bugs in clzip (hopefully very unlikely). The chances of data corruption
|
||||
going undetected are microscopic. Be aware, though, that the check
|
||||
occurs upon decompression, so it can only tell you that something is
|
||||
wrong. It can't help you recover the original uncompressed data.
|
||||
with the end-of-stream marker, provide a 3 factor integrity checking
|
||||
which guarantees that the decompressed version of the data is identical
|
||||
to the original. This guards against corruption of the compressed data,
|
||||
and against undetected bugs in clzip (hopefully very unlikely). The
|
||||
chances of data corruption going undetected are microscopic. Be aware,
|
||||
though, that the check occurs upon decompression, so it can only tell
|
||||
you that something is wrong. It can't help you recover the original
|
||||
uncompressed data.
|
||||
|
||||
Clzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
|
@ -157,14 +158,14 @@ or more compressed files. The result is the concatenation of the
|
|||
corresponding uncompressed files. Integrity testing of concatenated
|
||||
compressed files is also supported.
|
||||
|
||||
Clzip can produce multi-member files and safely recover, with
|
||||
Clzip can produce multimember files and safely recover, with
|
||||
lziprecover, the undamaged members in case of file damage. Clzip can
|
||||
also split the compressed output in volumes of a given size, even when
|
||||
reading from standard input. This allows the direct creation of
|
||||
multivolume compressed tar archives.
|
||||
|
||||
Clzip is able to compress and decompress streams of unlimited size by
|
||||
automatically creating multi-member output. The members so created are
|
||||
automatically creating multimember output. The members so created are
|
||||
large, about 2 PiB each.
|
||||
|
||||
|
||||
|
@ -181,6 +182,11 @@ The format for running clzip is:
|
|||
clzip [@var{options}] [@var{files}]
|
||||
@end example
|
||||
|
||||
@noindent
|
||||
@samp{-} used as a @var{file} argument means standard input. It can be
|
||||
mixed with other @var{files} and is read just once, the first time it
|
||||
appears in the command line.
|
||||
|
||||
Clzip supports the following options:
|
||||
|
||||
@table @code
|
||||
|
@ -192,6 +198,13 @@ Print an informative help message describing the options and exit.
|
|||
@itemx --version
|
||||
Print the version number of clzip on the standard output and exit.
|
||||
|
||||
@anchor{--trailing-error}
|
||||
@item -a
|
||||
@itemx --trailing-error
|
||||
Exit with error status 2 if any remaining input is detected after
|
||||
decompressing the last member. Such remaining input is usually trailing
|
||||
garbage that can be safely ignored. @xref{concat-example}.
|
||||
|
||||
@item -b @var{bytes}
|
||||
@itemx --member-size=@var{bytes}
|
||||
Set the member size limit to @var{bytes}. A small member size may
|
||||
|
@ -200,13 +213,18 @@ range from 100 kB to 2 PiB. Defaults to 2 PiB.
|
|||
|
||||
@item -c
|
||||
@itemx --stdout
|
||||
Compress or decompress to standard output. Needed when reading from a
|
||||
named pipe (fifo) or from a device. Use it to recover as much of the
|
||||
uncompressed data as possible when decompressing a corrupt file.
|
||||
Compress or decompress to standard output; keep input files unchanged.
|
||||
If compressing several files, each file is compressed independently.
|
||||
This option is needed when reading from a named pipe (fifo) or from a
|
||||
device. Use it also to recover as much of the uncompressed data as
|
||||
possible when decompressing a corrupt file.
|
||||
|
||||
@item -d
|
||||
@itemx --decompress
|
||||
Decompress.
|
||||
Decompress the specified file(s). If a file does not exist or can't be
|
||||
opened, clzip continues decompressing the rest of the files. If a file
|
||||
fails to decompress, clzip exits immediately without decompressing the
|
||||
rest of the files.
|
||||
|
||||
@item -f
|
||||
@itemx --force
|
||||
|
@ -242,11 +260,13 @@ Quiet operation. Suppress all messages.
|
|||
|
||||
@item -s @var{bytes}
|
||||
@itemx --dictionary-size=@var{bytes}
|
||||
Set the dictionary size limit in bytes. Valid values range from 4 KiB to
|
||||
512 MiB. Clzip will use the smallest possible dictionary size for each
|
||||
file without exceeding this limit. Note that dictionary sizes are
|
||||
quantized. If the specified size does not match one of the valid sizes,
|
||||
it will be rounded upwards by adding up to (@var{bytes} / 16) to it.
|
||||
Set the dictionary size limit in bytes. Clzip will use the smallest
|
||||
possible dictionary size for each file without exceeding this limit.
|
||||
Valid values range from 4 KiB to 512 MiB. Values 12 to 29 are
|
||||
interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note that
|
||||
dictionary sizes are quantized. If the specified size does not match one
|
||||
of the valid sizes, it will be rounded upwards by adding up to
|
||||
@w{(@var{bytes} / 8)} to it.
|
||||
|
||||
For maximum compression you should use a dictionary size limit as large
|
||||
as possible, but keep in mind that the decompression memory requirement
|
||||
|
@ -257,7 +277,7 @@ is affected at compression time by the choice of dictionary size limit.
|
|||
Split the compressed output into several volume files with names
|
||||
@samp{original_name00001.lz}, @samp{original_name00002.lz}, etc, and set
|
||||
the volume size limit to @var{bytes}. Each volume is a complete, maybe
|
||||
multi-member, lzip file. A small volume size may degrade compression
|
||||
multimember, lzip file. A small volume size may degrade compression
|
||||
ratio, so use it only when needed. Valid values range from 100 kB to 4
|
||||
EiB.
|
||||
|
||||
|
@ -265,7 +285,8 @@ EiB.
|
|||
@itemx --test
|
||||
Check integrity of the specified file(s), but don't decompress them.
|
||||
This really performs a trial decompression and throws away the result.
|
||||
Use it together with @samp{-v} to see information about the file.
|
||||
Use it together with @samp{-v} to see information about the file(s). If
|
||||
a file fails the test, clzip continues checking the rest of the files.
|
||||
|
||||
@item -v
|
||||
@itemx --verbose
|
||||
|
@ -274,18 +295,19 @@ When compressing, show the compression ratio for each file processed. A
|
|||
second @samp{-v} shows the progress of compression.@*
|
||||
When decompressing or testing, further -v's (up to 4) increase the
|
||||
verbosity level, showing status, compression ratio, dictionary size,
|
||||
and trailer contents (CRC, data size, member size).
|
||||
trailer contents (CRC, data size, member size), and up to 6 bytes of
|
||||
trailing data (if any).
|
||||
|
||||
@item -0 .. -9
|
||||
Set the compression parameters (dictionary size and match length limit)
|
||||
as shown in the table below. Note that @samp{-9} can be much slower than
|
||||
@samp{-0}. These options have no effect when decompressing.
|
||||
as shown in the table below. The default compression level is @samp{-6}.
|
||||
Note that @samp{-9} can be much slower than @samp{-0}. These options
|
||||
have no effect when decompressing.
|
||||
|
||||
The bidimensional parameter space of LZMA can't be mapped to a linear
|
||||
scale optimal for all files. If your files are large, very repetitive,
|
||||
etc, you may need to use the @samp{--match-length} and
|
||||
@samp{--dictionary-size} options directly to achieve optimal
|
||||
performance.
|
||||
etc, you may need to use the @samp{--dictionary-size} and
|
||||
@samp{--match-length} options directly to achieve optimal performance.
|
||||
|
||||
@multitable {Level} {Dictionary size} {Match length limit}
|
||||
@item Level @tab Dictionary size @tab Match length limit
|
||||
|
@ -364,14 +386,14 @@ additional information before, between, or after them.
|
|||
Each member has the following structure:
|
||||
@verbatim
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| ID string | VN | DS | Lzma stream | CRC32 | Data size | Member size |
|
||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
@end verbatim
|
||||
|
||||
All multibyte values are stored in little endian order.
|
||||
|
||||
@table @samp
|
||||
@item ID string
|
||||
@item ID string (the "magic" bytes)
|
||||
A four byte string, identifying the lzip format, with the value "LZIP"
|
||||
(0x4C, 0x5A, 0x49, 0x50).
|
||||
|
||||
|
@ -388,8 +410,8 @@ from the base size to obtain the dictionary size.@*
|
|||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
@item Lzma stream
|
||||
The lzma stream, finished by an end of stream marker. Uses default
|
||||
@item LZMA stream
|
||||
The LZMA stream, finished by an end of stream marker. Uses default
|
||||
values for encoder properties.
|
||||
@ifnothtml
|
||||
@xref{Stream format,,,lzip},
|
||||
|
@ -409,7 +431,7 @@ Size of the uncompressed original data.
|
|||
@item Member size (8 bytes)
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, allows the verification of stream integrity, and
|
||||
facilitates safe recovery of undamaged members from multi-member files.
|
||||
facilitates safe recovery of undamaged members from multimember files.
|
||||
|
||||
@end table
|
||||
|
||||
|
@ -480,6 +502,44 @@ range encoding), Igor Pavlov (for putting all the above together in
|
|||
LZMA), and Julian Seward (for bzip2's CLI).
|
||||
|
||||
|
||||
@node Trailing data
|
||||
@chapter Extra data appended to the file
|
||||
@cindex trailing data
|
||||
|
||||
Sometimes extra data is found appended to a lzip file after the last
|
||||
member. Such trailing data may be:
|
||||
|
||||
@itemize @bullet
|
||||
@item
|
||||
Padding added to make the file size a multiple of some block size, for
|
||||
example when writing to a tape.
|
||||
|
||||
@item
|
||||
Garbage added by some not totally successful copy operation.
|
||||
|
||||
@item
|
||||
Useful data added by the user; a cryptographically secure hash, a
|
||||
description of file contents, etc.
|
||||
|
||||
@item
|
||||
Malicious data added to the file in order to make its total size and
|
||||
hash value (for a chosen hash) coincide with those of another file.
|
||||
|
||||
@item
|
||||
In very rare cases, trailing data could be the corrupt header of another
|
||||
member. In multimember or concatenated files the probability of
|
||||
corruption happening in the magic bytes is 5 times smaller than the
|
||||
probability of getting a false positive caused by the corruption of the
|
||||
integrity information itself. Therefore it can be considered to be below
|
||||
the noise level.
|
||||
@end itemize
|
||||
|
||||
Trailing data can be safely ignored in most cases. In some cases, like
|
||||
that of user-added data, it is expected to be ignored. In those cases
|
||||
where a file containing trailing data must be rejected, the option
|
||||
@samp{--trailing-error} can be used. @xref{--trailing-error}.
|
||||
|
||||
|
||||
@node Examples
|
||||
@chapter A small tutorial with examples
|
||||
@cindex examples
|
||||
|
@ -487,7 +547,7 @@ LZMA), and Julian Seward (for bzip2's CLI).
|
|||
WARNING! Even if clzip is bug-free, other causes may result in a corrupt
|
||||
compressed file (bugs in the system libraries, memory errors, etc).
|
||||
Therefore, if the data you are going to compress are important, give the
|
||||
@samp{--keep} option to clzip and do not remove the original file until
|
||||
@samp{--keep} option to clzip and don't remove the original file until
|
||||
you verify the compressed file with a command like
|
||||
@w{@samp{clzip -cd file.lz | cmp file -}}.
|
||||
|
||||
|
@ -502,7 +562,7 @@ clzip -v file
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 2: Like example 1 but the created @samp{file.lz} is multi-member
|
||||
Example 2: Like example 1 but the created @samp{file.lz} is multimember
|
||||
with a member size of 1 MiB. The compression ratio is not shown.
|
||||
|
||||
@example
|
||||
|
@ -530,16 +590,29 @@ clzip -tv file.lz
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 5: Compress a whole floppy in /dev/fd0 and send the output to
|
||||
Example 5: Compress a whole device in /dev/sdc and send the output to
|
||||
@samp{file.lz}.
|
||||
|
||||
@example
|
||||
clzip -c /dev/fd0 > file.lz
|
||||
clzip -c /dev/sdc > file.lz
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@anchor{concat-example}
|
||||
@noindent
|
||||
Example 6: The right way of concatenating compressed files.
|
||||
@xref{Trailing data}.
|
||||
|
||||
@example
|
||||
Don't do this
|
||||
cat file1.lz file2.lz file3.lz | clzip -d
|
||||
Do this instead
|
||||
clzip -cd file1.lz file2.lz file3.lz
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 6: Decompress @samp{file.lz} partially until 10 KiB of
|
||||
Example 7: Decompress @samp{file.lz} partially until 10 KiB of
|
||||
decompressed data are produced.
|
||||
|
||||
@example
|
||||
|
@ -548,7 +621,7 @@ clzip -cd file.lz | dd bs=1024 count=10
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 7: Decompress @samp{file.lz} partially from decompressed byte
|
||||
Example 8: Decompress @samp{file.lz} partially from decompressed byte
|
||||
10000 to decompressed byte 15000 (5000 bytes are produced).
|
||||
|
||||
@example
|
||||
|
@ -557,7 +630,7 @@ clzip -cd file.lz | dd bs=1000 skip=10 count=5
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 8: Create a multivolume compressed tar archive with a volume
|
||||
Example 9: Create a multivolume compressed tar archive with a volume
|
||||
size of 1440 KiB.
|
||||
|
||||
@example
|
||||
|
@ -566,7 +639,7 @@ tar -c some_directory | clzip -S 1440KiB -o volume_name
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 9: Extract a multivolume compressed tar archive.
|
||||
Example 10: Extract a multivolume compressed tar archive.
|
||||
|
||||
@example
|
||||
clzip -cd volume_name*.lz | tar -xf -
|
||||
|
@ -574,8 +647,8 @@ clzip -cd volume_name*.lz | tar -xf -
|
|||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 10: Create a multivolume compressed backup of a large database
|
||||
file with a volume size of 650 MB, where each volume is a multi-member
|
||||
Example 11: Create a multivolume compressed backup of a large database
|
||||
file with a volume size of 650 MB, where each volume is a multimember
|
||||
file with a member size of 32 MiB.
|
||||
|
||||
@example
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue