Adding upstream version 1.6~pre2.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
41be15d0fe
commit
dda4168857
15 changed files with 364 additions and 296 deletions
10
doc/clzip.1
10
doc/clzip.1
|
@ -1,7 +1,7 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
|
||||
.TH CLZIP "1" "January 2014" "Clzip 1.6-pre1" "User Commands"
|
||||
.TH CLZIP "1" "May 2014" "clzip 1.6-pre2" "User Commands"
|
||||
.SH NAME
|
||||
Clzip \- reduces the size of files
|
||||
clzip \- reduces the size of files
|
||||
.SH SYNOPSIS
|
||||
.B clzip
|
||||
[\fIoptions\fR] [\fIfiles\fR]
|
||||
|
@ -89,13 +89,13 @@ This is free software: you are free to change and redistribute it.
|
|||
There is NO WARRANTY, to the extent permitted by law.
|
||||
.SH "SEE ALSO"
|
||||
The full documentation for
|
||||
.B Clzip
|
||||
.B clzip
|
||||
is maintained as a Texinfo manual. If the
|
||||
.B info
|
||||
and
|
||||
.B Clzip
|
||||
.B clzip
|
||||
programs are properly installed at your site, the command
|
||||
.IP
|
||||
.B info Clzip
|
||||
.B info clzip
|
||||
.PP
|
||||
should give you access to the complete manual.
|
||||
|
|
102
doc/clzip.info
102
doc/clzip.info
|
@ -11,7 +11,7 @@ File: clzip.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Clzip Manual
|
||||
************
|
||||
|
||||
This manual is for Clzip (version 1.6-pre1, 30 January 2014).
|
||||
This manual is for Clzip (version 1.6-pre2, 6 May 2014).
|
||||
|
||||
* Menu:
|
||||
|
||||
|
@ -39,20 +39,31 @@ Clzip is a lossless data compressor with a user interface similar to the
|
|||
one of gzip or bzip2. Clzip decompresses almost as fast as gzip,
|
||||
compresses most files more than bzip2, and is better than both from a
|
||||
data recovery perspective. Clzip is a clean implementation of the LZMA
|
||||
algorithm.
|
||||
(Lempel-Ziv-Markov chain-Algorithm) algorithm.
|
||||
|
||||
Clzip uses the lzip file format; the files produced by clzip are
|
||||
fully compatible with lzip-1.4 or newer, and can be rescued with
|
||||
lziprecover. Clzip is in fact a C language version of lzip, intended
|
||||
for embedded devices or systems lacking a C++ compiler.
|
||||
|
||||
The lzip file format is designed for long-term data archiving and
|
||||
provides very safe integrity checking. It is as simple as possible (but
|
||||
not simpler), so that with the only help of the lzip manual it would be
|
||||
possible for a digital archaeologist to extract the data from a lzip
|
||||
file long after quantum computers eventually render LZMA obsolete.
|
||||
Additionally lzip is copylefted, which guarantees that it will remain
|
||||
free forever.
|
||||
The lzip file format is designed for long-term data archiving, taking
|
||||
into account both data integrity and decoder availability:
|
||||
|
||||
* The lzip format provides very safe integrity checking and some data
|
||||
recovery means. The lziprecover program can repair bit-flip errors
|
||||
(one of the most common forms of data corruption) in lzip files,
|
||||
and provides data recovery capabilities, including error-checked
|
||||
merging of damaged copies of a file.
|
||||
|
||||
* The lzip format is as simple as possible (but not simpler). The
|
||||
lzip manual provides the code of a simple decompressor along with
|
||||
a detailed explanation of how it works, so that with the only help
|
||||
of the lzip manual it would be possible for a digital
|
||||
archaeologist to extract the data from a lzip file long after
|
||||
quantum computers eventually render LZMA obsolete.
|
||||
|
||||
* Additionally lzip is copylefted, which guarantees that it will
|
||||
remain free forever.
|
||||
|
||||
The member trailer stores the 32-bit CRC of the original data, the
|
||||
size of the original data and the size of the member. These values,
|
||||
|
@ -66,16 +77,21 @@ though, that the check occurs upon decompression, so it can only tell
|
|||
you that something is wrong. It can't help you recover the original
|
||||
uncompressed data.
|
||||
|
||||
If you ever need to recover data from a damaged lzip file, try the
|
||||
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
|
||||
(one of the most common forms of data corruption), and provides data
|
||||
recovery capabilities, including error-checked merging of damaged copies
|
||||
of a file.
|
||||
|
||||
Clzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for tar or zutils.
|
||||
|
||||
The amount of memory required for compression is about 1 or 2 times
|
||||
the dictionary size limit (1 if input file size is less than dictionary
|
||||
size limit, else 2) plus 9 times the dictionary size really used. The
|
||||
amount of memory required for decompression is about 46 kB larger than
|
||||
the dictionary size really used.
|
||||
|
||||
Clzip will automatically use the smallest possible dictionary size
|
||||
for each file without exceeding the given limit. Keep in mind that the
|
||||
decompression memory requirement is affected at compression time by the
|
||||
choice of dictionary size limit.
|
||||
|
||||
When compressing, clzip replaces every file given in the command line
|
||||
with a compressed version of itself, with the name "original_name.lz".
|
||||
When decompressing, clzip attempts to guess the name for the
|
||||
|
@ -114,30 +130,29 @@ multivolume compressed tar archives.
|
|||
automatically creating multi-member output. The members so created are
|
||||
large, about 64 PiB each.
|
||||
|
||||
The amount of memory required for compression is about 1 or 2 times
|
||||
the dictionary size limit (1 if input file size is less than dictionary
|
||||
size limit, else 2) plus 9 times the dictionary size really used. The
|
||||
amount of memory required for decompression is about 46 kB larger than
|
||||
the dictionary size really used.
|
||||
|
||||
Clzip will automatically use the smallest possible dictionary size
|
||||
without exceeding the given limit. Keep in mind that the decompression
|
||||
memory requirement is affected at compression time by the choice of
|
||||
dictionary size limit.
|
||||
|
||||
|
||||
File: clzip.info, Node: Algorithm, Next: Invoking clzip, Prev: Introduction, Up: Top
|
||||
|
||||
2 Algorithm
|
||||
***********
|
||||
|
||||
Clzip implements a simplified version of the LZMA (Lempel-Ziv-Markov
|
||||
chain-Algorithm) algorithm. The high compression of LZMA comes from
|
||||
combining two basic, well-proven compression ideas: sliding dictionaries
|
||||
(LZ77/78) and markov models (the thing used by every compression
|
||||
algorithm that uses a range encoder or similar order-0 entropy coder as
|
||||
its last stage) with segregation of contexts according to what the bits
|
||||
are used for.
|
||||
There is no such thing as a "LZMA algorithm"; it is more like a "LZMA
|
||||
coding scheme". For example, the option '-0' of lzip uses the scheme in
|
||||
almost the simplest way possible; issuing the longest match it can find,
|
||||
or a literal byte if it can't find a match. Inversely, a much more
|
||||
elaborated way of finding coding sequences of minimum price than the one
|
||||
currently used by lzip could be developed, and the resulting sequence
|
||||
could also be coded using the LZMA coding scheme.
|
||||
|
||||
Lzip currently implements two variants of the LZMA algorithm; fast
|
||||
(used by option -0) and normal (used by all other compression levels).
|
||||
Clzip just implements the "normal" variant.
|
||||
|
||||
The high compression of LZMA comes from combining two basic,
|
||||
well-proven compression ideas: sliding dictionaries (LZ77/78) and
|
||||
markov models (the thing used by every compression algorithm that uses
|
||||
a range encoder or similar order-0 entropy coder as its last stage)
|
||||
with segregation of contexts according to what the bits are used for.
|
||||
|
||||
Clzip is a two stage compressor. The first stage is a Lempel-Ziv
|
||||
coder, which reduces redundancy by translating chunks of data to their
|
||||
|
@ -145,11 +160,6 @@ corresponding distance-length pairs. The second stage is a range encoder
|
|||
that uses a different probability model for each type of data;
|
||||
distances, lengths, literal bytes, etc.
|
||||
|
||||
The match finder, part of the LZ coder, is the most important piece
|
||||
of the LZMA algorithm, as it is in many Lempel-Ziv based algorithms.
|
||||
Most of clzip's execution time is spent in the match finder, and it has
|
||||
the greatest influence on the compression ratio.
|
||||
|
||||
Here is how it works, step by step:
|
||||
|
||||
1) The member header is written to the output stream.
|
||||
|
@ -261,7 +271,7 @@ The format for running clzip is:
|
|||
'--dictionary-size=BYTES'
|
||||
Set the dictionary size limit in bytes. Valid values range from 4
|
||||
KiB to 512 MiB. Clzip will use the smallest possible dictionary
|
||||
size for each member without exceeding this limit. Note that
|
||||
size for each file without exceeding this limit. Note that
|
||||
dictionary sizes are quantized. If the specified size does not
|
||||
match one of the valid sizes, it will be rounded upwards by adding
|
||||
up to (BYTES / 16) to it.
|
||||
|
@ -530,13 +540,13 @@ Concept index
|
|||
|
||||
Tag Table:
|
||||
Node: Top210
|
||||
Node: Introduction921
|
||||
Node: Algorithm5557
|
||||
Node: Invoking clzip8057
|
||||
Node: File format13656
|
||||
Node: Examples16161
|
||||
Node: Problems18130
|
||||
Node: Concept index18656
|
||||
Node: Introduction916
|
||||
Node: Algorithm5823
|
||||
Node: Invoking clzip8629
|
||||
Node: File format14226
|
||||
Node: Examples16731
|
||||
Node: Problems18700
|
||||
Node: Concept index19226
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 30 January 2014
|
||||
@set VERSION 1.6-pre1
|
||||
@set UPDATED 6 May 2014
|
||||
@set VERSION 1.6-pre2
|
||||
|
||||
@dircategory Data Compression
|
||||
@direntry
|
||||
|
@ -59,20 +59,36 @@ Clzip is a lossless data compressor with a user interface similar to the
|
|||
one of gzip or bzip2. Clzip decompresses almost as fast as gzip,
|
||||
compresses most files more than bzip2, and is better than both from a
|
||||
data recovery perspective. Clzip is a clean implementation of the LZMA
|
||||
algorithm.
|
||||
(Lempel-Ziv-Markov chain-Algorithm) algorithm.
|
||||
|
||||
Clzip uses the lzip file format; the files produced by clzip are fully
|
||||
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
||||
Clzip is in fact a C language version of lzip, intended for embedded
|
||||
devices or systems lacking a C++ compiler.
|
||||
|
||||
The lzip file format is designed for long-term data archiving and
|
||||
provides very safe integrity checking. It is as simple as possible (but
|
||||
not simpler), so that with the only help of the lzip manual it would be
|
||||
possible for a digital archaeologist to extract the data from a lzip
|
||||
file long after quantum computers eventually render LZMA obsolete.
|
||||
The lzip file format is designed for long-term data archiving, taking
|
||||
into account both data integrity and decoder availability:
|
||||
|
||||
@itemize @bullet
|
||||
@item
|
||||
The lzip format provides very safe integrity checking and some data
|
||||
recovery means. The lziprecover program can repair bit-flip errors (one
|
||||
of the most common forms of data corruption) in lzip files, and provides
|
||||
data recovery capabilities, including error-checked merging of damaged
|
||||
copies of a file.
|
||||
|
||||
@item
|
||||
The lzip format is as simple as possible (but not simpler). The lzip
|
||||
manual provides the code of a simple decompressor along with a detailed
|
||||
explanation of how it works, so that with the only help of the lzip
|
||||
manual it would be possible for a digital archaeologist to extract the
|
||||
data from a lzip file long after quantum computers eventually render
|
||||
LZMA obsolete.
|
||||
|
||||
@item
|
||||
Additionally lzip is copylefted, which guarantees that it will remain
|
||||
free forever.
|
||||
@end itemize
|
||||
|
||||
The member trailer stores the 32-bit CRC of the original data, the size
|
||||
of the original data and the size of the member. These values, together
|
||||
|
@ -85,16 +101,21 @@ going undetected are microscopic. Be aware, though, that the check
|
|||
occurs upon decompression, so it can only tell you that something is
|
||||
wrong. It can't help you recover the original uncompressed data.
|
||||
|
||||
If you ever need to recover data from a damaged lzip file, try the
|
||||
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
|
||||
(one of the most common forms of data corruption), and provides data
|
||||
recovery capabilities, including error-checked merging of damaged copies
|
||||
of a file.
|
||||
|
||||
Clzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for tar or zutils.
|
||||
|
||||
The amount of memory required for compression is about 1 or 2 times the
|
||||
dictionary size limit (1 if input file size is less than dictionary size
|
||||
limit, else 2) plus 9 times the dictionary size really used. The amount
|
||||
of memory required for decompression is about 46 kB larger than the
|
||||
dictionary size really used.
|
||||
|
||||
Clzip will automatically use the smallest possible dictionary size for
|
||||
each file without exceeding the given limit. Keep in mind that the
|
||||
decompression memory requirement is affected at compression time by the
|
||||
choice of dictionary size limit.
|
||||
|
||||
When compressing, clzip replaces every file given in the command line
|
||||
with a compressed version of itself, with the name "original_name.lz".
|
||||
When decompressing, clzip attempts to guess the name for the decompressed
|
||||
|
@ -135,29 +156,28 @@ Clzip is able to compress and decompress streams of unlimited size by
|
|||
automatically creating multi-member output. The members so created are
|
||||
large, about 64 PiB each.
|
||||
|
||||
The amount of memory required for compression is about 1 or 2 times the
|
||||
dictionary size limit (1 if input file size is less than dictionary size
|
||||
limit, else 2) plus 9 times the dictionary size really used. The amount
|
||||
of memory required for decompression is about 46 kB larger than the
|
||||
dictionary size really used.
|
||||
|
||||
Clzip will automatically use the smallest possible dictionary size
|
||||
without exceeding the given limit. Keep in mind that the decompression
|
||||
memory requirement is affected at compression time by the choice of
|
||||
dictionary size limit.
|
||||
|
||||
|
||||
@node Algorithm
|
||||
@chapter Algorithm
|
||||
@cindex algorithm
|
||||
|
||||
Clzip implements a simplified version of the LZMA (Lempel-Ziv-Markov
|
||||
chain-Algorithm) algorithm. The high compression of LZMA comes from
|
||||
combining two basic, well-proven compression ideas: sliding dictionaries
|
||||
(LZ77/78) and markov models (the thing used by every compression
|
||||
algorithm that uses a range encoder or similar order-0 entropy coder as
|
||||
its last stage) with segregation of contexts according to what the bits
|
||||
are used for.
|
||||
There is no such thing as a "LZMA algorithm"; it is more like a "LZMA
|
||||
coding scheme". For example, the option '-0' of lzip uses the scheme in
|
||||
almost the simplest way possible; issuing the longest match it can find,
|
||||
or a literal byte if it can't find a match. Inversely, a much more
|
||||
elaborated way of finding coding sequences of minimum price than the one
|
||||
currently used by lzip could be developed, and the resulting sequence
|
||||
could also be coded using the LZMA coding scheme.
|
||||
|
||||
Lzip currently implements two variants of the LZMA algorithm; fast (used
|
||||
by option -0) and normal (used by all other compression levels). Clzip
|
||||
just implements the "normal" variant.
|
||||
|
||||
The high compression of LZMA comes from combining two basic, well-proven
|
||||
compression ideas: sliding dictionaries (LZ77/78) and markov models (the
|
||||
thing used by every compression algorithm that uses a range encoder or
|
||||
similar order-0 entropy coder as its last stage) with segregation of
|
||||
contexts according to what the bits are used for.
|
||||
|
||||
Clzip is a two stage compressor. The first stage is a Lempel-Ziv coder,
|
||||
which reduces redundancy by translating chunks of data to their
|
||||
|
@ -165,11 +185,6 @@ corresponding distance-length pairs. The second stage is a range encoder
|
|||
that uses a different probability model for each type of data;
|
||||
distances, lengths, literal bytes, etc.
|
||||
|
||||
The match finder, part of the LZ coder, is the most important piece of
|
||||
the LZMA algorithm, as it is in many Lempel-Ziv based algorithms. Most
|
||||
of clzip's execution time is spent in the match finder, and it has the
|
||||
greatest influence on the compression ratio.
|
||||
|
||||
Here is how it works, step by step:
|
||||
|
||||
1) The member header is written to the output stream.
|
||||
|
@ -284,7 +299,7 @@ Quiet operation. Suppress all messages.
|
|||
@itemx --dictionary-size=@var{bytes}
|
||||
Set the dictionary size limit in bytes. Valid values range from 4 KiB to
|
||||
512 MiB. Clzip will use the smallest possible dictionary size for each
|
||||
member without exceeding this limit. Note that dictionary sizes are
|
||||
file without exceeding this limit. Note that dictionary sizes are
|
||||
quantized. If the specified size does not match one of the valid sizes,
|
||||
it will be rounded upwards by adding up to (@var{bytes} / 16) to it.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue