Merging upstream version 1.5~pre2.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
7220eb23eb
commit
b8d132e6e9
15 changed files with 296 additions and 250 deletions
|
@ -1,3 +1,8 @@
|
||||||
|
2013-07-17 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
|
* Version 1.5-pre2 released.
|
||||||
|
* Show progress of compression at verbosity level 2 (-vv).
|
||||||
|
|
||||||
2013-05-13 Antonio Diaz Diaz <antonio@gnu.org>
|
2013-05-13 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.5-pre1 released.
|
* Version 1.5-pre1 released.
|
||||||
|
|
6
INSTALL
6
INSTALL
|
@ -1,7 +1,7 @@
|
||||||
Requirements
|
Requirements
|
||||||
------------
|
------------
|
||||||
You will need a C compiler.
|
You will need a C compiler.
|
||||||
I use gcc 4.8.0 and 3.3.6, but the code should compile with any
|
I use gcc 4.8.1 and 3.3.6, but the code should compile with any
|
||||||
standards compliant compiler.
|
standards compliant compiler.
|
||||||
Gcc is available at http://gcc.gnu.org.
|
Gcc is available at http://gcc.gnu.org.
|
||||||
|
|
||||||
|
@ -10,9 +10,9 @@ Procedure
|
||||||
---------
|
---------
|
||||||
1. Unpack the archive if you have not done so already:
|
1. Unpack the archive if you have not done so already:
|
||||||
|
|
||||||
lzip -cd clzip[version].tar.lz | tar -xf -
|
tar -xf clzip[version].tar.lz
|
||||||
or
|
or
|
||||||
gzip -cd clzip[version].tar.gz | tar -xf -
|
lzip -cd clzip[version].tar.lz | tar -xf -
|
||||||
|
|
||||||
This creates the directory ./clzip[version] containing the source from
|
This creates the directory ./clzip[version] containing the source from
|
||||||
the main archive.
|
the main archive.
|
||||||
|
|
4
NEWS
4
NEWS
|
@ -1,5 +1,7 @@
|
||||||
Changes in version 1.5:
|
Changes in version 1.5:
|
||||||
|
|
||||||
|
Clzip now shows the progress of compression at verbosity level 2 (-vv).
|
||||||
|
|
||||||
Decompression time has been reduced by 1%.
|
Decompression time has been reduced by 1%.
|
||||||
|
|
||||||
File version is now shown only if verbosity >= 4.
|
File version is now shown only if verbosity >= 4.
|
||||||
|
@ -7,4 +9,4 @@ File version is now shown only if verbosity >= 4.
|
||||||
Option "-n, --threads" is now accepted and ignored for compatibility
|
Option "-n, --threads" is now accepted and ignored for compatibility
|
||||||
with plzip.
|
with plzip.
|
||||||
|
|
||||||
"configure" now accepts options with a separate argument.
|
The configure script now accepts options with a separate argument.
|
||||||
|
|
52
README
52
README
|
@ -1,22 +1,38 @@
|
||||||
Description
|
Description
|
||||||
|
|
||||||
Clzip is a lossless data compressor based on the LZMA algorithm, with
|
Clzip is a lossless data compressor with a user interface similar to the
|
||||||
very safe integrity checking and a user interface similar to the one of
|
one of gzip or bzip2. Clzip decompresses almost as fast as gzip and
|
||||||
gzip or bzip2. Clzip decompresses almost as fast as gzip and compresses
|
compresses more than bzip2, which makes it well suited for software
|
||||||
better than bzip2, which makes it well suited for software distribution
|
distribution and data archiving. Clzip is a clean implementation of the
|
||||||
and data archiving.
|
LZMA algorithm.
|
||||||
|
|
||||||
Clzip uses the same well-defined exit status values used by bzip2, which
|
Clzip uses the same well-defined exit status values used by lzip and
|
||||||
makes it safer when used in pipes or scripts than compressors returning
|
bzip2, which makes it safer when used in pipes or scripts than
|
||||||
ambiguous warning values, like gzip.
|
compressors returning ambiguous warning values, like gzip.
|
||||||
|
|
||||||
Clzip uses the lzip file format; the files produced by clzip are fully
|
Clzip uses the lzip file format; the files produced by clzip are fully
|
||||||
compatible with lzip-1.4 or newer. Clzip is in fact a C language version
|
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
||||||
of lzip, intended for embedded devices or systems lacking a C++
|
Clzip is in fact a C language version of lzip, intended for embedded
|
||||||
compiler.
|
devices or systems lacking a C++ compiler.
|
||||||
|
|
||||||
|
The lzip file format is designed for long-term data archiving and
|
||||||
|
provides very safe integrity checking. The member trailer stores the
|
||||||
|
32-bit CRC of the original data, the size of the original data and the
|
||||||
|
size of the member. These values, together with the value remaining in
|
||||||
|
the range decoder and the end-of-stream marker, provide a 4 factor
|
||||||
|
integrity checking which guarantees that the decompressed version of the
|
||||||
|
data is identical to the original. This guards against corruption of the
|
||||||
|
compressed data, and against undetected bugs in clzip (hopefully very
|
||||||
|
unlikely). The chances of data corruption going undetected are
|
||||||
|
microscopic. Be aware, though, that the check occurs upon decompression,
|
||||||
|
so it can only tell you that something is wrong. It can't help you
|
||||||
|
recover the original uncompressed data.
|
||||||
|
|
||||||
If you ever need to recover data from a damaged lzip file, try the
|
If you ever need to recover data from a damaged lzip file, try the
|
||||||
lziprecover program.
|
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
|
||||||
|
(one of the most common forms of data corruption), and provides data
|
||||||
|
recovery capabilities, including error-checked merging of damaged copies
|
||||||
|
of a file.
|
||||||
|
|
||||||
Clzip replaces every file given in the command line with a compressed
|
Clzip replaces every file given in the command line with a compressed
|
||||||
version of itself, with the name "original_name.lz". Each compressed
|
version of itself, with the name "original_name.lz". Each compressed
|
||||||
|
@ -50,18 +66,6 @@ without exceeding the given limit. Keep in mind that the decompression
|
||||||
memory requirement is affected at compression time by the choice of
|
memory requirement is affected at compression time by the choice of
|
||||||
dictionary size limit.
|
dictionary size limit.
|
||||||
|
|
||||||
As a self-check for your protection, clzip stores in the member trailer
|
|
||||||
the 32-bit CRC of the original data, the size of the original data and
|
|
||||||
the size of the member. These values, together with the value remaining
|
|
||||||
in the range decoder and the end-of-stream marker, provide a very safe 4
|
|
||||||
factor integrity checking which guarantees that the decompressed version
|
|
||||||
of the data is identical to the original. This guards against corruption
|
|
||||||
of the compressed data, and against undetected bugs in clzip (hopefully
|
|
||||||
very unlikely). The chances of data corruption going undetected are
|
|
||||||
microscopic. Be aware, though, that the check occurs upon decompression,
|
|
||||||
so it can only tell you that something is wrong. It can't help you
|
|
||||||
recover the original uncompressed data.
|
|
||||||
|
|
||||||
Clzip implements a simplified version of the LZMA (Lempel-Ziv-Markov
|
Clzip implements a simplified version of the LZMA (Lempel-Ziv-Markov
|
||||||
chain-Algorithm) algorithm. The high compression of LZMA comes from
|
chain-Algorithm) algorithm. The high compression of LZMA comes from
|
||||||
combining two basic, well-proven compression ideas: sliding dictionaries
|
combining two basic, well-proven compression ideas: sliding dictionaries
|
||||||
|
|
28
configure
vendored
28
configure
vendored
|
@ -1,14 +1,14 @@
|
||||||
#! /bin/sh
|
#! /bin/sh
|
||||||
# configure script for Clzip - Data compressor based on the LZMA algorithm
|
# configure script for Clzip - LZMA lossless data compressor
|
||||||
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
||||||
#
|
#
|
||||||
# This configure script is free software: you have unlimited permission
|
# This configure script is free software: you have unlimited permission
|
||||||
# to copy, distribute and modify it.
|
# to copy, distribute and modify it.
|
||||||
|
|
||||||
pkgname=clzip
|
pkgname=clzip
|
||||||
pkgversion=1.5-pre1
|
pkgversion=1.5-pre2
|
||||||
progname=clzip
|
progname=clzip
|
||||||
srctrigger=doc/clzip.texinfo
|
srctrigger=doc/${pkgname}.texinfo
|
||||||
|
|
||||||
# clear some things potentially inherited from environment.
|
# clear some things potentially inherited from environment.
|
||||||
LC_ALL=C
|
LC_ALL=C
|
||||||
|
@ -26,9 +26,8 @@ CFLAGS='-Wall -W -O2'
|
||||||
LDFLAGS=
|
LDFLAGS=
|
||||||
|
|
||||||
# checking whether we are using GNU C.
|
# checking whether we are using GNU C.
|
||||||
if [ ! -x /bin/gcc ] &&
|
${CC} --version > /dev/null 2>&1
|
||||||
[ ! -x /usr/bin/gcc ] &&
|
if [ $? != 0 ] ; then
|
||||||
[ ! -x /usr/local/bin/gcc ] ; then
|
|
||||||
CC=cc
|
CC=cc
|
||||||
CFLAGS='-W -O2'
|
CFLAGS='-W -O2'
|
||||||
fi
|
fi
|
||||||
|
@ -96,16 +95,19 @@ while [ $# != 0 ] ; do
|
||||||
CFLAGS=*) CFLAGS=${optarg} ;;
|
CFLAGS=*) CFLAGS=${optarg} ;;
|
||||||
LDFLAGS=*) LDFLAGS=${optarg} ;;
|
LDFLAGS=*) LDFLAGS=${optarg} ;;
|
||||||
|
|
||||||
--* | *=* | *-*-*) ;;
|
--*)
|
||||||
|
echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;;
|
||||||
|
*=* | *-*-*) ;;
|
||||||
*)
|
*)
|
||||||
echo "configure: Unrecognized option: \"${option}\"; use --help for usage." 1>&2
|
echo "configure: unrecognized option: '${option}'" 1>&2
|
||||||
|
echo "Try 'configure --help' for more information." 1>&2
|
||||||
exit 1 ;;
|
exit 1 ;;
|
||||||
esac
|
esac
|
||||||
|
|
||||||
# Check if the option took a separate argument
|
# Check if the option took a separate argument
|
||||||
if [ "${arg2}" = yes ] ; then
|
if [ "${arg2}" = yes ] ; then
|
||||||
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
|
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
|
||||||
else echo "configure: Missing argument to \"${option}\"" 1>&2
|
else echo "configure: Missing argument to '${option}'" 1>&2
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
@ -123,10 +125,8 @@ if [ -z "${srcdir}" ] ; then
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ ! -r "${srcdir}/${srctrigger}" ] ; then
|
if [ ! -r "${srcdir}/${srctrigger}" ] ; then
|
||||||
exec 1>&2
|
echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2
|
||||||
echo
|
echo "configure: (At least ${srctrigger} is missing)." 1>&2
|
||||||
echo "configure: Can't find sources in ${srcdir} ${srcdirtext}"
|
|
||||||
echo "configure: (At least ${srctrigger} is missing)."
|
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -164,7 +164,7 @@ echo "CFLAGS = ${CFLAGS}"
|
||||||
echo "LDFLAGS = ${LDFLAGS}"
|
echo "LDFLAGS = ${LDFLAGS}"
|
||||||
rm -f Makefile
|
rm -f Makefile
|
||||||
cat > Makefile << EOF
|
cat > Makefile << EOF
|
||||||
# Makefile for Clzip - Data compressor based on the LZMA algorithm
|
# Makefile for Clzip - LZMA lossless data compressor
|
||||||
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
||||||
# This file was generated automatically by configure. Do not edit.
|
# This file was generated automatically by configure. Do not edit.
|
||||||
#
|
#
|
||||||
|
|
69
decoder.c
69
decoder.c
|
@ -1,4 +1,4 @@
|
||||||
/* Clzip - Data compressor based on the LZMA algorithm
|
/* Clzip - LZMA lossless data compressor
|
||||||
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
|
@ -34,7 +34,7 @@ CRC32 crc32;
|
||||||
|
|
||||||
void Pp_show_msg( struct Pretty_print * const pp, const char * const msg )
|
void Pp_show_msg( struct Pretty_print * const pp, const char * const msg )
|
||||||
{
|
{
|
||||||
if( pp->verbosity >= 0 )
|
if( verbosity >= 0 )
|
||||||
{
|
{
|
||||||
if( pp->first_post )
|
if( pp->first_post )
|
||||||
{
|
{
|
||||||
|
@ -122,26 +122,23 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
|
||||||
struct Pretty_print * const pp )
|
struct Pretty_print * const pp )
|
||||||
{
|
{
|
||||||
File_trailer trailer;
|
File_trailer trailer;
|
||||||
const int trailer_size = Ft_versioned_size( decoder->member_version );
|
|
||||||
const unsigned long long member_size =
|
const unsigned long long member_size =
|
||||||
Rd_member_position( decoder->rdec ) + trailer_size;
|
Rd_member_position( decoder->rdec ) + Ft_size;
|
||||||
bool error = false;
|
bool error = false;
|
||||||
|
|
||||||
int size = Rd_read_data( decoder->rdec, trailer, trailer_size );
|
int size = Rd_read_data( decoder->rdec, trailer, Ft_size );
|
||||||
if( size < trailer_size )
|
if( size < Ft_size )
|
||||||
{
|
{
|
||||||
error = true;
|
error = true;
|
||||||
if( pp->verbosity >= 0 )
|
if( verbosity >= 0 )
|
||||||
{
|
{
|
||||||
Pp_show_msg( pp, 0 );
|
Pp_show_msg( pp, 0 );
|
||||||
fprintf( stderr, "Trailer truncated at trailer position %d;"
|
fprintf( stderr, "Trailer truncated at trailer position %d;"
|
||||||
" some checks may fail.\n", size );
|
" some checks may fail.\n", size );
|
||||||
}
|
}
|
||||||
while( size < trailer_size ) trailer[size++] = 0;
|
while( size < Ft_size ) trailer[size++] = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
if( decoder->member_version == 0 ) Ft_set_member_size( trailer, member_size );
|
|
||||||
|
|
||||||
if( decoder->rdec->code != 0 )
|
if( decoder->rdec->code != 0 )
|
||||||
{
|
{
|
||||||
error = true;
|
error = true;
|
||||||
|
@ -150,7 +147,7 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
|
||||||
if( Ft_get_data_crc( trailer ) != LZd_crc( decoder ) )
|
if( Ft_get_data_crc( trailer ) != LZd_crc( decoder ) )
|
||||||
{
|
{
|
||||||
error = true;
|
error = true;
|
||||||
if( pp->verbosity >= 0 )
|
if( verbosity >= 0 )
|
||||||
{
|
{
|
||||||
Pp_show_msg( pp, 0 );
|
Pp_show_msg( pp, 0 );
|
||||||
fprintf( stderr, "CRC mismatch; trailer says %08X, data CRC is %08X.\n",
|
fprintf( stderr, "CRC mismatch; trailer says %08X, data CRC is %08X.\n",
|
||||||
|
@ -160,7 +157,7 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
|
||||||
if( Ft_get_data_size( trailer ) != LZd_data_position( decoder ) )
|
if( Ft_get_data_size( trailer ) != LZd_data_position( decoder ) )
|
||||||
{
|
{
|
||||||
error = true;
|
error = true;
|
||||||
if( pp->verbosity >= 0 )
|
if( verbosity >= 0 )
|
||||||
{
|
{
|
||||||
Pp_show_msg( pp, 0 );
|
Pp_show_msg( pp, 0 );
|
||||||
fprintf( stderr, "Data size mismatch; trailer says %llu, data size is %llu (0x%llX).\n",
|
fprintf( stderr, "Data size mismatch; trailer says %llu, data size is %llu (0x%llX).\n",
|
||||||
|
@ -170,19 +167,19 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
|
||||||
if( Ft_get_member_size( trailer ) != member_size )
|
if( Ft_get_member_size( trailer ) != member_size )
|
||||||
{
|
{
|
||||||
error = true;
|
error = true;
|
||||||
if( pp->verbosity >= 0 )
|
if( verbosity >= 0 )
|
||||||
{
|
{
|
||||||
Pp_show_msg( pp, 0 );
|
Pp_show_msg( pp, 0 );
|
||||||
fprintf( stderr, "Member size mismatch; trailer says %llu, member size is %llu (0x%llX).\n",
|
fprintf( stderr, "Member size mismatch; trailer says %llu, member size is %llu (0x%llX).\n",
|
||||||
Ft_get_member_size( trailer ), member_size, member_size );
|
Ft_get_member_size( trailer ), member_size, member_size );
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if( !error && pp->verbosity >= 2 && LZd_data_position( decoder ) > 0 && member_size > 0 )
|
if( !error && verbosity >= 2 && LZd_data_position( decoder ) > 0 && member_size > 0 )
|
||||||
fprintf( stderr, "%6.3f:1, %6.3f bits/byte, %5.2f%% saved. ",
|
fprintf( stderr, "%6.3f:1, %6.3f bits/byte, %5.2f%% saved. ",
|
||||||
(double)LZd_data_position( decoder ) / member_size,
|
(double)LZd_data_position( decoder ) / member_size,
|
||||||
( 8.0 * member_size ) / LZd_data_position( decoder ),
|
( 8.0 * member_size ) / LZd_data_position( decoder ),
|
||||||
100.0 * ( 1.0 - ( (double)member_size / LZd_data_position( decoder ) ) ) );
|
100.0 * ( 1.0 - ( (double)member_size / LZd_data_position( decoder ) ) ) );
|
||||||
if( !error && pp->verbosity >= 4 )
|
if( !error && verbosity >= 4 )
|
||||||
fprintf( stderr, "data CRC %08X, data size %9llu, member size %8llu. ",
|
fprintf( stderr, "data CRC %08X, data size %9llu, member size %8llu. ",
|
||||||
Ft_get_data_crc( trailer ),
|
Ft_get_data_crc( trailer ),
|
||||||
Ft_get_data_size( trailer ), Ft_get_member_size( trailer ) );
|
Ft_get_data_size( trailer ), Ft_get_member_size( trailer ) );
|
||||||
|
@ -195,29 +192,30 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
|
||||||
int LZd_decode_member( struct LZ_decoder * const decoder,
|
int LZd_decode_member( struct LZ_decoder * const decoder,
|
||||||
struct Pretty_print * const pp )
|
struct Pretty_print * const pp )
|
||||||
{
|
{
|
||||||
|
struct Range_decoder * const rdec = decoder->rdec;
|
||||||
unsigned rep0 = 0; /* rep[0-3] latest four distances */
|
unsigned rep0 = 0; /* rep[0-3] latest four distances */
|
||||||
unsigned rep1 = 0; /* used for efficient coding of */
|
unsigned rep1 = 0; /* used for efficient coding of */
|
||||||
unsigned rep2 = 0; /* repeated distances */
|
unsigned rep2 = 0; /* repeated distances */
|
||||||
unsigned rep3 = 0;
|
unsigned rep3 = 0;
|
||||||
State state = 0;
|
State state = 0;
|
||||||
|
|
||||||
Rd_load( decoder->rdec );
|
Rd_load( rdec );
|
||||||
while( !Rd_finished( decoder->rdec ) )
|
while( !Rd_finished( rdec ) )
|
||||||
{
|
{
|
||||||
const int pos_state = LZd_data_position( decoder ) & pos_state_mask;
|
const int pos_state = LZd_data_position( decoder ) & pos_state_mask;
|
||||||
if( Rd_decode_bit( decoder->rdec, &decoder->bm_match[state][pos_state] ) == 0 ) /* 1st bit */
|
if( Rd_decode_bit( rdec, &decoder->bm_match[state][pos_state] ) == 0 ) /* 1st bit */
|
||||||
{
|
{
|
||||||
const uint8_t prev_byte = LZd_get_prev_byte( decoder );
|
const uint8_t prev_byte = LZd_get_prev_byte( decoder );
|
||||||
if( St_is_char( state ) )
|
if( St_is_char( state ) )
|
||||||
{
|
{
|
||||||
state -= ( state < 4 ) ? state : 3;
|
state -= ( state < 4 ) ? state : 3;
|
||||||
LZd_put_byte( decoder, Rd_decode_tree( decoder->rdec,
|
LZd_put_byte( decoder, Rd_decode_tree( rdec,
|
||||||
decoder->bm_literal[get_lit_state(prev_byte)], 8 ) );
|
decoder->bm_literal[get_lit_state(prev_byte)], 8 ) );
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
state -= ( state < 10 ) ? 3 : 6;
|
state -= ( state < 10 ) ? 3 : 6;
|
||||||
LZd_put_byte( decoder, Rd_decode_matched( decoder->rdec,
|
LZd_put_byte( decoder, Rd_decode_matched( rdec,
|
||||||
decoder->bm_literal[get_lit_state(prev_byte)],
|
decoder->bm_literal[get_lit_state(prev_byte)],
|
||||||
LZd_get_byte( decoder, rep0 ) ) );
|
LZd_get_byte( decoder, rep0 ) ) );
|
||||||
}
|
}
|
||||||
|
@ -225,22 +223,22 @@ int LZd_decode_member( struct LZ_decoder * const decoder,
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
int len;
|
int len;
|
||||||
if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep[state] ) == 1 ) /* 2nd bit */
|
if( Rd_decode_bit( rdec, &decoder->bm_rep[state] ) == 1 ) /* 2nd bit */
|
||||||
{
|
{
|
||||||
if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep0[state] ) == 0 ) /* 3rd bit */
|
if( Rd_decode_bit( rdec, &decoder->bm_rep0[state] ) == 0 ) /* 3rd bit */
|
||||||
{
|
{
|
||||||
if( Rd_decode_bit( decoder->rdec, &decoder->bm_len[state][pos_state] ) == 0 ) /* 4th bit */
|
if( Rd_decode_bit( rdec, &decoder->bm_len[state][pos_state] ) == 0 ) /* 4th bit */
|
||||||
{ state = St_set_short_rep( state );
|
{ state = St_set_short_rep( state );
|
||||||
LZd_put_byte( decoder, LZd_get_byte( decoder, rep0 ) ); continue; }
|
LZd_put_byte( decoder, LZd_get_byte( decoder, rep0 ) ); continue; }
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
unsigned distance;
|
unsigned distance;
|
||||||
if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep1[state] ) == 0 ) /* 4th bit */
|
if( Rd_decode_bit( rdec, &decoder->bm_rep1[state] ) == 0 ) /* 4th bit */
|
||||||
distance = rep1;
|
distance = rep1;
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep2[state] ) == 0 ) /* 5th bit */
|
if( Rd_decode_bit( rdec, &decoder->bm_rep2[state] ) == 0 ) /* 5th bit */
|
||||||
distance = rep2;
|
distance = rep2;
|
||||||
else
|
else
|
||||||
{ distance = rep3; rep3 = rep2; }
|
{ distance = rep3; rep3 = rep2; }
|
||||||
|
@ -250,31 +248,30 @@ int LZd_decode_member( struct LZ_decoder * const decoder,
|
||||||
rep0 = distance;
|
rep0 = distance;
|
||||||
}
|
}
|
||||||
state = St_set_rep( state );
|
state = St_set_rep( state );
|
||||||
len = min_match_len + Rd_decode_len( decoder->rdec, &decoder->rep_len_model, pos_state );
|
len = min_match_len + Rd_decode_len( rdec, &decoder->rep_len_model, pos_state );
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
int dis_slot;
|
int dis_slot;
|
||||||
const unsigned rep0_saved = rep0;
|
const unsigned rep0_saved = rep0;
|
||||||
len = min_match_len + Rd_decode_len( decoder->rdec, &decoder->match_len_model, pos_state );
|
len = min_match_len + Rd_decode_len( rdec, &decoder->match_len_model, pos_state );
|
||||||
dis_slot = Rd_decode_tree6( decoder->rdec, decoder->bm_dis_slot[get_dis_state(len)] );
|
dis_slot = Rd_decode_tree6( rdec, decoder->bm_dis_slot[get_dis_state(len)] );
|
||||||
if( dis_slot < start_dis_model ) rep0 = dis_slot;
|
if( dis_slot < start_dis_model ) rep0 = dis_slot;
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
const int direct_bits = ( dis_slot >> 1 ) - 1;
|
const int direct_bits = ( dis_slot >> 1 ) - 1;
|
||||||
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
|
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
|
||||||
if( dis_slot < end_dis_model )
|
if( dis_slot < end_dis_model )
|
||||||
rep0 += Rd_decode_tree_reversed( decoder->rdec,
|
rep0 += Rd_decode_tree_reversed( rdec,
|
||||||
decoder->bm_dis + rep0 - dis_slot - 1,
|
decoder->bm_dis + rep0 - dis_slot - 1, direct_bits );
|
||||||
direct_bits );
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
rep0 += Rd_decode( decoder->rdec, direct_bits - dis_align_bits ) << dis_align_bits;
|
rep0 += Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits;
|
||||||
rep0 += Rd_decode_tree_reversed4( decoder->rdec, decoder->bm_align );
|
rep0 += Rd_decode_tree_reversed4( rdec, decoder->bm_align );
|
||||||
if( rep0 == 0xFFFFFFFFU ) /* Marker found */
|
if( rep0 == 0xFFFFFFFFU ) /* Marker found */
|
||||||
{
|
{
|
||||||
rep0 = rep0_saved;
|
rep0 = rep0_saved;
|
||||||
Rd_normalize( decoder->rdec );
|
Rd_normalize( rdec );
|
||||||
LZd_flush_data( decoder );
|
LZd_flush_data( decoder );
|
||||||
if( len == min_match_len ) /* End Of Stream marker */
|
if( len == min_match_len ) /* End Of Stream marker */
|
||||||
{
|
{
|
||||||
|
@ -282,9 +279,9 @@ int LZd_decode_member( struct LZ_decoder * const decoder,
|
||||||
}
|
}
|
||||||
if( len == min_match_len + 1 ) /* Sync Flush marker */
|
if( len == min_match_len + 1 ) /* Sync Flush marker */
|
||||||
{
|
{
|
||||||
Rd_load( decoder->rdec ); continue;
|
Rd_load( rdec ); continue;
|
||||||
}
|
}
|
||||||
if( pp->verbosity >= 0 )
|
if( verbosity >= 0 )
|
||||||
{
|
{
|
||||||
Pp_show_msg( pp, 0 );
|
Pp_show_msg( pp, 0 );
|
||||||
fprintf( stderr, "Unsupported marker code '%d'.\n", len );
|
fprintf( stderr, "Unsupported marker code '%d'.\n", len );
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
/* Clzip - Data compressor based on the LZMA algorithm
|
/* Clzip - LZMA lossless data compressor
|
||||||
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
|
@ -237,7 +237,6 @@ struct LZ_decoder
|
||||||
int stream_pos; /* first byte not yet written to file */
|
int stream_pos; /* first byte not yet written to file */
|
||||||
uint32_t crc;
|
uint32_t crc;
|
||||||
int outfd; /* output file descriptor */
|
int outfd; /* output file descriptor */
|
||||||
int member_version;
|
|
||||||
|
|
||||||
Bit_model bm_literal[1<<literal_context_bits][0x300];
|
Bit_model bm_literal[1<<literal_context_bits][0x300];
|
||||||
Bit_model bm_match[states][pos_states];
|
Bit_model bm_match[states][pos_states];
|
||||||
|
@ -314,7 +313,6 @@ static inline bool LZd_init( struct LZ_decoder * const decoder,
|
||||||
decoder->stream_pos = 0;
|
decoder->stream_pos = 0;
|
||||||
decoder->crc = 0xFFFFFFFFU;
|
decoder->crc = 0xFFFFFFFFU;
|
||||||
decoder->outfd = ofd;
|
decoder->outfd = ofd;
|
||||||
decoder->member_version = Fh_version( header );
|
|
||||||
|
|
||||||
Bm_array_init( decoder->bm_literal[0], (1 << literal_context_bits) * 0x300 );
|
Bm_array_init( decoder->bm_literal[0], (1 << literal_context_bits) * 0x300 );
|
||||||
Bm_array_init( decoder->bm_match[0], states * pos_states );
|
Bm_array_init( decoder->bm_match[0], states * pos_states );
|
||||||
|
|
|
@ -1,12 +1,12 @@
|
||||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
|
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
|
||||||
.TH CLZIP "1" "May 2013" "Clzip 1.5-pre1" "User Commands"
|
.TH CLZIP "1" "July 2013" "Clzip 1.5-pre2" "User Commands"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
Clzip \- reduces the size of files
|
Clzip \- reduces the size of files
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
.B clzip
|
.B clzip
|
||||||
[\fIoptions\fR] [\fIfiles\fR]
|
[\fIoptions\fR] [\fIfiles\fR]
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
Clzip \- Data compressor based on the LZMA algorithm.
|
Clzip \- LZMA lossless data compressor.
|
||||||
.SH OPTIONS
|
.SH OPTIONS
|
||||||
.TP
|
.TP
|
||||||
\fB\-h\fR, \fB\-\-help\fR
|
\fB\-h\fR, \fB\-\-help\fR
|
||||||
|
|
115
doc/clzip.info
115
doc/clzip.info
|
@ -3,7 +3,7 @@ clzip.texinfo.
|
||||||
|
|
||||||
INFO-DIR-SECTION Data Compression
|
INFO-DIR-SECTION Data Compression
|
||||||
START-INFO-DIR-ENTRY
|
START-INFO-DIR-ENTRY
|
||||||
* Clzip: (clzip). Data compressor based on the LZMA algorithm
|
* Clzip: (clzip). LZMA lossless data compressor
|
||||||
END-INFO-DIR-ENTRY
|
END-INFO-DIR-ENTRY
|
||||||
|
|
||||||
|
|
||||||
|
@ -12,17 +12,17 @@ File: clzip.info, Node: Top, Next: Introduction, Up: (dir)
|
||||||
Clzip Manual
|
Clzip Manual
|
||||||
************
|
************
|
||||||
|
|
||||||
This manual is for Clzip (version 1.5-pre1, 13 May 2013).
|
This manual is for Clzip (version 1.5-pre2, 17 July 2013).
|
||||||
|
|
||||||
* Menu:
|
* Menu:
|
||||||
|
|
||||||
* Introduction:: Purpose and features of clzip
|
* Introduction:: Purpose and features of clzip
|
||||||
* Algorithm:: How clzip compresses the data
|
* Algorithm:: How clzip compresses the data
|
||||||
* Invoking Clzip:: Command line interface
|
* Invoking clzip:: Command line interface
|
||||||
* File Format:: Detailed format of the compressed file
|
* File format:: Detailed format of the compressed file
|
||||||
* Examples:: A small tutorial with examples
|
* Examples:: A small tutorial with examples
|
||||||
* Problems:: Reporting bugs
|
* Problems:: Reporting bugs
|
||||||
* Concept Index:: Index of concepts
|
* Concept index:: Index of concepts
|
||||||
|
|
||||||
|
|
||||||
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
|
||||||
|
@ -36,23 +36,39 @@ File: clzip.info, Node: Introduction, Next: Algorithm, Prev: Top, Up: Top
|
||||||
1 Introduction
|
1 Introduction
|
||||||
**************
|
**************
|
||||||
|
|
||||||
Clzip is a lossless data compressor based on the LZMA algorithm, with
|
Clzip is a lossless data compressor with a user interface similar to the
|
||||||
very safe integrity checking and a user interface similar to the one of
|
one of gzip or bzip2. Clzip decompresses almost as fast as gzip and
|
||||||
gzip or bzip2. Clzip decompresses almost as fast as gzip and compresses
|
compresses more than bzip2, which makes it well suited for software
|
||||||
better than bzip2, which makes it well suited for software distribution
|
distribution and data archiving. Clzip is a clean implementation of the
|
||||||
and data archiving.
|
LZMA algorithm.
|
||||||
|
|
||||||
Clzip uses the same well-defined exit status values used by bzip2,
|
Clzip uses the same well-defined exit status values used by lzip and
|
||||||
which makes it safer when used in pipes or scripts than compressors
|
bzip2, which makes it safer when used in pipes or scripts than
|
||||||
returning ambiguous warning values, like gzip.
|
compressors returning ambiguous warning values, like gzip.
|
||||||
|
|
||||||
Clzip uses the lzip file format; the files produced by clzip are
|
Clzip uses the lzip file format; the files produced by clzip are
|
||||||
fully compatible with lzip-1.4 or newer. Clzip is in fact a C language
|
fully compatible with lzip-1.4 or newer, and can be rescued with
|
||||||
version of lzip, intended for embedded devices or systems lacking a C++
|
lziprecover. Clzip is in fact a C language version of lzip, intended
|
||||||
compiler.
|
for embedded devices or systems lacking a C++ compiler.
|
||||||
|
|
||||||
|
The lzip file format is designed for long-term data archiving and
|
||||||
|
provides very safe integrity checking. The member trailer stores the
|
||||||
|
32-bit CRC of the original data, the size of the original data and the
|
||||||
|
size of the member. These values, together with the value remaining in
|
||||||
|
the range decoder and the end-of-stream marker, provide a 4 factor
|
||||||
|
integrity checking which guarantees that the decompressed version of the
|
||||||
|
data is identical to the original. This guards against corruption of the
|
||||||
|
compressed data, and against undetected bugs in clzip (hopefully very
|
||||||
|
unlikely). The chances of data corruption going undetected are
|
||||||
|
microscopic. Be aware, though, that the check occurs upon decompression,
|
||||||
|
so it can only tell you that something is wrong. It can't help you
|
||||||
|
recover the original uncompressed data.
|
||||||
|
|
||||||
If you ever need to recover data from a damaged lzip file, try the
|
If you ever need to recover data from a damaged lzip file, try the
|
||||||
lziprecover program.
|
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
|
||||||
|
(one of the most common forms of data corruption), and provides data
|
||||||
|
recovery capabilities, including error-checked merging of damaged copies
|
||||||
|
of a file.
|
||||||
|
|
||||||
Clzip replaces every file given in the command line with a compressed
|
Clzip replaces every file given in the command line with a compressed
|
||||||
version of itself, with the name "original_name.lz". Each compressed
|
version of itself, with the name "original_name.lz". Each compressed
|
||||||
|
@ -99,20 +115,8 @@ filename.lz becomes filename
|
||||||
filename.tlz becomes filename.tar
|
filename.tlz becomes filename.tar
|
||||||
anyothername becomes anyothername.out
|
anyothername becomes anyothername.out
|
||||||
|
|
||||||
As a self-check for your protection, clzip stores in the member
|
|
||||||
trailer the 32-bit CRC of the original data, the size of the original
|
|
||||||
data and the size of the member. These values, together with the value
|
|
||||||
remaining in the range decoder and the end-of-stream marker, provide a
|
|
||||||
very safe 4 factor integrity checking which guarantees that the
|
|
||||||
decompressed version of the data is identical to the original. This
|
|
||||||
guards against corruption of the compressed data, and against
|
|
||||||
undetected bugs in clzip (hopefully very unlikely). The chances of data
|
|
||||||
corruption going undetected are microscopic. Be aware, though, that the
|
|
||||||
check occurs upon decompression, so it can only tell you that something
|
|
||||||
is wrong. It can't help you recover the original uncompressed data.
|
|
||||||
|
|
||||||
|
|
||||||
File: clzip.info, Node: Algorithm, Next: Invoking Clzip, Prev: Introduction, Up: Top
|
File: clzip.info, Node: Algorithm, Next: Invoking clzip, Prev: Introduction, Up: Top
|
||||||
|
|
||||||
2 Algorithm
|
2 Algorithm
|
||||||
***********
|
***********
|
||||||
|
@ -173,9 +177,9 @@ range encoding), Igor Pavlov (for putting all the above together in
|
||||||
LZMA), and Julian Seward (for bzip2's CLI and the idea of unzcrash).
|
LZMA), and Julian Seward (for bzip2's CLI and the idea of unzcrash).
|
||||||
|
|
||||||
|
|
||||||
File: clzip.info, Node: Invoking Clzip, Next: File Format, Prev: Algorithm, Up: Top
|
File: clzip.info, Node: Invoking clzip, Next: File format, Prev: Algorithm, Up: Top
|
||||||
|
|
||||||
3 Invoking Clzip
|
3 Invoking clzip
|
||||||
****************
|
****************
|
||||||
|
|
||||||
The format for running clzip is:
|
The format for running clzip is:
|
||||||
|
@ -278,10 +282,10 @@ The format for running clzip is:
|
||||||
`--verbose'
|
`--verbose'
|
||||||
Verbose mode.
|
Verbose mode.
|
||||||
When compressing, show the compression ratio for each file
|
When compressing, show the compression ratio for each file
|
||||||
processed.
|
processed. A second -v shows the progress of compression.
|
||||||
When decompressing or testing, further -v's (up to 4) increase the
|
When decompressing or testing, further -v's (up to 4) increase the
|
||||||
verbosity level, showing status, dictionary size, compression
|
verbosity level, showing status, compression ratio, dictionary
|
||||||
ratio, and trailer contents (CRC, data size, member size).
|
size, and trailer contents (CRC, data size, member size).
|
||||||
|
|
||||||
`-1 .. -9'
|
`-1 .. -9'
|
||||||
Set the compression parameters (dictionary size and match length
|
Set the compression parameters (dictionary size and match length
|
||||||
|
@ -333,9 +337,9 @@ invalid input file, 3 for an internal consistency error (eg, bug) which
|
||||||
caused clzip to panic.
|
caused clzip to panic.
|
||||||
|
|
||||||
|
|
||||||
File: clzip.info, Node: File Format, Next: Examples, Prev: Invoking Clzip, Up: Top
|
File: clzip.info, Node: File format, Next: Examples, Prev: Invoking clzip, Up: Top
|
||||||
|
|
||||||
4 File Format
|
4 File format
|
||||||
*************
|
*************
|
||||||
|
|
||||||
Perfection is reached, not when there is no longer anything to add, but
|
Perfection is reached, not when there is no longer anything to add, but
|
||||||
|
@ -389,7 +393,8 @@ additional information before, between, or after them.
|
||||||
|
|
||||||
`Lzma stream'
|
`Lzma stream'
|
||||||
The lzma stream, finished by an end of stream marker. Uses default
|
The lzma stream, finished by an end of stream marker. Uses default
|
||||||
values for encoder properties.
|
values for encoder properties. See the lzip manual for a full
|
||||||
|
description.
|
||||||
|
|
||||||
`CRC32 (4 bytes)'
|
`CRC32 (4 bytes)'
|
||||||
CRC of the uncompressed original data.
|
CRC of the uncompressed original data.
|
||||||
|
@ -405,7 +410,7 @@ additional information before, between, or after them.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
File: clzip.info, Node: Examples, Next: Problems, Prev: File Format, Up: Top
|
File: clzip.info, Node: Examples, Next: Problems, Prev: File format, Up: Top
|
||||||
|
|
||||||
5 A small tutorial with examples
|
5 A small tutorial with examples
|
||||||
********************************
|
********************************
|
||||||
|
@ -478,7 +483,7 @@ file with a member size of 32MiB.
|
||||||
clzip -b 32MiB -S 650MB big_db
|
clzip -b 32MiB -S 650MB big_db
|
||||||
|
|
||||||
|
|
||||||
File: clzip.info, Node: Problems, Next: Concept Index, Prev: Examples, Up: Top
|
File: clzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
|
||||||
|
|
||||||
6 Reporting Bugs
|
6 Reporting Bugs
|
||||||
****************
|
****************
|
||||||
|
@ -493,9 +498,9 @@ for all eternity, if not longer.
|
||||||
by running `clzip --version'.
|
by running `clzip --version'.
|
||||||
|
|
||||||
|
|
||||||
File: clzip.info, Node: Concept Index, Prev: Problems, Up: Top
|
File: clzip.info, Node: Concept index, Prev: Problems, Up: Top
|
||||||
|
|
||||||
Concept Index
|
Concept index
|
||||||
*************
|
*************
|
||||||
|
|
||||||
|