1
0
Fork 0

Merging upstream version 1.2.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-20 15:06:26 +01:00
parent 270325f71d
commit 0d06ac25e1
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
20 changed files with 207 additions and 138 deletions

View file

@ -1,7 +1,6 @@
Lzd was written by Antonio Diaz Diaz. Lzd was written by Antonio Diaz Diaz.
The ideas embodied in lzd are due to (at least) the following people: The ideas embodied in lzd are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the
the definition of Markov chains), G.N.N. Martin (for the definition of definition of Markov chains), G.N.N. Martin (for the definition of range
range encoding), and Igor Pavlov (for putting all the above together in encoding), and Igor Pavlov (for putting all the above together in LZMA).
LZMA).

26
COPYING
View file

@ -1,17 +1,17 @@
Lzd - Educational decompressor for the lzip format Lzd - Educational decompressor for the lzip format
Copyright (C) Antonio Diaz Diaz. Copyright (C) Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and This program is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided binary forms, with or without modification, are permitted provided
that the following conditions are met: that the following conditions are met:
1. Redistributions of source code must retain the above copyright 1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer. notice, this list of conditions, and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright 2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution. documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful, This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

View file

@ -1,7 +1,16 @@
2021-01-04 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.2 released.
* lzd.cc (main): Verify also mismatches in member size.
Accept and ignore the option '-d' for compatibility with zutils.
Remove warning about "lzd not safe for real work".
Print license notice.
* testsuite: Add 10 new test files.
2019-01-11 Antonio Diaz Diaz <antonio@gnu.org> 2019-01-11 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.1 released. * Version 1.1 released.
* File_* renamed to Lzip_*. * Rename File_* to Lzip_*.
* lzd.cc: Compile on DOS with DJGPP. * lzd.cc: Compile on DOS with DJGPP.
* configure: Accept appending to CXXFLAGS, 'CXXFLAGS+=OPTIONS'. * configure: Accept appending to CXXFLAGS, 'CXXFLAGS+=OPTIONS'.
@ -19,7 +28,7 @@
2016-01-23 Antonio Diaz Diaz <antonio@gnu.org> 2016-01-23 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.8 released. * Version 0.8 released.
* Documented that lzip does not use 'literal_pos_state_bits'. * Document that lzip does not use 'literal_pos_state_bits'.
2015-07-07 Antonio Diaz Diaz <antonio@gnu.org> 2015-07-07 Antonio Diaz Diaz <antonio@gnu.org>
@ -39,7 +48,7 @@
2013-08-01 Antonio Diaz Diaz <antonio@gnu.org> 2013-08-01 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.4 released. * Version 0.4 released.
* check.sh: Removed '/dev/full' from tests. * check.sh: Remove '/dev/full' from tests.
2013-07-24 Antonio Diaz Diaz <antonio@gnu.org> 2013-07-24 Antonio Diaz Diaz <antonio@gnu.org>
@ -49,15 +58,15 @@
2013-05-06 Antonio Diaz Diaz <antonio@gnu.org> 2013-05-06 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.2 released. * Version 0.2 released.
* main.c: Added a missing '#include' for OS/2. * main.c: Add a missing '#include' for OS/2.
2013-03-21 Antonio Diaz Diaz <ant_diaz@teleline.es> 2013-03-21 Antonio Diaz Diaz <ant_diaz@teleline.es>
* Version 0.1 released. * Version 0.1 released.
Copyright (C) 2013-2019 Antonio Diaz Diaz. Copyright (C) 2013-2021 Antonio Diaz Diaz.
This file is a collection of facts, and thus it is not copyrightable, This file is a collection of facts, and thus it is not copyrightable,
but just in case, you have unlimited permission to copy, distribute and but just in case, you have unlimited permission to copy, distribute, and
modify it. modify it.

14
INSTALL
View file

@ -1,7 +1,7 @@
Requirements Requirements
------------ ------------
You will need a C++ compiler. You will need a C++11 compiler. (gcc 3.3.6 or newer is recommended).
I use gcc 5.3.0 and 4.1.2, but the code should compile with any standards I use gcc 6.1.0 and 4.1.2, but the code should compile with any standards
compliant compiler. compliant compiler.
Gcc is available at http://gcc.gnu.org. Gcc is available at http://gcc.gnu.org.
@ -36,10 +36,10 @@ the main archive.
Another way Another way
----------- -----------
You can also compile lzd into a separate directory. You can also compile lzd into a separate directory.
To do this, you must use a version of 'make' that supports the 'VPATH' To do this, you must use a version of 'make' that supports the variable
variable, such as GNU 'make'. 'cd' to the directory where you want the 'VPATH', such as GNU 'make'. 'cd' to the directory where you want the
object files and executables to go and run the 'configure' script. object files and executables to go and run the 'configure' script.
'configure' automatically checks for the source code in '.', in '..' and 'configure' automatically checks for the source code in '.', in '..', and
in the directory that 'configure' is in. in the directory that 'configure' is in.
'configure' recognizes the option '--srcdir=DIR' to control where to 'configure' recognizes the option '--srcdir=DIR' to control where to
@ -50,7 +50,7 @@ After running 'configure', you can run 'make' and 'make install' as
explained above. explained above.
Copyright (C) 2013-2019 Antonio Diaz Diaz. Copyright (C) 2013-2021 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute, and modify it.

View file

@ -63,7 +63,7 @@ install-info :
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
$(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info" $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info"
-if $(CAN_RUN_INSTALLINFO) ; then \ -if $(CAN_RUN_INSTALLINFO) ; then \
install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi fi
install-info-compress : install-info install-info-compress : install-info
@ -84,7 +84,7 @@ uninstall-bin :
uninstall-info : uninstall-info :
-if $(CAN_RUN_INSTALLINFO) ; then \ -if $(CAN_RUN_INSTALLINFO) ; then \
install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
@ -105,7 +105,10 @@ dist : doc
$(DISTNAME)/*.cc \ $(DISTNAME)/*.cc \
$(DISTNAME)/testsuite/check.sh \ $(DISTNAME)/testsuite/check.sh \
$(DISTNAME)/testsuite/test.txt \ $(DISTNAME)/testsuite/test.txt \
$(DISTNAME)/testsuite/test.txt.lz $(DISTNAME)/testsuite/fox.lz \
$(DISTNAME)/testsuite/fox_*.lz \
$(DISTNAME)/testsuite/test.txt.lz \
$(DISTNAME)/testsuite/test_em.txt.lz
rm -f $(DISTNAME) rm -f $(DISTNAME)
lzip -v -9 $(DISTNAME).tar lzip -v -9 $(DISTNAME).tar

18
NEWS
View file

@ -1,8 +1,16 @@
Changes in version 1.1: Changes in version 1.2:
All 'File_*' identifiers have been renamed to 'Lzip_*'. Mismatches in member size are now verified. Lzd is now compliant with the
lzip specification; it verifies the 3 integrity factors.
Lzd now should compile on DOS with DJGPP. Lzd now accepts (and ignores) the option '-d'. This allows it to be used as
argument to the option '--lz' of the tools from the zutils package.
The configure script now accepts appending options to CXXFLAGS using the The warning about "lzd not safe for real work" has been removed.
syntax 'CXXFLAGS+=OPTIONS'. (Lzd is safe, just not very convenient to use).
10 new test files have been added to the testsuite.
The source code of lzd is now used as a reference in the description of the
media type 'application/lzip'.
See http://datatracker.ietf.org/doc/draft-diaz-lzip

64
README
View file

@ -2,52 +2,60 @@ Description
Lzd is a simplified decompressor for the lzip format with an educational Lzd is a simplified decompressor for the lzip format with an educational
purpose. Studying its source is a good first step to understand how lzip purpose. Studying its source is a good first step to understand how lzip
works. It is not safe to use lzd for any real work. works.
The source of lzd is used in the lzip manual as a reference decompressor The source of lzd is used in the lzip manual as a reference decompressor in
in the description of the lzip file format. Reading the lzip manual will the description of the lzip file format. Reading the lzip manual will help
help you understand the source. you understand the source. Lzd is compliant with the lzip specification; it
verifies the 3 integrity factors.
Lzd decompresses from standard input to standard output. Lzd will The source of lzd is also used as a reference in the description of the
correctly decompress the concatenation of two or more compressed files. media type 'application/lzip'.
The result is the concatenation of the corresponding decompressed data. See http://datatracker.ietf.org/doc/draft-diaz-lzip
Integrity of such concatenated compressed input is also verified.
Lzd decompresses from standard input to standard output. It accepts (and
ignores) the option '-d' for compatibility with other lzip tools. In
particular, accepting the option '-d' allows lzd to be used as argument to
the option '--lz' of the tools from the zutils package.
Lzd will correctly decompress the concatenation of two or more compressed
files. The result is the concatenation of the corresponding decompressed
data. Integrity of such concatenated compressed input is also verified.
The lzip file format is designed for data sharing and long-term archiving, The lzip file format is designed for data sharing and long-term archiving,
taking into account both data integrity and decoder availability: taking into account both data integrity and decoder availability:
* The lzip format provides very safe integrity checking and some data * The lzip format provides very safe integrity checking and some data
recovery means. The lziprecover program can repair bit flip errors recovery means. The program lziprecover can repair bit flip errors
(one of the most common forms of data corruption) in lzip files, (one of the most common forms of data corruption) in lzip files, and
and provides data recovery capabilities, including error-checked provides data recovery capabilities, including error-checked merging
merging of damaged copies of a file. of damaged copies of a file.
* The lzip format is as simple as possible (but not simpler). The * The lzip format is as simple as possible (but not simpler). The lzip
lzip manual provides the source code of a simple decompressor manual provides the source code of a simple decompressor along with a
along with a detailed explanation of how it works, so that with detailed explanation of how it works, so that with the only help of the
the only help of the lzip manual it would be possible for a lzip manual it would be possible for a digital archaeologist to extract
digital archaeologist to extract the data from a lzip file long the data from a lzip file long after quantum computers eventually
after quantum computers eventually render LZMA obsolete. render LZMA obsolete.
* Additionally the lzip reference implementation is copylefted, which * Additionally the lzip reference implementation is copylefted, which
guarantees that it will remain free forever. guarantees that it will remain free forever.
A nice feature of the lzip format is that a corrupt byte is easier to A nice feature of the lzip format is that a corrupt byte is easier to repair
repair the nearer it is from the beginning of the file. Therefore, with the nearer it is from the beginning of the file. Therefore, with the help of
the help of lziprecover, losing an entire archive just because of a lziprecover, losing an entire archive just because of a corrupt byte near
corrupt byte near the beginning is a thing of the past. the beginning is a thing of the past.
The ideas embodied in lzd are due to (at least) the following people: The ideas embodied in lzd are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the
the definition of Markov chains), G.N.N. Martin (for the definition of definition of Markov chains), G.N.N. Martin (for the definition of range
range encoding), and Igor Pavlov (for putting all the above together in encoding), and Igor Pavlov (for putting all the above together in LZMA).
LZMA).
Copyright (C) 2013-2019 Antonio Diaz Diaz. Copyright (C) 2013-2021 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute, and modify it.
The file Makefile.in is a data file used by configure to produce the The file Makefile.in is a data file used by configure to produce the
Makefile. It has the same copyright owner and permissions that configure Makefile. It has the same copyright owner and permissions that configure

25
configure vendored
View file

@ -1,12 +1,12 @@
#! /bin/sh #! /bin/sh
# configure script for Lzd - Educational decompressor for the lzip format # configure script for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013-2019 Antonio Diaz Diaz. # Copyright (C) 2013-2021 Antonio Diaz Diaz.
# #
# This configure script is free software: you have unlimited permission # This configure script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
pkgname=lzd pkgname=lzd
pkgversion=1.1 pkgversion=1.2
progname=lzd progname=lzd
srctrigger=lzd.cc srctrigger=lzd.cc
@ -26,11 +26,7 @@ CXXFLAGS='-Wall -W -O2'
LDFLAGS= LDFLAGS=
# checking whether we are using GNU C++. # checking whether we are using GNU C++.
/bin/sh -c "${CXX} --version" > /dev/null 2>&1 || /bin/sh -c "${CXX} --version" > /dev/null 2>&1 || { CXX=c++ ; CXXFLAGS=-O2 ; }
{
CXX=c++
CXXFLAGS=-O2
}
# Loop over all args # Loop over all args
args= args=
@ -42,11 +38,12 @@ while [ $# != 0 ] ; do
shift shift
# Add the argument quoted to args # Add the argument quoted to args
args="${args} \"${option}\"" if [ -z "${args}" ] ; then args="\"${option}\""
else args="${args} \"${option}\"" ; fi
# Split out the argument for options that take them # Split out the argument for options that take them
case ${option} in case ${option} in
*=*) optarg=`echo ${option} | sed -e 's,^[^=]*=,,;s,/$,,'` ;; *=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;;
esac esac
# Process the options # Process the options
@ -125,7 +122,7 @@ if [ -z "${srcdir}" ] ; then
if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then if [ ! -r "${srcdir}/${srctrigger}" ] ; then
## the sed command below emulates the dirname command ## the sed command below emulates the dirname command
srcdir=`echo $0 | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
fi fi
fi fi
@ -148,7 +145,7 @@ if [ -z "${no_create}" ] ; then
# Run this file to recreate the current configuration. # Run this file to recreate the current configuration.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
exec /bin/sh $0 ${args} --no-create exec /bin/sh $0 ${args} --no-create
EOF EOF
@ -170,11 +167,11 @@ echo "LDFLAGS = ${LDFLAGS}"
rm -f Makefile rm -f Makefile
cat > Makefile << EOF cat > Makefile << EOF
# Makefile for Lzd - Educational decompressor for the lzip format # Makefile for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013-2019 Antonio Diaz Diaz. # Copyright (C) 2013-2021 Antonio Diaz Diaz.
# This file was generated automatically by configure. Don't edit. # This file was generated automatically by configure. Don't edit.
# #
# This Makefile is free software: you have unlimited permission # This Makefile is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
pkgname = ${pkgname} pkgname = ${pkgname}
pkgversion = ${pkgversion} pkgversion = ${pkgversion}

129
lzd.cc
View file

@ -1,25 +1,25 @@
/* Lzd - Educational decompressor for the lzip format /* Lzd - Educational decompressor for the lzip format
Copyright (C) 2013-2019 Antonio Diaz Diaz. Copyright (C) 2013-2021 Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and This program is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided binary forms, with or without modification, are permitted provided
that the following conditions are met: that the following conditions are met:
1. Redistributions of source code must retain the above copyright 1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer. notice, this list of conditions, and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright 2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution. documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful, This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*/ */
/* /*
Exit status: 0 for a normal exit, 1 for environmental problems Exit status: 0 for a normal exit, 1 for environmental problems
(file not found, invalid flags, I/O errors, etc), 2 to indicate a (file not found, invalid flags, I/O errors, etc), 2 to indicate a
corrupt or invalid input file. corrupt or invalid input file.
*/ */
#include <algorithm> #include <algorithm>
@ -47,7 +47,7 @@ public:
void set_char() void set_char()
{ {
static const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 }; const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
st = next[st]; st = next[st];
} }
void set_match() { st = ( st < 7 ) ? 7 : 10; } void set_match() { st = ( st < 7 ) ? 7 : 10; }
@ -69,7 +69,7 @@ enum {
dis_slot_bits = 6, dis_slot_bits = 6,
start_dis_model = 4, start_dis_model = 4,
end_dis_model = 14, end_dis_model = 14,
modeled_distances = 1 << (end_dis_model / 2), // 128 modeled_distances = 1 << ( end_dis_model / 2 ), // 128
dis_align_bits = 4, dis_align_bits = 4,
dis_align_size = 1 << dis_align_bits, dis_align_size = 1 << dis_align_bits,
@ -130,8 +130,9 @@ public:
const CRC32 crc32; const CRC32 crc32;
typedef uint8_t Lzip_header[6]; // 0-3 magic, 4 version, 5 coded_dict_size typedef uint8_t Lzip_header[6]; // 0-3 magic bytes
// 4 version
// 5 coded dictionary size
typedef uint8_t Lzip_trailer[20]; typedef uint8_t Lzip_trailer[20];
// 0-3 CRC32 of the uncompressed data // 0-3 CRC32 of the uncompressed data
// 4-11 size of the uncompressed data // 4-11 size of the uncompressed data
@ -139,16 +140,18 @@ typedef uint8_t Lzip_trailer[20];
class Range_decoder class Range_decoder
{ {
unsigned long long member_pos;
uint32_t code; uint32_t code;
uint32_t range; uint32_t range;
public: public:
Range_decoder() : code( 0 ), range( 0xFFFFFFFFU ) Range_decoder() : member_pos( 6 ), code( 0 ), range( 0xFFFFFFFFU )
{ {
for( int i = 0; i < 5; ++i ) code = (code << 8) | get_byte(); for( int i = 0; i < 5; ++i ) code = ( code << 8 ) | get_byte();
} }
uint8_t get_byte() { return std::getc( stdin ); } uint8_t get_byte() { ++member_pos; return std::getc( stdin ); }
unsigned long long member_position() const { return member_pos; }
unsigned decode( const int num_bits ) unsigned decode( const int num_bits )
{ {
@ -159,7 +162,7 @@ public:
symbol <<= 1; symbol <<= 1;
if( code >= range ) { code -= range; symbol |= 1; } if( code >= range ) { code -= range; symbol |= 1; }
if( range <= 0x00FFFFFFU ) // normalize if( range <= 0x00FFFFFFU ) // normalize
{ range <<= 8; code = (code << 8) | get_byte(); } { range <<= 8; code = ( code << 8 ) | get_byte(); }
} }
return symbol; return symbol;
} }
@ -171,7 +174,8 @@ public:
if( code < bound ) if( code < bound )
{ {
range = bound; range = bound;
bm.probability += (bit_model_total - bm.probability) >> bit_model_move_bits; bm.probability +=
( bit_model_total - bm.probability ) >> bit_model_move_bits;
symbol = 0; symbol = 0;
} }
else else
@ -182,7 +186,7 @@ public:
symbol = 1; symbol = 1;
} }
if( range <= 0x00FFFFFFU ) // normalize if( range <= 0x00FFFFFFU ) // normalize
{ range <<= 8; code = (code << 8) | get_byte(); } { range <<= 8; code = ( code << 8 ) | get_byte(); }
return symbol; return symbol;
} }
@ -191,7 +195,7 @@ public:
unsigned symbol = 1; unsigned symbol = 1;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
symbol = ( symbol << 1 ) | decode_bit( bm[symbol] ); symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
return symbol - (1 << num_bits); return symbol - ( 1 << num_bits );
} }
unsigned decode_tree_reversed( Bit_model bm[], const int num_bits ) unsigned decode_tree_reversed( Bit_model bm[], const int num_bits )
@ -278,7 +282,11 @@ public:
~LZ_decoder() { delete[] buffer; } ~LZ_decoder() { delete[] buffer; }
unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; } unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; }
unsigned long long data_position() const { return partial_data_pos + pos; } unsigned long long data_position() const
{ return partial_data_pos + pos; }
uint8_t get_byte() { return rdec.get_byte(); }
unsigned long long member_position() const
{ return rdec.member_position(); }
bool decode_member(); bool decode_member();
}; };
@ -290,7 +298,6 @@ void LZ_decoder::flush_data()
{ {
const unsigned size = pos - stream_pos; const unsigned size = pos - stream_pos;
crc32.update_buf( crc_, buffer + stream_pos, size ); crc32.update_buf( crc_, buffer + stream_pos, size );
errno = 0;
if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size ) if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size )
{ std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) ); { std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) );
std::exit( 1 ); } std::exit( 1 ); }
@ -301,7 +308,7 @@ void LZ_decoder::flush_data()
} }
bool LZ_decoder::decode_member() // Returns false if error bool LZ_decoder::decode_member() // Returns false if error
{ {
Bit_model bm_literal[1<<literal_context_bits][0x300]; Bit_model bm_literal[1<<literal_context_bits][0x300];
Bit_model bm_match[State::states][pos_states]; Bit_model bm_match[State::states][pos_states];
@ -381,7 +388,8 @@ bool LZ_decoder::decode_member() // Returns false if error
direct_bits ); direct_bits );
else else
{ {
rep0 += rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits; rep0 +=
rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits;
rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits ); rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits );
if( rep0 == 0xFFFFFFFFU ) // marker found if( rep0 == 0xFFFFFFFFU ) // marker found
{ {
@ -403,20 +411,21 @@ bool LZ_decoder::decode_member() // Returns false if error
int main( const int argc, const char * const argv[] ) int main( const int argc, const char * const argv[] )
{ {
if( argc > 1 ) if( argc > 2 || ( argc == 2 && std::strcmp( argv[1], "-d" ) != 0 ) )
{ {
std::printf( "Lzd %s - Educational decompressor for the lzip format.\n", std::printf(
PROGVERSION ); "Lzd %s - Educational decompressor for the lzip format.\n"
std::printf( "Study the source to learn how a lzip decompressor works.\n" "Study the source to learn how a lzip decompressor works.\n"
"See the lzip manual for an explanation of the code.\n" "See the lzip manual for an explanation of the code.\n"
"It is not safe to use lzd for any real work.\n" "\nUsage: %s [-d] < file.lz > file\n"
"\nUsage: %s < file.lz > file\n", argv[0] ); "Lzd decompresses from standard input to standard output.\n"
std::printf( "Lzd decompresses from standard input to standard output.\n" "\nCopyright (C) 2021 Antonio Diaz Diaz.\n"
"\nCopyright (C) 2019 Antonio Diaz Diaz.\n" "License 2-clause BSD.\n"
"This is free software: you are free to change and redistribute it.\n" "This is free software: you are free to change and redistribute it.\n"
"There is NO WARRANTY, to the extent permitted by law.\n" "There is NO WARRANTY, to the extent permitted by law.\n"
"Report bugs to lzip-bug@nongnu.org\n" "Report bugs to lzip-bug@nongnu.org\n"
"Lzd home page: http://www.nongnu.org/lzip/lzd.html\n" ); "Lzd home page: http://www.nongnu.org/lzip/lzd.html\n",
PROGVERSION, argv[0] );
return 0; return 0;
} }
@ -432,9 +441,9 @@ int main( const int argc, const char * const argv[] )
if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 ) if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 )
{ {
if( first_member ) if( first_member )
{ std::fputs( "Bad magic number (file not in lzip format).\n", stderr ); { std::fputs( "Bad magic number (file not in lzip format).\n",
return 2; } stderr ); return 2; }
break; break; // ignore trailing data
} }
unsigned dict_size = 1 << ( header[5] & 0x1F ); unsigned dict_size = 1 << ( header[5] & 0x1F );
dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 ); dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 );
@ -447,17 +456,29 @@ int main( const int argc, const char * const argv[] )
{ std::fputs( "Data error\n", stderr ); return 2; } { std::fputs( "Data error\n", stderr ); return 2; }
Lzip_trailer trailer; // verify trailer Lzip_trailer trailer; // verify trailer
for( int i = 0; i < 20; ++i ) trailer[i] = std::getc( stdin ); for( int i = 0; i < 20; ++i ) trailer[i] = decoder.get_byte();
int retval = 0;
unsigned crc = 0; unsigned crc = 0;
for( int i = 3; i >= 0; --i ) { crc <<= 8; crc += trailer[i]; } for( int i = 3; i >= 0; --i ) crc = ( crc << 8 ) + trailer[i];
if( crc != decoder.crc() )
{ std::fputs( "CRC mismatch\n", stderr ); retval = 2; }
unsigned long long data_size = 0; unsigned long long data_size = 0;
for( int i = 11; i >= 4; --i ) { data_size <<= 8; data_size += trailer[i]; } for( int i = 11; i >= 4; --i )
if( crc != decoder.crc() || data_size != decoder.data_position() ) data_size = ( data_size << 8 ) + trailer[i];
{ std::fputs( "CRC error\n", stderr ); return 2; } if( data_size != decoder.data_position() )
{ std::fputs( "Data size mismatch\n", stderr ); retval = 2; }
unsigned long long member_size = 0;
for( int i = 19; i >= 12; --i )
member_size = ( member_size << 8 ) + trailer[i];
if( member_size != decoder.member_position() )
{ std::fputs( "Member size mismatch\n", stderr ); retval = 2; }
if( retval ) return retval;
} }
if( std::fclose( stdout ) != 0 ) if( std::fclose( stdout ) != 0 )
{ std::fprintf( stderr, "Error closing stdout: %s\n", std::strerror( errno ) ); { std::fprintf( stderr, "Error closing stdout: %s\n",
return 1; } std::strerror( errno ) ); return 1; }
return 0; return 0;
} }

View file

@ -1,9 +1,9 @@
#! /bin/sh #! /bin/sh
# check script for Lzd - Educational decompressor for the lzip format # check script for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013-2019 Antonio Diaz Diaz. # Copyright (C) 2013-2021 Antonio Diaz Diaz.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
LC_ALL=C LC_ALL=C
export LC_ALL export LC_ALL
@ -30,6 +30,8 @@ cd "${objdir}"/tmp || framework_failure
in="${testdir}"/test.txt in="${testdir}"/test.txt
in_lz="${testdir}"/test.txt.lz in_lz="${testdir}"/test.txt.lz
in_em="${testdir}"/test_em.txt.lz
fox_lz="${testdir}"/fox.lz
fail=0 fail=0
test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; } test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
@ -38,16 +40,38 @@ printf "testing lzd-%s..." "$2"
"${LZIP}" < "${in}" 2> /dev/null "${LZIP}" < "${in}" 2> /dev/null
[ $? = 2 ] || test_failed $LINENO [ $? = 2 ] || test_failed $LINENO
"${LZIP}" < "${in_lz}" > copy || test_failed $LINENO for i in "${in_lz}" "${in_em}" ; do
cmp "${in}" copy || test_failed $LINENO "${LZIP}" < "$i" > copy || test_failed $LINENO "$i"
cmp "${in}" copy || test_failed $LINENO "$i"
done
cat "${in}" "${in}" > in2 || framework_failure cat "${in}" "${in}" > in2 || framework_failure
cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
"${LZIP}" < in2.lz > copy2 || test_failed $LINENO "${LZIP}" < in2.lz > copy2 || test_failed $LINENO
cmp in2 copy2 || test_failed $LINENO cmp in2 copy2 || test_failed $LINENO
cat in2.lz > copy2.lz || framework_failure
printf "\ngarbage" >> copy2.lz || framework_failure
"${LZIP}" -d < copy2.lz > copy2 || test_failed $LINENO
cmp in2 copy2 || test_failed $LINENO
rm -f in2 copy2 copy2.lz || framework_failure
printf "\ntesting bad input..." printf "\ntesting bad input..."
for i in fox_bm.lz fox_v2.lz fox_s11.lz fox_de20.lz \
fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
"${LZIP}" < "${testdir}"/$i > /dev/null 2>&1
[ $? = 2 ] || test_failed $LINENO $i
done
"${LZIP}" < "${fox_lz}" > fox || test_failed $LINENO
for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
"${LZIP}" < "${testdir}"/$i > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO $i
cmp fox out || test_failed $LINENO $i
done
rm -f fox out || framework_failure
cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null && if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
[ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then [ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then

BIN
testsuite/fox.lz Normal file

Binary file not shown.

BIN
testsuite/fox_bcrc.lz Normal file

Binary file not shown.

BIN
testsuite/fox_bm.lz Normal file

Binary file not shown.

BIN
testsuite/fox_crc0.lz Normal file

Binary file not shown.

BIN
testsuite/fox_das46.lz Normal file

Binary file not shown.

BIN
testsuite/fox_de20.lz Normal file

Binary file not shown.

BIN
testsuite/fox_mes81.lz Normal file

Binary file not shown.

BIN
testsuite/fox_s11.lz Normal file

Binary file not shown.

BIN
testsuite/fox_v2.lz Normal file

Binary file not shown.

BIN
testsuite/test_em.txt.lz Normal file

Binary file not shown.