1
0
Fork 0

Merging upstream version 1.0.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-20 15:01:50 +01:00
parent fa46c2beb5
commit 57a6b04c20
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
10 changed files with 124 additions and 74 deletions

17
COPYING Normal file
View file

@ -0,0 +1,17 @@
Lzd - Educational decompressor for the lzip format
Copyright (C) Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided
that the following conditions are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

View file

@ -1,3 +1,9 @@
2017-05-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.0 released.
* lzd.cc: Minor code improvements.
* testsuite/check.sh: A POSIX shell is required to run the tests.
2016-05-10 Antonio Diaz Diaz <antonio@gnu.org> 2016-05-10 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.9 released. * Version 0.9 released.
@ -43,7 +49,7 @@
* Version 0.1 released. * Version 0.1 released.
Copyright (C) 2013-2016 Antonio Diaz Diaz. Copyright (C) 2013-2017 Antonio Diaz Diaz.
This file is a collection of facts, and thus it is not copyrightable, This file is a collection of facts, and thus it is not copyrightable,
but just in case, you have unlimited permission to copy, distribute and but just in case, you have unlimited permission to copy, distribute and

View file

@ -50,7 +50,7 @@ After running 'configure', you can run 'make' and 'make install' as
explained above. explained above.
Copyright (C) 2013-2016 Antonio Diaz Diaz. Copyright (C) 2013-2017 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute and modify it.

View file

@ -95,16 +95,17 @@ dist : doc
ln -sf $(VPATH) $(DISTNAME) ln -sf $(VPATH) $(DISTNAME)
tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \ tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \
$(DISTNAME)/AUTHORS \ $(DISTNAME)/AUTHORS \
$(DISTNAME)/COPYING \
$(DISTNAME)/ChangeLog \ $(DISTNAME)/ChangeLog \
$(DISTNAME)/INSTALL \ $(DISTNAME)/INSTALL \
$(DISTNAME)/Makefile.in \ $(DISTNAME)/Makefile.in \
$(DISTNAME)/NEWS \ $(DISTNAME)/NEWS \
$(DISTNAME)/README \ $(DISTNAME)/README \
$(DISTNAME)/configure \ $(DISTNAME)/configure \
$(DISTNAME)/*.cc \
$(DISTNAME)/testsuite/check.sh \ $(DISTNAME)/testsuite/check.sh \
$(DISTNAME)/testsuite/test.txt \ $(DISTNAME)/testsuite/test.txt \
$(DISTNAME)/testsuite/test.txt.lz \ $(DISTNAME)/testsuite/test.txt.lz
$(DISTNAME)/*.cc
rm -f $(DISTNAME) rm -f $(DISTNAME)
lzip -v -9 $(DISTNAME).tar lzip -v -9 $(DISTNAME).tar

7
NEWS
View file

@ -1,4 +1,5 @@
Changes in version 0.9: Changes in version 1.0:
A configure warning happening on some shells when testing for g++ has Minor code improvements have been made.
been fixed.
The tests have been improved.

12
README
View file

@ -24,11 +24,11 @@ availability:
merging of damaged copies of a file. merging of damaged copies of a file.
* The lzip format is as simple as possible (but not simpler). The * The lzip format is as simple as possible (but not simpler). The
lzip manual provides the code of a simple decompressor along with a lzip manual provides the source code of a simple decompressor along
detailed explanation of how it works, so that with the only help of with a detailed explanation of how it works, so that with the only
the lzip manual it would be possible for a digital archaeologist to help of the lzip manual it would be possible for a digital
extract the data from a lzip file long after quantum computers archaeologist to extract the data from a lzip file long after
eventually render LZMA obsolete. quantum computers eventually render LZMA obsolete.
* Additionally the lzip reference implementation is copylefted, which * Additionally the lzip reference implementation is copylefted, which
guarantees that it will remain free forever. guarantees that it will remain free forever.
@ -45,7 +45,7 @@ range encoding), and Igor Pavlov (for putting all the above together in
LZMA). LZMA).
Copyright (C) 2013-2016 Antonio Diaz Diaz. Copyright (C) 2013-2017 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute and modify it.

21
configure vendored
View file

@ -1,12 +1,12 @@
#! /bin/sh #! /bin/sh
# configure script for Lzd - Educational decompressor for the lzip format # configure script for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013-2016 Antonio Diaz Diaz. # Copyright (C) 2013-2017 Antonio Diaz Diaz.
# #
# This configure script is free software: you have unlimited permission # This configure script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute and modify it.
pkgname=lzd pkgname=lzd
pkgversion=0.9 pkgversion=1.0
progname=lzd progname=lzd
srctrigger=lzd.cc srctrigger=lzd.cc
@ -26,11 +26,11 @@ CXXFLAGS='-Wall -W -O2'
LDFLAGS= LDFLAGS=
# checking whether we are using GNU C++. # checking whether we are using GNU C++.
if /bin/sh -c "${CXX} --version" > /dev/null 2>&1 ; then true /bin/sh -c "${CXX} --version" > /dev/null 2>&1 ||
else {
CXX=c++ CXX=c++
CXXFLAGS='-W -O2' CXXFLAGS=-O2
fi }
# Loop over all args # Loop over all args
args= args=
@ -52,9 +52,12 @@ while [ $# != 0 ] ; do
# Process the options # Process the options
case ${option} in case ${option} in
--help | -h) --help | -h)
echo "Usage: configure [options]" echo "Usage: $0 [OPTION]... [VAR=VALUE]..."
echo echo
echo "Options: [defaults in brackets]" echo "To assign makefile variables (e.g., CXX, CXXFLAGS...), specify them as"
echo "arguments to configure in the form VAR=VALUE."
echo
echo "Options and variables: [defaults in brackets]"
echo " -h, --help display this help and exit" echo " -h, --help display this help and exit"
echo " -V, --version output version information and exit" echo " -V, --version output version information and exit"
echo " --srcdir=DIR find the sources in DIR [. or ..]" echo " --srcdir=DIR find the sources in DIR [. or ..]"
@ -165,7 +168,7 @@ echo "LDFLAGS = ${LDFLAGS}"
rm -f Makefile rm -f Makefile
cat > Makefile << EOF cat > Makefile << EOF
# Makefile for Lzd - Educational decompressor for the lzip format # Makefile for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013-2016 Antonio Diaz Diaz. # Copyright (C) 2013-2017 Antonio Diaz Diaz.
# This file was generated automatically by configure. Don't edit. # This file was generated automatically by configure. Don't edit.
# #
# This Makefile is free software: you have unlimited permission # This Makefile is free software: you have unlimited permission

78
lzd.cc
View file

@ -1,5 +1,5 @@
/* Lzd - Educational decompressor for the lzip format /* Lzd - Educational decompressor for the lzip format
Copyright (C) 2013-2016 Antonio Diaz Diaz. Copyright (C) 2013-2017 Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and This program is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided binary forms, with or without modification, are permitted provided
@ -150,10 +150,10 @@ public:
uint8_t get_byte() { return std::getc( stdin ); } uint8_t get_byte() { return std::getc( stdin ); }
int decode( const int num_bits ) unsigned decode( const int num_bits )
{ {
int symbol = 0; unsigned symbol = 0;
for( int i = 0; i < num_bits; ++i ) for( int i = num_bits; i > 0; --i )
{ {
range >>= 1; range >>= 1;
symbol <<= 1; symbol <<= 1;
@ -164,9 +164,9 @@ public:
return symbol; return symbol;
} }
int decode_bit( Bit_model & bm ) unsigned decode_bit( Bit_model & bm )
{ {
int symbol; unsigned symbol;
const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability; const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability;
if( code < bound ) if( code < bound )
{ {
@ -186,18 +186,18 @@ public:
return symbol; return symbol;
} }
int decode_tree( Bit_model bm[], const int num_bits ) unsigned decode_tree( Bit_model bm[], const int num_bits )
{ {
int symbol = 1; unsigned symbol = 1;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
symbol = ( symbol << 1 ) | decode_bit( bm[symbol] ); symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
return symbol - (1 << num_bits); return symbol - (1 << num_bits);
} }
int decode_tree_reversed( Bit_model bm[], const int num_bits ) unsigned decode_tree_reversed( Bit_model bm[], const int num_bits )
{ {
int symbol = decode_tree( bm, num_bits ); unsigned symbol = decode_tree( bm, num_bits );
int reversed_symbol = 0; unsigned reversed_symbol = 0;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
{ {
reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 ); reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 );
@ -206,14 +206,13 @@ public:
return reversed_symbol; return reversed_symbol;
} }
int decode_matched( Bit_model bm[], const int match_byte ) unsigned decode_matched( Bit_model bm[], const unsigned match_byte )
{ {
Bit_model * const bm1 = bm + 0x100; unsigned symbol = 1;
int symbol = 1;
for( int i = 7; i >= 0; --i ) for( int i = 7; i >= 0; --i )
{ {
const int match_bit = ( match_byte >> i ) & 1; const unsigned match_bit = ( match_byte >> i ) & 1;
const int bit = decode_bit( bm1[(match_bit<<8)+symbol] ); const unsigned bit = decode_bit( bm[symbol+(match_bit<<8)+0x100] );
symbol = ( symbol << 1 ) | bit; symbol = ( symbol << 1 ) | bit;
if( match_bit != bit ) if( match_bit != bit )
{ {
@ -225,7 +224,7 @@ public:
return symbol & 0xFF; return symbol & 0xFF;
} }
int decode_len( Len_model & lm, const int pos_state ) unsigned decode_len( Len_model & lm, const int pos_state )
{ {
if( decode_bit( lm.choice1 ) == 0 ) if( decode_bit( lm.choice1 ) == 0 )
return decode_tree( lm.bm_low[pos_state], len_low_bits ); return decode_tree( lm.bm_low[pos_state], len_low_bits );
@ -253,9 +252,9 @@ class LZ_decoder
uint8_t peek( const unsigned distance ) const uint8_t peek( const unsigned distance ) const
{ {
unsigned i = pos - distance - 1; if( pos > distance ) return buffer[pos - distance - 1];
if( pos <= distance ) i += dictionary_size; if( pos_wrapped ) return buffer[dictionary_size + pos - distance - 1];
return buffer[i]; return 0; // prev_byte of first byte
} }
void put_byte( const uint8_t b ) void put_byte( const uint8_t b )
@ -274,7 +273,7 @@ public:
stream_pos( 0 ), stream_pos( 0 ),
crc_( 0xFFFFFFFFU ), crc_( 0xFFFFFFFFU ),
pos_wrapped( false ) pos_wrapped( false )
{ buffer[dictionary_size-1] = 0; } // prev_byte of first byte {}
~LZ_decoder() { delete[] buffer; } ~LZ_decoder() { delete[] buffer; }
@ -312,13 +311,13 @@ bool LZ_decoder::decode_member() // Returns false if error
Bit_model bm_rep2[State::states]; Bit_model bm_rep2[State::states];
Bit_model bm_len[State::states][pos_states]; Bit_model bm_len[State::states][pos_states];
Bit_model bm_dis_slot[len_states][1<<dis_slot_bits]; Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
Bit_model bm_dis[modeled_distances-end_dis_model]; Bit_model bm_dis[modeled_distances-end_dis_model+1];
Bit_model bm_align[dis_align_size]; Bit_model bm_align[dis_align_size];
Len_model match_len_model; Len_model match_len_model;
Len_model rep_len_model; Len_model rep_len_model;
unsigned rep0 = 0; // rep[0-3] latest four distances unsigned rep0 = 0; // rep[0-3] latest four distances
unsigned rep1 = 0; // used for efficient coding of unsigned rep1 = 0; // used for efficient coding of
unsigned rep2 = 0; // repeated distances unsigned rep2 = 0; // repeated distances
unsigned rep3 = 0; unsigned rep3 = 0;
State state; State state;
@ -341,7 +340,12 @@ bool LZ_decoder::decode_member() // Returns false if error
int len; int len;
if( rdec.decode_bit( bm_rep[state()] ) != 0 ) // 2nd bit if( rdec.decode_bit( bm_rep[state()] ) != 0 ) // 2nd bit
{ {
if( rdec.decode_bit( bm_rep0[state()] ) != 0 ) // 3rd bit if( rdec.decode_bit( bm_rep0[state()] ) == 0 ) // 3rd bit
{
if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
{ state.set_short_rep(); put_byte( peek( rep0 ) ); continue; }
}
else
{ {
unsigned distance; unsigned distance;
if( rdec.decode_bit( bm_rep1[state()] ) == 0 ) // 4th bit if( rdec.decode_bit( bm_rep1[state()] ) == 0 ) // 4th bit
@ -357,11 +361,6 @@ bool LZ_decoder::decode_member() // Returns false if error
rep1 = rep0; rep1 = rep0;
rep0 = distance; rep0 = distance;
} }
else
{
if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
{ state.set_short_rep(); put_byte( peek( rep0 ) ); continue; }
}
state.set_rep(); state.set_rep();
len = min_match_len + rdec.decode_len( rep_len_model, pos_state ); len = min_match_len + rdec.decode_len( rep_len_model, pos_state );
} }
@ -370,15 +369,14 @@ bool LZ_decoder::decode_member() // Returns false if error
rep3 = rep2; rep2 = rep1; rep1 = rep0; rep3 = rep2; rep2 = rep1; rep1 = rep0;
len = min_match_len + rdec.decode_len( match_len_model, pos_state ); len = min_match_len + rdec.decode_len( match_len_model, pos_state );
const int len_state = std::min( len - min_match_len, len_states - 1 ); const int len_state = std::min( len - min_match_len, len_states - 1 );
const int dis_slot = rep0 = rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits );
rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits ); if( rep0 >= start_dis_model )
if( dis_slot < start_dis_model ) rep0 = dis_slot;
else
{ {
const unsigned dis_slot = rep0;
const int direct_bits = ( dis_slot >> 1 ) - 1; const int direct_bits = ( dis_slot >> 1 ) - 1;
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits; rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
if( dis_slot < end_dis_model ) if( dis_slot < end_dis_model )
rep0 += rdec.decode_tree_reversed( bm_dis + rep0 - dis_slot - 1, rep0 += rdec.decode_tree_reversed( bm_dis + ( rep0 - dis_slot ),
direct_bits ); direct_bits );
else else
{ {
@ -414,7 +412,7 @@ int main( const int argc, const char * const argv[] )
"It is not safe to use lzd for any real work.\n" "It is not safe to use lzd for any real work.\n"
"\nUsage: %s < file.lz > file\n", argv[0] ); "\nUsage: %s < file.lz > file\n", argv[0] );
std::printf( "Lzd decompresses from standard input to standard output.\n" std::printf( "Lzd decompresses from standard input to standard output.\n"
"\nCopyright (C) 2016 Antonio Diaz Diaz.\n" "\nCopyright (C) 2017 Antonio Diaz Diaz.\n"
"This is free software: you are free to change and redistribute it.\n" "This is free software: you are free to change and redistribute it.\n"
"There is NO WARRANTY, to the extent permitted by law.\n" "There is NO WARRANTY, to the extent permitted by law.\n"
"Report bugs to lzip-bug@nongnu.org\n" "Report bugs to lzip-bug@nongnu.org\n"
@ -429,7 +427,7 @@ int main( const int argc, const char * const argv[] )
for( bool first_member = true; ; first_member = false ) for( bool first_member = true; ; first_member = false )
{ {
File_header header; File_header header; // verify header
for( int i = 0; i < 6; ++i ) header[i] = std::getc( stdin ); for( int i = 0; i < 6; ++i ) header[i] = std::getc( stdin );
if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 ) if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 )
{ {
@ -444,11 +442,11 @@ int main( const int argc, const char * const argv[] )
{ std::fputs( "Invalid dictionary size in member header.\n", stderr ); { std::fputs( "Invalid dictionary size in member header.\n", stderr );
return 2; } return 2; }
LZ_decoder decoder( dict_size ); LZ_decoder decoder( dict_size ); // decode LZMA stream
if( !decoder.decode_member() ) if( !decoder.decode_member() )
{ std::fputs( "Data error\n", stderr ); return 2; } { std::fputs( "Data error\n", stderr ); return 2; }
File_trailer trailer; File_trailer trailer; // verify trailer
for( int i = 0; i < 20; ++i ) trailer[i] = std::getc( stdin ); for( int i = 0; i < 20; ++i ) trailer[i] = std::getc( stdin );
unsigned crc = 0; unsigned crc = 0;
for( int i = 3; i >= 0; --i ) { crc <<= 8; crc += trailer[i]; } for( int i = 3; i >= 0; --i ) { crc <<= 8; crc += trailer[i]; }

View file

@ -1,6 +1,6 @@
#! /bin/sh #! /bin/sh
# check script for Lzd - Educational decompressor for lzip files # check script for Lzd - Educational decompressor for lzip files
# Copyright (C) 2013-2016 Antonio Diaz Diaz. # Copyright (C) 2013-2017 Antonio Diaz Diaz.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute and modify it.
@ -17,6 +17,13 @@ if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then
exit 1 exit 1
fi fi
[ -e "${LZIP}" ] 2> /dev/null ||
{
echo "$0: a POSIX shell is required to run the tests"
echo "Try bash -c \"$0 $1 $2\""
exit 1
}
if [ -d tmp ] ; then rm -rf tmp ; fi if [ -d tmp ] ; then rm -rf tmp ; fi
mkdir tmp mkdir tmp
cd "${objdir}"/tmp || framework_failure cd "${objdir}"/tmp || framework_failure
@ -24,24 +31,41 @@ cd "${objdir}"/tmp || framework_failure
in="${testdir}"/test.txt in="${testdir}"/test.txt
in_lz="${testdir}"/test.txt.lz in_lz="${testdir}"/test.txt.lz
fail=0 fail=0
test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
printf "testing lzd-%s..." "$2" printf "testing lzd-%s..." "$2"
"${LZIP}" < "${in}" 2> /dev/null "${LZIP}" < "${in}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi [ $? = 2 ] || test_failed $LINENO
dd if="${in_lz}" bs=1 count=6 2> /dev/null | "${LZIP}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
dd if="${in_lz}" bs=1 count=20 2> /dev/null | "${LZIP}" > /dev/null 2>&1
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" < "${in_lz}" > copy || fail=1 "${LZIP}" < "${in_lz}" > copy || test_failed $LINENO
cmp "${in}" copy || fail=1 cmp "${in}" copy || test_failed $LINENO
printf .
cat "${in}" "${in}" > in2 || framework_failure cat "${in}" "${in}" > in2 || framework_failure
cat "${in_lz}" "${in_lz}" | "${LZIP}" > copy2 || fail=1 cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
cmp in2 copy2 || fail=1 "${LZIP}" < in2.lz > copy2 || test_failed $LINENO
printf . cmp in2 copy2 || test_failed $LINENO
printf "\ntesting bad input..."
cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
[ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then
# can't detect truncated header of non-first member
for i in 6 20 14734 14758 ; do
dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
"${LZIP}" < trunc.lz > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO $i
done
else
printf "\nwarning: skipping truncation test: 'dd' does not work on your system."
fi
cat "${in_lz}" > ingin.lz || framework_failure
printf "g" >> ingin.lz || framework_failure
cat "${in_lz}" >> ingin.lz || framework_failure
"${LZIP}" < ingin.lz > copy || test_failed $LINENO
cmp "${in}" copy || test_failed $LINENO
echo echo
if [ ${fail} = 0 ] ; then if [ ${fail} = 0 ] ; then

Binary file not shown.