1
0
Fork 0

Compare commits

..

No commits in common. "9a45d2df811b4b214656c31b0dca10cc4316d1af" and "2902fcb6e554dd674f04b7dfcfe41960aaed339d" have entirely different histories.

22 changed files with 859 additions and 1070 deletions

View file

@ -1,6 +1,7 @@
Lzd was written by Antonio Diaz Diaz. Lzd was written by Antonio Diaz Diaz.
The ideas embodied in lzd are due to (at least) the following people: The ideas embodied in lzd are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
definition of Markov chains), G.N.N. Martin (for the definition of range the definition of Markov chains), G.N.N. Martin (for the definition of
encoding), and Igor Pavlov (for putting all the above together in LZMA). range encoding), and Igor Pavlov (for putting all the above together in
LZMA).

17
COPYING
View file

@ -1,17 +0,0 @@
Lzd - Educational decompressor for the lzip format
Copyright (C) Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided
that the following conditions are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions, and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

View file

@ -1,60 +1,3 @@
2025-01-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.5 released.
* lzd.cc: Reject empty members and nonzero first LZMA byte.
2024-01-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.4 released.
* lzd.cc: Use header_size and trailer_size instead of 6 and 20.
2022-10-24 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.3 released.
* lzd.cc (Range_decoder): Discard first LZMA byte explicitly.
2021-01-04 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.2 released.
* lzd.cc (main): Check also mismatches in member size.
Accept and ignore the option '-d' for compatibility with zutils.
Remove warning about "lzd not safe for real work".
Print license notice.
* testsuite: Add 10 new test files.
2019-01-11 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.1 released.
* Rename File_* to Lzip_*.
* lzd.cc: Compile on DOS with DJGPP.
* configure: Accept appending to CXXFLAGS; 'CXXFLAGS+=OPTIONS'.
2017-05-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.0 released.
* lzd.cc: Minor code improvements.
* check.sh: Require a POSIX shell.
2016-05-10 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.9 released.
* configure: Avoid warning on some shells when testing for g++.
2016-01-23 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.8 released.
* Document that lzip does not use 'literal_pos_state_bits'.
2015-07-07 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.7 released.
* Minor changes.
2014-08-25 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.6 released.
* Minor changes.
2013-09-17 Antonio Diaz Diaz <antonio@gnu.org> 2013-09-17 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.5 released. * Version 0.5 released.
@ -63,7 +6,7 @@
2013-08-01 Antonio Diaz Diaz <antonio@gnu.org> 2013-08-01 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.4 released. * Version 0.4 released.
* check.sh: Remove '/dev/full' from tests. * testsuite/check.sh: Removed '/dev/full' from tests.
2013-07-24 Antonio Diaz Diaz <antonio@gnu.org> 2013-07-24 Antonio Diaz Diaz <antonio@gnu.org>
@ -73,14 +16,15 @@
2013-05-06 Antonio Diaz Diaz <antonio@gnu.org> 2013-05-06 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.2 released. * Version 0.2 released.
* main.c: Add a missing '#include' for OS/2. * main.c: Added a missing '#include' for OS/2.
2013-03-21 Antonio Diaz Diaz <ant_diaz@teleline.es> 2013-03-21 Antonio Diaz Diaz <ant_diaz@teleline.es>
* Version 0.1 released. * Version 0.1 released.
Copyright (C) 2013-2025 Antonio Diaz Diaz. Copyright (C) 2013 Antonio Diaz Diaz.
This file is a collection of facts, and thus it is not copyrightable, but just This file is a collection of facts, and thus it is not copyrightable,
in case, you have unlimited permission to copy, distribute, and modify it. but just in case, you have unlimited permission to copy, distribute and
modify it.

41
INSTALL
View file

@ -1,11 +1,9 @@
Requirements Requirements
------------ ------------
You will need a C++98 compiler with support for 'long long'. You will need a C++ compiler.
(gcc 3.3.6 or newer is recommended). I use gcc 4.8.1 and 3.3.6, but the code should compile with any
I use gcc 6.1.0 and 3.3.6, but the code should compile with any standards standards compliant compiler.
compliant compiler. Gcc is available at http://gcc.gnu.org.
Gcc is available at http://gcc.gnu.org
Lzip is available at http://www.nongnu.org/lzip/lzip.html
Procedure Procedure
@ -16,8 +14,8 @@ Procedure
or or
lzip -cd lzd[version].tar.lz | tar -xf - lzip -cd lzd[version].tar.lz | tar -xf -
This creates the directory ./lzd[version] containing the source code This creates the directory ./lzd[version] containing the source from
extracted from the archive. the main archive.
2. Change to lzd directory and run configure. 2. Change to lzd directory and run configure.
(Try 'configure --help' for usage instructions). (Try 'configure --help' for usage instructions).
@ -32,28 +30,31 @@ extracted from the archive.
4. Optionally, type 'make check' to run the tests that come with lzd. 4. Optionally, type 'make check' to run the tests that come with lzd.
5. Type 'make install' to install the program and any data files and 5. Type 'make install' to install the program and any data files and
documentation. You need root privileges to install into a prefix owned documentation.
by root.
You can install only the program, the info manual or the man page
typing 'make install-bin', 'make install-info' or 'make install-man'
respectively.
Another way Another way
----------- -----------
You can also compile lzd into a separate directory. You can also compile lzd into a separate directory. To do this, you
To do this, you must use a version of 'make' that supports the variable must use a version of 'make' that supports the 'VPATH' variable, such
'VPATH', such as GNU 'make'. 'cd' to the directory where you want the as GNU 'make'. 'cd' to the directory where you want the object files
object files and executables to go and run the 'configure' script. and executables to go and run the 'configure' script. 'configure'
'configure' automatically checks for the source code in '.', in '..', and automatically checks for the source code in '.', in '..' and in the
in the directory that 'configure' is in. directory that 'configure' is in.
'configure' recognizes the option '--srcdir=DIR' to control where to look 'configure' recognizes the option '--srcdir=DIR' to control where to
for the source code. Usually 'configure' can determine that directory look for the sources. Usually 'configure' can determine that directory
automatically. automatically.
After running 'configure', you can run 'make' and 'make install' as After running 'configure', you can run 'make' and 'make install' as
explained above. explained above.
Copyright (C) 2013-2025 Antonio Diaz Diaz. Copyright (C) 2013 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute, and modify it. distribute and modify it.

View file

@ -1,47 +1,44 @@
DISTNAME = $(pkgname)-$(pkgversion) DISTNAME = $(pkgname)-$(pkgversion)
INSTALL = install INSTALL = install
INSTALL_PROGRAM = $(INSTALL) -m 755 INSTALL_PROGRAM = $(INSTALL) -p -m 755
INSTALL_DATA = $(INSTALL) -p -m 644
INSTALL_DIR = $(INSTALL) -d -m 755 INSTALL_DIR = $(INSTALL) -d -m 755
INSTALL_DATA = $(INSTALL) -m 644
SHELL = /bin/sh SHELL = /bin/sh
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
objs = lzd.o objs = lzd.o
.PHONY : all install install-bin install-info install-man \ .PHONY : all install install-bin install-info install-man install-strip \
install-strip install-compress install-strip-compress \
install-bin-strip install-info-compress install-man-compress \
uninstall uninstall-bin uninstall-info uninstall-man \ uninstall uninstall-bin uninstall-info uninstall-man \
doc info man check dist clean distclean doc info man check dist clean distclean
all : $(progname) all : $(progname)
$(progname) : $(objs) $(progname) : $(objs)
$(CXX) $(CXXFLAGS) $(LDFLAGS) -o $@ $(objs) $(CXX) $(LDFLAGS) -o $@ $(objs)
$(progname)_profiled : $(objs)
$(CXX) $(LDFLAGS) -pg -o $@ $(objs)
%.o : %.cc %.o : %.cc
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $< $(CXX) $(CPPFLAGS) $(CXXFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $<
# prevent 'make' from trying to remake source files
$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
MAKEFLAGS += -r
.SUFFIXES :
$(objs) : Makefile $(objs) : Makefile
doc : doc :
info : $(VPATH)/doc/$(pkgname).info info : $(VPATH)/doc/$(pkgname).info
$(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texi $(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texinfo
cd $(VPATH)/doc && $(MAKEINFO) $(pkgname).texi cd $(VPATH)/doc && makeinfo $(pkgname).texinfo
man : $(VPATH)/doc/$(progname).1 man : $(VPATH)/doc/$(progname).1
$(VPATH)/doc/$(progname).1 : $(progname) $(VPATH)/doc/$(progname).1 : $(progname)
help2man -n 'educational decompressor for the lzip format' -o $@ --no-info ./$(progname) help2man -n 'educational decompressor for lzip files' \
-o $@ --no-info ./$(progname)
Makefile : $(VPATH)/configure $(VPATH)/Makefile.in Makefile : $(VPATH)/configure $(VPATH)/Makefile.in
./config.status ./config.status
@ -50,73 +47,54 @@ check : all
@$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion) @$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion)
install : install-bin install : install-bin
install-strip : install-bin-strip
install-compress : install-bin
install-strip-compress : install-bin-strip
install-bin : all install-bin : all
if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi
$(INSTALL_PROGRAM) ./$(progname) "$(DESTDIR)$(bindir)/$(progname)" $(INSTALL_PROGRAM) ./$(progname) "$(DESTDIR)$(bindir)/$(progname)"
install-bin-strip : all
$(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin
install-info : install-info :
if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
$(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info" $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info"
-if $(CAN_RUN_INSTALLINFO) ; then \ -install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info"
install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi
install-info-compress : install-info
lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info"
install-man : install-man :
if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi
-rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
$(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1" $(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1"
install-man-compress : install-man install-strip : all
lzip -v -9 "$(DESTDIR)$(mandir)/man1/$(progname).1" $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install
uninstall : uninstall-bin uninstall : uninstall-bin uninstall-info uninstall-man
uninstall-bin : uninstall-bin :
-rm -f "$(DESTDIR)$(bindir)/$(progname)" -rm -f "$(DESTDIR)$(bindir)/$(progname)"
uninstall-info : uninstall-info :
-if $(CAN_RUN_INSTALLINFO) ; then \ -install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info"
install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"
fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
uninstall-man : uninstall-man :
-rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"* -rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"
dist : doc dist : doc
ln -sf $(VPATH) $(DISTNAME) ln -sf $(VPATH) $(DISTNAME)
tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \ tar -cvf $(DISTNAME).tar \
$(DISTNAME)/AUTHORS \ $(DISTNAME)/AUTHORS \
$(DISTNAME)/COPYING \
$(DISTNAME)/ChangeLog \ $(DISTNAME)/ChangeLog \
$(DISTNAME)/INSTALL \ $(DISTNAME)/INSTALL \
$(DISTNAME)/Makefile.in \ $(DISTNAME)/Makefile.in \
$(DISTNAME)/NEWS \ $(DISTNAME)/NEWS \
$(DISTNAME)/README \ $(DISTNAME)/README \
$(DISTNAME)/configure \ $(DISTNAME)/configure \
$(DISTNAME)/*.cc \
$(DISTNAME)/testsuite/check.sh \ $(DISTNAME)/testsuite/check.sh \
$(DISTNAME)/testsuite/test.txt \ $(DISTNAME)/testsuite/test.txt \
$(DISTNAME)/testsuite/em.lz \ $(DISTNAME)/testsuite/test.txt.lz \
$(DISTNAME)/testsuite/fox.lz \ $(DISTNAME)/*.cc
$(DISTNAME)/testsuite/fox_*.lz \
$(DISTNAME)/testsuite/test.txt.lz
rm -f $(DISTNAME) rm -f $(DISTNAME)
lzip -v -9 $(DISTNAME).tar lzip -v -9 $(DISTNAME).tar
clean : clean :
-rm -f $(progname) $(objs) -rm -f $(progname) $(progname)_profiled $(objs)
distclean : clean distclean : clean
-rm -f Makefile config.status *.tar *.tar.lz -rm -f Makefile config.status *.tar *.tar.lz

8
NEWS
View file

@ -1,7 +1,3 @@
Changes in version 1.5: Changes in version 0.5:
lzd now exits with error status 2 if any empty member is found in a Minor changes.
multimember file.
lzd now exits with error status 2 if the first byte of the LZMA stream is
not 0.

47
README
View file

@ -1,39 +1,30 @@
See the file INSTALL for compilation and installation instructions.
Description Description
Lzd is a simplified decompressor for the lzip format with an educational Lzd is a simplified decompressor for lzip files with an educational
purpose. Studying its source code is a good first step to understand how purpose. Studying its source is a good first step to understand how lzip
lzip works. Lzd is written in C++. works. It is not safe to use lzd for any real work.
The source code of lzd is used in the lzip manual as a reference The source of lzd is used in the lzip manual as a reference decompressor
decompressor in the description of the lzip file format. Reading the lzip in the description of the lzip file format. Reading the lzip manual will
manual will help you understand the source code. Lzd is compliant with the help you understand the source.
lzip specification; it checks the 3 integrity factors.
The source code of lzd is also used as a reference in the description of the Lzd decompresses from standard input to standard output. Lzd will
media type 'application/lzip'. correctly decompress the concatenation of two or more compressed files.
See http://datatracker.ietf.org/doc/draft-diaz-lzip The result is the concatenation of the corresponding decompressed data.
Integrity of such concatenated compressed input is also verified.
Lzd decompresses from standard input to standard output. It accepts (and
ignores) the option '-d' for compatibility with other lzip tools. In
particular, accepting the option '-d' allows lzd to be used as argument to
the option '--lz' of the tools from the zutils package.
Lzd correctly decompresses the concatenation of two or more compressed
files. The result is the concatenation of the corresponding decompressed
data. Integrity of such concatenated compressed input is also checked.
The ideas embodied in lzd are due to (at least) the following people: The ideas embodied in lzd are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
definition of Markov chains), G.N.N. Martin (for the definition of range the definition of Markov chains), G.N.N. Martin (for the definition of
encoding), and Igor Pavlov (for putting all the above together in LZMA). range encoding), and Igor Pavlov (for putting all the above together in
LZMA).
Copyright (C) 2013-2025 Antonio Diaz Diaz. Copyright (C) 2013 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute, and modify it. distribute and modify it.
The file Makefile.in is a data file used by configure to produce the Makefile. The file Makefile.in is a data file used by configure to produce the
It has the same copyright owner and permissions that configure itself. Makefile. It has the same copyright owner and permissions that configure
itself.

73
configure vendored
View file

@ -1,12 +1,12 @@
#! /bin/sh #! /bin/sh
# configure script for Lzd - Educational decompressor for the lzip format # configure script for Lzd - Educational decompressor for lzip files
# Copyright (C) 2013-2025 Antonio Diaz Diaz. # Copyright (C) 2013 Antonio Diaz Diaz.
# #
# This configure script is free software: you have unlimited permission # This configure script is free software: you have unlimited permission
# to copy, distribute, and modify it. # to copy, distribute and modify it.
pkgname=lzd pkgname=lzd
pkgversion=1.5 pkgversion=0.5
progname=lzd progname=lzd
srctrigger=lzd.cc srctrigger=lzd.cc
@ -24,10 +24,13 @@ CXX=g++
CPPFLAGS= CPPFLAGS=
CXXFLAGS='-Wall -W -O2' CXXFLAGS='-Wall -W -O2'
LDFLAGS= LDFLAGS=
MAKEINFO=makeinfo
# checking whether we are using GNU C++. # checking whether we are using GNU C++.
/bin/sh -c "${CXX} --version" > /dev/null 2>&1 || { CXX=c++ ; CXXFLAGS=-O2 ; } ${CXX} --version > /dev/null 2>&1
if [ $? != 0 ] ; then
CXX=c++
CXXFLAGS='-W -O2'
fi
# Loop over all args # Loop over all args
args= args=
@ -39,38 +42,32 @@ while [ $# != 0 ] ; do
shift shift
# Add the argument quoted to args # Add the argument quoted to args
if [ -z "${args}" ] ; then args="\"${option}\"" args="${args} \"${option}\""
else args="${args} \"${option}\"" ; fi
# Split out the argument for options that take them # Split out the argument for options that take them
case ${option} in case ${option} in
*=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;; *=*) optarg=`echo ${option} | sed -e 's,^[^=]*=,,;s,/$,,'` ;;
esac esac
# Process the options # Process the options
case ${option} in case ${option} in
--help | -h) --help | -h)
echo "Usage: $0 [OPTION]... [VAR=VALUE]..." echo "Usage: configure [options]"
echo echo
echo "To assign makefile variables (e.g., CXX, CXXFLAGS...), specify them as" echo "Options: [defaults in brackets]"
echo "arguments to configure in the form VAR=VALUE."
echo
echo "Options and variables: [defaults in brackets]"
echo " -h, --help display this help and exit" echo " -h, --help display this help and exit"
echo " -V, --version output version information and exit" echo " -V, --version output version information and exit"
echo " --srcdir=DIR find the source code in DIR [. or ..]" echo " --srcdir=DIR find the sources in DIR [. or ..]"
echo " --prefix=DIR install into DIR [${prefix}]" echo " --prefix=DIR install into DIR [${prefix}]"
echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]" echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]"
echo " --bindir=DIR user executables directory [${bindir}]" echo " --bindir=DIR user executables directory [${bindir}]"
echo " --datarootdir=DIR base directory for doc and data [${datarootdir}]" echo " --datarootdir=DIR base directory for doc and data [${datarootdir}]"
echo " --infodir=DIR info files directory [${infodir}]" echo " --infodir=DIR info files directory [${infodir}]"
echo " --mandir=DIR man pages directory [${mandir}]" echo " --mandir=DIR man pages directory [${mandir}]"
echo " CXX=COMPILER C++ compiler to use [${CXX}]" echo " CXX=COMPILER C++ compiler to use [g++]"
echo " CPPFLAGS=OPTIONS command-line options for the preprocessor [${CPPFLAGS}]" echo " CPPFLAGS=OPTIONS command line options for the preprocessor [${CPPFLAGS}]"
echo " CXXFLAGS=OPTIONS command-line options for the C++ compiler [${CXXFLAGS}]" echo " CXXFLAGS=OPTIONS command line options for the C++ compiler [${CXXFLAGS}]"
echo " CXXFLAGS+=OPTIONS append options to the current value of CXXFLAGS" echo " LDFLAGS=OPTIONS command line options for the linker [${LDFLAGS}]"
echo " LDFLAGS=OPTIONS command-line options for the linker [${LDFLAGS}]"
echo " MAKEINFO=NAME makeinfo program to use [${MAKEINFO}]"
echo echo
exit 0 ;; exit 0 ;;
--version | -V) --version | -V)
@ -93,12 +90,10 @@ while [ $# != 0 ] ; do
--mandir=*) mandir=${optarg} ;; --mandir=*) mandir=${optarg} ;;
--no-create) no_create=yes ;; --no-create) no_create=yes ;;
CXX=*) CXX=${optarg} ;; CXX=*) CXX=${optarg} ;;
CPPFLAGS=*) CPPFLAGS=${optarg} ;; CPPFLAGS=*) CPPFLAGS=${optarg} ;;
CXXFLAGS=*) CXXFLAGS=${optarg} ;; CXXFLAGS=*) CXXFLAGS=${optarg} ;;
CXXFLAGS+=*) CXXFLAGS="${CXXFLAGS} ${optarg}" ;; LDFLAGS=*) LDFLAGS=${optarg} ;;
LDFLAGS=*) LDFLAGS=${optarg} ;;
MAKEINFO=*) MAKEINFO=${optarg} ;;
--*) --*)
echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;; echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;;
@ -109,7 +104,7 @@ while [ $# != 0 ] ; do
exit 1 ;; exit 1 ;;
esac esac
# Check whether the option took a separate argument # Check if the option took a separate argument
if [ "${arg2}" = yes ] ; then if [ "${arg2}" = yes ] ; then
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
else echo "configure: Missing argument to '${option}'" 1>&2 else echo "configure: Missing argument to '${option}'" 1>&2
@ -118,19 +113,19 @@ while [ $# != 0 ] ; do
fi fi
done done
# Find the source code, if location was not specified. # Find the source files, if location was not specified.
srcdirtext= srcdirtext=
if [ -z "${srcdir}" ] ; then if [ -z "${srcdir}" ] ; then
srcdirtext="or . or .." ; srcdir=. srcdirtext="or . or .." ; srcdir=.
if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then if [ ! -r "${srcdir}/${srctrigger}" ] ; then
## the sed command below emulates the dirname command ## the sed command below emulates the dirname command
srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` srcdir=`echo $0 | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
fi fi
fi fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then if [ ! -r "${srcdir}/${srctrigger}" ] ; then
echo "configure: Can't find source code in ${srcdir} ${srcdirtext}" 1>&2 echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2
echo "configure: (At least ${srctrigger} is missing)." 1>&2 echo "configure: (At least ${srctrigger} is missing)." 1>&2
exit 1 exit 1
fi fi
@ -144,13 +139,13 @@ if [ -z "${no_create}" ] ; then
rm -f config.status rm -f config.status
cat > config.status << EOF cat > config.status << EOF
#! /bin/sh #! /bin/sh
# This file was generated automatically by configure. Don't edit. # This file was generated automatically by configure. Do not edit.
# Run this file to recreate the current configuration. # Run this file to recreate the current configuration.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute, and modify it. # to copy, distribute and modify it.
exec /bin/sh "$0" ${args} --no-create exec /bin/sh $0 ${args} --no-create
EOF EOF
chmod +x config.status chmod +x config.status
fi fi
@ -167,15 +162,14 @@ echo "CXX = ${CXX}"
echo "CPPFLAGS = ${CPPFLAGS}" echo "CPPFLAGS = ${CPPFLAGS}"
echo "CXXFLAGS = ${CXXFLAGS}" echo "CXXFLAGS = ${CXXFLAGS}"
echo "LDFLAGS = ${LDFLAGS}" echo "LDFLAGS = ${LDFLAGS}"
echo "MAKEINFO = ${MAKEINFO}"
rm -f Makefile rm -f Makefile
cat > Makefile << EOF cat > Makefile << EOF
# Makefile for Lzd - Educational decompressor for the lzip format # Makefile for Lzd - Educational decompressor for lzip files
# Copyright (C) 2013-2025 Antonio Diaz Diaz. # Copyright (C) 2013 Antonio Diaz Diaz.
# This file was generated automatically by configure. Don't edit. # This file was generated automatically by configure. Do not edit.
# #
# This Makefile is free software: you have unlimited permission # This Makefile is free software: you have unlimited permission
# to copy, distribute, and modify it. # to copy, distribute and modify it.
pkgname = ${pkgname} pkgname = ${pkgname}
pkgversion = ${pkgversion} pkgversion = ${pkgversion}
@ -191,7 +185,6 @@ CXX = ${CXX}
CPPFLAGS = ${CPPFLAGS} CPPFLAGS = ${CPPFLAGS}
CXXFLAGS = ${CXXFLAGS} CXXFLAGS = ${CXXFLAGS}
LDFLAGS = ${LDFLAGS} LDFLAGS = ${LDFLAGS}
MAKEINFO = ${MAKEINFO}
EOF EOF
cat "${srcdir}/Makefile.in" >> Makefile cat "${srcdir}/Makefile.in" >> Makefile

330
lzd.cc
View file

@ -1,25 +1,17 @@
/* Lzd - Educational decompressor for the lzip format /* Lzd - Educational decompressor for lzip files
Copyright (C) 2013-2025 Antonio Diaz Diaz. Copyright (C) 2013 Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and This program is free software: you have unlimited permission
binary forms, with or without modification, are permitted provided to copy, distribute and modify it.
that the following conditions are met:
1. Redistributions of source code must retain the above copyright This program is distributed in the hope that it will be useful,
notice, this list of conditions, and the following disclaimer. but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*/ */
/* /*
Exit status: 0 for a normal exit, 1 for environmental problems Exit status: 0 for a normal exit, 1 for environmental problems
(file not found, invalid command-line options, I/O errors, etc), 2 to (file not found, invalid flags, I/O errors, etc), 2 to indicate a
indicate a corrupt or invalid input file. corrupt or invalid input file.
*/ */
#include <algorithm> #include <algorithm>
@ -29,7 +21,7 @@
#include <cstring> #include <cstring>
#include <stdint.h> #include <stdint.h>
#include <unistd.h> #include <unistd.h>
#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ #if defined(__MSVCRT__) || defined(__OS2__)
#include <fcntl.h> #include <fcntl.h>
#include <io.h> #include <io.h>
#endif #endif
@ -47,7 +39,7 @@ public:
void set_char() void set_char()
{ {
const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 }; static const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
st = next[st]; st = next[st];
} }
void set_match() { st = ( st < 7 ) ? 7 : 10; } void set_match() { st = ( st < 7 ) ? 7 : 10; }
@ -60,7 +52,6 @@ enum {
min_dictionary_size = 1 << 12, min_dictionary_size = 1 << 12,
max_dictionary_size = 1 << 29, max_dictionary_size = 1 << 29,
literal_context_bits = 3, literal_context_bits = 3,
literal_pos_state_bits = 0, // not used
pos_state_bits = 2, pos_state_bits = 2,
pos_states = 1 << pos_state_bits, pos_states = 1 << pos_state_bits,
pos_state_mask = pos_states - 1, pos_state_mask = pos_states - 1,
@ -69,7 +60,7 @@ enum {
dis_slot_bits = 6, dis_slot_bits = 6,
start_dis_model = 4, start_dis_model = 4,
end_dis_model = 14, end_dis_model = 14,
modeled_distances = 1 << ( end_dis_model / 2 ), // 128 modeled_distances = 1 << (end_dis_model / 2), // 128
dis_align_bits = 4, dis_align_bits = 4,
dis_align_size = 1 << dis_align_bits, dis_align_size = 1 << dis_align_bits,
@ -130,82 +121,74 @@ public:
const CRC32 crc32; const CRC32 crc32;
enum { header_size = 6, trailer_size = 20 }; typedef uint8_t File_header[6]; // 0-3 magic, 4 version, 5 coded_dict_size
typedef uint8_t Lzip_header[header_size]; // 0-3 magic bytes
// 4 version typedef uint8_t File_trailer[20];
// 5 coded dictionary size
typedef uint8_t Lzip_trailer[trailer_size];
// 0-3 CRC32 of the uncompressed data // 0-3 CRC32 of the uncompressed data
// 4-11 size of the uncompressed data // 4-11 size of the uncompressed data
// 12-19 member size including header and trailer // 12-19 member size including header and trailer
class Range_decoder class Range_decoder
{ {
unsigned long long member_pos;
uint32_t code; uint32_t code;
uint32_t range; uint32_t range;
public: public:
Range_decoder() Range_decoder() : code( 0 ), range( 0xFFFFFFFFU )
: member_pos( header_size ), code( 0 ), range( 0xFFFFFFFFU )
{ {
if( get_byte() != 0 ) // check first LZMA byte for( int i = 0; i < 5; ++i ) code = (code << 8) | get_byte();
{ std::fputs( "Nonzero first LZMA byte.\n", stderr ); std::exit( 2 ); }
for( int i = 0; i < 4; ++i ) code = ( code << 8 ) | get_byte();
} }
uint8_t get_byte() { ++member_pos; return std::getc( stdin ); } uint8_t get_byte() { return std::getc( stdin ); }
unsigned long long member_position() const { return member_pos; }
unsigned decode( const int num_bits ) int decode( const int num_bits )
{ {
unsigned symbol = 0; int symbol = 0;
for( int i = num_bits; i > 0; --i ) for( int i = 0; i < num_bits; ++i )
{ {
range >>= 1; range >>= 1;
symbol <<= 1; symbol <<= 1;
if( code >= range ) { code -= range; symbol |= 1; } if( code >= range ) { code -= range; symbol |= 1; }
if( range <= 0x00FFFFFFU ) // normalize if( range <= 0x00FFFFFFU ) // normalize
{ range <<= 8; code = ( code << 8 ) | get_byte(); } { range <<= 8; code = (code << 8) | get_byte(); }
} }
return symbol; return symbol;
} }
bool decode_bit( Bit_model & bm ) int decode_bit( Bit_model & bm )
{ {
bool symbol; int symbol;
const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability; const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability;
if( code < bound ) if( code < bound )
{ {
range = bound; range = bound;
bm.probability += bm.probability += (bit_model_total - bm.probability) >> bit_model_move_bits;
( bit_model_total - bm.probability ) >> bit_model_move_bits;
symbol = 0; symbol = 0;
} }
else else
{ {
code -= bound;
range -= bound; range -= bound;
code -= bound;
bm.probability -= bm.probability >> bit_model_move_bits; bm.probability -= bm.probability >> bit_model_move_bits;
symbol = 1; symbol = 1;
} }
if( range <= 0x00FFFFFFU ) // normalize if( range <= 0x00FFFFFFU ) // normalize
{ range <<= 8; code = ( code << 8 ) | get_byte(); } { range <<= 8; code = (code << 8) | get_byte(); }
return symbol; return symbol;
} }
unsigned decode_tree( Bit_model bm[], const int num_bits ) int decode_tree( Bit_model bm[], const int num_bits )
{ {
unsigned symbol = 1; int symbol = 1;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
symbol = ( symbol << 1 ) | decode_bit( bm[symbol] ); symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
return symbol - ( 1 << num_bits ); return symbol - (1 << num_bits);
} }
unsigned decode_tree_reversed( Bit_model bm[], const int num_bits ) int decode_tree_reversed( Bit_model bm[], const int num_bits )
{ {
unsigned symbol = decode_tree( bm, num_bits ); int symbol = decode_tree( bm, num_bits );
unsigned reversed_symbol = 0; int reversed_symbol = 0;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
{ {
reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 ); reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 );
@ -214,13 +197,14 @@ public:
return reversed_symbol; return reversed_symbol;
} }
unsigned decode_matched( Bit_model bm[], const unsigned match_byte ) int decode_matched( Bit_model bm[], const int match_byte )
{ {
unsigned symbol = 1; Bit_model * const bm1 = bm + 0x100;
int symbol = 1;
for( int i = 7; i >= 0; --i ) for( int i = 7; i >= 0; --i )
{ {
const bool match_bit = ( match_byte >> i ) & 1; const int match_bit = ( match_byte >> i ) & 1;
const bool bit = decode_bit( bm[symbol+(match_bit<<8)+0x100] ); const int bit = decode_bit( bm1[(match_bit<<8)+symbol] );
symbol = ( symbol << 1 ) | bit; symbol = ( symbol << 1 ) | bit;
if( match_bit != bit ) if( match_bit != bit )
{ {
@ -229,18 +213,17 @@ public:
break; break;
} }
} }
return symbol & 0xFF; return symbol - 0x100;
} }
unsigned decode_len( Len_model & lm, const int pos_state ) int decode_len( Len_model & lm, const int pos_state )
{ {
if( decode_bit( lm.choice1 ) == 0 ) if( decode_bit( lm.choice1 ) == 0 )
return min_match_len + return decode_tree( lm.bm_low[pos_state], len_low_bits );
decode_tree( lm.bm_low[pos_state], len_low_bits );
if( decode_bit( lm.choice2 ) == 0 ) if( decode_bit( lm.choice2 ) == 0 )
return min_match_len + len_low_symbols + return len_low_symbols +
decode_tree( lm.bm_mid[pos_state], len_mid_bits ); decode_tree( lm.bm_mid[pos_state], len_mid_bits );
return min_match_len + len_low_symbols + len_mid_symbols + return len_low_symbols + len_mid_symbols +
decode_tree( lm.bm_high, len_high_bits ); decode_tree( lm.bm_high, len_high_bits );
} }
}; };
@ -255,15 +238,14 @@ class LZ_decoder
unsigned pos; // current pos in buffer unsigned pos; // current pos in buffer
unsigned stream_pos; // first byte not yet written to stdout unsigned stream_pos; // first byte not yet written to stdout
uint32_t crc_; uint32_t crc_;
bool pos_wrapped;
void flush_data(); void flush_data();
uint8_t peek( const unsigned distance ) const uint8_t get_byte( const unsigned distance ) const
{ {
if( pos > distance ) return buffer[pos - distance - 1]; unsigned i = pos - distance - 1;
if( pos_wrapped ) return buffer[dictionary_size + pos - distance - 1]; if( pos <= distance ) i += dictionary_size;
return 0; // prev_byte of first byte return buffer[i];
} }
void put_byte( const uint8_t b ) void put_byte( const uint8_t b )
@ -273,25 +255,20 @@ class LZ_decoder
} }
public: public:
explicit LZ_decoder( const unsigned dict_size ) LZ_decoder( const unsigned dict_size )
: :
partial_data_pos( 0 ), partial_data_pos( 0 ),
dictionary_size( dict_size ), dictionary_size( dict_size ),
buffer( new uint8_t[dictionary_size] ), buffer( new uint8_t[dictionary_size] ),
pos( 0 ), pos( 0 ),
stream_pos( 0 ), stream_pos( 0 ),
crc_( 0xFFFFFFFFU ), crc_( 0xFFFFFFFFU )
pos_wrapped( false ) { buffer[dictionary_size-1] = 0; } // prev_byte of first_byte
{}
~LZ_decoder() { delete[] buffer; } ~LZ_decoder() { delete[] buffer; }
unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; } unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; }
unsigned long long data_position() const unsigned long long data_position() const { return partial_data_pos + pos; }
{ return partial_data_pos + pos; }
uint8_t get_byte() { return rdec.get_byte(); }
unsigned long long member_position() const
{ return rdec.member_position(); }
bool decode_member(); bool decode_member();
}; };
@ -303,17 +280,17 @@ void LZ_decoder::flush_data()
{ {
const unsigned size = pos - stream_pos; const unsigned size = pos - stream_pos;
crc32.update_buf( crc_, buffer + stream_pos, size ); crc32.update_buf( crc_, buffer + stream_pos, size );
errno = 0;
if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size ) if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size )
{ std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) ); { std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) );
std::exit( 1 ); } std::exit( 1 ); }
if( pos >= dictionary_size ) if( pos >= dictionary_size ) { partial_data_pos += pos; pos = 0; }
{ partial_data_pos += pos; pos = 0; pos_wrapped = true; }
stream_pos = pos; stream_pos = pos;
} }
} }
bool LZ_decoder::decode_member() // Return false if error bool LZ_decoder::decode_member() // Returns false if error
{ {
Bit_model bm_literal[1<<literal_context_bits][0x300]; Bit_model bm_literal[1<<literal_context_bits][0x300];
Bit_model bm_match[State::states][pos_states]; Bit_model bm_match[State::states][pos_states];
@ -323,13 +300,13 @@ bool LZ_decoder::decode_member() // Return false if error
Bit_model bm_rep2[State::states]; Bit_model bm_rep2[State::states];
Bit_model bm_len[State::states][pos_states]; Bit_model bm_len[State::states][pos_states];
Bit_model bm_dis_slot[len_states][1<<dis_slot_bits]; Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
Bit_model bm_dis[modeled_distances-end_dis_model+1]; Bit_model bm_dis[modeled_distances-end_dis_model];
Bit_model bm_align[dis_align_size]; Bit_model bm_align[dis_align_size];
Len_model match_len_model; Len_model match_len_model;
Len_model rep_len_model; Len_model rep_len_model;
unsigned rep0 = 0; // rep[0-3] latest four distances unsigned rep0 = 0; // rep[0-3] latest four distances
unsigned rep1 = 0; // used for efficient coding of unsigned rep1 = 0; // used for efficient coding of
unsigned rep2 = 0; // repeated distances unsigned rep2 = 0; // repeated distances
unsigned rep3 = 0; unsigned rep3 = 0;
State state; State state;
@ -338,155 +315,146 @@ bool LZ_decoder::decode_member() // Return false if error
const int pos_state = data_position() & pos_state_mask; const int pos_state = data_position() & pos_state_mask;
if( rdec.decode_bit( bm_match[state()][pos_state] ) == 0 ) // 1st bit if( rdec.decode_bit( bm_match[state()][pos_state] ) == 0 ) // 1st bit
{ {
// literal byte const uint8_t prev_byte = get_byte( 0 );
const uint8_t prev_byte = peek( 0 );
const int literal_state = prev_byte >> ( 8 - literal_context_bits ); const int literal_state = prev_byte >> ( 8 - literal_context_bits );
Bit_model * const bm = bm_literal[literal_state]; Bit_model * const bm = bm_literal[literal_state];
if( state.is_char() ) if( state.is_char() )
put_byte( rdec.decode_tree( bm, 8 ) ); put_byte( rdec.decode_tree( bm, 8 ) );
else else
put_byte( rdec.decode_matched( bm, peek( rep0 ) ) ); put_byte( rdec.decode_matched( bm, get_byte( rep0 ) ) );
state.set_char(); state.set_char();
continue;
} }
// match or repeated match else
int len;
if( rdec.decode_bit( bm_rep[state()] ) != 0 ) // 2nd bit
{ {
if( rdec.decode_bit( bm_rep0[state()] ) == 0 ) // 3rd bit int len;
if( rdec.decode_bit( bm_rep[state()] ) != 0 ) // 2nd bit
{ {
if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit if( rdec.decode_bit( bm_rep0[state()] ) != 0 ) // 3rd bit
{ state.set_short_rep(); put_byte( peek( rep0 ) ); continue; } {
unsigned distance;
if( rdec.decode_bit( bm_rep1[state()] ) == 0 ) // 4th bit
distance = rep1;
else
{
if( rdec.decode_bit( bm_rep2[state()] ) == 0 ) // 5th bit
distance = rep2;
else
{ distance = rep3; rep3 = rep2; }
rep2 = rep1;
}
rep1 = rep0;
rep0 = distance;
}
else
{
if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
{ state.set_short_rep(); put_byte( get_byte( rep0 ) ); continue; }
}
state.set_rep();
len = min_match_len + rdec.decode_len( rep_len_model, pos_state );
} }
else else
{ {
unsigned distance; rep3 = rep2; rep2 = rep1; rep1 = rep0;
if( rdec.decode_bit( bm_rep1[state()] ) == 0 ) // 4th bit len = min_match_len + rdec.decode_len( match_len_model, pos_state );
distance = rep1; const int len_state = std::min( len - min_match_len, len_states - 1 );
const int dis_slot =
rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits );
if( dis_slot < start_dis_model ) rep0 = dis_slot;
else else
{ {
if( rdec.decode_bit( bm_rep2[state()] ) == 0 ) // 5th bit const int direct_bits = ( dis_slot >> 1 ) - 1;
distance = rep2; rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
if( dis_slot < end_dis_model )
rep0 += rdec.decode_tree_reversed( bm_dis + rep0 - dis_slot - 1,
direct_bits );
else else
{ distance = rep3; rep3 = rep2; }
rep2 = rep1;
}
rep1 = rep0;
rep0 = distance;
}
state.set_rep();
len = rdec.decode_len( rep_len_model, pos_state );
}
else // match
{
rep3 = rep2; rep2 = rep1; rep1 = rep0;
len = rdec.decode_len( match_len_model, pos_state );
const int len_state = std::min( len - min_match_len, len_states - 1 );
rep0 = rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits );
if( rep0 >= start_dis_model )
{
const unsigned dis_slot = rep0;
const int direct_bits = ( dis_slot >> 1 ) - 1;
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
if( dis_slot < end_dis_model )
rep0 += rdec.decode_tree_reversed( bm_dis + ( rep0 - dis_slot ),
direct_bits );
else
{
rep0 += rdec.decode( direct_bits-dis_align_bits ) << dis_align_bits;
rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits );
if( rep0 == 0xFFFFFFFFU ) // marker found
{ {
flush_data(); rep0 += rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits;
return len == min_match_len; // End Of Stream marker rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits );
if( rep0 == 0xFFFFFFFFU ) // Marker found
{
flush_data();
return ( len == min_match_len ); // End Of Stream marker
}
} }
} }
state.set_match();
if( rep0 >= dictionary_size || ( rep0 >= pos && !partial_data_pos ) )
return false;
} }
state.set_match(); for( int i = 0; i < len; ++i )
if( rep0 >= dictionary_size || ( rep0 >= pos && !pos_wrapped ) ) put_byte( get_byte( rep0 ) );
{ flush_data(); return false; }
} }
for( int i = 0; i < len; ++i ) put_byte( peek( rep0 ) );
} }
flush_data();
return false; return false;
} }
int main( const int argc, const char * const argv[] ) int main( const int argc, const char * const argv[] )
{ {
if( argc > 2 || ( argc == 2 && std::strcmp( argv[1], "-d" ) != 0 ) ) if( argc > 1 )
{ {
std::printf( std::printf( "Lzd %s - Educational decompressor for lzip files.\n",
"Lzd %s - Educational decompressor for the lzip format.\n" PROGVERSION );
"Study the source code to learn how a lzip decompressor works.\n" std::printf( "Study the source to learn how a lzip decompressor works.\n"
"See the lzip manual for an explanation of the code.\n" "See the lzip manual for an explanation of the code.\n"
"\nUsage: %s [-d] < file.lz > file\n" "It is not safe to use lzd for any real work.\n"
"Lzd decompresses from standard input to standard output.\n" "\nUsage: %s < file.lz > file\n", argv[0] );
"\nCopyright (C) 2025 Antonio Diaz Diaz.\n" std::printf( "Lzd decompresses from standard input to standard output.\n"
"License 2-clause BSD.\n" "\nCopyright (C) 2013 Antonio Diaz Diaz.\n"
"This is free software: you are free to change and redistribute " "This is free software: you are free to change and redistribute it.\n"
"it.\nThere is NO WARRANTY, to the extent permitted by law.\n" "There is NO WARRANTY, to the extent permitted by law.\n"
"Report bugs to lzip-bug@nongnu.org\n" "Report bugs to lzip-bug@nongnu.org\n"
"Lzd home page: http://www.nongnu.org/lzip/lzd.html\n", "Lzd home page: http://www.nongnu.org/lzip/lzd.html\n" );
PROGVERSION, argv[0] );
return 0; return 0;
} }
#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ #if defined(__MSVCRT__) || defined(__OS2__)
setmode( STDIN_FILENO, O_BINARY ); setmode( STDIN_FILENO, O_BINARY );
setmode( STDOUT_FILENO, O_BINARY ); setmode( STDOUT_FILENO, O_BINARY );
#endif #endif
bool empty = false, multi = false;
for( bool first_member = true; ; first_member = false ) for( bool first_member = true; ; first_member = false )
{ {
Lzip_header header; // check header File_header header;
for( int i = 0; i < header_size; ++i ) header[i] = std::getc( stdin ); for( int i = 0; i < 6; ++i )
if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 ) header[i] = std::getc( stdin );
if( std::feof( stdin ) || std::memcmp( header, "LZIP", 4 ) != 0 )
{ {
if( first_member ) if( first_member )
{ std::fputs( "Bad magic number (file not in lzip format).\n", { std::fprintf( stderr, "Bad magic number (file not in lzip format)\n" );
stderr ); return 2; } return 2; }
break; // ignore trailing data break;
}
if( header[4] != 1 )
{
std::fprintf( stderr, "Version %d member format not supported.\n",
header[4] );
return 2;
} }
unsigned dict_size = 1 << ( header[5] & 0x1F ); unsigned dict_size = 1 << ( header[5] & 0x1F );
dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 ); dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 );
if( dict_size < min_dictionary_size || dict_size > max_dictionary_size ) if( dict_size < min_dictionary_size || dict_size > max_dictionary_size )
{ std::fputs( "Invalid dictionary size in member header.\n", { std::fprintf( stderr, "Invalid dictionary size in member header\n" );
stderr ); return 2; } return 2; }
LZ_decoder decoder( dict_size ); // decode LZMA stream LZ_decoder decoder( dict_size );
if( !decoder.decode_member() ) if( !decoder.decode_member() )
{ std::fputs( "Data error.\n", stderr ); return 2; } { std::fprintf( stderr, "Data error\n" ); return 2; }
Lzip_trailer trailer; // check trailer File_trailer trailer;
for( int i = 0; i < trailer_size; ++i ) trailer[i] = decoder.get_byte(); for( int i = 0; i < 20; ++i ) trailer[i] = std::getc( stdin );
int retval = 0;
unsigned crc = 0; unsigned crc = 0;
for( int i = 3; i >= 0; --i ) crc = ( crc << 8 ) + trailer[i]; for( int i = 3; i >= 0; --i ) { crc <<= 8; crc += trailer[i]; }
if( crc != decoder.crc() )
{ std::fputs( "CRC mismatch.\n", stderr ); retval = 2; }
unsigned long long data_size = 0; unsigned long long data_size = 0;
for( int i = 11; i >= 4; --i ) for( int i = 11; i >= 4; --i ) { data_size <<= 8; data_size += trailer[i]; }
data_size = ( data_size << 8 ) + trailer[i]; if( crc != decoder.crc() || data_size != decoder.data_position() )
if( data_size != decoder.data_position() ) { std::fprintf( stderr, "CRC error\n" ); return 2; }
{ std::fputs( "Data size mismatch.\n", stderr ); retval = 2; }
multi = !first_member; if( data_size == 0 ) empty = true;
unsigned long long member_size = 0;
for( int i = 19; i >= 12; --i )
member_size = ( member_size << 8 ) + trailer[i];
if( member_size != decoder.member_position() )
{ std::fputs( "Member size mismatch.\n", stderr ); retval = 2; }
if( retval ) return retval;
} }
if( std::fclose( stdout ) != 0 ) if( std::fclose( stdout ) != 0 )
{ std::fprintf( stderr, "Error closing stdout: %s\n", { std::fprintf( stderr, "Can't close stdout: %s\n", std::strerror( errno ) );
std::strerror( errno ) ); return 1; } return 1; }
if( empty && multi )
{ std::fputs( "Empty member not allowed.\n", stderr ); return 2; }
return 0; return 0;
} }

View file

@ -1,9 +1,9 @@
#! /bin/sh #! /bin/sh
# check script for Lzd - Educational decompressor for the lzip format # check script for Lzd - Educational decompressor for lzip files
# Copyright (C) 2013-2025 Antonio Diaz Diaz. # Copyright (C) 2013 Antonio Diaz Diaz.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute, and modify it. # to copy, distribute and modify it.
LC_ALL=C LC_ALL=C
export LC_ALL export LC_ALL
@ -12,104 +12,36 @@ testdir=`cd "$1" ; pwd`
LZIP="${objdir}"/lzd LZIP="${objdir}"/lzd
framework_failure() { echo "failure in testing framework" ; exit 1 ; } framework_failure() { echo "failure in testing framework" ; exit 1 ; }
if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then if [ ! -x "${LZIP}" ] ; then
echo "${LZIP}: cannot execute" echo "${LZIP}: cannot execute"
exit 1 exit 1
fi fi
[ -e "${LZIP}" ] 2> /dev/null ||
{
echo "$0: a POSIX shell is required to run the tests"
echo "Try bash -c \"$0 $1 $2\""
exit 1
}
if [ -d tmp ] ; then rm -rf tmp ; fi if [ -d tmp ] ; then rm -rf tmp ; fi
mkdir tmp mkdir tmp
cd "${objdir}"/tmp || framework_failure cd "${objdir}"/tmp
in="${testdir}"/test.txt in="${testdir}"/test.txt
in_lz="${testdir}"/test.txt.lz in_lz="${testdir}"/test.txt.lz
em_lz="${testdir}"/em.lz
fox_lz="${testdir}"/fox.lz
fnz_lz="${testdir}"/fox_nz.lz
fail=0 fail=0
test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
printf "testing lzd-%s..." "$2" printf "testing lzd-%s..." "$2"
"${LZIP}" < "${in}" 2> /dev/null "${LZIP}" < "${in}" 2> /dev/null
[ $? = 2 ] || test_failed $LINENO if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
dd if="${in_lz}" bs=1 count=6 2> /dev/null | "${LZIP}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
dd if="${in_lz}" bs=1 count=20 2> /dev/null | "${LZIP}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" < "${in_lz}" > out || test_failed $LINENO "${LZIP}" < "${in_lz}" > copy || fail=1
cmp "${in}" out || test_failed $LINENO cmp "${in}" copy || fail=1
printf .
cat "${in}" "${in}" > in2 || framework_failure cat "${in}" "${in}" > in2 || framework_failure
cat "${in_lz}" "${in_lz}" | "${LZIP}" > out2 || test_failed $LINENO cat "${in_lz}" "${in_lz}" | "${LZIP}" > copy2 || fail=1
cmp in2 out2 || test_failed $LINENO cmp in2 copy2 || fail=1
rm -f out2 || framework_failure printf .
cat "${in_lz}" "${in_lz}" > out2.lz || framework_failure
printf "\ngarbage" >> out2.lz || framework_failure
"${LZIP}" -d < out2.lz > out2 || test_failed $LINENO
cmp in2 out2 || test_failed $LINENO
rm -f in2 out2 out2.lz || framework_failure
touch empty || framework_failure
"${LZIP}" -d < "${em_lz}" > em || test_failed $LINENO
cmp empty em || test_failed $LINENO
printf "\ntesting bad input..."
cat "${em_lz}" "${em_lz}" | "${LZIP}" -d > em 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
cmp empty em || test_failed $LINENO
rm -f empty em || framework_failure
cat "${em_lz}" "${in_lz}" | "${LZIP}" -d > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
cmp "${in}" out || test_failed $LINENO
cat "${in_lz}" "${em_lz}" | "${LZIP}" -d > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
cmp "${in}" out || test_failed $LINENO
"${LZIP}" < "${fnz_lz}" 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
for i in fox_v2.lz fox_s11.lz fox_de20.lz \
fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
"${LZIP}" < "${testdir}"/$i > /dev/null 2>&1
[ $? = 2 ] || test_failed $LINENO $i
done
"${LZIP}" < "${fox_lz}" > fox || test_failed $LINENO
for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
"${LZIP}" < "${testdir}"/$i > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO $i
cmp fox out || test_failed $LINENO $i
done
rm -f fox || framework_failure
cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
if dd if=in3.lz of=trunc.lz bs=14682 count=1 2> /dev/null &&
[ -e trunc.lz ] && cmp in2.lz trunc.lz ; then
# can't detect truncated header of non-first member
for i in 6 20 14664 14688 ; do
dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
"${LZIP}" < trunc.lz > /dev/null 2>&1
[ $? = 2 ] || test_failed $LINENO $i
done
else
printf "warning: skipping truncation test: 'dd' does not work on your system."
fi
rm -f in2.lz in3.lz trunc.lz || framework_failure
cp "${in_lz}" ingin.lz || framework_failure
printf "g" >> ingin.lz || framework_failure
cat "${in_lz}" >> ingin.lz || framework_failure
"${LZIP}" -d < ingin.lz > out || test_failed $LINENO
cmp "${in}" out || test_failed $LINENO
rm -f out ingin.lz || framework_failure
echo echo
if [ ${fail} = 0 ] ; then if [ ${fail} = 0 ] ; then

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load diff

Binary file not shown.