1
0
Fork 0

Compare commits

...

10 commits

Author SHA1 Message Date
9a45d2df81
Adding upstream version 1.5.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 15:09:22 +01:00
d9ee6fc0c5
Adding upstream version 1.4.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 15:08:42 +01:00
4ebeeeb191
Adding upstream version 1.3.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 15:08:05 +01:00
5fa8d2a83d
Adding upstream version 1.2.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 15:06:16 +01:00
6a0f9dafa8
Adding upstream version 1.1.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 15:04:05 +01:00
dcb57f45d5
Adding upstream version 1.0.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 15:01:40 +01:00
29dc774230
Adding upstream version 0.9.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 14:59:00 +01:00
4502486013
Adding upstream version 0.8.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 14:58:19 +01:00
2c6d5ecc7e
Adding upstream version 0.7.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 14:49:45 +01:00
64ab85d0eb
Adding upstream version 0.6.
Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-20 14:45:03 +01:00
22 changed files with 1090 additions and 879 deletions

View file

@ -1,7 +1,6 @@
Lzd was written by Antonio Diaz Diaz. Lzd was written by Antonio Diaz Diaz.
The ideas embodied in lzd are due to (at least) the following people: The ideas embodied in lzd are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
the definition of Markov chains), G.N.N. Martin (for the definition of definition of Markov chains), G.N.N. Martin (for the definition of range
range encoding), and Igor Pavlov (for putting all the above together in encoding), and Igor Pavlov (for putting all the above together in LZMA).
LZMA).

17
COPYING Normal file
View file

@ -0,0 +1,17 @@
Lzd - Educational decompressor for the lzip format
Copyright (C) Antonio Diaz Diaz.
This program is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided
that the following conditions are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions, and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

View file

@ -1,3 +1,60 @@
2025-01-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.5 released.
* lzd.cc: Reject empty members and nonzero first LZMA byte.
2024-01-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.4 released.
* lzd.cc: Use header_size and trailer_size instead of 6 and 20.
2022-10-24 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.3 released.
* lzd.cc (Range_decoder): Discard first LZMA byte explicitly.
2021-01-04 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.2 released.
* lzd.cc (main): Check also mismatches in member size.
Accept and ignore the option '-d' for compatibility with zutils.
Remove warning about "lzd not safe for real work".
Print license notice.
* testsuite: Add 10 new test files.
2019-01-11 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.1 released.
* Rename File_* to Lzip_*.
* lzd.cc: Compile on DOS with DJGPP.
* configure: Accept appending to CXXFLAGS; 'CXXFLAGS+=OPTIONS'.
2017-05-02 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.0 released.
* lzd.cc: Minor code improvements.
* check.sh: Require a POSIX shell.
2016-05-10 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.9 released.
* configure: Avoid warning on some shells when testing for g++.
2016-01-23 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.8 released.
* Document that lzip does not use 'literal_pos_state_bits'.
2015-07-07 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.7 released.
* Minor changes.
2014-08-25 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.6 released.
* Minor changes.
2013-09-17 Antonio Diaz Diaz <antonio@gnu.org> 2013-09-17 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.5 released. * Version 0.5 released.
@ -6,7 +63,7 @@
2013-08-01 Antonio Diaz Diaz <antonio@gnu.org> 2013-08-01 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.4 released. * Version 0.4 released.
* testsuite/check.sh: Removed '/dev/full' from tests. * check.sh: Remove '/dev/full' from tests.
2013-07-24 Antonio Diaz Diaz <antonio@gnu.org> 2013-07-24 Antonio Diaz Diaz <antonio@gnu.org>
@ -16,15 +73,14 @@
2013-05-06 Antonio Diaz Diaz <antonio@gnu.org> 2013-05-06 Antonio Diaz Diaz <antonio@gnu.org>
* Version 0.2 released. * Version 0.2 released.
* main.c: Added a missing '#include' for OS/2. * main.c: Add a missing '#include' for OS/2.
2013-03-21 Antonio Diaz Diaz <ant_diaz@teleline.es> 2013-03-21 Antonio Diaz Diaz <ant_diaz@teleline.es>
* Version 0.1 released. * Version 0.1 released.
Copyright (C) 2013 Antonio Diaz Diaz. Copyright (C) 2013-2025 Antonio Diaz Diaz.
This file is a collection of facts, and thus it is not copyrightable, This file is a collection of facts, and thus it is not copyrightable, but just
but just in case, you have unlimited permission to copy, distribute and in case, you have unlimited permission to copy, distribute, and modify it.
modify it.

41
INSTALL
View file

@ -1,9 +1,11 @@
Requirements Requirements
------------ ------------
You will need a C++ compiler. You will need a C++98 compiler with support for 'long long'.
I use gcc 4.8.1 and 3.3.6, but the code should compile with any (gcc 3.3.6 or newer is recommended).
standards compliant compiler. I use gcc 6.1.0 and 3.3.6, but the code should compile with any standards
Gcc is available at http://gcc.gnu.org. compliant compiler.
Gcc is available at http://gcc.gnu.org
Lzip is available at http://www.nongnu.org/lzip/lzip.html
Procedure Procedure
@ -14,8 +16,8 @@ Procedure
or or
lzip -cd lzd[version].tar.lz | tar -xf - lzip -cd lzd[version].tar.lz | tar -xf -
This creates the directory ./lzd[version] containing the source from This creates the directory ./lzd[version] containing the source code
the main archive. extracted from the archive.
2. Change to lzd directory and run configure. 2. Change to lzd directory and run configure.
(Try 'configure --help' for usage instructions). (Try 'configure --help' for usage instructions).
@ -30,31 +32,28 @@ the main archive.
4. Optionally, type 'make check' to run the tests that come with lzd. 4. Optionally, type 'make check' to run the tests that come with lzd.
5. Type 'make install' to install the program and any data files and 5. Type 'make install' to install the program and any data files and
documentation. documentation. You need root privileges to install into a prefix owned
by root.
You can install only the program, the info manual or the man page
typing 'make install-bin', 'make install-info' or 'make install-man'
respectively.
Another way Another way
----------- -----------
You can also compile lzd into a separate directory. To do this, you You can also compile lzd into a separate directory.
must use a version of 'make' that supports the 'VPATH' variable, such To do this, you must use a version of 'make' that supports the variable
as GNU 'make'. 'cd' to the directory where you want the object files 'VPATH', such as GNU 'make'. 'cd' to the directory where you want the
and executables to go and run the 'configure' script. 'configure' object files and executables to go and run the 'configure' script.
automatically checks for the source code in '.', in '..' and in the 'configure' automatically checks for the source code in '.', in '..', and
directory that 'configure' is in. in the directory that 'configure' is in.
'configure' recognizes the option '--srcdir=DIR' to control where to 'configure' recognizes the option '--srcdir=DIR' to control where to look
look for the sources. Usually 'configure' can determine that directory for the source code. Usually 'configure' can determine that directory
automatically. automatically.
After running 'configure', you can run 'make' and 'make install' as After running 'configure', you can run 'make' and 'make install' as
explained above. explained above.
Copyright (C) 2013 Antonio Diaz Diaz. Copyright (C) 2013-2025 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute, and modify it.

View file

@ -1,44 +1,47 @@
DISTNAME = $(pkgname)-$(pkgversion) DISTNAME = $(pkgname)-$(pkgversion)
INSTALL = install INSTALL = install
INSTALL_PROGRAM = $(INSTALL) -p -m 755 INSTALL_PROGRAM = $(INSTALL) -m 755
INSTALL_DATA = $(INSTALL) -p -m 644
INSTALL_DIR = $(INSTALL) -d -m 755 INSTALL_DIR = $(INSTALL) -d -m 755
INSTALL_DATA = $(INSTALL) -m 644
SHELL = /bin/sh SHELL = /bin/sh
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
objs = lzd.o objs = lzd.o
.PHONY : all install install-bin install-info install-man install-strip \ .PHONY : all install install-bin install-info install-man \
install-strip install-compress install-strip-compress \
install-bin-strip install-info-compress install-man-compress \
uninstall uninstall-bin uninstall-info uninstall-man \ uninstall uninstall-bin uninstall-info uninstall-man \
doc info man check dist clean distclean doc info man check dist clean distclean
all : $(progname) all : $(progname)
$(progname) : $(objs) $(progname) : $(objs)
$(CXX) $(LDFLAGS) -o $@ $(objs) $(CXX) $(CXXFLAGS) $(LDFLAGS) -o $@ $(objs)
$(progname)_profiled : $(objs)
$(CXX) $(LDFLAGS) -pg -o $@ $(objs)
%.o : %.cc %.o : %.cc
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $< $(CXX) $(CPPFLAGS) $(CXXFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $<
$(objs) : Makefile # prevent 'make' from trying to remake source files
$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
MAKEFLAGS += -r
.SUFFIXES :
$(objs) : Makefile
doc : doc :
info : $(VPATH)/doc/$(pkgname).info info : $(VPATH)/doc/$(pkgname).info
$(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texinfo $(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texi
cd $(VPATH)/doc && makeinfo $(pkgname).texinfo cd $(VPATH)/doc && $(MAKEINFO) $(pkgname).texi
man : $(VPATH)/doc/$(progname).1 man : $(VPATH)/doc/$(progname).1
$(VPATH)/doc/$(progname).1 : $(progname) $(VPATH)/doc/$(progname).1 : $(progname)
help2man -n 'educational decompressor for lzip files' \ help2man -n 'educational decompressor for the lzip format' -o $@ --no-info ./$(progname)
-o $@ --no-info ./$(progname)
Makefile : $(VPATH)/configure $(VPATH)/Makefile.in Makefile : $(VPATH)/configure $(VPATH)/Makefile.in
./config.status ./config.status
@ -47,54 +50,73 @@ check : all
@$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion) @$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion)
install : install-bin install : install-bin
install-strip : install-bin-strip
install-compress : install-bin
install-strip-compress : install-bin-strip
install-bin : all install-bin : all
if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi
$(INSTALL_PROGRAM) ./$(progname) "$(DESTDIR)$(bindir)/$(progname)" $(INSTALL_PROGRAM) ./$(progname) "$(DESTDIR)$(bindir)/$(progname)"
install-bin-strip : all
$(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin
install-info : install-info :
if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
$(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info" $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info"
-install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" -if $(CAN_RUN_INSTALLINFO) ; then \
install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi
install-info-compress : install-info
lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info"
install-man : install-man :
if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi
-rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
$(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1" $(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1"
install-strip : all install-man-compress : install-man
$(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install lzip -v -9 "$(DESTDIR)$(mandir)/man1/$(progname).1"
uninstall : uninstall-bin uninstall-info uninstall-man uninstall : uninstall-bin
uninstall-bin : uninstall-bin :
-rm -f "$(DESTDIR)$(bindir)/$(progname)" -rm -f "$(DESTDIR)$(bindir)/$(progname)"
uninstall-info : uninstall-info :
-install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" -if $(CAN_RUN_INSTALLINFO) ; then \
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info" install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
uninstall-man : uninstall-man :
-rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1" -rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
dist : doc dist : doc
ln -sf $(VPATH) $(DISTNAME) ln -sf $(VPATH) $(DISTNAME)
tar -cvf $(DISTNAME).tar \ tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \
$(DISTNAME)/AUTHORS \ $(DISTNAME)/AUTHORS \
$(DISTNAME)/COPYING \
$(DISTNAME)/ChangeLog \ $(DISTNAME)/ChangeLog \
$(DISTNAME)/INSTALL \ $(DISTNAME)/INSTALL \
$(DISTNAME)/Makefile.in \ $(DISTNAME)/Makefile.in \
$(DISTNAME)/NEWS \ $(DISTNAME)/NEWS \
$(DISTNAME)/README \ $(DISTNAME)/README \
$(DISTNAME)/configure \ $(DISTNAME)/configure \
$(DISTNAME)/*.cc \
$(DISTNAME)/testsuite/check.sh \ $(DISTNAME)/testsuite/check.sh \
$(DISTNAME)/testsuite/test.txt \ $(DISTNAME)/testsuite/test.txt \
$(DISTNAME)/testsuite/test.txt.lz \ $(DISTNAME)/testsuite/em.lz \
$(DISTNAME)/*.cc $(DISTNAME)/testsuite/fox.lz \
$(DISTNAME)/testsuite/fox_*.lz \
$(DISTNAME)/testsuite/test.txt.lz
rm -f $(DISTNAME) rm -f $(DISTNAME)
lzip -v -9 $(DISTNAME).tar lzip -v -9 $(DISTNAME).tar
clean : clean :
-rm -f $(progname) $(progname)_profiled $(objs) -rm -f $(progname) $(objs)
distclean : clean distclean : clean
-rm -f Makefile config.status *.tar *.tar.lz -rm -f Makefile config.status *.tar *.tar.lz

8
NEWS
View file

@ -1,3 +1,7 @@
Changes in version 0.5: Changes in version 1.5:
Minor changes. lzd now exits with error status 2 if any empty member is found in a
multimember file.
lzd now exits with error status 2 if the first byte of the LZMA stream is
not 0.

47
README
View file

@ -1,30 +1,39 @@
See the file INSTALL for compilation and installation instructions.
Description Description
Lzd is a simplified decompressor for lzip files with an educational Lzd is a simplified decompressor for the lzip format with an educational
purpose. Studying its source is a good first step to understand how lzip purpose. Studying its source code is a good first step to understand how
works. It is not safe to use lzd for any real work. lzip works. Lzd is written in C++.
The source of lzd is used in the lzip manual as a reference decompressor The source code of lzd is used in the lzip manual as a reference
in the description of the lzip file format. Reading the lzip manual will decompressor in the description of the lzip file format. Reading the lzip
help you understand the source. manual will help you understand the source code. Lzd is compliant with the
lzip specification; it checks the 3 integrity factors.
Lzd decompresses from standard input to standard output. Lzd will The source code of lzd is also used as a reference in the description of the
correctly decompress the concatenation of two or more compressed files. media type 'application/lzip'.
The result is the concatenation of the corresponding decompressed data. See http://datatracker.ietf.org/doc/draft-diaz-lzip
Integrity of such concatenated compressed input is also verified.
Lzd decompresses from standard input to standard output. It accepts (and
ignores) the option '-d' for compatibility with other lzip tools. In
particular, accepting the option '-d' allows lzd to be used as argument to
the option '--lz' of the tools from the zutils package.
Lzd correctly decompresses the concatenation of two or more compressed
files. The result is the concatenation of the corresponding decompressed
data. Integrity of such concatenated compressed input is also checked.
The ideas embodied in lzd are due to (at least) the following people: The ideas embodied in lzd are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
the definition of Markov chains), G.N.N. Martin (for the definition of definition of Markov chains), G.N.N. Martin (for the definition of range
range encoding), and Igor Pavlov (for putting all the above together in encoding), and Igor Pavlov (for putting all the above together in LZMA).
LZMA).
Copyright (C) 2013 Antonio Diaz Diaz. Copyright (C) 2013-2025 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute, and modify it.
The file Makefile.in is a data file used by configure to produce the The file Makefile.in is a data file used by configure to produce the Makefile.
Makefile. It has the same copyright owner and permissions that configure It has the same copyright owner and permissions that configure itself.
itself.

65
configure vendored
View file

@ -1,12 +1,12 @@
#! /bin/sh #! /bin/sh
# configure script for Lzd - Educational decompressor for lzip files # configure script for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013 Antonio Diaz Diaz. # Copyright (C) 2013-2025 Antonio Diaz Diaz.
# #
# This configure script is free software: you have unlimited permission # This configure script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
pkgname=lzd pkgname=lzd
pkgversion=0.5 pkgversion=1.5
progname=lzd progname=lzd
srctrigger=lzd.cc srctrigger=lzd.cc
@ -24,13 +24,10 @@ CXX=g++
CPPFLAGS= CPPFLAGS=
CXXFLAGS='-Wall -W -O2' CXXFLAGS='-Wall -W -O2'
LDFLAGS= LDFLAGS=
MAKEINFO=makeinfo
# checking whether we are using GNU C++. # checking whether we are using GNU C++.
${CXX} --version > /dev/null 2>&1 /bin/sh -c "${CXX} --version" > /dev/null 2>&1 || { CXX=c++ ; CXXFLAGS=-O2 ; }
if [ $? != 0 ] ; then
CXX=c++
CXXFLAGS='-W -O2'
fi
# Loop over all args # Loop over all args
args= args=
@ -42,32 +39,38 @@ while [ $# != 0 ] ; do
shift shift
# Add the argument quoted to args # Add the argument quoted to args
args="${args} \"${option}\"" if [ -z "${args}" ] ; then args="\"${option}\""
else args="${args} \"${option}\"" ; fi
# Split out the argument for options that take them # Split out the argument for options that take them
case ${option} in case ${option} in
*=*) optarg=`echo ${option} | sed -e 's,^[^=]*=,,;s,/$,,'` ;; *=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;;
esac esac
# Process the options # Process the options
case ${option} in case ${option} in
--help | -h) --help | -h)
echo "Usage: configure [options]" echo "Usage: $0 [OPTION]... [VAR=VALUE]..."
echo echo
echo "Options: [defaults in brackets]" echo "To assign makefile variables (e.g., CXX, CXXFLAGS...), specify them as"
echo "arguments to configure in the form VAR=VALUE."
echo
echo "Options and variables: [defaults in brackets]"
echo " -h, --help display this help and exit" echo " -h, --help display this help and exit"
echo " -V, --version output version information and exit" echo " -V, --version output version information and exit"
echo " --srcdir=DIR find the sources in DIR [. or ..]" echo " --srcdir=DIR find the source code in DIR [. or ..]"
echo " --prefix=DIR install into DIR [${prefix}]" echo " --prefix=DIR install into DIR [${prefix}]"
echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]" echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]"
echo " --bindir=DIR user executables directory [${bindir}]" echo " --bindir=DIR user executables directory [${bindir}]"
echo " --datarootdir=DIR base directory for doc and data [${datarootdir}]" echo " --datarootdir=DIR base directory for doc and data [${datarootdir}]"
echo " --infodir=DIR info files directory [${infodir}]" echo " --infodir=DIR info files directory [${infodir}]"
echo " --mandir=DIR man pages directory [${mandir}]" echo " --mandir=DIR man pages directory [${mandir}]"
echo " CXX=COMPILER C++ compiler to use [g++]" echo " CXX=COMPILER C++ compiler to use [${CXX}]"
echo " CPPFLAGS=OPTIONS command line options for the preprocessor [${CPPFLAGS}]" echo " CPPFLAGS=OPTIONS command-line options for the preprocessor [${CPPFLAGS}]"
echo " CXXFLAGS=OPTIONS command line options for the C++ compiler [${CXXFLAGS}]" echo " CXXFLAGS=OPTIONS command-line options for the C++ compiler [${CXXFLAGS}]"
echo " LDFLAGS=OPTIONS command line options for the linker [${LDFLAGS}]" echo " CXXFLAGS+=OPTIONS append options to the current value of CXXFLAGS"
echo " LDFLAGS=OPTIONS command-line options for the linker [${LDFLAGS}]"
echo " MAKEINFO=NAME makeinfo program to use [${MAKEINFO}]"
echo echo
exit 0 ;; exit 0 ;;
--version | -V) --version | -V)
@ -93,7 +96,9 @@ while [ $# != 0 ] ; do
CXX=*) CXX=${optarg} ;; CXX=*) CXX=${optarg} ;;
CPPFLAGS=*) CPPFLAGS=${optarg} ;; CPPFLAGS=*) CPPFLAGS=${optarg} ;;
CXXFLAGS=*) CXXFLAGS=${optarg} ;; CXXFLAGS=*) CXXFLAGS=${optarg} ;;
CXXFLAGS+=*) CXXFLAGS="${CXXFLAGS} ${optarg}" ;;
LDFLAGS=*) LDFLAGS=${optarg} ;; LDFLAGS=*) LDFLAGS=${optarg} ;;
MAKEINFO=*) MAKEINFO=${optarg} ;;
--*) --*)
echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;; echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;;
@ -104,7 +109,7 @@ while [ $# != 0 ] ; do
exit 1 ;; exit 1 ;;
esac esac
# Check if the option took a separate argument # Check whether the option took a separate argument
if [ "${arg2}" = yes ] ; then if [ "${arg2}" = yes ] ; then
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
else echo "configure: Missing argument to '${option}'" 1>&2 else echo "configure: Missing argument to '${option}'" 1>&2
@ -113,19 +118,19 @@ while [ $# != 0 ] ; do
fi fi
done done
# Find the source files, if location was not specified. # Find the source code, if location was not specified.
srcdirtext= srcdirtext=
if [ -z "${srcdir}" ] ; then if [ -z "${srcdir}" ] ; then
srcdirtext="or . or .." ; srcdir=. srcdirtext="or . or .." ; srcdir=.
if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then if [ ! -r "${srcdir}/${srctrigger}" ] ; then
## the sed command below emulates the dirname command ## the sed command below emulates the dirname command
srcdir=`echo $0 | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
fi fi
fi fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then if [ ! -r "${srcdir}/${srctrigger}" ] ; then
echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2 echo "configure: Can't find source code in ${srcdir} ${srcdirtext}" 1>&2
echo "configure: (At least ${srctrigger} is missing)." 1>&2 echo "configure: (At least ${srctrigger} is missing)." 1>&2
exit 1 exit 1
fi fi
@ -139,13 +144,13 @@ if [ -z "${no_create}" ] ; then
rm -f config.status rm -f config.status
cat > config.status << EOF cat > config.status << EOF
#! /bin/sh #! /bin/sh
# This file was generated automatically by configure. Do not edit. # This file was generated automatically by configure. Don't edit.
# Run this file to recreate the current configuration. # Run this file to recreate the current configuration.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
exec /bin/sh $0 ${args} --no-create exec /bin/sh "$0" ${args} --no-create
EOF EOF
chmod +x config.status chmod +x config.status
fi fi
@ -162,14 +167,15 @@ echo "CXX = ${CXX}"
echo "CPPFLAGS = ${CPPFLAGS}" echo "CPPFLAGS = ${CPPFLAGS}"
echo "CXXFLAGS = ${CXXFLAGS}" echo "CXXFLAGS = ${CXXFLAGS}"
echo "LDFLAGS = ${LDFLAGS}" echo "LDFLAGS = ${LDFLAGS}"
echo "MAKEINFO = ${MAKEINFO}"
rm -f Makefile rm -f Makefile
cat > Makefile << EOF cat > Makefile << EOF
# Makefile for Lzd - Educational decompressor for lzip files # Makefile for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013 Antonio Diaz Diaz. # Copyright (C) 2013-2025 Antonio Diaz Diaz.
# This file was generated automatically by configure. Do not edit. # This file was generated automatically by configure. Don't edit.
# #
# This Makefile is free software: you have unlimited permission # This Makefile is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
pkgname = ${pkgname} pkgname = ${pkgname}
pkgversion = ${pkgversion} pkgversion = ${pkgversion}
@ -185,6 +191,7 @@ CXX = ${CXX}
CPPFLAGS = ${CPPFLAGS} CPPFLAGS = ${CPPFLAGS}
CXXFLAGS = ${CXXFLAGS} CXXFLAGS = ${CXXFLAGS}
LDFLAGS = ${LDFLAGS} LDFLAGS = ${LDFLAGS}
MAKEINFO = ${MAKEINFO}
EOF EOF
cat "${srcdir}/Makefile.in" >> Makefile cat "${srcdir}/Makefile.in" >> Makefile

262
lzd.cc
View file

@ -1,8 +1,16 @@
/* Lzd - Educational decompressor for lzip files /* Lzd - Educational decompressor for the lzip format
Copyright (C) 2013 Antonio Diaz Diaz. Copyright (C) 2013-2025 Antonio Diaz Diaz.
This program is free software: you have unlimited permission This program is free software. Redistribution and use in source and
to copy, distribute and modify it. binary forms, with or without modification, are permitted provided
that the following conditions are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions, and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
This program is distributed in the hope that it will be useful, This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of but WITHOUT ANY WARRANTY; without even the implied warranty of
@ -10,8 +18,8 @@
*/ */
/* /*
Exit status: 0 for a normal exit, 1 for environmental problems Exit status: 0 for a normal exit, 1 for environmental problems
(file not found, invalid flags, I/O errors, etc), 2 to indicate a (file not found, invalid command-line options, I/O errors, etc), 2 to
corrupt or invalid input file. indicate a corrupt or invalid input file.
*/ */
#include <algorithm> #include <algorithm>
@ -21,7 +29,7 @@
#include <cstring> #include <cstring>
#include <stdint.h> #include <stdint.h>
#include <unistd.h> #include <unistd.h>
#if defined(__MSVCRT__) || defined(__OS2__) #if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
#include <fcntl.h> #include <fcntl.h>
#include <io.h> #include <io.h>
#endif #endif
@ -39,7 +47,7 @@ public:
void set_char() void set_char()
{ {
static const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 }; const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
st = next[st]; st = next[st];
} }
void set_match() { st = ( st < 7 ) ? 7 : 10; } void set_match() { st = ( st < 7 ) ? 7 : 10; }
@ -52,6 +60,7 @@ enum {
min_dictionary_size = 1 << 12, min_dictionary_size = 1 << 12,
max_dictionary_size = 1 << 29, max_dictionary_size = 1 << 29,
literal_context_bits = 3, literal_context_bits = 3,
literal_pos_state_bits = 0, // not used
pos_state_bits = 2, pos_state_bits = 2,
pos_states = 1 << pos_state_bits, pos_states = 1 << pos_state_bits,
pos_state_mask = pos_states - 1, pos_state_mask = pos_states - 1,
@ -60,7 +69,7 @@ enum {
dis_slot_bits = 6, dis_slot_bits = 6,
start_dis_model = 4, start_dis_model = 4,
end_dis_model = 14, end_dis_model = 14,
modeled_distances = 1 << (end_dis_model / 2), // 128 modeled_distances = 1 << ( end_dis_model / 2 ), // 128
dis_align_bits = 4, dis_align_bits = 4,
dis_align_size = 1 << dis_align_bits, dis_align_size = 1 << dis_align_bits,
@ -121,74 +130,82 @@ public:
const CRC32 crc32; const CRC32 crc32;
typedef uint8_t File_header[6]; // 0-3 magic, 4 version, 5 coded_dict_size enum { header_size = 6, trailer_size = 20 };
typedef uint8_t Lzip_header[header_size]; // 0-3 magic bytes
typedef uint8_t File_trailer[20]; // 4 version
// 5 coded dictionary size
typedef uint8_t Lzip_trailer[trailer_size];
// 0-3 CRC32 of the uncompressed data // 0-3 CRC32 of the uncompressed data
// 4-11 size of the uncompressed data // 4-11 size of the uncompressed data
// 12-19 member size including header and trailer // 12-19 member size including header and trailer
class Range_decoder class Range_decoder
{ {
unsigned long long member_pos;
uint32_t code; uint32_t code;
uint32_t range; uint32_t range;
public: public:
Range_decoder() : code( 0 ), range( 0xFFFFFFFFU ) Range_decoder()
: member_pos( header_size ), code( 0 ), range( 0xFFFFFFFFU )
{ {
for( int i = 0; i < 5; ++i ) code = (code << 8) | get_byte(); if( get_byte() != 0 ) // check first LZMA byte
{ std::fputs( "Nonzero first LZMA byte.\n", stderr ); std::exit( 2 ); }
for( int i = 0; i < 4; ++i ) code = ( code << 8 ) | get_byte();
} }
uint8_t get_byte() { return std::getc( stdin ); } uint8_t get_byte() { ++member_pos; return std::getc( stdin ); }
unsigned long long member_position() const { return member_pos; }
int decode( const int num_bits ) unsigned decode( const int num_bits )
{ {
int symbol = 0; unsigned symbol = 0;
for( int i = 0; i < num_bits; ++i ) for( int i = num_bits; i > 0; --i )
{ {
range >>= 1; range >>= 1;
symbol <<= 1; symbol <<= 1;
if( code >= range ) { code -= range; symbol |= 1; } if( code >= range ) { code -= range; symbol |= 1; }
if( range <= 0x00FFFFFFU ) // normalize if( range <= 0x00FFFFFFU ) // normalize
{ range <<= 8; code = (code << 8) | get_byte(); } { range <<= 8; code = ( code << 8 ) | get_byte(); }
} }
return symbol; return symbol;
} }
int decode_bit( Bit_model & bm ) bool decode_bit( Bit_model & bm )
{ {
int symbol; bool symbol;
const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability; const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability;
if( code < bound ) if( code < bound )
{ {
range = bound; range = bound;
bm.probability += (bit_model_total - bm.probability) >> bit_model_move_bits; bm.probability +=
( bit_model_total - bm.probability ) >> bit_model_move_bits;
symbol = 0; symbol = 0;
} }
else else
{ {
range -= bound;
code -= bound; code -= bound;
range -= bound;
bm.probability -= bm.probability >> bit_model_move_bits; bm.probability -= bm.probability >> bit_model_move_bits;
symbol = 1; symbol = 1;
} }
if( range <= 0x00FFFFFFU ) // normalize if( range <= 0x00FFFFFFU ) // normalize
{ range <<= 8; code = (code << 8) | get_byte(); } { range <<= 8; code = ( code << 8 ) | get_byte(); }
return symbol; return symbol;
} }
int decode_tree( Bit_model bm[], const int num_bits ) unsigned decode_tree( Bit_model bm[], const int num_bits )
{ {
int symbol = 1; unsigned symbol = 1;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
symbol = ( symbol << 1 ) | decode_bit( bm[symbol] ); symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
return symbol - (1 << num_bits); return symbol - ( 1 << num_bits );
} }
int decode_tree_reversed( Bit_model bm[], const int num_bits ) unsigned decode_tree_reversed( Bit_model bm[], const int num_bits )
{ {
int symbol = decode_tree( bm, num_bits ); unsigned symbol = decode_tree( bm, num_bits );
int reversed_symbol = 0; unsigned reversed_symbol = 0;
for( int i = 0; i < num_bits; ++i ) for( int i = 0; i < num_bits; ++i )
{ {
reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 ); reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 );
@ -197,14 +214,13 @@ public:
return reversed_symbol; return reversed_symbol;
} }
int decode_matched( Bit_model bm[], const int match_byte ) unsigned decode_matched( Bit_model bm[], const unsigned match_byte )
{ {
Bit_model * const bm1 = bm + 0x100; unsigned symbol = 1;
int symbol = 1;
for( int i = 7; i >= 0; --i ) for( int i = 7; i >= 0; --i )
{ {
const int match_bit = ( match_byte >> i ) & 1; const bool match_bit = ( match_byte >> i ) & 1;
const int bit = decode_bit( bm1[(match_bit<<8)+symbol] ); const bool bit = decode_bit( bm[symbol+(match_bit<<8)+0x100] );
symbol = ( symbol << 1 ) | bit; symbol = ( symbol << 1 ) | bit;
if( match_bit != bit ) if( match_bit != bit )
{ {
@ -213,17 +229,18 @@ public:
break; break;
} }
} }
return symbol - 0x100; return symbol & 0xFF;
} }
int decode_len( Len_model & lm, const int pos_state ) unsigned decode_len( Len_model & lm, const int pos_state )
{ {
if( decode_bit( lm.choice1 ) == 0 ) if( decode_bit( lm.choice1 ) == 0 )
return decode_tree( lm.bm_low[pos_state], len_low_bits ); return min_match_len +
decode_tree( lm.bm_low[pos_state], len_low_bits );
if( decode_bit( lm.choice2 ) == 0 ) if( decode_bit( lm.choice2 ) == 0 )
return len_low_symbols + return min_match_len + len_low_symbols +
decode_tree( lm.bm_mid[pos_state], len_mid_bits ); decode_tree( lm.bm_mid[pos_state], len_mid_bits );
return len_low_symbols + len_mid_symbols + return min_match_len + len_low_symbols + len_mid_symbols +
decode_tree( lm.bm_high, len_high_bits ); decode_tree( lm.bm_high, len_high_bits );
} }
}; };
@ -238,14 +255,15 @@ class LZ_decoder
unsigned pos; // current pos in buffer unsigned pos; // current pos in buffer
unsigned stream_pos; // first byte not yet written to stdout unsigned stream_pos; // first byte not yet written to stdout
uint32_t crc_; uint32_t crc_;
bool pos_wrapped;
void flush_data(); void flush_data();
uint8_t get_byte( const unsigned distance ) const uint8_t peek( const unsigned distance ) const
{ {
unsigned i = pos - distance - 1; if( pos > distance ) return buffer[pos - distance - 1];
if( pos <= distance ) i += dictionary_size; if( pos_wrapped ) return buffer[dictionary_size + pos - distance - 1];
return buffer[i]; return 0; // prev_byte of first byte
} }
void put_byte( const uint8_t b ) void put_byte( const uint8_t b )
@ -255,20 +273,25 @@ class LZ_decoder
} }
public: public:
LZ_decoder( const unsigned dict_size ) explicit LZ_decoder( const unsigned dict_size )
: :
partial_data_pos( 0 ), partial_data_pos( 0 ),
dictionary_size( dict_size ), dictionary_size( dict_size ),
buffer( new uint8_t[dictionary_size] ), buffer( new uint8_t[dictionary_size] ),
pos( 0 ), pos( 0 ),
stream_pos( 0 ), stream_pos( 0 ),
crc_( 0xFFFFFFFFU ) crc_( 0xFFFFFFFFU ),
{ buffer[dictionary_size-1] = 0; } // prev_byte of first_byte pos_wrapped( false )
{}
~LZ_decoder() { delete[] buffer; } ~LZ_decoder() { delete[] buffer; }
unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; } unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; }
unsigned long long data_position() const { return partial_data_pos + pos; } unsigned long long data_position() const
{ return partial_data_pos + pos; }
uint8_t get_byte() { return rdec.get_byte(); }
unsigned long long member_position() const
{ return rdec.member_position(); }
bool decode_member(); bool decode_member();
}; };
@ -280,17 +303,17 @@ void LZ_decoder::flush_data()
{ {
const unsigned size = pos - stream_pos; const unsigned size = pos - stream_pos;
crc32.update_buf( crc_, buffer + stream_pos, size ); crc32.update_buf( crc_, buffer + stream_pos, size );
errno = 0;
if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size ) if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size )
{ std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) ); { std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) );
std::exit( 1 ); } std::exit( 1 ); }
if( pos >= dictionary_size ) { partial_data_pos += pos; pos = 0; } if( pos >= dictionary_size )
{ partial_data_pos += pos; pos = 0; pos_wrapped = true; }
stream_pos = pos; stream_pos = pos;
} }
} }
bool LZ_decoder::decode_member() // Returns false if error bool LZ_decoder::decode_member() // Return false if error
{ {
Bit_model bm_literal[1<<literal_context_bits][0x300]; Bit_model bm_literal[1<<literal_context_bits][0x300];
Bit_model bm_match[State::states][pos_states]; Bit_model bm_match[State::states][pos_states];
@ -300,7 +323,7 @@ bool LZ_decoder::decode_member() // Returns false if error
Bit_model bm_rep2[State::states]; Bit_model bm_rep2[State::states];
Bit_model bm_len[State::states][pos_states]; Bit_model bm_len[State::states][pos_states];
Bit_model bm_dis_slot[len_states][1<<dis_slot_bits]; Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
Bit_model bm_dis[modeled_distances-end_dis_model]; Bit_model bm_dis[modeled_distances-end_dis_model+1];
Bit_model bm_align[dis_align_size]; Bit_model bm_align[dis_align_size];
Len_model match_len_model; Len_model match_len_model;
Len_model rep_len_model; Len_model rep_len_model;
@ -315,21 +338,27 @@ bool LZ_decoder::decode_member() // Returns false if error
const int pos_state = data_position() & pos_state_mask; const int pos_state = data_position() & pos_state_mask;
if( rdec.decode_bit( bm_match[state()][pos_state] ) == 0 ) // 1st bit if( rdec.decode_bit( bm_match[state()][pos_state] ) == 0 ) // 1st bit
{ {
const uint8_t prev_byte = get_byte( 0 ); // literal byte
const uint8_t prev_byte = peek( 0 );
const int literal_state = prev_byte >> ( 8 - literal_context_bits ); const int literal_state = prev_byte >> ( 8 - literal_context_bits );
Bit_model * const bm = bm_literal[literal_state]; Bit_model * const bm = bm_literal[literal_state];
if( state.is_char() ) if( state.is_char() )
put_byte( rdec.decode_tree( bm, 8 ) ); put_byte( rdec.decode_tree( bm, 8 ) );
else else
put_byte( rdec.decode_matched( bm, get_byte( rep0 ) ) ); put_byte( rdec.decode_matched( bm, peek( rep0 ) ) );
state.set_char(); state.set_char();
continue;
} }
else // match or repeated match
{
int len; int len;
if( rdec.decode_bit( bm_rep[state()] ) != 0 ) // 2nd bit if( rdec.decode_bit( bm_rep[state()] ) != 0 ) // 2nd bit
{ {
if( rdec.decode_bit( bm_rep0[state()] ) != 0 ) // 3rd bit if( rdec.decode_bit( bm_rep0[state()] ) == 0 ) // 3rd bit
{
if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
{ state.set_short_rep(); put_byte( peek( rep0 ) ); continue; }
}
else
{ {
unsigned distance; unsigned distance;
if( rdec.decode_bit( bm_rep1[state()] ) == 0 ) // 4th bit if( rdec.decode_bit( bm_rep1[state()] ) == 0 ) // 4th bit
@ -345,116 +374,119 @@ bool LZ_decoder::decode_member() // Returns false if error
rep1 = rep0; rep1 = rep0;
rep0 = distance; rep0 = distance;
} }
else
{
if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
{ state.set_short_rep(); put_byte( get_byte( rep0 ) ); continue; }
}
state.set_rep(); state.set_rep();
len = min_match_len + rdec.decode_len( rep_len_model, pos_state ); len = rdec.decode_len( rep_len_model, pos_state );
} }
else else // match
{ {
rep3 = rep2; rep2 = rep1; rep1 = rep0; rep3 = rep2; rep2 = rep1; rep1 = rep0;
len = min_match_len + rdec.decode_len( match_len_model, pos_state ); len = rdec.decode_len( match_len_model, pos_state );
const int len_state = std::min( len - min_match_len, len_states - 1 ); const int len_state = std::min( len - min_match_len, len_states - 1 );
const int dis_slot = rep0 = rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits );
rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits ); if( rep0 >= start_dis_model )
if( dis_slot < start_dis_model ) rep0 = dis_slot;
else
{ {
const unsigned dis_slot = rep0;
const int direct_bits = ( dis_slot >> 1 ) - 1; const int direct_bits = ( dis_slot >> 1 ) - 1;
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits; rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
if( dis_slot < end_dis_model ) if( dis_slot < end_dis_model )
rep0 += rdec.decode_tree_reversed( bm_dis + rep0 - dis_slot - 1, rep0 += rdec.decode_tree_reversed( bm_dis + ( rep0 - dis_slot ),
direct_bits ); direct_bits );
else else
{ {
rep0 += rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits; rep0 += rdec.decode( direct_bits-dis_align_bits ) << dis_align_bits;
rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits ); rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits );
if( rep0 == 0xFFFFFFFFU ) // Marker found if( rep0 == 0xFFFFFFFFU ) // marker found
{ {
flush_data(); flush_data();
return ( len == min_match_len ); // End Of Stream marker return len == min_match_len; // End Of Stream marker
} }
} }
} }
state.set_match(); state.set_match();
if( rep0 >= dictionary_size || ( rep0 >= pos && !partial_data_pos ) ) if( rep0 >= dictionary_size || ( rep0 >= pos && !pos_wrapped ) )
return false; { flush_data(); return false; }
}
for( int i = 0; i < len; ++i )
put_byte( get_byte( rep0 ) );
} }
for( int i = 0; i < len; ++i ) put_byte( peek( rep0 ) );
} }
flush_data();
return false; return false;
} }
int main( const int argc, const char * const argv[] ) int main( const int argc, const char * const argv[] )
{ {
if( argc > 1 ) if( argc > 2 || ( argc == 2 && std::strcmp( argv[1], "-d" ) != 0 ) )
{ {
std::printf( "Lzd %s - Educational decompressor for lzip files.\n", std::printf(
PROGVERSION ); "Lzd %s - Educational decompressor for the lzip format.\n"
std::printf( "Study the source to learn how a lzip decompressor works.\n" "Study the source code to learn how a lzip decompressor works.\n"
"See the lzip manual for an explanation of the code.\n" "See the lzip manual for an explanation of the code.\n"
"It is not safe to use lzd for any real work.\n" "\nUsage: %s [-d] < file.lz > file\n"
"\nUsage: %s < file.lz > file\n", argv[0] ); "Lzd decompresses from standard input to standard output.\n"
std::printf( "Lzd decompresses from standard input to standard output.\n" "\nCopyright (C) 2025 Antonio Diaz Diaz.\n"
"\nCopyright (C) 2013 Antonio Diaz Diaz.\n" "License 2-clause BSD.\n"
"This is free software: you are free to change and redistribute it.\n" "This is free software: you are free to change and redistribute "
"There is NO WARRANTY, to the extent permitted by law.\n" "it.\nThere is NO WARRANTY, to the extent permitted by law.\n"
"Report bugs to lzip-bug@nongnu.org\n" "Report bugs to lzip-bug@nongnu.org\n"
"Lzd home page: http://www.nongnu.org/lzip/lzd.html\n" ); "Lzd home page: http://www.nongnu.org/lzip/lzd.html\n",
PROGVERSION, argv[0] );
return 0; return 0;
} }
#if defined(__MSVCRT__) || defined(__OS2__) #if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
setmode( STDIN_FILENO, O_BINARY ); setmode( STDIN_FILENO, O_BINARY );
setmode( STDOUT_FILENO, O_BINARY ); setmode( STDOUT_FILENO, O_BINARY );
#endif #endif
bool empty = false, multi = false;
for( bool first_member = true; ; first_member = false ) for( bool first_member = true; ; first_member = false )
{ {
File_header header; Lzip_header header; // check header
for( int i = 0; i < 6; ++i ) for( int i = 0; i < header_size; ++i ) header[i] = std::getc( stdin );
header[i] = std::getc( stdin ); if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 )
if( std::feof( stdin ) || std::memcmp( header, "LZIP", 4 ) != 0 )
{ {
if( first_member ) if( first_member )
{ std::fprintf( stderr, "Bad magic number (file not in lzip format)\n" ); { std::fputs( "Bad magic number (file not in lzip format).\n",
return 2; } stderr ); return 2; }
break; break; // ignore trailing data
}
if( header[4] != 1 )
{
std::fprintf( stderr, "Version %d member format not supported.\n",
header[4] );
return 2;
} }
unsigned dict_size = 1 << ( header[5] & 0x1F ); unsigned dict_size = 1 << ( header[5] & 0x1F );
dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 ); dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 );
if( dict_size < min_dictionary_size || dict_size > max_dictionary_size ) if( dict_size < min_dictionary_size || dict_size > max_dictionary_size )
{ std::fprintf( stderr, "Invalid dictionary size in member header\n" ); { std::fputs( "Invalid dictionary size in member header.\n",
return 2; } stderr ); return 2; }
LZ_decoder decoder( dict_size ); LZ_decoder decoder( dict_size ); // decode LZMA stream
if( !decoder.decode_member() ) if( !decoder.decode_member() )
{ std::fprintf( stderr, "Data error\n" ); return 2; } { std::fputs( "Data error.\n", stderr ); return 2; }
File_trailer trailer; Lzip_trailer trailer; // check trailer
for( int i = 0; i < 20; ++i ) trailer[i] = std::getc( stdin ); for( int i = 0; i < trailer_size; ++i ) trailer[i] = decoder.get_byte();
int retval = 0;
unsigned crc = 0; unsigned crc = 0;
for( int i = 3; i >= 0; --i ) { crc <<= 8; crc += trailer[i]; } for( int i = 3; i >= 0; --i ) crc = ( crc << 8 ) + trailer[i];
if( crc != decoder.crc() )
{ std::fputs( "CRC mismatch.\n", stderr ); retval = 2; }
unsigned long long data_size = 0; unsigned long long data_size = 0;
for( int i = 11; i >= 4; --i ) { data_size <<= 8; data_size += trailer[i]; } for( int i = 11; i >= 4; --i )
if( crc != decoder.crc() || data_size != decoder.data_position() ) data_size = ( data_size << 8 ) + trailer[i];
{ std::fprintf( stderr, "CRC error\n" ); return 2; } if( data_size != decoder.data_position() )
{ std::fputs( "Data size mismatch.\n", stderr ); retval = 2; }
multi = !first_member; if( data_size == 0 ) empty = true;
unsigned long long member_size = 0;
for( int i = 19; i >= 12; --i )
member_size = ( member_size << 8 ) + trailer[i];
if( member_size != decoder.member_position() )
{ std::fputs( "Member size mismatch.\n", stderr ); retval = 2; }
if( retval ) return retval;
} }
if( std::fclose( stdout ) != 0 ) if( std::fclose( stdout ) != 0 )
{ std::fprintf( stderr, "Can't close stdout: %s\n", std::strerror( errno ) ); { std::fprintf( stderr, "Error closing stdout: %s\n",
return 1; } std::strerror( errno ) ); return 1; }
if( empty && multi )
{ std::fputs( "Empty member not allowed.\n", stderr ); return 2; }
return 0; return 0;
} }

View file

@ -1,9 +1,9 @@
#! /bin/sh #! /bin/sh
# check script for Lzd - Educational decompressor for lzip files # check script for Lzd - Educational decompressor for the lzip format
# Copyright (C) 2013 Antonio Diaz Diaz. # Copyright (C) 2013-2025 Antonio Diaz Diaz.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute, and modify it.
LC_ALL=C LC_ALL=C
export LC_ALL export LC_ALL
@ -12,36 +12,104 @@ testdir=`cd "$1" ; pwd`
LZIP="${objdir}"/lzd LZIP="${objdir}"/lzd
framework_failure() { echo "failure in testing framework" ; exit 1 ; } framework_failure() { echo "failure in testing framework" ; exit 1 ; }
if [ ! -x "${LZIP}" ] ; then if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then
echo "${LZIP}: cannot execute" echo "${LZIP}: cannot execute"
exit 1 exit 1
fi fi
[ -e "${LZIP}" ] 2> /dev/null ||
{
echo "$0: a POSIX shell is required to run the tests"
echo "Try bash -c \"$0 $1 $2\""
exit 1
}
if [ -d tmp ] ; then rm -rf tmp ; fi if [ -d tmp ] ; then rm -rf tmp ; fi
mkdir tmp mkdir tmp
cd "${objdir}"/tmp cd "${objdir}"/tmp || framework_failure
in="${testdir}"/test.txt in="${testdir}"/test.txt
in_lz="${testdir}"/test.txt.lz in_lz="${testdir}"/test.txt.lz
em_lz="${testdir}"/em.lz
fox_lz="${testdir}"/fox.lz
fnz_lz="${testdir}"/fox_nz.lz
fail=0 fail=0
test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
printf "testing lzd-%s..." "$2" printf "testing lzd-%s..." "$2"
"${LZIP}" < "${in}" 2> /dev/null "${LZIP}" < "${in}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi [ $? = 2 ] || test_failed $LINENO
dd if="${in_lz}" bs=1 count=6 2> /dev/null | "${LZIP}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
dd if="${in_lz}" bs=1 count=20 2> /dev/null | "${LZIP}" 2> /dev/null
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" < "${in_lz}" > copy || fail=1 "${LZIP}" < "${in_lz}" > out || test_failed $LINENO
cmp "${in}" copy || fail=1 cmp "${in}" out || test_failed $LINENO
printf .
cat "${in}" "${in}" > in2 || framework_failure cat "${in}" "${in}" > in2 || framework_failure
cat "${in_lz}" "${in_lz}" | "${LZIP}" > copy2 || fail=1 cat "${in_lz}" "${in_lz}" | "${LZIP}" > out2 || test_failed $LINENO
cmp in2 copy2 || fail=1 cmp in2 out2 || test_failed $LINENO
printf . rm -f out2 || framework_failure
cat "${in_lz}" "${in_lz}" > out2.lz || framework_failure
printf "\ngarbage" >> out2.lz || framework_failure
"${LZIP}" -d < out2.lz > out2 || test_failed $LINENO
cmp in2 out2 || test_failed $LINENO
rm -f in2 out2 out2.lz || framework_failure
touch empty || framework_failure
"${LZIP}" -d < "${em_lz}" > em || test_failed $LINENO
cmp empty em || test_failed $LINENO
printf "\ntesting bad input..."
cat "${em_lz}" "${em_lz}" | "${LZIP}" -d > em 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
cmp empty em || test_failed $LINENO
rm -f empty em || framework_failure
cat "${em_lz}" "${in_lz}" | "${LZIP}" -d > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
cmp "${in}" out || test_failed $LINENO
cat "${in_lz}" "${em_lz}" | "${LZIP}" -d > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
cmp "${in}" out || test_failed $LINENO
"${LZIP}" < "${fnz_lz}" 2> /dev/null
[ $? = 2 ] || test_failed $LINENO
for i in fox_v2.lz fox_s11.lz fox_de20.lz \
fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
"${LZIP}" < "${testdir}"/$i > /dev/null 2>&1
[ $? = 2 ] || test_failed $LINENO $i
done
"${LZIP}" < "${fox_lz}" > fox || test_failed $LINENO
for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
"${LZIP}" < "${testdir}"/$i > out 2> /dev/null
[ $? = 2 ] || test_failed $LINENO $i
cmp fox out || test_failed $LINENO $i
done
rm -f fox || framework_failure
cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
if dd if=in3.lz of=trunc.lz bs=14682 count=1 2> /dev/null &&
[ -e trunc.lz ] && cmp in2.lz trunc.lz ; then
# can't detect truncated header of non-first member
for i in 6 20 14664 14688 ; do
dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
"${LZIP}" < trunc.lz > /dev/null 2>&1
[ $? = 2 ] || test_failed $LINENO $i
done
else
printf "warning: skipping truncation test: 'dd' does not work on your system."
fi
rm -f in2.lz in3.lz trunc.lz || framework_failure
cp "${in_lz}" ingin.lz || framework_failure
printf "g" >> ingin.lz || framework_failure
cat "${in_lz}" >> ingin.lz || framework_failure
"${LZIP}" -d < ingin.lz > out || test_failed $LINENO
cmp "${in}" out || test_failed $LINENO
rm -f out ingin.lz || framework_failure
echo echo
if [ ${fail} = 0 ] ; then if [ ${fail} = 0 ] ; then

BIN
testsuite/em.lz Normal file

Binary file not shown.

BIN
testsuite/fox.lz Normal file

Binary file not shown.

BIN
testsuite/fox_bcrc.lz Normal file

Binary file not shown.

BIN
testsuite/fox_crc0.lz Normal file

Binary file not shown.

BIN
testsuite/fox_das46.lz Normal file

Binary file not shown.

BIN
testsuite/fox_de20.lz Normal file

Binary file not shown.

BIN
testsuite/fox_mes81.lz Normal file

Binary file not shown.

BIN
testsuite/fox_nz.lz Normal file

Binary file not shown.

BIN
testsuite/fox_s11.lz Normal file

Binary file not shown.

BIN
testsuite/fox_v2.lz Normal file

Binary file not shown.

File diff suppressed because it is too large Load diff

Binary file not shown.