Merging upstream version 1.12~rc1.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
4ddb634c25
commit
cd6a248630
24 changed files with 874 additions and 719 deletions
3
COPYING
3
COPYING
|
@ -1,8 +1,7 @@
|
||||||
GNU GENERAL PUBLIC LICENSE
|
GNU GENERAL PUBLIC LICENSE
|
||||||
Version 2, June 1991
|
Version 2, June 1991
|
||||||
|
|
||||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
Copyright (C) 1989, 1991 Free Software Foundation, Inc. <http://fsf.org/>
|
||||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
|
||||||
Everyone is permitted to copy and distribute verbatim copies
|
Everyone is permitted to copy and distribute verbatim copies
|
||||||
of this license document, but changing it is not allowed.
|
of this license document, but changing it is not allowed.
|
||||||
|
|
||||||
|
|
74
ChangeLog
74
ChangeLog
|
@ -1,3 +1,13 @@
|
||||||
|
2024-11-19 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
|
* Version 1.12-rc1 released.
|
||||||
|
* decompress.cc (decompress), list.cc (list_files):
|
||||||
|
Return 2 if any empty member is found in a multimember file.
|
||||||
|
* dec_stdout.cc, dec_stream.cc:
|
||||||
|
Change 'deliver_packet' to 'deliver_packets'.
|
||||||
|
* plzip.texi: New chapter 'Syntax of command-line arguments'.
|
||||||
|
* check.sh: Use 'cp' instead of 'cat'.
|
||||||
|
|
||||||
2024-01-21 Antonio Diaz Diaz <antonio@gnu.org>
|
2024-01-21 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.11 released.
|
* Version 1.11 released.
|
||||||
|
@ -20,16 +30,17 @@
|
||||||
2021-01-03 Antonio Diaz Diaz <antonio@gnu.org>
|
2021-01-03 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.9 released.
|
* Version 1.9 released.
|
||||||
|
* New option '--check-lib'.
|
||||||
* main.cc (main): Report an error if a file name is empty.
|
* main.cc (main): Report an error if a file name is empty.
|
||||||
|
(main): Show final diagnostic when testing multiple files.
|
||||||
Make '-o' behave like '-c', but writing to file instead of stdout.
|
Make '-o' behave like '-c', but writing to file instead of stdout.
|
||||||
Make '-c' and '-o' check whether the output is a terminal only once.
|
Make '-c' and '-o' check whether the output is a terminal only once.
|
||||||
Do not open output if input is a terminal.
|
Do not open output if input is a terminal.
|
||||||
* main.cc: New option '--check-lib'.
|
Set a valid invocation_name even if argc == 0.
|
||||||
* Replace 'decompressed', 'compressed' with 'out', 'in' in output.
|
* Replace 'decompressed', 'compressed' with 'out', 'in' in output.
|
||||||
* decompress.cc, dec_stream.cc, dec_stdout.cc:
|
* decompress.cc, dec_stdout.cc, dec_stream.cc:
|
||||||
Continue testing if any input file fails the test.
|
Continue testing if any input file fails the test.
|
||||||
Show the largest dictionary size in a multimember file.
|
Show the largest dictionary size in a multimember file.
|
||||||
* main.cc: Show final diagnostic when testing multiple files.
|
|
||||||
* decompress.cc, dec_stream.cc [LZ_API_VERSION >= 1012]: Avoid
|
* decompress.cc, dec_stream.cc [LZ_API_VERSION >= 1012]: Avoid
|
||||||
copying decompressed data when testing with lzlib 1.12 or newer.
|
copying decompressed data when testing with lzlib 1.12 or newer.
|
||||||
* compress.cc, dec_stream.cc: Start only the worker threads required.
|
* compress.cc, dec_stream.cc: Start only the worker threads required.
|
||||||
|
@ -38,47 +49,46 @@
|
||||||
Use plain comparison instead of Boyer-Moore to search for headers.
|
Use plain comparison instead of Boyer-Moore to search for headers.
|
||||||
* lzip_index.cc: Improve messages for corruption in last header.
|
* lzip_index.cc: Improve messages for corruption in last header.
|
||||||
* decompress.cc: Shorten messages 'Data error' and 'Unexpected EOF'.
|
* decompress.cc: Shorten messages 'Data error' and 'Unexpected EOF'.
|
||||||
* main.cc: Set a valid invocation_name even if argc == 0.
|
|
||||||
* Document extraction from tar.lz in manual, '--help', and man page.
|
* Document extraction from tar.lz in manual, '--help', and man page.
|
||||||
* plzip.texi (Introduction): Mention tarlz as an alternative.
|
* plzip.texi (Introduction): Mention tarlz as an alternative.
|
||||||
* plzip.texi: Several fixes and improvements.
|
Several fixes and improvements.
|
||||||
* testsuite: Add 8 new test files.
|
* testsuite: Add 8 new test files.
|
||||||
|
|
||||||
2019-01-05 Antonio Diaz Diaz <antonio@gnu.org>
|
2019-01-05 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.8 released.
|
* Version 1.8 released.
|
||||||
* Rename File_* to Lzip_*.
|
* Rename File_* to Lzip_*.
|
||||||
* main.cc: New options '--in-slots' and '--out-slots'.
|
* New options '--in-slots' and '--out-slots'.
|
||||||
* main.cc: Increase default in_slots per worker from 2 to 4.
|
* main.cc (main): Increase default in_slots per worker from 2 to 4.
|
||||||
* main.cc: Increase default out_slots per worker from 32 to 64.
|
(main): Increase default out_slots per worker from 32 to 64.
|
||||||
|
(main): Check return value of close( infd ).
|
||||||
* lzip.h (Lzip_trailer): New function 'verify_consistency'.
|
* lzip.h (Lzip_trailer): New function 'verify_consistency'.
|
||||||
* lzip_index.cc: Detect some kinds of corrupt trailers.
|
* lzip_index.cc: Detect some kinds of corrupt trailers.
|
||||||
* main.cc (main): Check return value of close( infd ).
|
* plzip.texi: Improve descriptions of '-0..-9', '-m', and '-s'.
|
||||||
* plzip.texi: Improve description of '-0..-9', '-m', and '-s'.
|
|
||||||
* configure: New option '--with-mingw'.
|
* configure: New option '--with-mingw'.
|
||||||
* configure: Accept appending to CXXFLAGS; 'CXXFLAGS+=OPTIONS'.
|
Accept appending to CXXFLAGS; 'CXXFLAGS+=OPTIONS'.
|
||||||
* INSTALL: Document use of CXXFLAGS+='-D __USE_MINGW_ANSI_STDIO'.
|
* INSTALL: Document use of CXXFLAGS+='-D __USE_MINGW_ANSI_STDIO'.
|
||||||
|
|
||||||
2018-02-07 Antonio Diaz Diaz <antonio@gnu.org>
|
2018-02-07 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.7 released.
|
* Version 1.7 released.
|
||||||
|
* New option '--loose-trailing'.
|
||||||
* compress.cc: Use 'LZ_compress_restart_member' and replace input
|
* compress.cc: Use 'LZ_compress_restart_member' and replace input
|
||||||
packet queue by a circular buffer to reduce memory fragmentation.
|
packet queue by a circular buffer to reduce memory fragmentation.
|
||||||
* compress.cc: Return one empty packet at a time to reduce mem use.
|
Return one empty packet at a time to reduce memory use.
|
||||||
* main.cc: Reduce threads on 32 bit systems to use under 2.22 GiB.
|
* main.cc: Reduce threads on 32 bit systems to use under 2.22 GiB.
|
||||||
* main.cc: New option '--loose-trailing'.
|
(set_c_outname): Do not add a second '.lz' to the arg of '-o'.
|
||||||
|
(cleanup_and_fail): Suppress messages from other threads.
|
||||||
* Improve corrupt header detection to HD = 3 on seekable files.
|
* Improve corrupt header detection to HD = 3 on seekable files.
|
||||||
(On all files with lzlib 1.10 or newer).
|
(On all files with lzlib 1.10 or newer).
|
||||||
* Replace 'bits/byte' with inverse compression ratio in output.
|
* Replace 'bits/byte' with inverse compression ratio in output.
|
||||||
* Show progress of decompression at verbosity level 2 (-vv).
|
* Show progress of decompression at verbosity level 2 (-vv).
|
||||||
* Show progress of (de)compression only if stderr is a terminal.
|
* Show progress of (de)compression only if stderr is a terminal.
|
||||||
* main.cc: Do not add a second .lz extension to the arg of -o.
|
|
||||||
* Show dictionary size at verbosity level 4 (-vvvv).
|
* Show dictionary size at verbosity level 4 (-vvvv).
|
||||||
* main.cc (cleanup_and_fail): Suppress messages from other threads.
|
|
||||||
* list.cc: Add missing '#include <pthread.h>'.
|
* list.cc: Add missing '#include <pthread.h>'.
|
||||||
* plzip.texi: New chapter 'Output'.
|
* plzip.texi: New chapter 'Meaning of plzip's output'.
|
||||||
* plzip.texi (Memory requirements): Add table.
|
(Memory requirements): Add table.
|
||||||
* plzip.texi (Program design): Add a block diagram.
|
(Program design): Add a block diagram.
|
||||||
|
|
||||||
2017-04-12 Antonio Diaz Diaz <antonio@gnu.org>
|
2017-04-12 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
|
@ -92,14 +102,13 @@
|
||||||
2016-05-14 Antonio Diaz Diaz <antonio@gnu.org>
|
2016-05-14 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.5 released.
|
* Version 1.5 released.
|
||||||
* main.cc: New option '-a, --trailing-error'.
|
* New option '-a, --trailing-error'.
|
||||||
* main.cc (main): Delete '--output' file if infd is a terminal.
|
* main.cc (main): Delete '--output' file if infd is a terminal.
|
||||||
* main.cc (main): Don't use stdin more than once.
|
(main): Don't use stdin more than once.
|
||||||
* plzip.texi: New chapters 'Trailing data' and 'Examples'.
|
* plzip.texi: New chapters 'Trailing data' and 'Examples'.
|
||||||
* configure: Avoid warning on some shells when testing for g++.
|
* configure: Avoid warning on some shells when testing for g++.
|
||||||
* Makefile.in: Detect the existence of install-info.
|
* Makefile.in: Detect the existence of install-info.
|
||||||
* check.sh: A POSIX shell is required to run the tests.
|
* check.sh: Require a POSIX shell. Don't check error messages.
|
||||||
* check.sh: Don't check error messages.
|
|
||||||
|
|
||||||
2015-07-09 Antonio Diaz Diaz <antonio@gnu.org>
|
2015-07-09 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
|
@ -136,14 +145,14 @@
|
||||||
|
|
||||||
* Version 1.0 released.
|
* Version 1.0 released.
|
||||||
* compress.cc: Change 'deliver_packet' to 'deliver_packets'.
|
* compress.cc: Change 'deliver_packet' to 'deliver_packets'.
|
||||||
* Scalability of decompression from/to regular files has been
|
* Increase scalability of decompression from/to regular files by
|
||||||
increased by removing splitter and muxer when not needed.
|
removing splitter and muxer when not needed.
|
||||||
* The number of worker threads is now limited to the number of
|
* Limit the number of worker threads to the number of members when
|
||||||
members when decompressing from a regular file.
|
decompressing from a regular file.
|
||||||
* configure: Options now accept a separate argument.
|
* configure: Options now accept a separate argument.
|
||||||
* Makefile.in: New targets 'install-as-lzip' and 'install-bin'.
|
* Makefile.in: New targets 'install-as-lzip' and 'install-bin'.
|
||||||
* main.cc: Use 'setmode' instead of '_setmode' on Windows and OS/2.
|
|
||||||
* main.cc: Define 'strtoull' to 'std::strtoul' on Windows.
|
* main.cc: Define 'strtoull' to 'std::strtoul' on Windows.
|
||||||
|
(main): Use 'setmode' instead of '_setmode' on Windows and OS/2.
|
||||||
|
|
||||||
2012-03-01 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
2012-03-01 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||||
|
|
||||||
|
@ -154,13 +163,13 @@
|
||||||
2012-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
2012-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||||
|
|
||||||
* Version 0.8 released.
|
* Version 0.8 released.
|
||||||
* main.cc: New option '-F, --recompress'.
|
* New option '-F, --recompress'.
|
||||||
* decompress.cc (decompress): Show compression ratio.
|
* decompress.cc (decompress): Show compression ratio.
|
||||||
* main.cc (close_and_set_permissions): Inability to change output
|
* main.cc (close_and_set_permissions): Inability to change output
|
||||||
file attributes has been downgraded from error to warning.
|
file attributes has been downgraded from error to warning.
|
||||||
|
(main): Set stdin/stdout in binary mode on OS2.
|
||||||
* Small change in '--help' output and man page.
|
* Small change in '--help' output and man page.
|
||||||
* Change quote characters in messages as advised by GNU Standards.
|
* Change quote characters in messages as advised by GNU Standards.
|
||||||
* main.cc: Set stdin/stdout in binary mode on OS2.
|
|
||||||
* compress.cc: Reduce memory use of compressed packets.
|
* compress.cc: Reduce memory use of compressed packets.
|
||||||
* decompress.cc: Use Boyer-Moore algorithm to search for headers.
|
* decompress.cc: Use Boyer-Moore algorithm to search for headers.
|
||||||
|
|
||||||
|
@ -174,7 +183,7 @@
|
||||||
* main.cc (open_instream): Don't show the message
|
* main.cc (open_instream): Don't show the message
|
||||||
" and '--stdout' was not specified" for directories, etc.
|
" and '--stdout' was not specified" for directories, etc.
|
||||||
Exit with status 1 if any output file exists and is skipped.
|
Exit with status 1 if any output file exists and is skipped.
|
||||||
* main.cc: Fix warning about fchown return value being ignored.
|
Fix warning about fchown's return value being ignored.
|
||||||
* testsuite: Rename 'test1' to 'test.txt'. New tests.
|
* testsuite: Rename 'test1' to 'test.txt'. New tests.
|
||||||
|
|
||||||
2010-03-20 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
2010-03-20 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||||
|
@ -202,9 +211,8 @@
|
||||||
* Version 0.3 released.
|
* Version 0.3 released.
|
||||||
* New option '-B, --data-size'.
|
* New option '-B, --data-size'.
|
||||||
* Output file is now removed if plzip is interrupted.
|
* Output file is now removed if plzip is interrupted.
|
||||||
* This version automatically chooses the smallest possible
|
* Choose automatically the smallest possible dictionary size for
|
||||||
dictionary size for each member during compression, saving
|
each member during compression, saving memory during decompression.
|
||||||
memory during decompression.
|
|
||||||
* main.cc: New constant 'o_binary'.
|
* main.cc: New constant 'o_binary'.
|
||||||
|
|
||||||
2010-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
2010-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||||
|
|
|
@ -2,8 +2,8 @@
|
||||||
DISTNAME = $(pkgname)-$(pkgversion)
|
DISTNAME = $(pkgname)-$(pkgversion)
|
||||||
INSTALL = install
|
INSTALL = install
|
||||||
INSTALL_PROGRAM = $(INSTALL) -m 755
|
INSTALL_PROGRAM = $(INSTALL) -m 755
|
||||||
INSTALL_DATA = $(INSTALL) -m 644
|
|
||||||
INSTALL_DIR = $(INSTALL) -d -m 755
|
INSTALL_DIR = $(INSTALL) -d -m 755
|
||||||
|
INSTALL_DATA = $(INSTALL) -m 644
|
||||||
SHELL = /bin/sh
|
SHELL = /bin/sh
|
||||||
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
|
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
|
||||||
|
|
||||||
|
@ -34,7 +34,8 @@ main.o : main.cc
|
||||||
|
|
||||||
# prevent 'make' from trying to remake source files
|
# prevent 'make' from trying to remake source files
|
||||||
$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
|
$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
|
||||||
%.h %.cc : ;
|
MAKEFLAGS += -r
|
||||||
|
.SUFFIXES :
|
||||||
|
|
||||||
$(objs) : Makefile
|
$(objs) : Makefile
|
||||||
arg_parser.o : arg_parser.h
|
arg_parser.o : arg_parser.h
|
||||||
|
@ -133,8 +134,7 @@ dist : doc
|
||||||
$(DISTNAME)/testsuite/test.txt \
|
$(DISTNAME)/testsuite/test.txt \
|
||||||
$(DISTNAME)/testsuite/fox.lz \
|
$(DISTNAME)/testsuite/fox.lz \
|
||||||
$(DISTNAME)/testsuite/fox_*.lz \
|
$(DISTNAME)/testsuite/fox_*.lz \
|
||||||
$(DISTNAME)/testsuite/test.txt.lz \
|
$(DISTNAME)/testsuite/test.txt.lz
|
||||||
$(DISTNAME)/testsuite/test_em.txt.lz
|
|
||||||
rm -f $(DISTNAME)
|
rm -f $(DISTNAME)
|
||||||
lzip -v -9 $(DISTNAME).tar
|
lzip -v -9 $(DISTNAME).tar
|
||||||
|
|
||||||
|
|
16
NEWS
16
NEWS
|
@ -1,14 +1,8 @@
|
||||||
Changes in version 1.11:
|
Changes in version 1.12:
|
||||||
|
|
||||||
File diagnostics have been reformatted as 'PROGRAM: FILE: MESSAGE'.
|
plzip now exits with error status 2 if any empty member is found in a
|
||||||
|
multimember file.
|
||||||
|
|
||||||
Diagnostics caused by invalid arguments to command-line options now show the
|
Scalability when decompressing to standard output has been increased.
|
||||||
argument and the name of the option.
|
|
||||||
|
|
||||||
The option '-o, --output' now preserves dates, permissions, and ownership of
|
The chapter 'Syntax of command-line arguments' has been added to the manual.
|
||||||
the file when (de)compressing exactly one file.
|
|
||||||
|
|
||||||
The option '-o, --output' now creates missing intermediate directories when
|
|
||||||
writing to a file.
|
|
||||||
|
|
||||||
The variable MAKEINFO has been added to configure and Makefile.in.
|
|
||||||
|
|
28
README
28
README
|
@ -1,26 +1,26 @@
|
||||||
Description
|
Description
|
||||||
|
|
||||||
Plzip is a massively parallel (multi-threaded) implementation of lzip,
|
Plzip is a massively parallel (multi-threaded) implementation of lzip. Plzip
|
||||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
uses the compression library lzlib.
|
||||||
|
|
||||||
Lzip is a lossless data compressor with a user interface similar to the one
|
Lzip is a lossless data compressor with a user interface similar to the one
|
||||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov
|
||||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
chain-Algorithm) designed to achieve complete interoperability between
|
||||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
file can be decompressed on 32-bit machines. Lzip provides accurate and
|
||||||
checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
|
robust 3-factor integrity checking. 'lzip -0' compresses about as fast as
|
||||||
files more than bzip2 (lzip -9). Decompression speed is intermediate between
|
gzip, while 'lzip -9' compresses most files more than bzip2. Decompression
|
||||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||||
perspective. Lzip has been designed, written, and tested with great care to
|
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
and tested with great care to replace gzip and bzip2 as general-purpose
|
||||||
Unix-like systems.
|
compressed format for Unix-like systems.
|
||||||
|
|
||||||
Plzip can compress/decompress large files on multiprocessor machines much
|
Plzip can compress/decompress large files on multiprocessor machines much
|
||||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||||
to 2 percent larger compressed files). Note that the number of usable
|
to 2 percent larger compressed files). Note that the number of usable
|
||||||
threads is limited by file size; on files larger than a few GB plzip can use
|
threads is limited by file size; on files larger than a few GB plzip can use
|
||||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
hundreds of processors, but on files smaller than 1 MiB plzip is no faster
|
||||||
than lzip.
|
than lzip (even at compression level -0).
|
||||||
|
|
||||||
For creation and manipulation of compressed tar archives tarlz can be more
|
For creation and manipulation of compressed tar archives tarlz can be more
|
||||||
efficient than using tar and plzip because tarlz is able to keep the
|
efficient than using tar and plzip because tarlz is able to keep the
|
||||||
|
|
|
@ -75,19 +75,19 @@ bool Arg_parser::parse_long_option( const char * const opt, const char * const a
|
||||||
error_ += "' requires an argument";
|
error_ += "' requires an argument";
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
data.back().argument = &opt[len+3];
|
data.back().argument = &opt[len+3]; // argument may be empty
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
if( options[index].has_arg == yes )
|
if( options[index].has_arg == yes || options[index].has_arg == yme )
|
||||||
{
|
{
|
||||||
if( !arg || !arg[0] )
|
if( !arg || ( options[index].has_arg == yes && !arg[0] ) )
|
||||||
{
|
{
|
||||||
error_ = "option '--"; error_ += options[index].long_name;
|
error_ = "option '--"; error_ += options[index].long_name;
|
||||||
error_ += "' requires an argument";
|
error_ += "' requires an argument";
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
++argind; data.back().argument = arg;
|
++argind; data.back().argument = arg; // argument may be empty
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -123,15 +123,16 @@ bool Arg_parser::parse_short_option( const char * const opt, const char * const
|
||||||
{
|
{
|
||||||
data.back().argument = &opt[cind]; ++argind; cind = 0;
|
data.back().argument = &opt[cind]; ++argind; cind = 0;
|
||||||
}
|
}
|
||||||
else if( options[index].has_arg == yes )
|
else if( options[index].has_arg == yes || options[index].has_arg == yme )
|
||||||
{
|
{
|
||||||
if( !arg || !arg[0] )
|
if( !arg || ( options[index].has_arg == yes && !arg[0] ) )
|
||||||
{
|
{
|
||||||
error_ = "option requires an argument -- '"; error_ += c;
|
error_ = "option requires an argument -- '"; error_ += c;
|
||||||
error_ += '\'';
|
error_ += '\'';
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
data.back().argument = arg; ++argind; cind = 0;
|
++argind; cind = 0;
|
||||||
|
data.back().argument = arg; // argument may be empty
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return true;
|
return true;
|
||||||
|
|
10
arg_parser.h
10
arg_parser.h
|
@ -36,14 +36,18 @@
|
||||||
The argument '--' terminates all options; any following arguments are
|
The argument '--' terminates all options; any following arguments are
|
||||||
treated as non-option arguments, even if they begin with a hyphen.
|
treated as non-option arguments, even if they begin with a hyphen.
|
||||||
|
|
||||||
The syntax for optional option arguments is '-<short_option><argument>'
|
The syntax of options with an optional argument is
|
||||||
(without whitespace), or '--<long_option>=<argument>'.
|
'-<short_option><argument>' (without whitespace), or
|
||||||
|
'--<long_option>=<argument>'.
|
||||||
|
|
||||||
|
The syntax of options with an empty argument is '-<short_option> ""',
|
||||||
|
'--<long_option> ""', or '--<long_option>=""'.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
class Arg_parser
|
class Arg_parser
|
||||||
{
|
{
|
||||||
public:
|
public:
|
||||||
enum Has_arg { no, yes, maybe };
|
enum Has_arg { no, yes, maybe, yme }; // yme = yes but maybe empty
|
||||||
|
|
||||||
struct Option
|
struct Option
|
||||||
{
|
{
|
||||||
|
|
83
compress.cc
83
compress.cc
|
@ -112,7 +112,6 @@ void xlock( pthread_mutex_t * const mutex )
|
||||||
{ show_error( "pthread_mutex_lock", errcode ); cleanup_and_fail(); }
|
{ show_error( "pthread_mutex_lock", errcode ); cleanup_and_fail(); }
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void xunlock( pthread_mutex_t * const mutex )
|
void xunlock( pthread_mutex_t * const mutex )
|
||||||
{
|
{
|
||||||
const int errcode = pthread_mutex_unlock( mutex );
|
const int errcode = pthread_mutex_unlock( mutex );
|
||||||
|
@ -158,7 +157,7 @@ struct Packet // data block with a serial number
|
||||||
int size; // number of bytes in data (if any)
|
int size; // number of bytes in data (if any)
|
||||||
unsigned id; // serial number assigned as received
|
unsigned id; // serial number assigned as received
|
||||||
Packet() : data( 0 ), size( 0 ), id( 0 ) {}
|
Packet() : data( 0 ), size( 0 ), id( 0 ) {}
|
||||||
void init( uint8_t * const d, const int s, const unsigned i )
|
void assign( uint8_t * const d, const int s, const unsigned i )
|
||||||
{ data = d; size = s; id = i; }
|
{ data = d; size = s; id = i; }
|
||||||
};
|
};
|
||||||
|
|
||||||
|
@ -176,7 +175,7 @@ private:
|
||||||
unsigned deliver_id; // id of next packet to be delivered
|
unsigned deliver_id; // id of next packet to be delivered
|
||||||
Slot_tally slot_tally; // limits the number of input packets
|
Slot_tally slot_tally; // limits the number of input packets
|
||||||
std::vector< Packet > circular_ibuffer;
|
std::vector< Packet > circular_ibuffer;
|
||||||
std::vector< const Packet * > circular_obuffer;
|
std::vector< const Packet * > circular_obuffer; // pointers to ibuffer
|
||||||
int num_working; // number of workers still running
|
int num_working; // number of workers still running
|
||||||
const int num_slots; // max packets in circulation
|
const int num_slots; // max packets in circulation
|
||||||
pthread_mutex_t imutex;
|
pthread_mutex_t imutex;
|
||||||
|
@ -212,7 +211,7 @@ public:
|
||||||
{
|
{
|
||||||
slot_tally.get_slot(); // wait for a free slot
|
slot_tally.get_slot(); // wait for a free slot
|
||||||
xlock( &imutex );
|
xlock( &imutex );
|
||||||
circular_ibuffer[receive_id % num_slots].init( data, size, receive_id );
|
circular_ibuffer[receive_id % num_slots].assign( data, size, receive_id );
|
||||||
++receive_id;
|
++receive_id;
|
||||||
xsignal( &iav_or_eof );
|
xsignal( &iav_or_eof );
|
||||||
xunlock( &imutex );
|
xunlock( &imutex );
|
||||||
|
@ -221,7 +220,6 @@ public:
|
||||||
// distribute a packet to a worker
|
// distribute a packet to a worker
|
||||||
Packet * distribute_packet()
|
Packet * distribute_packet()
|
||||||
{
|
{
|
||||||
Packet * ipacket = 0;
|
|
||||||
xlock( &imutex );
|
xlock( &imutex );
|
||||||
++icheck_counter;
|
++icheck_counter;
|
||||||
while( receive_id == distrib_id && !eof ) // no packets to distribute
|
while( receive_id == distrib_id && !eof ) // no packets to distribute
|
||||||
|
@ -230,15 +228,13 @@ public:
|
||||||
xwait( &iav_or_eof, &imutex );
|
xwait( &iav_or_eof, &imutex );
|
||||||
}
|
}
|
||||||
if( receive_id != distrib_id )
|
if( receive_id != distrib_id )
|
||||||
{ ipacket = &circular_ibuffer[distrib_id % num_slots]; ++distrib_id; }
|
{ Packet * ipacket = &circular_ibuffer[distrib_id % num_slots];
|
||||||
|
++distrib_id; xunlock( &imutex ); return ipacket; }
|
||||||
xunlock( &imutex );
|
xunlock( &imutex );
|
||||||
if( !ipacket ) // EOF
|
xlock( &omutex ); // notify muxer when last worker exits
|
||||||
{
|
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
||||||
xlock( &omutex ); // notify muxer when last worker exits
|
xunlock( &omutex );
|
||||||
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
return 0; // EOF
|
||||||
xunlock( &omutex );
|
|
||||||
}
|
|
||||||
return ipacket;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// collect a packet from a worker
|
// collect a packet from a worker
|
||||||
|
@ -307,30 +303,38 @@ public:
|
||||||
|
|
||||||
struct Worker_arg
|
struct Worker_arg
|
||||||
{
|
{
|
||||||
Packet_courier * courier;
|
Packet_courier & courier;
|
||||||
const Pretty_print * pp;
|
const Pretty_print & pp;
|
||||||
int dictionary_size;
|
const int dictionary_size;
|
||||||
int match_len_limit;
|
const int match_len_limit;
|
||||||
int offset;
|
const int offset;
|
||||||
|
Worker_arg( Packet_courier & co, const Pretty_print & pp_, const int dis,
|
||||||
|
const int mll, const int off )
|
||||||
|
: courier( co ), pp( pp_ ), dictionary_size( dis ),
|
||||||
|
match_len_limit( mll ), offset( off ) {}
|
||||||
};
|
};
|
||||||
|
|
||||||
struct Splitter_arg
|
struct Splitter_arg
|
||||||
{
|
{
|
||||||
struct Worker_arg worker_arg;
|
Worker_arg worker_arg;
|
||||||
pthread_t * worker_threads;
|
pthread_t * const worker_threads;
|
||||||
int infd;
|
const int data_size;
|
||||||
int data_size;
|
const int infd;
|
||||||
int num_workers; // returned by splitter to main thread
|
int num_workers; // returned by splitter to main thread
|
||||||
|
Splitter_arg( Packet_courier & co, const Pretty_print & pp_, const int dis,
|
||||||
|
const int mll, const int off, pthread_t * wt, const int das,
|
||||||
|
const int ifd, const int nw )
|
||||||
|
: worker_arg( co, pp_, dis, mll, off ), worker_threads( wt ),
|
||||||
|
data_size( das ), infd( ifd ), num_workers( nw ) {}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
/* Get packets from courier, replace their contents, and return them to
|
// get packets from courier, replace their contents, and return them to courier
|
||||||
courier. */
|
|
||||||
extern "C" void * cworker( void * arg )
|
extern "C" void * cworker( void * arg )
|
||||||
{
|
{
|
||||||
const Worker_arg & tmp = *(const Worker_arg *)arg;
|
const Worker_arg & tmp = *(const Worker_arg *)arg;
|
||||||
Packet_courier & courier = *tmp.courier;
|
Packet_courier & courier = tmp.courier;
|
||||||
const Pretty_print & pp = *tmp.pp;
|
const Pretty_print & pp = tmp.pp;
|
||||||
const int dictionary_size = tmp.dictionary_size;
|
const int dictionary_size = tmp.dictionary_size;
|
||||||
const int match_len_limit = tmp.match_len_limit;
|
const int match_len_limit = tmp.match_len_limit;
|
||||||
const int offset = tmp.offset;
|
const int offset = tmp.offset;
|
||||||
|
@ -407,8 +411,8 @@ extern "C" void * cworker( void * arg )
|
||||||
extern "C" void * csplitter( void * arg )
|
extern "C" void * csplitter( void * arg )
|
||||||
{
|
{
|
||||||
Splitter_arg & tmp = *(Splitter_arg *)arg;
|
Splitter_arg & tmp = *(Splitter_arg *)arg;
|
||||||
Packet_courier & courier = *tmp.worker_arg.courier;
|
Packet_courier & courier = tmp.worker_arg.courier;
|
||||||
const Pretty_print & pp = *tmp.worker_arg.pp;
|
const Pretty_print & pp = tmp.worker_arg.pp;
|
||||||
pthread_t * const worker_threads = tmp.worker_threads;
|
pthread_t * const worker_threads = tmp.worker_threads;
|
||||||
const int offset = tmp.worker_arg.offset;
|
const int offset = tmp.worker_arg.offset;
|
||||||
const int infd = tmp.infd;
|
const int infd = tmp.infd;
|
||||||
|
@ -436,11 +440,7 @@ extern "C" void * csplitter( void * arg )
|
||||||
}
|
}
|
||||||
if( size < data_size ) break; // EOF
|
if( size < data_size ) break; // EOF
|
||||||
}
|
}
|
||||||
else
|
else { delete[] data; break; }
|
||||||
{
|
|
||||||
delete[] data;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
courier.finish( tmp.num_workers - i ); // no more packets to send
|
courier.finish( tmp.num_workers - i ); // no more packets to send
|
||||||
tmp.num_workers = i;
|
tmp.num_workers = i;
|
||||||
|
@ -465,7 +465,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
|
||||||
out_size += opacket->size;
|
out_size += opacket->size;
|
||||||
|
|
||||||
if( writeblock( outfd, opacket->data, opacket->size ) != opacket->size )
|
if( writeblock( outfd, opacket->data, opacket->size ) != opacket->size )
|
||||||
{ pp(); show_error( "Write error", errno ); cleanup_and_fail(); }
|
{ pp(); show_error( write_error_msg, errno ); cleanup_and_fail(); }
|
||||||
delete[] opacket->data;
|
delete[] opacket->data;
|
||||||
courier.return_empty_packet();
|
courier.return_empty_packet();
|
||||||
}
|
}
|
||||||
|
@ -475,8 +475,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
|
||||||
} // end namespace
|
} // end namespace
|
||||||
|
|
||||||
|
|
||||||
/* Init the courier, then start the splitter and the workers and call the
|
// init the courier, then start the splitter and the workers and call the muxer
|
||||||
muxer. */
|
|
||||||
int compress( const unsigned long long cfile_size,
|
int compress( const unsigned long long cfile_size,
|
||||||
const int data_size, const int dictionary_size,
|
const int data_size, const int dictionary_size,
|
||||||
const int match_len_limit, const int num_workers,
|
const int match_len_limit, const int num_workers,
|
||||||
|
@ -496,16 +495,8 @@ int compress( const unsigned long long cfile_size,
|
||||||
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
|
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
|
||||||
if( !worker_threads ) { pp( mem_msg ); return 1; }
|
if( !worker_threads ) { pp( mem_msg ); return 1; }
|
||||||
|
|
||||||
Splitter_arg splitter_arg;
|
Splitter_arg splitter_arg( courier, pp, dictionary_size, match_len_limit,
|
||||||
splitter_arg.worker_arg.courier = &courier;
|
offset, worker_threads, data_size, infd, num_workers );
|
||||||
splitter_arg.worker_arg.pp = &pp;
|
|
||||||
splitter_arg.worker_arg.dictionary_size = dictionary_size;
|
|
||||||
splitter_arg.worker_arg.match_len_limit = match_len_limit;
|
|
||||||
splitter_arg.worker_arg.offset = offset;
|
|
||||||
splitter_arg.worker_threads = worker_threads;
|
|
||||||
splitter_arg.infd = infd;
|
|
||||||
splitter_arg.data_size = data_size;
|
|
||||||
splitter_arg.num_workers = num_workers;
|
|
||||||
|
|
||||||
pthread_t splitter_thread;
|
pthread_t splitter_thread;
|
||||||
int errcode = pthread_create( &splitter_thread, 0, csplitter, &splitter_arg );
|
int errcode = pthread_create( &splitter_thread, 0, csplitter, &splitter_arg );
|
||||||
|
|
4
configure
vendored
4
configure
vendored
|
@ -6,7 +6,7 @@
|
||||||
# to copy, distribute, and modify it.
|
# to copy, distribute, and modify it.
|
||||||
|
|
||||||
pkgname=plzip
|
pkgname=plzip
|
||||||
pkgversion=1.11
|
pkgversion=1.12-rc1
|
||||||
progname=plzip
|
progname=plzip
|
||||||
with_mingw=
|
with_mingw=
|
||||||
srctrigger=doc/${pkgname}.texi
|
srctrigger=doc/${pkgname}.texi
|
||||||
|
@ -115,7 +115,7 @@ while [ $# != 0 ] ; do
|
||||||
exit 1 ;;
|
exit 1 ;;
|
||||||
esac
|
esac
|
||||||
|
|
||||||
# Check if the option took a separate argument
|
# Check whether the option took a separate argument
|
||||||
if [ "${arg2}" = yes ] ; then
|
if [ "${arg2}" = yes ] ; then
|
||||||
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
|
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
|
||||||
else echo "configure: Missing argument to '${option}'" 1>&2
|
else echo "configure: Missing argument to '${option}'" 1>&2
|
||||||
|
|
119
dec_stdout.cc
119
dec_stdout.cc
|
@ -46,10 +46,10 @@ struct Packet // data block
|
||||||
uint8_t * data; // data may be null if size == 0
|
uint8_t * data; // data may be null if size == 0
|
||||||
int size; // number of bytes in data (if any)
|
int size; // number of bytes in data (if any)
|
||||||
bool eom; // end of member
|
bool eom; // end of member
|
||||||
Packet() : data( 0 ), size( 0 ), eom( true ) {}
|
Packet() : data( 0 ), size( 0 ), eom( false ) {}
|
||||||
Packet( uint8_t * const d, const int s, const bool e )
|
Packet( uint8_t * const d, const int s, const bool e )
|
||||||
: data( d ), size( s ), eom ( e ) {}
|
: data( d ), size( s ), eom ( e ) {}
|
||||||
~Packet() { if( data ) delete[] data; }
|
void delete_data() { if( data ) { delete[] data; data = 0; } }
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
@ -59,8 +59,8 @@ public:
|
||||||
unsigned ocheck_counter;
|
unsigned ocheck_counter;
|
||||||
unsigned owait_counter;
|
unsigned owait_counter;
|
||||||
private:
|
private:
|
||||||
int deliver_worker_id; // worker queue currently delivering packets
|
int deliver_id; // worker queue currently delivering packets
|
||||||
std::vector< std::queue< const Packet * > > opacket_queues;
|
std::vector< std::queue< Packet > > opacket_queues;
|
||||||
int num_working; // number of workers still running
|
int num_working; // number of workers still running
|
||||||
const int num_workers; // number of workers
|
const int num_workers; // number of workers
|
||||||
const unsigned out_slots; // max output packets per queue
|
const unsigned out_slots; // max output packets per queue
|
||||||
|
@ -75,10 +75,9 @@ private:
|
||||||
public:
|
public:
|
||||||
Packet_courier( const Shared_retval & sh_ret, const int workers,
|
Packet_courier( const Shared_retval & sh_ret, const int workers,
|
||||||
const int slots )
|
const int slots )
|
||||||
: ocheck_counter( 0 ), owait_counter( 0 ), deliver_worker_id( 0 ),
|
: ocheck_counter( 0 ), owait_counter( 0 ), deliver_id( 0 ),
|
||||||
opacket_queues( workers ), num_working( workers ),
|
opacket_queues( workers ), num_working( workers ), num_workers( workers ),
|
||||||
num_workers( workers ), out_slots( slots ), slot_av( workers ),
|
out_slots( slots ), slot_av( workers ), shared_retval( sh_ret )
|
||||||
shared_retval( sh_ret )
|
|
||||||
{
|
{
|
||||||
xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
|
xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
|
||||||
for( unsigned i = 0; i < slot_av.size(); ++i ) xinit_cond( &slot_av[i] );
|
for( unsigned i = 0; i < slot_av.size(); ++i ) xinit_cond( &slot_av[i] );
|
||||||
|
@ -89,7 +88,7 @@ public:
|
||||||
if( shared_retval() ) // cleanup to avoid memory leaks
|
if( shared_retval() ) // cleanup to avoid memory leaks
|
||||||
for( int i = 0; i < num_workers; ++i )
|
for( int i = 0; i < num_workers; ++i )
|
||||||
while( !opacket_queues[i].empty() )
|
while( !opacket_queues[i].empty() )
|
||||||
{ delete opacket_queues[i].front(); opacket_queues[i].pop(); }
|
{ opacket_queues[i].front().delete_data(); opacket_queues[i].pop(); }
|
||||||
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
|
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
|
||||||
xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
|
xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
|
||||||
}
|
}
|
||||||
|
@ -102,49 +101,47 @@ public:
|
||||||
xunlock( &omutex );
|
xunlock( &omutex );
|
||||||
}
|
}
|
||||||
|
|
||||||
// collect a packet from a worker, discard packet on error
|
// make a packet with data received from a worker, discard data on error
|
||||||
void collect_packet( const Packet * const opacket, const int worker_id )
|
void collect_packet( const int worker_id, uint8_t * const data,
|
||||||
|
const int size, const bool eom )
|
||||||
{
|
{
|
||||||
|
Packet opacket( data, size, eom );
|
||||||
xlock( &omutex );
|
xlock( &omutex );
|
||||||
if( opacket->data )
|
if( data )
|
||||||
while( opacket_queues[worker_id].size() >= out_slots )
|
while( opacket_queues[worker_id].size() >= out_slots )
|
||||||
{
|
{
|
||||||
if( shared_retval() ) { delete opacket; goto done; }
|
if( shared_retval() ) { delete[] data; goto out; }
|
||||||
xwait( &slot_av[worker_id], &omutex );
|
xwait( &slot_av[worker_id], &omutex );
|
||||||
}
|
}
|
||||||
opacket_queues[worker_id].push( opacket );
|
opacket_queues[worker_id].push( opacket );
|
||||||
if( worker_id == deliver_worker_id ) xsignal( &oav_or_exit );
|
if( worker_id == deliver_id ) xsignal( &oav_or_exit );
|
||||||
done:
|
out: xunlock( &omutex );
|
||||||
xunlock( &omutex );
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/* deliver a packet to muxer
|
/* deliver packets to muxer
|
||||||
if packet->eom, move to next queue
|
if opacket.eom, move to next queue
|
||||||
if packet data == 0, wait again */
|
if opacket.data == 0, skip opacket */
|
||||||
const Packet * deliver_packet()
|
void deliver_packets( std::vector< Packet > & packet_vector )
|
||||||
{
|
{
|
||||||
const Packet * opacket = 0;
|
packet_vector.clear();
|
||||||
xlock( &omutex );
|
xlock( &omutex );
|
||||||
++ocheck_counter;
|
++ocheck_counter;
|
||||||
while( true )
|
do {
|
||||||
{
|
while( opacket_queues[deliver_id].empty() && num_working > 0 )
|
||||||
while( opacket_queues[deliver_worker_id].empty() && num_working > 0 )
|
{ ++owait_counter; xwait( &oav_or_exit, &omutex ); }
|
||||||
|
while( true )
|
||||||
{
|
{
|
||||||
++owait_counter;
|
if( opacket_queues[deliver_id].empty() ) break;
|
||||||
xwait( &oav_or_exit, &omutex );
|
Packet opacket = opacket_queues[deliver_id].front();
|
||||||
|
opacket_queues[deliver_id].pop();
|
||||||
|
if( opacket_queues[deliver_id].size() + 1 == out_slots )
|
||||||
|
xsignal( &slot_av[deliver_id] );
|
||||||
|
if( opacket.eom && ++deliver_id >= num_workers ) deliver_id = 0;
|
||||||
|
if( opacket.data ) packet_vector.push_back( opacket );
|
||||||
}
|
}
|
||||||
if( opacket_queues[deliver_worker_id].empty() ) break;
|
|
||||||
opacket = opacket_queues[deliver_worker_id].front();
|
|
||||||
opacket_queues[deliver_worker_id].pop();
|
|
||||||
if( opacket_queues[deliver_worker_id].size() + 1 == out_slots )
|
|
||||||
xsignal( &slot_av[deliver_worker_id] );
|
|
||||||
if( opacket->eom && ++deliver_worker_id >= num_workers )
|
|
||||||
deliver_worker_id = 0;
|
|
||||||
if( opacket->data ) break;
|
|
||||||
delete opacket; opacket = 0;
|
|
||||||
}
|
}
|
||||||
|
while( packet_vector.empty() && num_working > 0 );
|
||||||
xunlock( &omutex );
|
xunlock( &omutex );
|
||||||
return opacket;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
bool finished() // all packets delivered to muxer
|
bool finished() // all packets delivered to muxer
|
||||||
|
@ -163,9 +160,14 @@ struct Worker_arg
|
||||||
Packet_courier * courier;
|
Packet_courier * courier;
|
||||||
const Pretty_print * pp;
|
const Pretty_print * pp;
|
||||||
Shared_retval * shared_retval;
|
Shared_retval * shared_retval;
|
||||||
int worker_id;
|
|
||||||
int num_workers;
|
|
||||||
int infd;
|
int infd;
|
||||||
|
int num_workers;
|
||||||
|
int worker_id;
|
||||||
|
void assign( const Lzip_index & li, Packet_courier & co,
|
||||||
|
const Pretty_print & pp_, Shared_retval & sr,
|
||||||
|
const int ifd, const int nw, const int wi )
|
||||||
|
{ lzip_index = &li; courier = &co; pp = &pp_; shared_retval = &sr;
|
||||||
|
infd = ifd; num_workers = nw; worker_id = wi; }
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
@ -179,9 +181,9 @@ extern "C" void * dworker_o( void * arg )
|
||||||
Packet_courier & courier = *tmp.courier;
|
Packet_courier & courier = *tmp.courier;
|
||||||
const Pretty_print & pp = *tmp.pp;
|
const Pretty_print & pp = *tmp.pp;
|
||||||
Shared_retval & shared_retval = *tmp.shared_retval;
|
Shared_retval & shared_retval = *tmp.shared_retval;
|
||||||
const int worker_id = tmp.worker_id;
|
|
||||||
const int num_workers = tmp.num_workers;
|
|
||||||
const int infd = tmp.infd;
|
const int infd = tmp.infd;
|
||||||
|
const int num_workers = tmp.num_workers;
|
||||||
|
const int worker_id = tmp.worker_id;
|
||||||
const int buffer_size = 65536;
|
const int buffer_size = 65536;
|
||||||
|
|
||||||
int new_pos = 0;
|
int new_pos = 0;
|
||||||
|
@ -231,12 +233,11 @@ extern "C" void * dworker_o( void * arg )
|
||||||
const bool eom = LZ_decompress_finished( decoder ) == 1;
|
const bool eom = LZ_decompress_finished( decoder ) == 1;
|
||||||
if( new_pos == max_packet_size || eom ) // make data packet
|
if( new_pos == max_packet_size || eom ) // make data packet
|
||||||
{
|
{
|
||||||
const Packet * const opacket =
|
courier.collect_packet( worker_id, ( new_pos > 0 ) ? new_data : 0,
|
||||||
new Packet( ( new_pos > 0 ) ? new_data : 0, new_pos, eom );
|
new_pos, eom );
|
||||||
courier.collect_packet( opacket, worker_id );
|
|
||||||
if( new_pos > 0 ) { new_pos = 0; new_data = 0; }
|
if( new_pos > 0 ) { new_pos = 0; new_data = 0; }
|
||||||
if( eom )
|
if( eom )
|
||||||
{ LZ_decompress_reset( decoder ); // prepare for new member
|
{ LZ_decompress_reset( decoder ); // prepare for next member
|
||||||
break; }
|
break; }
|
||||||
}
|
}
|
||||||
if( rd == 0 ) break;
|
if( rd == 0 ) break;
|
||||||
|
@ -262,23 +263,28 @@ done:
|
||||||
void muxer( Packet_courier & courier, const Pretty_print & pp,
|
void muxer( Packet_courier & courier, const Pretty_print & pp,
|
||||||
Shared_retval & shared_retval, const int outfd )
|
Shared_retval & shared_retval, const int outfd )
|
||||||
{
|
{
|
||||||
|
std::vector< Packet > packet_vector;
|
||||||
while( true )
|
while( true )
|
||||||
{
|
{
|
||||||
const Packet * const opacket = courier.deliver_packet();
|
courier.deliver_packets( packet_vector );
|
||||||
if( !opacket ) break; // queue is empty. all workers exited
|
if( packet_vector.empty() ) break; // queue is empty. all workers exited
|
||||||
|
|
||||||
if( shared_retval() == 0 &&
|
for( unsigned i = 0; i < packet_vector.size(); ++i )
|
||||||
writeblock( outfd, opacket->data, opacket->size ) != opacket->size &&
|
{
|
||||||
shared_retval.set_value( 1 ) )
|
Packet & opacket = packet_vector[i];
|
||||||
{ pp(); show_error( "Write error", errno ); }
|
if( shared_retval() == 0 &&
|
||||||
delete opacket;
|
writeblock( outfd, opacket.data, opacket.size ) != opacket.size &&
|
||||||
|
shared_retval.set_value( 1 ) )
|
||||||
|
{ pp(); show_error( write_error_msg, errno ); }
|
||||||
|
opacket.delete_data();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
} // end namespace
|
} // end namespace
|
||||||
|
|
||||||
|
|
||||||
// init the courier, then start the workers and call the muxer.
|
// init the courier, then start the workers and call the muxer
|
||||||
int dec_stdout( const int num_workers, const int infd, const int outfd,
|
int dec_stdout( const int num_workers, const int infd, const int outfd,
|
||||||
const Pretty_print & pp, const int debug_level,
|
const Pretty_print & pp, const int debug_level,
|
||||||
const int out_slots, const Lzip_index & lzip_index )
|
const int out_slots, const Lzip_index & lzip_index )
|
||||||
|
@ -294,13 +300,8 @@ int dec_stdout( const int num_workers, const int infd, const int outfd,
|
||||||
int i = 0; // number of workers started
|
int i = 0; // number of workers started
|
||||||
for( ; i < num_workers; ++i )
|
for( ; i < num_workers; ++i )
|
||||||
{
|
{
|
||||||
worker_args[i].lzip_index = &lzip_index;
|
worker_args[i].assign( lzip_index, courier, pp, shared_retval, infd,
|
||||||
worker_args[i].courier = &courier;
|
num_workers, i );
|
||||||
worker_args[i].pp = &pp;
|
|
||||||
worker_args[i].shared_retval = &shared_retval;
|
|
||||||
worker_args[i].worker_id = i;
|
|
||||||
worker_args[i].num_workers = num_workers;
|
|
||||||
worker_args[i].infd = infd;
|
|
||||||
const int errcode =
|
const int errcode =
|
||||||
pthread_create( &worker_threads[i], 0, dworker_o, &worker_args[i] );
|
pthread_create( &worker_threads[i], 0, dworker_o, &worker_args[i] );
|
||||||
if( errcode )
|
if( errcode )
|
||||||
|
|
191
dec_stream.cc
191
dec_stream.cc
|
@ -54,10 +54,10 @@ struct Packet // data block
|
||||||
uint8_t * data; // data may be null if size == 0
|
uint8_t * data; // data may be null if size == 0
|
||||||
int size; // number of bytes in data (if any)
|
int size; // number of bytes in data (if any)
|
||||||
bool eom; // end of member
|
bool eom; // end of member
|
||||||
Packet() : data( 0 ), size( 0 ), eom( true ) {}
|
Packet() : data( 0 ), size( 0 ), eom( false ) {}
|
||||||
Packet( uint8_t * const d, const int s, const bool e )
|
Packet( uint8_t * const d, const int s, const bool e )
|
||||||
: data( d ), size( s ), eom ( e ) {}
|
: data( d ), size( s ), eom ( e ) {}
|
||||||
~Packet() { if( data ) delete[] data; }
|
void delete_data() { if( data ) { delete[] data; data = 0; } }
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
@ -69,11 +69,11 @@ public:
|
||||||
unsigned ocheck_counter;
|
unsigned ocheck_counter;
|
||||||
unsigned owait_counter;
|
unsigned owait_counter;
|
||||||
private:
|
private:
|
||||||
int receive_worker_id; // worker queue currently receiving packets
|
int receive_id; // worker queue currently receiving packets
|
||||||
int deliver_worker_id; // worker queue currently delivering packets
|
int deliver_id; // worker queue currently delivering packets
|
||||||
Slot_tally slot_tally; // limits the number of input packets
|
Slot_tally slot_tally; // limits the number of input packets
|
||||||
std::vector< std::queue< const Packet * > > ipacket_queues;
|
std::vector< std::queue< Packet > > ipacket_queues;
|
||||||
std::vector< std::queue< const Packet * > > opacket_queues;
|
std::vector< std::queue< Packet > > opacket_queues;
|
||||||
int num_working; // number of workers still running
|
int num_working; // number of workers still running
|
||||||
const int num_workers; // number of workers
|
const int num_workers; // number of workers
|
||||||
const unsigned out_slots; // max output packets per queue
|
const unsigned out_slots; // max output packets per queue
|
||||||
|
@ -94,11 +94,11 @@ public:
|
||||||
const int in_slots, const int oslots )
|
const int in_slots, const int oslots )
|
||||||
: icheck_counter( 0 ), iwait_counter( 0 ),
|
: icheck_counter( 0 ), iwait_counter( 0 ),
|
||||||
ocheck_counter( 0 ), owait_counter( 0 ),
|
ocheck_counter( 0 ), owait_counter( 0 ),
|
||||||
receive_worker_id( 0 ), deliver_worker_id( 0 ),
|
receive_id( 0 ), deliver_id( 0 ), slot_tally( in_slots ),
|
||||||
slot_tally( in_slots ), ipacket_queues( workers ),
|
ipacket_queues( workers ), opacket_queues( workers ),
|
||||||
opacket_queues( workers ), num_working( workers ),
|
num_working( workers ), num_workers( workers ),
|
||||||
num_workers( workers ), out_slots( oslots ), slot_av( workers ),
|
out_slots( oslots ), slot_av( workers ), shared_retval( sh_ret ),
|
||||||
shared_retval( sh_ret ), eof( false ), trailing_data_found_( false )
|
eof( false ), trailing_data_found_( false )
|
||||||
{
|
{
|
||||||
xinit_mutex( &imutex ); xinit_cond( &iav_or_eof );
|
xinit_mutex( &imutex ); xinit_cond( &iav_or_eof );
|
||||||
xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
|
xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
|
||||||
|
@ -111,9 +111,9 @@ public:
|
||||||
for( int i = 0; i < num_workers; ++i )
|
for( int i = 0; i < num_workers; ++i )
|
||||||
{
|
{
|
||||||
while( !ipacket_queues[i].empty() )
|
while( !ipacket_queues[i].empty() )
|
||||||
{ delete ipacket_queues[i].front(); ipacket_queues[i].pop(); }
|
{ ipacket_queues[i].front().delete_data(); ipacket_queues[i].pop(); }
|
||||||
while( !opacket_queues[i].empty() )
|
while( !opacket_queues[i].empty() )
|
||||||
{ delete opacket_queues[i].front(); opacket_queues[i].pop(); }
|
{ opacket_queues[i].front().delete_data(); opacket_queues[i].pop(); }
|
||||||
}
|
}
|
||||||
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
|
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
|
||||||
xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
|
xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
|
||||||
|
@ -125,19 +125,18 @@ public:
|
||||||
void receive_packet( uint8_t * const data, const int size, const bool eom )
|
void receive_packet( uint8_t * const data, const int size, const bool eom )
|
||||||
{
|
{
|
||||||
if( shared_retval() ) { delete[] data; return; } // discard packet on error
|
if( shared_retval() ) { delete[] data; return; } // discard packet on error
|
||||||
const Packet * const ipacket = new Packet( data, size, eom );
|
const Packet ipacket( data, size, eom );
|
||||||
slot_tally.get_slot(); // wait for a free slot
|
slot_tally.get_slot(); // wait for a free slot
|
||||||
xlock( &imutex );
|
xlock( &imutex );
|
||||||
ipacket_queues[receive_worker_id].push( ipacket );
|
ipacket_queues[receive_id].push( ipacket );
|
||||||
xbroadcast( &iav_or_eof );
|
xbroadcast( &iav_or_eof );
|
||||||
xunlock( &imutex );
|
xunlock( &imutex );
|
||||||
if( eom && ++receive_worker_id >= num_workers ) receive_worker_id = 0;
|
if( eom && ++receive_id >= num_workers ) receive_id = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
// distribute a packet to a worker
|
// distribute a packet to a worker
|
||||||
const Packet * distribute_packet( const int worker_id )
|
Packet distribute_packet( const int worker_id )
|
||||||
{
|
{
|
||||||
const Packet * ipacket = 0;
|
|
||||||
xlock( &imutex );
|
xlock( &imutex );
|
||||||
++icheck_counter;
|
++icheck_counter;
|
||||||
while( ipacket_queues[worker_id].empty() && !eof )
|
while( ipacket_queues[worker_id].empty() && !eof )
|
||||||
|
@ -147,63 +146,58 @@ public:
|
||||||
}
|
}
|
||||||
if( !ipacket_queues[worker_id].empty() )
|
if( !ipacket_queues[worker_id].empty() )
|
||||||
{
|
{
|
||||||
ipacket = ipacket_queues[worker_id].front();
|
const Packet ipacket = ipacket_queues[worker_id].front();
|
||||||
ipacket_queues[worker_id].pop();
|
ipacket_queues[worker_id].pop();
|
||||||
|
xunlock( &imutex ); slot_tally.leave_slot(); return ipacket;
|
||||||
}
|
}
|
||||||
xunlock( &imutex );
|
xunlock( &imutex ); // no more packets
|
||||||
if( ipacket ) slot_tally.leave_slot();
|
xlock( &omutex ); // notify muxer when last worker exits
|
||||||
else // no more packets
|
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
||||||
{
|
xunlock( &omutex );
|
||||||
xlock( &omutex ); // notify muxer when last worker exits
|
return Packet();
|
||||||
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
|
||||||
xunlock( &omutex );
|
|
||||||
}
|
|
||||||
return ipacket;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// collect a packet from a worker, discard packet on error
|
// make a packet with data received from a worker, discard data on error
|
||||||
void collect_packet( const Packet * const opacket, const int worker_id )
|
void collect_packet( const int worker_id, uint8_t * const data,
|
||||||
|
const int size, const bool eom )
|
||||||
{
|
{
|
||||||
|
Packet opacket( data, size, eom );
|
||||||
xlock( &omutex );
|
xlock( &omutex );
|
||||||
if( opacket->data )
|
if( data )
|
||||||
while( opacket_queues[worker_id].size() >= out_slots )
|
while( opacket_queues[worker_id].size() >= out_slots )
|
||||||
{
|
{
|
||||||
if( shared_retval() ) { delete opacket; goto done; }
|
if( shared_retval() ) { delete[] data; goto out; }
|
||||||
xwait( &slot_av[worker_id], &omutex );
|
xwait( &slot_av[worker_id], &omutex );
|
||||||
}
|
}
|
||||||
opacket_queues[worker_id].push( opacket );
|
opacket_queues[worker_id].push( opacket );
|
||||||
if( worker_id == deliver_worker_id ) xsignal( &oav_or_exit );
|
if( worker_id == deliver_id ) xsignal( &oav_or_exit );
|
||||||
done:
|
out: xunlock( &omutex );
|
||||||
xunlock( &omutex );
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/* deliver a packet to muxer
|
/* deliver packets to muxer
|
||||||
if packet->eom, move to next queue
|
if opacket.eom, move to next queue
|
||||||
if packet data == 0, wait again */
|
if opacket.data == 0, skip opacket */
|
||||||
const Packet * deliver_packet()
|
void deliver_packets( std::vector< Packet > & packet_vector )
|
||||||
{
|
{
|
||||||
const Packet * opacket = 0;
|
packet_vector.clear();
|
||||||
xlock( &omutex );
|
xlock( &omutex );
|
||||||
++ocheck_counter;
|
++ocheck_counter;
|
||||||
while( true )
|
do {
|
||||||
{
|
while( opacket_queues[deliver_id].empty() && num_working > 0 )
|
||||||
while( opacket_queues[deliver_worker_id].empty() && num_working > 0 )
|
{ ++owait_counter; xwait( &oav_or_exit, &omutex ); }
|
||||||
|
while( true )
|
||||||
{
|
{
|
||||||
++owait_counter;
|
if( opacket_queues[deliver_id].empty() ) break;
|
||||||
xwait( &oav_or_exit, &omutex );
|
Packet opacket = opacket_queues[deliver_id].front();
|
||||||
|
opacket_queues[deliver_id].pop();
|
||||||
|
if( opacket_queues[deliver_id].size() + 1 == out_slots )
|
||||||
|
xsignal( &slot_av[deliver_id] );
|
||||||
|
if( opacket.eom && ++deliver_id >= num_workers ) deliver_id = 0;
|
||||||
|
if( opacket.data ) packet_vector.push_back( opacket );
|
||||||
}
|
}
|
||||||
if( opacket_queues[deliver_worker_id].empty() ) break;
|
|
||||||
opacket = opacket_queues[deliver_worker_id].front();
|
|
||||||
opacket_queues[deliver_worker_id].pop();
|
|
||||||
if( opacket_queues[deliver_worker_id].size() + 1 == out_slots )
|
|
||||||
xsignal( &slot_av[deliver_worker_id] );
|
|
||||||
if( opacket->eom && ++deliver_worker_id >= num_workers )
|
|
||||||
deliver_worker_id = 0;
|
|
||||||
if( opacket->data ) break;
|
|
||||||
delete opacket; opacket = 0;
|
|
||||||
}
|
}
|
||||||
|
while( packet_vector.empty() && num_working > 0 );
|
||||||
xunlock( &omutex );
|
xunlock( &omutex );
|
||||||
return opacket;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void add_sizes( const unsigned long long partial_in_size,
|
void add_sizes( const unsigned long long partial_in_size,
|
||||||
|
@ -252,17 +246,29 @@ struct Worker_arg
|
||||||
bool loose_trailing;
|
bool loose_trailing;
|
||||||
bool testing;
|
bool testing;
|
||||||
bool nocopy; // avoid copying decompressed data when testing
|
bool nocopy; // avoid copying decompressed data when testing
|
||||||
|
void assign( Packet_courier & co, const Pretty_print & pp_,
|
||||||
|
Shared_retval & sr, const bool it, const bool lt,
|
||||||
|
const bool t, const bool nc )
|
||||||
|
{ courier = &co; pp = &pp_; shared_retval = &sr; worker_id = 0;
|
||||||
|
ignore_trailing = it; loose_trailing = lt; testing = t; nocopy = nc; }
|
||||||
};
|
};
|
||||||
|
|
||||||
struct Splitter_arg
|
struct Splitter_arg
|
||||||
{
|
{
|
||||||
struct Worker_arg worker_arg;
|
Worker_arg worker_arg;
|
||||||
Worker_arg * worker_args;
|
Worker_arg * const worker_args;
|
||||||
pthread_t * worker_threads;
|
pthread_t * const worker_threads;
|
||||||
unsigned long long cfile_size;
|
const unsigned long long cfile_size;
|
||||||
int infd;
|
const int infd;
|
||||||
unsigned dictionary_size; // returned by splitter to main thread
|
unsigned dictionary_size; // returned by splitter to main thread
|
||||||
int num_workers; // returned by splitter to main thread
|
int num_workers; // returned by splitter to main thread
|
||||||
|
Splitter_arg( Packet_courier & co, const Pretty_print & pp_,
|
||||||
|
Shared_retval & sr, const bool it, const bool lt,
|
||||||
|
const bool t, const bool nc, Worker_arg * wa, pthread_t * wt,
|
||||||
|
const unsigned long long cfs, const int ifd, const int nw )
|
||||||
|
: worker_args( wa ), worker_threads( wt ), cfile_size( cfs ),
|
||||||
|
infd( ifd ), dictionary_size( 0 ), num_workers( nw )
|
||||||
|
{ worker_arg.assign( co, pp_, sr, it, lt, t, nc ); }
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
@ -291,22 +297,22 @@ extern "C" void * dworker_s( void * arg )
|
||||||
|
|
||||||
while( true )
|
while( true )
|
||||||
{
|
{
|
||||||
const Packet * const ipacket = courier.distribute_packet( worker_id );
|
Packet ipacket = courier.distribute_packet( worker_id );
|
||||||
if( !ipacket ) break; // no more packets to process
|
if( !ipacket.data ) break; // no more packets to process
|
||||||
|
|
||||||
int written = 0;
|
int written = 0;
|
||||||
while( !draining ) // else discard trailing data or drain queue
|
while( !draining ) // else discard trailing data or drain queue
|
||||||
{
|
{
|
||||||
if( LZ_decompress_write_size( decoder ) > 0 && written < ipacket->size )
|
if( LZ_decompress_write_size( decoder ) > 0 && written < ipacket.size )
|
||||||
{
|
{
|
||||||
const int wr = LZ_decompress_write( decoder, ipacket->data + written,
|
const int wr = LZ_decompress_write( decoder, ipacket.data + written,
|
||||||
ipacket->size - written );
|
ipacket.size - written );
|
||||||
if( wr < 0 ) internal_error( "library error (LZ_decompress_write)." );
|
if( wr < 0 ) internal_error( "library error (LZ_decompress_write)." );
|
||||||
written += wr;
|
written += wr;
|
||||||
if( written > ipacket->size )
|
if( written > ipacket.size )
|
||||||
internal_error( "ipacket size exceeded in worker." );
|
internal_error( "ipacket size exceeded in worker." );
|
||||||
}
|
}
|
||||||
if( ipacket->eom && written == ipacket->size )
|
if( ipacket.eom && written == ipacket.size )
|
||||||
LZ_decompress_finish( decoder );
|
LZ_decompress_finish( decoder );
|
||||||
unsigned long long total_in = 0; // detect empty member + corrupt header
|
unsigned long long total_in = 0; // detect empty member + corrupt header
|
||||||
while( !draining ) // read and pack decompressed data
|
while( !draining ) // read and pack decompressed data
|
||||||
|
@ -353,14 +359,13 @@ extern "C" void * dworker_s( void * arg )
|
||||||
{
|
{
|
||||||
if( !testing ) // make data packet
|
if( !testing ) // make data packet
|
||||||
{
|
{
|
||||||
const Packet * const opacket =
|
courier.collect_packet( worker_id, ( new_pos > 0 ) ? new_data : 0,
|
||||||
new Packet( ( new_pos > 0 ) ? new_data : 0, new_pos, eom );
|
new_pos, eom );
|
||||||
courier.collect_packet( opacket, worker_id );
|
|
||||||
if( new_pos > 0 ) new_data = 0;
|
if( new_pos > 0 ) new_data = 0;
|
||||||
}
|
}
|
||||||
new_pos = 0;
|
new_pos = 0;
|
||||||
if( eom )
|
if( eom )
|
||||||
{ LZ_decompress_reset( decoder ); // prepare for new member
|
{ LZ_decompress_reset( decoder ); // prepare for next member
|
||||||
break; }
|
break; }
|
||||||
}
|
}
|
||||||
if( rd == 0 )
|
if( rd == 0 )
|
||||||
|
@ -369,9 +374,9 @@ extern "C" void * dworker_s( void * arg )
|
||||||
if( total_in == size ) break; else total_in = size;
|
if( total_in == size ) break; else total_in = size;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if( !ipacket->data || written == ipacket->size ) break;
|
if( !ipacket.data || written == ipacket.size ) break;
|
||||||
}
|
}
|
||||||
delete ipacket;
|
ipacket.delete_data();
|
||||||
}
|
}
|
||||||
|
|
||||||
if( new_data ) delete[] new_data;
|
if( new_data ) delete[] new_data;
|
||||||
|
@ -404,7 +409,7 @@ bool start_worker( const Worker_arg & worker_arg,
|
||||||
packaging and distribution to workers.
|
packaging and distribution to workers.
|
||||||
Start a worker per member up to a maximum of num_workers.
|
Start a worker per member up to a maximum of num_workers.
|
||||||
*/
|
*/
|
||||||
extern "C" void * dsplitter_s( void * arg )
|
extern "C" void * dsplitter( void * arg )
|
||||||
{
|
{
|
||||||
Splitter_arg & tmp = *(Splitter_arg *)arg;
|
Splitter_arg & tmp = *(Splitter_arg *)arg;
|
||||||
const Worker_arg & worker_arg = tmp.worker_arg;
|
const Worker_arg & worker_arg = tmp.worker_arg;
|
||||||
|
@ -546,16 +551,21 @@ fail:
|
||||||
void muxer( Packet_courier & courier, const Pretty_print & pp,
|
void muxer( Packet_courier & courier, const Pretty_print & pp,
|
||||||
Shared_retval & shared_retval, const int outfd )
|
Shared_retval & shared_retval, const int outfd )
|
||||||
{
|
{
|
||||||
|
std::vector< Packet > packet_vector;
|
||||||
while( true )
|
while( true )
|
||||||
{
|
{
|
||||||
const Packet * const opacket = courier.deliver_packet();
|
courier.deliver_packets( packet_vector );
|
||||||
if( !opacket ) break; // queue is empty. all workers exited
|
if( packet_vector.empty() ) break; // queue is empty. all workers exited
|
||||||
|
|
||||||
if( shared_retval() == 0 &&
|
for( unsigned i = 0; i < packet_vector.size(); ++i )
|
||||||
writeblock( outfd, opacket->data, opacket->size ) != opacket->size &&
|
{
|
||||||
shared_retval.set_value( 1 ) )
|
Packet & opacket = packet_vector[i];
|
||||||
{ pp(); show_error( "Write error", errno ); }
|
if( shared_retval() == 0 &&
|
||||||
delete opacket;
|
writeblock( outfd, opacket.data, opacket.size ) != opacket.size &&
|
||||||
|
shared_retval.set_value( 1 ) )
|
||||||
|
{ pp(); show_error( write_error_msg, errno ); }
|
||||||
|
opacket.delete_data();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -590,23 +600,12 @@ int dec_stream( const unsigned long long cfile_size, const int num_workers,
|
||||||
const bool nocopy = false;
|
const bool nocopy = false;
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
Splitter_arg splitter_arg;
|
Splitter_arg splitter_arg( courier, pp, shared_retval,
|
||||||
splitter_arg.worker_arg.courier = &courier;
|
cl_opts.ignore_trailing, cl_opts.loose_trailing, outfd < 0, nocopy,
|
||||||
splitter_arg.worker_arg.pp = &pp;
|
worker_args, worker_threads, cfile_size, infd, num_workers );
|
||||||
splitter_arg.worker_arg.shared_retval = &shared_retval;
|
|
||||||
splitter_arg.worker_arg.worker_id = 0;
|
|
||||||
splitter_arg.worker_arg.ignore_trailing = cl_opts.ignore_trailing;
|
|
||||||
splitter_arg.worker_arg.loose_trailing = cl_opts.loose_trailing;
|
|
||||||
splitter_arg.worker_arg.testing = ( outfd < 0 );
|
|
||||||
splitter_arg.worker_arg.nocopy = nocopy;
|
|
||||||
splitter_arg.worker_args = worker_args;
|
|
||||||
splitter_arg.worker_threads = worker_threads;
|
|
||||||
splitter_arg.cfile_size = cfile_size;
|
|
||||||
splitter_arg.infd = infd;
|
|
||||||
splitter_arg.num_workers = num_workers;
|
|
||||||
|
|
||||||
pthread_t splitter_thread;
|
pthread_t splitter_thread;
|
||||||
int errcode = pthread_create( &splitter_thread, 0, dsplitter_s, &splitter_arg );
|
int errcode = pthread_create( &splitter_thread, 0, dsplitter, &splitter_arg );
|
||||||
if( errcode )
|
if( errcode )
|
||||||
{ show_error( "Can't create splitter thread", errcode );
|
{ show_error( "Can't create splitter thread", errcode );
|
||||||
delete[] worker_threads; delete[] worker_args; return 1; }
|
delete[] worker_threads; delete[] worker_args; return 1; }
|
||||||
|
|
|
@ -115,7 +115,7 @@ int pwriteblock( const int fd, const uint8_t * const buf, const int size,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void decompress_error( struct LZ_Decoder * const decoder,
|
void decompress_error( LZ_Decoder * const decoder,
|
||||||
const Pretty_print & pp,
|
const Pretty_print & pp,
|
||||||
Shared_retval & shared_retval, const int worker_id )
|
Shared_retval & shared_retval, const int worker_id )
|
||||||
{
|
{
|
||||||
|
@ -158,11 +158,16 @@ struct Worker_arg
|
||||||
const Lzip_index * lzip_index;
|
const Lzip_index * lzip_index;
|
||||||
const Pretty_print * pp;
|
const Pretty_print * pp;
|
||||||
Shared_retval * shared_retval;
|
Shared_retval * shared_retval;
|
||||||
int worker_id;
|
|
||||||
int num_workers;
|
|
||||||
int infd;
|
int infd;
|
||||||
|
int num_workers;
|
||||||
int outfd;
|
int outfd;
|
||||||
|
int worker_id;
|
||||||
bool nocopy; // avoid copying decompressed data when testing
|
bool nocopy; // avoid copying decompressed data when testing
|
||||||
|
void assign( const Lzip_index & li, const Pretty_print & pp_,
|
||||||
|
Shared_retval & sr, const int ifd, const int nw,
|
||||||
|
const int ofd, const int wi, const bool nc )
|
||||||
|
{ lzip_index = &li; pp = &pp_; shared_retval = &sr; infd = ifd;
|
||||||
|
num_workers = nw; outfd = ofd; worker_id = wi; nocopy = nc; }
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
@ -243,7 +248,7 @@ extern "C" void * dworker( void * arg )
|
||||||
{
|
{
|
||||||
if( data_rest != 0 )
|
if( data_rest != 0 )
|
||||||
internal_error( "final data_rest is not zero." );
|
internal_error( "final data_rest is not zero." );
|
||||||
LZ_decompress_reset( decoder ); // prepare for new member
|
LZ_decompress_reset( decoder ); // prepare for next member
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
if( rd == 0 ) break;
|
if( rd == 0 ) break;
|
||||||
|
@ -264,11 +269,11 @@ done:
|
||||||
} // end namespace
|
} // end namespace
|
||||||
|
|
||||||
|
|
||||||
// start the workers and wait for them to finish.
|
// start the workers and wait for them to finish
|
||||||
int decompress( const unsigned long long cfile_size, int num_workers,
|
int decompress( const unsigned long long cfile_size, int num_workers,
|
||||||
const int infd, const int outfd, const Cl_options & cl_opts,
|
const int infd, const int outfd, const Cl_options & cl_opts,
|
||||||
const Pretty_print & pp, const int debug_level,
|
const Pretty_print & pp, const int debug_level,
|
||||||
const int in_slots, const int out_slots,
|
const int in_slots, const int out_slots, const bool from_stdin,
|
||||||
const bool infd_isreg, const bool one_to_one )
|
const bool infd_isreg, const bool one_to_one )
|
||||||
{
|
{
|
||||||
if( !infd_isreg )
|
if( !infd_isreg )
|
||||||
|
@ -284,11 +289,11 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
||||||
}
|
}
|
||||||
if( lzip_index.retval() != 0 ) // corrupt or invalid input file
|
if( lzip_index.retval() != 0 ) // corrupt or invalid input file
|
||||||
{
|
{
|
||||||
if( lzip_index.bad_magic() )
|
if( lzip_index.good_magic() ) pp( lzip_index.error().c_str() );
|
||||||
show_file_error( pp.name(), lzip_index.error().c_str() );
|
else show_file_error( pp.name(), lzip_index.error().c_str() );
|
||||||
else pp( lzip_index.error().c_str() );
|
|
||||||
return lzip_index.retval();
|
return lzip_index.retval();
|
||||||
}
|
}
|
||||||
|
const bool multi_empty = !from_stdin && lzip_index.multi_empty();
|
||||||
|
|
||||||
if( num_workers > lzip_index.members() ) num_workers = lzip_index.members();
|
if( num_workers > lzip_index.members() ) num_workers = lzip_index.members();
|
||||||
|
|
||||||
|
@ -301,8 +306,11 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
||||||
if( debug_level & 2 ) std::fputs( "decompress file to stdout.\n", stderr );
|
if( debug_level & 2 ) std::fputs( "decompress file to stdout.\n", stderr );
|
||||||
if( verbosity >= 1 ) pp();
|
if( verbosity >= 1 ) pp();
|
||||||
show_progress( 0, cfile_size, &pp ); // init
|
show_progress( 0, cfile_size, &pp ); // init
|
||||||
return dec_stdout( num_workers, infd, outfd, pp, debug_level, out_slots,
|
const int tmp = dec_stdout( num_workers, infd, outfd, pp, debug_level,
|
||||||
lzip_index );
|
out_slots, lzip_index );
|
||||||
|
if( tmp ) return tmp;
|
||||||
|
if( multi_empty ) { show_file_error( pp.name(), empty_msg ); return 2; }
|
||||||
|
return 0;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -325,14 +333,8 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
||||||
int i = 0; // number of workers started
|
int i = 0; // number of workers started
|
||||||
for( ; i < num_workers; ++i )
|
for( ; i < num_workers; ++i )
|
||||||
{
|
{
|
||||||
worker_args[i].lzip_index = &lzip_index;
|
worker_args[i].assign( lzip_index, pp, shared_retval, infd, num_workers,
|
||||||
worker_args[i].pp = &pp;
|
outfd, i, nocopy );
|
||||||
worker_args[i].shared_retval = &shared_retval;
|
|
||||||
worker_args[i].worker_id = i;
|
|
||||||
worker_args[i].num_workers = num_workers;
|
|
||||||
worker_args[i].infd = infd;
|
|
||||||
worker_args[i].outfd = outfd;
|
|
||||||
worker_args[i].nocopy = nocopy;
|
|
||||||
const int errcode =
|
const int errcode =
|
||||||
pthread_create( &worker_threads[i], 0, dworker, &worker_args[i] );
|
pthread_create( &worker_threads[i], 0, dworker, &worker_args[i] );
|
||||||
if( errcode )
|
if( errcode )
|
||||||
|
@ -359,5 +361,6 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
||||||
std::fprintf( stderr,
|
std::fprintf( stderr,
|
||||||
"workers started %8u\n", num_workers );
|
"workers started %8u\n", num_workers );
|
||||||
|
|
||||||
|
if( multi_empty ) { show_file_error( pp.name(), empty_msg ); return 2; }
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
35
doc/plzip.1
35
doc/plzip.1
|
@ -1,32 +1,33 @@
|
||||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
|
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
|
||||||
.TH PLZIP "1" "January 2024" "plzip 1.11" "User Commands"
|
.TH PLZIP "1" "November 2024" "plzip 1.12-rc1" "User Commands"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
plzip \- reduces the size of files
|
plzip \- reduces the size of files
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
.B plzip
|
.B plzip
|
||||||
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
Plzip is a massively parallel (multi\-threaded) implementation of lzip,
|
Plzip is a massively parallel (multi\-threaded) implementation of lzip. Plzip
|
||||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
uses the compression library lzlib.
|
||||||
.PP
|
.PP
|
||||||
Lzip is a lossless data compressor with a user interface similar to the one
|
Lzip is a lossless data compressor with a user interface similar to the one
|
||||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel\-Ziv\-Markov
|
of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel\-Ziv\-Markov
|
||||||
chain\-Algorithm' (LZMA) stream format to maximize interoperability. The
|
chain\-Algorithm) designed to achieve complete interoperability between
|
||||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||||
on 32\-bit machines. Lzip provides accurate and robust 3\-factor integrity
|
file can be decompressed on 32\-bit machines. Lzip provides accurate and
|
||||||
checking. Lzip can compress about as fast as gzip (lzip \fB\-0\fR) or compress most
|
robust 3\-factor integrity checking. 'lzip \fB\-0\fR' compresses about as fast as
|
||||||
files more than bzip2 (lzip \fB\-9\fR). Decompression speed is intermediate between
|
gzip, while 'lzip \fB\-9\fR' compresses most files more than bzip2. Decompression
|
||||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||||
perspective. Lzip has been designed, written, and tested with great care to
|
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||||
replace gzip and bzip2 as the standard general\-purpose compressed format for
|
and tested with great care to replace gzip and bzip2 as general\-purpose
|
||||||
Unix\-like systems.
|
compressed format for Unix\-like systems.
|
||||||
.PP
|
.PP
|
||||||
Plzip can compress/decompress large files on multiprocessor machines much
|
Plzip can compress/decompress large files on multiprocessor machines much
|
||||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||||
to 2 percent larger compressed files). Note that the number of usable
|
to 2 percent larger compressed files). Note that the number of usable
|
||||||
threads is limited by file size; on files larger than a few GB plzip can use
|
threads is limited by file size; on files larger than a few GB plzip can use
|
||||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
hundreds of processors, but on files smaller than 1 MiB plzip is no faster
|
||||||
than lzip.
|
than lzip (even at compression level \fB\-0\fR).
|
||||||
|
The number of threads defaults to the number of processors.
|
||||||
.SH OPTIONS
|
.SH OPTIONS
|
||||||
.TP
|
.TP
|
||||||
\fB\-h\fR, \fB\-\-help\fR
|
\fB\-h\fR, \fB\-\-help\fR
|
||||||
|
@ -132,8 +133,8 @@ License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||||
.br
|
.br
|
||||||
This is free software: you are free to change and redistribute it.
|
This is free software: you are free to change and redistribute it.
|
||||||
There is NO WARRANTY, to the extent permitted by law.
|
There is NO WARRANTY, to the extent permitted by law.
|
||||||
Using lzlib 1.14
|
Using lzlib 1.15\-rc1
|
||||||
Using LZ_API_VERSION = 1014
|
Using LZ_API_VERSION = 1015
|
||||||
.SH "SEE ALSO"
|
.SH "SEE ALSO"
|
||||||
The full documentation for
|
The full documentation for
|
||||||
.B plzip
|
.B plzip
|
||||||
|
|
347
doc/plzip.info
347
doc/plzip.info
|
@ -11,21 +11,22 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
|
||||||
Plzip Manual
|
Plzip Manual
|
||||||
************
|
************
|
||||||
|
|
||||||
This manual is for Plzip (version 1.11, 21 January 2024).
|
This manual is for Plzip (version 1.12-rc1, 19 November 2024).
|
||||||
|
|
||||||
* Menu:
|
* Menu:
|
||||||
|
|
||||||
* Introduction:: Purpose and features of plzip
|
* Introduction:: Purpose and features of plzip
|
||||||
* Output:: Meaning of plzip's output
|
* Output:: Meaning of plzip's output
|
||||||
* Invoking plzip:: Command-line interface
|
* Invoking plzip:: Command-line interface
|
||||||
* Program design:: Internal structure of plzip
|
* Argument syntax:: By convention, options start with a hyphen
|
||||||
* Memory requirements:: Memory required to compress and decompress
|
* File format:: Detailed format of the compressed file
|
||||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
* Program design:: Internal structure of plzip
|
||||||
* File format:: Detailed format of the compressed file
|
* Memory requirements:: Memory required to compress and decompress
|
||||||
* Trailing data:: Extra data appended to the file
|
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||||
* Examples:: A small tutorial with examples
|
* Trailing data:: Extra data appended to the file
|
||||||
* Problems:: Reporting bugs
|
* Examples:: A small tutorial with examples
|
||||||
* Concept index:: Index of concepts
|
* Problems:: Reporting bugs
|
||||||
|
* Concept index:: Index of concepts
|
||||||
|
|
||||||
|
|
||||||
Copyright (C) 2009-2024 Antonio Diaz Diaz.
|
Copyright (C) 2009-2024 Antonio Diaz Diaz.
|
||||||
|
@ -39,27 +40,27 @@ File: plzip.info, Node: Introduction, Next: Output, Prev: Top, Up: Top
|
||||||
1 Introduction
|
1 Introduction
|
||||||
**************
|
**************
|
||||||
|
|
||||||
Plzip is a massively parallel (multi-threaded) implementation of lzip,
|
Plzip is a massively parallel (multi-threaded) implementation of lzip.
|
||||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
Plzip uses the compression library lzlib.
|
||||||
|
|
||||||
Lzip is a lossless data compressor with a user interface similar to the
|
Lzip is a lossless data compressor with a user interface similar to the
|
||||||
one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
one of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov
|
||||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
chain-Algorithm) designed to achieve complete interoperability between
|
||||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
file can be decompressed on 32-bit machines. Lzip provides accurate and
|
||||||
checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
|
robust 3-factor integrity checking. 'lzip -0' compresses about as fast as
|
||||||
files more than bzip2 (lzip -9). Decompression speed is intermediate between
|
gzip, while 'lzip -9' compresses most files more than bzip2. Decompression
|
||||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||||
perspective. Lzip has been designed, written, and tested with great care to
|
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
and tested with great care to replace gzip and bzip2 as general-purpose
|
||||||
Unix-like systems.
|
compressed format for Unix-like systems.
|
||||||
|
|
||||||
Plzip can compress/decompress large files on multiprocessor machines much
|
Plzip can compress/decompress large files on multiprocessor machines much
|
||||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||||
to 2 percent larger compressed files). Note that the number of usable
|
to 2 percent larger compressed files). Note that the number of usable
|
||||||
threads is limited by file size; on files larger than a few GB plzip can use
|
threads is limited by file size; on files larger than a few GB plzip can use
|
||||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
hundreds of processors, but on files smaller than 1 MiB plzip is no faster
|
||||||
than lzip. *Note Minimum file sizes::.
|
than lzip (even at compression level -0). *Note Minimum file sizes::.
|
||||||
|
|
||||||
For creation and manipulation of compressed tar archives tarlz can be
|
For creation and manipulation of compressed tar archives tarlz can be
|
||||||
more efficient than using tar and plzip because tarlz is able to keep the
|
more efficient than using tar and plzip because tarlz is able to keep the
|
||||||
|
@ -96,9 +97,9 @@ makes it safer than compressors returning ambiguous warning values (like
|
||||||
gzip) when it is used as a back end for other programs like tar or zutils.
|
gzip) when it is used as a back end for other programs like tar or zutils.
|
||||||
|
|
||||||
Plzip automatically uses for each file the largest dictionary size that
|
Plzip automatically uses for each file the largest dictionary size that
|
||||||
does not exceed neither the file size nor the limit given. Keep in mind
|
does not exceed neither the file size nor the limit given. The dictionary
|
||||||
that the decompression memory requirement is affected at compression time
|
size used for decompression is the same dictionary size used for
|
||||||
by the choice of dictionary size limit. *Note Memory requirements::.
|
compression. *Note Memory requirements::.
|
||||||
|
|
||||||
When compressing, plzip replaces every file given in the command line
|
When compressing, plzip replaces every file given in the command line
|
||||||
with a compressed version of itself, with the name "original_name.lz". When
|
with a compressed version of itself, with the name "original_name.lz". When
|
||||||
|
@ -174,7 +175,7 @@ have been compressed. Decompressed is used to refer to data which have
|
||||||
undergone the process of decompression.
|
undergone the process of decompression.
|
||||||
|
|
||||||
|
|
||||||
File: plzip.info, Node: Invoking plzip, Next: Program design, Prev: Output, Up: Top
|
File: plzip.info, Node: Invoking plzip, Next: Argument syntax, Prev: Output, Up: Top
|
||||||
|
|
||||||
3 Invoking plzip
|
3 Invoking plzip
|
||||||
****************
|
****************
|
||||||
|
@ -189,8 +190,7 @@ means standard input. It can be mixed with other FILES and is read just
|
||||||
once, the first time it appears in the command line. Remember to prepend
|
once, the first time it appears in the command line. Remember to prepend
|
||||||
'./' to any file name beginning with a hyphen, or use '--'.
|
'./' to any file name beginning with a hyphen, or use '--'.
|
||||||
|
|
||||||
plzip supports the following options: *Note Argument syntax:
|
plzip supports the following options: *Note Argument syntax::.
|
||||||
(arg_parser)Argument syntax.
|
|
||||||
|
|
||||||
'-h'
|
'-h'
|
||||||
'--help'
|
'--help'
|
||||||
|
@ -235,7 +235,8 @@ once, the first time it appears in the command line. Remember to prepend
|
||||||
status 1. If a file fails to decompress, or is a terminal, plzip exits
|
status 1. If a file fails to decompress, or is a terminal, plzip exits
|
||||||
immediately with error status 2 without decompressing the rest of the
|
immediately with error status 2 without decompressing the rest of the
|
||||||
files. A terminal is considered an uncompressed file, and therefore
|
files. A terminal is considered an uncompressed file, and therefore
|
||||||
invalid.
|
invalid. A multimember file with one or more empty members is accepted
|
||||||
|
if redirected to standard input.
|
||||||
|
|
||||||
'-f'
|
'-f'
|
||||||
'--force'
|
'--force'
|
||||||
|
@ -259,7 +260,8 @@ once, the first time it appears in the command line. Remember to prepend
|
||||||
'-v', the dictionary size, the number of members in the file, and the
|
'-v', the dictionary size, the number of members in the file, and the
|
||||||
amount of trailing data (if any) are also printed. With '-vv', the
|
amount of trailing data (if any) are also printed. With '-vv', the
|
||||||
positions and sizes of each member in multimember files are also
|
positions and sizes of each member in multimember files are also
|
||||||
printed.
|
printed. A multimember file with one or more empty members is accepted
|
||||||
|
if redirected to standard input.
|
||||||
|
|
||||||
If any file is damaged, does not exist, can't be opened, or is not
|
If any file is damaged, does not exist, can't be opened, or is not
|
||||||
regular, the final exit status is > 0. '-lq' can be used to check
|
regular, the final exit status is > 0. '-lq' can be used to check
|
||||||
|
@ -278,8 +280,8 @@ once, the first time it appears in the command line. Remember to prepend
|
||||||
'-n N'
|
'-n N'
|
||||||
'--threads=N'
|
'--threads=N'
|
||||||
Set the maximum number of worker threads, overriding the system's
|
Set the maximum number of worker threads, overriding the system's
|
||||||
default. Valid values range from 1 to "as many as your system can
|
default. Valid values range from 1 to as many as your system can
|
||||||
support". If this option is not used, plzip tries to detect the number
|
support. If this option is not used, plzip tries to detect the number
|
||||||
of processors in the system and use it as default value. When
|
of processors in the system and use it as default value. When
|
||||||
compressing on a 32 bit system, plzip tries to limit the memory use to
|
compressing on a 32 bit system, plzip tries to limit the memory use to
|
||||||
under 2.22 GiB (4 worker threads at level -9) by reducing the number
|
under 2.22 GiB (4 worker threads at level -9) by reducing the number
|
||||||
|
@ -338,7 +340,8 @@ once, the first time it appears in the command line. Remember to prepend
|
||||||
fails the test, does not exist, can't be opened, or is a terminal,
|
fails the test, does not exist, can't be opened, or is a terminal,
|
||||||
plzip continues testing the rest of the files. A final diagnostic is
|
plzip continues testing the rest of the files. A final diagnostic is
|
||||||
shown at verbosity level 1 or higher if any file fails the test when
|
shown at verbosity level 1 or higher if any file fails the test when
|
||||||
testing multiple files.
|
testing multiple files. A multimember file with one or more empty
|
||||||
|
members is accepted if redirected to standard input.
|
||||||
|
|
||||||
'-v'
|
'-v'
|
||||||
'--verbose'
|
'--verbose'
|
||||||
|
@ -368,6 +371,7 @@ once, the first time it appears in the command line. Remember to prepend
|
||||||
'-s64MiB -m273'
|
'-s64MiB -m273'
|
||||||
|
|
||||||
Level Dictionary size (-s) Match length limit (-m)
|
Level Dictionary size (-s) Match length limit (-m)
|
||||||
|
------------------------------------------------------
|
||||||
-0 64 KiB 16 bytes
|
-0 64 KiB 16 bytes
|
||||||
-1 1 MiB 5 bytes
|
-1 1 MiB 5 bytes
|
||||||
-2 1.5 MiB 6 bytes
|
-2 1.5 MiB 6 bytes
|
||||||
|
@ -387,7 +391,7 @@ once, the first time it appears in the command line. Remember to prepend
|
||||||
When decompressing, testing, or listing, allow trailing data whose
|
When decompressing, testing, or listing, allow trailing data whose
|
||||||
first bytes are so similar to the magic bytes of a lzip header that
|
first bytes are so similar to the magic bytes of a lzip header that
|
||||||
they can be confused with a corrupt header. Use this option if a file
|
they can be confused with a corrupt header. Use this option if a file
|
||||||
triggers a "corrupt header" error and the cause is not indeed a
|
triggers a 'corrupt header' error and the cause is not indeed a
|
||||||
corrupt header.
|
corrupt header.
|
||||||
|
|
||||||
'--in-slots=N'
|
'--in-slots=N'
|
||||||
|
@ -421,6 +425,7 @@ and may be followed by a multiplier and an optional 'B' for "byte".
|
||||||
Table of SI and binary prefixes (unit multipliers):
|
Table of SI and binary prefixes (unit multipliers):
|
||||||
|
|
||||||
Prefix Value | Prefix Value
|
Prefix Value | Prefix Value
|
||||||
|
----------------------------------------------------------------------
|
||||||
k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
|
k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
|
||||||
M megabyte (10^6) | Mi mebibyte (2^20)
|
M megabyte (10^6) | Mi mebibyte (2^20)
|
||||||
G gigabyte (10^9) | Gi gibibyte (2^30)
|
G gigabyte (10^9) | Gi gibibyte (2^30)
|
||||||
|
@ -439,9 +444,131 @@ corrupt or invalid input file, 3 for an internal consistency error (e.g.,
|
||||||
bug) which caused plzip to panic.
|
bug) which caused plzip to panic.
|
||||||
|
|
||||||
|
|
||||||
File: plzip.info, Node: Program design, Next: Memory requirements, Prev: Invoking plzip, Up: Top
|
File: plzip.info, Node: Argument syntax, Next: File format, Prev: Invoking plzip, Up: Top
|
||||||
|
|
||||||
4 Internal structure of plzip
|
4 Syntax of command-line arguments
|
||||||
|
**********************************
|
||||||
|
|
||||||
|
POSIX recommends these conventions for command-line arguments.
|
||||||
|
|
||||||
|
* A command-line argument is an option if it begins with a hyphen ('-').
|
||||||
|
|
||||||
|
* Option names are single alphanumeric characters.
|
||||||
|
|
||||||
|
* Certain options require an argument.
|
||||||
|
|
||||||
|
* An option and its argument may or may not appear as separate tokens.
|
||||||
|
(In other words, the whitespace separating them is optional, unless the
|
||||||
|
argument is the empty string). Thus, '-o foo' and '-ofoo' are
|
||||||
|
equivalent.
|
||||||
|
|
||||||
|
* One or more options without arguments, followed by at most one option
|
||||||
|
that takes an argument, may follow a hyphen in a single token. Thus,
|
||||||
|
'-abc' is equivalent to '-a -b -c'.
|
||||||
|
|
||||||
|
* Options typically precede other non-option arguments.
|
||||||
|
|
||||||
|
* The argument '--' terminates all options; any following arguments are
|
||||||
|
treated as non-option arguments, even if they begin with a hyphen.
|
||||||
|
|
||||||
|
* A token consisting of a single hyphen character is interpreted as an
|
||||||
|
ordinary non-option argument. By convention, it is used to specify
|
||||||
|
standard input, standard output, or a file named '-'.
|
||||||
|
|
||||||
|
GNU adds "long options" to these conventions:
|
||||||
|
|
||||||
|
* A long option consists of two hyphens ('--') followed by a name made
|
||||||
|
of alphanumeric characters and hyphens. Option names are typically one
|
||||||
|
to three words long, with hyphens to separate words. Abbreviations can
|
||||||
|
be used for the long option names as long as the abbreviations are
|
||||||
|
unique.
|
||||||
|
|
||||||
|
* A long option and its argument may or may not appear as separate
|
||||||
|
tokens. In the latter case they must be separated by an equal sign '='.
|
||||||
|
Thus, '--foo bar' and '--foo=bar' are equivalent.
|
||||||
|
|
||||||
|
The syntax of options with an optional argument is
|
||||||
|
'-<short_option><argument>' (without whitespace), or
|
||||||
|
'--<long_option>=<argument>'.
|
||||||
|
|
||||||
|
|
||||||
|
File: plzip.info, Node: File format, Next: Program design, Prev: Argument syntax, Up: Top
|
||||||
|
|
||||||
|
5 File format
|
||||||
|
*************
|
||||||
|
|
||||||
|
Perfection is reached, not when there is no longer anything to add, but
|
||||||
|
when there is no longer anything to take away.
|
||||||
|
-- Antoine de Saint-Exupery
|
||||||
|
|
||||||
|
In the diagram below, a box like this:
|
||||||
|
|
||||||
|
+---+
|
||||||
|
| | <-- the vertical bars might be missing
|
||||||
|
+---+
|
||||||
|
|
||||||
|
represents one byte; a box like this:
|
||||||
|
|
||||||
|
+==============+
|
||||||
|
| |
|
||||||
|
+==============+
|
||||||
|
|
||||||
|
represents a variable number of bytes.
|
||||||
|
|
||||||
|
A lzip file consists of one or more independent "members" (compressed data
|
||||||
|
sets). The members simply appear one after another in the file, with no
|
||||||
|
additional information before, between, or after them. Each member can
|
||||||
|
encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
|
||||||
|
size of a multimember file is unlimited. Empty members (data size = 0) are
|
||||||
|
not allowed in multimember files.
|
||||||
|
|
||||||
|
Each member has the following structure:
|
||||||
|
|
||||||
|
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||||
|
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||||
|
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||||
|
|
||||||
|
All multibyte values are stored in little endian order.
|
||||||
|
|
||||||
|
'ID string (the "magic" bytes)'
|
||||||
|
A four byte string, identifying the lzip format, with the value "LZIP"
|
||||||
|
(0x4C, 0x5A, 0x49, 0x50).
|
||||||
|
|
||||||
|
'VN (version number, 1 byte)'
|
||||||
|
Just in case something needs to be modified in the future. 1 for now.
|
||||||
|
|
||||||
|
'DS (coded dictionary size, 1 byte)'
|
||||||
|
The dictionary size is calculated by taking a power of 2 (the base
|
||||||
|
size) and subtracting from it a fraction between 0/16 and 7/16 of the
|
||||||
|
base size.
|
||||||
|
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
|
||||||
|
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
||||||
|
from the base size to obtain the dictionary size.
|
||||||
|
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
|
||||||
|
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||||
|
|
||||||
|
'LZMA stream'
|
||||||
|
The LZMA stream, terminated by an 'End Of Stream' marker. Uses default
|
||||||
|
values for encoder properties. *Note Stream format: (lzip)Stream
|
||||||
|
format, for a complete description.
|
||||||
|
|
||||||
|
'CRC32 (4 bytes)'
|
||||||
|
Cyclic Redundancy Check (CRC) of the original uncompressed data.
|
||||||
|
|
||||||
|
'Data size (8 bytes)'
|
||||||
|
Size of the original uncompressed data.
|
||||||
|
|
||||||
|
'Member size (8 bytes)'
|
||||||
|
Total size of the member, including header and trailer. This field acts
|
||||||
|
as a distributed index, improves the checking of stream integrity, and
|
||||||
|
facilitates the safe recovery of undamaged members from multimember
|
||||||
|
files. Lzip limits the member size to 2 PiB to prevent the data size
|
||||||
|
field from overflowing.
|
||||||
|
|
||||||
|
|
||||||
|
File: plzip.info, Node: Program design, Next: Memory requirements, Prev: File format, Up: Top
|
||||||
|
|
||||||
|
6 Internal structure of plzip
|
||||||
*****************************
|
*****************************
|
||||||
|
|
||||||
When compressing, plzip divides the input file into chunks and compresses as
|
When compressing, plzip divides the input file into chunks and compresses as
|
||||||
|
@ -456,8 +583,8 @@ because lzip usually produces single-member files, which can't be
|
||||||
decompressed in parallel.
|
decompressed in parallel.
|
||||||
|
|
||||||
For each input file, a splitter thread and several worker threads are
|
For each input file, a splitter thread and several worker threads are
|
||||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
created, acting the main thread as muxer (multiplexer) thread. A 'packet
|
||||||
courier" takes care of data transfers among threads and limits the maximum
|
courier' takes care of data transfers among threads and limits the maximum
|
||||||
number of data blocks (packets) being processed simultaneously.
|
number of data blocks (packets) being processed simultaneously.
|
||||||
|
|
||||||
The splitter reads data blocks from the input file, and distributes them
|
The splitter reads data blocks from the input file, and distributes them
|
||||||
|
@ -486,7 +613,7 @@ only limited by the number of processors available and by I/O speed.
|
||||||
|
|
||||||
File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev: Program design, Up: Top
|
File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev: Program design, Up: Top
|
||||||
|
|
||||||
5 Memory required to compress and decompress
|
7 Memory required to compress and decompress
|
||||||
********************************************
|
********************************************
|
||||||
|
|
||||||
The amount of memory required *per worker thread* for decompression or
|
The amount of memory required *per worker thread* for decompression or
|
||||||
|
@ -520,6 +647,7 @@ The following table shows the memory required *per thread* for compression
|
||||||
at a given level, using the default data size for each level:
|
at a given level, using the default data size for each level:
|
||||||
|
|
||||||
Level Memory required
|
Level Memory required
|
||||||
|
------------------------
|
||||||
-0 4.875 MiB
|
-0 4.875 MiB
|
||||||
-1 17.75 MiB
|
-1 17.75 MiB
|
||||||
-2 26.625 MiB
|
-2 26.625 MiB
|
||||||
|
@ -532,9 +660,9 @@ Level Memory required
|
||||||
-9 568 MiB
|
-9 568 MiB
|
||||||
|
|
||||||
|
|
||||||
File: plzip.info, Node: Minimum file sizes, Next: File format, Prev: Memory requirements, Up: Top
|
File: plzip.info, Node: Minimum file sizes, Next: Trailing data, Prev: Memory requirements, Up: Top
|
||||||
|
|
||||||
6 Minimum file sizes required for full compression speed
|
8 Minimum file sizes required for full compression speed
|
||||||
********************************************************
|
********************************************************
|
||||||
|
|
||||||
When compressing, plzip divides the input file into chunks and compresses
|
When compressing, plzip divides the input file into chunks and compresses
|
||||||
|
@ -569,85 +697,9 @@ Level
|
||||||
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB
|
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB
|
||||||
|
|
||||||
|
|
||||||
File: plzip.info, Node: File format, Next: Trailing data, Prev: Minimum file sizes, Up: Top
|
File: plzip.info, Node: Trailing data, Next: Examples, Prev: Minimum file sizes, Up: Top
|
||||||
|
|
||||||
7 File format
|
9 Extra data appended to the file
|
||||||
*************
|
|
||||||
|
|
||||||
Perfection is reached, not when there is no longer anything to add, but
|
|
||||||
when there is no longer anything to take away.
|
|
||||||
-- Antoine de Saint-Exupery
|
|
||||||
|
|
||||||
|
|
||||||
In the diagram below, a box like this:
|
|
||||||
|
|
||||||
+---+
|
|
||||||
| | <-- the vertical bars might be missing
|
|
||||||
+---+
|
|
||||||
|
|
||||||
represents one byte; a box like this:
|
|
||||||
|
|
||||||
+==============+
|
|
||||||
| |
|
|
||||||
+==============+
|
|
||||||
|
|
||||||
represents a variable number of bytes.
|
|
||||||
|
|
||||||
|
|
||||||
A lzip file consists of one or more independent "members" (compressed
|
|
||||||
data sets). The members simply appear one after another in the file, with no
|
|
||||||
additional information before, between, or after them. Each member can
|
|
||||||
encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
|
|
||||||
size of a multimember file is unlimited.
|
|
||||||
|
|
||||||
Each member has the following structure:
|
|
||||||
|
|
||||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
||||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
|
||||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
||||||
|
|
||||||
All multibyte values are stored in little endian order.
|
|
||||||
|
|
||||||
'ID string (the "magic" bytes)'
|
|
||||||
A four byte string, identifying the lzip format, with the value "LZIP"
|
|
||||||
(0x4C, 0x5A, 0x49, 0x50).
|
|
||||||
|
|
||||||
'VN (version number, 1 byte)'
|
|
||||||
Just in case something needs to be modified in the future. 1 for now.
|
|
||||||
|
|
||||||
'DS (coded dictionary size, 1 byte)'
|
|
||||||
The dictionary size is calculated by taking a power of 2 (the base
|
|
||||||
size) and subtracting from it a fraction between 0/16 and 7/16 of the
|
|
||||||
base size.
|
|
||||||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
|
|
||||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
|
||||||
from the base size to obtain the dictionary size.
|
|
||||||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
|
|
||||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
|
||||||
|
|
||||||
'LZMA stream'
|
|
||||||
The LZMA stream, finished by an "End Of Stream" marker. Uses default
|
|
||||||
values for encoder properties. *Note Stream format: (lzip)Stream
|
|
||||||
format, for a complete description.
|
|
||||||
|
|
||||||
'CRC32 (4 bytes)'
|
|
||||||
Cyclic Redundancy Check (CRC) of the original uncompressed data.
|
|
||||||
|
|
||||||
'Data size (8 bytes)'
|
|
||||||
Size of the original uncompressed data.
|
|
||||||
|
|
||||||
'Member size (8 bytes)'
|
|
||||||
Total size of the member, including header and trailer. This field acts
|
|
||||||
as a distributed index, improves the checking of stream integrity, and
|
|
||||||
facilitates the safe recovery of undamaged members from multimember
|
|
||||||
files. Lzip limits the member size to 2 PiB to prevent the data size
|
|
||||||
field from overflowing.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
File: plzip.info, Node: Trailing data, Next: Examples, Prev: File format, Up: Top
|
|
||||||
|
|
||||||
8 Extra data appended to the file
|
|
||||||
*********************************
|
*********************************
|
||||||
|
|
||||||
Sometimes extra data are found appended to a lzip file after the last
|
Sometimes extra data are found appended to a lzip file after the last
|
||||||
|
@ -657,7 +709,7 @@ member. Such trailing data may be:
|
||||||
example when writing to a tape. It is safe to append any amount of
|
example when writing to a tape. It is safe to append any amount of
|
||||||
padding zero bytes to a lzip file.
|
padding zero bytes to a lzip file.
|
||||||
|
|
||||||
* Useful data added by the user; an "End Of File" string (to check that
|
* Useful data added by the user; an 'End Of File' string (to check that
|
||||||
the file has not been truncated), a cryptographically secure hash, a
|
the file has not been truncated), a cryptographically secure hash, a
|
||||||
description of file contents, etc. It is safe to append any amount of
|
description of file contents, etc. It is safe to append any amount of
|
||||||
text to a lzip file as long as none of the first four bytes of the
|
text to a lzip file as long as none of the first four bytes of the
|
||||||
|
@ -693,8 +745,8 @@ where a file containing trailing data must be rejected, the option
|
||||||
|
|
||||||
File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top
|
File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top
|
||||||
|
|
||||||
9 A small tutorial with examples
|
10 A small tutorial with examples
|
||||||
********************************
|
*********************************
|
||||||
|
|
||||||
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
|
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
|
||||||
compressed file (bugs in the system libraries, memory errors, etc).
|
compressed file (bugs in the system libraries, memory errors, etc).
|
||||||
|
@ -706,38 +758,32 @@ comparing the compressed file with the original because the corruption
|
||||||
happens before plzip compresses the RAM contents, resulting in a valid
|
happens before plzip compresses the RAM contents, resulting in a valid
|
||||||
compressed file containing wrong data.
|
compressed file containing wrong data.
|
||||||
|
|
||||||
|
|
||||||
Example 1: Extract all the files from archive 'foo.tar.lz'.
|
Example 1: Extract all the files from archive 'foo.tar.lz'.
|
||||||
|
|
||||||
tar -xf foo.tar.lz
|
tar -xf foo.tar.lz
|
||||||
or
|
or
|
||||||
plzip -cd foo.tar.lz | tar -xf -
|
plzip -cd foo.tar.lz | tar -xf -
|
||||||
|
|
||||||
|
|
||||||
Example 2: Replace a regular file with its compressed version 'file.lz' and
|
Example 2: Replace a regular file with its compressed version 'file.lz' and
|
||||||
show the compression ratio.
|
show the compression ratio.
|
||||||
|
|
||||||
plzip -v file
|
plzip -v file
|
||||||
|
|
||||||
|
|
||||||
Example 3: Like example 2 but the created 'file.lz' has a block size of
|
Example 3: Like example 2 but the created 'file.lz' has a block size of
|
||||||
1 MiB. The compression ratio is not shown.
|
1 MiB. The compression ratio is not shown.
|
||||||
|
|
||||||
plzip -B 1MiB file
|
plzip -B 1MiB file
|
||||||
|
|
||||||
|
|
||||||
Example 4: Restore a regular file from its compressed version 'file.lz'. If
|
Example 4: Restore a regular file from its compressed version 'file.lz'. If
|
||||||
the operation is successful, 'file.lz' is removed.
|
the operation is successful, 'file.lz' is removed.
|
||||||
|
|
||||||
plzip -d file.lz
|
plzip -d file.lz
|
||||||
|
|
||||||
|
|
||||||
Example 5: Check the integrity of the compressed file 'file.lz' and show
|
Example 5: Check the integrity of the compressed file 'file.lz' and show
|
||||||
status.
|
status.
|
||||||
|
|
||||||
plzip -tv file.lz
|
plzip -tv file.lz
|
||||||
|
|
||||||
|
|
||||||
Example 6: The right way of concatenating the decompressed output of two or
|
Example 6: The right way of concatenating the decompressed output of two or
|
||||||
more compressed files. *Note Trailing data::.
|
more compressed files. *Note Trailing data::.
|
||||||
|
|
||||||
|
@ -746,19 +792,16 @@ more compressed files. *Note Trailing data::.
|
||||||
Do this instead
|
Do this instead
|
||||||
plzip -cd file1.lz file2.lz file3.lz
|
plzip -cd file1.lz file2.lz file3.lz
|
||||||
|
|
||||||
|
|
||||||
Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed data
|
Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed data
|
||||||
are produced.
|
are produced.
|
||||||
|
|
||||||
plzip -cd file.lz | dd bs=1024 count=10
|
plzip -cd file.lz | dd bs=1024 count=10
|
||||||
|
|
||||||
|
|
||||||
Example 8: Decompress 'file.lz' partially from decompressed byte at offset
|
Example 8: Decompress 'file.lz' partially from decompressed byte at offset
|
||||||
10000 to decompressed byte at offset 14999 (5000 bytes are produced).
|
10000 to decompressed byte at offset 14999 (5000 bytes are produced).
|
||||||
|
|
||||||
plzip -cd file.lz | dd bs=1000 skip=10 count=5
|
plzip -cd file.lz | dd bs=1000 skip=10 count=5
|
||||||
|
|
||||||
|
|
||||||
Example 9: Compress a whole device in /dev/sdc and send the output to
|
Example 9: Compress a whole device in /dev/sdc and send the output to
|
||||||
'file.lz'.
|
'file.lz'.
|
||||||
|
|
||||||
|
@ -769,7 +812,7 @@ Example 9: Compress a whole device in /dev/sdc and send the output to
|
||||||
|
|
||||||
File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
|
File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
|
||||||
|
|
||||||
10 Reporting bugs
|
11 Reporting bugs
|
||||||
*****************
|
*****************
|
||||||
|
|
||||||
There are probably bugs in plzip. There are certainly errors and omissions
|
There are probably bugs in plzip. There are certainly errors and omissions
|
||||||
|
@ -790,6 +833,7 @@ Concept index
|
||||||
|