Merging upstream version 1.12~rc1.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
4ddb634c25
commit
cd6a248630
24 changed files with 874 additions and 719 deletions
3
COPYING
3
COPYING
|
@ -1,8 +1,7 @@
|
|||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc. <http://fsf.org/>
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
|
|
74
ChangeLog
74
ChangeLog
|
@ -1,3 +1,13 @@
|
|||
2024-11-19 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.12-rc1 released.
|
||||
* decompress.cc (decompress), list.cc (list_files):
|
||||
Return 2 if any empty member is found in a multimember file.
|
||||
* dec_stdout.cc, dec_stream.cc:
|
||||
Change 'deliver_packet' to 'deliver_packets'.
|
||||
* plzip.texi: New chapter 'Syntax of command-line arguments'.
|
||||
* check.sh: Use 'cp' instead of 'cat'.
|
||||
|
||||
2024-01-21 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.11 released.
|
||||
|
@ -20,16 +30,17 @@
|
|||
2021-01-03 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.9 released.
|
||||
* New option '--check-lib'.
|
||||
* main.cc (main): Report an error if a file name is empty.
|
||||
(main): Show final diagnostic when testing multiple files.
|
||||
Make '-o' behave like '-c', but writing to file instead of stdout.
|
||||
Make '-c' and '-o' check whether the output is a terminal only once.
|
||||
Do not open output if input is a terminal.
|
||||
* main.cc: New option '--check-lib'.
|
||||
Set a valid invocation_name even if argc == 0.
|
||||
* Replace 'decompressed', 'compressed' with 'out', 'in' in output.
|
||||
* decompress.cc, dec_stream.cc, dec_stdout.cc:
|
||||
* decompress.cc, dec_stdout.cc, dec_stream.cc:
|
||||
Continue testing if any input file fails the test.
|
||||
Show the largest dictionary size in a multimember file.
|
||||
* main.cc: Show final diagnostic when testing multiple files.
|
||||
* decompress.cc, dec_stream.cc [LZ_API_VERSION >= 1012]: Avoid
|
||||
copying decompressed data when testing with lzlib 1.12 or newer.
|
||||
* compress.cc, dec_stream.cc: Start only the worker threads required.
|
||||
|
@ -38,47 +49,46 @@
|
|||
Use plain comparison instead of Boyer-Moore to search for headers.
|
||||
* lzip_index.cc: Improve messages for corruption in last header.
|
||||
* decompress.cc: Shorten messages 'Data error' and 'Unexpected EOF'.
|
||||
* main.cc: Set a valid invocation_name even if argc == 0.
|
||||
* Document extraction from tar.lz in manual, '--help', and man page.
|
||||
* plzip.texi (Introduction): Mention tarlz as an alternative.
|
||||
* plzip.texi: Several fixes and improvements.
|
||||
Several fixes and improvements.
|
||||
* testsuite: Add 8 new test files.
|
||||
|
||||
2019-01-05 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.8 released.
|
||||
* Rename File_* to Lzip_*.
|
||||
* main.cc: New options '--in-slots' and '--out-slots'.
|
||||
* main.cc: Increase default in_slots per worker from 2 to 4.
|
||||
* main.cc: Increase default out_slots per worker from 32 to 64.
|
||||
* New options '--in-slots' and '--out-slots'.
|
||||
* main.cc (main): Increase default in_slots per worker from 2 to 4.
|
||||
(main): Increase default out_slots per worker from 32 to 64.
|
||||
(main): Check return value of close( infd ).
|
||||
* lzip.h (Lzip_trailer): New function 'verify_consistency'.
|
||||
* lzip_index.cc: Detect some kinds of corrupt trailers.
|
||||
* main.cc (main): Check return value of close( infd ).
|
||||
* plzip.texi: Improve description of '-0..-9', '-m', and '-s'.
|
||||
* plzip.texi: Improve descriptions of '-0..-9', '-m', and '-s'.
|
||||
* configure: New option '--with-mingw'.
|
||||
* configure: Accept appending to CXXFLAGS; 'CXXFLAGS+=OPTIONS'.
|
||||
Accept appending to CXXFLAGS; 'CXXFLAGS+=OPTIONS'.
|
||||
* INSTALL: Document use of CXXFLAGS+='-D __USE_MINGW_ANSI_STDIO'.
|
||||
|
||||
2018-02-07 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.7 released.
|
||||
* New option '--loose-trailing'.
|
||||
* compress.cc: Use 'LZ_compress_restart_member' and replace input
|
||||
packet queue by a circular buffer to reduce memory fragmentation.
|
||||
* compress.cc: Return one empty packet at a time to reduce mem use.
|
||||
Return one empty packet at a time to reduce memory use.
|
||||
* main.cc: Reduce threads on 32 bit systems to use under 2.22 GiB.
|
||||
* main.cc: New option '--loose-trailing'.
|
||||
(set_c_outname): Do not add a second '.lz' to the arg of '-o'.
|
||||
(cleanup_and_fail): Suppress messages from other threads.
|
||||
* Improve corrupt header detection to HD = 3 on seekable files.
|
||||
(On all files with lzlib 1.10 or newer).
|
||||
* Replace 'bits/byte' with inverse compression ratio in output.
|
||||
* Show progress of decompression at verbosity level 2 (-vv).
|
||||
* Show progress of (de)compression only if stderr is a terminal.
|
||||
* main.cc: Do not add a second .lz extension to the arg of -o.
|
||||
* Show dictionary size at verbosity level 4 (-vvvv).
|
||||
* main.cc (cleanup_and_fail): Suppress messages from other threads.
|
||||
* list.cc: Add missing '#include <pthread.h>'.
|
||||
* plzip.texi: New chapter 'Output'.
|
||||
* plzip.texi (Memory requirements): Add table.
|
||||
* plzip.texi (Program design): Add a block diagram.
|
||||
* plzip.texi: New chapter 'Meaning of plzip's output'.
|
||||
(Memory requirements): Add table.
|
||||
(Program design): Add a block diagram.
|
||||
|
||||
2017-04-12 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
|
@ -92,14 +102,13 @@
|
|||
2016-05-14 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.5 released.
|
||||
* main.cc: New option '-a, --trailing-error'.
|
||||
* New option '-a, --trailing-error'.
|
||||
* main.cc (main): Delete '--output' file if infd is a terminal.
|
||||
* main.cc (main): Don't use stdin more than once.
|
||||
(main): Don't use stdin more than once.
|
||||
* plzip.texi: New chapters 'Trailing data' and 'Examples'.
|
||||
* configure: Avoid warning on some shells when testing for g++.
|
||||
* Makefile.in: Detect the existence of install-info.
|
||||
* check.sh: A POSIX shell is required to run the tests.
|
||||
* check.sh: Don't check error messages.
|
||||
* check.sh: Require a POSIX shell. Don't check error messages.
|
||||
|
||||
2015-07-09 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
|
@ -136,14 +145,14 @@
|
|||
|
||||
* Version 1.0 released.
|
||||
* compress.cc: Change 'deliver_packet' to 'deliver_packets'.
|
||||
* Scalability of decompression from/to regular files has been
|
||||
increased by removing splitter and muxer when not needed.
|
||||
* The number of worker threads is now limited to the number of
|
||||
members when decompressing from a regular file.
|
||||
* Increase scalability of decompression from/to regular files by
|
||||
removing splitter and muxer when not needed.
|
||||
* Limit the number of worker threads to the number of members when
|
||||
decompressing from a regular file.
|
||||
* configure: Options now accept a separate argument.
|
||||
* Makefile.in: New targets 'install-as-lzip' and 'install-bin'.
|
||||
* main.cc: Use 'setmode' instead of '_setmode' on Windows and OS/2.
|
||||
* main.cc: Define 'strtoull' to 'std::strtoul' on Windows.
|
||||
(main): Use 'setmode' instead of '_setmode' on Windows and OS/2.
|
||||
|
||||
2012-03-01 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
||||
|
@ -154,13 +163,13 @@
|
|||
2012-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
||||
* Version 0.8 released.
|
||||
* main.cc: New option '-F, --recompress'.
|
||||
* New option '-F, --recompress'.
|
||||
* decompress.cc (decompress): Show compression ratio.
|
||||
* main.cc (close_and_set_permissions): Inability to change output
|
||||
file attributes has been downgraded from error to warning.
|
||||
(main): Set stdin/stdout in binary mode on OS2.
|
||||
* Small change in '--help' output and man page.
|
||||
* Change quote characters in messages as advised by GNU Standards.
|
||||
* main.cc: Set stdin/stdout in binary mode on OS2.
|
||||
* compress.cc: Reduce memory use of compressed packets.
|
||||
* decompress.cc: Use Boyer-Moore algorithm to search for headers.
|
||||
|
||||
|
@ -174,7 +183,7 @@
|
|||
* main.cc (open_instream): Don't show the message
|
||||
" and '--stdout' was not specified" for directories, etc.
|
||||
Exit with status 1 if any output file exists and is skipped.
|
||||
* main.cc: Fix warning about fchown return value being ignored.
|
||||
Fix warning about fchown's return value being ignored.
|
||||
* testsuite: Rename 'test1' to 'test.txt'. New tests.
|
||||
|
||||
2010-03-20 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
@ -202,9 +211,8 @@
|
|||
* Version 0.3 released.
|
||||
* New option '-B, --data-size'.
|
||||
* Output file is now removed if plzip is interrupted.
|
||||
* This version automatically chooses the smallest possible
|
||||
dictionary size for each member during compression, saving
|
||||
memory during decompression.
|
||||
* Choose automatically the smallest possible dictionary size for
|
||||
each member during compression, saving memory during decompression.
|
||||
* main.cc: New constant 'o_binary'.
|
||||
|
||||
2010-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
|
|
@ -2,8 +2,8 @@
|
|||
DISTNAME = $(pkgname)-$(pkgversion)
|
||||
INSTALL = install
|
||||
INSTALL_PROGRAM = $(INSTALL) -m 755
|
||||
INSTALL_DATA = $(INSTALL) -m 644
|
||||
INSTALL_DIR = $(INSTALL) -d -m 755
|
||||
INSTALL_DATA = $(INSTALL) -m 644
|
||||
SHELL = /bin/sh
|
||||
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
|
||||
|
||||
|
@ -34,7 +34,8 @@ main.o : main.cc
|
|||
|
||||
# prevent 'make' from trying to remake source files
|
||||
$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
|
||||
%.h %.cc : ;
|
||||
MAKEFLAGS += -r
|
||||
.SUFFIXES :
|
||||
|
||||
$(objs) : Makefile
|
||||
arg_parser.o : arg_parser.h
|
||||
|
@ -133,8 +134,7 @@ dist : doc
|
|||
$(DISTNAME)/testsuite/test.txt \
|
||||
$(DISTNAME)/testsuite/fox.lz \
|
||||
$(DISTNAME)/testsuite/fox_*.lz \
|
||||
$(DISTNAME)/testsuite/test.txt.lz \
|
||||
$(DISTNAME)/testsuite/test_em.txt.lz
|
||||
$(DISTNAME)/testsuite/test.txt.lz
|
||||
rm -f $(DISTNAME)
|
||||
lzip -v -9 $(DISTNAME).tar
|
||||
|
||||
|
|
16
NEWS
16
NEWS
|
@ -1,14 +1,8 @@
|
|||
Changes in version 1.11:
|
||||
Changes in version 1.12:
|
||||
|
||||
File diagnostics have been reformatted as 'PROGRAM: FILE: MESSAGE'.
|
||||
plzip now exits with error status 2 if any empty member is found in a
|
||||
multimember file.
|
||||
|
||||
Diagnostics caused by invalid arguments to command-line options now show the
|
||||
argument and the name of the option.
|
||||
Scalability when decompressing to standard output has been increased.
|
||||
|
||||
The option '-o, --output' now preserves dates, permissions, and ownership of
|
||||
the file when (de)compressing exactly one file.
|
||||
|
||||
The option '-o, --output' now creates missing intermediate directories when
|
||||
writing to a file.
|
||||
|
||||
The variable MAKEINFO has been added to configure and Makefile.in.
|
||||
The chapter 'Syntax of command-line arguments' has been added to the manual.
|
||||
|
|
28
README
28
README
|
@ -1,26 +1,26 @@
|
|||
Description
|
||||
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip. Plzip
|
||||
uses the compression library lzlib.
|
||||
|
||||
Lzip is a lossless data compressor with a user interface similar to the one
|
||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
|
||||
files more than bzip2 (lzip -9). Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
||||
Unix-like systems.
|
||||
of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov
|
||||
chain-Algorithm) designed to achieve complete interoperability between
|
||||
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||
file can be decompressed on 32-bit machines. Lzip provides accurate and
|
||||
robust 3-factor integrity checking. 'lzip -0' compresses about as fast as
|
||||
gzip, while 'lzip -9' compresses most files more than bzip2. Decompression
|
||||
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||
and tested with great care to replace gzip and bzip2 as general-purpose
|
||||
compressed format for Unix-like systems.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
to 2 percent larger compressed files). Note that the number of usable
|
||||
threads is limited by file size; on files larger than a few GB plzip can use
|
||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
||||
than lzip.
|
||||
hundreds of processors, but on files smaller than 1 MiB plzip is no faster
|
||||
than lzip (even at compression level -0).
|
||||
|
||||
For creation and manipulation of compressed tar archives tarlz can be more
|
||||
efficient than using tar and plzip because tarlz is able to keep the
|
||||
|
|
|
@ -75,19 +75,19 @@ bool Arg_parser::parse_long_option( const char * const opt, const char * const a
|
|||
error_ += "' requires an argument";
|
||||
return false;
|
||||
}
|
||||
data.back().argument = &opt[len+3];
|
||||
data.back().argument = &opt[len+3]; // argument may be empty
|
||||
return true;
|
||||
}
|
||||
|
||||
if( options[index].has_arg == yes )
|
||||
if( options[index].has_arg == yes || options[index].has_arg == yme )
|
||||
{
|
||||
if( !arg || !arg[0] )
|
||||
if( !arg || ( options[index].has_arg == yes && !arg[0] ) )
|
||||
{
|
||||
error_ = "option '--"; error_ += options[index].long_name;
|
||||
error_ += "' requires an argument";
|
||||
return false;
|
||||
}
|
||||
++argind; data.back().argument = arg;
|
||||
++argind; data.back().argument = arg; // argument may be empty
|
||||
return true;
|
||||
}
|
||||
|
||||
|
@ -123,15 +123,16 @@ bool Arg_parser::parse_short_option( const char * const opt, const char * const
|
|||
{
|
||||
data.back().argument = &opt[cind]; ++argind; cind = 0;
|
||||
}
|
||||
else if( options[index].has_arg == yes )
|
||||
else if( options[index].has_arg == yes || options[index].has_arg == yme )
|
||||
{
|
||||
if( !arg || !arg[0] )
|
||||
if( !arg || ( options[index].has_arg == yes && !arg[0] ) )
|
||||
{
|
||||
error_ = "option requires an argument -- '"; error_ += c;
|
||||
error_ += '\'';
|
||||
return false;
|
||||
}
|
||||
data.back().argument = arg; ++argind; cind = 0;
|
||||
++argind; cind = 0;
|
||||
data.back().argument = arg; // argument may be empty
|
||||
}
|
||||
}
|
||||
return true;
|
||||
|
|
10
arg_parser.h
10
arg_parser.h
|
@ -36,14 +36,18 @@
|
|||
The argument '--' terminates all options; any following arguments are
|
||||
treated as non-option arguments, even if they begin with a hyphen.
|
||||
|
||||
The syntax for optional option arguments is '-<short_option><argument>'
|
||||
(without whitespace), or '--<long_option>=<argument>'.
|
||||
The syntax of options with an optional argument is
|
||||
'-<short_option><argument>' (without whitespace), or
|
||||
'--<long_option>=<argument>'.
|
||||
|
||||
The syntax of options with an empty argument is '-<short_option> ""',
|
||||
'--<long_option> ""', or '--<long_option>=""'.
|
||||
*/
|
||||
|
||||
class Arg_parser
|
||||
{
|
||||
public:
|
||||
enum Has_arg { no, yes, maybe };
|
||||
enum Has_arg { no, yes, maybe, yme }; // yme = yes but maybe empty
|
||||
|
||||
struct Option
|
||||
{
|
||||
|
|
83
compress.cc
83
compress.cc
|
@ -112,7 +112,6 @@ void xlock( pthread_mutex_t * const mutex )
|
|||
{ show_error( "pthread_mutex_lock", errcode ); cleanup_and_fail(); }
|
||||
}
|
||||
|
||||
|
||||
void xunlock( pthread_mutex_t * const mutex )
|
||||
{
|
||||
const int errcode = pthread_mutex_unlock( mutex );
|
||||
|
@ -158,7 +157,7 @@ struct Packet // data block with a serial number
|
|||
int size; // number of bytes in data (if any)
|
||||
unsigned id; // serial number assigned as received
|
||||
Packet() : data( 0 ), size( 0 ), id( 0 ) {}
|
||||
void init( uint8_t * const d, const int s, const unsigned i )
|
||||
void assign( uint8_t * const d, const int s, const unsigned i )
|
||||
{ data = d; size = s; id = i; }
|
||||
};
|
||||
|
||||
|
@ -176,7 +175,7 @@ private:
|
|||
unsigned deliver_id; // id of next packet to be delivered
|
||||
Slot_tally slot_tally; // limits the number of input packets
|
||||
std::vector< Packet > circular_ibuffer;
|
||||
std::vector< const Packet * > circular_obuffer;
|
||||
std::vector< const Packet * > circular_obuffer; // pointers to ibuffer
|
||||
int num_working; // number of workers still running
|
||||
const int num_slots; // max packets in circulation
|
||||
pthread_mutex_t imutex;
|
||||
|
@ -212,7 +211,7 @@ public:
|
|||
{
|
||||
slot_tally.get_slot(); // wait for a free slot
|
||||
xlock( &imutex );
|
||||
circular_ibuffer[receive_id % num_slots].init( data, size, receive_id );
|
||||
circular_ibuffer[receive_id % num_slots].assign( data, size, receive_id );
|
||||
++receive_id;
|
||||
xsignal( &iav_or_eof );
|
||||
xunlock( &imutex );
|
||||
|
@ -221,7 +220,6 @@ public:
|
|||
// distribute a packet to a worker
|
||||
Packet * distribute_packet()
|
||||
{
|
||||
Packet * ipacket = 0;
|
||||
xlock( &imutex );
|
||||
++icheck_counter;
|
||||
while( receive_id == distrib_id && !eof ) // no packets to distribute
|
||||
|
@ -230,15 +228,13 @@ public:
|
|||
xwait( &iav_or_eof, &imutex );
|
||||
}
|
||||
if( receive_id != distrib_id )
|
||||
{ ipacket = &circular_ibuffer[distrib_id % num_slots]; ++distrib_id; }
|
||||
{ Packet * ipacket = &circular_ibuffer[distrib_id % num_slots];
|
||||
++distrib_id; xunlock( &imutex ); return ipacket; }
|
||||
xunlock( &imutex );
|
||||
if( !ipacket ) // EOF
|
||||
{
|
||||
xlock( &omutex ); // notify muxer when last worker exits
|
||||
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
||||
xunlock( &omutex );
|
||||
}
|
||||
return ipacket;
|
||||
xlock( &omutex ); // notify muxer when last worker exits
|
||||
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
||||
xunlock( &omutex );
|
||||
return 0; // EOF
|
||||
}
|
||||
|
||||
// collect a packet from a worker
|
||||
|
@ -307,30 +303,38 @@ public:
|
|||
|
||||
struct Worker_arg
|
||||
{
|
||||
Packet_courier * courier;
|
||||
const Pretty_print * pp;
|
||||
int dictionary_size;
|
||||
int match_len_limit;
|
||||
int offset;
|
||||
Packet_courier & courier;
|
||||
const Pretty_print & pp;
|
||||
const int dictionary_size;
|
||||
const int match_len_limit;
|
||||
const int offset;
|
||||
Worker_arg( Packet_courier & co, const Pretty_print & pp_, const int dis,
|
||||
const int mll, const int off )
|
||||
: courier( co ), pp( pp_ ), dictionary_size( dis ),
|
||||
match_len_limit( mll ), offset( off ) {}
|
||||
};
|
||||
|
||||
struct Splitter_arg
|
||||
{
|
||||
struct Worker_arg worker_arg;
|
||||
pthread_t * worker_threads;
|
||||
int infd;
|
||||
int data_size;
|
||||
Worker_arg worker_arg;
|
||||
pthread_t * const worker_threads;
|
||||
const int data_size;
|
||||
const int infd;
|
||||
int num_workers; // returned by splitter to main thread
|
||||
Splitter_arg( Packet_courier & co, const Pretty_print & pp_, const int dis,
|
||||
const int mll, const int off, pthread_t * wt, const int das,
|
||||
const int ifd, const int nw )
|
||||
: worker_arg( co, pp_, dis, mll, off ), worker_threads( wt ),
|
||||
data_size( das ), infd( ifd ), num_workers( nw ) {}
|
||||
};
|
||||
|
||||
|
||||
/* Get packets from courier, replace their contents, and return them to
|
||||
courier. */
|
||||
// get packets from courier, replace their contents, and return them to courier
|
||||
extern "C" void * cworker( void * arg )
|
||||
{
|
||||
const Worker_arg & tmp = *(const Worker_arg *)arg;
|
||||
Packet_courier & courier = *tmp.courier;
|
||||
const Pretty_print & pp = *tmp.pp;
|
||||
Packet_courier & courier = tmp.courier;
|
||||
const Pretty_print & pp = tmp.pp;
|
||||
const int dictionary_size = tmp.dictionary_size;
|
||||
const int match_len_limit = tmp.match_len_limit;
|
||||
const int offset = tmp.offset;
|
||||
|
@ -407,8 +411,8 @@ extern "C" void * cworker( void * arg )
|
|||
extern "C" void * csplitter( void * arg )
|
||||
{
|
||||
Splitter_arg & tmp = *(Splitter_arg *)arg;
|
||||
Packet_courier & courier = *tmp.worker_arg.courier;
|
||||
const Pretty_print & pp = *tmp.worker_arg.pp;
|
||||
Packet_courier & courier = tmp.worker_arg.courier;
|
||||
const Pretty_print & pp = tmp.worker_arg.pp;
|
||||
pthread_t * const worker_threads = tmp.worker_threads;
|
||||
const int offset = tmp.worker_arg.offset;
|
||||
const int infd = tmp.infd;
|
||||
|
@ -436,11 +440,7 @@ extern "C" void * csplitter( void * arg )
|
|||
}
|
||||
if( size < data_size ) break; // EOF
|
||||
}
|
||||
else
|
||||
{
|
||||
delete[] data;
|
||||
break;
|
||||
}
|
||||
else { delete[] data; break; }
|
||||
}
|
||||
courier.finish( tmp.num_workers - i ); // no more packets to send
|
||||
tmp.num_workers = i;
|
||||
|
@ -465,7 +465,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
|
|||
out_size += opacket->size;
|
||||
|
||||
if( writeblock( outfd, opacket->data, opacket->size ) != opacket->size )
|
||||
{ pp(); show_error( "Write error", errno ); cleanup_and_fail(); }
|
||||
{ pp(); show_error( write_error_msg, errno ); cleanup_and_fail(); }
|
||||
delete[] opacket->data;
|
||||
courier.return_empty_packet();
|
||||
}
|
||||
|
@ -475,8 +475,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
|
|||
} // end namespace
|
||||
|
||||
|
||||
/* Init the courier, then start the splitter and the workers and call the
|
||||
muxer. */
|
||||
// init the courier, then start the splitter and the workers and call the muxer
|
||||
int compress( const unsigned long long cfile_size,
|
||||
const int data_size, const int dictionary_size,
|
||||
const int match_len_limit, const int num_workers,
|
||||
|
@ -496,16 +495,8 @@ int compress( const unsigned long long cfile_size,
|
|||
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
|
||||
if( !worker_threads ) { pp( mem_msg ); return 1; }
|
||||
|
||||
Splitter_arg splitter_arg;
|
||||
splitter_arg.worker_arg.courier = &courier;
|
||||
splitter_arg.worker_arg.pp = &pp;
|
||||
splitter_arg.worker_arg.dictionary_size = dictionary_size;
|
||||
splitter_arg.worker_arg.match_len_limit = match_len_limit;
|
||||
splitter_arg.worker_arg.offset = offset;
|
||||
splitter_arg.worker_threads = worker_threads;
|
||||
splitter_arg.infd = infd;
|
||||
splitter_arg.data_size = data_size;
|
||||
splitter_arg.num_workers = num_workers;
|
||||
Splitter_arg splitter_arg( courier, pp, dictionary_size, match_len_limit,
|
||||
offset, worker_threads, data_size, infd, num_workers );
|
||||
|
||||
pthread_t splitter_thread;
|
||||
int errcode = pthread_create( &splitter_thread, 0, csplitter, &splitter_arg );
|
||||
|
|
4
configure
vendored
4
configure
vendored
|
@ -6,7 +6,7 @@
|
|||
# to copy, distribute, and modify it.
|
||||
|
||||
pkgname=plzip
|
||||
pkgversion=1.11
|
||||
pkgversion=1.12-rc1
|
||||
progname=plzip
|
||||
with_mingw=
|
||||
srctrigger=doc/${pkgname}.texi
|
||||
|
@ -115,7 +115,7 @@ while [ $# != 0 ] ; do
|
|||
exit 1 ;;
|
||||
esac
|
||||
|
||||
# Check if the option took a separate argument
|
||||
# Check whether the option took a separate argument
|
||||
if [ "${arg2}" = yes ] ; then
|
||||
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
|
||||
else echo "configure: Missing argument to '${option}'" 1>&2
|
||||
|
|
119
dec_stdout.cc
119
dec_stdout.cc
|
@ -46,10 +46,10 @@ struct Packet // data block
|
|||
uint8_t * data; // data may be null if size == 0
|
||||
int size; // number of bytes in data (if any)
|
||||
bool eom; // end of member
|
||||
Packet() : data( 0 ), size( 0 ), eom( true ) {}
|
||||
Packet() : data( 0 ), size( 0 ), eom( false ) {}
|
||||
Packet( uint8_t * const d, const int s, const bool e )
|
||||
: data( d ), size( s ), eom ( e ) {}
|
||||
~Packet() { if( data ) delete[] data; }
|
||||
void delete_data() { if( data ) { delete[] data; data = 0; } }
|
||||
};
|
||||
|
||||
|
||||
|
@ -59,8 +59,8 @@ public:
|
|||
unsigned ocheck_counter;
|
||||
unsigned owait_counter;
|
||||
private:
|
||||
int deliver_worker_id; // worker queue currently delivering packets
|
||||
std::vector< std::queue< const Packet * > > opacket_queues;
|
||||
int deliver_id; // worker queue currently delivering packets
|
||||
std::vector< std::queue< Packet > > opacket_queues;
|
||||
int num_working; // number of workers still running
|
||||
const int num_workers; // number of workers
|
||||
const unsigned out_slots; // max output packets per queue
|
||||
|
@ -75,10 +75,9 @@ private:
|
|||
public:
|
||||
Packet_courier( const Shared_retval & sh_ret, const int workers,
|
||||
const int slots )
|
||||
: ocheck_counter( 0 ), owait_counter( 0 ), deliver_worker_id( 0 ),
|
||||
opacket_queues( workers ), num_working( workers ),
|
||||
num_workers( workers ), out_slots( slots ), slot_av( workers ),
|
||||
shared_retval( sh_ret )
|
||||
: ocheck_counter( 0 ), owait_counter( 0 ), deliver_id( 0 ),
|
||||
opacket_queues( workers ), num_working( workers ), num_workers( workers ),
|
||||
out_slots( slots ), slot_av( workers ), shared_retval( sh_ret )
|
||||
{
|
||||
xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
|
||||
for( unsigned i = 0; i < slot_av.size(); ++i ) xinit_cond( &slot_av[i] );
|
||||
|
@ -89,7 +88,7 @@ public:
|
|||
if( shared_retval() ) // cleanup to avoid memory leaks
|
||||
for( int i = 0; i < num_workers; ++i )
|
||||
while( !opacket_queues[i].empty() )
|
||||
{ delete opacket_queues[i].front(); opacket_queues[i].pop(); }
|
||||
{ opacket_queues[i].front().delete_data(); opacket_queues[i].pop(); }
|
||||
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
|
||||
xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
|
||||
}
|
||||
|
@ -102,49 +101,47 @@ public:
|
|||
xunlock( &omutex );
|
||||
}
|
||||
|
||||
// collect a packet from a worker, discard packet on error
|
||||
void collect_packet( const Packet * const opacket, const int worker_id )
|
||||
// make a packet with data received from a worker, discard data on error
|
||||
void collect_packet( const int worker_id, uint8_t * const data,
|
||||
const int size, const bool eom )
|
||||
{
|
||||
Packet opacket( data, size, eom );
|
||||
xlock( &omutex );
|
||||
if( opacket->data )
|
||||
if( data )
|
||||
while( opacket_queues[worker_id].size() >= out_slots )
|
||||
{
|
||||
if( shared_retval() ) { delete opacket; goto done; }
|
||||
if( shared_retval() ) { delete[] data; goto out; }
|
||||
xwait( &slot_av[worker_id], &omutex );
|
||||
}
|
||||
opacket_queues[worker_id].push( opacket );
|
||||
if( worker_id == deliver_worker_id ) xsignal( &oav_or_exit );
|
||||
done:
|
||||
xunlock( &omutex );
|
||||
if( worker_id == deliver_id ) xsignal( &oav_or_exit );
|
||||
out: xunlock( &omutex );
|
||||
}
|
||||
|
||||
/* deliver a packet to muxer
|
||||
if packet->eom, move to next queue
|
||||
if packet data == 0, wait again */
|
||||
const Packet * deliver_packet()
|
||||
/* deliver packets to muxer
|
||||
if opacket.eom, move to next queue
|
||||
if opacket.data == 0, skip opacket */
|
||||
void deliver_packets( std::vector< Packet > & packet_vector )
|
||||
{
|
||||
const Packet * opacket = 0;
|
||||
packet_vector.clear();
|
||||
xlock( &omutex );
|
||||
++ocheck_counter;
|
||||
while( true )
|
||||
{
|
||||
while( opacket_queues[deliver_worker_id].empty() && num_working > 0 )
|
||||
do {
|
||||
while( opacket_queues[deliver_id].empty() && num_working > 0 )
|
||||
{ ++owait_counter; xwait( &oav_or_exit, &omutex ); }
|
||||
while( true )
|
||||
{
|
||||
++owait_counter;
|
||||
xwait( &oav_or_exit, &omutex );
|
||||
if( opacket_queues[deliver_id].empty() ) break;
|
||||
Packet opacket = opacket_queues[deliver_id].front();
|
||||
opacket_queues[deliver_id].pop();
|
||||
if( opacket_queues[deliver_id].size() + 1 == out_slots )
|
||||
xsignal( &slot_av[deliver_id] );
|
||||
if( opacket.eom && ++deliver_id >= num_workers ) deliver_id = 0;
|
||||
if( opacket.data ) packet_vector.push_back( opacket );
|
||||
}
|
||||
if( opacket_queues[deliver_worker_id].empty() ) break;
|
||||
opacket = opacket_queues[deliver_worker_id].front();
|
||||
opacket_queues[deliver_worker_id].pop();
|
||||
if( opacket_queues[deliver_worker_id].size() + 1 == out_slots )
|
||||
xsignal( &slot_av[deliver_worker_id] );
|
||||
if( opacket->eom && ++deliver_worker_id >= num_workers )
|
||||
deliver_worker_id = 0;
|
||||
if( opacket->data ) break;
|
||||
delete opacket; opacket = 0;
|
||||
}
|
||||
while( packet_vector.empty() && num_working > 0 );
|
||||
xunlock( &omutex );
|
||||
return opacket;
|
||||
}
|
||||
|
||||
bool finished() // all packets delivered to muxer
|
||||
|
@ -163,9 +160,14 @@ struct Worker_arg
|
|||
Packet_courier * courier;
|
||||
const Pretty_print * pp;
|
||||
Shared_retval * shared_retval;
|
||||
int worker_id;
|
||||
int num_workers;
|
||||
int infd;
|
||||
int num_workers;
|
||||
int worker_id;
|
||||
void assign( const Lzip_index & li, Packet_courier & co,
|
||||
const Pretty_print & pp_, Shared_retval & sr,
|
||||
const int ifd, const int nw, const int wi )
|
||||
{ lzip_index = &li; courier = &co; pp = &pp_; shared_retval = &sr;
|
||||
infd = ifd; num_workers = nw; worker_id = wi; }
|
||||
};
|
||||
|
||||
|
||||
|
@ -179,9 +181,9 @@ extern "C" void * dworker_o( void * arg )
|
|||
Packet_courier & courier = *tmp.courier;
|
||||
const Pretty_print & pp = *tmp.pp;
|
||||
Shared_retval & shared_retval = *tmp.shared_retval;
|
||||
const int worker_id = tmp.worker_id;
|
||||
const int num_workers = tmp.num_workers;
|
||||
const int infd = tmp.infd;
|
||||
const int num_workers = tmp.num_workers;
|
||||
const int worker_id = tmp.worker_id;
|
||||
const int buffer_size = 65536;
|
||||
|
||||
int new_pos = 0;
|
||||
|
@ -231,12 +233,11 @@ extern "C" void * dworker_o( void * arg )
|
|||
const bool eom = LZ_decompress_finished( decoder ) == 1;
|
||||
if( new_pos == max_packet_size || eom ) // make data packet
|
||||
{
|
||||
const Packet * const opacket =
|
||||
new Packet( ( new_pos > 0 ) ? new_data : 0, new_pos, eom );
|
||||
courier.collect_packet( opacket, worker_id );
|
||||
courier.collect_packet( worker_id, ( new_pos > 0 ) ? new_data : 0,
|
||||
new_pos, eom );
|
||||
if( new_pos > 0 ) { new_pos = 0; new_data = 0; }
|
||||
if( eom )
|
||||
{ LZ_decompress_reset( decoder ); // prepare for new member
|
||||
{ LZ_decompress_reset( decoder ); // prepare for next member
|
||||
break; }
|
||||
}
|
||||
if( rd == 0 ) break;
|
||||
|
@ -262,23 +263,28 @@ done:
|
|||
void muxer( Packet_courier & courier, const Pretty_print & pp,
|
||||
Shared_retval & shared_retval, const int outfd )
|
||||
{
|
||||
std::vector< Packet > packet_vector;
|
||||
while( true )
|
||||
{
|
||||
const Packet * const opacket = courier.deliver_packet();
|
||||
if( !opacket ) break; // queue is empty. all workers exited
|
||||
courier.deliver_packets( packet_vector );
|
||||
if( packet_vector.empty() ) break; // queue is empty. all workers exited
|
||||
|
||||
if( shared_retval() == 0 &&
|
||||
writeblock( outfd, opacket->data, opacket->size ) != opacket->size &&
|
||||
shared_retval.set_value( 1 ) )
|
||||
{ pp(); show_error( "Write error", errno ); }
|
||||
delete opacket;
|
||||
for( unsigned i = 0; i < packet_vector.size(); ++i )
|
||||
{
|
||||
Packet & opacket = packet_vector[i];
|
||||
if( shared_retval() == 0 &&
|
||||
writeblock( outfd, opacket.data, opacket.size ) != opacket.size &&
|
||||
shared_retval.set_value( 1 ) )
|
||||
{ pp(); show_error( write_error_msg, errno ); }
|
||||
opacket.delete_data();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
} // end namespace
|
||||
|
||||
|
||||
// init the courier, then start the workers and call the muxer.
|
||||
// init the courier, then start the workers and call the muxer
|
||||
int dec_stdout( const int num_workers, const int infd, const int outfd,
|
||||
const Pretty_print & pp, const int debug_level,
|
||||
const int out_slots, const Lzip_index & lzip_index )
|
||||
|
@ -294,13 +300,8 @@ int dec_stdout( const int num_workers, const int infd, const int outfd,
|
|||
int i = 0; // number of workers started
|
||||
for( ; i < num_workers; ++i )
|
||||
{
|
||||
worker_args[i].lzip_index = &lzip_index;
|
||||
worker_args[i].courier = &courier;
|
||||
worker_args[i].pp = &pp;
|
||||
worker_args[i].shared_retval = &shared_retval;
|
||||
worker_args[i].worker_id = i;
|
||||
worker_args[i].num_workers = num_workers;
|
||||
worker_args[i].infd = infd;
|
||||
worker_args[i].assign( lzip_index, courier, pp, shared_retval, infd,
|
||||
num_workers, i );
|
||||
const int errcode =
|
||||
pthread_create( &worker_threads[i], 0, dworker_o, &worker_args[i] );
|
||||
if( errcode )
|
||||
|
|
191
dec_stream.cc
191
dec_stream.cc
|
@ -54,10 +54,10 @@ struct Packet // data block
|
|||
uint8_t * data; // data may be null if size == 0
|
||||
int size; // number of bytes in data (if any)
|
||||
bool eom; // end of member
|
||||
Packet() : data( 0 ), size( 0 ), eom( true ) {}
|
||||
Packet() : data( 0 ), size( 0 ), eom( false ) {}
|
||||
Packet( uint8_t * const d, const int s, const bool e )
|
||||
: data( d ), size( s ), eom ( e ) {}
|
||||
~Packet() { if( data ) delete[] data; }
|
||||
void delete_data() { if( data ) { delete[] data; data = 0; } }
|
||||
};
|
||||
|
||||
|
||||
|
@ -69,11 +69,11 @@ public:
|
|||
unsigned ocheck_counter;
|
||||
unsigned owait_counter;
|
||||
private:
|
||||
int receive_worker_id; // worker queue currently receiving packets
|
||||
int deliver_worker_id; // worker queue currently delivering packets
|
||||
int receive_id; // worker queue currently receiving packets
|
||||
int deliver_id; // worker queue currently delivering packets
|
||||
Slot_tally slot_tally; // limits the number of input packets
|
||||
std::vector< std::queue< const Packet * > > ipacket_queues;
|
||||
std::vector< std::queue< const Packet * > > opacket_queues;
|
||||
std::vector< std::queue< Packet > > ipacket_queues;
|
||||
std::vector< std::queue< Packet > > opacket_queues;
|
||||
int num_working; // number of workers still running
|
||||
const int num_workers; // number of workers
|
||||
const unsigned out_slots; // max output packets per queue
|
||||
|
@ -94,11 +94,11 @@ public:
|
|||
const int in_slots, const int oslots )
|
||||
: icheck_counter( 0 ), iwait_counter( 0 ),
|
||||
ocheck_counter( 0 ), owait_counter( 0 ),
|
||||
receive_worker_id( 0 ), deliver_worker_id( 0 ),
|
||||
slot_tally( in_slots ), ipacket_queues( workers ),
|
||||
opacket_queues( workers ), num_working( workers ),
|
||||
num_workers( workers ), out_slots( oslots ), slot_av( workers ),
|
||||
shared_retval( sh_ret ), eof( false ), trailing_data_found_( false )
|
||||
receive_id( 0 ), deliver_id( 0 ), slot_tally( in_slots ),
|
||||
ipacket_queues( workers ), opacket_queues( workers ),
|
||||
num_working( workers ), num_workers( workers ),
|
||||
out_slots( oslots ), slot_av( workers ), shared_retval( sh_ret ),
|
||||
eof( false ), trailing_data_found_( false )
|
||||
{
|
||||
xinit_mutex( &imutex ); xinit_cond( &iav_or_eof );
|
||||
xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
|
||||
|
@ -111,9 +111,9 @@ public:
|
|||
for( int i = 0; i < num_workers; ++i )
|
||||
{
|
||||
while( !ipacket_queues[i].empty() )
|
||||
{ delete ipacket_queues[i].front(); ipacket_queues[i].pop(); }
|
||||
{ ipacket_queues[i].front().delete_data(); ipacket_queues[i].pop(); }
|
||||
while( !opacket_queues[i].empty() )
|
||||
{ delete opacket_queues[i].front(); opacket_queues[i].pop(); }
|
||||
{ opacket_queues[i].front().delete_data(); opacket_queues[i].pop(); }
|
||||
}
|
||||
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
|
||||
xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
|
||||
|
@ -125,19 +125,18 @@ public:
|
|||
void receive_packet( uint8_t * const data, const int size, const bool eom )
|
||||
{
|
||||
if( shared_retval() ) { delete[] data; return; } // discard packet on error
|
||||
const Packet * const ipacket = new Packet( data, size, eom );
|
||||
const Packet ipacket( data, size, eom );
|
||||
slot_tally.get_slot(); // wait for a free slot
|
||||
xlock( &imutex );
|
||||
ipacket_queues[receive_worker_id].push( ipacket );
|
||||
ipacket_queues[receive_id].push( ipacket );
|
||||
xbroadcast( &iav_or_eof );
|
||||
xunlock( &imutex );
|
||||
if( eom && ++receive_worker_id >= num_workers ) receive_worker_id = 0;
|
||||
if( eom && ++receive_id >= num_workers ) receive_id = 0;
|
||||
}
|
||||
|
||||
// distribute a packet to a worker
|
||||
const Packet * distribute_packet( const int worker_id )
|
||||
Packet distribute_packet( const int worker_id )
|
||||
{
|
||||
const Packet * ipacket = 0;
|
||||
xlock( &imutex );
|
||||
++icheck_counter;
|
||||
while( ipacket_queues[worker_id].empty() && !eof )
|
||||
|
@ -147,63 +146,58 @@ public:
|
|||
}
|
||||
if( !ipacket_queues[worker_id].empty() )
|
||||
{
|
||||
ipacket = ipacket_queues[worker_id].front();
|
||||
const Packet ipacket = ipacket_queues[worker_id].front();
|
||||
ipacket_queues[worker_id].pop();
|
||||
xunlock( &imutex ); slot_tally.leave_slot(); return ipacket;
|
||||
}
|
||||
xunlock( &imutex );
|
||||
if( ipacket ) slot_tally.leave_slot();
|
||||
else // no more packets
|
||||
{
|
||||
xlock( &omutex ); // notify muxer when last worker exits
|
||||
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
||||
xunlock( &omutex );
|
||||
}
|
||||
return ipacket;
|
||||
xunlock( &imutex ); // no more packets
|
||||
xlock( &omutex ); // notify muxer when last worker exits
|
||||
if( --num_working == 0 ) xsignal( &oav_or_exit );
|
||||
xunlock( &omutex );
|
||||
return Packet();
|
||||
}
|
||||
|
||||
// collect a packet from a worker, discard packet on error
|
||||
void collect_packet( const Packet * const opacket, const int worker_id )
|
||||
// make a packet with data received from a worker, discard data on error
|
||||
void collect_packet( const int worker_id, uint8_t * const data,
|
||||
const int size, const bool eom )
|
||||
{
|
||||
Packet opacket( data, size, eom );
|
||||
xlock( &omutex );
|
||||
if( opacket->data )
|
||||
if( data )
|
||||
while( opacket_queues[worker_id].size() >= out_slots )
|
||||
{
|
||||
if( shared_retval() ) { delete opacket; goto done; }
|
||||
if( shared_retval() ) { delete[] data; goto out; }
|
||||
xwait( &slot_av[worker_id], &omutex );
|
||||
}
|
||||
opacket_queues[worker_id].push( opacket );
|
||||
if( worker_id == deliver_worker_id ) xsignal( &oav_or_exit );
|
||||
done:
|
||||
xunlock( &omutex );
|
||||
if( worker_id == deliver_id ) xsignal( &oav_or_exit );
|
||||
out: xunlock( &omutex );
|
||||
}
|
||||
|
||||
/* deliver a packet to muxer
|
||||
if packet->eom, move to next queue
|
||||
if packet data == 0, wait again */
|
||||
const Packet * deliver_packet()
|
||||
/* deliver packets to muxer
|
||||
if opacket.eom, move to next queue
|
||||
if opacket.data == 0, skip opacket */
|
||||
void deliver_packets( std::vector< Packet > & packet_vector )
|
||||
{
|
||||
const Packet * opacket = 0;
|
||||
packet_vector.clear();
|
||||
xlock( &omutex );
|
||||
++ocheck_counter;
|
||||
while( true )
|
||||
{
|
||||
while( opacket_queues[deliver_worker_id].empty() && num_working > 0 )
|
||||
do {
|
||||
while( opacket_queues[deliver_id].empty() && num_working > 0 )
|
||||
{ ++owait_counter; xwait( &oav_or_exit, &omutex ); }
|
||||
while( true )
|
||||
{
|
||||
++owait_counter;
|
||||
xwait( &oav_or_exit, &omutex );
|
||||
if( opacket_queues[deliver_id].empty() ) break;
|
||||
Packet opacket = opacket_queues[deliver_id].front();
|
||||
opacket_queues[deliver_id].pop();
|
||||
if( opacket_queues[deliver_id].size() + 1 == out_slots )
|
||||
xsignal( &slot_av[deliver_id] );
|
||||
if( opacket.eom && ++deliver_id >= num_workers ) deliver_id = 0;
|
||||
if( opacket.data ) packet_vector.push_back( opacket );
|
||||
}
|
||||
if( opacket_queues[deliver_worker_id].empty() ) break;
|
||||
opacket = opacket_queues[deliver_worker_id].front();
|
||||
opacket_queues[deliver_worker_id].pop();
|
||||
if( opacket_queues[deliver_worker_id].size() + 1 == out_slots )
|
||||
xsignal( &slot_av[deliver_worker_id] );
|
||||
if( opacket->eom && ++deliver_worker_id >= num_workers )
|
||||
deliver_worker_id = 0;
|
||||
if( opacket->data ) break;
|
||||
delete opacket; opacket = 0;
|
||||
}
|
||||
while( packet_vector.empty() && num_working > 0 );
|
||||
xunlock( &omutex );
|
||||
return opacket;
|
||||
}
|
||||
|
||||
void add_sizes( const unsigned long long partial_in_size,
|
||||
|
@ -252,17 +246,29 @@ struct Worker_arg
|
|||
bool loose_trailing;
|
||||
bool testing;
|
||||
bool nocopy; // avoid copying decompressed data when testing
|
||||
void assign( Packet_courier & co, const Pretty_print & pp_,
|
||||
Shared_retval & sr, const bool it, const bool lt,
|
||||
const bool t, const bool nc )
|
||||
{ courier = &co; pp = &pp_; shared_retval = &sr; worker_id = 0;
|
||||
ignore_trailing = it; loose_trailing = lt; testing = t; nocopy = nc; }
|
||||
};
|
||||
|
||||
struct Splitter_arg
|
||||
{
|
||||
struct Worker_arg worker_arg;
|
||||
Worker_arg * worker_args;
|
||||
pthread_t * worker_threads;
|
||||
unsigned long long cfile_size;
|
||||
int infd;
|
||||
Worker_arg worker_arg;
|
||||
Worker_arg * const worker_args;
|
||||
pthread_t * const worker_threads;
|
||||
const unsigned long long cfile_size;
|
||||
const int infd;
|
||||
unsigned dictionary_size; // returned by splitter to main thread
|
||||
int num_workers; // returned by splitter to main thread
|
||||
Splitter_arg( Packet_courier & co, const Pretty_print & pp_,
|
||||
Shared_retval & sr, const bool it, const bool lt,
|
||||
const bool t, const bool nc, Worker_arg * wa, pthread_t * wt,
|
||||
const unsigned long long cfs, const int ifd, const int nw )
|
||||
: worker_args( wa ), worker_threads( wt ), cfile_size( cfs ),
|
||||
infd( ifd ), dictionary_size( 0 ), num_workers( nw )
|
||||
{ worker_arg.assign( co, pp_, sr, it, lt, t, nc ); }
|
||||
};
|
||||
|
||||
|
||||
|
@ -291,22 +297,22 @@ extern "C" void * dworker_s( void * arg )
|
|||
|
||||
while( true )
|
||||
{
|
||||
const Packet * const ipacket = courier.distribute_packet( worker_id );
|
||||
if( !ipacket ) break; // no more packets to process
|
||||
Packet ipacket = courier.distribute_packet( worker_id );
|
||||
if( !ipacket.data ) break; // no more packets to process
|
||||
|
||||
int written = 0;
|
||||
while( !draining ) // else discard trailing data or drain queue
|
||||
{
|
||||
if( LZ_decompress_write_size( decoder ) > 0 && written < ipacket->size )
|
||||
if( LZ_decompress_write_size( decoder ) > 0 && written < ipacket.size )
|
||||
{
|
||||
const int wr = LZ_decompress_write( decoder, ipacket->data + written,
|
||||
ipacket->size - written );
|
||||
const int wr = LZ_decompress_write( decoder, ipacket.data + written,
|
||||
ipacket.size - written );
|
||||
if( wr < 0 ) internal_error( "library error (LZ_decompress_write)." );
|
||||
written += wr;
|
||||
if( written > ipacket->size )
|
||||
if( written > ipacket.size )
|
||||
internal_error( "ipacket size exceeded in worker." );
|
||||
}
|
||||
if( ipacket->eom && written == ipacket->size )
|
||||
if( ipacket.eom && written == ipacket.size )
|
||||
LZ_decompress_finish( decoder );
|
||||
unsigned long long total_in = 0; // detect empty member + corrupt header
|
||||
while( !draining ) // read and pack decompressed data
|
||||
|
@ -353,14 +359,13 @@ extern "C" void * dworker_s( void * arg )
|
|||
{
|
||||
if( !testing ) // make data packet
|
||||
{
|
||||
const Packet * const opacket =
|
||||
new Packet( ( new_pos > 0 ) ? new_data : 0, new_pos, eom );
|
||||
courier.collect_packet( opacket, worker_id );
|
||||
courier.collect_packet( worker_id, ( new_pos > 0 ) ? new_data : 0,
|
||||
new_pos, eom );
|
||||
if( new_pos > 0 ) new_data = 0;
|
||||
}
|
||||
new_pos = 0;
|
||||
if( eom )
|
||||
{ LZ_decompress_reset( decoder ); // prepare for new member
|
||||
{ LZ_decompress_reset( decoder ); // prepare for next member
|
||||
break; }
|
||||
}
|
||||
if( rd == 0 )
|
||||
|
@ -369,9 +374,9 @@ extern "C" void * dworker_s( void * arg )
|
|||
if( total_in == size ) break; else total_in = size;
|
||||
}
|
||||
}
|
||||
if( !ipacket->data || written == ipacket->size ) break;
|
||||
if( !ipacket.data || written == ipacket.size ) break;
|
||||
}
|
||||
delete ipacket;
|
||||
ipacket.delete_data();
|
||||
}
|
||||
|
||||
if( new_data ) delete[] new_data;
|
||||
|
@ -404,7 +409,7 @@ bool start_worker( const Worker_arg & worker_arg,
|
|||
packaging and distribution to workers.
|
||||
Start a worker per member up to a maximum of num_workers.
|
||||
*/
|
||||
extern "C" void * dsplitter_s( void * arg )
|
||||
extern "C" void * dsplitter( void * arg )
|
||||
{
|
||||
Splitter_arg & tmp = *(Splitter_arg *)arg;
|
||||
const Worker_arg & worker_arg = tmp.worker_arg;
|
||||
|
@ -546,16 +551,21 @@ fail:
|
|||
void muxer( Packet_courier & courier, const Pretty_print & pp,
|
||||
Shared_retval & shared_retval, const int outfd )
|
||||
{
|
||||
std::vector< Packet > packet_vector;
|
||||
while( true )
|
||||
{
|
||||
const Packet * const opacket = courier.deliver_packet();
|
||||
if( !opacket ) break; // queue is empty. all workers exited
|
||||
courier.deliver_packets( packet_vector );
|
||||
if( packet_vector.empty() ) break; // queue is empty. all workers exited
|
||||
|
||||
if( shared_retval() == 0 &&
|
||||
writeblock( outfd, opacket->data, opacket->size ) != opacket->size &&
|
||||
shared_retval.set_value( 1 ) )
|
||||
{ pp(); show_error( "Write error", errno ); }
|
||||
delete opacket;
|
||||
for( unsigned i = 0; i < packet_vector.size(); ++i )
|
||||
{
|
||||
Packet & opacket = packet_vector[i];
|
||||
if( shared_retval() == 0 &&
|
||||
writeblock( outfd, opacket.data, opacket.size ) != opacket.size &&
|
||||
shared_retval.set_value( 1 ) )
|
||||
{ pp(); show_error( write_error_msg, errno ); }
|
||||
opacket.delete_data();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -590,23 +600,12 @@ int dec_stream( const unsigned long long cfile_size, const int num_workers,
|
|||
const bool nocopy = false;
|
||||
#endif
|
||||
|
||||
Splitter_arg splitter_arg;
|
||||
splitter_arg.worker_arg.courier = &courier;
|
||||
splitter_arg.worker_arg.pp = &pp;
|
||||
splitter_arg.worker_arg.shared_retval = &shared_retval;
|
||||
splitter_arg.worker_arg.worker_id = 0;
|
||||
splitter_arg.worker_arg.ignore_trailing = cl_opts.ignore_trailing;
|
||||
splitter_arg.worker_arg.loose_trailing = cl_opts.loose_trailing;
|
||||
splitter_arg.worker_arg.testing = ( outfd < 0 );
|
||||
splitter_arg.worker_arg.nocopy = nocopy;
|
||||
splitter_arg.worker_args = worker_args;
|
||||
splitter_arg.worker_threads = worker_threads;
|
||||
splitter_arg.cfile_size = cfile_size;
|
||||
splitter_arg.infd = infd;
|
||||
splitter_arg.num_workers = num_workers;
|
||||
Splitter_arg splitter_arg( courier, pp, shared_retval,
|
||||
cl_opts.ignore_trailing, cl_opts.loose_trailing, outfd < 0, nocopy,
|
||||
worker_args, worker_threads, cfile_size, infd, num_workers );
|
||||
|
||||
pthread_t splitter_thread;
|
||||
int errcode = pthread_create( &splitter_thread, 0, dsplitter_s, &splitter_arg );
|
||||
int errcode = pthread_create( &splitter_thread, 0, dsplitter, &splitter_arg );
|
||||
if( errcode )
|
||||
{ show_error( "Can't create splitter thread", errcode );
|
||||
delete[] worker_threads; delete[] worker_args; return 1; }
|
||||
|
|
|
@ -115,7 +115,7 @@ int pwriteblock( const int fd, const uint8_t * const buf, const int size,
|
|||
}
|
||||
|
||||
|
||||
void decompress_error( struct LZ_Decoder * const decoder,
|
||||
void decompress_error( LZ_Decoder * const decoder,
|
||||
const Pretty_print & pp,
|
||||
Shared_retval & shared_retval, const int worker_id )
|
||||
{
|
||||
|
@ -158,11 +158,16 @@ struct Worker_arg
|
|||
const Lzip_index * lzip_index;
|
||||
const Pretty_print * pp;
|
||||
Shared_retval * shared_retval;
|
||||
int worker_id;
|
||||
int num_workers;
|
||||
int infd;
|
||||
int num_workers;
|
||||
int outfd;
|
||||
int worker_id;
|
||||
bool nocopy; // avoid copying decompressed data when testing
|
||||
void assign( const Lzip_index & li, const Pretty_print & pp_,
|
||||
Shared_retval & sr, const int ifd, const int nw,
|
||||
const int ofd, const int wi, const bool nc )
|
||||
{ lzip_index = &li; pp = &pp_; shared_retval = &sr; infd = ifd;
|
||||
num_workers = nw; outfd = ofd; worker_id = wi; nocopy = nc; }
|
||||
};
|
||||
|
||||
|
||||
|
@ -243,7 +248,7 @@ extern "C" void * dworker( void * arg )
|
|||
{
|
||||
if( data_rest != 0 )
|
||||
internal_error( "final data_rest is not zero." );
|
||||
LZ_decompress_reset( decoder ); // prepare for new member
|
||||
LZ_decompress_reset( decoder ); // prepare for next member
|
||||
break;
|
||||
}
|
||||
if( rd == 0 ) break;
|
||||
|
@ -264,11 +269,11 @@ done:
|
|||
} // end namespace
|
||||
|
||||
|
||||
// start the workers and wait for them to finish.
|
||||
// start the workers and wait for them to finish
|
||||
int decompress( const unsigned long long cfile_size, int num_workers,
|
||||
const int infd, const int outfd, const Cl_options & cl_opts,
|
||||
const Pretty_print & pp, const int debug_level,
|
||||
const int in_slots, const int out_slots,
|
||||
const int in_slots, const int out_slots, const bool from_stdin,
|
||||
const bool infd_isreg, const bool one_to_one )
|
||||
{
|
||||
if( !infd_isreg )
|
||||
|
@ -284,11 +289,11 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
|||
}
|
||||
if( lzip_index.retval() != 0 ) // corrupt or invalid input file
|
||||
{
|
||||
if( lzip_index.bad_magic() )
|
||||
show_file_error( pp.name(), lzip_index.error().c_str() );
|
||||
else pp( lzip_index.error().c_str() );
|
||||
if( lzip_index.good_magic() ) pp( lzip_index.error().c_str() );
|
||||
else show_file_error( pp.name(), lzip_index.error().c_str() );
|
||||
return lzip_index.retval();
|
||||
}
|
||||
const bool multi_empty = !from_stdin && lzip_index.multi_empty();
|
||||
|
||||
if( num_workers > lzip_index.members() ) num_workers = lzip_index.members();
|
||||
|
||||
|
@ -301,8 +306,11 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
|||
if( debug_level & 2 ) std::fputs( "decompress file to stdout.\n", stderr );
|
||||
if( verbosity >= 1 ) pp();
|
||||
show_progress( 0, cfile_size, &pp ); // init
|
||||
return dec_stdout( num_workers, infd, outfd, pp, debug_level, out_slots,
|
||||
lzip_index );
|
||||
const int tmp = dec_stdout( num_workers, infd, outfd, pp, debug_level,
|
||||
out_slots, lzip_index );
|
||||
if( tmp ) return tmp;
|
||||
if( multi_empty ) { show_file_error( pp.name(), empty_msg ); return 2; }
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -325,14 +333,8 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
|||
int i = 0; // number of workers started
|
||||
for( ; i < num_workers; ++i )
|
||||
{
|
||||
worker_args[i].lzip_index = &lzip_index;
|
||||
worker_args[i].pp = &pp;
|
||||
worker_args[i].shared_retval = &shared_retval;
|
||||
worker_args[i].worker_id = i;
|
||||
worker_args[i].num_workers = num_workers;
|
||||
worker_args[i].infd = infd;
|
||||
worker_args[i].outfd = outfd;
|
||||
worker_args[i].nocopy = nocopy;
|
||||
worker_args[i].assign( lzip_index, pp, shared_retval, infd, num_workers,
|
||||
outfd, i, nocopy );
|
||||
const int errcode =
|
||||
pthread_create( &worker_threads[i], 0, dworker, &worker_args[i] );
|
||||
if( errcode )
|
||||
|
@ -359,5 +361,6 @@ int decompress( const unsigned long long cfile_size, int num_workers,
|
|||
std::fprintf( stderr,
|
||||
"workers started %8u\n", num_workers );
|
||||
|
||||
if( multi_empty ) { show_file_error( pp.name(), empty_msg ); return 2; }
|
||||
return 0;
|
||||
}
|
||||
|
|
35
doc/plzip.1
35
doc/plzip.1
|
@ -1,32 +1,33 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
|
||||
.TH PLZIP "1" "January 2024" "plzip 1.11" "User Commands"
|
||||
.TH PLZIP "1" "November 2024" "plzip 1.12-rc1" "User Commands"
|
||||
.SH NAME
|
||||
plzip \- reduces the size of files
|
||||
.SH SYNOPSIS
|
||||
.B plzip
|
||||
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
||||
.SH DESCRIPTION
|
||||
Plzip is a massively parallel (multi\-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
||||
Plzip is a massively parallel (multi\-threaded) implementation of lzip. Plzip
|
||||
uses the compression library lzlib.
|
||||
.PP
|
||||
Lzip is a lossless data compressor with a user interface similar to the one
|
||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel\-Ziv\-Markov
|
||||
chain\-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32\-bit machines. Lzip provides accurate and robust 3\-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip (lzip \fB\-0\fR) or compress most
|
||||
files more than bzip2 (lzip \fB\-9\fR). Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general\-purpose compressed format for
|
||||
Unix\-like systems.
|
||||
of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel\-Ziv\-Markov
|
||||
chain\-Algorithm) designed to achieve complete interoperability between
|
||||
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||
file can be decompressed on 32\-bit machines. Lzip provides accurate and
|
||||
robust 3\-factor integrity checking. 'lzip \fB\-0\fR' compresses about as fast as
|
||||
gzip, while 'lzip \fB\-9\fR' compresses most files more than bzip2. Decompression
|
||||
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||
and tested with great care to replace gzip and bzip2 as general\-purpose
|
||||
compressed format for Unix\-like systems.
|
||||
.PP
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
to 2 percent larger compressed files). Note that the number of usable
|
||||
threads is limited by file size; on files larger than a few GB plzip can use
|
||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
||||
than lzip.
|
||||
hundreds of processors, but on files smaller than 1 MiB plzip is no faster
|
||||
than lzip (even at compression level \fB\-0\fR).
|
||||
The number of threads defaults to the number of processors.
|
||||
.SH OPTIONS
|
||||
.TP
|
||||
\fB\-h\fR, \fB\-\-help\fR
|
||||
|
@ -132,8 +133,8 @@ License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
|||
.br
|
||||
This is free software: you are free to change and redistribute it.
|
||||
There is NO WARRANTY, to the extent permitted by law.
|
||||
Using lzlib 1.14
|
||||
Using LZ_API_VERSION = 1014
|
||||
Using lzlib 1.15\-rc1
|
||||
Using LZ_API_VERSION = 1015
|
||||
.SH "SEE ALSO"
|
||||
The full documentation for
|
||||
.B plzip
|
||||
|
|
347
doc/plzip.info
347
doc/plzip.info
|
@ -11,21 +11,22 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Plzip Manual
|
||||
************
|
||||
|
||||
This manual is for Plzip (version 1.11, 21 January 2024).
|
||||
This manual is for Plzip (version 1.12-rc1, 19 November 2024).
|
||||
|
||||
* Menu:
|
||||
|
||||
* Introduction:: Purpose and features of plzip
|
||||
* Output:: Meaning of plzip's output
|
||||
* Invoking plzip:: Command-line interface
|
||||
* Program design:: Internal structure of plzip
|
||||
* Memory requirements:: Memory required to compress and decompress
|
||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Trailing data:: Extra data appended to the file
|
||||
* Examples:: A small tutorial with examples
|
||||
* Problems:: Reporting bugs
|
||||
* Concept index:: Index of concepts
|
||||
* Introduction:: Purpose and features of plzip
|
||||
* Output:: Meaning of plzip's output
|
||||
* Invoking plzip:: Command-line interface
|
||||
* Argument syntax:: By convention, options start with a hyphen
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Program design:: Internal structure of plzip
|
||||
* Memory requirements:: Memory required to compress and decompress
|
||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||
* Trailing data:: Extra data appended to the file
|
||||
* Examples:: A small tutorial with examples
|
||||
* Problems:: Reporting bugs
|
||||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2009-2024 Antonio Diaz Diaz.
|
||||
|
@ -39,27 +40,27 @@ File: plzip.info, Node: Introduction, Next: Output, Prev: Top, Up: Top
|
|||
1 Introduction
|
||||
**************
|
||||
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
|
||||
Plzip is a massively parallel (multi-threaded) implementation of lzip.
|
||||
Plzip uses the compression library lzlib.
|
||||
|
||||
Lzip is a lossless data compressor with a user interface similar to the
|
||||
one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
|
||||
files more than bzip2 (lzip -9). Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
||||
Unix-like systems.
|
||||
one of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov
|
||||
chain-Algorithm) designed to achieve complete interoperability between
|
||||
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||
file can be decompressed on 32-bit machines. Lzip provides accurate and
|
||||
robust 3-factor integrity checking. 'lzip -0' compresses about as fast as
|
||||
gzip, while 'lzip -9' compresses most files more than bzip2. Decompression
|
||||
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||
and tested with great care to replace gzip and bzip2 as general-purpose
|
||||
compressed format for Unix-like systems.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
to 2 percent larger compressed files). Note that the number of usable
|
||||
threads is limited by file size; on files larger than a few GB plzip can use
|
||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
||||
than lzip. *Note Minimum file sizes::.
|
||||
hundreds of processors, but on files smaller than 1 MiB plzip is no faster
|
||||
than lzip (even at compression level -0). *Note Minimum file sizes::.
|
||||
|
||||
For creation and manipulation of compressed tar archives tarlz can be
|
||||
more efficient than using tar and plzip because tarlz is able to keep the
|
||||
|
@ -96,9 +97,9 @@ makes it safer than compressors returning ambiguous warning values (like
|
|||
gzip) when it is used as a back end for other programs like tar or zutils.
|
||||
|
||||
Plzip automatically uses for each file the largest dictionary size that
|
||||
does not exceed neither the file size nor the limit given. Keep in mind
|
||||
that the decompression memory requirement is affected at compression time
|
||||
by the choice of dictionary size limit. *Note Memory requirements::.
|
||||
does not exceed neither the file size nor the limit given. The dictionary
|
||||
size used for decompression is the same dictionary size used for
|
||||
compression. *Note Memory requirements::.
|
||||
|
||||
When compressing, plzip replaces every file given in the command line
|
||||
with a compressed version of itself, with the name "original_name.lz". When
|
||||
|
@ -174,7 +175,7 @@ have been compressed. Decompressed is used to refer to data which have
|
|||
undergone the process of decompression.
|
||||
|
||||
|
||||
File: plzip.info, Node: Invoking plzip, Next: Program design, Prev: Output, Up: Top
|
||||
File: plzip.info, Node: Invoking plzip, Next: Argument syntax, Prev: Output, Up: Top
|
||||
|
||||
3 Invoking plzip
|
||||
****************
|
||||
|
@ -189,8 +190,7 @@ means standard input. It can be mixed with other FILES and is read just
|
|||
once, the first time it appears in the command line. Remember to prepend
|
||||
'./' to any file name beginning with a hyphen, or use '--'.
|
||||
|
||||
plzip supports the following options: *Note Argument syntax:
|
||||
(arg_parser)Argument syntax.
|
||||
plzip supports the following options: *Note Argument syntax::.
|
||||
|
||||
'-h'
|
||||
'--help'
|
||||
|
@ -235,7 +235,8 @@ once, the first time it appears in the command line. Remember to prepend
|
|||
status 1. If a file fails to decompress, or is a terminal, plzip exits
|
||||
immediately with error status 2 without decompressing the rest of the
|
||||
files. A terminal is considered an uncompressed file, and therefore
|
||||
invalid.
|
||||
invalid. A multimember file with one or more empty members is accepted
|
||||
if redirected to standard input.
|
||||
|
||||
'-f'
|
||||
'--force'
|
||||
|
@ -259,7 +260,8 @@ once, the first time it appears in the command line. Remember to prepend
|
|||
'-v', the dictionary size, the number of members in the file, and the
|
||||
amount of trailing data (if any) are also printed. With '-vv', the
|
||||
positions and sizes of each member in multimember files are also
|
||||
printed.
|
||||
printed. A multimember file with one or more empty members is accepted
|
||||
if redirected to standard input.
|
||||
|
||||
If any file is damaged, does not exist, can't be opened, or is not
|
||||
regular, the final exit status is > 0. '-lq' can be used to check
|
||||
|
@ -278,8 +280,8 @@ once, the first time it appears in the command line. Remember to prepend
|
|||
'-n N'
|
||||
'--threads=N'
|
||||
Set the maximum number of worker threads, overriding the system's
|
||||
default. Valid values range from 1 to "as many as your system can
|
||||
support". If this option is not used, plzip tries to detect the number
|
||||
default. Valid values range from 1 to as many as your system can
|
||||
support. If this option is not used, plzip tries to detect the number
|
||||
of processors in the system and use it as default value. When
|
||||
compressing on a 32 bit system, plzip tries to limit the memory use to
|
||||
under 2.22 GiB (4 worker threads at level -9) by reducing the number
|
||||
|
@ -338,7 +340,8 @@ once, the first time it appears in the command line. Remember to prepend
|
|||
fails the test, does not exist, can't be opened, or is a terminal,
|
||||
plzip continues testing the rest of the files. A final diagnostic is
|
||||
shown at verbosity level 1 or higher if any file fails the test when
|
||||
testing multiple files.
|
||||
testing multiple files. A multimember file with one or more empty
|
||||
members is accepted if redirected to standard input.
|
||||
|
||||
'-v'
|
||||
'--verbose'
|
||||
|
@ -368,6 +371,7 @@ once, the first time it appears in the command line. Remember to prepend
|
|||
'-s64MiB -m273'
|
||||
|
||||
Level Dictionary size (-s) Match length limit (-m)
|
||||
------------------------------------------------------
|
||||
-0 64 KiB 16 bytes
|
||||
-1 1 MiB 5 bytes
|
||||
-2 1.5 MiB 6 bytes
|
||||
|
@ -387,7 +391,7 @@ once, the first time it appears in the command line. Remember to prepend
|
|||
When decompressing, testing, or listing, allow trailing data whose
|
||||
first bytes are so similar to the magic bytes of a lzip header that
|
||||
they can be confused with a corrupt header. Use this option if a file
|
||||
triggers a "corrupt header" error and the cause is not indeed a
|
||||
triggers a 'corrupt header' error and the cause is not indeed a
|
||||
corrupt header.
|
||||
|
||||
'--in-slots=N'
|
||||
|
@ -421,6 +425,7 @@ and may be followed by a multiplier and an optional 'B' for "byte".
|
|||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
Prefix Value | Prefix Value
|
||||
----------------------------------------------------------------------
|
||||
k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
|
||||
M megabyte (10^6) | Mi mebibyte (2^20)
|
||||
G gigabyte (10^9) | Gi gibibyte (2^30)
|
||||
|
@ -439,9 +444,131 @@ corrupt or invalid input file, 3 for an internal consistency error (e.g.,
|
|||
bug) which caused plzip to panic.
|
||||
|
||||
|
||||
File: plzip.info, Node: Program design, Next: Memory requirements, Prev: Invoking plzip, Up: Top
|
||||
File: plzip.info, Node: Argument syntax, Next: File format, Prev: Invoking plzip, Up: Top
|
||||
|
||||
4 Internal structure of plzip
|
||||
4 Syntax of command-line arguments
|
||||
**********************************
|
||||
|
||||
POSIX recommends these conventions for command-line arguments.
|
||||
|
||||
* A command-line argument is an option if it begins with a hyphen ('-').
|
||||
|
||||
* Option names are single alphanumeric characters.
|
||||
|
||||
* Certain options require an argument.
|
||||
|
||||
* An option and its argument may or may not appear as separate tokens.
|
||||
(In other words, the whitespace separating them is optional, unless the
|
||||
argument is the empty string). Thus, '-o foo' and '-ofoo' are
|
||||
equivalent.
|
||||
|
||||
* One or more options without arguments, followed by at most one option
|
||||
that takes an argument, may follow a hyphen in a single token. Thus,
|
||||
'-abc' is equivalent to '-a -b -c'.
|
||||
|
||||
* Options typically precede other non-option arguments.
|
||||
|
||||
* The argument '--' terminates all options; any following arguments are
|
||||
treated as non-option arguments, even if they begin with a hyphen.
|
||||
|
||||
* A token consisting of a single hyphen character is interpreted as an
|
||||
ordinary non-option argument. By convention, it is used to specify
|
||||
standard input, standard output, or a file named '-'.
|
||||
|
||||
GNU adds "long options" to these conventions:
|
||||
|
||||
* A long option consists of two hyphens ('--') followed by a name made
|
||||
of alphanumeric characters and hyphens. Option names are typically one
|
||||
to three words long, with hyphens to separate words. Abbreviations can
|
||||
be used for the long option names as long as the abbreviations are
|
||||
unique.
|
||||
|
||||
* A long option and its argument may or may not appear as separate
|
||||
tokens. In the latter case they must be separated by an equal sign '='.
|
||||
Thus, '--foo bar' and '--foo=bar' are equivalent.
|
||||
|
||||
The syntax of options with an optional argument is
|
||||
'-<short_option><argument>' (without whitespace), or
|
||||
'--<long_option>=<argument>'.
|
||||
|
||||
|
||||
File: plzip.info, Node: File format, Next: Program design, Prev: Argument syntax, Up: Top
|
||||
|
||||
5 File format
|
||||
*************
|
||||
|
||||
Perfection is reached, not when there is no longer anything to add, but
|
||||
when there is no longer anything to take away.
|
||||
-- Antoine de Saint-Exupery
|
||||
|
||||
In the diagram below, a box like this:
|
||||
|
||||
+---+
|
||||
| | <-- the vertical bars might be missing
|
||||
+---+
|
||||
|
||||
represents one byte; a box like this:
|
||||
|
||||
+==============+
|
||||
| |
|
||||
+==============+
|
||||
|
||||
represents a variable number of bytes.
|
||||
|
||||
A lzip file consists of one or more independent "members" (compressed data
|
||||
sets). The members simply appear one after another in the file, with no
|
||||
additional information before, between, or after them. Each member can
|
||||
encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
|
||||
size of a multimember file is unlimited. Empty members (data size = 0) are
|
||||
not allowed in multimember files.
|
||||
|
||||
Each member has the following structure:
|
||||
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
|
||||
All multibyte values are stored in little endian order.
|
||||
|
||||
'ID string (the "magic" bytes)'
|
||||
A four byte string, identifying the lzip format, with the value "LZIP"
|
||||
(0x4C, 0x5A, 0x49, 0x50).
|
||||
|
||||
'VN (version number, 1 byte)'
|
||||
Just in case something needs to be modified in the future. 1 for now.
|
||||
|
||||
'DS (coded dictionary size, 1 byte)'
|
||||
The dictionary size is calculated by taking a power of 2 (the base
|
||||
size) and subtracting from it a fraction between 0/16 and 7/16 of the
|
||||
base size.
|
||||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
|
||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
||||
from the base size to obtain the dictionary size.
|
||||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
'LZMA stream'
|
||||
The LZMA stream, terminated by an 'End Of Stream' marker. Uses default
|
||||
values for encoder properties. *Note Stream format: (lzip)Stream
|
||||
format, for a complete description.
|
||||
|
||||
'CRC32 (4 bytes)'
|
||||
Cyclic Redundancy Check (CRC) of the original uncompressed data.
|
||||
|
||||
'Data size (8 bytes)'
|
||||
Size of the original uncompressed data.
|
||||
|
||||
'Member size (8 bytes)'
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, improves the checking of stream integrity, and
|
||||
facilitates the safe recovery of undamaged members from multimember
|
||||
files. Lzip limits the member size to 2 PiB to prevent the data size
|
||||
field from overflowing.
|
||||
|
||||
|
||||
File: plzip.info, Node: Program design, Next: Memory requirements, Prev: File format, Up: Top
|
||||
|
||||
6 Internal structure of plzip
|
||||
*****************************
|
||||
|
||||
When compressing, plzip divides the input file into chunks and compresses as
|
||||
|
@ -456,8 +583,8 @@ because lzip usually produces single-member files, which can't be
|
|||
decompressed in parallel.
|
||||
|
||||
For each input file, a splitter thread and several worker threads are
|
||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||
courier" takes care of data transfers among threads and limits the maximum
|
||||
created, acting the main thread as muxer (multiplexer) thread. A 'packet
|
||||
courier' takes care of data transfers among threads and limits the maximum
|
||||
number of data blocks (packets) being processed simultaneously.
|
||||
|
||||
The splitter reads data blocks from the input file, and distributes them
|
||||
|
@ -486,7 +613,7 @@ only limited by the number of processors available and by I/O speed.
|
|||
|
||||
File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev: Program design, Up: Top
|
||||
|
||||
5 Memory required to compress and decompress
|
||||
7 Memory required to compress and decompress
|
||||
********************************************
|
||||
|
||||
The amount of memory required *per worker thread* for decompression or
|
||||
|
@ -520,6 +647,7 @@ The following table shows the memory required *per thread* for compression
|
|||
at a given level, using the default data size for each level:
|
||||
|
||||
Level Memory required
|
||||
------------------------
|
||||
-0 4.875 MiB
|
||||
-1 17.75 MiB
|
||||
-2 26.625 MiB
|
||||
|
@ -532,9 +660,9 @@ Level Memory required
|
|||
-9 568 MiB
|
||||
|
||||
|
||||
File: plzip.info, Node: Minimum file sizes, Next: File format, Prev: Memory requirements, Up: Top
|
||||
File: plzip.info, Node: Minimum file sizes, Next: Trailing data, Prev: Memory requirements, Up: Top
|
||||
|
||||
6 Minimum file sizes required for full compression speed
|
||||
8 Minimum file sizes required for full compression speed
|
||||
********************************************************
|
||||
|
||||
When compressing, plzip divides the input file into chunks and compresses
|
||||
|
@ -569,85 +697,9 @@ Level
|
|||
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB
|
||||
|
||||
|
||||
File: plzip.info, Node: File format, Next: Trailing data, Prev: Minimum file sizes, Up: Top
|
||||
File: plzip.info, Node: Trailing data, Next: Examples, Prev: Minimum file sizes, Up: Top
|
||||
|
||||
7 File format
|
||||
*************
|
||||
|
||||
Perfection is reached, not when there is no longer anything to add, but
|
||||
when there is no longer anything to take away.
|
||||
-- Antoine de Saint-Exupery
|
||||
|
||||
|
||||
In the diagram below, a box like this:
|
||||
|
||||
+---+
|
||||
| | <-- the vertical bars might be missing
|
||||
+---+
|
||||
|
||||
represents one byte; a box like this:
|
||||
|
||||
+==============+
|
||||
| |
|
||||
+==============+
|
||||
|
||||
represents a variable number of bytes.
|
||||
|
||||
|
||||
A lzip file consists of one or more independent "members" (compressed
|
||||
data sets). The members simply appear one after another in the file, with no
|
||||
additional information before, between, or after them. Each member can
|
||||
encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
|
||||
size of a multimember file is unlimited.
|
||||
|
||||
Each member has the following structure:
|
||||
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
|
||||
All multibyte values are stored in little endian order.
|
||||
|
||||
'ID string (the "magic" bytes)'
|
||||
A four byte string, identifying the lzip format, with the value "LZIP"
|
||||
(0x4C, 0x5A, 0x49, 0x50).
|
||||
|
||||
'VN (version number, 1 byte)'
|
||||
Just in case something needs to be modified in the future. 1 for now.
|
||||
|
||||
'DS (coded dictionary size, 1 byte)'
|
||||
The dictionary size is calculated by taking a power of 2 (the base
|
||||
size) and subtracting from it a fraction between 0/16 and 7/16 of the
|
||||
base size.
|
||||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
|
||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
||||
from the base size to obtain the dictionary size.
|
||||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
'LZMA stream'
|
||||
The LZMA stream, finished by an "End Of Stream" marker. Uses default
|
||||
values for encoder properties. *Note Stream format: (lzip)Stream
|
||||
format, for a complete description.
|
||||
|
||||
'CRC32 (4 bytes)'
|
||||
Cyclic Redundancy Check (CRC) of the original uncompressed data.
|
||||
|
||||
'Data size (8 bytes)'
|
||||
Size of the original uncompressed data.
|
||||
|
||||
'Member size (8 bytes)'
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, improves the checking of stream integrity, and
|
||||
facilitates the safe recovery of undamaged members from multimember
|
||||
files. Lzip limits the member size to 2 PiB to prevent the data size
|
||||
field from overflowing.
|
||||
|
||||
|
||||
|
||||
File: plzip.info, Node: Trailing data, Next: Examples, Prev: File format, Up: Top
|
||||
|
||||
8 Extra data appended to the file
|
||||
9 Extra data appended to the file
|
||||
*********************************
|
||||
|
||||
Sometimes extra data are found appended to a lzip file after the last
|
||||
|
@ -657,7 +709,7 @@ member. Such trailing data may be:
|
|||
example when writing to a tape. It is safe to append any amount of
|
||||
padding zero bytes to a lzip file.
|
||||
|
||||
* Useful data added by the user; an "End Of File" string (to check that
|
||||
* Useful data added by the user; an 'End Of File' string (to check that
|
||||
the file has not been truncated), a cryptographically secure hash, a
|
||||
description of file contents, etc. It is safe to append any amount of
|
||||
text to a lzip file as long as none of the first four bytes of the
|
||||
|
@ -693,8 +745,8 @@ where a file containing trailing data must be rejected, the option
|
|||
|
||||
File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top
|
||||
|
||||
9 A small tutorial with examples
|
||||
********************************
|
||||
10 A small tutorial with examples
|
||||
*********************************
|
||||
|
||||
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
|
||||
compressed file (bugs in the system libraries, memory errors, etc).
|
||||
|
@ -706,38 +758,32 @@ comparing the compressed file with the original because the corruption
|
|||
happens before plzip compresses the RAM contents, resulting in a valid
|
||||
compressed file containing wrong data.
|
||||
|
||||
|
||||
Example 1: Extract all the files from archive 'foo.tar.lz'.
|
||||
|
||||
tar -xf foo.tar.lz
|
||||
or
|
||||
plzip -cd foo.tar.lz | tar -xf -
|
||||
|
||||
|
||||
Example 2: Replace a regular file with its compressed version 'file.lz' and
|
||||
show the compression ratio.
|
||||
|
||||
plzip -v file
|
||||
|
||||
|
||||
Example 3: Like example 2 but the created 'file.lz' has a block size of
|
||||
1 MiB. The compression ratio is not shown.
|
||||
|
||||
plzip -B 1MiB file
|
||||
|
||||
|
||||
Example 4: Restore a regular file from its compressed version 'file.lz'. If
|
||||
the operation is successful, 'file.lz' is removed.
|
||||
|
||||
plzip -d file.lz
|
||||
|
||||
|
||||
Example 5: Check the integrity of the compressed file 'file.lz' and show
|
||||
status.
|
||||
|
||||
plzip -tv file.lz
|
||||
|
||||
|
||||
Example 6: The right way of concatenating the decompressed output of two or
|
||||
more compressed files. *Note Trailing data::.
|
||||
|
||||
|
@ -746,19 +792,16 @@ more compressed files. *Note Trailing data::.
|
|||
Do this instead
|
||||
plzip -cd file1.lz file2.lz file3.lz
|
||||
|
||||
|
||||
Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed data
|
||||
are produced.
|
||||
|
||||
plzip -cd file.lz | dd bs=1024 count=10
|
||||
|
||||
|
||||
Example 8: Decompress 'file.lz' partially from decompressed byte at offset
|
||||
10000 to decompressed byte at offset 14999 (5000 bytes are produced).
|
||||
|
||||
plzip -cd file.lz | dd bs=1000 skip=10 count=5
|
||||
|
||||
|
||||
Example 9: Compress a whole device in /dev/sdc and send the output to
|
||||
'file.lz'.
|
||||
|
||||
|
@ -769,7 +812,7 @@ Example 9: Compress a whole device in /dev/sdc and send the output to
|
|||
|
||||
File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
|
||||
|
||||
10 Reporting bugs
|
||||
11 Reporting bugs
|
||||
*****************
|
||||
|
||||
There are probably bugs in plzip. There are certainly errors and omissions
|
||||
|
@ -790,6 +833,7 @@ Concept index
|
|||
[index]
|
||||
* Menu:
|
||||
|
||||
* argument syntax: Argument syntax. (line 6)
|
||||
* bugs: Problems. (line 6)
|
||||
* examples: Examples. (line 6)
|
||||
* file format: File format. (line 6)
|
||||
|
@ -809,21 +853,22 @@ Concept index
|
|||
|
||||
Tag Table:
|
||||
Node: Top217
|
||||
Node: Introduction1156
|
||||
Node: Output5934
|
||||
Node: Invoking plzip7497
|
||||
Ref: --trailing-error8372
|
||||
Ref: --data-size8610
|
||||
Node: Program design19519
|
||||
Node: Memory requirements21818
|
||||
Node: Minimum file sizes23503
|
||||
Node: File format25506
|
||||
Ref: coded-dict-size26945
|
||||
Node: Trailing data28195
|
||||
Node: Examples30531
|
||||
Ref: concat-example31964
|
||||
Node: Problems32721
|
||||
Node: Concept index33276
|
||||
Node: Introduction1207
|
||||
Node: Output5956
|
||||
Node: Invoking plzip7519
|
||||
Ref: --trailing-error8365
|
||||
Ref: --data-size8603
|
||||
Node: Argument syntax19941
|
||||
Node: File format21886
|
||||
Ref: coded-dict-size23386
|
||||
Node: Program design24637
|
||||
Node: Memory requirements26933
|
||||
Node: Minimum file sizes28643
|
||||
Node: Trailing data30648
|
||||
Node: Examples32991
|
||||
Ref: concat-example34420
|
||||
Node: Problems35174
|
||||
Node: Concept index35729
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
359
doc/plzip.texi
359
doc/plzip.texi
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 21 January 2024
|
||||
@set VERSION 1.11
|
||||
@set UPDATED 19 November 2024
|
||||
@set VERSION 1.12-rc1
|
||||
|
||||
@dircategory Compression
|
||||
@direntry
|
||||
|
@ -36,17 +36,18 @@
|
|||
This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
|
||||
|
||||
@menu
|
||||
* Introduction:: Purpose and features of plzip
|
||||
* Output:: Meaning of plzip's output
|
||||
* Invoking plzip:: Command-line interface
|
||||
* Program design:: Internal structure of plzip
|
||||
* Memory requirements:: Memory required to compress and decompress
|
||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Trailing data:: Extra data appended to the file
|
||||
* Examples:: A small tutorial with examples
|
||||
* Problems:: Reporting bugs
|
||||
* Concept index:: Index of concepts
|
||||
* Introduction:: Purpose and features of plzip
|
||||
* Output:: Meaning of plzip's output
|
||||
* Invoking plzip:: Command-line interface
|
||||
* Argument syntax:: By convention, options start with a hyphen
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Program design:: Internal structure of plzip
|
||||
* Memory requirements:: Memory required to compress and decompress
|
||||
* Minimum file sizes:: Minimum file sizes required for full speed
|
||||
* Trailing data:: Extra data appended to the file
|
||||
* Examples:: A small tutorial with examples
|
||||
* Problems:: Reporting bugs
|
||||
* Concept index:: Index of concepts
|
||||
@end menu
|
||||
|
||||
@sp 1
|
||||
|
@ -61,30 +62,29 @@ distribute, and modify it.
|
|||
@chapter Introduction
|
||||
@cindex introduction
|
||||
|
||||
@uref{http://www.nongnu.org/lzip/plzip.html,,Plzip}
|
||||
is a massively parallel (multi-threaded) implementation of lzip,
|
||||
compatible with lzip 1.4 or newer. Plzip uses the compression library
|
||||
@uref{http://www.nongnu.org/lzip/plzip.html,,Plzip} is a massively parallel
|
||||
(multi-threaded) implementation of lzip. Plzip uses the compression library
|
||||
@uref{http://www.nongnu.org/lzip/lzlib.html,,lzlib}.
|
||||
|
||||
@uref{http://www.nongnu.org/lzip/lzip.html,,Lzip}
|
||||
is a lossless data compressor with a user interface similar to the one
|
||||
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
|
||||
chain-Algorithm' (LZMA) stream format to maximize interoperability. The
|
||||
maximum dictionary size is 512 MiB so that any lzip file can be decompressed
|
||||
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
|
||||
checking. Lzip can compress about as fast as gzip @w{(lzip -0)} or compress most
|
||||
files more than bzip2 @w{(lzip -9)}. Decompression speed is intermediate between
|
||||
gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
|
||||
perspective. Lzip has been designed, written, and tested with great care to
|
||||
replace gzip and bzip2 as the standard general-purpose compressed format for
|
||||
Unix-like systems.
|
||||
of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov
|
||||
chain-Algorithm) designed to achieve complete interoperability between
|
||||
implementations. The maximum dictionary size is 512 MiB so that any lzip
|
||||
file can be decompressed on 32-bit machines. Lzip provides accurate and
|
||||
robust 3-factor integrity checking. @w{@samp{lzip -0}} compresses about as fast as
|
||||
gzip, while @w{@samp{lzip -9}} compresses most files more than bzip2. Decompression
|
||||
speed is intermediate between gzip and bzip2. Lzip provides better data
|
||||
recovery capabilities than gzip and bzip2. Lzip has been designed, written,
|
||||
and tested with great care to replace gzip and bzip2 as general-purpose
|
||||
compressed format for Unix-like systems.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines much
|
||||
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
|
||||
to 2 percent larger compressed files). Note that the number of usable
|
||||
threads is limited by file size; on files larger than a few GB plzip can use
|
||||
hundreds of processors, but on files of only a few MB plzip is no faster
|
||||
than lzip. @xref{Minimum file sizes}.
|
||||
hundreds of processors, but on files smaller than @w{1 MiB} plzip is no faster
|
||||
than lzip (even at compression level -0). @xref{Minimum file sizes}.
|
||||
|
||||
For creation and manipulation of compressed tar archives
|
||||
@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} can be more
|
||||
|
@ -132,9 +132,9 @@ makes it safer than compressors returning ambiguous warning values (like
|
|||
gzip) when it is used as a back end for other programs like tar or zutils.
|
||||
|
||||
Plzip automatically uses for each file the largest dictionary size that does
|
||||
not exceed neither the file size nor the limit given. Keep in mind that the
|
||||
decompression memory requirement is affected at compression time by the
|
||||
choice of dictionary size limit. @xref{Memory requirements}.
|
||||
not exceed neither the file size nor the limit given. The dictionary size
|
||||
used for decompression is the same dictionary size used for compression.
|
||||
@xref{Memory requirements}.
|
||||
|
||||
When compressing, plzip replaces every file given in the command line
|
||||
with a compressed version of itself, with the name "original_name.lz".
|
||||
|
@ -235,11 +235,8 @@ argument means standard input. It can be mixed with other @var{files} and is
|
|||
read just once, the first time it appears in the command line. Remember to
|
||||
prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}.
|
||||
|
||||
plzip supports the following
|
||||
@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
|
||||
@ifnothtml
|
||||
@xref{Argument syntax,,,arg_parser}.
|
||||
@end ifnothtml
|
||||
@noindent
|
||||
plzip supports the following options: @xref{Argument syntax}.
|
||||
|
||||
@table @code
|
||||
@item -h
|
||||
|
@ -286,7 +283,8 @@ already exists and @option{--force} has not been specified, plzip continues
|
|||
decompressing the rest of the files and exits with error status 1. If a file
|
||||
fails to decompress, or is a terminal, plzip exits immediately with error
|
||||
status 2 without decompressing the rest of the files. A terminal is
|
||||
considered an uncompressed file, and therefore invalid.
|
||||
considered an uncompressed file, and therefore invalid. A multimember file
|
||||
with one or more empty members is accepted if redirected to standard input.
|
||||
|
||||
@item -f
|
||||
@itemx --force
|
||||
|
@ -295,7 +293,7 @@ Force overwrite of output files.
|
|||
@item -F
|
||||
@itemx --recompress
|
||||
When compressing, force re-compression of files whose name already has
|
||||
the @samp{.lz} or @samp{.tlz} suffix.
|
||||
the @file{.lz} or @file{.tlz} suffix.
|
||||
|
||||
@item -k
|
||||
@itemx --keep
|
||||
|
@ -309,7 +307,8 @@ even for multimember files. If more than one file is given, a final line
|
|||
containing the cumulative sizes is printed. With @option{-v}, the dictionary
|
||||
size, the number of members in the file, and the amount of trailing data (if
|
||||
any) are also printed. With @option{-vv}, the positions and sizes of each
|
||||
member in multimember files are also printed.
|
||||
member in multimember files are also printed. A multimember file with one or
|
||||
more empty members is accepted if redirected to standard input.
|
||||
|
||||
If any file is damaged, does not exist, can't be opened, or is not regular,
|
||||
the final exit status is @w{> 0}. @option{-lq} can be used to check quickly
|
||||
|
@ -327,7 +326,7 @@ times.
|
|||
@item -n @var{n}
|
||||
@itemx --threads=@var{n}
|
||||
Set the maximum number of worker threads, overriding the system's default.
|
||||
Valid values range from 1 to "as many as your system can support". If this
|
||||
Valid values range from 1 to as many as your system can support. If this
|
||||
option is not used, plzip tries to detect the number of processors in the
|
||||
system and use it as default value. When compressing on a @w{32 bit} system,
|
||||
plzip tries to limit the memory use to under @w{2.22 GiB} (4 worker threads
|
||||
|
@ -353,10 +352,10 @@ to @option{-c}. @option{-o} has no effect when testing or listing.
|
|||
|
||||
In order to keep backward compatibility with plzip versions prior to 1.9,
|
||||
when compressing from standard input and no other file names are given, the
|
||||
extension @samp{.lz} is appended to @var{file} unless it already ends in
|
||||
@samp{.lz} or @samp{.tlz}. This feature will be removed in a future version
|
||||
extension @file{.lz} is appended to @var{file} unless it already ends in
|
||||
@file{.lz} or @file{.tlz}. This feature will be removed in a future version
|
||||
of plzip. Meanwhile, redirection may be used instead of @option{-o} to write
|
||||
the compressed output to a file without the extension @samp{.lz} in its
|
||||
the compressed output to a file without the extension @file{.lz} in its
|
||||
name: @w{@samp{plzip < file > foo}}.
|
||||
|
||||
@item -q
|
||||
|
@ -386,7 +385,8 @@ together with @option{-v} to see information about the files. If a file
|
|||
fails the test, does not exist, can't be opened, or is a terminal, plzip
|
||||
continues testing the rest of the files. A final diagnostic is shown at
|
||||
verbosity level 1 or higher if any file fails the test when testing multiple
|
||||
files.
|
||||
files. A multimember file with one or more empty members is accepted if
|
||||
redirected to standard input.
|
||||
|
||||
@item -v
|
||||
@itemx --verbose
|
||||
|
@ -416,7 +416,7 @@ given, the last setting is used. For example @w{@option{-9 -s64MiB}} is
|
|||
equivalent to @w{@option{-s64MiB -m273}}
|
||||
|
||||
@multitable {Level} {Dictionary size (-s)} {Match length limit (-m)}
|
||||
@item Level @tab Dictionary size (-s) @tab Match length limit (-m)
|
||||
@headitem Level @tab Dictionary size (-s) @tab Match length limit (-m)
|
||||
@item -0 @tab 64 KiB @tab 16 bytes
|
||||
@item -1 @tab 1 MiB @tab 5 bytes
|
||||
@item -2 @tab 1.5 MiB @tab 6 bytes
|
||||
|
@ -437,7 +437,7 @@ Aliases for GNU gzip compatibility.
|
|||
When decompressing, testing, or listing, allow trailing data whose first
|
||||
bytes are so similar to the magic bytes of a lzip header that they can
|
||||
be confused with a corrupt header. Use this option if a file triggers a
|
||||
"corrupt header" error and the cause is not indeed a corrupt header.
|
||||
'corrupt header' error and the cause is not indeed a corrupt header.
|
||||
|
||||
@item --in-slots=@var{n}
|
||||
Number of @w{1 MiB} input packets buffered per worker thread when
|
||||
|
@ -474,7 +474,7 @@ and may be followed by a multiplier and an optional @samp{B} for "byte".
|
|||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
||||
@item Prefix @tab Value @tab | @tab Prefix @tab Value
|
||||
@headitem Prefix @tab Value @tab | @tab Prefix @tab Value
|
||||
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
||||
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
||||
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
||||
|
@ -494,6 +494,148 @@ indicate a corrupt or invalid input file, 3 for an internal consistency
|
|||
error (e.g., bug) which caused plzip to panic.
|
||||
|
||||
|
||||
@node Argument syntax
|
||||
@chapter Syntax of command-line arguments
|
||||
@cindex argument syntax
|
||||
|
||||
POSIX recommends these conventions for command-line arguments.
|
||||
|
||||
@itemize @bullet
|
||||
@item A command-line argument is an option if it begins with a hyphen
|
||||
(@samp{-}).
|
||||
|
||||
@item Option names are single alphanumeric characters.
|
||||
|
||||
@item Certain options require an argument.
|
||||
|
||||
@item An option and its argument may or may not appear as separate tokens.
|
||||
(In other words, the whitespace separating them is optional, unless the
|
||||
argument is the empty string).
|
||||
Thus, @w{@option{-o foo}} and @option{-ofoo} are equivalent.
|
||||
|
||||
@item One or more options without arguments, followed by at most one option
|
||||
that takes an argument, may follow a hyphen in a single token.
|
||||
Thus, @option{-abc} is equivalent to @w{@option{-a -b -c}}.
|
||||
|
||||
@item Options typically precede other non-option arguments.
|
||||
|
||||
@item The argument @samp{--} terminates all options; any following arguments
|
||||
are treated as non-option arguments, even if they begin with a hyphen.
|
||||
|
||||
@item A token consisting of a single hyphen character is interpreted as an
|
||||
ordinary non-option argument. By convention, it is used to specify standard
|
||||
input, standard output, or a file named @samp{-}.
|
||||
@end itemize
|
||||
|
||||
@noindent
|
||||
GNU adds @dfn{long options} to these conventions:
|
||||
|
||||
@itemize @bullet
|
||||
@item A long option consists of two hyphens (@samp{--}) followed by a name
|
||||
made of alphanumeric characters and hyphens. Option names are typically one
|
||||
to three words long, with hyphens to separate words. Abbreviations can be
|
||||
used for the long option names as long as the abbreviations are unique.
|
||||
|
||||
@item A long option and its argument may or may not appear as separate
|
||||
tokens. In the latter case they must be separated by an equal sign @samp{=}.
|
||||
Thus, @w{@option{--foo bar}} and @option{--foo=bar} are equivalent.
|
||||
@end itemize
|
||||
|
||||
@noindent
|
||||
The syntax of options with an optional argument is
|
||||
@option{-<short_option><argument>} (without whitespace), or
|
||||
@option{--<long_option>=<argument>}.
|
||||
|
||||
|
||||
@node File format
|
||||
@chapter File format
|
||||
@cindex file format
|
||||
|
||||
Perfection is reached, not when there is no longer anything to add, but
|
||||
when there is no longer anything to take away.@*
|
||||
--- Antoine de Saint-Exupery
|
||||
|
||||
In the diagram below, a box like this:
|
||||
|
||||
@verbatim
|
||||
+---+
|
||||
| | <-- the vertical bars might be missing
|
||||
+---+
|
||||
@end verbatim
|
||||
|
||||
represents one byte; a box like this:
|
||||
|
||||
@verbatim
|
||||
+==============+
|
||||
| |
|
||||
+==============+
|
||||
@end verbatim
|
||||
|
||||
represents a variable number of bytes.
|
||||
|
||||
@noindent
|
||||
A lzip file consists of one or more independent "members" (compressed data
|
||||
sets). The members simply appear one after another in the file, with no
|
||||
additional information before, between, or after them. Each member can
|
||||
encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
|
||||
The size of a multimember file is unlimited. Empty members (data size = 0)
|
||||
are not allowed in multimember files.
|
||||
|
||||
Each member has the following structure:
|
||||
|
||||
@verbatim
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
@end verbatim
|
||||
|
||||
All multibyte values are stored in little endian order.
|
||||
|
||||
@table @samp
|
||||
@item ID string (the "magic" bytes)
|
||||
A four byte string, identifying the lzip format, with the value "LZIP"
|
||||
(0x4C, 0x5A, 0x49, 0x50).
|
||||
|
||||
@item VN (version number, 1 byte)
|
||||
Just in case something needs to be modified in the future. 1 for now.
|
||||
|
||||
@anchor{coded-dict-size}
|
||||
@item DS (coded dictionary size, 1 byte)
|
||||
The dictionary size is calculated by taking a power of 2 (the base size)
|
||||
and subtracting from it a fraction between 0/16 and 7/16 of the base size.@*
|
||||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@*
|
||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
||||
from the base size to obtain the dictionary size.@*
|
||||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
@item LZMA stream
|
||||
The LZMA stream, terminated by an 'End Of Stream' marker. Uses default values
|
||||
for encoder properties.
|
||||
@ifnothtml
|
||||
@xref{Stream format,,,lzip},
|
||||
@end ifnothtml
|
||||
@ifhtml
|
||||
See
|
||||
@uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format,,Stream format}
|
||||
@end ifhtml
|
||||
for a complete description.
|
||||
|
||||
@item CRC32 (4 bytes)
|
||||
Cyclic Redundancy Check (CRC) of the original uncompressed data.
|
||||
|
||||
@item Data size (8 bytes)
|
||||
Size of the original uncompressed data.
|
||||
|
||||
@item Member size (8 bytes)
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, improves the checking of stream integrity, and
|
||||
facilitates the safe recovery of undamaged members from multimember files.
|
||||
Lzip limits the member size to @w{2 PiB} to prevent the data size field from
|
||||
overflowing.
|
||||
@end table
|
||||
|
||||
|
||||
@node Program design
|
||||
@chapter Internal structure of plzip
|
||||
@cindex program design
|
||||
|
@ -510,8 +652,8 @@ because lzip usually produces single-member files, which can't be
|
|||
decompressed in parallel.
|
||||
|
||||
For each input file, a splitter thread and several worker threads are
|
||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||
courier" takes care of data transfers among threads and limits the
|
||||
created, acting the main thread as muxer (multiplexer) thread. A 'packet
|
||||
courier' takes care of data transfers among threads and limits the
|
||||
maximum number of data blocks (packets) being processed simultaneously.
|
||||
|
||||
The splitter reads data blocks from the input file, and distributes them
|
||||
|
@ -587,7 +729,7 @@ The following table shows the memory required @strong{per thread} for
|
|||
compression at a given level, using the default data size for each level:
|
||||
|
||||
@multitable {Level} {Memory required}
|
||||
@item Level @tab Memory required
|
||||
@headitem Level @tab Memory required
|
||||
@item -0 @tab 4.875 MiB
|
||||
@item -1 @tab 17.75 MiB
|
||||
@item -2 @tab 26.625 MiB
|
||||
|
@ -638,96 +780,6 @@ data size for each level:
|
|||
@end multitable
|
||||
|
||||
|
||||
@node File format
|
||||
@chapter File format
|
||||
@cindex file format
|
||||
|
||||
Perfection is reached, not when there is no longer anything to add, but
|
||||
when there is no longer anything to take away.@*
|
||||
--- Antoine de Saint-Exupery
|
||||
|
||||
@sp 1
|
||||
In the diagram below, a box like this:
|
||||
|
||||
@verbatim
|
||||
+---+
|
||||
| | <-- the vertical bars might be missing
|
||||
+---+
|
||||
@end verbatim
|
||||
|
||||
represents one byte; a box like this:
|
||||
|
||||
@verbatim
|
||||
+==============+
|
||||
| |
|
||||
+==============+
|
||||
@end verbatim
|
||||
|
||||
represents a variable number of bytes.
|
||||
|
||||
@sp 1
|
||||
A lzip file consists of one or more independent "members" (compressed data
|
||||
sets). The members simply appear one after another in the file, with no
|
||||
additional information before, between, or after them. Each member can
|
||||
encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
|
||||
The size of a multimember file is unlimited.
|
||||
|
||||
Each member has the following structure:
|
||||
|
||||
@verbatim
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
|
||||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
@end verbatim
|
||||
|
||||
All multibyte values are stored in little endian order.
|
||||
|
||||
@table @samp
|
||||
@item ID string (the "magic" bytes)
|
||||
A four byte string, identifying the lzip format, with the value "LZIP"
|
||||
(0x4C, 0x5A, 0x49, 0x50).
|
||||
|
||||
@item VN (version number, 1 byte)
|
||||
Just in case something needs to be modified in the future. 1 for now.
|
||||
|
||||
@anchor{coded-dict-size}
|
||||
@item DS (coded dictionary size, 1 byte)
|
||||
The dictionary size is calculated by taking a power of 2 (the base size)
|
||||
and subtracting from it a fraction between 0/16 and 7/16 of the base size.@*
|
||||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@*
|
||||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
|
||||
from the base size to obtain the dictionary size.@*
|
||||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
|
||||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
@item LZMA stream
|
||||
The LZMA stream, finished by an "End Of Stream" marker. Uses default values
|
||||
for encoder properties.
|
||||
@ifnothtml
|
||||
@xref{Stream format,,,lzip},
|
||||
@end ifnothtml
|
||||
@ifhtml
|
||||
See
|
||||
@uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format,,Stream format}
|
||||
@end ifhtml
|
||||
for a complete description.
|
||||
|
||||
@item CRC32 (4 bytes)
|
||||
Cyclic Redundancy Check (CRC) of the original uncompressed data.
|
||||
|
||||
@item Data size (8 bytes)
|
||||
Size of the original uncompressed data.
|
||||
|
||||
@item Member size (8 bytes)
|
||||
Total size of the member, including header and trailer. This field acts
|
||||
as a distributed index, improves the checking of stream integrity, and
|
||||
facilitates the safe recovery of undamaged members from multimember files.
|
||||
Lzip limits the member size to @w{2 PiB} to prevent the data size field from
|
||||
overflowing.
|
||||
|
||||
@end table
|
||||
|
||||
|
||||
@node Trailing data
|
||||
@chapter Extra data appended to the file
|
||||
@cindex trailing data
|
||||
|
@ -742,7 +794,7 @@ example when writing to a tape. It is safe to append any amount of
|
|||
padding zero bytes to a lzip file.
|
||||
|
||||
@item
|
||||
Useful data added by the user; an "End Of File" string (to check that the
|
||||
Useful data added by the user; an 'End Of File' string (to check that the
|
||||
file has not been truncated), a cryptographically secure hash, a description
|
||||
of file contents, etc. It is safe to append any amount of text to a lzip
|
||||
file as long as none of the first four bytes of the text matches the
|
||||
|
@ -794,9 +846,8 @@ compression can only be detected by comparing the compressed file with the
|
|||
original because the corruption happens before plzip compresses the RAM
|
||||
contents, resulting in a valid compressed file containing wrong data.
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 1: Extract all the files from archive @samp{foo.tar.lz}.
|
||||
Example 1: Extract all the files from archive @file{foo.tar.lz}.
|
||||
|
||||
@example
|
||||
tar -xf foo.tar.lz
|
||||
|
@ -804,43 +855,38 @@ or
|
|||
plzip -cd foo.tar.lz | tar -xf -
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 2: Replace a regular file with its compressed version @samp{file.lz}
|
||||
Example 2: Replace a regular file with its compressed version @file{file.lz}
|
||||
and show the compression ratio.
|
||||
|
||||
@example
|
||||
plzip -v file
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 3: Like example 2 but the created @samp{file.lz} has a block size of
|
||||
Example 3: Like example 2 but the created @file{file.lz} has a block size of
|
||||
@w{1 MiB}. The compression ratio is not shown.
|
||||
|
||||
@example
|
||||
plzip -B 1MiB file
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 4: Restore a regular file from its compressed version
|
||||
@samp{file.lz}. If the operation is successful, @samp{file.lz} is removed.
|
||||
@file{file.lz}. If the operation is successful, @file{file.lz} is removed.
|
||||
|
||||
@example
|
||||
plzip -d file.lz
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 5: Check the integrity of the compressed file @samp{file.lz} and
|
||||
Example 5: Check the integrity of the compressed file @file{file.lz} and
|
||||
show status.
|
||||
|
||||
@example
|
||||
plzip -tv file.lz
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@anchor{concat-example}
|
||||
@noindent
|
||||
Example 6: The right way of concatenating the decompressed output of two or
|
||||
|
@ -853,28 +899,25 @@ Do this instead
|
|||
plzip -cd file1.lz file2.lz file3.lz
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 7: Decompress @samp{file.lz} partially until @w{10 KiB} of
|
||||
Example 7: Decompress @file{file.lz} partially until @w{10 KiB} of
|
||||
decompressed data are produced.
|
||||
|
||||
@example
|
||||
plzip -cd file.lz | dd bs=1024 count=10
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 8: Decompress @samp{file.lz} partially from decompressed byte at
|
||||
Example 8: Decompress @file{file.lz} partially from decompressed byte at
|
||||
offset 10000 to decompressed byte at offset 14999 (5000 bytes are produced).
|
||||
|
||||
@example
|
||||
plzip -cd file.lz | dd bs=1000 skip=10 count=5
|
||||
@end example
|
||||
|
||||
@sp 1
|
||||
@noindent
|
||||
Example 9: Compress a whole device in /dev/sdc and send the output to
|
||||
@samp{file.lz}.
|
||||
@file{file.lz}.
|
||||
|
||||
@example
|
||||
plzip -c /dev/sdc > file.lz
|
||||
|
|
15
list.cc
15
list.cc
|
@ -17,6 +17,7 @@
|
|||
|
||||
#define _FILE_OFFSET_BITS 64
|
||||
|
||||
#include <cerrno>
|
||||
#include <cstdio>
|
||||
#include <cstring>
|
||||
#include <string>
|
||||
|
@ -54,10 +55,10 @@ int list_files( const std::vector< std::string > & filenames,
|
|||
int files = 0, retval = 0;
|
||||
bool first_post = true;
|
||||
bool stdin_used = false;
|
||||
|
||||
|
||||
for( unsigned i = 0; i < filenames.size(); ++i )
|
||||
{
|
||||
const bool from_stdin = ( filenames[i] == "-" );
|
||||
const bool from_stdin = filenames[i] == "-";
|
||||
if( from_stdin ) { if( stdin_used ) continue; else stdin_used = true; }
|
||||
const char * const input_filename =
|
||||
from_stdin ? "(stdin)" : filenames[i].c_str();
|
||||
|
@ -74,6 +75,8 @@ int list_files( const std::vector< std::string > & filenames,
|
|||
set_retval( retval, lzip_index.retval() );
|
||||
continue;
|
||||
}
|
||||
const bool multi_empty = !from_stdin && lzip_index.multi_empty();
|
||||
if( multi_empty ) set_retval( retval, 2 );
|
||||
if( verbosity < 0 ) continue;
|
||||
const unsigned long long udata_size = lzip_index.udata_size();
|
||||
const unsigned long long cdata_size = lzip_index.cdata_size();
|
||||
|
@ -85,6 +88,8 @@ int list_files( const std::vector< std::string > & filenames,
|
|||
if( verbosity >= 1 ) std::fputs( " dict memb trail ", stdout );
|
||||
std::fputs( " uncompressed compressed saved name\n", stdout );
|
||||
}
|
||||
if( multi_empty )
|
||||
{ std::fflush( stdout ); show_file_error( input_filename, empty_msg ); }
|
||||
if( verbosity >= 1 )
|
||||
std::printf( "%s %5ld %6lld ", format_ds( lzip_index.dictionary_size() ),
|
||||
members, lzip_index.file_size() - cdata_size );
|
||||
|
@ -103,12 +108,16 @@ int list_files( const std::vector< std::string > & filenames,
|
|||
first_post = true; // reprint heading after list of members
|
||||
}
|
||||
std::fflush( stdout );
|
||||
if( std::ferror( stdout ) ) break;
|
||||
}
|
||||
if( verbosity >= 0 && files > 1 )
|
||||
if( verbosity >= 0 && files > 1 && !std::ferror( stdout ) )
|
||||
{
|
||||
if( verbosity >= 1 ) std::fputs( " ", stdout );
|
||||
list_line( total_uncomp, total_comp, "(totals)" );
|
||||
std::fflush( stdout );
|
||||
}
|
||||
if( verbosity >= 0 && ( std::ferror( stdout ) || std::fclose( stdout ) != 0 ) )
|
||||
{ show_file_error( "(stdout)", write_error_msg, errno );
|
||||
set_retval( retval, 1 ); }
|
||||
return retval;
|
||||
}
|
||||
|
|
6
lzip.h
6
lzip.h
|
@ -207,8 +207,10 @@ inline void set_retval( int & retval, const int new_val )
|
|||
const char * const bad_magic_msg = "Bad magic number (file not in lzip format).";
|
||||
const char * const bad_dict_msg = "Invalid dictionary size in member header.";
|
||||
const char * const corrupt_mm_msg = "Corrupt header in multimember file.";
|
||||
const char * const trailing_msg = "Trailing data not allowed.";
|
||||
const char * const empty_msg = "Empty member not allowed.";
|
||||
const char * const mem_msg = "Not enough memory.";
|
||||
const char * const trailing_msg = "Trailing data not allowed.";
|
||||
const char * const write_error_msg = "Write error";
|
||||
|
||||
// defined in compress.cc
|
||||
int readblock( const int fd, uint8_t * const buf, const int size );
|
||||
|
@ -255,7 +257,7 @@ void show_results( const unsigned long long in_size,
|
|||
int decompress( const unsigned long long cfile_size, int num_workers,
|
||||
const int infd, const int outfd, const Cl_options & cl_opts,
|
||||
const Pretty_print & pp, const int debug_level,
|
||||
const int in_slots, const int out_slots,
|
||||
const int in_slots, const int out_slots, const bool from_stdin,
|
||||
const bool infd_isreg, const bool one_to_one );
|
||||
|
||||
// defined in list.cc
|
||||
|
|
|
@ -45,9 +45,8 @@ int seek_read( const int fd, uint8_t * const buf, const int size,
|
|||
|
||||
bool Lzip_index::check_header( const Lzip_header & header, const bool first )
|
||||
{
|
||||
if( !header.check_magic() )
|
||||
{ error_ = bad_magic_msg; retval_ = 2; if( first ) bad_magic_ = true;
|
||||
return false; }
|
||||
if( header.check_magic() ) { if( first ) good_magic_ = true; }
|
||||
else { error_ = bad_magic_msg; retval_ = 2; return false; }
|
||||
if( !header.check_version() )
|
||||
{ error_ = bad_version( header.version() ); retval_ = 2; return false; }
|
||||
if( !isvalid_ds( header.dictionary_size() ) )
|
||||
|
@ -145,20 +144,20 @@ bool Lzip_index::skip_trailing_data( const int fd, unsigned long long & pos,
|
|||
|
||||
Lzip_index::Lzip_index( const int infd, const Cl_options & cl_opts )
|
||||
: insize( lseek( infd, 0, SEEK_END ) ), retval_( 0 ), dictionary_size_( 0 ),
|
||||
bad_magic_( false )
|
||||
good_magic_( false )
|
||||
{
|
||||
if( insize < 0 )
|
||||
{ set_errno_error( "Input file is not seekable: " ); return; }
|
||||
Lzip_header header;
|
||||
if( insize >= header.size &&
|
||||
( !read_header( infd, header, 0 ) ||
|
||||
!check_header( header, true ) ) ) return;
|
||||
if( insize < min_member_size )
|
||||
{ error_ = "Input file is too short."; retval_ = 2; return; }
|
||||
if( insize > INT64_MAX )
|
||||
{ error_ = "Input file is too long (2^63 bytes or more).";
|
||||
retval_ = 2; return; }
|
||||
|
||||
Lzip_header header;
|
||||
if( !read_header( infd, header, 0 ) ||
|
||||
!check_header( header, true ) ) return;
|
||||
|
||||
unsigned long long pos = insize; // always points to a header or to EOF
|
||||
while( pos >= min_member_size )
|
||||
{
|
||||
|
|
12
lzip_index.h
12
lzip_index.h
|
@ -55,7 +55,7 @@ class Lzip_index
|
|||
const long long insize;
|
||||
int retval_;
|
||||
unsigned dictionary_size_; // largest dictionary size in the file
|
||||
bool bad_magic_; // bad magic in first header
|
||||
bool good_magic_; // good magic in first header
|
||||
|
||||
bool check_header( const Lzip_header & header, const bool first );
|
||||
void set_errno_error( const char * const msg );
|
||||
|
@ -71,7 +71,15 @@ public:
|
|||
const std::string & error() const { return error_; }
|
||||
int retval() const { return retval_; }
|
||||
unsigned dictionary_size() const { return dictionary_size_; }
|
||||
bool bad_magic() const { return bad_magic_; }
|
||||
bool good_magic() const { return good_magic_; }
|
||||
|
||||
bool multi_empty() const // multimember file with empty member(s)
|
||||
{
|
||||
if( member_vector.size() > 1 )
|
||||
for( unsigned long i = 0; i < member_vector.size(); ++i )
|
||||
if( member_vector[i].dblock.size() == 0 ) return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
long long udata_size() const
|
||||
{ if( member_vector.empty() ) return 0;
|
||||
|
|
72
main.cc
72
main.cc
|
@ -26,7 +26,7 @@
|
|||
|
||||
#include <algorithm>
|
||||
#include <cerrno>
|
||||
#include <climits> // SSIZE_MAX
|
||||
#include <climits> // CHAR_BIT, SSIZE_MAX
|
||||
#include <csignal>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
|
@ -42,8 +42,10 @@
|
|||
#if defined __MSVCRT__ || defined __OS2__
|
||||
#include <io.h>
|
||||
#if defined __MSVCRT__
|
||||
#include <direct.h>
|
||||
#define fchmod(x,y) 0
|
||||
#define fchown(x,y,z) 0
|
||||
#define mkdir(name,mode) _mkdir(name)
|
||||
#define strtoull std::strtoul
|
||||
#define SIGHUP SIGTERM
|
||||
#define S_ISSOCK(x) 0
|
||||
|
@ -102,25 +104,26 @@ bool delete_output_on_interrupt = false;
|
|||
|
||||
void show_help( const long num_online )
|
||||
{
|
||||
std::printf( "Plzip is a massively parallel (multi-threaded) implementation of lzip,\n"
|
||||
"compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.\n"
|
||||
std::printf( "Plzip is a massively parallel (multi-threaded) implementation of lzip. Plzip\n"
|
||||
"uses the compression library lzlib.\n"
|
||||
"\nLzip is a lossless data compressor with a user interface similar to the one\n"
|
||||
"of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov\n"
|
||||
"chain-Algorithm' (LZMA) stream format to maximize interoperability. The\n"
|
||||
"maximum dictionary size is 512 MiB so that any lzip file can be decompressed\n"
|
||||
"on 32-bit machines. Lzip provides accurate and robust 3-factor integrity\n"
|
||||
"checking. Lzip can compress about as fast as gzip (lzip -0) or compress most\n"
|
||||
"files more than bzip2 (lzip -9). Decompression speed is intermediate between\n"
|
||||
"gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery\n"
|
||||
"perspective. Lzip has been designed, written, and tested with great care to\n"
|
||||
"replace gzip and bzip2 as the standard general-purpose compressed format for\n"
|
||||
"Unix-like systems.\n"
|
||||
"of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov\n"
|
||||
"chain-Algorithm) designed to achieve complete interoperability between\n"
|
||||
"implementations. The maximum dictionary size is 512 MiB so that any lzip\n"
|
||||
"file can be decompressed on 32-bit machines. Lzip provides accurate and\n"
|
||||
"robust 3-factor integrity checking. 'lzip -0' compresses about as fast as\n"
|
||||
"gzip, while 'lzip -9' compresses most files more than bzip2. Decompression\n"
|
||||
"speed is intermediate between gzip and bzip2. Lzip provides better data\n"
|
||||
"recovery capabilities than gzip and bzip2. Lzip has been designed, written,\n"
|
||||
"and tested with great care to replace gzip and bzip2 as general-purpose\n"
|
||||
"compressed format for Unix-like systems.\n"
|
||||
"\nPlzip can compress/decompress large files on multiprocessor machines much\n"
|
||||
"faster than lzip, at the cost of a slightly reduced compression ratio (0.4\n"
|
||||
"to 2 percent larger compressed files). Note that the number of usable\n"
|
||||
"threads is limited by file size; on files larger than a few GB plzip can use\n"
|
||||
"hundreds of processors, but on files of only a few MB plzip is no faster\n"
|
||||
"than lzip.\n"
|
||||
"hundreds of processors, but on files smaller than 1 MiB plzip is no faster\n"
|
||||
"than lzip (even at compression level -0).\n"
|
||||
"The number of threads defaults to the number of processors.\n"
|
||||
"\nUsage: %s [options] [files]\n", invocation_name );
|
||||
std::printf( "\nOptions:\n"
|
||||
" -h, --help display this help and exit\n"
|
||||
|
@ -277,7 +280,7 @@ const char * format_ds( const unsigned dictionary_size )
|
|||
const char * p = "";
|
||||
const char * np = " ";
|
||||
unsigned num = dictionary_size;
|
||||
bool exact = ( num % factor == 0 );
|
||||
bool exact = num % factor == 0;
|
||||
|
||||
for( int i = 0; i < n && ( num > 9999 || ( exact && num >= factor ) ); ++i )
|
||||
{ num /= factor; if( num % factor != 0 ) exact = false;
|
||||
|
@ -294,7 +297,7 @@ void show_header( const unsigned dictionary_size )
|
|||
|
||||
namespace {
|
||||
|
||||
// separate numbers of 5 or more digits in groups of 3 digits using '_'
|
||||
// separate numbers of 6 or more digits in groups of 3 digits using '_'
|
||||
const char * format_num3( unsigned long long num )
|
||||
{
|
||||
enum { buffers = 8, bufsize = 4 * sizeof num, n = 10 };
|
||||
|
@ -306,7 +309,7 @@ const char * format_num3( unsigned long long num )
|
|||
char * const buf = buffer[current++]; current %= buffers;
|
||||
char * p = buf + bufsize - 1; // fill the buffer backwards
|
||||
*p = 0; // terminator
|
||||
if( num > 1024 )
|
||||
if( num > 9999 )
|
||||
{
|
||||
char prefix = 0; // try binary first, then si
|
||||
for( int i = 0; i < n && num != 0 && num % 1024 == 0; ++i )
|
||||
|
@ -317,7 +320,7 @@ const char * format_num3( unsigned long long num )
|
|||
{ num /= 1000; prefix = si_prefix[i]; }
|
||||
if( prefix ) *(--p) = prefix;
|
||||
}
|
||||
const bool split = num >= 10000;
|
||||
const bool split = num >= 100000;
|
||||
|
||||
for( int i = 0; ; )
|
||||
{
|
||||
|
@ -352,7 +355,7 @@ unsigned long long getnum( const char * const arg,
|
|||
|
||||
if( !errno && tail[0] )
|
||||
{
|
||||
const unsigned factor = ( tail[1] == 'i' ) ? 1024 : 1000;
|
||||
const unsigned factor = (tail[1] == 'i') ? 1024 : 1000;
|
||||
int exponent = 0; // 0 = bad multiplier
|
||||
switch( tail[0] )
|
||||
{
|
||||
|
@ -470,9 +473,9 @@ int open_instream( const char * const name, struct stat * const in_statsp,
|
|||
{
|
||||
const int i = fstat( infd, in_statsp );
|
||||
const mode_t mode = in_statsp->st_mode;
|
||||
const bool can_read = ( i == 0 && !reg_only &&
|
||||
( S_ISBLK( mode ) || S_ISCHR( mode ) ||
|
||||
S_ISFIFO( mode ) || S_ISSOCK( mode ) ) );
|
||||
const bool can_read = i == 0 && !reg_only &&
|
||||
( S_ISBLK( mode ) || S_ISCHR( mode ) ||
|
||||
S_ISFIFO( mode ) || S_ISSOCK( mode ) );
|
||||
if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || one_to_one ) ) )
|
||||
{
|
||||
if( verbosity >= 0 )
|
||||
|
@ -495,7 +498,7 @@ int open_instream2( const char * const name, struct stat * const in_statsp,
|
|||
if( program_mode == m_compress && !recompress && eindex >= 0 )
|
||||
{
|
||||
if( verbosity >= 0 )
|
||||
std::fprintf( stderr, "%s: %s: Input file already has '%s' suffix.\n",
|
||||
std::fprintf( stderr, "%s: %s: Input file already has '%s' suffix, ignored.\n",
|
||||
program_name, name, known_extensions[eindex].from );
|
||||
return -1;
|
||||
}
|
||||
|
@ -539,8 +542,8 @@ bool open_outstream( const bool force, const bool protect )
|
|||
if( force ) flags |= O_TRUNC; else flags |= O_EXCL;
|
||||
|
||||
outfd = -1;
|
||||
if( output_filename.size() &&
|
||||
output_filename[output_filename.size()-1] == '/' ) errno = EISDIR;
|
||||
if( output_filename.size() && output_filename.end()[-1] == '/' )
|
||||
errno = EISDIR;
|
||||
else {
|
||||
if( !protect && !make_dirs( output_filename ) )
|
||||
{ show_file_error( output_filename.c_str(),
|
||||
|
@ -810,7 +813,7 @@ int main( const int argc, const char * const argv[] )
|
|||
{ opt_in, "in-slots", Arg_parser::yes },
|
||||
{ opt_lt, "loose-trailing", Arg_parser::no },
|
||||
{ opt_out, "out-slots", Arg_parser::yes },
|
||||
{ 0, 0, Arg_parser::no } };
|
||||
{ 0, 0, Arg_parser::no } };
|
||||
|
||||
const Arg_parser parser( argc, argv, options );
|
||||
if( parser.error().size() ) // bad option
|
||||
|
@ -831,11 +834,11 @@ int main( const int argc, const char * const argv[] )
|
|||
const char * const arg = sarg.c_str();
|
||||
switch( code )
|
||||
{
|
||||
case '0': case '1': case '2': case '3': case '4':
|
||||
case '5': case '6': case '7': case '8': case '9':
|
||||
case '0': case '1': case '2': case '3': case '4': case '5':
|
||||
case '6': case '7': case '8': case '9':
|
||||
encoder_options = option_mapping[code-'0']; break;
|
||||
case 'a': cl_opts.ignore_trailing = false; break;
|
||||
case 'b': break;
|
||||
case 'b': break; // ignored
|
||||
case 'B': data_size = getnum( arg, pn, 2 * LZ_min_dictionary_size(),
|
||||
2 * LZ_max_dictionary_size() ); break;
|
||||
case 'c': to_stdout = true; break;
|
||||
|
@ -854,7 +857,7 @@ int main( const int argc, const char * const argv[] )
|
|||
case 'q': verbosity = -1; break;
|
||||
case 's': encoder_options.dictionary_size = get_dict_size( arg, pn );
|
||||
break;
|
||||
case 'S': break;
|
||||
case 'S': break; // ignored
|
||||
case 't': set_mode( program_mode, m_test ); break;
|
||||
case 'v': if( verbosity < 4 ) ++verbosity; break;
|
||||
case 'V': show_version(); return 0;
|
||||
|
@ -935,9 +938,10 @@ int main( const int argc, const char * const argv[] )
|
|||
{
|
||||
std::string input_filename;
|
||||
int infd;
|
||||
const bool from_stdin = filenames[i] == "-";
|
||||
|
||||
pp.set_name( filenames[i] );
|
||||
if( filenames[i] == "-" )
|
||||
if( from_stdin )
|
||||
{
|
||||
if( stdin_used ) continue; else stdin_used = true;
|
||||
infd = STDIN_FILENO;
|
||||
|
@ -985,8 +989,8 @@ int main( const int argc, const char * const argv[] )
|
|||
infd, outfd, pp, debug_level );
|
||||
else
|
||||
tmp = decompress( cfile_size, num_workers, infd, outfd, cl_opts, pp,
|
||||
debug_level, in_slots, out_slots, infd_isreg,
|
||||
one_to_one );
|
||||
debug_level, in_slots, out_slots, from_stdin,
|
||||
infd_isreg, one_to_one );
|
||||
if( close( infd ) != 0 )
|
||||
{ show_file_error( pp.name(), "Error closing input file", errno );
|
||||
set_retval( tmp, 1 ); }
|
||||
|
|
|
@ -28,9 +28,8 @@ if [ -d tmp ] ; then rm -rf tmp ; fi
|
|||
mkdir tmp
|
||||
cd "${objdir}"/tmp || framework_failure
|
||||
|
||||
cat "${testdir}"/test.txt > in || framework_failure
|
||||
cp "${testdir}"/test.txt in || framework_failure
|
||||
in_lz="${testdir}"/test.txt.lz
|
||||
in_em="${testdir}"/test_em.txt.lz
|
||||
fox_lz="${testdir}"/fox.lz
|
||||
fail=0
|
||||
lwarn8=0
|
||||
|
@ -112,33 +111,25 @@ printf "LZIP\001+.............................." | "${LZIP}" -t 2> /dev/null
|
|||
|
||||
printf "\ntesting decompression..."
|
||||
|
||||
for i in "${in_lz}" "${in_em}" ; do
|
||||
"${LZIP}" -lq "$i" || test_failed $LINENO "$i"
|
||||
"${LZIP}" -t "$i" || test_failed $LINENO "$i"
|
||||
"${LZIP}" -d "$i" -o out || test_failed $LINENO "$i"
|
||||
cmp in out || test_failed $LINENO "$i"
|
||||
"${LZIP}" -cd "$i" > out || test_failed $LINENO "$i"
|
||||
cmp in out || test_failed $LINENO "$i"
|
||||
"${LZIP}" -d "$i" -o - > out || test_failed $LINENO "$i"
|
||||
cmp in out || test_failed $LINENO "$i"
|
||||
"${LZIP}" -d < "$i" > out || test_failed $LINENO "$i"
|
||||
cmp in out || test_failed $LINENO "$i"
|
||||
rm -f out || framework_failure
|
||||
done
|
||||
"${LZIP}" -l "${in_lz}" > /dev/null || test_failed $LINENO
|
||||
"${LZIP}" -t "${in_lz}" || test_failed $LINENO
|
||||
"${LZIP}" -d "${in_lz}" -o out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
"${LZIP}" -cd "${in_lz}" > out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
"${LZIP}" -d "${in_lz}" -o - > out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
"${LZIP}" -d < "${in_lz}" > out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
rm -f out || framework_failure
|
||||
|
||||
lines=`"${LZIP}" -tvv "${in_em}" 2>&1 | wc -l` || test_failed $LINENO
|
||||
[ "${lines}" -eq 1 ] || test_failed $LINENO "${lines}"
|
||||
|
||||
lines=`"${LZIP}" -lvv "${in_em}" | wc -l` || test_failed $LINENO
|
||||
[ "${lines}" -eq 11 ] || test_failed $LINENO "${lines}"
|
||||
|
||||
cat "${in_lz}" > out.lz || framework_failure
|
||||
cp "${in_lz}" out.lz || framework_failure
|
||||
"${LZIP}" -dk out.lz || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
rm -f out || framework_failure
|
||||
"${LZIP}" -cd "${fox_lz}" > fox || test_failed $LINENO
|
||||
cat fox > copy || framework_failure
|
||||
cat "${in_lz}" > copy.lz || framework_failure
|
||||
cp fox copy || framework_failure
|
||||
cp "${in_lz}" copy.lz || framework_failure
|
||||
"${LZIP}" -d copy.lz out.lz 2> /dev/null # skip copy, decompress out
|
||||
[ $? = 1 ] || test_failed $LINENO
|
||||
[ ! -e out.lz ] || test_failed $LINENO
|
||||
|
@ -152,7 +143,6 @@ rm -f copy out || framework_failure
|
|||
printf "to be overwritten" > out || framework_failure
|
||||
"${LZIP}" -df -o out < "${in_lz}" || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
rm -f out || framework_failure
|
||||
"${LZIP}" -d -o ./- "${in_lz}" || test_failed $LINENO
|
||||
cmp in ./- || test_failed $LINENO
|
||||
rm -f ./- || framework_failure
|
||||
|
@ -160,12 +150,12 @@ rm -f ./- || framework_failure
|
|||
cmp in ./- || test_failed $LINENO
|
||||
rm -f ./- || framework_failure
|
||||
|
||||
cat "${in_lz}" > anyothername || framework_failure
|
||||
cp "${in_lz}" anyothername || framework_failure
|
||||
"${LZIP}" -dv - anyothername - < "${in_lz}" > out 2> /dev/null ||
|
||||
test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
cmp in anyothername.out || test_failed $LINENO
|
||||
rm -f out anyothername.out || framework_failure
|
||||
rm -f anyothername.out || framework_failure
|
||||
|
||||
"${LZIP}" -lq in "${in_lz}"
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
|
@ -182,7 +172,7 @@ cat out in | cmp in - || test_failed $LINENO # out must be empty
|
|||
[ $? = 1 ] || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
rm -f out || framework_failure
|
||||
cat "${in_lz}" > out.lz || framework_failure
|
||||
cp "${in_lz}" out.lz || framework_failure
|
||||
for i in 1 2 3 4 5 6 7 ; do
|
||||
printf "g" >> out.lz || framework_failure
|
||||
"${LZIP}" -alvv out.lz "${in_lz}" > /dev/null 2>&1
|
||||
|
@ -203,7 +193,7 @@ cmp in out || test_failed $LINENO
|
|||
rm -f out || framework_failure
|
||||
|
||||
cat in in > in2 || framework_failure
|
||||
"${LZIP}" -lq "${in_lz}" "${in_lz}" || test_failed $LINENO
|
||||
"${LZIP}" -l "${in_lz}" "${in_lz}" > /dev/null || test_failed $LINENO
|
||||
"${LZIP}" -t "${in_lz}" "${in_lz}" || test_failed $LINENO
|
||||
"${LZIP}" -cd "${in_lz}" "${in_lz}" -o out > out2 || test_failed $LINENO
|
||||
[ ! -e out ] || test_failed $LINENO # override -o
|
||||
|
@ -214,6 +204,11 @@ cmp in2 out2 || test_failed $LINENO
|
|||
rm -f out2 || framework_failure
|
||||
|
||||
cat "${in_lz}" "${in_lz}" > out2.lz || framework_failure
|
||||
lines=`"${LZIP}" -tvv out2.lz 2>&1 | wc -l` || test_failed $LINENO
|
||||
[ "${lines}" -eq 1 ] || test_failed $LINENO "${lines}"
|
||||
lines=`"${LZIP}" -lvv out2.lz | wc -l` || test_failed $LINENO
|
||||
[ "${lines}" -eq 5 ] || test_failed $LINENO "${lines}"
|
||||
|
||||
printf "\ngarbage" >> out2.lz || framework_failure
|
||||
"${LZIP}" -tvvvv out2.lz 2> /dev/null || test_failed $LINENO
|
||||
"${LZIP}" -alq out2.lz
|
||||
|
@ -243,6 +238,21 @@ rm -rf a || framework_failure
|
|||
[ $? = 1 ] || test_failed $LINENO
|
||||
[ ! -e a ] || test_failed $LINENO
|
||||
|
||||
touch empty em || framework_failure
|
||||
"${LZIP}" -0 em || test_failed $LINENO
|
||||
"${LZIP}" -l em.lz > /dev/null || test_failed $LINENO
|
||||
"${LZIP}" -dk em.lz || test_failed $LINENO
|
||||
cmp empty em || test_failed $LINENO
|
||||
cat em.lz em.lz | "${LZIP}" -t || test_failed $LINENO
|
||||
cat em.lz em.lz | "${LZIP}" -d > em || test_failed $LINENO
|
||||
cmp empty em || test_failed $LINENO
|
||||
cat em.lz "${in_lz}" | "${LZIP}" -t || test_failed $LINENO
|
||||
cat em.lz "${in_lz}" | "${LZIP}" -d > out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
cat "${in_lz}" em.lz | "${LZIP}" -t || test_failed $LINENO
|
||||
cat "${in_lz}" em.lz | "${LZIP}" -d > out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
|
||||
printf "\ntesting compression..."
|
||||
|
||||
"${LZIP}" -c -0 in in in -o out3.lz > copy2.lz || test_failed $LINENO
|
||||
|
@ -251,7 +261,7 @@ printf "\ntesting compression..."
|
|||
"${LZIP}" -d copy2.lz -o out2 || test_failed $LINENO
|
||||
[ -e copy2.lz ] || test_failed $LINENO
|
||||
cmp in2 out2 || test_failed $LINENO
|
||||
rm -f in2 out2 copy2.lz || framework_failure
|
||||
rm -f copy2.lz || framework_failure
|
||||
|
||||
"${LZIP}" -cf "${in_lz}" > lzlz 2> /dev/null # /dev/null is a tty on OS/2
|
||||
[ $? = 1 ] || test_failed $LINENO
|
||||
|
@ -332,11 +342,44 @@ rm -rf a fox || framework_failure
|
|||
|
||||
printf "\ntesting bad input..."
|
||||
|
||||
cat em.lz em.lz > ee.lz || framework_failure
|
||||
"${LZIP}" -l < ee.lz > /dev/null || test_failed $LINENO
|
||||
"${LZIP}" -t < ee.lz || test_failed $LINENO
|
||||
"${LZIP}" -d < ee.lz > em || test_failed $LINENO
|
||||
cmp empty em || test_failed $LINENO
|
||||
"${LZIP}" -lq ee.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -tq ee.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -dq ee.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
[ ! -e ee ] || test_failed $LINENO
|
||||
"${LZIP}" -cdq ee.lz > em
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
cmp empty em || test_failed $LINENO
|
||||
rm -f empty em || framework_failure
|
||||
cat "${in_lz}" em.lz "${in_lz}" > inein.lz || framework_failure
|
||||
"${LZIP}" -l < inein.lz > /dev/null || test_failed $LINENO
|
||||
"${LZIP}" -t < inein.lz || test_failed $LINENO
|
||||
"${LZIP}" -d < inein.lz > out2 || test_failed $LINENO
|
||||
cmp in2 out2 || test_failed $LINENO
|
||||
"${LZIP}" -lq inein.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -tq inein.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -dq inein.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
[ ! -e inein ] || test_failed $LINENO
|
||||
"${LZIP}" -cdq inein.lz > out2
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
cmp in2 out2 || test_failed $LINENO
|
||||
rm -f in2 out2 inein.lz em.lz || framework_failure
|
||||
|
||||
headers='LZIp LZiP LZip LzIP LzIp LziP lZIP lZIp lZiP lzIP'
|
||||
body='\001\014\000\203\377\373\377\377\300\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000$\000\000\000\000\000\000\000'
|
||||
cat "${in_lz}" > int.lz || framework_failure
|
||||
body='\001\014\000\000\101\376\367\377\377\340\000\200\000\215\357\002\322\001\000\000\000\000\000\000\000\045\000\000\000\000\000\000\000'
|
||||
cp "${in_lz}" int.lz || framework_failure
|
||||
printf "LZIP${body}" >> int.lz || framework_failure
|
||||
if "${LZIP}" -tq int.lz ; then
|
||||
if "${LZIP}" -t int.lz ; then
|
||||
for header in ${headers} ; do
|
||||
printf "${header}${body}" > int.lz || framework_failure
|
||||
"${LZIP}" -lq int.lz # first member
|
||||
|
@ -355,7 +398,7 @@ if "${LZIP}" -tq int.lz ; then
|
|||
[ $? = 2 ] || test_failed $LINENO ${header}
|
||||
"${LZIP}" -cdq --loose-trailing int.lz > /dev/null
|
||||
[ $? = 2 ] || test_failed $LINENO ${header}
|
||||
cat "${in_lz}" > int.lz || framework_failure
|
||||
cp "${in_lz}" int.lz || framework_failure
|
||||
printf "${header}${body}" >> int.lz || framework_failure
|
||||
"${LZIP}" -lq int.lz # trailing data
|
||||
[ $? = 2 ] || test_failed $LINENO ${header}
|
||||
|
@ -365,7 +408,7 @@ if "${LZIP}" -tq int.lz ; then
|
|||
[ $? = 2 ] || lzlib_1_10 # requires lzlib 1.10
|
||||
"${LZIP}" -cdq int.lz > /dev/null
|
||||
[ $? = 2 ] || test_failed $LINENO ${header}
|
||||
"${LZIP}" -lq --loose-trailing int.lz ||
|
||||
"${LZIP}" -l --loose-trailing int.lz > /dev/null ||
|
||||
test_failed $LINENO ${header}
|
||||
"${LZIP}" -t --loose-trailing int.lz ||
|
||||
test_failed $LINENO ${header}
|
||||
|
@ -383,7 +426,7 @@ if "${LZIP}" -tq int.lz ; then
|
|||
[ $? = 2 ] || test_failed $LINENO ${header}
|
||||
done
|
||||
else
|
||||
printf "\nwarning: skipping header test: 'printf' does not work on your system."
|
||||
printf "warning: skipping header test: 'printf' does not work on your system."
|
||||
fi
|
||||
rm -f int.lz || framework_failure
|
||||
|
||||
|
@ -395,9 +438,9 @@ done
|
|||
|
||||
cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
|
||||
cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
|
||||
if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
|
||||
[ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then
|
||||
for i in 6 20 14734 14753 14754 14755 14756 14757 14758 ; do
|
||||
if dd if=in3.lz of=trunc.lz bs=14682 count=1 2> /dev/null &&
|
||||
[ -e trunc.lz ] && cmp in2.lz trunc.lz ; then
|
||||
for i in 6 20 14664 14683 14684 14685 14686 14687 14688 ; do
|
||||
dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
|
||||
"${LZIP}" -lq trunc.lz
|
||||
[ $? = 2 ] || test_failed $LINENO $i
|
||||
|
@ -411,11 +454,11 @@ if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
|
|||
[ $? = 2 ] || lzlib_1_8 # requires lzlib 1.8
|
||||
done
|
||||
else
|
||||
printf "\nwarning: skipping truncation test: 'dd' does not work on your system."
|
||||
printf "warning: skipping truncation test: 'dd' does not work on your system."
|
||||
fi
|
||||
rm -f in2.lz in3.lz trunc.lz || framework_failure
|
||||
|
||||
cat "${in_lz}" > ingin.lz || framework_failure
|
||||
cp "${in_lz}" ingin.lz || framework_failure
|
||||
printf "g" >> ingin.lz || framework_failure
|
||||
cat "${in_lz}" >> ingin.lz || framework_failure
|
||||
"${LZIP}" -lq ingin.lz
|
||||
|
@ -424,14 +467,17 @@ cat "${in_lz}" >> ingin.lz || framework_failure
|
|||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -atq < ingin.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -acdq ingin.lz > /dev/null
|
||||
"${LZIP}" -acdq ingin.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -adq < ingin.lz > /dev/null
|
||||
"${LZIP}" -adq < ingin.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -tq ingin.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -t < ingin.lz || test_failed $LINENO
|
||||
"${LZIP}" -cdq ingin.lz > out
|
||||
"${LZIP}" -dq ingin.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
[ ! -e ingin ] || test_failed $LINENO
|
||||
"${LZIP}" -cdq ingin.lz
|
||||
[ $? = 2 ] || test_failed $LINENO
|
||||
"${LZIP}" -d < ingin.lz > out || test_failed $LINENO
|
||||
cmp in out || test_failed $LINENO
|
||||
|
|
|
@ -1,8 +1,7 @@
|
|||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc. <http://fsf.org/>
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
|
@ -339,8 +338,7 @@ Public License instead of this License.
|
|||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc. <http://fsf.org/>
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
|
|
Binary file not shown.
Binary file not shown.
Loading…
Add table
Add a link
Reference in a new issue