1
0
Fork 0

Adding upstream version 1.1~pre1.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-24 04:01:20 +01:00
parent 4bce01c02a
commit e7c68f81ff
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
17 changed files with 356 additions and 277 deletions

View file

@ -1,3 +1,9 @@
2013-07-20 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.1-pre1 released.
* Show progress of compression at verbosity level 2 (-vv).
* SIGUSR1 and SIGUSR2 are no more used to signal a fatal error.
2013-05-29 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.0 released.

View file

@ -1,7 +1,7 @@
Requirements
------------
You will need a C++ compiler and the lzlib compression library installed.
I use gcc 4.8.0 and 3.3.6, but the code should compile with any
I use gcc 4.8.1 and 3.3.6, but the code should compile with any
standards compliant compiler.
Lzlib must be version 1.0 or newer.
Gcc is available at http://gcc.gnu.org.
@ -12,9 +12,9 @@ Procedure
---------
1. Unpack the archive if you have not done so already:
lzip -cd plzip[version].tar.lz | tar -xf -
tar -xf plzip[version].tar.lz
or
gzip -cd plzip[version].tar.gz | tar -xf -
lzip -cd plzip[version].tar.lz | tar -xf -
This creates the directory ./plzip[version] containing the source from
the main archive.

16
NEWS
View file

@ -1,15 +1,5 @@
Changes in version 1.0:
Changes in version 1.1:
Scalability of compression (max number of useful worker threads) has
been increased.
Plzip now shows the progress of compression at verbosity level 2 (-vv).
Scalability when decompressing from/to regular files has been increased.
The number of worker threads is now limited to the number of members in
the input file when decompressing from a regular file.
"configure" now accepts options with a separate argument.
The target "install-as-lzip" has been added to the Makefile.
The target "install-bin" has been added to the Makefile.
Signals "SIGUSR1" and "SIGUSR2" are no more used to signal a fatal error.

49
README
View file

@ -1,21 +1,40 @@
Description
Plzip is a massively parallel (multi-threaded), lossless data compressor
based on the lzlib compression library, with very safe integrity
checking and a user interface similar to the one of bzip2, gzip or lzip.
based on the lzlib compression library, with a user interface similar to
the one of lzip, bzip2 or gzip.
Plzip is intended for faster compression/decompression of big files on
multiprocessor machines, which makes it specially well suited for
distribution of big software files and large scale data archiving. On
files big enough (several GB), plzip can use hundreds of processors.
Plzip uses the lzip file format; the files produced by plzip are fully
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
Plzip can compress/decompress large files on multiprocessor machines
much faster than lzip, at the cost of a slightly reduced compression
ratio. On files large enough (several GB), plzip can use hundreds of
processors. On files of only a few MB it is better to use lzip.
Plzip uses the same well-defined exit status values used by lzip and
bzip2, which makes it safer when used in pipes or scripts than
compressors returning ambiguous warning values, like gzip.
Plzip uses the lzip file format; the files produced by plzip are fully
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
The lzip file format is designed for long-term data archiving and
provides very safe integrity checking. The member trailer stores the
32-bit CRC of the original data, the size of the original data and the
size of the member. These values, together with the value remaining in
the range decoder and the end-of-stream marker, provide a 4 factor
integrity checking which guarantees that the decompressed version of the
data is identical to the original. This guards against corruption of the
compressed data, and against undetected bugs in plzip (hopefully very
unlikely). The chances of data corruption going undetected are
microscopic. Be aware, though, that the check occurs upon decompression,
so it can only tell you that something is wrong. It can't help you
recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
(one of the most common forms of data corruption), and provides data
recovery capabilities, including error-checked merging of damaged copies
of a file.
Plzip replaces every file given in the command line with a compressed
version of itself, with the name "original_name.lz". Each compressed
file has the same modification date, permissions, and, when possible,
@ -33,18 +52,6 @@ or more compressed files. The result is the concatenation of the
corresponding uncompressed files. Integrity testing of concatenated
compressed files is also supported.
As a self-check for your protection, plzip stores in the member trailer
the 32-bit CRC of the original data, the size of the original data and
the size of the member. These values, together with the value remaining
in the range decoder and the end-of-stream marker, provide a very safe 4
factor integrity checking which guarantees that the decompressed version
of the data is identical to the original. This guards against corruption
of the compressed data, and against undetected bugs in plzip (hopefully
very unlikely). The chances of data corruption going undetected are
microscopic. Be aware, though, that the check occurs upon decompression,
so it can only tell you that something is wrong. It can't help you
recover the original uncompressed data.
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.

View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@ -80,61 +80,70 @@ int writeblock( const int fd, const uint8_t * const buf, const int size )
void xinit( pthread_mutex_t * const mutex )
{
const int errcode = pthread_mutex_init( mutex, 0 );
if( errcode ) { show_error( "pthread_mutex_init", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_mutex_init", errcode ); cleanup_and_fail(); }
}
void xinit( pthread_cond_t * const cond )
{
const int errcode = pthread_cond_init( cond, 0 );
if( errcode ) { show_error( "pthread_cond_init", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_cond_init", errcode ); cleanup_and_fail(); }
}
void xdestroy( pthread_mutex_t * const mutex )
{
const int errcode = pthread_mutex_destroy( mutex );
if( errcode ) { show_error( "pthread_mutex_destroy", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_mutex_destroy", errcode ); cleanup_and_fail(); }
}
void xdestroy( pthread_cond_t * const cond )
{
const int errcode = pthread_cond_destroy( cond );
if( errcode ) { show_error( "pthread_cond_destroy", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_cond_destroy", errcode ); cleanup_and_fail(); }
}
void xlock( pthread_mutex_t * const mutex )
{
const int errcode = pthread_mutex_lock( mutex );
if( errcode ) { show_error( "pthread_mutex_lock", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_mutex_lock", errcode ); cleanup_and_fail(); }
}
void xunlock( pthread_mutex_t * const mutex )
{
const int errcode = pthread_mutex_unlock( mutex );
if( errcode ) { show_error( "pthread_mutex_unlock", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_mutex_unlock", errcode ); cleanup_and_fail(); }
}
void xwait( pthread_cond_t * const cond, pthread_mutex_t * const mutex )
{
const int errcode = pthread_cond_wait( cond, mutex );
if( errcode ) { show_error( "pthread_cond_wait", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_cond_wait", errcode ); cleanup_and_fail(); }
}
void xsignal( pthread_cond_t * const cond )
{
const int errcode = pthread_cond_signal( cond );
if( errcode ) { show_error( "pthread_cond_signal", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_cond_signal", errcode ); cleanup_and_fail(); }
}
void xbroadcast( pthread_cond_t * const cond )
{
const int errcode = pthread_cond_broadcast( cond );
if( errcode ) { show_error( "pthread_cond_broadcast", errcode ); fatal(); }
if( errcode )
{ show_error( "pthread_cond_broadcast", errcode ); cleanup_and_fail(); }
}
@ -317,10 +326,10 @@ extern "C" void * csplitter( void * arg )
for( bool first_post = true; ; first_post = false )
{
uint8_t * const data = new( std::nothrow ) uint8_t[data_size];
if( !data ) { pp( mem_msg ); fatal(); }
if( !data ) { pp( mem_msg ); cleanup_and_fail(); }
const int size = readblock( infd, data, data_size );
if( size != data_size && errno )
{ pp(); show_error( "Read error", errno ); fatal(); }
{ pp(); show_error( "Read error", errno ); cleanup_and_fail(); }
if( size > 0 || first_post ) // first packet may be empty
{
@ -365,7 +374,7 @@ extern "C" void * cworker( void * arg )
const int max_compr_size = 42 + packet->size + ( ( packet->size + 7 ) / 8 );
uint8_t * const new_data = new( std::nothrow ) uint8_t[max_compr_size];
if( !new_data ) { pp( mem_msg ); fatal(); }
if( !new_data ) { pp( mem_msg ); cleanup_and_fail(); }
const int dict_size = std::max( LZ_min_dictionary_size(),
std::min( dictionary_size, packet->size ) );
LZ_Encoder * const encoder =
@ -376,7 +385,7 @@ extern "C" void * cworker( void * arg )
pp( mem_msg );
else
internal_error( "invalid argument to encoder" );
fatal();
cleanup_and_fail();
}
int written = 0;
@ -403,7 +412,7 @@ extern "C" void * cworker( void * arg )
if( verbosity >= 0 )
std::fprintf( stderr, "LZ_compress_read error: %s.\n",
LZ_strerror( LZ_compress_errno( encoder ) ) );
fatal();
cleanup_and_fail();
}
new_pos += rd;
if( new_pos > max_compr_size )
@ -412,8 +421,9 @@ extern "C" void * cworker( void * arg )
}
if( LZ_compress_close( encoder ) < 0 )
{ pp( "LZ_compress_close failed" ); fatal(); }
{ pp( "LZ_compress_close failed" ); cleanup_and_fail(); }
if( verbosity >= 2 && packet->size > 0 ) show_progress( packet->size );
packet->data = new_data;
packet->size = new_pos;
courier.collect_packet( packet );
@ -441,7 +451,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
{
const int wr = writeblock( outfd, opacket->data, opacket->size );
if( wr != opacket->size )
{ pp(); show_error( "Write error", errno ); fatal(); }
{ pp(); show_error( "Write error", errno ); cleanup_and_fail(); }
}
delete[] opacket->data;
delete opacket;
@ -475,7 +485,7 @@ int compress( const int data_size, const int dictionary_size,
pthread_t splitter_thread;
int errcode = pthread_create( &splitter_thread, 0, csplitter, &splitter_arg );
if( errcode )
{ show_error( "Can't create splitter thread", errcode ); fatal(); }
{ show_error( "Can't create splitter thread", errcode ); cleanup_and_fail(); }
Worker_arg worker_arg;
worker_arg.courier = &courier;
@ -484,12 +494,12 @@ int compress( const int data_size, const int dictionary_size,
worker_arg.match_len_limit = match_len_limit;
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
if( !worker_threads ) { pp( mem_msg ); fatal(); }
if( !worker_threads ) { pp( mem_msg ); cleanup_and_fail(); }
for( int i = 0; i < num_workers; ++i )
{
errcode = pthread_create( worker_threads + i, 0, cworker, &worker_arg );
if( errcode )
{ show_error( "Can't create worker threads", errcode ); fatal(); }
{ show_error( "Can't create worker threads", errcode ); cleanup_and_fail(); }
}
muxer( courier, pp, outfd );
@ -498,13 +508,13 @@ int compress( const int data_size, const int dictionary_size,
{
errcode = pthread_join( worker_threads[i], 0 );
if( errcode )
{ show_error( "Can't join worker threads", errcode ); fatal(); }
{ show_error( "Can't join worker threads", errcode ); cleanup_and_fail(); }
}
delete[] worker_threads;
errcode = pthread_join( splitter_thread, 0 );
if( errcode )
{ show_error( "Can't join splitter thread", errcode ); fatal(); }
{ show_error( "Can't join splitter thread", errcode ); cleanup_and_fail(); }
if( verbosity >= 1 )
{

18
configure vendored
View file

@ -1,14 +1,14 @@
#! /bin/sh
# configure script for Plzip - A parallel compressor compatible with lzip
# configure script for Plzip - Parallel compressor compatible with lzip
# Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
#
# This configure script is free software: you have unlimited permission
# to copy, distribute and modify it.
pkgname=plzip
pkgversion=1.0
pkgversion=1.1-pre1
progname=plzip
srctrigger=doc/plzip.texinfo
srctrigger=doc/${pkgname}.texinfo
# clear some things potentially inherited from environment.
LC_ALL=C
@ -100,14 +100,14 @@ while [ $# != 0 ] ; do
*=* | *-*-*) ;;
*)
echo "configure: unrecognized option: '${option}'" 1>&2
echo "Try 'configure --help' for more information."
echo "Try 'configure --help' for more information." 1>&2
exit 1 ;;
esac
# Check if the option took a separate argument
if [ "${arg2}" = yes ] ; then
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
else echo "configure: Missing argument to \"${option}\"" 1>&2
else echo "configure: Missing argument to '${option}'" 1>&2
exit 1
fi
fi
@ -125,10 +125,8 @@ if [ -z "${srcdir}" ] ; then
fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then
exec 1>&2
echo
echo "configure: Can't find sources in ${srcdir} ${srcdirtext}"
echo "configure: (At least ${srctrigger} is missing)."
echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2
echo "configure: (At least ${srctrigger} is missing)." 1>&2
exit 1
fi
@ -166,7 +164,7 @@ echo "CXXFLAGS = ${CXXFLAGS}"
echo "LDFLAGS = ${LDFLAGS}"
rm -f Makefile
cat > Makefile << EOF
# Makefile for Plzip - A parallel compressor compatible with lzip
# Makefile for Plzip - Parallel compressor compatible with lzip
# Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
# This file was generated automatically by configure. Do not edit.
#

View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@ -171,7 +171,7 @@ extern "C" void * dworker_o( void * arg )
LZ_Decoder * const decoder = LZ_decompress_open();
if( !new_data || !ibuffer || !decoder ||
LZ_decompress_errno( decoder ) != LZ_ok )
{ pp( "Not enough memory" ); fatal(); }
{ pp( "Not enough memory" ); cleanup_and_fail(); }
int new_pos = 0;
for( int i = worker_id; i < file_index.members(); i += num_workers )
@ -188,7 +188,7 @@ extern "C" void * dworker_o( void * arg )
if( size > 0 )
{
if( preadblock( infd, ibuffer, size, member_pos ) != size )
{ pp(); show_error( "Read error", errno ); fatal(); }
{ pp(); show_error( "Read error", errno ); cleanup_and_fail(); }
member_pos += size;
member_rest -= size;
if( LZ_decompress_write( decoder, ibuffer, size ) != size )
@ -201,7 +201,7 @@ extern "C" void * dworker_o( void * arg )
const int rd = LZ_decompress_read( decoder, new_data + new_pos,
max_packet_size - new_pos );
if( rd < 0 )
fatal( decompress_read_error( decoder, pp, worker_id ) );
cleanup_and_fail( decompress_read_error( decoder, pp, worker_id ) );
new_pos += rd;
if( new_pos > max_packet_size )
internal_error( "opacket size exceeded in worker" );
@ -216,7 +216,7 @@ extern "C" void * dworker_o( void * arg )
courier.collect_packet( opacket, worker_id );
new_pos = 0;
new_data = new( std::nothrow ) uint8_t[max_packet_size];
if( !new_data ) { pp( "Not enough memory" ); fatal(); }
if( !new_data ) { pp( "Not enough memory" ); cleanup_and_fail(); }
}
if( LZ_decompress_finished( decoder ) == 1 )
{
@ -235,9 +235,9 @@ extern "C" void * dworker_o( void * arg )
delete[] ibuffer; delete[] new_data;
if( LZ_decompress_member_position( decoder ) != 0 )
{ pp( "Error, some data remains in decoder" ); fatal(); }
{ pp( "Error, some data remains in decoder" ); cleanup_and_fail(); }
if( LZ_decompress_close( decoder ) < 0 )
{ pp( "LZ_decompress_close failed" ); fatal(); }
{ pp( "LZ_decompress_close failed" ); cleanup_and_fail(); }
courier.worker_finished();
return 0;
}
@ -256,7 +256,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
{
const int wr = writeblock( outfd, opacket->data, opacket->size );
if( wr != opacket->size )
{ pp(); show_error( "Write error", errno ); fatal(); }
{ pp(); show_error( "Write error", errno ); cleanup_and_fail(); }
}
delete[] opacket->data;
delete opacket;
@ -280,7 +280,7 @@ int dec_stdout( const int num_workers, const int infd, const int outfd,
Worker_arg * worker_args = new( std::nothrow ) Worker_arg[num_workers];
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
if( !worker_args || !worker_threads )
{ pp( "Not enough memory" ); fatal(); }
{ pp( "Not enough memory" ); cleanup_and_fail(); }
for( int i = 0; i < num_workers; ++i )
{
worker_args[i].file_index = &file_index;
@ -292,7 +292,7 @@ int dec_stdout( const int num_workers, const int infd, const int outfd,
const int errcode =
pthread_create( &worker_threads[i], 0, dworker_o, &worker_args[i] );
if( errcode )
{ show_error( "Can't create worker threads", errcode ); fatal(); }
{ show_error( "Can't create worker threads", errcode ); cleanup_and_fail(); }
}
muxer( courier, pp, outfd );
@ -301,7 +301,7 @@ int dec_stdout( const int num_workers, const int infd, const int outfd,
{
const int errcode = pthread_join( worker_threads[i], 0 );
if( errcode )
{ show_error( "Can't join worker threads", errcode ); fatal(); }
{ show_error( "Can't join worker threads", errcode ); cleanup_and_fail(); }
}
delete[] worker_threads;
delete[] worker_args;

View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@ -248,22 +248,31 @@ extern "C" void * dsplitter_s( void * arg )
Packet_courier & courier = *tmp.courier;
const Pretty_print & pp = *tmp.pp;
const int infd = tmp.infd;
const int hsize = 6; // header size
const int tsize = 20; // trailer size
const int hsize = File_header::size;
const int tsize = File_trailer::size;
const int buffer_size = max_packet_size;
const int base_buffer_size = tsize + buffer_size + hsize;
uint8_t * const base_buffer = new( std::nothrow ) uint8_t[base_buffer_size];
if( !base_buffer ) { pp( "Not enough memory" ); fatal(); }
if( !base_buffer ) { pp( "Not enough memory" ); cleanup_and_fail(); }
uint8_t * const buffer = base_buffer + tsize;
int size = readblock( infd, buffer, buffer_size + hsize ) - hsize;
bool at_stream_end = ( size < buffer_size );
if( size != buffer_size && errno )
{ pp(); show_error( "Read error", errno ); fatal(); }
if( size <= tsize )
{ pp( "Error reading member header" ); fatal(); }
if( find_magic( buffer, 0, 4 ) != 0 )
{ pp( "Bad magic number (file not in lzip format)" ); fatal(); }
{ pp(); show_error( "Read error", errno ); cleanup_and_fail(); }
if( size + hsize < min_member_size )
{ pp( "Input file is too short" ); cleanup_and_fail( 2 ); }
const File_header & header = *(File_header *)buffer;
if( !header.verify_magic() )
{ pp( "Bad magic number (file not in lzip format)" ); cleanup_and_fail( 2 ); }
if( !header.verify_version() )
{
if( verbosity >= 0 )
{ pp();
std::fprintf( stderr, "Version %d member format not supported.\n",
header.version() ); }
cleanup_and_fail( 2 );
}
unsigned long long partial_member_size = 0;
while( true )
@ -274,13 +283,21 @@ extern "C" void * dsplitter_s( void * arg )
newpos = find_magic( buffer, newpos, size + 4 - newpos );
if( newpos <= size )
{
unsigned long long member_size = 0;
for( int i = 1; i <= 8; ++i )
{ member_size <<= 8; member_size += base_buffer[tsize+newpos-i]; }
const File_trailer & trailer = *(File_trailer *)(buffer + newpos - tsize);
const unsigned long long member_size = trailer.member_size();
if( partial_member_size + newpos - pos == member_size )
{ // header found
const File_header & header = *(File_header *)(buffer + newpos);
if( !header.verify_version() )
{
if( verbosity >= 0 )
{ pp();
std::fprintf( stderr, "Version %d member format not supported.\n",
header.version() ); }
cleanup_and_fail( 2 );
}
uint8_t * const data = new( std::nothrow ) uint8_t[newpos - pos];
if( !data ) { pp( "Not enough memory" ); fatal(); }
if( !data ) { pp( "Not enough memory" ); cleanup_and_fail(); }
std::memcpy( data, buffer + pos, newpos - pos );
courier.receive_packet( data, newpos - pos );
courier.receive_packet( 0, 0 ); // end of member token
@ -293,7 +310,7 @@ extern "C" void * dsplitter_s( void * arg )
if( at_stream_end )
{
uint8_t * data = new( std::nothrow ) uint8_t[size + hsize - pos];
if( !data ) { pp( "Not enough memory" ); fatal(); }
if( !data ) { pp( "Not enough memory" ); cleanup_and_fail(); }
std::memcpy( data, buffer + pos, size + hsize - pos );
courier.receive_packet( data, size + hsize - pos );
courier.receive_packet( 0, 0 ); // end of member token
@ -303,7 +320,7 @@ extern "C" void * dsplitter_s( void * arg )
{
partial_member_size += buffer_size - pos;
uint8_t * data = new( std::nothrow ) uint8_t[buffer_size - pos];
if( !data ) { pp( "Not enough memory" ); fatal(); }
if( !data ) { pp( "Not enough memory" ); cleanup_and_fail(); }
std::memcpy( data, buffer + pos, buffer_size - pos );
courier.receive_packet( data, buffer_size - pos );
}
@ -311,7 +328,7 @@ extern "C" void * dsplitter_s( void * arg )
size = readblock( infd, buffer + hsize, buffer_size );
at_stream_end = ( size < buffer_size );
if( size != buffer_size && errno )
{ pp(); show_error( "Read error", errno ); fatal(); }
{ pp(); show_error( "Read error", errno ); cleanup_and_fail(); }
}
delete[] base_buffer;
courier.finish(); // no more packets to send
@ -339,7 +356,7 @@ extern "C" void * dworker_s( void * arg )
uint8_t * new_data = new( std::nothrow ) uint8_t[max_packet_size];
LZ_Decoder * const decoder = LZ_decompress_open();
if( !new_data || !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
{ pp( "Not enough memory" ); fatal(); }
{ pp( "Not enough memory" ); cleanup_and_fail(); }
int new_pos = 0;
bool trailing_garbage_found = false;
@ -370,7 +387,7 @@ extern "C" void * dworker_s( void * arg )
if( LZ_decompress_errno( decoder ) == LZ_header_error )
trailing_garbage_found = true;
else
fatal( decompress_read_error( decoder, pp, worker_id ) );
cleanup_and_fail( decompress_read_error( decoder, pp, worker_id ) );
}
else new_pos += rd;
if( new_pos > max_packet_size )
@ -386,7 +403,7 @@ extern "C" void * dworker_s( void * arg )
courier.collect_packet( opacket, worker_id );
new_pos = 0;
new_data = new( std::nothrow ) uint8_t[max_packet_size];
if( !new_data ) { pp( "Not enough memory" ); fatal(); }
if( !new_data ) { pp( "Not enough memory" ); cleanup_and_fail(); }
}
if( trailing_garbage_found ||
LZ_decompress_finished( decoder ) == 1 )
@ -409,9 +426,9 @@ extern "C" void * dworker_s( void * arg )
delete[] new_data;
if( LZ_decompress_member_position( decoder ) != 0 )
{ pp( "Error, some data remains in decoder" ); fatal(); }
{ pp( "Error, some data remains in decoder" ); cleanup_and_fail(); }
if( LZ_decompress_close( decoder ) < 0 )
{ pp( "LZ_decompress_close failed" ); fatal(); }
{ pp( "LZ_decompress_close failed" ); cleanup_and_fail(); }
return 0;
}
@ -431,7 +448,7 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
{
const int wr = writeblock( outfd, opacket->data, opacket->size );
if( wr != opacket->size )
{ pp(); show_error( "Write error", errno ); fatal(); }
{ pp(); show_error( "Write error", errno ); cleanup_and_fail(); }
}
delete[] opacket->data;
delete opacket;
@ -462,12 +479,12 @@ int dec_stream( const int num_workers, const int infd, const int outfd,
pthread_t splitter_thread;
int errcode = pthread_create( &splitter_thread, 0, dsplitter_s, &splitter_arg );
if( errcode )
{ show_error( "Can't create splitter thread", errcode ); fatal(); }
{ show_error( "Can't create splitter thread", errcode ); cleanup_and_fail(); }
Worker_arg * worker_args = new( std::nothrow ) Worker_arg[num_workers];
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
if( !worker_args || !worker_threads )
{ pp( "Not enough memory" ); fatal(); }
{ pp( "Not enough memory" ); cleanup_and_fail(); }
for( int i = 0; i < num_workers; ++i )
{
worker_args[i].courier = &courier;
@ -475,7 +492,7 @@ int dec_stream( const int num_workers, const int infd, const int outfd,
worker_args[i].worker_id = i;
errcode = pthread_create( &worker_threads[i], 0, dworker_s, &worker_args[i] );
if( errcode )
{ show_error( "Can't create worker threads", errcode ); fatal(); }
{ show_error( "Can't create worker threads", errcode ); cleanup_and_fail(); }
}
muxer( courier, pp, outfd );
@ -484,14 +501,14 @@ int dec_stream( const int num_workers, const int infd, const int outfd,
{
errcode = pthread_join( worker_threads[i], 0 );
if( errcode )
{ show_error( "Can't join worker threads", errcode ); fatal(); }
{ show_error( "Can't join worker threads", errcode ); cleanup_and_fail(); }
}
delete[] worker_threads;
delete[] worker_args;
errcode = pthread_join( splitter_thread, 0 );
if( errcode )
{ show_error( "Can't join splitter thread", errcode ); fatal(); }
{ show_error( "Can't join splitter thread", errcode ); cleanup_and_fail(); }
if( verbosity >= 2 && out_size > 0 && in_size > 0 )
std::fprintf( stderr, "%6.3f:1, %6.3f bits/byte, %5.2f%% saved. ",

View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@ -122,7 +122,7 @@ extern "C" void * dworker( void * arg )
LZ_Decoder * const decoder = LZ_decompress_open();
if( !ibuffer || !obuffer || !decoder ||
LZ_decompress_errno( decoder ) != LZ_ok )
{ pp( "Not enough memory" ); fatal(); }
{ pp( "Not enough memory" ); cleanup_and_fail(); }
for( int i = worker_id; i < file_index.members(); i += num_workers )
{
@ -140,7 +140,7 @@ extern "C" void * dworker( void * arg )
if( size > 0 )
{
if( preadblock( infd, ibuffer, size, member_pos ) != size )
{ pp(); show_error( "Read error", errno ); fatal(); }
{ pp(); show_error( "Read error", errno ); cleanup_and_fail(); }
member_pos += size;
member_rest -= size;
if( LZ_decompress_write( decoder, ibuffer, size ) != size )
@ -152,7 +152,7 @@ extern "C" void * dworker( void * arg )
{
const int rd = LZ_decompress_read( decoder, obuffer, buffer_size );
if( rd < 0 )
fatal( decompress_read_error( decoder, pp, worker_id ) );
cleanup_and_fail( decompress_read_error( decoder, pp, worker_id ) );
if( rd > 0 && outfd >= 0 )
{
const int wr = pwriteblock( outfd, obuffer, rd, data_pos );
@ -162,7 +162,7 @@ extern "C" void * dworker( void * arg )
if( verbosity >= 0 )
std::fprintf( stderr, "Write error in worker %d: %s\n",
worker_id, std::strerror( errno ) );
fatal();
cleanup_and_fail();
}
}
if( rd > 0 )
@ -184,9 +184,9 @@ extern "C" void * dworker( void * arg )
delete[] obuffer; delete[] ibuffer;
if( LZ_decompress_member_position( decoder ) != 0 )
{ pp( "Error, some data remains in decoder" ); fatal(); }
{ pp( "Error, some data remains in decoder" ); cleanup_and_fail(); }
if( LZ_decompress_close( decoder ) < 0 )
{ pp( "LZ_decompress_close failed" ); fatal(); }
{ pp( "LZ_decompress_close failed" ); cleanup_and_fail(); }
return 0;
}
@ -208,7 +208,7 @@ int decompress( int num_workers, const int infd, const int outfd,
return dec_stream( num_workers, infd, outfd, pp, debug_level, testing );
}
if( file_index.retval() != 0 )
{ show_error( file_index.error().c_str() ); return file_index.retval(); }
{ pp( file_index.error().c_str() ); return file_index.retval(); }
if( num_workers > file_index.members() )
num_workers = file_index.members();
@ -224,7 +224,7 @@ int decompress( int num_workers, const int infd, const int outfd,
Worker_arg * worker_args = new( std::nothrow ) Worker_arg[num_workers];
pthread_t * worker_threads = new( std::nothrow ) pthread_t[num_workers];
if( !worker_args || !worker_threads )
{ pp( "Not enough memory" ); fatal(); }
{ pp( "Not enough memory" ); cleanup_and_fail(); }
for( int i = 0; i < num_workers; ++i )
{
worker_args[i].file_index = &file_index;
@ -236,14 +236,14 @@ int decompress( int num_workers, const int infd, const int outfd,
const int errcode =
pthread_create( &worker_threads[i], 0, dworker, &worker_args[i] );
if( errcode )
{ show_error( "Can't create worker threads", errcode ); fatal(); }
{ show_error( "Can't create worker threads", errcode ); cleanup_and_fail(); }
}
for( int i = num_workers - 1; i >= 0; --i )
{
const int errcode = pthread_join( worker_threads[i], 0 );
if( errcode )
{ show_error( "Can't join worker threads", errcode ); fatal(); }
{ show_error( "Can't join worker threads", errcode ); cleanup_and_fail(); }
}
delete[] worker_threads;
delete[] worker_args;

View file

@ -1,12 +1,12 @@
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
.TH PLZIP "1" "May 2013" "Plzip 1.0" "User Commands"
.TH PLZIP "1" "July 2013" "Plzip 1.1-pre1" "User Commands"
.SH NAME
Plzip \- reduces the size of files
.SH SYNOPSIS
.B plzip
[\fIoptions\fR] [\fIfiles\fR]
.SH DESCRIPTION
Plzip \- A parallel compressor compatible with lzip.
Plzip \- Parallel compressor compatible with lzip.
.SH OPTIONS
.TP
\fB\-h\fR, \fB\-\-help\fR

View file

@ -12,16 +12,16 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
Plzip Manual
************
This manual is for Plzip (version 1.0, 29 May 2013).
This manual is for Plzip (version 1.1-pre1, 20 July 2013).
* Menu:
* Introduction:: Purpose and features of plzip
* Program Design:: Internal structure of plzip
* Invoking Plzip:: Command line interface
* File Format:: Detailed format of the compressed file
* Program design:: Internal structure of plzip
* Invoking plzip:: Command line interface
* File format:: Detailed format of the compressed file
* Problems:: Reporting bugs
* Concept Index:: Index of concepts
* Concept index:: Index of concepts
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@ -30,27 +30,46 @@ This manual is for Plzip (version 1.0, 29 May 2013).
copy, distribute and modify it.

File: plzip.info, Node: Introduction, Next: Program Design, Prev: Top, Up: Top
File: plzip.info, Node: Introduction, Next: Program design, Prev: Top, Up: Top
1 Introduction
**************
Plzip is a massively parallel (multi-threaded), lossless data compressor
based on the lzlib compression library, with very safe integrity
checking and a user interface similar to the one of bzip2, gzip or lzip.
based on the lzlib compression library, with a user interface similar to
the one of lzip, bzip2 or gzip.
Plzip is intended for faster compression/decompression of big files
on multiprocessor machines, which makes it specially well suited for
distribution of big software files and large scale data archiving. On
files big enough (several GB), plzip can use hundreds of processors.
Plzip can compress/decompress large files on multiprocessor machines
much faster than lzip, at the cost of a slightly reduced compression
ratio. On files large enough (several GB), plzip can use hundreds of
processors. On files of only a few MB it is better to use lzip.
Plzip uses the same well-defined exit status values used by lzip and
bzip2, which makes it safer when used in pipes or scripts than
compressors returning ambiguous warning values, like gzip.
Plzip uses the lzip file format; the files produced by plzip are
fully compatible with lzip-1.4 or newer, and can be rescued with
lziprecover.
Plzip uses the same well-defined exit status values used by lzip and
bzip2, which makes it safer when used in pipes or scripts than
compressors returning ambiguous warning values, like gzip.
The lzip file format is designed for long-term data archiving and
provides very safe integrity checking. The member trailer stores the
32-bit CRC of the original data, the size of the original data and the
size of the member. These values, together with the value remaining in
the range decoder and the end-of-stream marker, provide a 4 factor
integrity checking which guarantees that the decompressed version of the
data is identical to the original. This guards against corruption of the
compressed data, and against undetected bugs in plzip (hopefully very
unlikely). The chances of data corruption going undetected are
microscopic. Be aware, though, that the check occurs upon decompression,
so it can only tell you that something is wrong. It can't help you
recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
(one of the most common forms of data corruption), and provides data
recovery capabilities, including error-checked merging of damaged copies
of a file.
Plzip replaces every file given in the command line with a compressed
version of itself, with the name "original_name.lz". Each compressed
@ -76,18 +95,6 @@ filename.lz becomes filename
filename.tlz becomes filename.tar
anyothername becomes anyothername.out
As a self-check for your protection, plzip stores in the member
trailer the 32-bit CRC of the original data, the size of the original
data and the size of the member. These values, together with the value
remaining in the range decoder and the end-of-stream marker, provide a
very safe 4 factor integrity checking which guarantees that the
decompressed version of the data is identical to the original. This
guards against corruption of the compressed data, and against
undetected bugs in plzip (hopefully very unlikely). The chances of data
corruption going undetected are microscopic. Be aware, though, that the
check occurs upon decompression, so it can only tell you that something
is wrong. It can't help you recover the original uncompressed data.
WARNING! Even if plzip is bug-free, other causes may result in a
corrupt compressed file (bugs in the system libraries, memory errors,
etc). Therefore, if the data you are going to compress is important,
@ -96,9 +103,9 @@ until you verify the compressed file with a command like
`plzip -cd file.lz | cmp file -'.

File: plzip.info, Node: Program Design, Next: Invoking Plzip, Prev: Introduction, Up: Top
File: plzip.info, Node: Program design, Next: Invoking plzip, Prev: Introduction, Up: Top
2 Program Design
2 Program design
****************
For each input file, a splitter thread and several worker threads are
@ -119,9 +126,9 @@ speed of large files with many members is only limited by the number of
processors available and by I/O speed.

File: plzip.info, Node: Invoking Plzip, Next: File Format, Prev: Program Design, Up: Top
File: plzip.info, Node: Invoking plzip, Next: File format, Prev: Program design, Up: Top
3 Invoking Plzip
3 Invoking plzip
****************
The format for running plzip is:
@ -220,7 +227,7 @@ The format for running plzip is:
`--verbose'
Verbose mode.
When compressing, show the compression ratio for each file
processed.
processed. A second -v shows the progress of compression.
When decompressing or testing, further -v's (up to 4) increase the
verbosity level, showing status, compression ratio, decompressed
size, and compressed size.
@ -275,9 +282,9 @@ invalid input file, 3 for an internal consistency error (eg, bug) which
caused plzip to panic.

File: plzip.info, Node: File Format, Next: Problems, Prev: Invoking Plzip, Up: Top
File: plzip.info, Node: File format, Next: Problems, Prev: Invoking plzip, Up: Top
4 File Format
4 File format
*************
Perfection is reached, not when there is no longer anything to add, but
@ -348,7 +355,7 @@ additional information before, between, or after them.

File: plzip.info, Node: Problems, Next: Concept Index, Prev: File Format, Up: Top
File: plzip.info, Node: Problems, Next: Concept index, Prev: File format, Up: Top
5 Reporting Bugs
****************
@ -363,34 +370,34 @@ for all eternity, if not longer.
by running `plzip --version'.

File: plzip.info, Node: Concept Index, Prev: Problems, Up: Top
File: plzip.info, Node: Concept index, Prev: Problems, Up: Top
Concept Index
Concept index
*************
[index]
* Menu:
* bugs: Problems. (line 6)
* file format: File Format. (line 6)
* file format: File format. (line 6)
* getting help: Problems. (line 6)
* introduction: Introduction. (line 6)
* invoking: Invoking Plzip. (line 6)
* options: Invoking Plzip. (line 6)
* program design: Program Design. (line 6)
* usage: Invoking Plzip. (line 6)
* version: Invoking Plzip. (line 6)
* invoking: Invoking plzip. (line 6)
* options: Invoking plzip. (line 6)
* program design: Program design. (line 6)
* usage: Invoking plzip. (line 6)
* version: Invoking plzip. (line 6)

Tag Table:
Node: Top223
Node: Introduction865
Node: Program Design4113
Node: Invoking Plzip5167
Node: File Format10416
Node: Problems12895
Node: Concept Index13424
Node: Introduction871
Node: Program design4426
Node: Invoking plzip5480
Node: File format10776
Node: Problems13255
Node: Concept index13784

End Tag Table

View file

@ -6,8 +6,8 @@
@finalout
@c %**end of header
@set UPDATED 29 May 2013
@set VERSION 1.0
@set UPDATED 20 July 2013
@set VERSION 1.1-pre1
@dircategory Data Compression
@direntry
@ -36,11 +36,11 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
@menu
* Introduction:: Purpose and features of plzip
* Program Design:: Internal structure of plzip
* Invoking Plzip:: Command line interface
* File Format:: Detailed format of the compressed file
* Program design:: Internal structure of plzip
* Invoking plzip:: Command line interface
* File format:: Detailed format of the compressed file
* Problems:: Reporting bugs
* Concept Index:: Index of concepts
* Concept index:: Index of concepts
@end menu
@sp 1
@ -55,21 +55,40 @@ to copy, distribute and modify it.
@cindex introduction
Plzip is a massively parallel (multi-threaded), lossless data compressor
based on the lzlib compression library, with very safe integrity
checking and a user interface similar to the one of bzip2, gzip or lzip.
based on the lzlib compression library, with a user interface similar to
the one of lzip, bzip2 or gzip.
Plzip is intended for faster compression/decompression of big files on
multiprocessor machines, which makes it specially well suited for
distribution of big software files and large scale data archiving. On
files big enough (several GB), plzip can use hundreds of processors.
Plzip uses the lzip file format; the files produced by plzip are fully
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
Plzip can compress/decompress large files on multiprocessor machines
much faster than lzip, at the cost of a slightly reduced compression
ratio. On files large enough (several GB), plzip can use hundreds of
processors. On files of only a few MB it is better to use lzip.
Plzip uses the same well-defined exit status values used by lzip and
bzip2, which makes it safer when used in pipes or scripts than
compressors returning ambiguous warning values, like gzip.
Plzip uses the lzip file format; the files produced by plzip are fully
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
The lzip file format is designed for long-term data archiving and
provides very safe integrity checking. The member trailer stores the
32-bit CRC of the original data, the size of the original data and the
size of the member. These values, together with the value remaining in
the range decoder and the end-of-stream marker, provide a 4 factor
integrity checking which guarantees that the decompressed version of the
data is identical to the original. This guards against corruption of the
compressed data, and against undetected bugs in plzip (hopefully very
unlikely). The chances of data corruption going undetected are
microscopic. Be aware, though, that the check occurs upon decompression,
so it can only tell you that something is wrong. It can't help you
recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
(one of the most common forms of data corruption), and provides data
recovery capabilities, including error-checked merging of damaged copies
of a file.
Plzip replaces every file given in the command line with a compressed
version of itself, with the name "original_name.lz". Each compressed
file has the same modification date, permissions, and, when possible,
@ -96,18 +115,6 @@ file from that of the compressed file as follows:
@item anyothername @tab becomes @tab anyothername.out
@end multitable
As a self-check for your protection, plzip stores in the member trailer
the 32-bit CRC of the original data, the size of the original data and
the size of the member. These values, together with the value remaining
in the range decoder and the end-of-stream marker, provide a very safe 4
factor integrity checking which guarantees that the decompressed version
of the data is identical to the original. This guards against corruption
of the compressed data, and against undetected bugs in plzip (hopefully
very unlikely). The chances of data corruption going undetected are
microscopic. Be aware, though, that the check occurs upon decompression,
so it can only tell you that something is wrong. It can't help you
recover the original uncompressed data.
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
compressed file (bugs in the system libraries, memory errors, etc).
Therefore, if the data you are going to compress is important, give the
@ -116,8 +123,8 @@ you verify the compressed file with a command like
@w{@samp{plzip -cd file.lz | cmp file -}}.
@node Program Design
@chapter Program Design
@node Program design
@chapter Program design
@cindex program design
For each input file, a splitter thread and several worker threads are
@ -138,8 +145,8 @@ large files with many members is only limited by the number of
processors available and by I/O speed.
@node Invoking Plzip
@chapter Invoking Plzip
@node Invoking plzip
@chapter Invoking plzip
@cindex invoking
@cindex options
@cindex usage
@ -237,7 +244,8 @@ Use it together with @samp{-v} to see information about the file.
@item -v
@itemx --verbose
Verbose mode.@*
When compressing, show the compression ratio for each file processed.@*
When compressing, show the compression ratio for each file processed. A
second -v shows the progress of compression.@*
When decompressing or testing, further -v's (up to 4) increase the
verbosity level, showing status, compression ratio, decompressed size,
and compressed size.
@ -297,8 +305,8 @@ invalid input file, 3 for an internal consistency error (eg, bug) which
caused plzip to panic.
@node File Format
@chapter File Format
@node File format
@chapter File format
@cindex file format
Perfection is reached, not when there is no longer anything to add, but
@ -387,8 +395,8 @@ If you find a bug in plzip, please send electronic mail to
find by running @w{@samp{plzip --version}}.
@node Concept Index
@unnumbered Concept Index
@node Concept index
@unnumbered Concept index
@printindex cp

View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@ -68,23 +68,23 @@ File_index::File_index( const int infd ) : retval_( 0 )
{ error_ = "Input file is not seekable :";
error_ += std::strerror( errno ); retval_ = 1; return; }
if( isize > INT64_MAX )
{ error_ = "Input file is too long (2^63 bytes or more).";
{ error_ = "Input file is too long (2^63 bytes or more)";
retval_ = 2; return; }
long long pos = isize; // always points to a header or EOF
File_header header;
File_trailer trailer;
if( isize < min_member_size )
{ error_ = "Input file is too short."; retval_ = 2; return; }
{ error_ = "Input file is too short"; retval_ = 2; return; }
if( seek_read( infd, header.data, File_header::size, 0 ) != File_header::size )
{ error_ = "Error reading member header :";
error_ += std::strerror( errno ); retval_ = 1; return; }
if( !header.verify_magic() )
{ error_ = "Bad magic number (file not in lzip format).";
{ error_ = "Bad magic number (file not in lzip format)";
retval_ = 2; return; }
if( !header.verify_version() )
{ error_ = "Version "; error_ += format_num( header.version() );
error_ += "member format not supported."; retval_ = 2; return; }
error_ += "member format not supported"; retval_ = 2; return; }
while( pos >= min_member_size )
{
@ -114,9 +114,9 @@ File_index::File_index( const int infd ) : retval_( 0 )
if( member_vector.size() == 0 && isize - pos > File_header::size &&
seek_read( infd, header.data, File_header::size, pos ) == File_header::size &&
header.verify_magic() && header.verify_version() )
{ // last trailer is corrupt
error_ = "Member size in trailer is corrupt at pos ";
error_ += format_num( isize - 8 ); retval_ = 2; break;
{
error_ = "Last member in input file is truncated or corrupt";
retval_ = 2; break;
}
pos -= member_size;
member_vector.push_back( Member( 0, trailer.data_size(),
@ -125,7 +125,7 @@ File_index::File_index( const int infd ) : retval_( 0 )
if( pos != 0 || member_vector.size() == 0 )
{
member_vector.clear();
if( retval_ == 0 ) { error_ = "Can't create file index."; retval_ = 2; }
if( retval_ == 0 ) { error_ = "Can't create file index"; retval_ = 2; }
return;
}
std::reverse( member_vector.begin(), member_vector.end() );
@ -135,7 +135,7 @@ File_index::File_index( const int infd ) : retval_( 0 )
if( end < 0 || end > INT64_MAX )
{
member_vector.clear();
error_ = "Data in input file is too long (2^63 bytes or more).";
error_ = "Data in input file is too long (2^63 bytes or more)";
retval_ = 2; return;
}
member_vector[i+1].dblock.pos( end );

View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify

7
lzip.h
View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@ -196,10 +196,13 @@ int decompress( int num_workers, const int infd, const int outfd,
// defined in main.cc
extern int verbosity;
void fatal( const int retval = 1 ); // terminate the program
void cleanup_and_fail( const int retval = 1 ); // terminate the program
void show_error( const char * const msg, const int errcode = 0,
const bool help = false );
void internal_error( const char * const msg );
void show_progress( const int packet_size,
const Pretty_print * const p = 0,
const struct stat * const in_statsp = 0 );
class Slot_tally

104
main.cc
View file

@ -1,4 +1,4 @@
/* Plzip - A parallel compressor compatible with lzip
/* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@ -96,13 +96,11 @@ const mode_t usr_rw = S_IRUSR | S_IWUSR;
const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
mode_t outfd_mode = usr_rw;
bool delete_output_on_interrupt = false;
pthread_t main_thread;
pid_t main_thread_pid;
void show_help( const long num_online )
{
std::printf( "%s - A parallel compressor compatible with lzip.\n", Program_name );
std::printf( "%s - Parallel compressor compatible with lzip.\n", Program_name );
std::printf( "\nUsage: %s [options] [files]\n", invocation_name );
std::printf( "\nOptions:\n"
" -h, --help display this help and exit\n"
@ -262,12 +260,13 @@ int open_instream( const char * const name, struct stat * const in_statsp,
const bool can_read = ( i == 0 &&
( S_ISBLK( mode ) || S_ISCHR( mode ) ||
S_ISFIFO( mode ) || S_ISSOCK( mode ) ) );
if( i != 0 || ( !S_ISREG( mode ) && ( !to_stdout || !can_read ) ) )
const bool no_ofile = to_stdout || ( program_mode == m_test );
if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || !no_ofile ) ) )
{
if( verbosity >= 0 )
std::fprintf( stderr, "%s: Input file '%s' is not a regular file%s.\n",
program_name, name,
( can_read && !to_stdout ) ?
( can_read && !no_ofile ) ?
" and '--stdout' was not specified" : "" );
close( infd );
infd = -1;
@ -340,22 +339,6 @@ bool check_tty( const int infd, const Mode program_mode )
}
void cleanup_and_fail( const int retval )
{
if( delete_output_on_interrupt )
{
delete_output_on_interrupt = false;
if( verbosity >= 0 )
std::fprintf( stderr, "%s: Deleting output file '%s', if it exists.\n",
program_name, output_filename.c_str() );
if( outfd >= 0 ) { close( outfd ); outfd = -1; }
if( std::remove( output_filename.c_str() ) != 0 && errno != ENOENT )
show_error( "WARNING: deletion of output file (apparently) failed." );
}
std::exit( retval );
}
// Set permissions, owner and times.
void close_and_set_permissions( const struct stat * const in_statsp )
{
@ -382,13 +365,10 @@ void close_and_set_permissions( const struct stat * const in_statsp )
}
extern "C" void signal_handler( int sig )
extern "C" void signal_handler( int )
{
if( !pthread_equal( pthread_self(), main_thread ) )
kill( main_thread_pid, sig );
if( sig != SIGUSR1 && sig != SIGUSR2 )
show_error( "Control-C or similar caught, quitting." );
cleanup_and_fail( ( sig != SIGUSR2 ) ? 1 : 2 );
cleanup_and_fail( 1 );
}
@ -405,14 +385,6 @@ void set_signals()
int verbosity = 0;
// This can be called from any thread, main thread or sub-threads alike,
// since they all call common helper functions that call fatal() in case
// of an error.
//
void fatal( const int retval )
{ signal_handler( ( retval != 2 ) ? SIGUSR1 : SIGUSR2 ); }
void Pretty_print::operator()( const char * const msg ) const
{
if( verbosity >= 0 )
@ -456,6 +428,60 @@ void internal_error( const char * const msg )
}
// This can be called from any thread, main thread or sub-threads alike,
// since they all call common helper functions that call cleanup_and_fail()
// in case of an error.
//
void cleanup_and_fail( const int retval )
{
// only one thread can delete and exit
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock( &mutex ); // ignore errors to avoid loop
if( delete_output_on_interrupt )
{
delete_output_on_interrupt = false;
if( verbosity >= 0 )
std::fprintf( stderr, "%s: Deleting output file '%s', if it exists.\n",
program_name, output_filename.c_str() );
if( outfd >= 0 ) { close( outfd ); outfd = -1; }
if( std::remove( output_filename.c_str() ) != 0 && errno != ENOENT )
show_error( "WARNING: deletion of output file (apparently) failed." );
}
std::exit( retval );
}
void show_progress( const int packet_size,
const Pretty_print * const p,
const struct stat * const in_statsp )
{
static unsigned long long cfile_size = 0; // file_size / 100
static unsigned long long pos = 0;
static const Pretty_print * pp = 0;
static pthread_mutex_t mutex;
if( p ) // initialize static vars
{
if( !pp ) xinit( &mutex ); // init mutex only once
pos = 0; pp = p;
cfile_size = ( in_statsp && S_ISREG( in_statsp->st_mode ) ) ?
in_statsp->st_size / 100 : 0;
return;
}
if( pp )
{
xlock( &mutex );
pos += packet_size;
if( cfile_size > 0 )
std::fprintf( stderr, "%4llu%%", pos / cfile_size );
std::fprintf( stderr, " %.1f MB\r", pos / 1000000.0 );
pp->reset(); (*pp)(); // restore cursor position
xunlock( &mutex );
}
}
int main( const int argc, const char * const argv[] )
{
// Mapping from gzip/bzip2 style 1..9 compression modes
@ -486,8 +512,6 @@ int main( const int argc, const char * const argv[] )
bool recompress = false;
bool to_stdout = false;
invocation_name = argv[0];
main_thread = pthread_self();
main_thread_pid = getpid();
if( LZ_version()[0] != LZ_version_string[0] )
internal_error( "bad library version" );
@ -598,8 +622,6 @@ int main( const int argc, const char * const argv[] )
if( !to_stdout && program_mode != m_test &&
( filenames_given || default_output_filename.size() ) )
set_signals();
std::signal( SIGUSR1, signal_handler );
std::signal( SIGUSR2, signal_handler );
Pretty_print pp( filenames );
@ -668,9 +690,13 @@ int main( const int argc, const char * const argv[] )
if( verbosity >= 1 ) pp();
int tmp;
if( program_mode == m_compress )
{
show_progress( 0, &pp, in_statsp ); // initialize static vars
if( verbosity >= 2 ) show_progress( 0 ); // show initial zero size
tmp = compress( data_size, encoder_options.dictionary_size,
encoder_options.match_len_limit,
num_workers, infd, outfd, pp, debug_level );
}
else
tmp = decompress( num_workers, infd, outfd, pp, debug_level,
program_mode == m_test, infd_isreg );

View file

@ -1,5 +1,5 @@
#! /bin/sh
# check script for Plzip - A parallel compressor compatible with lzip
# check script for Plzip - Parallel compressor compatible with lzip
# Copyright (C) 2009, 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
#
# This script is free software: you have unlimited permission
@ -28,13 +28,21 @@ fail=0
printf "testing plzip-%s..." "$2"
"${LZIP}" -cqs-1 in > /dev/null
if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cqs0 in > /dev/null
if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cqs4095 in > /dev/null
if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cqm274 in > /dev/null
if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -tq in
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -tq < in
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cdq in
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cdq < in
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -t "${in_lz}" || fail=1
"${LZIP}" -cd "${in_lz}" > copy || fail=1
@ -42,7 +50,7 @@ cmp in copy || fail=1
printf .
"${LZIP}" -cfq "${in_lz}" > out
if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cF "${in_lz}" > out || fail=1
"${LZIP}" -cd out | "${LZIP}" -d > copy || fail=1
cmp in copy || fail=1
@ -54,32 +62,32 @@ for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
printf "garbage" >> copy.lz || fail=1
"${LZIP}" -df copy.lz || fail=1
cmp in copy || fail=1
printf .
done
printf .
for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
"${LZIP}" -c -$i in > out || fail=1
printf "g" >> out || fail=1
"${LZIP}" -cd out > copy || fail=1
cmp in copy || fail=1
printf .
done
printf .
for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
"${LZIP}" -$i < in > out || fail=1
printf "garbage" >> out || fail=1
"${LZIP}" -d < out > copy || fail=1
cmp in copy || fail=1
printf .
done
printf .
for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
"${LZIP}" -f -$i -o out < in || fail=1
printf "g" >> out.lz || fail=1
"${LZIP}" -df -o copy < out.lz || fail=1
cmp in copy || fail=1
printf .
done
printf .
"${LZIP}" < in > anyothername || fail=1
"${LZIP}" -d anyothername || fail=1
@ -95,39 +103,38 @@ for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ; do
"${LZIP}" -d -n$i out4.lz || fail=1
cmp in4 out4 || fail=1
rm -f out4
printf .
done
printf .
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ; do
"${LZIP}" -s4Ki -B8Ki -n$i < in4 > out4 || fail=1
printf "g" >> out4 || fail=1
"${LZIP}" -d -n$i < out4 > copy4 || fail=1
cmp in4 copy4 || fail=1
printf .
done
printf .
cat "${in_lz}" > ingin.lz || framework_failure
printf "g" >> ingin.lz || framework_failure
cat "${in_lz}" >> ingin.lz || framework_failure
"${LZIP}" -tq ingin.lz
if [ $? != 2 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cdq ingin.lz > out
if [ $? != 2 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -t < ingin.lz || fail=1
printf .
"${LZIP}" -d < ingin.lz > copy || fail=1
cmp in copy || fail=1
printf .
dd if="${in_lz}" bs=1024 count=10 > trunc.lz 2> /dev/null || framework_failure
"${LZIP}" -tq trunc.lz
if [ $? != 2 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cdq trunc.lz > out
if [ $? != 2 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -tq < trunc.lz
if [ $? != 2 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -dq < trunc.lz > out
if [ $? != 2 ] ; then fail=1 ; printf - ; else printf . ; fi
if [ $? = 2 ] ; then printf . ; else fail=1 ; printf - ; fi
echo
if [ ${fail} = 0 ] ; then