1
0
Fork 0

Merging upstream version 1.5.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-24 04:12:55 +01:00
parent 5e1f92d2a0
commit 66060d80f9
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
20 changed files with 632 additions and 272 deletions

View file

@ -1,3 +1,15 @@
2016-05-14 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.5 released.
* main.cc: Added new option '-a, --trailing-error'.
* main.cc (main): Delete '--output' file if infd is a terminal.
* main.cc (main): Don't use stdin more than once.
* lzip.texi: Added chapters 'Trailing data' and 'Examples'.
* configure: Avoid warning on some shells when testing for g++.
* Makefile.in: Detect the existence of install-info.
* testsuite/check.sh: A POSIX shell is required to run the tests.
* testsuite/check.sh: Don't check error messages.
2015-07-09 Antonio Diaz Diaz <antonio@gnu.org> 2015-07-09 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.4 released. * Version 1.4 released.
@ -6,11 +18,11 @@
2015-01-22 Antonio Diaz Diaz <antonio@gnu.org> 2015-01-22 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.3 released. * Version 1.3 released.
* dec_stream.cc: Do not use output packets or muxer when testing. * dec_stream.cc: Don't use output packets or muxer when testing.
* Make '-dvvv' and '-tvvv' show dictionary size like lzip. * Make '-dvvv' and '-tvvv' show dictionary size like lzip.
* lzip.h: Added missing 'const' to the declaration of 'compress'. * lzip.h: Added missing 'const' to the declaration of 'compress'.
* Added chapters 'Memory requirements' and 'Minimum file sizes' * lzip.texi: Added chapters 'Memory requirements' and
to manual. 'Minimum file sizes'.
* Makefile.in: Added new targets 'install*-compress'. * Makefile.in: Added new targets 'install*-compress'.
2014-08-29 Antonio Diaz Diaz <antonio@gnu.org> 2014-08-29 Antonio Diaz Diaz <antonio@gnu.org>
@ -27,7 +39,7 @@
* Version 1.1 released. * Version 1.1 released.
* Show progress of compression at verbosity level 2 (-vv). * Show progress of compression at verbosity level 2 (-vv).
* SIGUSR1 and SIGUSR2 are no more used to signal a fatal error. * SIGUSR1 and SIGUSR2 are no longer used to signal a fatal error.
2013-05-29 Antonio Diaz Diaz <antonio@gnu.org> 2013-05-29 Antonio Diaz Diaz <antonio@gnu.org>
@ -69,7 +81,7 @@
to match those of lzip 1.11. to match those of lzip 1.11.
* decompress.cc: A limit has been set on the number of packets * decompress.cc: A limit has been set on the number of packets
produced by workers to limit the amount of memory used. produced by workers to limit the amount of memory used.
* main.cc (open_instream): Do not show the message * main.cc (open_instream): Don't show the message
" and '--stdout' was not specified" for directories, etc. " and '--stdout' was not specified" for directories, etc.
* main.cc: Fixed warning about fchown return value being ignored. * main.cc: Fixed warning about fchown return value being ignored.
* testsuite: 'test1' renamed to 'test.txt'. Added new tests. * testsuite: 'test1' renamed to 'test.txt'. Added new tests.
@ -78,8 +90,8 @@
* Version 0.6 released. * Version 0.6 released.
* Small portability fixes. * Small portability fixes.
* Added chapter 'Program Design' and description of option * lzip.texinfo: Added chapter 'Program Design' and description
'--threads' to manual. of option '--threads'.
* Debug stats have been fixed. * Debug stats have been fixed.
2010-02-10 Antonio Diaz Diaz <ant_diaz@teleline.es> 2010-02-10 Antonio Diaz Diaz <ant_diaz@teleline.es>
@ -127,7 +139,7 @@
until something better appears on the net. until something better appears on the net.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This file is a collection of facts, and thus it is not copyrightable, This file is a collection of facts, and thus it is not copyrightable,
but just in case, you have unlimited permission to copy, distribute and but just in case, you have unlimited permission to copy, distribute and

View file

@ -1,7 +1,7 @@
Requirements Requirements
------------ ------------
You will need a C++ compiler and the lzlib compression library installed. You will need a C++ compiler and the lzlib compression library installed.
I use gcc 4.9.1 and 4.1.2, but the code should compile with any I use gcc 5.3.0 and 4.1.2, but the code should compile with any
standards compliant compiler. standards compliant compiler.
Lzlib must be version 1.0 or newer, but the fast encoder is only Lzlib must be version 1.0 or newer, but the fast encoder is only
available in lzlib 1.7 or newer. available in lzlib 1.7 or newer.
@ -65,7 +65,7 @@ After running 'configure', you can run 'make' and 'make install' as
explained above. explained above.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute and modify it.

View file

@ -6,6 +6,7 @@ INSTALL_DATA = $(INSTALL) -m 644
INSTALL_DIR = $(INSTALL) -d -m 755 INSTALL_DIR = $(INSTALL) -d -m 755
LIBS = -llz -lpthread LIBS = -llz -lpthread
SHELL = /bin/sh SHELL = /bin/sh
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
objs = arg_parser.o file_index.o compress.o dec_stdout.o dec_stream.o \ objs = arg_parser.o file_index.o compress.o dec_stdout.o dec_stream.o \
decompress.o main.o decompress.o main.o
@ -72,7 +73,9 @@ install-info :
if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
$(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info" $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info"
-install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" -if $(CAN_RUN_INSTALLINFO) ; then \
install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi
install-info-compress : install-info install-info-compress : install-info
lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info" lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info"
@ -95,7 +98,9 @@ uninstall-bin :
-rm -f "$(DESTDIR)$(bindir)/$(progname)" -rm -f "$(DESTDIR)$(bindir)/$(progname)"
uninstall-info : uninstall-info :
-install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" -if $(CAN_RUN_INSTALLINFO) ; then \
install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
fi
-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
uninstall-man : uninstall-man :

17
NEWS
View file

@ -1,5 +1,14 @@
Changes in version 1.4: Changes in version 1.5:
The option "-0" has been modified to use the new fast encoder of lzlib The option "-a, --trailing-error", which makes plzip exit with error
1.7, which achieves a compression speed and ratio comparable to those of status 2 if any remaining input is detected after decompressing the last
pigz's default compression level. member, has been added.
When decompressing, the file specified with the '--output' option is now
deleted if the input is a terminal.
The new chapters "Trailing data" and "Examples" have been added to the
manual.
A harmless check failure on Windows, caused by the failed comparison of
a message in text mode, has been fixed.

4
README
View file

@ -13,7 +13,7 @@ plzip is no faster than lzip.
When compressing, plzip divides the input file into chunks and When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen, compresses as many chunks simultaneously as worker threads are chosen,
creating a multi-member compressed file. creating a multimember compressed file.
When decompressing, plzip decompresses as many members simultaneously as When decompressing, plzip decompresses as many members simultaneously as
worker threads are chosen. Files that were compressed with lzip will not worker threads are chosen. Files that were compressed with lzip will not
@ -88,7 +88,7 @@ corresponding uncompressed files. Integrity testing of concatenated
compressed files is also supported. compressed files is also supported.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy, This file is free documentation: you have unlimited permission to copy,
distribute and modify it. distribute and modify it.

View file

@ -1,5 +1,5 @@
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version) /* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
Copyright (C) 2006-2015 Antonio Diaz Diaz. Copyright (C) 2006-2016 Antonio Diaz Diaz.
This library is free software. Redistribution and use in source and This library is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided binary forms, with or without modification, are permitted provided

View file

@ -1,5 +1,5 @@
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version) /* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
Copyright (C) 2006-2015 Antonio Diaz Diaz. Copyright (C) 2006-2016 Antonio Diaz Diaz.
This library is free software. Redistribution and use in source and This library is free software. Redistribution and use in source and
binary forms, with or without modification, are permitted provided binary forms, with or without modification, are permitted provided

View file

@ -1,6 +1,6 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek. Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -77,14 +77,14 @@ int writeblock( const int fd, const uint8_t * const buf, const int size )
} }
void xinit( pthread_mutex_t * const mutex ) void xinit_mutex( pthread_mutex_t * const mutex )
{ {
const int errcode = pthread_mutex_init( mutex, 0 ); const int errcode = pthread_mutex_init( mutex, 0 );
if( errcode ) if( errcode )
{ show_error( "pthread_mutex_init", errcode ); cleanup_and_fail(); } { show_error( "pthread_mutex_init", errcode ); cleanup_and_fail(); }
} }
void xinit( pthread_cond_t * const cond ) void xinit_cond( pthread_cond_t * const cond )
{ {
const int errcode = pthread_cond_init( cond, 0 ); const int errcode = pthread_cond_init( cond, 0 );
if( errcode ) if( errcode )
@ -92,14 +92,14 @@ void xinit( pthread_cond_t * const cond )
} }
void xdestroy( pthread_mutex_t * const mutex ) void xdestroy_mutex( pthread_mutex_t * const mutex )
{ {
const int errcode = pthread_mutex_destroy( mutex ); const int errcode = pthread_mutex_destroy( mutex );
if( errcode ) if( errcode )
{ show_error( "pthread_mutex_destroy", errcode ); cleanup_and_fail(); } { show_error( "pthread_mutex_destroy", errcode ); cleanup_and_fail(); }
} }
void xdestroy( pthread_cond_t * const cond ) void xdestroy_cond( pthread_cond_t * const cond )
{ {
const int errcode = pthread_cond_destroy( cond ); const int errcode = pthread_cond_destroy( cond );
if( errcode ) if( errcode )
@ -196,14 +196,14 @@ public:
slot_tally( slots ), circular_buffer( slots, (Packet *) 0 ), slot_tally( slots ), circular_buffer( slots, (Packet *) 0 ),
num_working( workers ), num_slots( slots ), eof( false ) num_working( workers ), num_slots( slots ), eof( false )
{ {
xinit( &imutex ); xinit( &iav_or_eof ); xinit_mutex( &imutex ); xinit_cond( &iav_or_eof );
xinit( &omutex ); xinit( &oav_or_exit ); xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
} }
~Packet_courier() ~Packet_courier()
{ {
xdestroy( &oav_or_exit ); xdestroy( &omutex ); xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
xdestroy( &iav_or_eof ); xdestroy( &imutex ); xdestroy_cond( &iav_or_eof ); xdestroy_mutex( &imutex );
} }
// make a packet with data received from splitter // make a packet with data received from splitter

14
configure vendored
View file

@ -1,12 +1,12 @@
#! /bin/sh #! /bin/sh
# configure script for Plzip - Parallel compressor compatible with lzip # configure script for Plzip - Parallel compressor compatible with lzip
# Copyright (C) 2009-2015 Antonio Diaz Diaz. # Copyright (C) 2009-2016 Antonio Diaz Diaz.
# #
# This configure script is free software: you have unlimited permission # This configure script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute and modify it.
pkgname=plzip pkgname=plzip
pkgversion=1.4 pkgversion=1.5
progname=plzip progname=plzip
srctrigger=doc/${pkgname}.texi srctrigger=doc/${pkgname}.texi
@ -26,8 +26,8 @@ CXXFLAGS='-Wall -W -O2'
LDFLAGS= LDFLAGS=
# checking whether we are using GNU C++. # checking whether we are using GNU C++.
${CXX} --version > /dev/null 2>&1 if /bin/sh -c "${CXX} --version" > /dev/null 2>&1 ; then true
if [ $? != 0 ] ; then else
CXX=c++ CXX=c++
CXXFLAGS='-W -O2' CXXFLAGS='-W -O2'
fi fi
@ -139,7 +139,7 @@ if [ -z "${no_create}" ] ; then
rm -f config.status rm -f config.status
cat > config.status << EOF cat > config.status << EOF
#! /bin/sh #! /bin/sh
# This file was generated automatically by configure. Do not edit. # This file was generated automatically by configure. Don't edit.
# Run this file to recreate the current configuration. # Run this file to recreate the current configuration.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
@ -165,8 +165,8 @@ echo "LDFLAGS = ${LDFLAGS}"
rm -f Makefile rm -f Makefile
cat > Makefile << EOF cat > Makefile << EOF
# Makefile for Plzip - Parallel compressor compatible with lzip # Makefile for Plzip - Parallel compressor compatible with lzip
# Copyright (C) 2009-2015 Antonio Diaz Diaz. # Copyright (C) 2009-2016 Antonio Diaz Diaz.
# This file was generated automatically by configure. Do not edit. # This file was generated automatically by configure. Don't edit.
# #
# This Makefile is free software: you have unlimited permission # This Makefile is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute and modify it.

View file

@ -1,6 +1,6 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek. Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -76,14 +76,14 @@ public:
opacket_queues( workers ), num_working( workers ), opacket_queues( workers ), num_working( workers ),
num_workers( workers ), out_slots( slots ), slot_av( workers ) num_workers( workers ), out_slots( slots ), slot_av( workers )
{ {
xinit( &omutex ); xinit( &oav_or_exit ); xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
for( unsigned i = 0; i < slot_av.size(); ++i ) xinit( &slot_av[i] ); for( unsigned i = 0; i < slot_av.size(); ++i ) xinit_cond( &slot_av[i] );
} }
~Packet_courier() ~Packet_courier()
{ {
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy( &slot_av[i] ); for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
xdestroy( &oav_or_exit ); xdestroy( &omutex ); xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
} }
void worker_finished() void worker_finished()

View file

@ -1,6 +1,6 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek. Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -88,16 +88,16 @@ public:
num_workers( workers ), out_slots( oslots ), slot_av( workers ), num_workers( workers ), out_slots( oslots ), slot_av( workers ),
eof( false ) eof( false )
{ {
xinit( &imutex ); xinit( &iav_or_eof ); xinit_mutex( &imutex ); xinit_cond( &iav_or_eof );
xinit( &omutex ); xinit( &oav_or_exit ); xinit_mutex( &omutex ); xinit_cond( &oav_or_exit );
for( unsigned i = 0; i < slot_av.size(); ++i ) xinit( &slot_av[i] ); for( unsigned i = 0; i < slot_av.size(); ++i ) xinit_cond( &slot_av[i] );
} }
~Packet_courier() ~Packet_courier()
{ {
for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy( &slot_av[i] ); for( unsigned i = 0; i < slot_av.size(); ++i ) xdestroy_cond( &slot_av[i] );
xdestroy( &oav_or_exit ); xdestroy( &omutex ); xdestroy_cond( &oav_or_exit ); xdestroy_mutex( &omutex );
xdestroy( &iav_or_eof ); xdestroy( &imutex ); xdestroy_cond( &iav_or_eof ); xdestroy_mutex( &imutex );
} }
// make a packet with data received from splitter // make a packet with data received from splitter
@ -345,6 +345,7 @@ struct Worker_arg
Packet_courier * courier; Packet_courier * courier;
const Pretty_print * pp; const Pretty_print * pp;
int worker_id; int worker_id;
bool ignore_trailing;
bool testing; bool testing;
}; };
@ -357,6 +358,7 @@ extern "C" void * dworker_s( void * arg )
Packet_courier & courier = *tmp.courier; Packet_courier & courier = *tmp.courier;
const Pretty_print & pp = *tmp.pp; const Pretty_print & pp = *tmp.pp;
const int worker_id = tmp.worker_id; const int worker_id = tmp.worker_id;
const bool ignore_trailing = tmp.ignore_trailing;
const bool testing = tmp.testing; const bool testing = tmp.testing;
uint8_t * new_data = new( std::nothrow ) uint8_t[max_packet_size]; uint8_t * new_data = new( std::nothrow ) uint8_t[max_packet_size];
@ -392,7 +394,11 @@ extern "C" void * dworker_s( void * arg )
if( rd < 0 ) if( rd < 0 )
{ {
if( LZ_decompress_errno( decoder ) == LZ_header_error ) if( LZ_decompress_errno( decoder ) == LZ_header_error )
{
trailing_garbage_found = true; trailing_garbage_found = true;
if( !ignore_trailing )
{ pp( "Trailing data not allowed." ); cleanup_and_fail( 2 ); }
}
else else
cleanup_and_fail( decompress_read_error( decoder, pp, worker_id ) ); cleanup_and_fail( decompress_read_error( decoder, pp, worker_id ) );
} }
@ -461,7 +467,8 @@ void muxer( Packet_courier & courier, const Pretty_print & pp, const int outfd )
// init the courier, then start the splitter and the workers and, // init the courier, then start the splitter and the workers and,
// if not testing, call the muxer. // if not testing, call the muxer.
int dec_stream( const int num_workers, const int infd, const int outfd, int dec_stream( const int num_workers, const int infd, const int outfd,
const Pretty_print & pp, const int debug_level ) const Pretty_print & pp, const int debug_level,
const bool ignore_trailing )
{ {
const int in_slots_per_worker = 2; const int in_slots_per_worker = 2;
const int out_slots = 32; const int out_slots = 32;
@ -490,6 +497,7 @@ int dec_stream( const int num_workers, const int infd, const int outfd,
worker_args[i].courier = &courier; worker_args[i].courier = &courier;
worker_args[i].pp = &pp; worker_args[i].pp = &pp;
worker_args[i].worker_id = i; worker_args[i].worker_id = i;
worker_args[i].ignore_trailing = ignore_trailing;
worker_args[i].testing = ( outfd < 0 ); worker_args[i].testing = ( outfd < 0 );
errcode = pthread_create( &worker_threads[i], 0, dworker_s, &worker_args[i] ); errcode = pthread_create( &worker_threads[i], 0, dworker_s, &worker_args[i] );
if( errcode ) if( errcode )

View file

@ -1,6 +1,6 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek. Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -45,7 +45,7 @@ void Pretty_print::operator()( const char * const msg ) const
{ {
first_post = false; first_post = false;
std::fprintf( stderr, " %s: ", name_.c_str() ); std::fprintf( stderr, " %s: ", name_.c_str() );
for( unsigned i = 0; i < longest_name - name_.size(); ++i ) for( unsigned i = name_.size(); i < longest_name; ++i )
std::fputc( ' ', stderr ); std::fputc( ' ', stderr );
if( !msg ) std::fflush( stderr ); if( !msg ) std::fflush( stderr );
} }
@ -213,16 +213,16 @@ extern "C" void * dworker( void * arg )
// start the workers and wait for them to finish. // start the workers and wait for them to finish.
int decompress( int num_workers, const int infd, const int outfd, int decompress( int num_workers, const int infd, const int outfd,
const Pretty_print & pp, const int debug_level, const Pretty_print & pp, const int debug_level,
const bool infd_isreg ) const bool ignore_trailing, const bool infd_isreg )
{ {
if( !infd_isreg ) if( !infd_isreg )
return dec_stream( num_workers, infd, outfd, pp, debug_level ); return dec_stream( num_workers, infd, outfd, pp, debug_level, ignore_trailing );
const File_index file_index( infd ); const File_index file_index( infd, ignore_trailing );
if( file_index.retval() == 1 ) if( file_index.retval() == 1 )
{ {
lseek( infd, 0, SEEK_SET ); lseek( infd, 0, SEEK_SET );
return dec_stream( num_workers, infd, outfd, pp, debug_level ); return dec_stream( num_workers, infd, outfd, pp, debug_level, ignore_trailing );
} }
if( file_index.retval() != 0 ) if( file_index.retval() != 0 )
{ pp( file_index.error().c_str() ); return file_index.retval(); } { pp( file_index.error().c_str() ); return file_index.retval(); }

View file

@ -1,5 +1,5 @@
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1. .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1.
.TH PLZIP "1" "July 2015" "plzip 1.4" "User Commands" .TH PLZIP "1" "May 2016" "plzip 1.5" "User Commands"
.SH NAME .SH NAME
plzip \- reduces the size of files plzip \- reduces the size of files
.SH SYNOPSIS .SH SYNOPSIS
@ -15,11 +15,14 @@ display this help and exit
\fB\-V\fR, \fB\-\-version\fR \fB\-V\fR, \fB\-\-version\fR
output version information and exit output version information and exit
.TP .TP
\fB\-a\fR, \fB\-\-trailing\-error\fR
exit with error status if trailing data
.TP
\fB\-B\fR, \fB\-\-data\-size=\fR<bytes> \fB\-B\fR, \fB\-\-data\-size=\fR<bytes>
set size of input data blocks [2x8=16 MiB] set size of input data blocks [2x8=16 MiB]
.TP .TP
\fB\-c\fR, \fB\-\-stdout\fR \fB\-c\fR, \fB\-\-stdout\fR
send output to standard output write to standard output, keep input files
.TP .TP
\fB\-d\fR, \fB\-\-decompress\fR \fB\-d\fR, \fB\-\-decompress\fR
decompress decompress
@ -40,7 +43,7 @@ set match length limit in bytes [36]
set number of (de)compression threads [2] set number of (de)compression threads [2]
.TP .TP
\fB\-o\fR, \fB\-\-output=\fR<file> \fB\-o\fR, \fB\-\-output=\fR<file>
if reading stdin, place the output into <file> if reading standard input, write to <file>
.TP .TP
\fB\-q\fR, \fB\-\-quiet\fR \fB\-q\fR, \fB\-\-quiet\fR
suppress all messages suppress all messages
@ -63,13 +66,16 @@ alias for \fB\-0\fR
\fB\-\-best\fR \fB\-\-best\fR
alias for \fB\-9\fR alias for \fB\-9\fR
.PP .PP
If no file names are given, plzip compresses or decompresses If no file names are given, or if a file is '\-', plzip compresses or
from standard input to standard output. decompresses from standard input to standard output.
Numbers may be followed by a multiplier: k = kB = 10^3 = 1000, Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,
Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc... Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...
Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12
to 2^29 bytes.
.PP
The bidimensional parameter space of LZMA can't be mapped to a linear The bidimensional parameter space of LZMA can't be mapped to a linear
scale optimal for all files. If your files are large, very repetitive, scale optimal for all files. If your files are large, very repetitive,
etc, you may need to use the \fB\-\-match\-length\fR and \fB\-\-dictionary\-size\fR etc, you may need to use the \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR
options directly to achieve optimal performance. options directly to achieve optimal performance.
.PP .PP
Exit status: 0 for a normal exit, 1 for environmental problems (file Exit status: 0 for a normal exit, 1 for environmental problems (file
@ -83,8 +89,8 @@ Plzip home page: http://www.nongnu.org/lzip/plzip.html
.SH COPYRIGHT .SH COPYRIGHT
Copyright \(co 2009 Laszlo Ersek. Copyright \(co 2009 Laszlo Ersek.
.br .br
Copyright \(co 2015 Antonio Diaz Diaz. Copyright \(co 2016 Antonio Diaz Diaz.
Using lzlib 1.7 Using lzlib 1.8
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html> License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
.br .br
This is free software: you are free to change and redistribute it. This is free software: you are free to change and redistribute it.

View file

@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
Plzip Manual Plzip Manual
************ ************
This manual is for Plzip (version 1.4, 9 July 2015). This manual is for Plzip (version 1.5, 14 May 2016).
* Menu: * Menu:
@ -21,11 +21,13 @@ This manual is for Plzip (version 1.4, 9 July 2015).
* File format:: Detailed format of the compressed file * File format:: Detailed format of the compressed file
* Memory requirements:: Memory required to compress and decompress * Memory requirements:: Memory required to compress and decompress
* Minimum file sizes:: Minimum file sizes required for full speed * Minimum file sizes:: Minimum file sizes required for full speed
* Trailing data:: Extra data appended to the file
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs * Problems:: Reporting bugs
* Concept index:: Index of concepts * Concept index:: Index of concepts
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to This manual is free documentation: you have unlimited permission to
copy, distribute and modify it. copy, distribute and modify it.
@ -59,7 +61,7 @@ availability:
recovery means. The lziprecover program can repair bit-flip errors recovery means. The lziprecover program can repair bit-flip errors
(one of the most common forms of data corruption) in lzip files, (one of the most common forms of data corruption) in lzip files,
and provides data recovery capabilities, including error-checked and provides data recovery capabilities, including error-checked
merging of damaged copies of a file. *note Data safety: merging of damaged copies of a file. *Note Data safety:
(lziprecover)Data safety. (lziprecover)Data safety.
* The lzip format is as simple as possible (but not simpler). The * The lzip format is as simple as possible (but not simpler). The
@ -115,13 +117,6 @@ two or more compressed files. The result is the concatenation of the
corresponding uncompressed files. Integrity testing of concatenated corresponding uncompressed files. Integrity testing of concatenated
compressed files is also supported. compressed files is also supported.
WARNING! Even if plzip is bug-free, other causes may result in a
corrupt compressed file (bugs in the system libraries, memory errors,
etc). Therefore, if the data you are going to compress are important,
give the '--keep' option to plzip and do not remove the original file
until you verify the compressed file with a command like
'plzip -cd file.lz | cmp file -'.
 
File: plzip.info, Node: Invoking plzip, Next: Program design, Prev: Introduction, Up: Top File: plzip.info, Node: Invoking plzip, Next: Program design, Prev: Introduction, Up: Top
@ -132,6 +127,10 @@ The format for running plzip is:
plzip [OPTIONS] [FILES] plzip [OPTIONS] [FILES]
'-' used as a FILE argument means standard input. It can be mixed with
other FILES and is read just once, the first time it appears in the
command line.
Plzip supports the following options: Plzip supports the following options:
'-h' '-h'
@ -142,6 +141,13 @@ The format for running plzip is:
'--version' '--version'
Print the version number of plzip on the standard output and exit. Print the version number of plzip on the standard output and exit.
'-a'
'--trailing-error'
Exit with error status 2 if any remaining input is detected after
decompressing the last member. Such remaining input is usually
trailing garbage that can be safely ignored. *Note
concat-example::.
'-B BYTES' '-B BYTES'
'--data-size=BYTES' '--data-size=BYTES'
Set the size of the input data blocks, in bytes. The input file Set the size of the input data blocks, in bytes. The input file
@ -153,12 +159,17 @@ The format for running plzip is:
'-c' '-c'
'--stdout' '--stdout'
Compress or decompress to standard output. Needed when reading Compress or decompress to standard output; keep input files
from a named pipe (fifo) or from a device. unchanged. If compressing several files, each file is compressed
independently. This option is needed when reading from a named
pipe (fifo) or from a device.
'-d' '-d'
'--decompress' '--decompress'
Decompress. Decompress the specified file(s). If a file does not exist or
can't be opened, plzip continues decompressing the rest of the
files. If a file fails to decompress, plzip exits immediately
without decompressing the rest of the files.
'-f' '-f'
'--force' '--force'
@ -207,12 +218,13 @@ The format for running plzip is:
'-s BYTES' '-s BYTES'
'--dictionary-size=BYTES' '--dictionary-size=BYTES'
Set the dictionary size limit in bytes. Valid values range from 4 Set the dictionary size limit in bytes. Plzip will use the smallest
KiB to 512 MiB. Plzip will use the smallest possible dictionary possible dictionary size for each file without exceeding this
size for each file without exceeding this limit. Note that limit. Valid values range from 4 KiB to 512 MiB. Values 12 to 29
dictionary sizes are quantized. If the specified size does not are interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note
match one of the valid sizes, it will be rounded upwards by adding that dictionary sizes are quantized. If the specified size does
up to (BYTES / 16) to it. not match one of the valid sizes, it will be rounded upwards by
adding up to (BYTES / 8) to it.
For maximum compression you should use a dictionary size limit as For maximum compression you should use a dictionary size limit as
large as possible, but keep in mind that the decompression memory large as possible, but keep in mind that the decompression memory
@ -224,7 +236,8 @@ The format for running plzip is:
Check integrity of the specified file(s), but don't decompress Check integrity of the specified file(s), but don't decompress
them. This really performs a trial decompression and throws away them. This really performs a trial decompression and throws away
the result. Use it together with '-v' to see information about the result. Use it together with '-v' to see information about
the file. the file(s). If a file fails the test, plzip may be unable to
check the rest of the files.
'-v' '-v'
'--verbose' '--verbose'
@ -237,14 +250,14 @@ The format for running plzip is:
'-0 .. -9' '-0 .. -9'
Set the compression parameters (dictionary size and match length Set the compression parameters (dictionary size and match length
limit) as shown in the table below. Note that '-9' can be much limit) as shown in the table below. The default compression level
slower than '-0'. These options have no effect when decompressing. is '-6'. Note that '-9' can be much slower than '-0'. These
options have no effect when decompressing.
The bidimensional parameter space of LZMA can't be mapped to a The bidimensional parameter space of LZMA can't be mapped to a
linear scale optimal for all files. If your files are large, very linear scale optimal for all files. If your files are large, very
repetitive, etc, you may need to use the '--match-length' and repetitive, etc, you may need to use the '--dictionary-size' and
'--dictionary-size' options directly to achieve optimal '--match-length' options directly to achieve optimal performance.
performance.
Level Dictionary size Match length limit Level Dictionary size Match length limit
-0 64 KiB 16 bytes -0 64 KiB 16 bytes
@ -292,7 +305,7 @@ File: plzip.info, Node: Program design, Next: File format, Prev: Invoking plz
When compressing, plzip divides the input file into chunks and When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen, compresses as many chunks simultaneously as worker threads are chosen,
creating a multi-member compressed file. creating a multimember compressed file.
When decompressing, plzip decompresses as many members When decompressing, plzip decompresses as many members
simultaneously as worker threads are chosen. Files that were compressed simultaneously as worker threads are chosen. Files that were compressed
@ -348,12 +361,12 @@ additional information before, between, or after them.
Each member has the following structure: Each member has the following structure:
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID string | VN | DS | Lzma stream | CRC32 | Data size | Member size | | ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
All multibyte values are stored in little endian order. All multibyte values are stored in little endian order.
'ID string' 'ID string (the "magic" bytes)'
A four byte string, identifying the lzip format, with the value A four byte string, identifying the lzip format, with the value
"LZIP" (0x4C, 0x5A, 0x49, 0x50). "LZIP" (0x4C, 0x5A, 0x49, 0x50).
@ -371,8 +384,8 @@ additional information before, between, or after them.
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
Valid values for dictionary size range from 4 KiB to 512 MiB. Valid values for dictionary size range from 4 KiB to 512 MiB.
'Lzma stream' 'LZMA stream'
The lzma stream, finished by an end of stream marker. Uses default The LZMA stream, finished by an end of stream marker. Uses default
values for encoder properties. *Note Stream format: (lzip)Stream values for encoder properties. *Note Stream format: (lzip)Stream
format, for a complete description. format, for a complete description.
@ -386,7 +399,7 @@ additional information before, between, or after them.
Total size of the member, including header and trailer. This field Total size of the member, including header and trailer. This field
acts as a distributed index, allows the verification of stream acts as a distributed index, allows the verification of stream
integrity, and facilitates safe recovery of undamaged members from integrity, and facilitates safe recovery of undamaged members from
multi-member files. multimember files.
 
@ -408,7 +421,7 @@ following:
file, or for testing of a regular file; the dictionary size. file, or for testing of a regular file; the dictionary size.
(Note that regular files with more than 1024 bytes of trailing (Note that regular files with more than 1024 bytes of trailing
garbage are treated as non-seekable). data are treated as non-seekable).
* For testing of a non-seekable file or of standard input; the * For testing of a non-seekable file or of standard input; the
dictionary size plus up to 5 MiB. dictionary size plus up to 5 MiB.
@ -420,14 +433,14 @@ following:
dictionary size plus up to 35 MiB. dictionary size plus up to 35 MiB.
 
File: plzip.info, Node: Minimum file sizes, Next: Problems, Prev: Memory requirements, Up: Top File: plzip.info, Node: Minimum file sizes, Next: Trailing data, Prev: Memory requirements, Up: Top
6 Minimum file sizes required for full compression speed 6 Minimum file sizes required for full compression speed
******************************************************** ********************************************************
When compressing, plzip divides the input file into chunks and When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen, compresses as many chunks simultaneously as worker threads are chosen,
creating a multi-member compressed file. creating a multimember compressed file.
For this to work as expected (and roughly multiply the compression For this to work as expected (and roughly multiply the compression
speed by the number of available processors), the uncompressed file speed by the number of available processors), the uncompressed file
@ -456,9 +469,106 @@ Level
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB -9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB
 
File: plzip.info, Node: Problems, Next: Concept index, Prev: Minimum file sizes, Up: Top File: plzip.info, Node: Trailing data, Next: Examples, Prev: Minimum file sizes, Up: Top
7 Reporting bugs 7 Extra data appended to the file
*********************************
Sometimes extra data is found appended to a lzip file after the last
member. Such trailing data may be:
* Padding added to make the file size a multiple of some block size,
for example when writing to a tape.
* Garbage added by some not totally successful copy operation.
* Useful data added by the user; a cryptographically secure hash, a
description of file contents, etc.
* Malicious data added to the file in order to make its total size
and hash value (for a chosen hash) coincide with those of another
file.
* In very rare cases, trailing data could be the corrupt header of
another member. In multimember or concatenated files the
probability of corruption happening in the magic bytes is 5 times
smaller than the probability of getting a false positive caused by
the corruption of the integrity information itself. Therefore it
can be considered to be below the noise level.
Trailing data can be safely ignored in most cases. In some cases,
like that of user-added data, it is expected to be ignored. In those
cases where a file containing trailing data must be rejected, the option
'--trailing-error' can be used. *Note --trailing-error::.

File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top
8 A small tutorial with examples
********************************
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
compressed file (bugs in the system libraries, memory errors, etc).
Therefore, if the data you are going to compress are important, give the
'--keep' option to plzip and don't remove the original file until you
verify the compressed file with a command like
'plzip -cd file.lz | cmp file -'.
Example 1: Replace a regular file with its compressed version 'file.lz'
and show the compression ratio.
plzip -v file
Example 2: Like example 1 but the created 'file.lz' has a block size of
1 MiB. The compression ratio is not shown.
plzip -B 1MiB file
Example 3: Restore a regular file from its compressed version
'file.lz'. If the operation is successful, 'file.lz' is removed.
plzip -d file.lz
Example 4: Verify the integrity of the compressed file 'file.lz' and
show status.
plzip -tv file.lz
Example 5: Compress a whole device in /dev/sdc and send the output to
'file.lz'.
plzip -c /dev/sdc > file.lz
Example 6: The right way of concatenating compressed files. *Note
Trailing data::.
Don't do this
cat file1.lz file2.lz file3.lz | plzip -d
Do this instead
plzip -cd file1.lz file2.lz file3.lz
Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed
data are produced.
plzip -cd file.lz | dd bs=1024 count=10
Example 8: Decompress 'file.lz' partially from decompressed byte 10000
to decompressed byte 15000 (5000 bytes are produced).
plzip -cd file.lz | dd bs=1000 skip=10 count=5

File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
9 Reporting bugs
**************** ****************
There are probably bugs in plzip. There are certainly errors and There are probably bugs in plzip. There are certainly errors and
@ -480,6 +590,7 @@ Concept index
* Menu: * Menu:
* bugs: Problems. (line 6) * bugs: Problems. (line 6)
* examples: Examples. (line 6)
* file format: File format. (line 6) * file format: File format. (line 6)
* getting help: Problems. (line 6) * getting help: Problems. (line 6)
* introduction: Introduction. (line 6) * introduction: Introduction. (line 6)
@ -488,6 +599,7 @@ Concept index
* minimum file sizes: Minimum file sizes. (line 6) * minimum file sizes: Minimum file sizes. (line 6)
* options: Invoking plzip. (line 6) * options: Invoking plzip. (line 6)
* program design: Program design. (line 6) * program design: Program design. (line 6)
* trailing data: Trailing data. (line 6)
* usage: Invoking plzip. (line 6) * usage: Invoking plzip. (line 6)
* version: Invoking plzip. (line 6) * version: Invoking plzip. (line 6)
@ -495,15 +607,19 @@ Concept index
 
Tag Table: Tag Table:
Node: Top221 Node: Top221
Node: Introduction984 Node: Introduction1101
Node: Invoking plzip5332 Node: Invoking plzip5078
Ref: --data-size5747 Ref: --trailing-error5647
Node: Program design10972 Ref: --data-size5890
Node: File format12560 Node: Program design11683
Node: Memory requirements14973 Node: File format13270
Node: Minimum file sizes16085 Node: Memory requirements15702
Node: Problems18007 Node: Minimum file sizes16811
Node: Concept index18543 Node: Trailing data18737
Node: Examples20121
Ref: concat-example21286
Node: Problems21823
Node: Concept index22349
 
End Tag Table End Tag Table

View file

@ -6,8 +6,8 @@
@finalout @finalout
@c %**end of header @c %**end of header
@set UPDATED 9 July 2015 @set UPDATED 14 May 2016
@set VERSION 1.4 @set VERSION 1.5
@dircategory Data Compression @dircategory Data Compression
@direntry @direntry
@ -41,12 +41,14 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
* File format:: Detailed format of the compressed file * File format:: Detailed format of the compressed file
* Memory requirements:: Memory required to compress and decompress * Memory requirements:: Memory required to compress and decompress
* Minimum file sizes:: Minimum file sizes required for full speed * Minimum file sizes:: Minimum file sizes required for full speed
* Trailing data:: Extra data appended to the file
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs * Problems:: Reporting bugs
* Concept index:: Index of concepts * Concept index:: Index of concepts
@end menu @end menu
@sp 1 @sp 1
Copyright @copyright{} 2009-2015 Antonio Diaz Diaz. Copyright @copyright{} 2009-2016 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission This manual is free documentation: you have unlimited permission
to copy, distribute and modify it. to copy, distribute and modify it.
@ -83,7 +85,7 @@ program can repair bit-flip errors (one of the most common forms of data
corruption) in lzip files, and provides data recovery capabilities, corruption) in lzip files, and provides data recovery capabilities,
including error-checked merging of damaged copies of a file. including error-checked merging of damaged copies of a file.
@ifnothtml @ifnothtml
@ref{Data safety,,,lziprecover}. @xref{Data safety,,,lziprecover}.
@end ifnothtml @end ifnothtml
@item @item
@ -144,13 +146,6 @@ or more compressed files. The result is the concatenation of the
corresponding uncompressed files. Integrity testing of concatenated corresponding uncompressed files. Integrity testing of concatenated
compressed files is also supported. compressed files is also supported.
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
compressed file (bugs in the system libraries, memory errors, etc).
Therefore, if the data you are going to compress are important, give the
@samp{--keep} option to plzip and do not remove the original file until
you verify the compressed file with a command like
@w{@samp{plzip -cd file.lz | cmp file -}}.
@node Invoking plzip @node Invoking plzip
@chapter Invoking plzip @chapter Invoking plzip
@ -165,6 +160,11 @@ The format for running plzip is:
plzip [@var{options}] [@var{files}] plzip [@var{options}] [@var{files}]
@end example @end example
@noindent
@samp{-} used as a @var{file} argument means standard input. It can be
mixed with other @var{files} and is read just once, the first time it
appears in the command line.
Plzip supports the following options: Plzip supports the following options:
@table @code @table @code
@ -176,6 +176,13 @@ Print an informative help message describing the options and exit.
@itemx --version @itemx --version
Print the version number of plzip on the standard output and exit. Print the version number of plzip on the standard output and exit.
@anchor{--trailing-error}
@item -a
@itemx --trailing-error
Exit with error status 2 if any remaining input is detected after
decompressing the last member. Such remaining input is usually trailing
garbage that can be safely ignored. @xref{concat-example}.
@anchor{--data-size} @anchor{--data-size}
@item -B @var{bytes} @item -B @var{bytes}
@itemx --data-size=@var{bytes} @itemx --data-size=@var{bytes}
@ -188,12 +195,17 @@ data size.
@item -c @item -c
@itemx --stdout @itemx --stdout
Compress or decompress to standard output. Needed when reading from a Compress or decompress to standard output; keep input files unchanged.
named pipe (fifo) or from a device. If compressing several files, each file is compressed independently.
This option is needed when reading from a named pipe (fifo) or from a
device.
@item -d @item -d
@itemx --decompress @itemx --decompress
Decompress. Decompress the specified file(s). If a file does not exist or can't be
opened, plzip continues decompressing the rest of the files. If a file
fails to decompress, plzip exits immediately without decompressing the
rest of the files.
@item -f @item -f
@itemx --force @itemx --force
@ -238,11 +250,13 @@ Quiet operation. Suppress all messages.
@item -s @var{bytes} @item -s @var{bytes}
@itemx --dictionary-size=@var{bytes} @itemx --dictionary-size=@var{bytes}
Set the dictionary size limit in bytes. Valid values range from 4 KiB to Set the dictionary size limit in bytes. Plzip will use the smallest
512 MiB. Plzip will use the smallest possible dictionary size for each possible dictionary size for each file without exceeding this limit.
file without exceeding this limit. Note that dictionary sizes are Valid values range from 4 KiB to 512 MiB. Values 12 to 29 are
quantized. If the specified size does not match one of the valid sizes, interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note that
it will be rounded upwards by adding up to (@var{bytes} / 16) to it. dictionary sizes are quantized. If the specified size does not match one
of the valid sizes, it will be rounded upwards by adding up to
@w{(@var{bytes} / 8)} to it.
For maximum compression you should use a dictionary size limit as large For maximum compression you should use a dictionary size limit as large
as possible, but keep in mind that the decompression memory requirement as possible, but keep in mind that the decompression memory requirement
@ -252,7 +266,9 @@ is affected at compression time by the choice of dictionary size limit.
@itemx --test @itemx --test
Check integrity of the specified file(s), but don't decompress them. Check integrity of the specified file(s), but don't decompress them.
This really performs a trial decompression and throws away the result. This really performs a trial decompression and throws away the result.
Use it together with @samp{-v} to see information about the file. Use it together with @samp{-v} to see information about the file(s). If
a file fails the test, plzip may be unable to check the rest of the
files.
@item -v @item -v
@itemx --verbose @itemx --verbose
@ -265,14 +281,14 @@ decompressed size, and compressed size.
@item -0 .. -9 @item -0 .. -9
Set the compression parameters (dictionary size and match length limit) Set the compression parameters (dictionary size and match length limit)
as shown in the table below. Note that @samp{-9} can be much slower than as shown in the table below. The default compression level is @samp{-6}.
@samp{-0}. These options have no effect when decompressing. Note that @samp{-9} can be much slower than @samp{-0}. These options
have no effect when decompressing.
The bidimensional parameter space of LZMA can't be mapped to a linear The bidimensional parameter space of LZMA can't be mapped to a linear
scale optimal for all files. If your files are large, very repetitive, scale optimal for all files. If your files are large, very repetitive,
etc, you may need to use the @samp{--match-length} and etc, you may need to use the @samp{--dictionary-size} and
@samp{--dictionary-size} options directly to achieve optimal @samp{--match-length} options directly to achieve optimal performance.
performance.
@multitable {Level} {Dictionary size} {Match length limit} @multitable {Level} {Dictionary size} {Match length limit}
@item Level @tab Dictionary size @tab Match length limit @item Level @tab Dictionary size @tab Match length limit
@ -324,7 +340,7 @@ caused plzip to panic.
When compressing, plzip divides the input file into chunks and When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen, compresses as many chunks simultaneously as worker threads are chosen,
creating a multi-member compressed file. creating a multimember compressed file.
When decompressing, plzip decompresses as many members simultaneously as When decompressing, plzip decompresses as many members simultaneously as
worker threads are chosen. Files that were compressed with lzip will not worker threads are chosen. Files that were compressed with lzip will not
@ -383,14 +399,14 @@ additional information before, between, or after them.
Each member has the following structure: Each member has the following structure:
@verbatim @verbatim
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID string | VN | DS | Lzma stream | CRC32 | Data size | Member size | | ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
@end verbatim @end verbatim
All multibyte values are stored in little endian order. All multibyte values are stored in little endian order.
@table @samp @table @samp
@item ID string @item ID string (the "magic" bytes)
A four byte string, identifying the lzip format, with the value "LZIP" A four byte string, identifying the lzip format, with the value "LZIP"
(0x4C, 0x5A, 0x49, 0x50). (0x4C, 0x5A, 0x49, 0x50).
@ -407,8 +423,8 @@ from the base size to obtain the dictionary size.@*
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@* Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
Valid values for dictionary size range from 4 KiB to 512 MiB. Valid values for dictionary size range from 4 KiB to 512 MiB.
@item Lzma stream @item LZMA stream
The lzma stream, finished by an end of stream marker. Uses default The LZMA stream, finished by an end of stream marker. Uses default
values for encoder properties. values for encoder properties.
@ifnothtml @ifnothtml
@xref{Stream format,,,lzip}, @xref{Stream format,,,lzip},
@ -428,7 +444,7 @@ Size of the uncompressed original data.
@item Member size (8 bytes) @item Member size (8 bytes)
Total size of the member, including header and trailer. This field acts Total size of the member, including header and trailer. This field acts
as a distributed index, allows the verification of stream integrity, and as a distributed index, allows the verification of stream integrity, and
facilitates safe recovery of undamaged members from multi-member files. facilitates safe recovery of undamaged members from multimember files.
@end table @end table
@ -453,8 +469,8 @@ times the data size. Default is 136 MiB.
For decompression of a regular (seekable) file to another regular file, For decompression of a regular (seekable) file to another regular file,
or for testing of a regular file; the dictionary size. or for testing of a regular file; the dictionary size.
(Note that regular files with more than 1024 bytes of trailing garbage (Note that regular files with more than 1024 bytes of trailing data are
are treated as non-seekable). treated as non-seekable).
@item @item
For testing of a non-seekable file or of standard input; the dictionary For testing of a non-seekable file or of standard input; the dictionary
@ -476,7 +492,7 @@ dictionary size plus up to 35 MiB.
When compressing, plzip divides the input file into chunks and When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen, compresses as many chunks simultaneously as worker threads are chosen,
creating a multi-member compressed file. creating a multimember compressed file.
For this to work as expected (and roughly multiply the compression speed For this to work as expected (and roughly multiply the compression speed
by the number of available processors), the uncompressed file must be at by the number of available processors), the uncompressed file must be at
@ -506,6 +522,133 @@ data size for each level:
@end multitable @end multitable
@node Trailing data
@chapter Extra data appended to the file
@cindex trailing data
Sometimes extra data is found appended to a lzip file after the last
member. Such trailing data may be:
@itemize @bullet
@item
Padding added to make the file size a multiple of some block size, for
example when writing to a tape.
@item
Garbage added by some not totally successful copy operation.
@item
Useful data added by the user; a cryptographically secure hash, a
description of file contents, etc.
@item
Malicious data added to the file in order to make its total size and
hash value (for a chosen hash) coincide with those of another file.
@item
In very rare cases, trailing data could be the corrupt header of another
member. In multimember or concatenated files the probability of
corruption happening in the magic bytes is 5 times smaller than the
probability of getting a false positive caused by the corruption of the
integrity information itself. Therefore it can be considered to be below
the noise level.
@end itemize
Trailing data can be safely ignored in most cases. In some cases, like
that of user-added data, it is expected to be ignored. In those cases
where a file containing trailing data must be rejected, the option
@samp{--trailing-error} can be used. @xref{--trailing-error}.
@node Examples
@chapter A small tutorial with examples
@cindex examples
WARNING! Even if plzip is bug-free, other causes may result in a corrupt
compressed file (bugs in the system libraries, memory errors, etc).
Therefore, if the data you are going to compress are important, give the
@samp{--keep} option to plzip and don't remove the original file until
you verify the compressed file with a command like
@w{@samp{plzip -cd file.lz | cmp file -}}.
@sp 1
@noindent
Example 1: Replace a regular file with its compressed version
@samp{file.lz} and show the compression ratio.
@example
plzip -v file
@end example
@sp 1
@noindent
Example 2: Like example 1 but the created @samp{file.lz} has a block
size of 1 MiB. The compression ratio is not shown.
@example
plzip -B 1MiB file
@end example
@sp 1
@noindent
Example 3: Restore a regular file from its compressed version
@samp{file.lz}. If the operation is successful, @samp{file.lz} is
removed.
@example
plzip -d file.lz
@end example
@sp 1
@noindent
Example 4: Verify the integrity of the compressed file @samp{file.lz}
and show status.
@example
plzip -tv file.lz
@end example
@sp 1
@noindent
Example 5: Compress a whole device in /dev/sdc and send the output to
@samp{file.lz}.
@example
plzip -c /dev/sdc > file.lz
@end example
@sp 1
@anchor{concat-example}
@noindent
Example 6: The right way of concatenating compressed files.
@xref{Trailing data}.
@example
Don't do this
cat file1.lz file2.lz file3.lz | plzip -d
Do this instead
plzip -cd file1.lz file2.lz file3.lz
@end example
@sp 1
@noindent
Example 7: Decompress @samp{file.lz} partially until 10 KiB of
decompressed data are produced.
@example
plzip -cd file.lz | dd bs=1024 count=10
@end example
@sp 1
@noindent
Example 8: Decompress @samp{file.lz} partially from decompressed byte
10000 to decompressed byte 15000 (5000 bytes are produced).
@example
plzip -cd file.lz | dd bs=1000 skip=10 count=5
@end example
@node Problems @node Problems
@chapter Reporting bugs @chapter Reporting bugs
@cindex bugs @cindex bugs

View file

@ -1,5 +1,5 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -50,11 +50,11 @@ void File_index::set_num_error( const char * const msg1, unsigned long long num,
char buf[80]; char buf[80];
snprintf( buf, sizeof buf, "%s%llu%s", msg1, num, msg2 ); snprintf( buf, sizeof buf, "%s%llu%s", msg1, num, msg2 );
error_ = buf; error_ = buf;
retval_ = member_vector.empty() ? 1 : 2; // maybe trailing garbage retval_ = member_vector.empty() ? 1 : 2; // maybe trailing data
} }
File_index::File_index( const int infd ) File_index::File_index( const int infd, const bool ignore_garbage )
: retval_( 0 ) : retval_( 0 )
{ {
const long long isize = lseek( infd, 0, SEEK_END ); const long long isize = lseek( infd, 0, SEEK_END );
@ -88,7 +88,10 @@ File_index::File_index( const int infd )
if( member_size < min_member_size || member_size > pos ) if( member_size < min_member_size || member_size > pos )
{ {
if( member_vector.empty() && isize - pos < max_garbage ) if( member_vector.empty() && isize - pos < max_garbage )
{ --pos; continue; } // maybe trailing garbage {
if( ignore_garbage ) { --pos; continue; } // maybe trailing data
error_ = "Trailing data not allowed."; retval_ = 2; return;
}
set_num_error( "Member size in trailer is corrupt at pos ", pos - 8 ); set_num_error( "Member size in trailer is corrupt at pos ", pos - 8 );
break; break;
} }
@ -98,7 +101,10 @@ File_index::File_index( const int infd )
if( !header.verify_magic() || !header.verify_version() ) if( !header.verify_magic() || !header.verify_version() )
{ {
if( member_vector.empty() && isize - pos < max_garbage ) if( member_vector.empty() && isize - pos < max_garbage )
{ --pos; continue; } // maybe trailing garbage {
if( ignore_garbage ) { --pos; continue; } // maybe trailing data
error_ = "Trailing data not allowed."; retval_ = 2; return;
}
set_num_error( "Bad header at pos ", pos - member_size ); set_num_error( "Bad header at pos ", pos - member_size );
break; break;
} }

View file

@ -1,5 +1,5 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -57,7 +57,7 @@ class File_index
const char * const msg2 = "" ); const char * const msg2 = "" );
public: public:
explicit File_index( const int infd ); File_index( const int infd, const bool ignore_garbage );
long members() const { return member_vector.size(); } long members() const { return member_vector.size(); }
const std::string & error() const { return error_; } const std::string & error() const { return error_; }

37
lzip.h
View file

@ -1,5 +1,5 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -15,6 +15,10 @@
along with this program. If not, see <http://www.gnu.org/licenses/>. along with this program. If not, see <http://www.gnu.org/licenses/>.
*/ */
#ifndef LZ_API_VERSION
#define LZ_API_VERSION 1
#endif
enum { enum {
min_dictionary_bits = 12, min_dictionary_bits = 12,
min_dictionary_size = 1 << min_dictionary_bits, min_dictionary_size = 1 << min_dictionary_bits,
@ -31,9 +35,11 @@ class Pretty_print
mutable bool first_post; mutable bool first_post;
public: public:
explicit Pretty_print( const std::vector< std::string > & filenames ) Pretty_print( const std::vector< std::string > & filenames,
const int verbosity )
: stdin_name( "(stdin)" ), longest_name( 0 ), first_post( false ) : stdin_name( "(stdin)" ), longest_name( 0 ), first_post( false )
{ {
if( verbosity <= 0 ) return;
const unsigned stdin_name_len = std::strlen( stdin_name ); const unsigned stdin_name_len = std::strlen( stdin_name );
for( unsigned i = 0; i < filenames.size(); ++i ) for( unsigned i = 0; i < filenames.size(); ++i )
{ {
@ -57,6 +63,11 @@ public:
}; };
inline bool isvalid_ds( const unsigned dictionary_size )
{ return ( dictionary_size >= min_dictionary_size &&
dictionary_size <= max_dictionary_size ); }
inline int real_bits( unsigned value ) inline int real_bits( unsigned value )
{ {
int bits = 0; int bits = 0;
@ -91,8 +102,7 @@ struct File_header
bool dictionary_size( const unsigned sz ) bool dictionary_size( const unsigned sz )
{ {
if( sz >= min_dictionary_size && sz <= max_dictionary_size ) if( !isvalid_ds( sz ) ) return false;
{
data[5] = real_bits( sz - 1 ); data[5] = real_bits( sz - 1 );
if( sz > min_dictionary_size ) if( sz > min_dictionary_size )
{ {
@ -104,8 +114,6 @@ struct File_header
} }
return true; return true;
} }
return false;
}
}; };
@ -152,10 +160,10 @@ struct File_trailer
// defined in compress.cc // defined in compress.cc
int readblock( const int fd, uint8_t * const buf, const int size ); int readblock( const int fd, uint8_t * const buf, const int size );
int writeblock( const int fd, const uint8_t * const buf, const int size ); int writeblock( const int fd, const uint8_t * const buf, const int size );
void xinit( pthread_mutex_t * const mutex ); void xinit_mutex( pthread_mutex_t * const mutex );
void xinit( pthread_cond_t * const cond ); void xinit_cond( pthread_cond_t * const cond );
void xdestroy( pthread_mutex_t * const mutex ); void xdestroy_mutex( pthread_mutex_t * const mutex );
void xdestroy( pthread_cond_t * const cond ); void xdestroy_cond( pthread_cond_t * const cond );
void xlock( pthread_mutex_t * const mutex ); void xlock( pthread_mutex_t * const mutex );
void xunlock( pthread_mutex_t * const mutex ); void xunlock( pthread_mutex_t * const mutex );
void xwait( pthread_cond_t * const cond, pthread_mutex_t * const mutex ); void xwait( pthread_cond_t * const cond, pthread_mutex_t * const mutex );
@ -176,7 +184,8 @@ int dec_stdout( const int num_workers, const int infd, const int outfd,
// defined in dec_stream.cc // defined in dec_stream.cc
int dec_stream( const int num_workers, const int infd, const int outfd, int dec_stream( const int num_workers, const int infd, const int outfd,
const Pretty_print & pp, const int debug_level ); const Pretty_print & pp, const int debug_level,
const bool ignore_garbage );
// defined in decompress.cc // defined in decompress.cc
int preadblock( const int fd, uint8_t * const buf, const int size, int preadblock( const int fd, uint8_t * const buf, const int size,
@ -187,7 +196,7 @@ int decompress_read_error( struct LZ_Decoder * const decoder,
const Pretty_print & pp, const int worker_id ); const Pretty_print & pp, const int worker_id );
int decompress( int num_workers, const int infd, const int outfd, int decompress( int num_workers, const int infd, const int outfd,
const Pretty_print & pp, const int debug_level, const Pretty_print & pp, const int debug_level,
const bool infd_isreg ); const bool ignore_garbage, const bool infd_isreg );
// defined in main.cc // defined in main.cc
extern int verbosity; extern int verbosity;
@ -214,9 +223,9 @@ class Slot_tally
public: public:
explicit Slot_tally( const int slots ) explicit Slot_tally( const int slots )
: num_slots( slots ), num_free( slots ) : num_slots( slots ), num_free( slots )
{ xinit( &mutex ); xinit( &slot_av ); } { xinit_mutex( &mutex ); xinit_cond( &slot_av ); }
~Slot_tally() { xdestroy( &slot_av ); xdestroy( &mutex ); } ~Slot_tally() { xdestroy_cond( &slot_av ); xdestroy_mutex( &mutex ); }
bool all_free() { return ( num_free == num_slots ); } bool all_free() { return ( num_free == num_slots ); }

110
main.cc
View file

@ -1,6 +1,6 @@
/* Plzip - Parallel compressor compatible with lzip /* Plzip - Parallel compressor compatible with lzip
Copyright (C) 2009 Laszlo Ersek. Copyright (C) 2009 Laszlo Ersek.
Copyright (C) 2009-2015 Antonio Diaz Diaz. Copyright (C) 2009-2016 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
@ -67,12 +67,13 @@
#error "Environments where CHAR_BIT != 8 are not supported." #error "Environments where CHAR_BIT != 8 are not supported."
#endif #endif
int verbosity = 0;
namespace { namespace {
const char * const Program_name = "Plzip"; const char * const Program_name = "Plzip";
const char * const program_name = "plzip"; const char * const program_name = "plzip";
const char * const program_year = "2015"; const char * const program_year = "2016";
const char * invocation_name = 0; const char * invocation_name = 0;
struct { const char * from; const char * to; } const known_extensions[] = { struct { const char * from; const char * to; } const known_extensions[] = {
@ -90,9 +91,6 @@ enum Mode { m_compress, m_decompress, m_test };
std::string output_filename; std::string output_filename;
int outfd = -1; int outfd = -1;
const mode_t usr_rw = S_IRUSR | S_IWUSR;
const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
mode_t outfd_mode = usr_rw;
bool delete_output_on_interrupt = false; bool delete_output_on_interrupt = false;
@ -103,15 +101,16 @@ void show_help( const long num_online )
std::printf( "\nOptions:\n" std::printf( "\nOptions:\n"
" -h, --help display this help and exit\n" " -h, --help display this help and exit\n"
" -V, --version output version information and exit\n" " -V, --version output version information and exit\n"
" -a, --trailing-error exit with error status if trailing data\n"
" -B, --data-size=<bytes> set size of input data blocks [2x8=16 MiB]\n" " -B, --data-size=<bytes> set size of input data blocks [2x8=16 MiB]\n"
" -c, --stdout send output to standard output\n" " -c, --stdout write to standard output, keep input files\n"
" -d, --decompress decompress\n" " -d, --decompress decompress\n"
" -f, --force overwrite existing output files\n" " -f, --force overwrite existing output files\n"
" -F, --recompress force re-compression of compressed files\n" " -F, --recompress force re-compression of compressed files\n"
" -k, --keep keep (don't delete) input files\n" " -k, --keep keep (don't delete) input files\n"
" -m, --match-length=<bytes> set match length limit in bytes [36]\n" " -m, --match-length=<bytes> set match length limit in bytes [36]\n"
" -n, --threads=<n> set number of (de)compression threads [%ld]\n" " -n, --threads=<n> set number of (de)compression threads [%ld]\n"
" -o, --output=<file> if reading stdin, place the output into <file>\n" " -o, --output=<file> if reading standard input, write to <file>\n"
" -q, --quiet suppress all messages\n" " -q, --quiet suppress all messages\n"
" -s, --dictionary-size=<bytes> set dictionary size limit in bytes [8 MiB]\n" " -s, --dictionary-size=<bytes> set dictionary size limit in bytes [8 MiB]\n"
" -t, --test test compressed file integrity\n" " -t, --test test compressed file integrity\n"
@ -123,13 +122,15 @@ void show_help( const long num_online )
{ {
std::printf( " -D, --debug=<level> (0-1) print debug statistics to stderr\n" ); std::printf( " -D, --debug=<level> (0-1) print debug statistics to stderr\n" );
} }
std::printf( "If no file names are given, plzip compresses or decompresses\n" std::printf( "If no file names are given, or if a file is '-', plzip compresses or\n"
"from standard input to standard output.\n" "decompresses from standard input to standard output.\n"
"Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n" "Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n"
"Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n" "Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n"
"The bidimensional parameter space of LZMA can't be mapped to a linear\n" "Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12\n"
"to 2^29 bytes.\n"
"\nThe bidimensional parameter space of LZMA can't be mapped to a linear\n"
"scale optimal for all files. If your files are large, very repetitive,\n" "scale optimal for all files. If your files are large, very repetitive,\n"
"etc, you may need to use the --match-length and --dictionary-size\n" "etc, you may need to use the --dictionary-size and --match-length\n"
"options directly to achieve optimal performance.\n" "options directly to achieve optimal performance.\n"
"\nExit status: 0 for a normal exit, 1 for environmental problems (file\n" "\nExit status: 0 for a normal exit, 1 for environmental problems (file\n"
"not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or\n" "not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or\n"
@ -190,11 +191,9 @@ unsigned long long getnum( const char * const ptr,
if( !errno && tail[0] ) if( !errno && tail[0] )
{ {
const int factor = ( tail[1] == 'i' ) ? 1024 : 1000; const int factor = ( tail[1] == 'i' ) ? 1024 : 1000;
int exponent = 0; int exponent = 0; // 0 = bad multiplier
bool bad_multiplier = false;
switch( tail[0] ) switch( tail[0] )
{ {
case ' ': break;
case 'Y': exponent = 8; break; case 'Y': exponent = 8; break;
case 'Z': exponent = 7; break; case 'Z': exponent = 7; break;
case 'E': exponent = 6; break; case 'E': exponent = 6; break;
@ -202,13 +201,10 @@ unsigned long long getnum( const char * const ptr,
case 'T': exponent = 4; break; case 'T': exponent = 4; break;
case 'G': exponent = 3; break; case 'G': exponent = 3; break;
case 'M': exponent = 2; break; case 'M': exponent = 2; break;
case 'K': if( factor == 1024 ) exponent = 1; else bad_multiplier = true; case 'K': if( factor == 1024 ) exponent = 1; break;
break; case 'k': if( factor == 1000 ) exponent = 1; break;
case 'k': if( factor == 1000 ) exponent = 1; else bad_multiplier = true;
break;
default : bad_multiplier = true;
} }
if( bad_multiplier ) if( exponent <= 0 )
{ {
show_error( "Bad multiplier in numerical argument.", 0, true ); show_error( "Bad multiplier in numerical argument.", 0, true );
std::exit( 1 ); std::exit( 1 );
@ -238,7 +234,7 @@ int get_dict_size( const char * const arg )
return ( 1 << bits ); return ( 1 << bits );
int dictionary_size = getnum( arg, LZ_min_dictionary_size(), int dictionary_size = getnum( arg, LZ_min_dictionary_size(),
LZ_max_dictionary_size() ); LZ_max_dictionary_size() );
if( dictionary_size == 65535 ) ++dictionary_size; if( dictionary_size == 65535 ) ++dictionary_size; // no fast encoder
return dictionary_size; return dictionary_size;
} }
@ -283,7 +279,7 @@ int open_instream( const char * const name, struct stat * const in_statsp,
const bool can_read = ( i == 0 && const bool can_read = ( i == 0 &&
( S_ISBLK( mode ) || S_ISCHR( mode ) || ( S_ISBLK( mode ) || S_ISCHR( mode ) ||
S_ISFIFO( mode ) || S_ISSOCK( mode ) ) ); S_ISFIFO( mode ) || S_ISSOCK( mode ) ) );
const bool no_ofile = to_stdout || program_mode == m_test; const bool no_ofile = ( to_stdout || program_mode == m_test );
if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || !no_ofile ) ) ) if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || !no_ofile ) ) )
{ {
if( verbosity >= 0 ) if( verbosity >= 0 )
@ -326,13 +322,17 @@ void set_d_outname( const std::string & name, const int i )
} }
bool open_outstream( const bool force ) bool open_outstream( const bool force, const bool from_stdin )
{ {
const mode_t usr_rw = S_IRUSR | S_IWUSR;
const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
const mode_t outfd_mode = from_stdin ? all_rw : usr_rw;
int flags = O_CREAT | O_WRONLY | O_BINARY; int flags = O_CREAT | O_WRONLY | O_BINARY;
if( force ) flags |= O_TRUNC; else flags |= O_EXCL; if( force ) flags |= O_TRUNC; else flags |= O_EXCL;
outfd = open( output_filename.c_str(), flags, outfd_mode ); outfd = open( output_filename.c_str(), flags, outfd_mode );
if( outfd < 0 && verbosity >= 0 ) if( outfd >= 0 ) delete_output_on_interrupt = true;
else if( verbosity >= 0 )
{ {
if( errno == EEXIST ) if( errno == EEXIST )
std::fprintf( stderr, "%s: Output file '%s' already exists, skipping.\n", std::fprintf( stderr, "%s: Output file '%s' already exists, skipping.\n",
@ -403,7 +403,11 @@ void close_and_set_permissions( const struct stat * const in_statsp )
fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 ) fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 )
warning = true; warning = true;
} }
if( close( outfd ) != 0 ) cleanup_and_fail( 1 ); if( close( outfd ) != 0 )
{
show_error( "Error closing output file", errno );
cleanup_and_fail( 1 );
}
outfd = -1; outfd = -1;
delete_output_on_interrupt = false; delete_output_on_interrupt = false;
if( in_statsp ) if( in_statsp )
@ -435,25 +439,19 @@ void set_signals()
} // end namespace } // end namespace
int verbosity = 0;
void show_error( const char * const msg, const int errcode, const bool help ) void show_error( const char * const msg, const int errcode, const bool help )
{ {
if( verbosity >= 0 ) if( verbosity < 0 ) return;
{
if( msg && msg[0] ) if( msg && msg[0] )
{ {
std::fprintf( stderr, "%s: %s", program_name, msg ); std::fprintf( stderr, "%s: %s", program_name, msg );
if( errcode > 0 ) if( errcode > 0 ) std::fprintf( stderr, ": %s", std::strerror( errcode ) );
std::fprintf( stderr, ": %s", std::strerror( errcode ) );
std::fputc( '\n', stderr ); std::fputc( '\n', stderr );
} }
if( help ) if( help )
std::fprintf( stderr, "Try '%s --help' for more information.\n", std::fprintf( stderr, "Try '%s --help' for more information.\n",
invocation_name ); invocation_name );
} }
}
void internal_error( const char * const msg ) void internal_error( const char * const msg )
@ -473,8 +471,7 @@ void show_progress( const int packet_size,
static const Pretty_print * pp = 0; static const Pretty_print * pp = 0;
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
if( verbosity >= 2 ) if( verbosity < 2 ) return;
{
if( p ) // initialize static vars if( p ) // initialize static vars
{ csize = cfile_size; pos = 0; pp = p; } { csize = cfile_size; pos = 0; pp = p; }
if( pp ) if( pp )
@ -488,7 +485,6 @@ void show_progress( const int packet_size,
xunlock( &mutex ); xunlock( &mutex );
} }
} }
}
int main( const int argc, const char * const argv[] ) int main( const int argc, const char * const argv[] )
@ -497,7 +493,7 @@ int main( const int argc, const char * const argv[] )
to the corresponding LZMA compression modes. */ to the corresponding LZMA compression modes. */
const Lzma_options option_mapping[] = const Lzma_options option_mapping[] =
{ {
{ 65535, 16 }, // -0 { 65535, 16 }, // -0 (65535,16 chooses fast encoder)
{ 1 << 20, 5 }, // -1 { 1 << 20, 5 }, // -1
{ 3 << 19, 6 }, // -2 { 3 << 19, 6 }, // -2
{ 1 << 21, 8 }, // -3 { 1 << 21, 8 }, // -3
@ -517,13 +513,15 @@ int main( const int argc, const char * const argv[] )
int num_workers = 0; // start this many worker threads int num_workers = 0; // start this many worker threads
Mode program_mode = m_compress; Mode program_mode = m_compress;
bool force = false; bool force = false;
bool ignore_trailing = true;
bool keep_input_files = false; bool keep_input_files = false;
bool recompress = false; bool recompress = false;
bool to_stdout = false; bool to_stdout = false;
invocation_name = argv[0]; invocation_name = argv[0];
if( LZ_version()[0] != LZ_version_string[0] ) if( LZ_version()[0] < '1' )
internal_error( "bad library version." ); { show_error( "Bad library version. At least lzlib 1.0 is required." );
return 1; }
const long num_online = std::max( 1L, sysconf( _SC_NPROCESSORS_ONLN ) ); const long num_online = std::max( 1L, sysconf( _SC_NPROCESSORS_ONLN ) );
long max_workers = sysconf( _SC_THREAD_THREADS_MAX ); long max_workers = sysconf( _SC_THREAD_THREADS_MAX );
@ -542,6 +540,7 @@ int main( const int argc, const char * const argv[] )
{ '7', 0, Arg_parser::no }, { '7', 0, Arg_parser::no },
{ '8', 0, Arg_parser::no }, { '8', 0, Arg_parser::no },
{ '9', "best", Arg_parser::no }, { '9', "best", Arg_parser::no },
{ 'a', "trailing-error", Arg_parser::no },
{ 'b', "member-size", Arg_parser::yes }, { 'b', "member-size", Arg_parser::yes },
{ 'B', "data-size", Arg_parser::yes }, { 'B', "data-size", Arg_parser::yes },
{ 'c', "stdout", Arg_parser::no }, { 'c', "stdout", Arg_parser::no },
@ -572,28 +571,30 @@ int main( const int argc, const char * const argv[] )
const int code = parser.code( argind ); const int code = parser.code( argind );
if( !code ) break; // no more options if( !code ) break; // no more options
const std::string & arg = parser.argument( argind ); const std::string & arg = parser.argument( argind );
const char * const ptr = arg.c_str();
switch( code ) switch( code )
{ {
case '0': case '1': case '2': case '3': case '4': case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9': case '5': case '6': case '7': case '8': case '9':
encoder_options = option_mapping[code-'0']; break; encoder_options = option_mapping[code-'0']; break;
case 'a': ignore_trailing = false; break;
case 'b': break; case 'b': break;
case 'B': data_size = getnum( arg.c_str(), 2 * LZ_min_dictionary_size(), case 'B': data_size = getnum( ptr, 2 * LZ_min_dictionary_size(),
2 * LZ_max_dictionary_size() ); break; 2 * LZ_max_dictionary_size() ); break;
case 'c': to_stdout = true; break; case 'c': to_stdout = true; break;
case 'd': program_mode = m_decompress; break; case 'd': program_mode = m_decompress; break;
case 'D': debug_level = getnum( arg.c_str(), 0, 3 ); break; case 'D': debug_level = getnum( ptr, 0, 3 ); break;
case 'f': force = true; break; case 'f': force = true; break;
case 'F': recompress = true; break; case 'F': recompress = true; break;
case 'h': show_help( num_online ); return 0; case 'h': show_help( num_online ); return 0;
case 'k': keep_input_files = true; break; case 'k': keep_input_files = true; break;
case 'm': encoder_options.match_len_limit = case 'm': encoder_options.match_len_limit =
getnum( arg.c_str(), LZ_min_match_len_limit(), getnum( ptr, LZ_min_match_len_limit(),
LZ_max_match_len_limit() ); break; LZ_max_match_len_limit() ); break;
case 'n': num_workers = getnum( arg.c_str(), 1, max_workers ); break; case 'n': num_workers = getnum( ptr, 1, max_workers ); break;
case 'o': default_output_filename = arg; break; case 'o': default_output_filename = arg; break;
case 'q': verbosity = -1; break; case 'q': verbosity = -1; break;
case 's': encoder_options.dictionary_size = get_dict_size( arg.c_str() ); case 's': encoder_options.dictionary_size = get_dict_size( ptr );
break; break;
case 'S': break; case 'S': break;
case 't': program_mode = m_test; break; case 't': program_mode = m_test; break;
@ -637,9 +638,10 @@ int main( const int argc, const char * const argv[] )
( filenames_given || default_output_filename.size() ) ) ( filenames_given || default_output_filename.size() ) )
set_signals(); set_signals();
Pretty_print pp( filenames ); Pretty_print pp( filenames, verbosity );
int retval = 0; int retval = 0;
bool stdin_used = false;
for( unsigned i = 0; i < filenames.size(); ++i ) for( unsigned i = 0; i < filenames.size(); ++i )
{ {
struct stat in_stats; struct stat in_stats;
@ -647,6 +649,7 @@ int main( const int argc, const char * const argv[] )
if( filenames[i].empty() || filenames[i] == "-" ) if( filenames[i].empty() || filenames[i] == "-" )
{ {
if( stdin_used ) continue; else stdin_used = true;
input_filename.clear(); input_filename.clear();
infd = STDIN_FILENO; infd = STDIN_FILENO;
if( program_mode != m_test ) if( program_mode != m_test )
@ -658,8 +661,7 @@ int main( const int argc, const char * const argv[] )
if( program_mode == m_compress ) if( program_mode == m_compress )
set_c_outname( default_output_filename ); set_c_outname( default_output_filename );
else output_filename = default_output_filename; else output_filename = default_output_filename;
outfd_mode = all_rw; if( !open_outstream( force, true ) )
if( !open_outstream( force ) )
{ {
if( retval < 1 ) retval = 1; if( retval < 1 ) retval = 1;
close( infd ); infd = -1; close( infd ); infd = -1;
@ -683,8 +685,7 @@ int main( const int argc, const char * const argv[] )
if( program_mode == m_compress ) if( program_mode == m_compress )
set_c_outname( input_filename ); set_c_outname( input_filename );
else set_d_outname( input_filename, eindex ); else set_d_outname( input_filename, eindex );
outfd_mode = usr_rw; if( !open_outstream( force, false ) )
if( !open_outstream( force ) )
{ {
if( retval < 1 ) retval = 1; if( retval < 1 ) retval = 1;
close( infd ); infd = -1; close( infd ); infd = -1;
@ -694,10 +695,12 @@ int main( const int argc, const char * const argv[] )
} }
} }
if( !check_tty( infd, program_mode ) ) return 1; if( !check_tty( infd, program_mode ) )
{
if( retval < 1 ) retval = 1;
cleanup_and_fail( retval );
}
if( output_filename.size() && !to_stdout && program_mode != m_test )
delete_output_on_interrupt = true;
const struct stat * const in_statsp = input_filename.size() ? &in_stats : 0; const struct stat * const in_statsp = input_filename.size() ? &in_stats : 0;
const bool infd_isreg = in_statsp && S_ISREG( in_statsp->st_mode ); const bool infd_isreg = in_statsp && S_ISREG( in_statsp->st_mode );
pp.set_name( input_filename ); pp.set_name( input_filename );
@ -711,7 +714,8 @@ int main( const int argc, const char * const argv[] )
num_workers, infd, outfd, pp, debug_level ); num_workers, infd, outfd, pp, debug_level );
} }
else else
tmp = decompress( num_workers, infd, outfd, pp, debug_level, infd_isreg ); tmp = decompress( num_workers, infd, outfd, pp, debug_level,
ignore_trailing, infd_isreg );
if( tmp > retval ) retval = tmp; if( tmp > retval ) retval = tmp;
if( tmp && program_mode != m_test ) cleanup_and_fail( retval ); if( tmp && program_mode != m_test ) cleanup_and_fail( retval );

View file

@ -1,6 +1,6 @@
#! /bin/sh #! /bin/sh
# check script for Plzip - Parallel compressor compatible with lzip # check script for Plzip - Parallel compressor compatible with lzip
# Copyright (C) 2009-2015 Antonio Diaz Diaz. # Copyright (C) 2009-2016 Antonio Diaz Diaz.
# #
# This script is free software: you have unlimited permission # This script is free software: you have unlimited permission
# to copy, distribute and modify it. # to copy, distribute and modify it.
@ -17,9 +17,16 @@ if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then
exit 1 exit 1
fi fi
if [ -e "${LZIP}" ] 2> /dev/null ; then true
else
echo "$0: a POSIX shell is required to run the tests"
echo "Try bash -c \"$0 $1 $2\""
exit 1
fi
if [ -d tmp ] ; then rm -rf tmp ; fi if [ -d tmp ] ; then rm -rf tmp ; fi
mkdir tmp mkdir tmp
cd "${objdir}"/tmp cd "${objdir}"/tmp || framework_failure
cat "${testdir}"/test.txt > in || framework_failure cat "${testdir}"/test.txt > in || framework_failure
in_lz="${testdir}"/test.txt.lz in_lz="${testdir}"/test.txt.lz
@ -27,25 +34,22 @@ fail=0
printf "testing plzip-%s..." "$2" printf "testing plzip-%s..." "$2"
"${LZIP}" -cqm4 in > /dev/null "${LZIP}" -fkqm4 in
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cqm274 in > /dev/null "${LZIP}" -fkqm274 in
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cqs-1 in > /dev/null "${LZIP}" -fkqs-1 in
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cqs0 in > /dev/null "${LZIP}" -fkqs0 in
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cqs4095 in > /dev/null "${LZIP}" -fkqs4095 in
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cqs513MiB in > /dev/null "${LZIP}" -fkqs513MiB in
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi
printf " in: Bad magic number (file not in lzip format).\n" > msg "${LZIP}" -tq in
"${LZIP}" -t in 2> out if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
if [ $? = 2 ] && cmp out msg ; then printf . ; else printf - ; fail=1 ; fi "${LZIP}" -tq < in
printf " (stdin): Bad magic number (file not in lzip format).\n" > msg if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -t < in 2> out
if [ $? = 2 ] && cmp out msg ; then printf . ; else printf - ; fail=1 ; fi
rm -f out msg
"${LZIP}" -cdq in "${LZIP}" -cdq in
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cdq < in "${LZIP}" -cdq < in
@ -55,26 +59,53 @@ if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
dd if="${in_lz}" bs=1 count=20 2> /dev/null | "${LZIP}" -tq dd if="${in_lz}" bs=1 count=20 2> /dev/null | "${LZIP}" -tq
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -t "${in_lz}" || fail=1 printf "\ntesting decompression..."
"${LZIP}" -t "${in_lz}"
if [ $? = 0 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cd "${in_lz}" > copy || fail=1 "${LZIP}" -cd "${in_lz}" > copy || fail=1
cmp in copy || fail=1 cmp in copy || fail=1
printf . printf .
rm -f copy
cat "${in_lz}" > copy.lz || framework_failure cat "${in_lz}" > copy.lz || framework_failure
printf "to be overwritten" > copy || framework_failure "${LZIP}" -dk copy.lz || fail=1
"${LZIP}" -df copy.lz || fail=1
cmp in copy || fail=1 cmp in copy || fail=1
printf . printf "to be overwritten" > copy || framework_failure
"${LZIP}" -dq copy.lz
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -df copy.lz
if [ $? = 0 ] && [ ! -e copy.lz ] && cmp in copy ; then
printf . ; else printf - ; fail=1 ; fi
printf "to be overwritten" > copy || framework_failure printf "to be overwritten" > copy || framework_failure
"${LZIP}" -df -o copy < "${in_lz}" || fail=1 "${LZIP}" -df -o copy < "${in_lz}" || fail=1
cmp in copy || fail=1 cmp in copy || fail=1
printf . printf .
rm -f copy
"${LZIP}" < in > anyothername || fail=1 "${LZIP}" < in > anyothername || fail=1
"${LZIP}" -d anyothername || fail=1 "${LZIP}" -d -o copy - anyothername - < "${in_lz}"
cmp in anyothername.out || fail=1 if [ $? = 0 ] && cmp in copy && cmp in anyothername.out ; then
printf . printf . ; else printf - ; fail=1 ; fi
rm -f copy anyothername.out
"${LZIP}" -tq in "${in_lz}"
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -tq foo.lz "${in_lz}"
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cdq in "${in_lz}" > copy
if [ $? = 2 ] && cat copy in | cmp in - ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cdq foo.lz "${in_lz}" > copy
if [ $? = 1 ] && cmp in copy ; then printf . ; else printf - ; fail=1 ; fi
rm -f copy
cat "${in_lz}" > copy.lz || framework_failure
"${LZIP}" -dq in copy.lz
if [ $? = 2 ] && [ -e copy.lz ] && [ ! -e copy ] && [ ! -e in.out ] ; then
printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -dq foo.lz copy.lz
if [ $? = 1 ] && [ ! -e copy.lz ] && [ ! -e foo ] && cmp in copy ; then
printf . ; else printf - ; fail=1 ; fi
cat in in > in2 || framework_failure cat in in > in2 || framework_failure
"${LZIP}" -o copy2 < in2 || fail=1 "${LZIP}" -o copy2 < in2 || fail=1
@ -84,12 +115,23 @@ cmp in2 copy2 || fail=1
printf . printf .
printf "garbage" >> copy2.lz || framework_failure printf "garbage" >> copy2.lz || framework_failure
rm -f copy2
"${LZIP}" -atq copy2.lz
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -atq < copy2.lz
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -adkq copy2.lz
if [ $? = 2 ] && [ ! -e copy2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -adkq -o copy2 < copy2.lz
if [ $? = 2 ] && [ ! -e copy2 ] ; then printf . ; else printf - ; fail=1 ; fi
printf "to be overwritten" > copy2 || framework_failure printf "to be overwritten" > copy2 || framework_failure
"${LZIP}" -df copy2.lz || fail=1 "${LZIP}" -df copy2.lz || fail=1
cmp in2 copy2 || fail=1 cmp in2 copy2 || fail=1
printf . printf .
"${LZIP}" -cfq "${in_lz}" > out printf "\ntesting compression..."
"${LZIP}" -cfq "${in_lz}" > out # /dev/null is a tty on OS/2
if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cF "${in_lz}" > out || fail=1 "${LZIP}" -cF "${in_lz}" > out || fail=1
"${LZIP}" -cd out | "${LZIP}" -d > copy || fail=1 "${LZIP}" -cd out | "${LZIP}" -d > copy || fail=1
@ -162,10 +204,10 @@ printf .
dd if="${in_lz}" bs=1024 count=6 > trunc.lz 2> /dev/null || framework_failure dd if="${in_lz}" bs=1024 count=6 > trunc.lz 2> /dev/null || framework_failure
"${LZIP}" -tq trunc.lz "${LZIP}" -tq trunc.lz
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cdq trunc.lz > out
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -tq < trunc.lz "${LZIP}" -tq < trunc.lz
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -cdq trunc.lz > out
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi
"${LZIP}" -dq < trunc.lz > out "${LZIP}" -dq < trunc.lz > out
if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi