Merging upstream version 1.25~rc1.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
1d67e88e3c
commit
b8e73cb85f
39 changed files with 978 additions and 742 deletions
51
ChangeLog
51
ChangeLog
|
@ -1,3 +1,15 @@
|
|||
2024-11-18 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.25-rc1 released.
|
||||
* byte_repair.cc: Repair a nonzero first LZMA byte.
|
||||
* Integrate options '--ignore-empty' and '--ignore-nonzero' into
|
||||
'-i, --ignore-errors'.
|
||||
* merge.cc (copy_file): Add name arguments, use 'show_file_error'.
|
||||
* lziprecover.texi: New chapter 'Syntax of command-line arguments'.
|
||||
* check.sh: Use 'cp' instead of 'cat'.
|
||||
* testsuite: Add fox_nz.lz, fox6_b1nz.lz.
|
||||
Remove fox6.lz, fox6_nz.lz, test_em.txt.lz, test_3m.txt.lz.md5.
|
||||
|
||||
2024-10-01 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.25-pre1 released.
|
||||
|
@ -7,7 +19,7 @@
|
|||
* New options '--ignore-empty' and '--ignore-nonzero'.
|
||||
* Rename option '--clear-marking' to '--nonzero-repair'.
|
||||
* Remove options '--empty-error' and '--marking-error'.
|
||||
* Remove decompression support for Sync Flush marker.
|
||||
* decoder.cc (decode_member): Remove support for Sync Flush marker.
|
||||
* testsuite: Require lzip/clzip. Add fox6_nz.lz. Remove fox6_mark.lz.
|
||||
|
||||
2024-01-20 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
@ -134,11 +146,10 @@
|
|||
* repair.cc: Repair a damaged dictionary size in the header.
|
||||
* repair.cc: Try bytes at offsets 7 to 11 first.
|
||||
* Decompression time has been reduced by 2%.
|
||||
* main.cc (decompress): Print up to 6 bytes of trailing data when
|
||||
'-tvvvv' is specified.
|
||||
* decoder.cc (verify_trailer): Remove test of final code.
|
||||
* main.cc (main): Delete '--output' file if infd is a terminal.
|
||||
* main.cc (main): Don't use stdin more than once.
|
||||
(main): Don't use stdin more than once.
|
||||
(decompress): Print 6 bytes of trailing data at verbosity level 4.
|
||||
* decoder.cc (verify_trailer): Remove test of final code.
|
||||
* Use 'close_and_set_permissions' and 'set_signals' in all modes.
|
||||
* range_dec.cc (list_file): Show dictionary size and size of
|
||||
trailing data (if any) with '-lv'.
|
||||
|
@ -154,8 +165,7 @@
|
|||
* lziprecover.texi: New chapter 'Trailing data'.
|
||||
* configure: Avoid warning on some shells when testing for g++.
|
||||
* Makefile.in: Detect the existence of install-info.
|
||||
* check.sh: Don't check error messages.
|
||||
* check.sh: A POSIX shell is required to run the tests.
|
||||
* check.sh: Require a POSIX shell. Don't check error messages.
|
||||
|
||||
2015-05-28 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
|
@ -192,11 +202,11 @@
|
|||
* Option '-l, --list' now accepts more than one file.
|
||||
* Decompression time has been reduced by 12%.
|
||||
* split.cc: Use as few digits as possible in file names.
|
||||
* split.cc: In verbose mode show names of files being created.
|
||||
In verbose mode show names of files being created.
|
||||
* main.cc (show_header): Show header version if verbosity >= 4.
|
||||
(main): Use 'setmode' instead of '_setmode' on Windows and OS/2.
|
||||
* configure: Options now accept a separate argument.
|
||||
* Makefile.in: New targets 'install-as-lzip' and 'install-bin'.
|
||||
* main.cc: Use 'setmode' instead of '_setmode' on Windows and OS/2.
|
||||
|
||||
2012-02-24 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
||||
|
@ -207,8 +217,7 @@
|
|||
* lziprecover.cc: Rename to main.cc.
|
||||
* New files merge.cc, repair.cc, split.cc, and range_dec.cc.
|
||||
* main.cc: Add decompressor options (-c, -d, -k, -t) so that an
|
||||
external decompressor is not needed for recovery nor for
|
||||
"make check".
|
||||
external decompressor is not needed for recovery and 'make check'.
|
||||
* New option '-D, --range-decompress', which extracts a range of
|
||||
bytes decompressing only the members containing the desired data.
|
||||
* New option '-l, --list', which prints correct total file sizes
|
||||
|
@ -223,25 +232,23 @@
|
|||
* Version 1.12 released.
|
||||
* lziprecover.cc: If '-v' is not specified show errors only.
|
||||
* unzcrash.cc: Use Arg_parser.
|
||||
* unzcrash.cc: New options '-b, --bits', '-p, --position', and
|
||||
'-s, --size'.
|
||||
New options '-b, --bits', '-p, --position', and '-s, --size'.
|
||||
|
||||
2010-09-16 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
||||
* Version 1.11 released.
|
||||
* lziprecover.cc: New option '-m, --merge', which tries to produce a
|
||||
correct file by merging the good parts of two or more damaged copies.
|
||||
* lziprecover.cc: New option '-R, --repair' for repairing a
|
||||
1-byte error in single-member files.
|
||||
* decoder.cc (decode_member): Detect file errors earlier to improve
|
||||
efficiency of lziprecover's new repair capability.
|
||||
This change also prevents (harmless) access to uninitialized
|
||||
memory when decompressing a corrupt file.
|
||||
* lziprecover.cc: New options '-f, --force' and '-o, --output'.
|
||||
* lziprecover.cc: New option '-s, --split' to select the until now
|
||||
only operation of splitting multimember files.
|
||||
* lziprecover.cc: If no operation is specified, warn the user and do
|
||||
nothing.
|
||||
* lziprecover.cc: New option '-m, --merge', which tries to produce a
|
||||
correct file by merging the good parts of two or more damaged copies.
|
||||
New option '-R, --repair' for repairing a 1-byte error in
|
||||
single-member files.
|
||||
New options '-f, --force' and '-o, --output'.
|
||||
New option '-s, --split' to select the until now only operation of
|
||||
splitting multimember files.
|
||||
If no operation is specified, warn the user and do nothing.
|
||||
|
||||
2009-06-22 Antonio Diaz Diaz <ant_diaz@teleline.es>
|
||||
|
||||
|
|
|
@ -2,8 +2,8 @@
|
|||
DISTNAME = $(pkgname)-$(pkgversion)
|
||||
INSTALL = install
|
||||
INSTALL_PROGRAM = $(INSTALL) -m 755
|
||||
INSTALL_DATA = $(INSTALL) -m 644
|
||||
INSTALL_DIR = $(INSTALL) -d -m 755
|
||||
INSTALL_DATA = $(INSTALL) -m 644
|
||||
SHELL = /bin/sh
|
||||
CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
|
||||
|
||||
|
@ -150,11 +150,9 @@ dist : doc
|
|||
$(DISTNAME)/testsuite/test.txt \
|
||||
$(DISTNAME)/testsuite/test21636.txt \
|
||||
$(DISTNAME)/testsuite/test_bad[6-9].txt \
|
||||
$(DISTNAME)/testsuite/test_3m.txt.lz.md5 \
|
||||
$(DISTNAME)/testsuite/fox.lz \
|
||||
$(DISTNAME)/testsuite/fox_*.lz \
|
||||
$(DISTNAME)/testsuite/fox6.lz \
|
||||
$(DISTNAME)/testsuite/fox6_nz.lz \
|
||||
$(DISTNAME)/testsuite/fox6_b1nz.lz \
|
||||
$(DISTNAME)/testsuite/fox6_sc[1-6].lz \
|
||||
$(DISTNAME)/testsuite/fox6_bad[1-6].lz \
|
||||
$(DISTNAME)/testsuite/numbers.lz \
|
||||
|
@ -162,7 +160,6 @@ dist : doc
|
|||
$(DISTNAME)/testsuite/test.txt.lz \
|
||||
$(DISTNAME)/testsuite/test.txt.lzma \
|
||||
$(DISTNAME)/testsuite/test_bad[1-9].lz \
|
||||
$(DISTNAME)/testsuite/test_em.txt.lz \
|
||||
$(DISTNAME)/testsuite/test.txt.lz.fec \
|
||||
$(DISTNAME)/testsuite/test.txt.lz.fec16
|
||||
rm -f $(DISTNAME)
|
||||
|
|
19
NEWS
19
NEWS
|
@ -12,20 +12,21 @@ The option '--fec-file', which sets the fec file to be used, has been added.
|
|||
The options '-r, --recursive' and '-R, --dereference-recursive' have been
|
||||
added for recursive creation and reading of fec files.
|
||||
|
||||
The short name of option '--byte-repair' has been changed to "-B".
|
||||
The short name of option '--byte-repair' has been changed to '-B'.
|
||||
|
||||
The option '--ignore-empty', which makes lziprecover ignore empty members in
|
||||
multimember files when decompressing, testing, or listing, has been added.
|
||||
By default lziprecover now exits with error status 2 if any empty member is
|
||||
found in a multimember file.
|
||||
The option '--byte-repair' now repairs a nonzero first LZMA byte.
|
||||
|
||||
The option '--ignore-nonzero', which makes lziprecover ignore a nonzero
|
||||
first byte in the LZMA stream when decompressing or testing, has been added.
|
||||
By default lziprecover now exits with error status 2 if the first LZMA byte
|
||||
is nonzero in any member of the input files.
|
||||
When decompressing, testing, or listing, lziprecover now exits with error
|
||||
status 2 if any empty member is found in a regular multimember file unless
|
||||
'-i' is given.
|
||||
|
||||
When decompressing or testing, lziprecover now exits with error status 2 if
|
||||
the first byte of the LZMA stream is not 0 unless '-i' is given.
|
||||
|
||||
The option '--clear-marking' has been renamed to '--nonzero-repair'.
|
||||
|
||||
Options '--empty-error' and '--marking-error' have been removed.
|
||||
|
||||
The chapter 'Syntax of command-line arguments' has been added to the manual.
|
||||
|
||||
Lzip 1.16 (or clzip 1.6) or newer is required to run the tests.
|
||||
|
|
|
@ -52,7 +52,7 @@ uint8_t * read_file( const int infd, long * const file_sizep,
|
|||
if( buffer_size >= LONG_MAX )
|
||||
{ show_file_error( filename, large_file_msg );
|
||||
std::free( buffer ); return 0; }
|
||||
buffer_size = ( buffer_size <= LONG_MAX / 2 ) ? 2 * buffer_size : LONG_MAX;
|
||||
buffer_size = (buffer_size <= LONG_MAX / 2) ? 2 * buffer_size : LONG_MAX;
|
||||
uint8_t * const tmp = (uint8_t *)std::realloc( buffer, buffer_size );
|
||||
if( !tmp ) { std::free( buffer ); throw std::bad_alloc(); }
|
||||
buffer = tmp;
|
||||
|
@ -143,8 +143,8 @@ int alone_to_lz( const int infd, const Pretty_print & pp )
|
|||
{ pp( "conversion failed" ); std::free( buffer ); return 2; }
|
||||
if( writeblock( outfd, buffer + offset, lzip_size ) != lzip_size )
|
||||
{
|
||||
show_error( "Error writing output file", errno );
|
||||
std::free( buffer ); return 1;
|
||||
show_file_error( printable_name( output_filename, false ), write_error_msg,
|
||||
errno ); std::free( buffer ); return 1;
|
||||
}
|
||||
std::free( buffer );
|
||||
if( verbosity >= 1 ) std::fputs( "done\n", stderr );
|
||||
|
|
|
@ -58,6 +58,19 @@ bool gross_damage( const uint8_t * const mbuffer, const long msize )
|
|||
}
|
||||
|
||||
|
||||
// Return value: 0 = errors remain, 6 = repaired pos
|
||||
int repair_nonzero( uint8_t * const mbuffer, const long msize )
|
||||
{
|
||||
mbuffer[6] = 0;
|
||||
const Lzip_header & header = *(Lzip_header *)mbuffer;
|
||||
const unsigned dictionary_size = header.dictionary_size();
|
||||
if( !isvalid_ds( dictionary_size ) ) return 0;
|
||||
LZ_mtester mtester( mbuffer, msize, dictionary_size );
|
||||
if( mtester.test_member() == 0 ) return 6;
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
// Return value: 0 = no change, 5 = repaired pos
|
||||
int repair_dictionary_size( uint8_t * const mbuffer, const long msize )
|
||||
{
|
||||
|
@ -155,6 +168,15 @@ long repair_member( uint8_t * const mbuffer, const long long mpos,
|
|||
} // end namespace
|
||||
|
||||
|
||||
bool safe_seek( const int fd, const long long pos,
|
||||
const std::string & filename )
|
||||
{
|
||||
if( lseek( fd, pos, SEEK_SET ) == pos ) return true;
|
||||
show_file_error( filename.c_str(), "Seek error", errno );
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
long seek_write( const int fd, const uint8_t * const buf, const long size,
|
||||
const long long pos )
|
||||
{
|
||||
|
@ -165,16 +187,16 @@ long seek_write( const int fd, const uint8_t * const buf, const long size,
|
|||
|
||||
|
||||
uint8_t * read_member( const int infd, const long long mpos,
|
||||
const long long msize, const char * const filename )
|
||||
const long long msize, const std::string & filename )
|
||||
{
|
||||
if( msize <= 0 || msize > LONG_MAX )
|
||||
{ show_file_error( filename,
|
||||
{ show_file_error( filename.c_str(),
|
||||
"Input file contains member larger than LONG_MAX." ); return 0; }
|
||||
if( !safe_seek( infd, mpos, filename ) ) return 0;
|
||||
uint8_t * const buffer = new uint8_t[msize];
|
||||
|
||||
if( readblock( infd, buffer, msize ) != msize )
|
||||
{ show_file_error( filename, read_error_msg, errno );
|
||||
{ show_file_error( filename.c_str(), read_error_msg, errno );
|
||||
delete[] buffer; return 0; }
|
||||
return buffer;
|
||||
}
|
||||
|
@ -206,8 +228,10 @@ int byte_repair( const std::string & input_filename,
|
|||
const long long msize = lzip_index.mblock( i ).size();
|
||||
if( !safe_seek( infd, mpos, filename ) ) cleanup_and_fail( 1 );
|
||||
long long failure_pos = 0;
|
||||
if( test_member_from_file( infd, msize, &failure_pos ) == 0 ) continue;
|
||||
if( failure_pos < Lzip_header::size ) // End Of File
|
||||
bool nonzero = false;
|
||||
const int ret = test_member_from_file( infd, msize, &failure_pos, &nonzero );
|
||||
if( ret == 0 && !nonzero ) continue;
|
||||
if( ret != 0 && failure_pos < Lzip_header::size ) // End Of File
|
||||
{ show_error( "Can't repair error in input file." );
|
||||
cleanup_and_fail( 2 ); }
|
||||
if( failure_pos >= msize - 8 ) failure_pos = msize - 8 - 1;
|
||||
|
@ -218,14 +242,17 @@ int byte_repair( const std::string & input_filename,
|
|||
i + 1, lzip_index.members(), mpos + failure_pos );
|
||||
std::fflush( stdout );
|
||||
}
|
||||
uint8_t * const mbuffer = read_member( infd, mpos, msize, filename );
|
||||
uint8_t * const mbuffer = read_member( infd, mpos, msize, input_filename );
|
||||
if( !mbuffer ) cleanup_and_fail( 1 );
|
||||
const Lzip_header & header = *(const Lzip_header *)mbuffer;
|
||||
const unsigned dictionary_size = header.dictionary_size();
|
||||
long pos = 0;
|
||||
if( !nonzero && mbuffer[6] != 0 ) nonzero = true; // bad DS
|
||||
if( !gross_damage( mbuffer, msize ) )
|
||||
{
|
||||
pos = repair_dictionary_size( mbuffer, msize );
|
||||
if( nonzero ) pos = repair_nonzero( mbuffer, msize );
|
||||
if( pos == 0 )
|
||||
pos = repair_dictionary_size( mbuffer, msize );
|
||||
if( pos == 0 )
|
||||
pos = repair_member( mbuffer, mpos, msize, header.size + 1,
|
||||
header.size + 6, dictionary_size, terminator );
|
||||
|
@ -243,12 +270,14 @@ int byte_repair( const std::string & input_filename,
|
|||
if( !safe_seek( infd, 0, filename ) ) return 1;
|
||||
set_signal_handler();
|
||||
if( !open_outstream( true, true, false, true, to_file ) ) return 1;
|
||||
if( !copy_file( infd, outfd ) ) // copy whole file
|
||||
cleanup_and_fail( 1 );
|
||||
if( !copy_file( infd, outfd, input_filename, output_filename ) )
|
||||
cleanup_and_fail( 1 ); // copy whole file
|
||||
}
|
||||
if( seek_write( outfd, mbuffer + pos, 1, mpos + pos ) != 1 )
|
||||
{ show_error( "Error writing output file", errno );
|
||||
cleanup_and_fail( 1 ); }
|
||||
if( ( nonzero && pos != 6 &&
|
||||
seek_write( outfd, mbuffer + 6, 1, mpos + 6 ) != 1 ) ||
|
||||
seek_write( outfd, mbuffer + pos, 1, mpos + pos ) != 1 )
|
||||
{ show_file_error( printable_name( output_filename, false ),
|
||||
write_error_msg, errno ); cleanup_and_fail( 1 ); }
|
||||
}
|
||||
delete[] mbuffer;
|
||||
if( pos == 0 )
|
||||
|
@ -272,24 +301,24 @@ int byte_repair( const std::string & input_filename,
|
|||
}
|
||||
|
||||
|
||||
int debug_delay( const char * const input_filename,
|
||||
int debug_delay( const std::string & input_filename,
|
||||
const Cl_options & cl_opts, Block range,
|
||||
const char terminator )
|
||||
{
|
||||
const char * const filename = input_filename.c_str();
|
||||
struct stat in_stats; // not used
|
||||
const int infd = open_instream( input_filename, &in_stats, false, true );
|
||||
const int infd = open_instream( filename, &in_stats, false, true );
|
||||
if( infd < 0 ) return 1;
|
||||
|
||||
const Lzip_index lzip_index( infd, cl_opts );
|
||||
if( lzip_index.retval() != 0 )
|
||||
{ show_file_error( input_filename, lzip_index.error().c_str() );
|
||||
{ show_file_error( filename, lzip_index.error().c_str() );
|
||||
return lzip_index.retval(); }
|
||||
|
||||
if( range.end() > lzip_index.cdata_size() )
|
||||
range.size( std::max( 0LL, lzip_index.cdata_size() - range.pos() ) );
|
||||
if( range.size() <= 0 )
|
||||
{ show_file_error( input_filename, "Nothing to do; range is empty." );
|
||||
return 0; }
|
||||
{ show_file_error( filename, "Nothing to do; range is empty." ); return 0; }
|
||||
|
||||
for( long i = 0; i < lzip_index.members(); ++i )
|
||||
{
|
||||
|
@ -355,24 +384,25 @@ int debug_delay( const char * const input_filename,
|
|||
}
|
||||
|
||||
|
||||
int debug_byte_repair( const char * const input_filename,
|
||||
int debug_byte_repair( const std::string & input_filename,
|
||||
const Cl_options & cl_opts, const Bad_byte & bad_byte,
|
||||
const char terminator )
|
||||
{
|
||||
const char * const filename = input_filename.c_str();
|
||||
struct stat in_stats; // not used
|
||||
const int infd = open_instream( input_filename, &in_stats, false, true );
|
||||
const int infd = open_instream( filename, &in_stats, false, true );
|
||||
if( infd < 0 ) return 1;
|
||||
|
||||
const Lzip_index lzip_index( infd, cl_opts );
|
||||
if( lzip_index.retval() != 0 )
|
||||
{ show_file_error( input_filename, lzip_index.error().c_str() );
|
||||
{ show_file_error( filename, lzip_index.error().c_str() );
|
||||
return lzip_index.retval(); }
|
||||
|
||||
long idx = 0;
|
||||
for( ; idx < lzip_index.members(); ++idx )
|
||||
if( lzip_index.mblock( idx ).includes( bad_byte.pos ) ) break;
|
||||
if( idx >= lzip_index.members() )
|
||||
{ show_file_error( input_filename, "Nothing to do; byte is beyond EOF." );
|
||||
{ show_file_error( filename, "Nothing to do; byte is beyond EOF." );
|
||||
return 0; }
|
||||
|
||||
const long long mpos = lzip_index.mblock( idx ).pos();
|
||||
|
@ -392,11 +422,12 @@ int debug_byte_repair( const char * const input_filename,
|
|||
if( !mbuffer ) return 1;
|
||||
const Lzip_header & header = *(const Lzip_header *)mbuffer;
|
||||
const unsigned dictionary_size = header.dictionary_size();
|
||||
const uint8_t good_value = mbuffer[bad_byte.pos-mpos];
|
||||
const long long bad_pos = bad_byte.pos - mpos;
|
||||
const uint8_t good_value = mbuffer[bad_pos];
|
||||
const uint8_t bad_value = bad_byte( good_value );
|
||||
mbuffer[bad_byte.pos-mpos] = bad_value;
|
||||
mbuffer[bad_pos] = bad_value;
|
||||
long failure_pos = 0;
|
||||
if( bad_byte.pos != 5 || isvalid_ds( header.dictionary_size() ) )
|
||||
if( bad_pos != 5 || isvalid_ds( header.dictionary_size() ) )
|
||||
{
|
||||
LZ_mtester mtester( mbuffer, msize, header.dictionary_size() );
|
||||
if( mtester.test_member() == 0 && mtester.finished() )
|
||||
|
@ -419,6 +450,8 @@ int debug_byte_repair( const char * const input_filename,
|
|||
}
|
||||
if( failure_pos >= msize ) failure_pos = msize - 1;
|
||||
long pos = repair_dictionary_size( mbuffer, msize );
|
||||
if( pos == 0 )
|
||||
if( mbuffer[6] != 0 ) pos = repair_nonzero( mbuffer, msize );
|
||||
if( pos == 0 )
|
||||
pos = repair_member( mbuffer, mpos, msize, header.size + 1,
|
||||
header.size + 6, dictionary_size, terminator );
|
||||
|
@ -441,21 +474,21 @@ int debug_byte_repair( const char * const input_filename,
|
|||
(Packet sizes are a fractionary number of bytes. The packet and marker
|
||||
sizes shown by option -X are the number of extra bytes required to decode
|
||||
the packet, not counting the data present in the range decoder before and
|
||||
after the decoding. The max marker size of a 'Sync Flush marker' does not
|
||||
include the 5 bytes read by rdec.load).
|
||||
after the decoding.
|
||||
if bad_byte.pos >= cdata_size, bad_byte is ignored.
|
||||
*/
|
||||
int debug_decompress( const char * const input_filename,
|
||||
int debug_decompress( const std::string & input_filename,
|
||||
const Cl_options & cl_opts, const Bad_byte & bad_byte,
|
||||
const bool show_packets )
|
||||
{
|
||||
const char * const filename = input_filename.c_str();
|
||||
struct stat in_stats;
|
||||
const int infd = open_instream( input_filename, &in_stats, false, true );
|
||||
const int infd = open_instream( filename, &in_stats, false, true );
|
||||
if( infd < 0 ) return 1;
|
||||
|
||||
const Lzip_index lzip_index( infd, cl_opts );
|
||||
if( lzip_index.retval() != 0 )
|
||||
{ show_file_error( input_filename, lzip_index.error().c_str() );
|
||||
{ show_file_error( filename, lzip_index.error().c_str() );
|
||||
return lzip_index.retval(); }
|
||||
|
||||
outfd = show_packets ? -1 : STDOUT_FILENO;
|
||||
|
|
2
configure
vendored
2
configure
vendored
|
@ -6,7 +6,7 @@
|
|||
# to copy, distribute, and modify it.
|
||||
|
||||
pkgname=lziprecover
|
||||
pkgversion=1.25-pre1
|
||||
pkgversion=1.25-rc1
|
||||
progname=lziprecover
|
||||
srctrigger=doc/${pkgname}.texi
|
||||
|
||||
|
|
28
decoder.cc
28
decoder.cc
|
@ -77,7 +77,7 @@ bool Range_decoder::read_block()
|
|||
{
|
||||
stream_pos = readblock( infd, buffer, buffer_size );
|
||||
if( stream_pos != buffer_size && errno ) throw Error( read_error_msg );
|
||||
at_stream_end = ( stream_pos < buffer_size );
|
||||
at_stream_end = stream_pos < buffer_size;
|
||||
partial_member_pos += pos;
|
||||
pos = 0;
|
||||
show_dprogress();
|
||||
|
@ -99,7 +99,7 @@ void LZ_decoder::flush_data()
|
|||
const long long s =
|
||||
std::min( positive_diff( outend, sp ), (unsigned long long)size ) - i;
|
||||
if( s > 0 && writeblock( outfd, buffer + stream_pos + i, s ) != s )
|
||||
throw Error( "Write error" );
|
||||
throw Error( write_error_msg );
|
||||
}
|
||||
if( pos >= dictionary_size )
|
||||
{ partial_data_pos += pos; pos = 0; pos_wrapped = true; }
|
||||
|
@ -180,7 +180,8 @@ bool LZ_decoder::check_trailer( const Pretty_print & pp ) const
|
|||
/* Return value: 0 = OK, 1 = decoder error, 2 = unexpected EOF,
|
||||
3 = trailer error, 4 = unknown marker found,
|
||||
5 = nonzero first LZMA byte found. */
|
||||
int LZ_decoder::decode_member( const Pretty_print & pp, const bool ignore_nonzero )
|
||||
int LZ_decoder::decode_member( const Pretty_print & pp,
|
||||
const bool ignore_nonzero )
|
||||
{
|
||||
Bit_model bm_literal[1<<literal_context_bits][0x300];
|
||||
Bit_model bm_match[State::states][pos_states];
|
||||
|
@ -244,22 +245,22 @@ int LZ_decoder::decode_member( const Pretty_print & pp, const bool ignore_nonzer
|
|||
}
|
||||
else // match
|
||||
{
|
||||
rep3 = rep2; rep2 = rep1; rep1 = rep0;
|
||||
len = rdec.decode_len( match_len_model, pos_state );
|
||||
unsigned distance = rdec.decode_tree6( bm_dis_slot[get_len_state(len)] );
|
||||
if( distance >= start_dis_model )
|
||||
rep0 = rdec.decode_tree6( bm_dis_slot[get_len_state(len)] );
|
||||
if( rep0 >= start_dis_model )
|
||||
{
|
||||
const unsigned dis_slot = distance;
|
||||
const unsigned dis_slot = rep0;
|
||||
const int direct_bits = ( dis_slot >> 1 ) - 1;
|
||||
distance = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
|
||||
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
|
||||
if( dis_slot < end_dis_model )
|
||||
distance += rdec.decode_tree_reversed(
|
||||
bm_dis + ( distance - dis_slot ), direct_bits );
|
||||
rep0 += rdec.decode_tree_reversed( bm_dis + ( rep0 - dis_slot ),
|
||||
direct_bits );
|
||||
else
|
||||
{
|
||||
distance +=
|
||||
rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits;
|
||||
distance += rdec.decode_tree_reversed4( bm_align );
|
||||
if( distance == 0xFFFFFFFFU ) // marker found
|
||||
rep0 += rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits;
|
||||
rep0 += rdec.decode_tree_reversed4( bm_align );
|
||||
if( rep0 == 0xFFFFFFFFU ) // marker found
|
||||
{
|
||||
rdec.normalize();
|
||||
flush_data();
|
||||
|
@ -271,7 +272,6 @@ int LZ_decoder::decode_member( const Pretty_print & pp, const bool ignore_nonzer
|
|||
}
|
||||
}
|
||||
}
|
||||
rep3 = rep2; rep2 = rep1; rep1 = rep0; rep0 = distance;
|
||||
state.set_match();
|
||||
if( rep0 >= dictionary_size || ( rep0 >= pos && !pos_wrapped ) )
|
||||
{ flush_data(); return 1; }
|
||||
|
|
20
decoder.h
20
decoder.h
|
@ -26,6 +26,7 @@ class Range_decoder
|
|||
uint32_t range;
|
||||
const int infd; // input file descriptor
|
||||
bool at_stream_end;
|
||||
bool nonzero_;
|
||||
|
||||
bool read_block();
|
||||
|
||||
|
@ -42,11 +43,12 @@ public:
|
|||
code( 0 ),
|
||||
range( 0xFFFFFFFFU ),
|
||||
infd( ifd ),
|
||||
at_stream_end( false )
|
||||
at_stream_end( false ), nonzero_( false )
|
||||
{}
|
||||
|
||||
~Range_decoder() { delete[] buffer; }
|
||||
|
||||
bool nonzero() const { return nonzero_; }
|
||||
unsigned get_code() const { return code; }
|
||||
bool finished() { return pos >= stream_pos && !read_block(); }
|
||||
|
||||
|
@ -110,8 +112,10 @@ public:
|
|||
{
|
||||
code = 0;
|
||||
range = 0xFFFFFFFFU;
|
||||
// check first byte of the LZMA stream
|
||||
if( get_byte() != 0 && !ignore_nonzero ) return false;
|
||||
// check first byte of the LZMA stream without reading it
|
||||
nonzero_ = buffer[pos] != 0;
|
||||
if( nonzero_ && !ignore_nonzero ) return false;
|
||||
get_byte(); // discard first byte of the LZMA stream
|
||||
for( int i = 0; i < 4; ++i ) code = ( code << 8 ) | get_byte();
|
||||
return true;
|
||||
}
|
||||
|
@ -131,7 +135,7 @@ public:
|
|||
range >>= 1;
|
||||
// symbol <<= 1;
|
||||
// if( code >= range ) { code -= range; symbol |= 1; }
|
||||
const bool bit = ( code >= range );
|
||||
const bool bit = code >= range;
|
||||
symbol <<= 1; symbol += bit;
|
||||
code -= range & ( 0U - bit );
|
||||
}
|
||||
|
@ -329,14 +333,14 @@ class LZ_decoder
|
|||
bool fast, fast2;
|
||||
if( lpos > distance )
|
||||
{
|
||||
fast = ( len < dictionary_size - lpos );
|
||||
fast2 = ( fast && len <= lpos - i );
|
||||
fast = len < dictionary_size - lpos;
|
||||
fast2 = fast && len <= lpos - i;
|
||||
}
|
||||
else
|
||||
{
|
||||
i += dictionary_size;
|
||||
fast = ( len < dictionary_size - i ); // (i == pos) may happen
|
||||
fast2 = ( fast && len <= i - lpos );
|
||||
fast = len < dictionary_size - i; // (i == pos) may happen
|
||||
fast2 = fast && len <= i - lpos;
|
||||
}
|
||||
if( fast ) // no wrap
|
||||
{
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
|
||||
.TH LZIPRECOVER "1" "October 2024" "lziprecover 1.25-pre1" "User Commands"
|
||||
.TH LZIPRECOVER "1" "November 2024" "lziprecover 1.25-rc1" "User Commands"
|
||||
.SH NAME
|
||||
lziprecover \- recovers data from damaged lzip files
|
||||
.SH SYNOPSIS
|
||||
|
@ -119,12 +119,6 @@ remove members, tdata from files in place
|
|||
\fB\-\-strip=\fR<list>:d:e:t
|
||||
copy files to stdout stripping members given
|
||||
.TP
|
||||
\fB\-\-ignore\-empty\fR
|
||||
ignore empty members in multimember files
|
||||
.TP
|
||||
\fB\-\-ignore\-nonzero\fR
|
||||
ignore a nonzero first LZMA byte
|
||||
.TP
|
||||
\fB\-\-loose\-trailing\fR
|
||||
allow trailing data seeming corrupt header
|
||||
.TP
|
||||
|
|
|
@ -12,12 +12,13 @@ File: lziprecover.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Lziprecover Manual
|
||||
******************
|
||||
|
||||
This manual is for Lziprecover (version 1.25-pre1, 1 October 2024).
|
||||
This manual is for Lziprecover (version 1.25-rc1, 18 November 2024).
|
||||
|
||||
* Menu:
|
||||
|
||||
* Introduction:: Purpose and features of lziprecover
|
||||
* Invoking lziprecover:: Command-line interface
|
||||
* Argument syntax:: By convention, options start with a hyphen
|
||||
* File format:: Detailed format of the compressed file
|
||||
* Data safety:: Protecting data from accidental loss
|
||||
* Fec files:: Forward Error Correction
|
||||
|
@ -112,8 +113,9 @@ pdlzip.
|
|||
|
||||
If the cause of file corruption is a damaged medium, the combination
|
||||
GNU ddrescue + lziprecover is the recommended option for recovering data
|
||||
from damaged lzip files. *Note ddrescue-example::, and *note
|
||||
ddrescue-example2::, for examples.
|
||||
from damaged files. *Note ddrescue-example::, *note ddrescue-example2::, and
|
||||
*note ddrescue-example3::, for examples. *Note GNU ddrescue manual:
|
||||
(ddrescue)Top, for details about ddrescue.
|
||||
|
||||
If a file is too damaged for lziprecover to repair it, all the
|
||||
recoverable data in all members of the file can be extracted with the
|
||||
|
@ -135,7 +137,7 @@ have been compressed. Decompressed is used to refer to data which have
|
|||
undergone the process of decompression.
|
||||
|
||||
|
||||
File: lziprecover.info, Node: Invoking lziprecover, Next: File format, Prev: Introduction, Up: Top
|
||||
File: lziprecover.info, Node: Invoking lziprecover, Next: Argument syntax, Prev: Introduction, Up: Top
|
||||
|
||||
2 Invoking lziprecover
|
||||
**********************
|
||||
|
@ -150,8 +152,7 @@ first time it appears in the command line. If no file names are specified,
|
|||
lziprecover decompresses from standard input to standard output. Remember
|
||||
to prepend './' to any file name beginning with a hyphen, or use '--'.
|
||||
|
||||
lziprecover supports the following options: *Note Argument syntax:
|
||||
(arg_parser)Argument syntax.
|
||||
lziprecover supports the following options: *Note Argument syntax::.
|
||||
|
||||
'-h'
|
||||
'--help'
|
||||
|
@ -175,7 +176,7 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
dictionary size of the resulting file (and therefore the amount of
|
||||
memory required to decompress it). Only streamed files with default
|
||||
LZMA properties can be converted; non-streamed lzma-alone files lack
|
||||
the "End Of Stream" marker required in lzip files.
|
||||
the 'End Of Stream' marker required in lzip files.
|
||||
|
||||
The name of the converted lzip file is derived from that of the
|
||||
original lzma-alone file as follows:
|
||||
|
@ -215,23 +216,24 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
status 1. If a file fails to decompress, or is a terminal, lziprecover
|
||||
exits immediately with error status 2 without decompressing the rest
|
||||
of the files. A terminal is considered an uncompressed file, and
|
||||
therefore invalid.
|
||||
therefore invalid. A multimember file with one or more empty members
|
||||
is accepted if redirected to standard input or if '-i' is given.
|
||||
|
||||
'-D RANGE'
|
||||
'--range-decompress=RANGE'
|
||||
Decompress only a range of bytes starting at decompressed byte position
|
||||
BEGIN and up to byte position END - 1. Byte positions start at 0. This
|
||||
option provides random access to the data in multimember files; it
|
||||
only decompresses the members containing the desired data. In order to
|
||||
guarantee the correctness of the data produced, all members containing
|
||||
any part of the desired data are decompressed and their integrity is
|
||||
checked.
|
||||
BEGIN and up to byte position END - 1. Byte positions start at 0. The
|
||||
bytes produced are sent to standard output unless the option '-o' is
|
||||
used. This option provides random access to the data in multimember
|
||||
files; it only decompresses the members containing the desired data.
|
||||
In order to guarantee the correctness of the data produced, all
|
||||
members containing any part of the desired data are decompressed and
|
||||
their integrity is checked.
|
||||
|
||||
Four formats of RANGE are recognized, 'BEGIN', 'BEGIN-END',
|
||||
'BEGIN,SIZE', and ',SIZE'. If only BEGIN is specified, END is taken as
|
||||
the end of the file. If only SIZE is specified, BEGIN is taken as the
|
||||
beginning of the file. The bytes produced are sent to standard output
|
||||
unless the option '--output' is used.
|
||||
beginning of the file.
|
||||
|
||||
'-e'
|
||||
'--reproduce'
|
||||
|
@ -325,7 +327,8 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
|
||||
'-k'
|
||||
'--keep'
|
||||
Keep (don't delete) input files during decompression.
|
||||
Keep (don't delete) input files during decompression or conversion from
|
||||
lzma-alone.
|
||||
|
||||
'-l'
|
||||
'--list'
|
||||
|
@ -336,9 +339,11 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
'-v', the dictionary size, the number of members in the file, and the
|
||||
amount of trailing data (if any) are also printed. With '-vv', the
|
||||
positions and sizes of each member in multimember files are also
|
||||
printed. With '-i', format errors are ignored, and with '-ivv', gaps
|
||||
between members are shown. The member numbers shown coincide with the
|
||||
file numbers produced by '--split'.
|
||||
printed. A multimember file with one or more empty members is accepted
|
||||
if redirected to standard input or if '-i' is given. With '-i', format
|
||||
errors are ignored, and with '-ivv', gaps between members are shown.
|
||||
The member numbers start at 1 and coincide with the file numbers
|
||||
produced by '--split'.
|
||||
|
||||
If any file is damaged, does not exist, can't be opened, or is not
|
||||
regular, the final exit status is > 0. '-lq' can be used to check
|
||||
|
@ -358,8 +363,8 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
'-n N'
|
||||
'--threads=N'
|
||||
Set the maximum number of worker threads for '--fec=create',
|
||||
overriding the system's default. Valid values range from 1 to "as many
|
||||
as your system can support". If this option is not used, lziprecover
|
||||
overriding the system's default. Valid values range from 1 to as many
|
||||
as your system can support. If this option is not used, lziprecover
|
||||
tries to detect the number of processors in the system and use it as
|
||||
default value. 'lziprecover --help' shows the system's default value.
|
||||
|
||||
|
@ -367,7 +372,7 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
'--output=FILE[/]'
|
||||
If repairing, place the repaired output into FILE instead of into
|
||||
FILE_fixed.lz. If splitting, the names of the files produced are in
|
||||
the form 'rec01FILE', 'rec02FILE', etc.
|
||||
the form 'rec1FILE', 'rec2FILE', etc.
|
||||
|
||||
If creating FEC data and '-c' has not been also specified, write the
|
||||
FEC data to FILE. If FILE ends with a slash, it is interpreted as the
|
||||
|
@ -415,8 +420,8 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
headers or trailers, try to split FILE and then work on each member
|
||||
individually.
|
||||
|
||||
The names of the files produced are in the form 'rec01FILE',
|
||||
'rec02FILE', etc, and are designed so that the use of wildcards in
|
||||
The names of the files produced are in the form 'rec1FILE',
|
||||
'rec2FILE', etc, and are designed so that the use of wildcards in
|
||||
subsequent processing, for example,
|
||||
'lziprecover -cd rec*FILE > recovered_data', processes the files in
|
||||
the correct order. The number of digits used in the names varies
|
||||
|
@ -430,7 +435,9 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
fails the test, does not exist, can't be opened, or is a terminal,
|
||||
lziprecover continues testing the rest of the files. A final
|
||||
diagnostic is shown at verbosity level 1 or higher if any file fails
|
||||
the test when testing multiple files.
|
||||
the test when testing multiple files. A multimember file with one or
|
||||
more empty members is accepted if redirected to standard input or if
|
||||
'-i' is given.
|
||||
|
||||
'-v'
|
||||
'--verbose'
|
||||
|
@ -448,14 +455,13 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
'--dump=[MEMBER_LIST][:damaged][:empty][:tdata]'
|
||||
Dump the members listed, the damaged members (if any), the empty
|
||||
members (if any), or the trailing data (if any) of one or more regular
|
||||
multimember files to standard output, or to a file if the option
|
||||
'--output' is used. If more than one file is given, the elements
|
||||
dumped from all the files are concatenated. If a file does not exist,
|
||||
can't be opened, or is not regular, lziprecover continues processing
|
||||
the rest of the files. If the dump fails in one file, lziprecover
|
||||
exits immediately without processing the rest of the files. Only
|
||||
'--dump=tdata' can write to a terminal. '--dump=damaged' implies
|
||||
'--ignore-errors'.
|
||||
multimember files to standard output, or to a file if the option '-o'
|
||||
is used. If more than one file is given, the elements dumped from all
|
||||
the files are concatenated. If a file does not exist, can't be opened,
|
||||
or is not regular, lziprecover continues processing the rest of the
|
||||
files. If the dump fails in one file, lziprecover exits immediately
|
||||
without processing the rest of the files. Only '--dump=tdata' can
|
||||
write to a terminal. '--dump=damaged' implies '--ignore-errors'.
|
||||
|
||||
The argument to '--dump' is a colon-separated list of the following
|
||||
element specifiers; a member list (1,3-6), a reverse member list
|
||||
|
@ -509,35 +515,23 @@ lziprecover supports the following options: *Note Argument syntax:
|
|||
|
||||
'--strip=[MEMBER_LIST][:damaged][:empty][:tdata]'
|
||||
Copy one or more regular multimember files to standard output (or to a
|
||||
file if the option '--output' is used), stripping the members listed,
|
||||
the damaged members (if any), the empty members (if any), or the
|
||||
trailing data (if any) from each file. If all members in a file are
|
||||
selected to be stripped, the trailing data (if any) are also stripped
|
||||
even if 'tdata' is not specified. If more than one file is given, the
|
||||
files are concatenated. In this case the trailing data are also
|
||||
stripped from all but the last file even if 'tdata' is not specified.
|
||||
If a file does not exist, can't be opened, or is not regular,
|
||||
lziprecover continues processing the rest of the files. If a file
|
||||
fails to copy, lziprecover exits immediately without processing the
|
||||
rest of the files. See '--dump' above for a description of the
|
||||
argument.
|
||||
|
||||
'--ignore-empty'
|
||||
When decompressing, testing, or listing, ignore empty members in
|
||||
multimember files. By default lziprecover exits with error status 2 if
|
||||
any empty member is found in a multimember file.
|
||||
|
||||
'--ignore-nonzero'
|
||||
When decompressing or testing, ignore a nonzero first byte in the LZMA
|
||||
stream. By default lziprecover exits with error status 2 if the first
|
||||
LZMA byte is nonzero in any member of the input files. Use
|
||||
'lziprecover --nonzero-repair' to repair any such nonzero bytes.
|
||||
file if the option '-o' is used), stripping the members listed, the
|
||||
damaged members (if any), the empty members (if any), or the trailing
|
||||
data (if any) from each file. If all members in a file are selected to
|
||||
be stripped, the trailing data (if any) are also stripped even if
|
||||
'tdata' is not specified. If more than one file is given, the files are
|
||||
concatenated. In this case the trailing data are also stripped from
|
||||
all but the last file even if 'tdata' is not specified. If a file does
|
||||
not exist, can't be opened, or is not regular, lziprecover continues
|
||||
processing the rest of the files. If a file fails to copy, lziprecover
|
||||
exits immediately without processing the rest of the files. See
|
||||
'--dump' above for a description of the argument.
|
||||
|
||||
'--loose-trailing'
|
||||
When decompressing, testing, or listing, allow trailing data whose
|
||||
first bytes are so similar to the magic bytes of a lzip header that
|
||||
they can be confused with a corrupt header. Use this option if a file
|
||||
triggers a "corrupt header" error and the cause is not indeed a
|
||||
triggers a 'corrupt header' error and the cause is not indeed a
|
||||
corrupt header.
|
||||
|
||||
'--nonzero-repair'
|
||||
|
@ -625,14 +619,15 @@ lziprecover also supports the following debug options (for experts):
|
|||
Load the compressed FILE into memory, set the byte at POSITION to
|
||||
VALUE, and decompress the modified compressed data to standard output.
|
||||
If the damaged member can be decompressed to the end (just fails with
|
||||
a CRC mismatch), the members following it are also decompressed.
|
||||
a CRC mismatch), the members following it are also decompressed. *Note
|
||||
--set-byte::, for a description of VALUE.
|
||||
|
||||
'-X[POSITION,VALUE]'
|
||||
'--show-packets[=POSITION,VALUE]'
|
||||
Load the compressed FILE into memory, optionally set the byte at
|
||||
POSITION to VALUE, decompress the modified compressed data (discarding
|
||||
the output), and print to standard output descriptions of the LZMA
|
||||
packets being decoded.
|
||||
packets being decoded. *Note --set-byte::, for a description of VALUE.
|
||||
|
||||
'-Y RANGE'
|
||||
'--debug-delay=RANGE'
|
||||
|
@ -649,6 +644,7 @@ lziprecover also supports the following debug options (for experts):
|
|||
'--debug-byte-repair=POSITION,VALUE'
|
||||
Load the compressed FILE into memory, set the byte at POSITION to
|
||||
VALUE, and then try to repair the byte error. *Note --byte-repair::.
|
||||
*Note --set-byte::, for a description of VALUE.
|
||||
|
||||
'--gf16'
|
||||
Forces the use of GF(2^16) when creating FEC blocks even if the number
|
||||
|
@ -681,9 +677,57 @@ corrupt or invalid input file, 3 for an internal consistency error (e.g.,
|
|||
bug) which caused lziprecover to panic.
|
||||
|
||||
|
||||
File: lziprecover.info, Node: File format, Next: Data safety, Prev: Invoking lziprecover, Up: Top
|
||||
File: lziprecover.info, Node: Argument syntax, Next: File format, Prev: Invoking lziprecover, Up: Top
|
||||
|
||||
3 File format
|
||||
3 Syntax of command-line arguments
|
||||
**********************************
|
||||
|
||||
POSIX recommends these conventions for command-line arguments.
|
||||
|
||||
* A command-line argument is an option if it begins with a hyphen ('-').
|
||||
|
||||
* Option names are single alphanumeric characters.
|
||||
|
||||
* Certain options require an argument.
|
||||
|
||||
* An option and its argument may or may not appear as separate tokens.
|
||||
(In other words, the whitespace separating them is optional, unless the
|
||||
argument is the empty string). Thus, '-o foo' and '-ofoo' are
|
||||
equivalent.
|
||||
|
||||
* One or more options without arguments, followed by at most one option
|
||||
that takes an argument, may follow a hyphen in a single token. Thus,
|
||||
'-abc' is equivalent to '-a -b -c'.
|
||||
|
||||
* Options typically precede other non-option arguments.
|
||||
|
||||
* The argument '--' terminates all options; any following arguments are
|
||||
treated as non-option arguments, even if they begin with a hyphen.
|
||||
|
||||
* A token consisting of a single hyphen character is interpreted as an
|
||||
ordinary non-option argument. By convention, it is used to specify
|
||||
standard input, standard output, or a file named '-'.
|
||||
|
||||
GNU adds "long options" to these conventions:
|
||||
|
||||
* A long option consists of two hyphens ('--') followed by a name made
|
||||
of alphanumeric characters and hyphens. Option names are typically one
|
||||
to three words long, with hyphens to separate words. Abbreviations can
|
||||
be used for the long option names as long as the abbreviations are
|
||||
unique.
|
||||
|
||||
* A long option and its argument may or may not appear as separate
|
||||
tokens. In the latter case they must be separated by an equal sign '='.
|
||||
Thus, '--foo bar' and '--foo=bar' are equivalent.
|
||||
|
||||
The syntax of options with an optional argument is
|
||||
'-<short_option><argument>' (without whitespace), or
|
||||
'--<long_option>=<argument>'.
|
||||
|
||||
|
||||
File: lziprecover.info, Node: File format, Next: Data safety, Prev: Argument syntax, Up: Top
|
||||
|
||||
4 File format
|
||||
*************
|
||||
|
||||
Perfection is reached, not when there is no longer anything to add, but
|
||||
|
@ -737,7 +781,7 @@ not allowed in multimember files.
|
|||
Valid values for dictionary size range from 4 KiB to 512 MiB.
|
||||
|
||||
'LZMA stream'
|
||||
The LZMA stream, finished by an "End Of Stream" marker. Uses default
|
||||
The LZMA stream, terminated by an 'End Of Stream' marker. Uses default
|
||||
values for encoder properties. *Note Stream format: (lzip)Stream
|
||||
format, for a complete description.
|
||||
|
||||
|
@ -757,7 +801,7 @@ not allowed in multimember files.
|
|||
|
||||
File: lziprecover.info, Node: Data safety, Next: Fec files, Prev: File format, Up: Top
|
||||
|
||||
4 Protecting data from accidental loss
|
||||
5 Protecting data from accidental loss
|
||||
**************************************
|
||||
|
||||
It is a fact of life that sometimes data becomes corrupt. Software has
|
||||
|
@ -803,7 +847,7 @@ with gzip and bzip2 with respect to data safety:
|
|||
|
||||
File: lziprecover.info, Node: Merging with a backup, Next: Reproducing a mailbox, Up: Data safety
|
||||
|
||||
4.1 Recovering a file using a damaged backup
|
||||
5.1 Recovering a file using a damaged backup
|
||||
============================================
|
||||
|
||||
Let's suppose that you made a compressed backup of your valuable scientific
|
||||
|
@ -830,7 +874,7 @@ possible to recover a file with thousands of errors.
|
|||
|
||||
File: lziprecover.info, Node: Reproducing a mailbox, Prev: Merging with a backup, Up: Data safety
|
||||
|
||||
4.2 Recovering new messages using an old backup
|
||||
5.2 Recovering new messages using an old backup
|
||||
===============================================
|
||||
|
||||
Let's suppose that you make periodic backups of your email messages stored
|
||||
|
@ -876,15 +920,14 @@ identical backups (*note performance-of-merge::).
|
|||
|
||||
File: lziprecover.info, Node: Fec files, Next: Repairing one byte, Prev: Data safety, Up: Top
|
||||
|
||||
5 Forward Error Correction
|
||||
6 Forward Error Correction
|
||||
**************************
|
||||
|
||||
"Forward Error Correction" (FEC) is any way of protecting data from
|
||||
corruption by creating redundant data that can be used later to repair
|
||||
errors in the protected data. Lziprecover uses a Hilbert-based Reed-Solomon
|
||||
code to create one fec file (with extension '.fec') for each file that
|
||||
needs to be protected. The fec files created by lziprecover are
|
||||
reproducible.
|
||||
Forward Error Correction (FEC) is any way of protecting data from corruption
|
||||
by creating redundant data that can be used later to repair errors in the
|
||||
protected data. Lziprecover uses a Hilbert-based Reed-Solomon code to create
|
||||
one fec file (with extension '.fec') for each file that needs to be
|
||||
protected. The fec files created by lziprecover are reproducible.
|
||||
|
||||
Reed-Solomon is the most space-efficient Error Correcting Code (ECC) for
|
||||
data stored in block devices. It creates redundant FEC blocks in such a way
|
||||
|
@ -892,8 +935,7 @@ that X FEC blocks allow the recuperation of any combination of up to X lost
|
|||
data blocks. All the blocks (data and FEC) are of the same size, which in
|
||||
fec files must be a multiple of 512 bytes. Reed-Solomon is not optimum for
|
||||
corruption affecting random single bits in a file because each corrupt bit
|
||||
invalidates the whole block containing it. But in block devices, scattered
|
||||
bit flips should not happen.
|
||||
invalidates the whole block containing it.
|
||||
|
||||
Usually, a corrupt file does not provide an indication of where the
|
||||
corruption is located. Therefore, each fec file stores one or two arrays of
|
||||
|
@ -921,7 +963,7 @@ must be intact to provide 'prodata_size', 'prodata_md5', and 'gf16'.
|
|||
|
||||
File: lziprecover.info, Node: How Reed-Solomon works, Next: Implementation details, Up: Fec files
|
||||
|
||||
5.1 How Reed-Solomon works
|
||||
6.1 How Reed-Solomon works
|
||||
==========================
|
||||
|
||||
To illustrate how Reed-Solomon works on the BEC, we will use an example with
|
||||
|
@ -944,8 +986,8 @@ p, q, and r can be computed from the values of x, y, and z:
|
|||
|
||||
Now, if the values of x and y are lost because of data corruption, they
|
||||
can be recomputed by using any two of the three equations above. For
|
||||
example, if we replace the known values of z, p, q, and r in equations (1)
|
||||
and (2) we get:
|
||||
example, if we replace the known values of z, p, and q in equations (1) and
|
||||
(2) we get:
|
||||
|
||||
x + y + 3 = 6 (1b)
|
||||
x + 2y + 9 = 14 (2b)
|
||||
|
@ -982,7 +1024,7 @@ obtain the values of x and y (D = A^-1 * F):
|
|||
|
||||
File: lziprecover.info, Node: Implementation details, Next: Creating fec files, Prev: How Reed-Solomon works, Up: Fec files
|
||||
|
||||
5.2 How lziprecover implements Reed-Solomon
|
||||
6.2 How lziprecover implements Reed-Solomon
|
||||
===========================================
|
||||
|
||||
Lziprecover's implementation of Reed-Solomon can manage up to 128 data
|
||||
|
@ -1011,17 +1053,17 @@ blocks.
|
|||
Lziprecover implements GF(2^8) with polynomial 0x11D and GF(2^16) with
|
||||
polynomial 0x1100B.
|
||||
|
||||
A Hilbert matrix is defined as 'A[i][j] = 1 / (i + j + 1)' for i and j
|
||||
>= 0. But as in a Galois Field addition is exclusive or, applying the
|
||||
Hilbert definition produces a singular (non invertible) matrix. To avoid
|
||||
this problem, lziprecover uses a Hilbert matrix starting at row
|
||||
'gf_size / 2'. I.e., 'A[i][j] = 1 / (i + gf_size / 2 + j)' for
|
||||
'0 <= i,j < gf_size / 2'. (gf_size is the size of the Galois Field).
|
||||
A Hilbert matrix is defined as A[i][j] = 1 / (i + j + 1) for i,j >= 0.
|
||||
But, as in a Galois Field the addition is the exclusive or operation,
|
||||
applying the Hilbert definition produces a singular (non invertible)
|
||||
matrix. To avoid this problem, lziprecover uses a Hilbert matrix starting
|
||||
at row r0 = gf_size / 2. I.e., A[i][j] = 1 / (i + j + r0) for
|
||||
0 <= i,j < r0. ('gf_size' is the size of the Galois Field).
|
||||
|
||||
|
||||
File: lziprecover.info, Node: Creating fec files, Next: Testing with fec files, Prev: Implementation details, Up: Fec files
|
||||
|
||||
5.3 How to create fec files
|
||||
6.3 How to create fec files
|
||||
===========================
|
||||
|
||||
Example 1: Create the fec file 'archive.tar.lz.fec' and store it in the
|
||||
|
@ -1039,10 +1081,15 @@ Example 3: Create recursively one fec file for each file in the directory
|
|||
|
||||
lziprecover -v -r -Fc -o fec/ datadir
|
||||
|
||||
Example 4: Create fec files for a collection of photos stored in directory
|
||||
'photos' and store them in the directory 'photos-fec'.
|
||||
|
||||
lziprecover -v -Fc -o photos-fec/ photos/*
|
||||
|
||||
|
||||
File: lziprecover.info, Node: Testing with fec files, Next: Repairing with fec files, Prev: Creating fec files, Up: Fec files
|
||||
|
||||
5.4 How to test files using fec files
|
||||
6.4 How to test files using fec files
|
||||
=====================================
|
||||
|
||||
Example 1: Test the integrity of 'archive.tar.lz' using the fec file
|
||||
|
@ -1061,10 +1108,15 @@ directory 'fec'.
|
|||
|
||||
lziprecover -v -r -Ft --fec-file=fec/ datadir
|
||||
|
||||
Example 4: Test the integrity of a collection of photos stored in directory
|
||||
'photos' using fec files from directory 'photos-fec'.
|
||||
|
||||
lziprecover -v -Ft --fec-file=photos-fec/ photos/*
|
||||
|
||||
|
||||
File: lziprecover.info, Node: Repairing with fec files, Next: Fec file format, Prev: Testing with fec files, Up: Fec files
|
||||
|
||||
5.5 How to repair files using fec files
|
||||
6.5 How to repair files using fec files
|
||||
=======================================
|
||||
|
||||
Example 1: Repair the file 'archive.tar.lz' using the fec file
|
||||
|
@ -1084,10 +1136,22 @@ directory 'fec'.
|
|||
|
||||
lziprecover -v -r -Fr --fec-file=fec/ datadir
|
||||
|
||||
Example 4: Recover a collection of photos from a damaged external drive
|
||||
('/dev/sdc1'). The photos are in directory 'photos', and the fec files are
|
||||
in directory 'photos-fec'.
|
||||
|
||||
ddrescue -b4096 -r10 /dev/sdc1 hdimage mapfile
|
||||
mount -o loop,ro hdimage /mnt/hdimage
|
||||
cp -a /mnt/hdimage/photos photos
|
||||
cp -a /mnt/hdimage/photos-fec photos-fec
|
||||
umount /mnt/hdimage
|
||||
lziprecover -v -Fr --fec-file=photos-fec/ photos/*
|
||||
(Check and rename repaired files. They are named 'photos/*_fixed')
|
||||
|
||||
|
||||
File: lziprecover.info, Node: Fec file format, Prev: Repairing with fec files, Up: Fec files
|
||||
|
||||
5.6 Fec file format
|
||||
6.6 Fec file format
|
||||
===================
|
||||
|
||||
A fec file consists of one chksum packet, one or more fec packets, and one
|
||||
|
@ -1127,7 +1191,7 @@ achieved by a careful design, without adding any padding bytes.
|
|||
The fec file format has an overhead of 8 bytes per protected data block,
|
||||
plus 16 bytes per FEC block, plus 80 bytes.
|
||||
|
||||
5.6.1 Chksum packet
|
||||
6.6.1 Chksum packet
|
||||
-------------------
|
||||
|
||||
A chksum packet contains one CRC for each of the N data blocks in the
|
||||
|
@ -1179,7 +1243,7 @@ payload_crc 36 + 4N 4
|
|||
present) contains an array of CRC32-Cs.
|
||||
|
||||
For the expected thousands of bit flips caused by a zeroed sector, a
|
||||
"symmetric" CRC like CRC32 is probably better than CRC32-C, which
|
||||
symmetric CRC like CRC32 is probably better than CRC32-C, which
|
||||
detects all the errors with an odd number of bit flips at the expense
|
||||
of a larger number of undetected errors with an even number of bit
|
||||
flips.
|
||||
|
@ -1187,7 +1251,7 @@ payload_crc 36 + 4N 4
|
|||
'payload_crc'
|
||||
CRC32 of the crc_array.
|
||||
|
||||
5.6.2 Fec packet
|
||||
6.6.2 Fec packet
|
||||
----------------
|
||||
|
||||
A fec packet contains one FEC block and is structured as shown in the
|
||||
|
@ -1224,7 +1288,7 @@ payload_crc 12 + fbs 4
|
|||
|
||||
File: lziprecover.info, Node: Repairing one byte, Next: Merging files, Prev: Fec files, Up: Top
|
||||
|
||||
6 Repairing one byte
|
||||
7 Repairing one byte
|
||||
********************
|
||||
|
||||
Lziprecover can repair perfectly most files with small errors (up to one
|
||||
|
@ -1238,11 +1302,11 @@ most common forms of data corruption.
|
|||
is limited to 2 GiB on 32-bit systems.
|
||||
|
||||
The error may be located anywhere in the file except in the first 5
|
||||
bytes of each member header or in the 'Member size' field of the trailer
|
||||
(last 8 bytes of each member). If the error is in the header it can be
|
||||
easily repaired with a text editor like GNU Moe (*note File format::). If
|
||||
the error is in the member size, it is enough to ignore the message about
|
||||
'bad member size' when decompressing.
|
||||
bytes of each member header (magic and version) or in the 'Member size'
|
||||
field of the trailer (last 8 bytes of each member). If the error is in the
|
||||
header it can be easily repaired with a text editor like GNU Moe (*note
|
||||
File format::). If the error is in the member size, it is enough to ignore
|
||||
the message about 'bad member size' when decompressing.
|
||||
|
||||
Bit flip happens when one bit in the file is changed from 0 to 1 or vice
|
||||
versa. It may be caused by bad RAM or even by natural radiation. I have
|
||||
|
@ -1252,7 +1316,7 @@ seen a case of bit flip in a file stored on an USB flash drive.
|
|||
transmission errors or I/O errors just affect one byte, or even one bit, of
|
||||
the file. Also, unlike magnetic media, where errors usually affect a whole
|
||||
sector, solid-state storage devices tend to produce single-byte errors,
|
||||
making of lzip the perfect format for data stored on such devices.
|
||||
which lziprecover can repair.
|
||||
|
||||
Repairing a file can take some time. Small files or files with the error
|
||||
located near the beginning can be repaired in a few seconds. But repairing
|
||||
|
@ -1266,7 +1330,7 @@ repairs more efficiently the worst errors.
|
|||
|
||||
File: lziprecover.info, Node: Merging files, Next: Reproducing one sector, Prev: Repairing one byte, Up: Top
|
||||
|
||||
7 Merging files
|
||||
8 Merging files
|
||||
***************
|
||||
|
||||
If you have several copies of a file but all of them are too damaged to
|
||||
|
@ -1320,10 +1384,8 @@ identical to the original, in just 5 seconds:
|
|||
than the number of corrupt bytes (3104) because contiguous corrupt bytes
|
||||
are counted as a single multibyte error.
|
||||
|
||||
|
||||
Example 1: Recover a compressed backup from two copies on CD-ROM with
|
||||
error-checked merging of copies. *Note GNU ddrescue manual: (ddrescue)Top,
|
||||
for details about ddrescue.
|
||||
error-checked merging of copies.
|
||||
|
||||
ddrescue -d -r1 -b2048 /dev/cdrom cdimage1 mapfile1
|
||||
mount -t iso9660 -o loop,ro cdimage1 /mnt/cdimage
|
||||
|
@ -1339,7 +1401,6 @@ for details about ddrescue.
|
|||
lziprecover -tv backup.tar.lz
|
||||
backup.tar.lz: ok
|
||||
|
||||
|
||||
Example 2: Recover the first volume of those created with the command
|
||||
'lzip -b 32MiB -S 650MB big_db' from two copies, 'big_db1_00001.lz' and
|
||||
'big_db2_00001.lz', with member 07 damaged in the first copy, member 18
|
||||
|
@ -1354,7 +1415,7 @@ correct file produced is saved in 'big_db_00001.lz'.
|
|||
|
||||
File: lziprecover.info, Node: Reproducing one sector, Next: Tarlz, Prev: Merging files, Up: Top
|
||||
|
||||
8 Reproducing one sector
|
||||
9 Reproducing one sector
|
||||
************************
|
||||
|
||||
Lziprecover can recover a zeroed sector in a lzip file by concatenating the
|
||||
|
@ -1430,7 +1491,7 @@ header, and that the archive can be reproduced. The tarlz format has minimum
|
|||
overhead. It uses basic ustar headers, and only adds extended pax headers
|
||||
when they are required.
|
||||
|
||||
8.1 Performance of '--reproduce'
|
||||
9.1 Performance of '--reproduce'
|
||||
================================
|
||||
|
||||
Reproduce mode is especially useful when recovering a corrupt backup (or a
|
||||
|
@ -1483,7 +1544,6 @@ for a different version of the software.
|
|||
Member reproduced successfully.
|
||||
Copy of input file reproduced successfully.
|
||||
|
||||
|
||||
Example 2: Recover a damaged backup with a zeroed sector of 4096 bytes at
|
||||
file position 1019904, using as reference a previous backup. The damaged
|
||||
backup comes from a damaged partition copied with ddrescue.
|
||||
|
@ -1505,7 +1565,6 @@ backup comes from a damaged partition copied with ddrescue.
|
|||
Member reproduced successfully.
|
||||
Copy of input file reproduced successfully.
|
||||
|
||||
|
||||
Example 3: Recover a damaged backup with a zeroed sector of 4096 bytes at
|
||||
file position 1019904, using as reference a file from the filesystem. (If
|
||||
the zeroed sector encodes (part of) a tar header, the tarball can't be
|
||||
|
@ -1541,8 +1600,8 @@ has been renamed.
|
|||
|
||||
File: lziprecover.info, Node: Tarlz, Next: File names, Prev: Reproducing one sector, Up: Top
|
||||
|
||||
9 Options supporting the tar.lz format
|
||||
**************************************
|
||||
10 Options supporting the tar.lz format
|
||||
***************************************
|
||||
|
||||
Tarlz is a massively parallel (multi-threaded) combined implementation of
|
||||
the tar archiver and the lzip compressor.
|
||||
|
@ -1562,8 +1621,8 @@ alignment between tar members and lzip members minimizes the amount of data
|
|||
lost in case of corruption. In this chapter we'll explain the ways in which
|
||||
lziprecover can recover and process multimember tar.lz archives.
|
||||
|
||||
9.1 Recovering damaged multimember tar.lz archives
|
||||
==================================================
|
||||
10.1 Recovering damaged multimember tar.lz archives
|
||||
===================================================
|
||||
|
||||
If you have several copies of the damaged archive, try merging them first
|
||||
because merging has a high probability of success. *Note Merging files::. If
|
||||
|
@ -1604,8 +1663,8 @@ possible from each damaged member in 'bad_members.tar.lz':
|
|||
cd tmp
|
||||
tarlz --keep-damaged -xvf ../bad_members.tar.lz
|
||||
|
||||
9.2 Processing multimember tar.lz archives
|
||||
==========================================
|
||||
10.2 Processing multimember tar.lz archives
|
||||
===========================================
|
||||
|
||||
Lziprecover is able to copy a list of members from a file to another. For
|
||||
example the command
|
||||
|
@ -1618,7 +1677,7 @@ end-of-file blocks.
|
|||
|
||||
File: lziprecover.info, Node: File names, Next: Trailing data, Prev: Tarlz, Up: Top
|
||||
|
||||
10 Names of the files produced by lziprecover
|
||||
11 Names of the files produced by lziprecover
|
||||
*********************************************
|
||||
|
||||
The name of the fixed file produced by '--byte-repair' and '--merge' is
|
||||
|
@ -1634,7 +1693,7 @@ string '_fixed' is inserted before the extension.
|
|||
|
||||
File: lziprecover.info, Node: Trailing data, Next: Examples, Prev: File names, Up: Top
|
||||
|
||||
11 Extra data appended to the file
|
||||
12 Extra data appended to the file
|
||||
**********************************
|
||||
|
||||
Sometimes extra data are found appended to a lzip file after the last
|
||||
|
@ -1644,7 +1703,7 @@ member. Such trailing data may be:
|
|||
example when writing to a tape. It is safe to append any amount of
|
||||
padding zero bytes to a lzip file.
|
||||
|
||||
* Useful data added by the user; an "End Of File" string (to check that
|
||||
* Useful data added by the user; an 'End Of File' string (to check that
|
||||
the file has not been truncated), a cryptographically secure hash, a
|
||||
description of file contents, etc. It is safe to append any amount of
|
||||
text to a lzip file as long as none of the first four bytes of the
|
||||
|
@ -1691,7 +1750,6 @@ Example 1: Add a comment or description to a compressed file.
|
|||
# This command removes the comment from file.lz
|
||||
lziprecover --remove=tdata file.lz
|
||||
|
||||
|
||||
Example 2: Add and check a cryptographically secure hash. (This may be
|
||||
convenient, but a separate copy of the hash must be kept in a safe place to
|
||||
guarantee that both file and hash have not been maliciously replaced).
|
||||
|
@ -1703,7 +1761,7 @@ guarantee that both file and hash have not been maliciously replaced).
|
|||
|
||||
File: lziprecover.info, Node: Examples, Next: Unzcrash, Prev: Trailing data, Up: Top
|
||||
|
||||
12 A small tutorial with examples
|
||||
13 A small tutorial with examples
|
||||
*********************************
|
||||
|
||||
Example 1: Extract all the files from archive 'foo.tar.lz'.
|
||||
|
@ -1763,7 +1821,7 @@ integrity of the resulting files.
|
|||
|
||||
File: lziprecover.info, Node: Unzcrash, Next: Problems, Prev: Examples, Up: Top
|
||||
|
||||
13 Testing the robustness of decompressors
|
||||
14 Testing the robustness of decompressors
|
||||
******************************************
|
||||
|
||||
*Note --unzcrash::, for a faster way of testing the robustness of lzip.
|
||||
|
@ -1849,10 +1907,11 @@ unzcrash supports the following options:
|
|||
|
||||
'-B[SIZE][,VALUE]'
|
||||
'--block[=SIZE][,VALUE]'
|
||||
Test block errors of given SIZE, simulating a whole sector I/O error.
|
||||
SIZE defaults to 512 bytes. VALUE defaults to 0. By default, only
|
||||
contiguous, non-overlapping blocks are tested, but this may be changed
|
||||
with the option '--delta'.
|
||||
Test block errors of given SIZE, simulating a whole sector I/O error
|
||||
by setting all the bytes in the block to VALUE before attempting
|
||||
decompression. SIZE defaults to 512 bytes. VALUE defaults to 0. By
|
||||
default, only contiguous, non-overlapping blocks are tested, but this
|
||||
may be changed with the option '--delta'.
|
||||
|
||||
'-d N'
|
||||
'--delta=N'
|
||||
|
@ -1918,7 +1977,7 @@ bug) which caused unzcrash to panic.
|
|||
|
||||
File: lziprecover.info, Node: Problems, Next: Concept index, Prev: Unzcrash, Up: Top
|
||||
|
||||
14 Reporting bugs
|
||||
15 Reporting bugs
|
||||
*****************
|
||||
|
||||
There are probably bugs in lziprecover. There are certainly errors and
|
||||
|
@ -1939,6 +1998,7 @@ Concept index
|
|||
|