Merging upstream version 0.19.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
84460224b0
commit
b53e340348
28 changed files with 926 additions and 616 deletions
27
doc/tarlz.1
27
doc/tarlz.1
|
@ -1,5 +1,5 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1.
|
||||
.TH TARLZ "1" "July 2020" "tarlz 0.17" "User Commands"
|
||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16.
|
||||
.TH TARLZ "1" "January 2021" "tarlz 0.19" "User Commands"
|
||||
.SH NAME
|
||||
tarlz \- creates tar archives with multimember lzip compression
|
||||
.SH SYNOPSIS
|
||||
|
@ -7,13 +7,15 @@ tarlz \- creates tar archives with multimember lzip compression
|
|||
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
||||
.SH DESCRIPTION
|
||||
Tarlz is a massively parallel (multi\-threaded) combined implementation of
|
||||
the tar archiver and the lzip compressor. Tarlz creates, lists and extracts
|
||||
archives in a simplified and safer variant of the POSIX pax format
|
||||
compressed with lzip, keeping the alignment between tar members and lzip
|
||||
members. The resulting multimember tar.lz archive is fully backward
|
||||
compatible with standard tar tools like GNU tar, which treat it like any
|
||||
other tar.lz archive. Tarlz can append files to the end of such compressed
|
||||
archives.
|
||||
the tar archiver and the lzip compressor. Tarlz uses the compression library
|
||||
lzlib.
|
||||
.PP
|
||||
Tarlz creates, lists, and extracts archives in a simplified and safer
|
||||
variant of the POSIX pax format compressed in lzip format, keeping the
|
||||
alignment between tar members and lzip members. The resulting multimember
|
||||
tar.lz archive is fully backward compatible with standard tar tools like GNU
|
||||
tar, which treat it like any other tar.lz archive. Tarlz can append files to
|
||||
the end of such compressed archives.
|
||||
.PP
|
||||
Keeping the alignment between tar members and lzip members has two
|
||||
advantages. It adds an indexed lzip layer on top of the tar archive, making
|
||||
|
@ -126,6 +128,9 @@ exit with error status if missing extended CRC
|
|||
.TP
|
||||
\fB\-\-out\-slots=\fR<n>
|
||||
number of 1 MiB output packets buffered [64]
|
||||
.TP
|
||||
\fB\-\-check\-lib\fR
|
||||
compare version of lzlib.h with liblz.{a,so}
|
||||
.PP
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems (file not
|
||||
found, files differ, invalid flags, I/O errors, etc), 2 to indicate a
|
||||
|
@ -136,8 +141,8 @@ Report bugs to lzip\-bug@nongnu.org
|
|||
.br
|
||||
Tarlz home page: http://www.nongnu.org/lzip/tarlz.html
|
||||
.SH COPYRIGHT
|
||||
Copyright \(co 2020 Antonio Diaz Diaz.
|
||||
Using lzlib 1.12\-rc1a
|
||||
Copyright \(co 2021 Antonio Diaz Diaz.
|
||||
Using lzlib 1.12
|
||||
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||
.br
|
||||
This is free software: you are free to change and redistribute it.
|
||||
|
|
129
doc/tarlz.info
129
doc/tarlz.info
|
@ -11,7 +11,7 @@ File: tarlz.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Tarlz Manual
|
||||
************
|
||||
|
||||
This manual is for Tarlz (version 0.17, 30 July 2020).
|
||||
This manual is for Tarlz (version 0.19, 8 January 2021).
|
||||
|
||||
* Menu:
|
||||
|
||||
|
@ -28,10 +28,10 @@ This manual is for Tarlz (version 0.17, 30 July 2020).
|
|||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2013-2020 Antonio Diaz Diaz.
|
||||
Copyright (C) 2013-2021 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to
|
||||
copy, distribute, and modify it.
|
||||
This manual is free documentation: you have unlimited permission to copy,
|
||||
distribute, and modify it.
|
||||
|
||||
|
||||
File: tarlz.info, Node: Introduction, Next: Invoking tarlz, Prev: Top, Up: Top
|
||||
|
@ -40,13 +40,15 @@ File: tarlz.info, Node: Introduction, Next: Invoking tarlz, Prev: Top, Up: T
|
|||
**************
|
||||
|
||||
Tarlz is a massively parallel (multi-threaded) combined implementation of
|
||||
the tar archiver and the lzip compressor. Tarlz creates, lists and extracts
|
||||
archives in a simplified and safer variant of the POSIX pax format
|
||||
compressed with lzip, keeping the alignment between tar members and lzip
|
||||
members. The resulting multimember tar.lz archive is fully backward
|
||||
compatible with standard tar tools like GNU tar, which treat it like any
|
||||
other tar.lz archive. Tarlz can append files to the end of such compressed
|
||||
archives.
|
||||
the tar archiver and the lzip compressor. Tarlz uses the compression
|
||||
library lzlib.
|
||||
|
||||
Tarlz creates tar archives using a simplified and safer variant of the
|
||||
POSIX pax format compressed in lzip format, keeping the alignment between
|
||||
tar members and lzip members. The resulting multimember tar.lz archive is
|
||||
fully backward compatible with standard tar tools like GNU tar, which treat
|
||||
it like any other tar.lz archive. Tarlz can append files to the end of such
|
||||
compressed archives.
|
||||
|
||||
Keeping the alignment between tar members and lzip members has two
|
||||
advantages. It adds an indexed lzip layer on top of the tar archive, making
|
||||
|
@ -56,7 +58,7 @@ plzip may even double the amount of files lost for each lzip member damaged
|
|||
because it does not keep the members aligned.
|
||||
|
||||
Tarlz can create tar archives with five levels of compression
|
||||
granularity; per file (--no-solid), per block (--bsolid, default), per
|
||||
granularity: per file (--no-solid), per block (--bsolid, default), per
|
||||
directory (--dsolid), appendable solid (--asolid), and solid (--solid). It
|
||||
can also create uncompressed tar archives.
|
||||
|
||||
|
@ -79,8 +81,8 @@ archive, but it has the following advantages:
|
|||
lziprecover can be used to recover some of the damaged members.
|
||||
|
||||
* A multimember tar.lz archive is usually smaller than the corresponding
|
||||
solidly compressed tar.gz archive, except when compressing files
|
||||
smaller than about 32 KiB individually.
|
||||
solidly compressed tar.gz archive, except when individually
|
||||
compressing files smaller than about 32 KiB.
|
||||
|
||||
Tarlz protects the extended records with a Cyclic Redundancy Check (CRC)
|
||||
in a way compatible with standard tar tools. *Note crc32::.
|
||||
|
@ -240,8 +242,7 @@ to '-1 --solid'
|
|||
not used, tarlz tries to detect the number of processors in the system
|
||||
and use it as default value. 'tarlz --help' shows the system's default
|
||||
value. See the note about multi-threaded archive creation in the
|
||||
option '-C' above. Multi-threaded extraction of files from an archive
|
||||
is not yet implemented. *Note Multi-threaded decoding::.
|
||||
option '-C' above.
|
||||
|
||||
Note that the number of usable threads is limited during compression to
|
||||
ceil( uncompressed_size / data_size ) (*note Minimum archive sizes::),
|
||||
|
@ -281,7 +282,8 @@ to '-1 --solid'
|
|||
|
||||
'-v'
|
||||
'--verbose'
|
||||
Verbosely list files processed.
|
||||
Verbosely list files processed. Further -v's (up to 4) increase the
|
||||
verbosity level.
|
||||
|
||||
'-x'
|
||||
'--extract'
|
||||
|
@ -376,7 +378,8 @@ to '-1 --solid'
|
|||
Don't delete partially extracted files. If a decompression error
|
||||
happens while extracting a file, keep the partial data extracted. Use
|
||||
this option to recover as much data as possible from each damaged
|
||||
member.
|
||||
member. It is recommended to run tarlz in single-threaded mode
|
||||
(-threads=0) when using this option.
|
||||
|
||||
'--missing-crc'
|
||||
Exit with error status 2 if the CRC of the extended records is missing.
|
||||
|
@ -396,6 +399,15 @@ to '-1 --solid'
|
|||
more memory. Valid values range from 1 to 1024. The default value is
|
||||
64.
|
||||
|
||||
'--check-lib'
|
||||
Compare the version of lzlib used to compile tarlz with the version
|
||||
actually being used and exit. Report any differences found. Exit with
|
||||
error status 1 if differences are found. A mismatch may indicate that
|
||||
lzlib is not correctly installed or that a different version of lzlib
|
||||
has been installed after compiling tarlz. 'tarlz -v --check-lib' shows
|
||||
the version of lzlib being used and the value of 'LZ_API_VERSION' (if
|
||||
defined). *Note Library version: (lzlib)Library version.
|
||||
|
||||
|
||||
Exit status: 0 for a normal exit, 1 for environmental problems (file not
|
||||
found, files differ, invalid flags, I/O errors, etc), 2 to indicate a
|
||||
|
@ -546,6 +558,10 @@ space, equal-sign, and newline.
|
|||
the swapping of two bytes.
|
||||
|
||||
|
||||
At verbosity level 1 or higher tarlz prints a diagnostic for each unknown
|
||||
extended header keyword found in an archive, once per keyword.
|
||||
|
||||
|
||||
4.2 Ustar header block
|
||||
======================
|
||||
|
||||
|
@ -770,11 +786,12 @@ interesting parts described here are those related to Multi-threaded
|
|||
processing.
|
||||
|
||||
The structure of the part of tarlz performing Multi-threaded archive
|
||||
creation is somewhat similar to that of plzip with the added complication of
|
||||
the solidity levels. A grouper thread and several worker threads are
|
||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||
courier" takes care of data transfers among threads and limits the maximum
|
||||
number of data blocks (packets) being processed simultaneously.
|
||||
creation is somewhat similar to that of plzip with the added complication
|
||||
of the solidity levels. *Note Program design: (plzip)Program design. A
|
||||
grouper thread and several worker threads are created, acting the main
|
||||
thread as muxer (multiplexer) thread. A "packet courier" takes care of data
|
||||
transfers among threads and limits the maximum number of data blocks
|
||||
(packets) being processed simultaneously.
|
||||
|
||||
The grouper traverses the directory tree, groups together the metadata of
|
||||
the files to be archived in each lzip member, and distributes them to the
|
||||
|
@ -805,8 +822,7 @@ the archive.
|
|||
,--------,
|
||||
| file |<---> data to/from each worker below
|
||||
| system |
|
||||
`--------'
|
||||
,------------,
|
||||
`--------' ,------------,
|
||||
,-->| worker 0 |--,
|
||||
| `------------' |
|
||||
,---------, | ,------------, | ,-------, ,--------,
|
||||
|
@ -870,8 +886,7 @@ possible decoding it safely in parallel.
|
|||
Tarlz is able to automatically decode aligned and unaligned multimember
|
||||
tar.lz archives, keeping backwards compatibility. If tarlz finds a member
|
||||
misalignment during multi-threaded decoding, it switches to single-threaded
|
||||
mode and continues decoding the archive. Currently only the options
|
||||
'--diff' and '--list' are able to do multi-threaded decoding.
|
||||
mode and continues decoding the archive.
|
||||
|
||||
If the files in the archive are large, multi-threaded '--list' on a
|
||||
regular (seekable) tar.lz archive can be hundreds of times faster than
|
||||
|
@ -886,7 +901,33 @@ example listing the Silesia corpus on a dual core machine:
|
|||
|
||||
On the other hand, multi-threaded '--list' won't detect corruption in
|
||||
the tar member data because it only decodes the part of each lzip member
|
||||
corresponding to the tar member header.
|
||||
corresponding to the tar member header. This is another reason why the tar
|
||||
headers must provide its own integrity checking.
|
||||
|
||||
|
||||
7.1 Limitations of multi-threaded extraction
|
||||
============================================
|
||||
|
||||
Multi-threaded extraction may produce different output than single-threaded
|
||||
extraction in some cases:
|
||||
|
||||
During multi-threaded extraction, several independent processes are
|
||||
simultaneously reading the archive and creating files in the file system.
|
||||
The archive is not read sequentially. As a consequence, any error or
|
||||
weirdness in the archive (like a corrupt member or an EOF block in the
|
||||
middle of the archive) won't be usually detected until part of the archive
|
||||
beyond that point has been processed.
|
||||
|
||||
If the archive contains two or more tar members with the same name,
|
||||
single-threaded extraction extracts the members in the order they appear in
|
||||
the archive and leaves in the file system the last version of the file. But
|
||||
multi-threaded extraction may extract the members in any order and leave in
|
||||
the file system any version of the file nondeterministically. It is
|
||||
unspecified which of the tar members is extracted.
|
||||
|
||||
If the same file is extracted through several paths (different member
|
||||
names resolve to the same file in the file system), the result is undefined.
|
||||
(Probably the resulting file will be mangled).
|
||||
|
||||
|
||||
File: tarlz.info, Node: Minimum archive sizes, Next: Examples, Prev: Multi-threaded decoding, Up: Top
|
||||
|
@ -1028,22 +1069,22 @@ Concept index
|
|||
|
||||
Tag Table:
|
||||
Node: Top223
|
||||
Node: Introduction1212
|
||||
Node: Invoking tarlz3982
|
||||
Ref: --data-size6193
|
||||
Ref: --bsolid14608
|
||||
Node: Portable character set18244
|
||||
Node: File format18887
|
||||
Ref: key_crc3223812
|
||||
Node: Amendments to pax format29271
|
||||
Ref: crc3229935
|
||||
Ref: flawed-compat31220
|
||||
Node: Program design33865
|
||||
Node: Multi-threaded decoding37756
|
||||
Node: Minimum archive sizes40492
|
||||
Node: Examples42630
|
||||
Node: Problems44345
|
||||
Node: Concept index44873
|
||||
Node: Introduction1214
|
||||
Node: Invoking tarlz4022
|
||||
Ref: --data-size6233
|
||||
Ref: --bsolid14593
|
||||
Node: Portable character set18852
|
||||
Node: File format19495
|
||||
Ref: key_crc3224420
|
||||
Node: Amendments to pax format30021
|
||||
Ref: crc3230685
|
||||
Ref: flawed-compat31970
|
||||
Node: Program design34615
|
||||
Node: Multi-threaded decoding38540
|
||||
Node: Minimum archive sizes42482
|
||||
Node: Examples44620
|
||||
Node: Problems46335
|
||||
Node: Concept index46863
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
123
doc/tarlz.texi
123
doc/tarlz.texi
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 30 July 2020
|
||||
@set VERSION 0.17
|
||||
@set UPDATED 8 January 2021
|
||||
@set VERSION 0.19
|
||||
|
||||
@dircategory Data Compression
|
||||
@direntry
|
||||
|
@ -29,6 +29,7 @@
|
|||
@contents
|
||||
@end ifnothtml
|
||||
|
||||
@ifnottex
|
||||
@node Top
|
||||
@top
|
||||
|
||||
|
@ -49,10 +50,11 @@ This manual is for Tarlz (version @value{VERSION}, @value{UPDATED}).
|
|||
@end menu
|
||||
|
||||
@sp 1
|
||||
Copyright @copyright{} 2013-2020 Antonio Diaz Diaz.
|
||||
Copyright @copyright{} 2013-2021 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission
|
||||
to copy, distribute, and modify it.
|
||||
This manual is free documentation: you have unlimited permission to copy,
|
||||
distribute, and modify it.
|
||||
@end ifnottex
|
||||
|
||||
|
||||
@node Introduction
|
||||
|
@ -61,13 +63,15 @@ to copy, distribute, and modify it.
|
|||
|
||||
@uref{http://www.nongnu.org/lzip/tarlz.html,,Tarlz} is a massively parallel
|
||||
(multi-threaded) combined implementation of the tar archiver and the
|
||||
@uref{http://www.nongnu.org/lzip/lzip.html,,lzip} compressor. Tarlz creates,
|
||||
lists and extracts archives in a simplified and safer variant of the POSIX
|
||||
pax format compressed with lzip, keeping the alignment between tar members
|
||||
and lzip members. The resulting multimember tar.lz archive is fully backward
|
||||
compatible with standard tar tools like GNU tar, which treat it like any
|
||||
other tar.lz archive. Tarlz can append files to the end of such compressed
|
||||
archives.
|
||||
@uref{http://www.nongnu.org/lzip/lzip.html,,lzip} compressor. Tarlz uses the
|
||||
compression library @uref{http://www.nongnu.org/lzip/lzlib.html,,lzlib}.
|
||||
|
||||
Tarlz creates tar archives using a simplified and safer variant of the POSIX
|
||||
pax format compressed in lzip format, keeping the alignment between tar
|
||||
members and lzip members. The resulting multimember tar.lz archive is fully
|
||||
backward compatible with standard tar tools like GNU tar, which treat it
|
||||
like any other tar.lz archive. Tarlz can append files to the end of such
|
||||
compressed archives.
|
||||
|
||||
Keeping the alignment between tar members and lzip members has two
|
||||
advantages. It adds an indexed lzip layer on top of the tar archive, making
|
||||
|
@ -76,7 +80,7 @@ amount of data lost in case of corruption. Compressing a tar archive with
|
|||
plzip may even double the amount of files lost for each lzip member damaged
|
||||
because it does not keep the members aligned.
|
||||
|
||||
Tarlz can create tar archives with five levels of compression granularity;
|
||||
Tarlz can create tar archives with five levels of compression granularity:
|
||||
per file (---no-solid), per block (---bsolid, default), per directory
|
||||
(---dsolid), appendable solid (---asolid), and solid (---solid). It can also
|
||||
create uncompressed tar archives.
|
||||
|
@ -97,17 +101,17 @@ member), and unwanted members can be deleted from the archive. Just
|
|||
like an uncompressed tar archive.
|
||||
|
||||
@item
|
||||
It is a safe POSIX-style backup format. In case of corruption,
|
||||
tarlz can extract all the undamaged members from the tar.lz
|
||||
archive, skipping over the damaged members, just like the standard
|
||||
(uncompressed) tar. Moreover, the option @samp{--keep-damaged} can be
|
||||
used to recover as much data as possible from each damaged member,
|
||||
and lziprecover can be used to recover some of the damaged members.
|
||||
It is a safe POSIX-style backup format. In case of corruption, tarlz
|
||||
can extract all the undamaged members from the tar.lz archive,
|
||||
skipping over the damaged members, just like the standard
|
||||
(uncompressed) tar. Moreover, the option @samp{--keep-damaged} can be used
|
||||
to recover as much data as possible from each damaged member, and
|
||||
lziprecover can be used to recover some of the damaged members.
|
||||
|
||||
@item
|
||||
A multimember tar.lz archive is usually smaller than the
|
||||
corresponding solidly compressed tar.gz archive, except when
|
||||
compressing files smaller than about 32 KiB individually.
|
||||
A multimember tar.lz archive is usually smaller than the corresponding
|
||||
solidly compressed tar.gz archive, except when individually
|
||||
compressing files smaller than about 32 KiB.
|
||||
@end itemize
|
||||
|
||||
Tarlz protects the extended records with a Cyclic Redundancy Check (CRC) in
|
||||
|
@ -275,8 +279,6 @@ of 0 disables threads entirely. If this option is not used, tarlz tries to
|
|||
detect the number of processors in the system and use it as default value.
|
||||
@w{@samp{tarlz --help}} shows the system's default value. See the note about
|
||||
multi-threaded archive creation in the option @samp{-C} above.
|
||||
Multi-threaded extraction of files from an archive is not yet implemented.
|
||||
@xref{Multi-threaded decoding}.
|
||||
|
||||
Note that the number of usable threads is limited during compression to
|
||||
@w{ceil( uncompressed_size / data_size )} (@pxref{Minimum archive sizes}),
|
||||
|
@ -316,7 +318,8 @@ List the contents of an archive. If @var{files} are given, list only the
|
|||
|
||||
@item -v
|
||||
@itemx --verbose
|
||||
Verbosely list files processed.
|
||||
Verbosely list files processed. Further -v's (up to 4) increase the
|
||||
verbosity level.
|
||||
|
||||
@item -x
|
||||
@itemx --extract
|
||||
|
@ -409,8 +412,9 @@ decimal numeric group ID.
|
|||
|
||||
@item --keep-damaged
|
||||
Don't delete partially extracted files. If a decompression error happens
|
||||
while extracting a file, keep the partial data extracted. Use this
|
||||
option to recover as much data as possible from each damaged member.
|
||||
while extracting a file, keep the partial data extracted. Use this option to
|
||||
recover as much data as possible from each damaged member. It is recommended
|
||||
to run tarlz in single-threaded mode (--threads=0) when using this option.
|
||||
|
||||
@item --missing-crc
|
||||
Exit with error status 2 if the CRC of the extended records is missing.
|
||||
|
@ -429,6 +433,19 @@ number of packets may increase compression speed if the files being archived
|
|||
are larger than @w{64 MiB} compressed, but requires more memory. Valid
|
||||
values range from 1 to 1024. The default value is 64.
|
||||
|
||||
@item --check-lib
|
||||
Compare the
|
||||
@uref{http://www.nongnu.org/lzip/manual/lzlib_manual.html#Library-version,,version of lzlib}
|
||||
used to compile tarlz with the version actually being used and exit. Report
|
||||
any differences found. Exit with error status 1 if differences are found. A
|
||||
mismatch may indicate that lzlib is not correctly installed or that a
|
||||
different version of lzlib has been installed after compiling tarlz.
|
||||
@w{@samp{tarlz -v --check-lib}} shows the version of lzlib being used and
|
||||
the value of @samp{LZ_API_VERSION} (if defined).
|
||||
@ifnothtml
|
||||
@xref{Library version,,,lzlib}.
|
||||
@end ifnothtml
|
||||
|
||||
@ignore
|
||||
@item --permissive
|
||||
Allow some violations of the archive format, like consecutive extended
|
||||
|
@ -613,8 +630,12 @@ protected by the CRC to guarante that corruption is always detected
|
|||
(except in case of CRC collision). A CRC was chosen because a checksum
|
||||
is too weak for a potentially large list of variable sized records. A
|
||||
checksum can't detect simple errors like the swapping of two bytes.
|
||||
|
||||
@end table
|
||||
|
||||
At verbosity level 1 or higher tarlz prints a diagnostic for each unknown
|
||||
extended header keyword found in an archive, once per keyword.
|
||||
|
||||
@sp 1
|
||||
@section Ustar header block
|
||||
|
||||
|
@ -839,11 +860,16 @@ or less similar to any other tar and won't be described here. The interesting
|
|||
parts described here are those related to Multi-threaded processing.
|
||||
|
||||
The structure of the part of tarlz performing Multi-threaded archive
|
||||
creation is somewhat similar to that of plzip with the added complication of
|
||||
the solidity levels. A grouper thread and several worker threads are
|
||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||
courier" takes care of data transfers among threads and limits the maximum
|
||||
number of data blocks (packets) being processed simultaneously.
|
||||
creation is somewhat similar to that of
|
||||
@uref{http://www.nongnu.org/lzip/plzip.html#Program-design,,plzip} with the
|
||||
added complication of the solidity levels.
|
||||
@ifnothtml
|
||||
@xref{Program design,,,plzip}.
|
||||
@end ifnothtml
|
||||
A grouper thread and several worker threads are created, acting the main
|
||||
thread as muxer (multiplexer) thread. A "packet courier" takes care of data
|
||||
transfers among threads and limits the maximum number of data blocks
|
||||
(packets) being processed simultaneously.
|
||||
|
||||
The grouper traverses the directory tree, groups together the metadata of
|
||||
the files to be archived in each lzip member, and distributes them to the
|
||||
|
@ -876,8 +902,7 @@ access files in the file system either to read them (diff) or write them
|
|||
,--------,
|
||||
| file |<---> data to/from each worker below
|
||||
| system |
|
||||
`--------'
|
||||
,------------,
|
||||
`--------' ,------------,
|
||||
,-->| worker 0 |--,
|
||||
| `------------' |
|
||||
,---------, | ,------------, | ,-------, ,--------,
|
||||
|
@ -941,8 +966,7 @@ decoding it safely in parallel.
|
|||
Tarlz is able to automatically decode aligned and unaligned multimember
|
||||
tar.lz archives, keeping backwards compatibility. If tarlz finds a member
|
||||
misalignment during multi-threaded decoding, it switches to single-threaded
|
||||
mode and continues decoding the archive. Currently only the options
|
||||
@samp{--diff} and @samp{--list} are able to do multi-threaded decoding.
|
||||
mode and continues decoding the archive.
|
||||
|
||||
If the files in the archive are large, multi-threaded @samp{--list} on a
|
||||
regular (seekable) tar.lz archive can be hundreds of times faster than
|
||||
|
@ -959,7 +983,32 @@ time tarlz -tf silesia.tar.lz (0.020s)
|
|||
|
||||
On the other hand, multi-threaded @samp{--list} won't detect corruption in
|
||||
the tar member data because it only decodes the part of each lzip member
|
||||
corresponding to the tar member header.
|
||||
corresponding to the tar member header. This is another reason why the tar
|
||||
headers must provide its own integrity checking.
|
||||
|
||||
@sp 1
|
||||
@section Limitations of multi-threaded extraction
|
||||
|
||||
Multi-threaded extraction may produce different output than single-threaded
|
||||
extraction in some cases:
|
||||
|
||||
During multi-threaded extraction, several independent processes are
|
||||
simultaneously reading the archive and creating files in the file system. The
|
||||
archive is not read sequentially. As a consequence, any error or weirdness
|
||||
in the archive (like a corrupt member or an EOF block in the middle of the
|
||||
archive) won't be usually detected until part of the archive beyond that
|
||||
point has been processed.
|
||||
|
||||
If the archive contains two or more tar members with the same name,
|
||||
single-threaded extraction extracts the members in the order they appear in
|
||||
the archive and leaves in the file system the last version of the file. But
|
||||
multi-threaded extraction may extract the members in any order and leave in
|
||||
the file system any version of the file nondeterministically. It is
|
||||
unspecified which of the tar members is extracted.
|
||||
|
||||
If the same file is extracted through several paths (different member names
|
||||
resolve to the same file in the file system), the result is undefined.
|
||||
(Probably the resulting file will be mangled).
|
||||
|
||||
|
||||
@node Minimum archive sizes
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue