1
0
Fork 0

Merging upstream version 0.11.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-17 21:12:14 +01:00
parent 3b818501c2
commit 5db1949a73
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
18 changed files with 1504 additions and 654 deletions

View file

@ -1,20 +1,20 @@
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1.
.TH TARLZ "1" "February 2019" "tarlz 0.10a" "User Commands"
.TH TARLZ "1" "February 2019" "tarlz 0.11" "User Commands"
.SH NAME
tarlz \- creates tar archives with multimember lzip compression
.SH SYNOPSIS
.B tarlz
[\fI\,options\/\fR] [\fI\,files\/\fR]
.SH DESCRIPTION
Tarlz is a combined implementation of the tar archiver and the lzip
compressor. By default tarlz creates, lists and extracts archives in a
simplified posix pax format compressed with lzip on a per file basis. Each
tar member is compressed in its own lzip member, as well as the end\-of\-file
blocks. This method adds an indexed lzip layer on top of the tar archive,
making it possible to decode the archive safely in parallel. The resulting
multimember tar.lz archive is fully backward compatible with standard tar
tools like GNU tar, which treat it like any other tar.lz archive. Tarlz can
append files to the end of such compressed archives.
Tarlz is a massively parallel (multi\-threaded) combined implementation of
the tar archiver and the lzip compressor. Tarlz creates, lists and extracts
archives in a simplified posix pax format compressed with lzip, keeping the
alignment between tar members and lzip members. This method adds an indexed
lzip layer on top of the tar archive, making it possible to decode the
archive safely in parallel. The resulting multimember tar.lz archive is
fully backward compatible with standard tar tools like GNU tar, which treat
it like any other tar.lz archive. Tarlz can append files to the end of such
compressed archives.
.PP
The tarlz file format is a safe posix\-style backup format. In case of
corruption, tarlz can extract all the undamaged members from the tar.lz
@ -46,7 +46,7 @@ change to directory <dir>
use archive file <archive>
.TP
\fB\-n\fR, \fB\-\-threads=\fR<n>
set number of decompression threads [2]
set number of (de)compression threads [2]
.TP
\fB\-q\fR, \fB\-\-quiet\fR
suppress all messages
@ -70,13 +70,13 @@ set compression level [default 6]
create solidly compressed appendable archive
.TP
\fB\-\-bsolid\fR
create per\-data\-block compressed archive
create per block compressed archive (default)
.TP
\fB\-\-dsolid\fR
create per\-directory compressed archive
create per directory compressed archive
.TP
\fB\-\-no\-solid\fR
create per\-file compressed archive (default)
create per file compressed archive
.TP
\fB\-\-solid\fR
create solidly compressed archive

View file

@ -11,7 +11,7 @@ File: tarlz.info, Node: Top, Next: Introduction, Up: (dir)
Tarlz Manual
************
This manual is for Tarlz (version 0.10, 31 January 2019).
This manual is for Tarlz (version 0.11, 13 February 2019).
* Menu:
@ -20,6 +20,7 @@ This manual is for Tarlz (version 0.10, 31 January 2019).
* File format:: Detailed format of the compressed archive
* Amendments to pax format:: The reasons for the differences with pax
* Multi-threaded tar:: Limitations of parallel tar decoding
* Minimum archive sizes:: Sizes required for full multi-threaded speed
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
* Concept index:: Index of concepts
@ -36,23 +37,23 @@ File: tarlz.info, Node: Introduction, Next: Invoking tarlz, Prev: Top, Up: T
1 Introduction
**************
Tarlz is a combined implementation of the tar archiver and the lzip
compressor. By default tarlz creates, lists and extracts archives in a
simplified posix pax format compressed with lzip on a per file basis.
Each tar member is compressed in its own lzip member, as well as the
end-of-file blocks. This method adds an indexed lzip layer on top of
the tar archive, making it possible to decode the archive safely in
parallel. The resulting multimember tar.lz archive is fully backward
compatible with standard tar tools like GNU tar, which treat it like
any other tar.lz archive. Tarlz can append files to the end of such
compressed archives.
Tarlz is a massively parallel (multi-threaded) combined implementation
of the tar archiver and the lzip compressor. Tarlz creates, lists and
extracts archives in a simplified posix pax format compressed with
lzip, keeping the alignment between tar members and lzip members. This
method adds an indexed lzip layer on top of the tar archive, making it
possible to decode the archive safely in parallel. The resulting
multimember tar.lz archive is fully backward compatible with standard
tar tools like GNU tar, which treat it like any other tar.lz archive.
Tarlz can append files to the end of such compressed archives.
Tarlz can create tar archives with four levels of compression
granularity; per file, per directory, appendable solid, and solid.
Tarlz can create tar archives with five levels of compression
granularity; per file, per block, per directory, appendable solid, and
solid.
Of course, compressing each file (or each directory) individually is
less efficient than compressing the whole tar archive, but it has the
following advantages:
Of course, compressing each file (or each directory) individually can't
achieve a compression ratio as high as compressing solidly the whole tar
archive, but it has the following advantages:
* The resulting multimember tar.lz archive can be decompressed in
parallel, multiplying the decompression speed.
@ -87,17 +88,23 @@ The format for running tarlz is:
tarlz [OPTIONS] [FILES]
On archive creation or appending, tarlz removes leading and trailing
slashes from filenames, as well as filename prefixes containing a '..'
component. On extraction, archive members containing a '..' component
are skipped. Tarlz detects when the archive being created or enlarged
is among the files to be dumped, appended or concatenated, and skips it.
On archive creation or appending tarlz archives the files specified, but
removes from member names any leading and trailing slashes and any
filename prefixes containing a '..' component. On extraction, leading
and trailing slashes are also removed from member names, and archive
members containing a '..' component in the filename are skipped. Tarlz
detects when the archive being created or enlarged is among the files
to be dumped, appended or concatenated, and skips it.
On extraction and listing, tarlz removes leading './' strings from
member names in the archive or given in the command line, so that
'tarlz -xf foo ./bar baz' extracts members 'bar' and './baz' from
archive 'foo'.
If several compression levels or '--*solid' options are given, the
last setting is used. For example '-9 --solid --uncompressed -1' is
equivalent to '-1 --solid'
tarlz supports the following options:
'-h'
@ -125,7 +132,7 @@ archive 'foo'.
Set target size of input data blocks for the '--bsolid' option.
Valid values range from 8 KiB to 1 GiB. Default value is two times
the dictionary size, except for option '-0' where it defaults to
1 MiB.
1 MiB. *Note Minimum archive sizes::.
'-c'
'--create'
@ -142,6 +149,11 @@ archive 'foo'.
relative to the then current working directory, perhaps changed by
a previous '-C' option.
Note that a process can only have one current working directory
(CWD). Therefore multi-threading can't be used to create an
archive if a '-C' option appears after a relative filename in the
command line.
'-f ARCHIVE'
'--file=ARCHIVE'
Use archive file ARCHIVE. '-' used as an ARCHIVE argument reads
@ -149,18 +161,21 @@ archive 'foo'.
'-n N'
'--threads=N'
Set the number of decompression threads, overriding the system's
Set the number of (de)compression threads, overriding the system's
default. Valid values range from 0 to "as many as your system can
support". A value of 0 disables threads entirely. If this option
is not used, tarlz tries to detect the number of processors in the
system and use it as default value. 'tarlz --help' shows the
system's default value. This option currently only has effect when
listing the contents of a multimember compressed archive. *Note
system's default value. See the note about multi-threaded archive
creation in the '-C' option above. Multi-threaded extraction of
files from an archive is not yet implemented. *Note
Multi-threaded tar::.
Note that the number of usable threads is limited during
decompression to the number of lzip members in the tar.lz archive,
which you can find by running 'lzip -lv archive.tar.lz'.
compression to ceil( uncompressed_size / data_size ) (*note
Minimum archive sizes::), and during decompression to the number
of lzip members in the tar.lz archive, which you can find by
running 'lzip -lv archive.tar.lz'.
'-q'
'--quiet'
@ -180,7 +195,7 @@ archive 'foo'.
'-t'
'--list'
List the contents of an archive. If FILES are given, list only the
given FILES.
FILES given.
'-v'
'--verbose'
@ -189,7 +204,7 @@ archive 'foo'.
'-x'
'--extract'
Extract files from an archive. If FILES are given, extract only
the given FILES. Else extract all the files in the archive.
the FILES given. Else extract all the files in the archive.
'-0 .. -9'
Set the compression level. The default compression level is '-6'.
@ -214,38 +229,43 @@ archive 'foo'.
solid compression. All the files being added to the archive are
compressed into a single lzip member, but the end-of-file blocks
are compressed into a separate lzip member. This creates a solidly
compressed appendable archive.
compressed appendable archive. Solid archives can't be created
nor decoded in parallel.
'--bsolid'
When creating or appending to a compressed archive, compress tar
members together in a lzip member until they approximate a target
uncompressed size. The size can't be exact because each solidly
compressed data block must contain an integer number of tar
members. This option improves compression efficiency for archives
with lots of small files. *Note --data-size::, to set the target
When creating or appending to a compressed archive, use block
compression. Tar members are compressed together in a lzip member
until they approximate a target uncompressed size. The size can't
be exact because each solidly compressed data block must contain
an integer number of tar members. Block compression is the default
because it improves compression ratio for archives with many files
smaller than the block size. This option allows tarlz revert to
default behavior if, for example, it is invoked through an alias
like 'tar='tarlz --solid''. *Note --data-size::, to set the target
block size.
'--dsolid'
When creating or appending to a compressed archive, use solid
compression for each directory especified in the command line. The
end-of-file blocks are compressed into a separate lzip member. This
creates a compressed appendable archive with a separate lzip
member for each top-level directory.
When creating or appending to a compressed archive, compress each
file specified in the command line separately in its own lzip
member, and use solid compression for each directory specified in
the command line. The end-of-file blocks are compressed into a
separate lzip member. This creates a compressed appendable archive
with a separate lzip member for each file or top-level directory
specified.
'--no-solid'
When creating or appending to a compressed archive, compress each
file separately. The end-of-file blocks are compressed into a
separate lzip member. This creates a compressed appendable archive
with a separate lzip member for each file. This option allows
tarlz revert to default behavior if, for example, tarlz is invoked
through an alias like 'tar='tarlz --solid''.
file separately in its own lzip member. The end-of-file blocks are
compressed into a separate lzip member. This creates a compressed
appendable archive with a lzip member for each file.
'--solid'
When creating or appending to a compressed archive, use solid
compression. The files being added to the archive, along with the
compression. The files being added to the archive, along with the
end-of-file blocks, are compressed into a single lzip member. The
resulting archive is not appendable. No more files can be later
appended to the archive.
appended to the archive. Solid archives can't be created nor
decoded in parallel.
'--anonymous'
Equivalent to '--owner=root --group=root'.
@ -341,9 +361,9 @@ blocks are either compressed in a separate lzip member or compressed
along with the tar members contained in the last lzip member.
The diagram below shows the correspondence between each tar member
(formed by one or two headers plus optional data) in the tar archive and
each lzip member in the resulting multimember tar.lz archive: *Note
File format: (lzip)File format.
(formed by one or two headers plus optional data) in the tar archive
and each lzip member in the resulting multimember tar.lz archive, when
per file compression is used: *Note File format: (lzip)File format.
tar
+========+======+=================+===============+========+======+========+
@ -612,12 +632,12 @@ wasteful for a backup format.
There is no portable way to tell what charset a text string is coded
into. Therefore, tarlz stores all fields representing text strings
as-is, without conversion to UTF-8 nor any other transformation. This
prevents accidental double UTF-8 conversions. If the need arises this
behavior will be adjusted with a command line option in the future.
unmodified, without conversion to UTF-8 nor any other transformation.
This prevents accidental double UTF-8 conversions. If the need arises
this behavior will be adjusted with a command line option in the future.

File: tarlz.info, Node: Multi-threaded tar, Next: Examples, Prev: Amendments to pax format, Up: Top
File: tarlz.info, Node: Multi-threaded tar, Next: Minimum archive sizes, Prev: Amendments to pax format, Up: Top
5 Limitations of parallel tar decoding
**************************************
@ -659,15 +679,53 @@ sequential '--list' because, in addition to using several processors,
it only needs to decompress part of each lzip member. See the following
example listing the Silesia corpus on a dual core machine:
tarlz -9 -cf silesia.tar.lz silesia
tarlz -9 --no-solid -cf silesia.tar.lz silesia
time lzip -cd silesia.tar.lz | tar -tf - (5.032s)
time plzip -cd silesia.tar.lz | tar -tf - (3.256s)
time tarlz -tf silesia.tar.lz (0.020s)

File: tarlz.info, Node: Examples, Next: Problems, Prev: Multi-threaded tar, Up: Top
File: tarlz.info, Node: Minimum archive sizes, Next: Examples, Prev: Multi-threaded tar, Up: Top
6 A small tutorial with examples
6 Minimum archive sizes required for multi-threaded block compression
*********************************************************************
When creating or appending to a compressed archive using multi-threaded
block compression, tarlz puts tar members together in blocks and
compresses as many blocks simultaneously as worker threads are chosen,
creating a multimember compressed archive.
For this to work as expected (and roughly multiply the compression
speed by the number of available processors), the uncompressed archive
must be at least as large as the number of worker threads times the
block size (*note --data-size::). Else some processors will not get any
data to compress, and compression will be proportionally slower. The
maximum speed increase achievable on a given file is limited by the
ratio (uncompressed_size / data_size). For example, a tarball the size
of gcc or linux will scale up to 10 or 12 processors at level -9.
The following table shows the minimum uncompressed archive size
needed for full use of N processors at a given compression level, using
the default data size for each level:
Processors 2 4 8 16 64 256
------------------------------------------------------------------
Level
-0 2 MiB 4 MiB 8 MiB 16 MiB 64 MiB 256 MiB
-1 4 MiB 8 MiB 16 MiB 32 MiB 128 MiB 512 MiB
-2 6 MiB 12 MiB 24 MiB 48 MiB 192 MiB 768 MiB
-3 8 MiB 16 MiB 32 MiB 64 MiB 256 MiB 1 GiB
-4 12 MiB 24 MiB 48 MiB 96 MiB 384 MiB 1.5 GiB
-5 16 MiB 32 MiB 64 MiB 128 MiB 512 MiB 2 GiB
-6 32 MiB 64 MiB 128 MiB 256 MiB 1 GiB 4 GiB
-7 64 MiB 128 MiB 256 MiB 512 MiB 2 GiB 8 GiB
-8 96 MiB 192 MiB 384 MiB 768 MiB 3 GiB 12 GiB
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB

File: tarlz.info, Node: Examples, Next: Problems, Prev: Minimum archive sizes, Up: Top
7 A small tutorial with examples
********************************
Example 1: Create a multimember compressed archive 'archive.tar.lz'
@ -725,7 +783,7 @@ Example 8: Copy the contents of directory 'sourcedir' to the directory

File: tarlz.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
7 Reporting bugs
8 Reporting bugs
****************
There are probably bugs in tarlz. There are certainly errors and
@ -754,6 +812,7 @@ Concept index
* getting help: Problems. (line 6)
* introduction: Introduction. (line 6)
* invoking: Invoking tarlz. (line 6)
* minimum archive sizes: Minimum archive sizes. (line 6)
* options: Invoking tarlz. (line 6)
* usage: Invoking tarlz. (line 6)
* version: Invoking tarlz. (line 6)
@ -762,18 +821,19 @@ Concept index

Tag Table:
Node: Top223
Node: Introduction1013
Node: Invoking tarlz3125
Ref: --data-size4717
Node: File format11536
Ref: key_crc3216321
Node: Amendments to pax format21738
Ref: crc3222262
Ref: flawed-compat23287
Node: Multi-threaded tar25649
Node: Examples28164
Node: Problems29830
Node: Concept index30356
Node: Introduction1089
Node: Invoking tarlz3218
Ref: --data-size5097
Node: File format12673
Ref: key_crc3217493
Node: Amendments to pax format22910
Ref: crc3223434
Ref: flawed-compat24459
Node: Multi-threaded tar26826
Node: Minimum archive sizes29365
Node: Examples31495
Node: Problems33164
Node: Concept index33690

End Tag Table

View file

@ -6,8 +6,8 @@
@finalout
@c %**end of header
@set UPDATED 31 January 2019
@set VERSION 0.10
@set UPDATED 13 February 2019
@set VERSION 0.11
@dircategory Data Compression
@direntry
@ -40,6 +40,7 @@ This manual is for Tarlz (version @value{VERSION}, @value{UPDATED}).
* File format:: Detailed format of the compressed archive
* Amendments to pax format:: The reasons for the differences with pax
* Multi-threaded tar:: Limitations of parallel tar decoding
* Minimum archive sizes:: Sizes required for full multi-threaded speed
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
* Concept index:: Index of concepts
@ -56,25 +57,24 @@ to copy, distribute and modify it.
@chapter Introduction
@cindex introduction
@uref{http://www.nongnu.org/lzip/tarlz.html,,Tarlz} is a combined
implementation of the tar archiver and the
@uref{http://www.nongnu.org/lzip/lzip.html,,lzip} compressor. By default
tarlz creates, lists and extracts archives in a simplified posix pax format
compressed with lzip on a per file basis. Each tar member is compressed in
its own lzip member, as well as the end-of-file blocks. This method adds an
indexed lzip layer on top of the tar archive, making it possible to decode
the archive safely in parallel. The resulting multimember tar.lz archive is
fully backward compatible with standard tar tools like GNU tar, which treat
it like any other tar.lz archive. Tarlz can append files to the end of such
compressed archives.
@uref{http://www.nongnu.org/lzip/tarlz.html,,Tarlz} is a massively parallel
(multi-threaded) combined implementation of the tar archiver and the
@uref{http://www.nongnu.org/lzip/lzip.html,,lzip} compressor. Tarlz creates,
lists and extracts archives in a simplified posix pax format compressed with
lzip, keeping the alignment between tar members and lzip members. This
method adds an indexed lzip layer on top of the tar archive, making it
possible to decode the archive safely in parallel. The resulting multimember
tar.lz archive is fully backward compatible with standard tar tools like GNU
tar, which treat it like any other tar.lz archive. Tarlz can append files to
the end of such compressed archives.
Tarlz can create tar archives with four levels of compression granularity;
per file, per directory, appendable solid, and solid.
Tarlz can create tar archives with five levels of compression granularity;
per file, per block, per directory, appendable solid, and solid.
@noindent
Of course, compressing each file (or each directory) individually is
less efficient than compressing the whole tar archive, but it has the
following advantages:
Of course, compressing each file (or each directory) individually can't
achieve a compression ratio as high as compressing solidly the whole tar
archive, but it has the following advantages:
@itemize @bullet
@item
@ -120,18 +120,23 @@ tarlz [@var{options}] [@var{files}]
@end example
@noindent
On archive creation or appending, tarlz removes leading and trailing
slashes from filenames, as well as filename prefixes containing a
@samp{..} component. On extraction, archive members containing a
@samp{..} component are skipped. Tarlz detects when the archive being
created or enlarged is among the files to be dumped, appended or
concatenated, and skips it.
On archive creation or appending tarlz archives the files specified, but
removes from member names any leading and trailing slashes and any filename
prefixes containing a @samp{..} component. On extraction, leading and
trailing slashes are also removed from member names, and archive members
containing a @samp{..} component in the filename are skipped. Tarlz detects
when the archive being created or enlarged is among the files to be dumped,
appended or concatenated, and skips it.
On extraction and listing, tarlz removes leading @samp{./} strings from
member names in the archive or given in the command line, so that
@w{@code{tarlz -xf foo ./bar baz}} extracts members @samp{bar} and
@samp{./baz} from archive @samp{foo}.
If several compression levels or @samp{--*solid} options are given, the last
setting is used. For example @w{@samp{-9 --solid --uncompressed -1}} is
equivalent to @samp{-1 --solid}
tarlz supports the following options:
@table @code
@ -160,6 +165,7 @@ specified. Tarlz can't concatenate uncompressed tar archives.
Set target size of input data blocks for the @samp{--bsolid} option. Valid
values range from @w{8 KiB} to @w{1 GiB}. Default value is two times the
dictionary size, except for option @samp{-0} where it defaults to @w{1 MiB}.
@xref{Minimum archive sizes}.
@item -c
@itemx --create
@ -176,6 +182,10 @@ extraction. Listing ignores any @samp{-C} options specified. @var{dir}
is relative to the then current working directory, perhaps changed by a
previous @samp{-C} option.
Note that a process can only have one current working directory (CWD).
Therefore multi-threading can't be used to create an archive if a @samp{-C}
option appears after a relative filename in the command line.
@item -f @var{archive}
@itemx --file=@var{archive}
Use archive file @var{archive}. @samp{-} used as an @var{archive}
@ -183,17 +193,19 @@ argument reads from standard input or writes to standard output.
@item -n @var{n}
@itemx --threads=@var{n}
Set the number of decompression threads, overriding the system's default.
Set the number of (de)compression threads, overriding the system's default.
Valid values range from 0 to "as many as your system can support". A value
of 0 disables threads entirely. If this option is not used, tarlz tries to
detect the number of processors in the system and use it as default value.
@w{@samp{tarlz --help}} shows the system's default value. This option
currently only has effect when listing the contents of a multimember
compressed archive. @xref{Multi-threaded tar}.
@w{@samp{tarlz --help}} shows the system's default value. See the note about
multi-threaded archive creation in the @samp{-C} option above.
Multi-threaded extraction of files from an archive is not yet implemented.
@xref{Multi-threaded tar}.
Note that the number of usable threads is limited during decompression to
the number of lzip members in the tar.lz archive, which you can find by
running @w{@code{lzip -lv archive.tar.lz}}.
Note that the number of usable threads is limited during compression to
@w{ceil( uncompressed_size / data_size )} (@pxref{Minimum archive sizes}),
and during decompression to the number of lzip members in the tar.lz
archive, which you can find by running @w{@code{lzip -lv archive.tar.lz}}.
@item -q
@itemx --quiet
@ -213,7 +225,7 @@ to an uncompressed tar archive.
@item -t
@itemx --list
List the contents of an archive. If @var{files} are given, list only the
given @var{files}.
@var{files} given.
@item -v
@itemx --verbose
@ -222,7 +234,7 @@ Verbosely list files processed.
@item -x
@itemx --extract
Extract files from an archive. If @var{files} are given, extract only
the given @var{files}. Else extract all the files in the archive.
the @var{files} given. Else extract all the files in the archive.
@item -0 .. -9
Set the compression level. The default compression level is @samp{-6}.
@ -245,40 +257,42 @@ it creates, reducing the amount of memory required for decompression.
@item --asolid
When creating or appending to a compressed archive, use appendable solid
compression. All the files being added to the archive are compressed
into a single lzip member, but the end-of-file blocks are compressed
into a separate lzip member. This creates a solidly compressed
appendable archive.
compression. All the files being added to the archive are compressed into a
single lzip member, but the end-of-file blocks are compressed into a
separate lzip member. This creates a solidly compressed appendable archive.
Solid archives can't be created nor decoded in parallel.
@item --bsolid
When creating or appending to a compressed archive, compress tar members
together in a lzip member until they approximate a target uncompressed size.
The size can't be exact because each solidly compressed data block must
contain an integer number of tar members. This option improves compression
efficiency for archives with lots of small files. @xref{--data-size}, to set
the target block size.
When creating or appending to a compressed archive, use block compression.
Tar members are compressed together in a lzip member until they approximate
a target uncompressed size. The size can't be exact because each solidly
compressed data block must contain an integer number of tar members. Block
compression is the default because it improves compression ratio for
archives with many files smaller than the block size. This option allows
tarlz revert to default behavior if, for example, it is invoked through an
alias like @code{tar='tarlz --solid'}. @xref{--data-size}, to set the target
block size.
@item --dsolid
When creating or appending to a compressed archive, use solid
compression for each directory especified in the command line. The
end-of-file blocks are compressed into a separate lzip member. This
creates a compressed appendable archive with a separate lzip member for
each top-level directory.
When creating or appending to a compressed archive, compress each file
specified in the command line separately in its own lzip member, and use
solid compression for each directory specified in the command line. The
end-of-file blocks are compressed into a separate lzip member. This creates
a compressed appendable archive with a separate lzip member for each file or
top-level directory specified.
@item --no-solid
When creating or appending to a compressed archive, compress each file
separately. The end-of-file blocks are compressed into a separate lzip
member. This creates a compressed appendable archive with a separate
lzip member for each file. This option allows tarlz revert to default
behavior if, for example, tarlz is invoked through an alias like
@code{tar='tarlz --solid'}.
separately in its own lzip member. The end-of-file blocks are compressed
into a separate lzip member. This creates a compressed appendable archive
with a lzip member for each file.
@item --solid
When creating or appending to a compressed archive, use solid
compression. The files being added to the archive, along with the
end-of-file blocks, are compressed into a single lzip member. The
resulting archive is not appendable. No more files can be later appended
to the archive.
When creating or appending to a compressed archive, use solid compression.
The files being added to the archive, along with the end-of-file blocks, are
compressed into a single lzip member. The resulting archive is not
appendable. No more files can be later appended to the archive. Solid
archives can't be created nor decoded in parallel.
@item --anonymous
Equivalent to @samp{--owner=root --group=root}.
@ -388,11 +402,11 @@ binary zeros, interpreted as an end-of-archive indicator. These EOF
blocks are either compressed in a separate lzip member or compressed
along with the tar members contained in the last lzip member.
The diagram below shows the correspondence between each tar member
(formed by one or two headers plus optional data) in the tar archive and
each
The diagram below shows the correspondence between each tar member (formed
by one or two headers plus optional data) in the tar archive and each
@uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#File-format,,lzip member}
in the resulting multimember tar.lz archive:
in the resulting multimember tar.lz archive, when per file compression is
used:
@ifnothtml
@xref{File format,,,lzip}.
@end ifnothtml
@ -672,10 +686,10 @@ format.
@section Avoid misconversions to/from UTF-8
There is no portable way to tell what charset a text string is coded into.
Therefore, tarlz stores all fields representing text strings as-is, without
conversion to UTF-8 nor any other transformation. This prevents accidental
double UTF-8 conversions. If the need arises this behavior will be adjusted
with a command line option in the future.
Therefore, tarlz stores all fields representing text strings unmodified,
without conversion to UTF-8 nor any other transformation. This prevents
accidental double UTF-8 conversions. If the need arises this behavior will
be adjusted with a command line option in the future.
@node Multi-threaded tar
@ -717,13 +731,51 @@ it only needs to decompress part of each lzip member. See the following
example listing the Silesia corpus on a dual core machine:
@example
tarlz -9 -cf silesia.tar.lz silesia
tarlz -9 --no-solid -cf silesia.tar.lz silesia
time lzip -cd silesia.tar.lz | tar -tf - (5.032s)
time plzip -cd silesia.tar.lz | tar -tf - (3.256s)
time tarlz -tf silesia.tar.lz (0.020s)
@end example
@node Minimum archive sizes
@chapter Minimum archive sizes required for multi-threaded block compression
@cindex minimum archive sizes
When creating or appending to a compressed archive using multi-threaded
block compression, tarlz puts tar members together in blocks and compresses
as many blocks simultaneously as worker threads are chosen, creating a
multimember compressed archive.
For this to work as expected (and roughly multiply the compression speed by
the number of available processors), the uncompressed archive must be at
least as large as the number of worker threads times the block size
(@pxref{--data-size}). Else some processors will not get any data to
compress, and compression will be proportionally slower. The maximum speed
increase achievable on a given file is limited by the ratio
@w{(uncompressed_size / data_size)}. For example, a tarball the size of gcc
or linux will scale up to 10 or 12 processors at level -9.
The following table shows the minimum uncompressed archive size needed for
full use of N processors at a given compression level, using the default
data size for each level:
@multitable {Processors} {512 MiB} {512 MiB} {512 MiB} {512 MiB} {512 MiB} {512 MiB}
@headitem Processors @tab 2 @tab 4 @tab 8 @tab 16 @tab 64 @tab 256
@item Level
@item -0 @tab 2 MiB @tab 4 MiB @tab 8 MiB @tab 16 MiB @tab 64 MiB @tab 256 MiB
@item -1 @tab 4 MiB @tab 8 MiB @tab 16 MiB @tab 32 MiB @tab 128 MiB @tab 512 MiB
@item -2 @tab 6 MiB @tab 12 MiB @tab 24 MiB @tab 48 MiB @tab 192 MiB @tab 768 MiB
@item -3 @tab 8 MiB @tab 16 MiB @tab 32 MiB @tab 64 MiB @tab 256 MiB @tab 1 GiB
@item -4 @tab 12 MiB @tab 24 MiB @tab 48 MiB @tab 96 MiB @tab 384 MiB @tab 1.5 GiB
@item -5 @tab 16 MiB @tab 32 MiB @tab 64 MiB @tab 128 MiB @tab 512 MiB @tab 2 GiB
@item -6 @tab 32 MiB @tab 64 MiB @tab 128 MiB @tab 256 MiB @tab 1 GiB @tab 4 GiB
@item -7 @tab 64 MiB @tab 128 MiB @tab 256 MiB @tab 512 MiB @tab 2 GiB @tab 8 GiB
@item -8 @tab 96 MiB @tab 192 MiB @tab 384 MiB @tab 768 MiB @tab 3 GiB @tab 12 GiB
@item -9 @tab 128 MiB @tab 256 MiB @tab 512 MiB @tab 1 GiB @tab 4 GiB @tab 16 GiB
@end multitable
@node Examples
@chapter A small tutorial with examples
@cindex examples