Merging upstream version 0.24.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
b3a4316df0
commit
d842f57fc5
33 changed files with 905 additions and 882 deletions
|
@ -1,5 +1,5 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16.
|
||||
.TH TARLZ "1" "September 2022" "tarlz 0.23" "User Commands"
|
||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
|
||||
.TH TARLZ "1" "September 2023" "tarlz 0.24" "User Commands"
|
||||
.SH NAME
|
||||
tarlz \- creates tar archives with multimember lzip compression
|
||||
.SH SYNOPSIS
|
||||
|
@ -80,7 +80,7 @@ follow symlinks; archive the files they point to
|
|||
set number of (de)compression threads [2]
|
||||
.TP
|
||||
\fB\-o\fR, \fB\-\-output=\fR<file>
|
||||
compress to <file>
|
||||
compress to <file> ('\-' for stdout)
|
||||
.TP
|
||||
\fB\-p\fR, \fB\-\-preserve\-permissions\fR
|
||||
don't subtract the umask on extraction
|
||||
|
@ -157,7 +157,7 @@ Report bugs to lzip\-bug@nongnu.org
|
|||
.br
|
||||
Tarlz home page: http://www.nongnu.org/lzip/tarlz.html
|
||||
.SH COPYRIGHT
|
||||
Copyright \(co 2022 Antonio Diaz Diaz.
|
||||
Copyright \(co 2023 Antonio Diaz Diaz.
|
||||
Using lzlib 1.13
|
||||
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||
.br
|
||||
|
|
140
doc/tarlz.info
140
doc/tarlz.info
|
@ -11,7 +11,7 @@ File: tarlz.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Tarlz Manual
|
||||
************
|
||||
|
||||
This manual is for Tarlz (version 0.23, 23 September 2022).
|
||||
This manual is for Tarlz (version 0.24, 20 September 2023).
|
||||
|
||||
* Menu:
|
||||
|
||||
|
@ -28,7 +28,7 @@ This manual is for Tarlz (version 0.23, 23 September 2022).
|
|||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2013-2022 Antonio Diaz Diaz.
|
||||
Copyright (C) 2013-2023 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to copy,
|
||||
distribute, and modify it.
|
||||
|
@ -58,9 +58,9 @@ plzip may even double the amount of files lost for each lzip member damaged
|
|||
because it does not keep the members aligned.
|
||||
|
||||
Tarlz can create tar archives with five levels of compression
|
||||
granularity: per file (--no-solid), per block (--bsolid, default), per
|
||||
directory (--dsolid), appendable solid (--asolid), and solid (--solid). It
|
||||
can also create uncompressed tar archives.
|
||||
granularity: per file ('--no-solid'), per block ('--bsolid', default), per
|
||||
directory ('--dsolid'), appendable solid ('--asolid'), and solid
|
||||
('--solid'). It can also create uncompressed tar archives.
|
||||
|
||||
Of course, compressing each file (or each directory) individually can't
|
||||
achieve a compression ratio as high as compressing solidly the whole tar
|
||||
|
@ -87,9 +87,9 @@ archive, but it has the following advantages:
|
|||
Tarlz protects the extended records with a Cyclic Redundancy Check (CRC)
|
||||
in a way compatible with standard tar tools. *Note crc32::.
|
||||
|
||||
Tarlz does not understand other tar formats like 'gnu', 'oldgnu', 'star'
|
||||
or 'v7'. The command 'tarlz -tf archive.tar.lz > /dev/null' can be used to
|
||||
verify that the format of the archive is compatible with tarlz.
|
||||
Tarlz does not understand other tar formats like 'gnu', 'oldgnu',
|
||||
'star', or 'v7'. The command 'tarlz -t -f archive.tar.lz > /dev/null' can
|
||||
be used to check that the format of the archive is compatible with tarlz.
|
||||
|
||||
|
||||
File: tarlz.info, Node: Invoking tarlz, Next: Portable character set, Prev: Introduction, Up: Top
|
||||
|
@ -140,7 +140,7 @@ to '-1 --solid'.
|
|||
'-A'
|
||||
'--concatenate'
|
||||
Append one or more archives to the end of an archive. If no archive is
|
||||
specified with the option '-f', the input archives are concatenated to
|
||||
specified with the option '-f', concatenate the input archives to
|
||||
standard output. All the archives involved must be regular (seekable)
|
||||
files, and must be either all compressed or all uncompressed.
|
||||
Compressed and uncompressed archives can't be mixed. Compressed
|
||||
|
@ -163,7 +163,7 @@ to '-1 --solid'.
|
|||
'-d'
|
||||
'--diff'
|
||||
Compare and report differences between archive and file system. For
|
||||
each tar member in the archive, verify that the corresponding file in
|
||||
each tar member in the archive, check that the corresponding file in
|
||||
the file system exists and is of the same type (regular file,
|
||||
directory, etc). Report on standard output the differences found in
|
||||
type, mode (permissions), owner and group IDs, modification time, file
|
||||
|
@ -224,22 +224,25 @@ to '-1 --solid'.
|
|||
directory without extracting the files under it, use
|
||||
'tarlz -xf foo --exclude='dir/*' dir'. Tarlz removes files and empty
|
||||
directories unconditionally before extracting over them. Other than
|
||||
that, it will not make any special effort to extract a file over an
|
||||
that, it does not make any special effort to extract a file over an
|
||||
incompatible type of file. For example, extracting a file over a
|
||||
non-empty directory will usually fail.
|
||||
non-empty directory usually fails.
|
||||
|
||||
'-z'
|
||||
'--compress'
|
||||
Compress existing POSIX tar archives aligning the lzip members to the
|
||||
tar members with choice of granularity (--bsolid by default, --dsolid
|
||||
works like --asolid). The input archives are kept unchanged. Existing
|
||||
compressed archives are not overwritten. A hyphen '-' used as the name
|
||||
of an input archive reads from standard input and writes to standard
|
||||
output (unless the option '--output' is used). Tarlz can be used as
|
||||
compressor for GNU tar using a command like
|
||||
'tar -c -Hustar foo | tarlz -z -o foo.tar.lz'. Note that tarlz only
|
||||
works reliably on archives without global headers, or with global
|
||||
headers whose content can be ignored.
|
||||
tar members with choice of granularity ('--bsolid' by default,
|
||||
'--dsolid' works like '--asolid'). Exit with error status 2 if any
|
||||
input archive is an empty file. The input archives are kept unchanged.
|
||||
Existing compressed archives are not overwritten. A hyphen '-' used as
|
||||
the name of an input archive reads from standard input and writes to
|
||||
standard output (unless the option '--output' is used). Tarlz can be
|
||||
used as compressor for GNU tar by using a command like
|
||||
'tar -c -Hustar foo | tarlz -z -o foo.tar.lz'. Tarlz can be used as
|
||||
compressor for zupdate (zutils) by using a command like
|
||||
'zupdate --lz="tarlz -z" foo.tar.gz'. Note that tarlz only works
|
||||
reliably on archives without global headers, or with global headers
|
||||
whose content can be ignored.
|
||||
|
||||
The compression is reversible, including any garbage present after the
|
||||
end-of-archive blocks. Tarlz stops parsing after the first
|
||||
|
@ -277,18 +280,18 @@ to '-1 --solid'.
|
|||
|
||||
'-C DIR'
|
||||
'--directory=DIR'
|
||||
Change to directory DIR. When creating or appending, the position of
|
||||
each '-C' option in the command line is significant; it will change the
|
||||
current working directory for the following FILES until a new '-C'
|
||||
option appears in the command line. When extracting or comparing, all
|
||||
the '-C' options are executed in sequence before reading the archive.
|
||||
Listing ignores any '-C' options specified. DIR is relative to the
|
||||
then current working directory, perhaps changed by a previous '-C'
|
||||
Change to directory DIR. When creating, appending, comparing, or
|
||||
extracting, the position of each '-C' option in the command line is
|
||||
significant; it changes the current working directory for the following
|
||||
FILES until a new '-C' option appears in the command line. '--list'
|
||||
and '--delete' ignore any '-C' options specified. DIR is relative to
|
||||
the then current working directory, perhaps changed by a previous '-C'
|
||||
option.
|
||||
|
||||
Note that a process can only have one current working directory (CWD).
|
||||
Therefore multi-threading can't be used to create an archive if a '-C'
|
||||
option appears after a relative file name in the command line.
|
||||
Therefore multi-threading can't be used to create or decode an archive
|
||||
if a '-C' option appears after a (relative) file name in the command
|
||||
line. (All file names are made relative when decoding).
|
||||
|
||||
'-f ARCHIVE'
|
||||
'--file=ARCHIVE'
|
||||
|
@ -308,8 +311,7 @@ to '-1 --solid'.
|
|||
support". A value of 0 disables threads entirely. If this option is
|
||||
not used, tarlz tries to detect the number of processors in the system
|
||||
and use it as default value. 'tarlz --help' shows the system's default
|
||||
value. See the note about multi-threaded archive creation in the
|
||||
option '-C' above.
|
||||
value. See the note about multi-threading in the option '-C' above.
|
||||
|
||||
Note that the number of usable threads is limited during compression to
|
||||
ceil( uncompressed_size / data_size ) (*note Minimum archive sizes::),
|
||||
|
@ -360,7 +362,9 @@ to '-1 --solid'.
|
|||
With '--create', don't compress the tar archive created. Create an
|
||||
uncompressed tar archive instead. With '--append', don't compress the
|
||||
new members appended to the tar archive. Compressed members can't be
|
||||
appended to an uncompressed archive, nor vice versa.
|
||||
appended to an uncompressed archive, nor vice versa. '--uncompressed'
|
||||
can be omitted if it can be deduced from the archive name. (An
|
||||
uncompressed archive name lacks a '.lz' or '.tlz' extension).
|
||||
|
||||
'--asolid'
|
||||
When creating or appending to a compressed archive, use appendable
|
||||
|
@ -438,13 +442,13 @@ to '-1 --solid'.
|
|||
happens while extracting a file, keep the partial data extracted. Use
|
||||
this option to recover as much data as possible from each damaged
|
||||
member. It is recommended to run tarlz in single-threaded mode
|
||||
(--threads=0) when using this option.
|
||||
('--threads=0') when using this option.
|
||||
|
||||
'--missing-crc'
|
||||
Exit with error status 2 if the CRC of the extended records is
|
||||
missing. When this option is used, tarlz detects any corruption in the
|
||||
extended records (only limited by CRC collisions). But note that a
|
||||
corrupt 'GNU.crc32' keyword, for example 'GNU.crc33', is reported as a
|
||||
corrupt 'GNU.crc32' keyword, for example 'GNU.crc30', is reported as a
|
||||
missing CRC instead of as a corrupt record. This misleading
|
||||
'Missing CRC' message is the consequence of a flaw in the POSIX pax
|
||||
format; i.e., the lack of a mandatory check sequence of the extended
|
||||
|
@ -588,6 +592,10 @@ header block are zeroed on archive creation to prevent trouble if the
|
|||
archive is read by an ustar tool, and are ignored by tarlz on archive
|
||||
extraction. *Note flawed-compat::.
|
||||
|
||||
Tarlz limits the size of the pax extended header data so that the whole
|
||||
header set (extended header + extended data + ustar header) can be read and
|
||||
decoded in a buffer of size INT_MAX.
|
||||
|
||||
The pax extended header data consists of one or more records, each of
|
||||
them constructed as follows:
|
||||
'"%d %s=%s\n", <length>, <keyword>, <value>'
|
||||
|
@ -618,11 +626,11 @@ space, equal-sign, and newline.
|
|||
previously archived. This record overrides the field 'linkname' in the
|
||||
following ustar header block. The following ustar header block
|
||||
determines the type of link created. If typeflag of the following
|
||||
header block is 1, it will be a hard link. If typeflag is 2, it will
|
||||
be a symbolic link and the linkpath value will be used as the contents
|
||||
of the symbolic link. The linkpath record is created only for links
|
||||
with a link name that does not fit in the space provided by the ustar
|
||||
header.
|
||||
header block is 1, a hard link is created. If typeflag is 2, a
|
||||
symbolic link is created and the linkpath value is used as the
|
||||
contents of the symbolic link. The linkpath record is created only for
|
||||
links with a link name that does not fit in the space provided by the
|
||||
ustar header.
|
||||
|
||||
'mtime'
|
||||
The signed decimal representation of the modification time of the
|
||||
|
@ -657,7 +665,7 @@ space, equal-sign, and newline.
|
|||
CRC32-C (Castagnoli) of the extended header data excluding the 8 bytes
|
||||
representing the CRC <value> itself. The <value> is represented as 8
|
||||
hexadecimal digits in big endian order, '22 GNU.crc32=00000000\n'. The
|
||||
keyword of the CRC record is protected by the CRC to guarante that
|
||||
keyword of the CRC record is protected by the CRC to guarantee that
|
||||
corruption is always detected when using '--missing-crc' (except in
|
||||
case of CRC collision). A CRC was chosen because a checksum is too
|
||||
weak for a potentially large list of variable sized records. A
|
||||
|
@ -843,11 +851,11 @@ to the POSIX-2:1993 standard, POSIX.1-2008 recommends selecting extended
|
|||
header field values that allow such tar to create a regular file containing
|
||||
the extended header records as data. This approach is broken because if the
|
||||
extended header is needed because of a long file name, the fields 'name'
|
||||
and 'prefix' will be unable to contain the full file name. (Some tar
|
||||
and 'prefix' are unable to contain the full file name. (Some tar
|
||||
implementations store the truncated name in the field 'name' alone,
|
||||
truncating the name to only 100 bytes instead of 256). Therefore the files
|
||||
corresponding to both the extended header and the overridden ustar header
|
||||
will be extracted using truncated file names, perhaps overwriting existing
|
||||
are extracted using truncated file names, perhaps overwriting existing
|
||||
files or directories. It may be a security risk to extract a file with a
|
||||
truncated file name.
|
||||
|
||||
|
@ -1098,11 +1106,11 @@ multimember compressed archive.
|
|||
For this to work as expected (and roughly multiply the compression speed
|
||||
by the number of available processors), the uncompressed archive must be at
|
||||
least as large as the number of worker threads times the block size (*note
|
||||
--data-size::). Else some processors will not get any data to compress, and
|
||||
compression will be proportionally slower. The maximum speed increase
|
||||
achievable on a given archive is limited by the ratio
|
||||
(uncompressed_size / data_size). For example, a tarball the size of gcc or
|
||||
linux will scale up to 10 or 14 processors at level -9.
|
||||
--data-size::). Else some processors do not get any data to compress, and
|
||||
compression is proportionally slower. The maximum speed increase achievable
|
||||
on a given archive is limited by the ratio (uncompressed_size / data_size).
|
||||
For example, a tarball the size of gcc or linux scales up to 10 or 14
|
||||
processors at level -9.
|
||||
|
||||
The following table shows the minimum uncompressed archive size needed
|
||||
for full use of N processors at a given compression level, using the default
|
||||
|
@ -1245,24 +1253,24 @@ Concept index
|
|||
Tag Table:
|
||||
Node: Top216
|
||||
Node: Introduction1210
|
||||
Node: Invoking tarlz4029
|
||||
Ref: --data-size12880
|
||||
Ref: --bsolid17192
|
||||
Node: Portable character set22788
|
||||
Node: File format23431
|
||||
Ref: key_crc3230188
|
||||
Ref: ustar-uid-gid33452
|
||||
Ref: ustar-mtime34254
|
||||
Node: Amendments to pax format36254
|
||||
Ref: crc3236963
|
||||
Ref: flawed-compat38274
|
||||
Node: Program design42364
|
||||
Node: Multi-threaded decoding46289
|
||||
Ref: mt-extraction49570
|
||||
Node: Minimum archive sizes50876
|
||||
Node: Examples53014
|
||||
Node: Problems55381
|
||||
Node: Concept index55936
|
||||
Node: Invoking tarlz4041
|
||||
Ref: --data-size13085
|
||||
Ref: --bsolid17521
|
||||
Node: Portable character set23119
|
||||
Node: File format23762
|
||||
Ref: key_crc3230703
|
||||
Ref: ustar-uid-gid33968
|
||||
Ref: ustar-mtime34770
|
||||
Node: Amendments to pax format36770
|
||||
Ref: crc3237479
|
||||
Ref: flawed-compat38790
|
||||
Node: Program design42872
|
||||
Node: Multi-threaded decoding46797
|
||||
Ref: mt-extraction50078
|
||||
Node: Minimum archive sizes51384
|
||||
Node: Examples53511
|
||||
Node: Problems55878
|
||||
Node: Concept index56433
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
179
doc/tarlz.texi
179
doc/tarlz.texi
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 23 September 2022
|
||||
@set VERSION 0.23
|
||||
@set UPDATED 20 September 2023
|
||||
@set VERSION 0.24
|
||||
|
||||
@dircategory Archiving
|
||||
@direntry
|
||||
|
@ -50,7 +50,7 @@ This manual is for Tarlz (version @value{VERSION}, @value{UPDATED}).
|
|||
@end menu
|
||||
|
||||
@sp 1
|
||||
Copyright @copyright{} 2013-2022 Antonio Diaz Diaz.
|
||||
Copyright @copyright{} 2013-2023 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to copy,
|
||||
distribute, and modify it.
|
||||
|
@ -81,9 +81,9 @@ plzip may even double the amount of files lost for each lzip member damaged
|
|||
because it does not keep the members aligned.
|
||||
|
||||
Tarlz can create tar archives with five levels of compression granularity:
|
||||
per file (---no-solid), per block (---bsolid, default), per directory
|
||||
(---dsolid), appendable solid (---asolid), and solid (---solid). It can also
|
||||
create uncompressed tar archives.
|
||||
per file (@option{--no-solid}), per block (@option{--bsolid}, default), per
|
||||
directory (@option{--dsolid}), appendable solid (@option{--asolid}), and
|
||||
solid (@option{--solid}). It can also create uncompressed tar archives.
|
||||
|
||||
@noindent
|
||||
Of course, compressing each file (or each directory) individually can't
|
||||
|
@ -104,7 +104,7 @@ archive. Just like an uncompressed tar archive.
|
|||
It is a safe POSIX-style backup format. In case of corruption, tarlz
|
||||
can extract all the undamaged members from the tar.lz archive,
|
||||
skipping over the damaged members, just like the standard
|
||||
(uncompressed) tar. Moreover, the option @samp{--keep-damaged} can be used
|
||||
(uncompressed) tar. Moreover, the option @option{--keep-damaged} can be used
|
||||
to recover as much data as possible from each damaged member, and
|
||||
lziprecover can be used to recover some of the damaged members.
|
||||
|
||||
|
@ -118,8 +118,8 @@ Tarlz protects the extended records with a Cyclic Redundancy Check (CRC) in
|
|||
a way compatible with standard tar tools. @xref{crc32}.
|
||||
|
||||
Tarlz does not understand other tar formats like @samp{gnu}, @samp{oldgnu},
|
||||
@samp{star} or @samp{v7}. The command
|
||||
@w{@samp{tarlz -tf archive.tar.lz > /dev/null}} can be used to verify that
|
||||
@samp{star}, or @samp{v7}. The command
|
||||
@w{@samp{tarlz -t -f archive.tar.lz > /dev/null}} can be used to check that
|
||||
the format of the archive is compatible with tarlz.
|
||||
|
||||
|
||||
|
@ -137,9 +137,9 @@ tarlz @var{operation} [@var{options}] [@var{files}]
|
|||
@end example
|
||||
|
||||
@noindent
|
||||
All operations except @samp{--concatenate} and @samp{--compress} operate on
|
||||
whole trees if any @var{file} is a directory. All operations except
|
||||
@samp{--compress} overwrite output files without warning. If no archive is
|
||||
All operations except @option{--concatenate} and @option{--compress} operate
|
||||
on whole trees if any @var{file} is a directory. All operations except
|
||||
@option{--compress} overwrite output files without warning. If no archive is
|
||||
specified, tarlz tries to read it from standard input or write it to
|
||||
standard output. Tarlz refuses to read archive data from a terminal or write
|
||||
archive data to a terminal. Tarlz detects when the archive being created or
|
||||
|
@ -147,7 +147,7 @@ enlarged is among the files to be archived, appended, or concatenated, and
|
|||
skips it.
|
||||
|
||||
Tarlz does not use absolute file names nor file names above the current
|
||||
working directory (perhaps changed by option @samp{-C}). On archive creation
|
||||
working directory (perhaps changed by option @option{-C}). On archive creation
|
||||
or appending tarlz archives the files specified, but removes from member
|
||||
names any leading and trailing slashes and any file name prefixes containing
|
||||
a @samp{..} component. On extraction, leading and trailing slashes are also
|
||||
|
@ -161,9 +161,9 @@ member names in the archive or given in the command line, so that
|
|||
@w{@samp{tarlz -xf foo ./bar baz}} extracts members @samp{bar} and
|
||||
@samp{./baz} from archive @samp{foo}.
|
||||
|
||||
If several compression levels or @samp{--*solid} options are given, the last
|
||||
setting is used. For example @w{@samp{-9 --solid --uncompressed -1}} is
|
||||
equivalent to @w{@samp{-1 --solid}}.
|
||||
If several compression levels or @option{--*solid} options are given, the last
|
||||
setting is used. For example @w{@option{-9 --solid --uncompressed -1}} is
|
||||
equivalent to @w{@option{-1 --solid}}.
|
||||
|
||||
tarlz supports the following operations:
|
||||
|
||||
|
@ -179,7 +179,7 @@ This version number should be included in all bug reports.
|
|||
@item -A
|
||||
@itemx --concatenate
|
||||
Append one or more archives to the end of an archive. If no archive is
|
||||
specified with the option @samp{-f}, the input archives are concatenated to
|
||||
specified with the option @option{-f}, concatenate the input archives to
|
||||
standard output. All the archives involved must be regular (seekable) files,
|
||||
and must be either all compressed or all uncompressed. Compressed and
|
||||
uncompressed archives can't be mixed. Compressed archives must be
|
||||
|
@ -202,23 +202,23 @@ Create a new archive from @var{files}.
|
|||
@item -d
|
||||
@itemx --diff
|
||||
Compare and report differences between archive and file system. For each tar
|
||||
member in the archive, verify that the corresponding file in the file system
|
||||
member in the archive, check that the corresponding file in the file system
|
||||
exists and is of the same type (regular file, directory, etc). Report on
|
||||
standard output the differences found in type, mode (permissions), owner and
|
||||
group IDs, modification time, file size, file contents (of regular files),
|
||||
target (of symlinks) and device number (of block/character special files).
|
||||
|
||||
As tarlz removes leading slashes from member names, the option @samp{-C} may
|
||||
be used in combination with @samp{--diff} when absolute file names were used
|
||||
As tarlz removes leading slashes from member names, the option @option{-C} may
|
||||
be used in combination with @option{--diff} when absolute file names were used
|
||||
on archive creation: @w{@samp{tarlz -C / -d}}. Alternatively, tarlz may be
|
||||
run from the root directory to perform the comparison.
|
||||
|
||||
@item --delete
|
||||
Delete files and directories from an archive in place. It currently can
|
||||
delete only from uncompressed archives and from archives with files
|
||||
compressed individually (@samp{--no-solid} archives). Note that files of
|
||||
about @samp{--data-size} or larger are compressed individually even if
|
||||
@samp{--bsolid} is used, and can therefore be deleted. Tarlz takes care to
|
||||
compressed individually (@option{--no-solid} archives). Note that files of
|
||||
about @option{--data-size} or larger are compressed individually even if
|
||||
@option{--bsolid} is used, and can therefore be deleted. Tarlz takes care to
|
||||
not delete a tar member unless it is possible to do so. For example it won't
|
||||
try to delete a tar member that is not compressed individually. Even in the
|
||||
case of finding a corrupt member after having deleted some member(s), tarlz
|
||||
|
@ -261,32 +261,36 @@ Extract files from an archive. If @var{files} are given, extract only the
|
|||
directory without extracting the files under it, use
|
||||
@w{@samp{tarlz -xf foo --exclude='dir/*' dir}}. Tarlz removes files and
|
||||
empty directories unconditionally before extracting over them. Other than
|
||||
that, it will not make any special effort to extract a file over an
|
||||
that, it does not make any special effort to extract a file over an
|
||||
incompatible type of file. For example, extracting a file over a non-empty
|
||||
directory will usually fail.
|
||||
directory usually fails.
|
||||
|
||||
@item -z
|
||||
@itemx --compress
|
||||
Compress existing POSIX tar archives aligning the lzip members to the tar
|
||||
members with choice of granularity (---bsolid by default, ---dsolid works
|
||||
like ---asolid). The input archives are kept unchanged. Existing compressed
|
||||
archives are not overwritten. A hyphen @samp{-} used as the name of an input
|
||||
archive reads from standard input and writes to standard output (unless the
|
||||
option @samp{--output} is used). Tarlz can be used as compressor for GNU tar
|
||||
using a command like @w{@samp{tar -c -Hustar foo | tarlz -z -o foo.tar.lz}}.
|
||||
Note that tarlz only works reliably on archives without global headers, or
|
||||
with global headers whose content can be ignored.
|
||||
members with choice of granularity (@option{--bsolid} by default,
|
||||
@option{--dsolid} works like @option{--asolid}). Exit with error status 2 if
|
||||
any input archive is an empty file. The input archives are kept unchanged.
|
||||
Existing compressed archives are not overwritten. A hyphen @samp{-} used as
|
||||
the name of an input archive reads from standard input and writes to
|
||||
standard output (unless the option @option{--output} is used). Tarlz can be
|
||||
used as compressor for GNU tar by using a command like
|
||||
@w{@samp{tar -c -Hustar foo | tarlz -z -o foo.tar.lz}}. Tarlz can be used as
|
||||
compressor for zupdate (zutils) by using a command like
|
||||
@w{@samp{zupdate --lz="tarlz -z" foo.tar.gz}}. Note that tarlz only works
|
||||
reliably on archives without global headers, or with global headers whose
|
||||
content can be ignored.
|
||||
|
||||
The compression is reversible, including any garbage present after the
|
||||
end-of-archive blocks. Tarlz stops parsing after the first end-of-archive
|
||||
block is found, and then compresses the rest of the archive. Unless solid
|
||||
compression is requested, the end-of-archive blocks are compressed in a lzip
|
||||
member separated from the preceding members and from any non-zero garbage
|
||||
following the end-of-archive blocks. @samp{--compress} implies plzip
|
||||
following the end-of-archive blocks. @option{--compress} implies plzip
|
||||
argument style, not tar style. Each input archive is compressed to a file
|
||||
with the extension @samp{.lz} added unless the option @samp{--output} is
|
||||
used. When @samp{--output} is used, only one input archive can be specified.
|
||||
@samp{-f} can't be used with @samp{--compress}.
|
||||
with the extension @samp{.lz} added unless the option @option{--output} is
|
||||
used. When @option{--output} is used, only one input archive can be specified.
|
||||
@option{-f} can't be used with @option{--compress}.
|
||||
|
||||
@item --check-lib
|
||||
Compare the
|
||||
|
@ -314,25 +318,25 @@ tarlz supports the following
|
|||
@anchor{--data-size}
|
||||
@item -B @var{bytes}
|
||||
@itemx --data-size=@var{bytes}
|
||||
Set target size of input data blocks for the option @samp{--bsolid}.
|
||||
Set target size of input data blocks for the option @option{--bsolid}.
|
||||
@xref{--bsolid}. Valid values range from @w{8 KiB} to @w{1 GiB}. Default
|
||||
value is two times the dictionary size, except for option @samp{-0} where it
|
||||
value is two times the dictionary size, except for option @option{-0} where it
|
||||
defaults to @w{1 MiB}. @xref{Minimum archive sizes}.
|
||||
|
||||
@item -C @var{dir}
|
||||
@itemx --directory=@var{dir}
|
||||
Change to directory @var{dir}. When creating or appending, the position of
|
||||
each @samp{-C} option in the command line is significant; it will change the
|
||||
current working directory for the following @var{files} until a new
|
||||
@samp{-C} option appears in the command line. When extracting or comparing,
|
||||
all the @samp{-C} options are executed in sequence before reading the
|
||||
archive. Listing ignores any @samp{-C} options specified. @var{dir} is
|
||||
relative to the then current working directory, perhaps changed by a
|
||||
previous @samp{-C} option.
|
||||
Change to directory @var{dir}. When creating, appending, comparing, or
|
||||
extracting, the position of each @option{-C} option in the command line is
|
||||
significant; it changes the current working directory for the following
|
||||
@var{files} until a new @option{-C} option appears in the command line.
|
||||
@option{--list} and @option{--delete} ignore any @option{-C} options
|
||||
specified. @var{dir} is relative to the then current working directory,
|
||||
perhaps changed by a previous @option{-C} option.
|
||||
|
||||
Note that a process can only have one current working directory (CWD).
|
||||
Therefore multi-threading can't be used to create an archive if a @samp{-C}
|
||||
option appears after a relative file name in the command line.
|
||||
Therefore multi-threading can't be used to create or decode an archive if a
|
||||
@option{-C} option appears after a (relative) file name in the command line.
|
||||
(All file names are made relative when decoding).
|
||||
|
||||
@item -f @var{archive}
|
||||
@itemx --file=@var{archive}
|
||||
|
@ -351,7 +355,7 @@ Valid values range from 0 to "as many as your system can support". A value
|
|||
of 0 disables threads entirely. If this option is not used, tarlz tries to
|
||||
detect the number of processors in the system and use it as default value.
|
||||
@w{@samp{tarlz --help}} shows the system's default value. See the note about
|
||||
multi-threaded archive creation in the option @samp{-C} above.
|
||||
multi-threading in the option @option{-C} above.
|
||||
|
||||
Note that the number of usable threads is limited during compression to
|
||||
@w{ceil( uncompressed_size / data_size )} (@pxref{Minimum archive sizes}),
|
||||
|
@ -360,9 +364,9 @@ archive, which you can find by running @w{@samp{lzip -lv archive.tar.lz}}.
|
|||
|
||||
@item -o @var{file}
|
||||
@itemx --output=@var{file}
|
||||
Write the compressed output to @var{file}. @w{@samp{-o -}} writes the
|
||||
compressed output to standard output. Currently @samp{--output} only works
|
||||
with @samp{--compress}.
|
||||
Write the compressed output to @var{file}. @w{@option{-o -}} writes the
|
||||
compressed output to standard output. Currently @option{--output} only works
|
||||
with @option{--compress}.
|
||||
|
||||
@item -p
|
||||
@itemx --preserve-permissions
|
||||
|
@ -381,8 +385,8 @@ Verbosely list files processed. Further -v's (up to 4) increase the
|
|||
verbosity level.
|
||||
|
||||
@item -0 .. -9
|
||||
Set the compression level for @samp{--create}, @samp{--append}, and
|
||||
@samp{--compress}. The default compression level is @samp{-6}. Like lzip,
|
||||
Set the compression level for @option{--create}, @option{--append}, and
|
||||
@option{--compress}. The default compression level is @option{-6}. Like lzip,
|
||||
tarlz also minimizes the dictionary size of the lzip members it creates,
|
||||
reducing the amount of memory required for decompression.
|
||||
|
||||
|
@ -401,10 +405,12 @@ reducing the amount of memory required for decompression.
|
|||
@end multitable
|
||||
|
||||
@item --uncompressed
|
||||
With @samp{--create}, don't compress the tar archive created. Create an
|
||||
uncompressed tar archive instead. With @samp{--append}, don't compress the
|
||||
With @option{--create}, don't compress the tar archive created. Create an
|
||||
uncompressed tar archive instead. With @option{--append}, don't compress the
|
||||
new members appended to the tar archive. Compressed members can't be
|
||||
appended to an uncompressed archive, nor vice versa.
|
||||
appended to an uncompressed archive, nor vice versa. @option{--uncompressed}
|
||||
can be omitted if it can be deduced from the archive name. (An uncompressed
|
||||
archive name lacks a @samp{.lz} or @samp{.tlz} extension).
|
||||
|
||||
@item --asolid
|
||||
When creating or appending to a compressed archive, use appendable solid
|
||||
|
@ -447,7 +453,7 @@ appendable. No more files can be later appended to the archive. Solid
|
|||
archives can't be created nor decoded in parallel.
|
||||
|
||||
@item --anonymous
|
||||
Equivalent to @w{@samp{--owner=root --group=root}}.
|
||||
Equivalent to @w{@option{--owner=root --group=root}}.
|
||||
|
||||
@item --owner=@var{owner}
|
||||
When creating or appending, use @var{owner} for files added to the archive.
|
||||
|
@ -465,27 +471,28 @@ to match if any component of the file name matches. For example, @samp{*.o}
|
|||
matches @samp{foo.o}, @samp{foo.o/bar} and @samp{foo/bar.o}. If
|
||||
@var{pattern} contains a @samp{/}, it matches a corresponding @samp{/} in
|
||||
the file name. For example, @samp{foo/*.o} matches @samp{foo/bar.o}.
|
||||
Multiple @samp{--exclude} options can be specified.
|
||||
Multiple @option{--exclude} options can be specified.
|
||||
|
||||
@item --ignore-ids
|
||||
Make @samp{--diff} ignore differences in owner and group IDs. This option is
|
||||
useful when comparing an @samp{--anonymous} archive.
|
||||
Make @option{--diff} ignore differences in owner and group IDs. This option is
|
||||
useful when comparing an @option{--anonymous} archive.
|
||||
|
||||
@item --ignore-overflow
|
||||
Make @samp{--diff} ignore differences in mtime caused by overflow on 32-bit
|
||||
Make @option{--diff} ignore differences in mtime caused by overflow on 32-bit
|
||||
systems with a 32-bit time_t.
|
||||
|
||||
@item --keep-damaged
|
||||
Don't delete partially extracted files. If a decompression error happens
|
||||
while extracting a file, keep the partial data extracted. Use this option to
|
||||
recover as much data as possible from each damaged member. It is recommended
|
||||
to run tarlz in single-threaded mode (---threads=0) when using this option.
|
||||
to run tarlz in single-threaded mode (@option{--threads=0}) when using this
|
||||
option.
|
||||
|
||||
@item --missing-crc
|
||||
Exit with error status 2 if the CRC of the extended records is missing. When
|
||||
this option is used, tarlz detects any corruption in the extended records
|
||||
(only limited by CRC collisions). But note that a corrupt @samp{GNU.crc32}
|
||||
keyword, for example @samp{GNU.crc33}, is reported as a missing CRC instead
|
||||
keyword, for example @samp{GNU.crc30}, is reported as a missing CRC instead
|
||||
of as a corrupt record. This misleading @w{@samp{Missing CRC}} message is
|
||||
the consequence of a flaw in the POSIX pax format; i.e., the lack of a
|
||||
mandatory check sequence of the extended records. @xref{crc32}.
|
||||
|
@ -606,7 +613,7 @@ Zero or more blocks that contain the contents of the file.
|
|||
@end itemize
|
||||
|
||||
Each tar member must be contiguously stored in a lzip member for the
|
||||
parallel decoding operations like @samp{--list} to work. If any tar member
|
||||
parallel decoding operations like @option{--list} to work. If any tar member
|
||||
is split over two or more lzip members, the archive must be decoded
|
||||
sequentially. @xref{Multi-threaded decoding}.
|
||||
|
||||
|
@ -639,7 +646,7 @@ tar.lz
|
|||
@end verbatim
|
||||
|
||||
@ignore
|
||||
When @samp{--permissive} is used, the following violations of the
|
||||
When @option{--permissive} is used, the following violations of the
|
||||
archive format are allowed:@*
|
||||
If several extended headers precede an ustar header, only the last
|
||||
extended header takes effect. The other extended headers are ignored.
|
||||
|
@ -660,6 +667,10 @@ fields in the pax header block are zeroed on archive creation to prevent
|
|||
trouble if the archive is read by an ustar tool, and are ignored by tarlz on
|
||||
archive extraction. @xref{flawed-compat}.
|
||||
|
||||
Tarlz limits the size of the pax extended header data so that the whole
|
||||
header set (extended header + extended data + ustar header) can be read and
|
||||
decoded in a buffer of size INT_MAX.
|
||||
|
||||
The pax extended header data consists of one or more records, each of
|
||||
them constructed as follows:@*
|
||||
@w{@samp{"%d %s=%s\n", <length>, <keyword>, <value>}}
|
||||
|
@ -689,11 +700,11 @@ greater than 2_097_151 (octal 7777777). @xref{ustar-uid-gid}.
|
|||
The file name of a link being created to another file, of any type,
|
||||
previously archived. This record overrides the field @samp{linkname} in the
|
||||
following ustar header block. The following ustar header block determines
|
||||
the type of link created. If typeflag of the following header block is 1, it
|
||||
will be a hard link. If typeflag is 2, it will be a symbolic link and the
|
||||
linkpath value will be used as the contents of the symbolic link. The
|
||||
linkpath record is created only for links with a link name that does not fit
|
||||
in the space provided by the ustar header.
|
||||
the type of link created. If typeflag of the following header block is 1, a
|
||||
hard link is created. If typeflag is 2, a symbolic link is created and the
|
||||
linkpath value is used as the contents of the symbolic link. The linkpath
|
||||
record is created only for links with a link name that does not fit in the
|
||||
space provided by the ustar header.
|
||||
|
||||
@item mtime
|
||||
The signed decimal representation of the modification time of the following
|
||||
|
@ -728,8 +739,8 @@ CRC32-C (Castagnoli) of the extended header data excluding the 8 bytes
|
|||
representing the CRC <value> itself. The <value> is represented as 8
|
||||
hexadecimal digits in big endian order,
|
||||
@w{@samp{22 GNU.crc32=00000000\n}}. The keyword of the CRC record is
|
||||
protected by the CRC to guarante that corruption is always detected when
|
||||
using @samp{--missing-crc} (except in case of CRC collision). A CRC was
|
||||
protected by the CRC to guarantee that corruption is always detected when
|
||||
using @option{--missing-crc} (except in case of CRC collision). A CRC was
|
||||
chosen because a checksum is too weak for a potentially large list of
|
||||
variable sized records. A checksum can't detect simple errors like the
|
||||
swapping of two bytes.
|
||||
|
@ -878,7 +889,7 @@ character.
|
|||
Tarlz creates safe archives that allow the reliable detection of invalid or
|
||||
corrupt metadata during decoding even when the integrity checking of lzip
|
||||
can't be used because the lzip members are only decompressed partially, as
|
||||
it happens in parallel @samp{--diff}, @samp{--list}, and @samp{--extract}.
|
||||
it happens in parallel @option{--diff}, @option{--list}, and @option{--extract}.
|
||||
In order to achieve this goal and avoid some other flaws in the pax format,
|
||||
tarlz makes some changes to the variant of the pax format that it uses. This
|
||||
chapter describes these changes and the concrete reasons to implement them.
|
||||
|
@ -919,11 +930,11 @@ to the POSIX-2:1993 standard, POSIX.1-2008 recommends selecting extended
|
|||
header field values that allow such tar to create a regular file containing
|
||||
the extended header records as data. This approach is broken because if the
|
||||
extended header is needed because of a long file name, the fields
|
||||
@samp{name} and @samp{prefix} will be unable to contain the full file name.
|
||||
@samp{name} and @samp{prefix} are unable to contain the full file name.
|
||||
(Some tar implementations store the truncated name in the field @samp{name}
|
||||
alone, truncating the name to only 100 bytes instead of 256). Therefore the
|
||||
files corresponding to both the extended header and the overridden ustar
|
||||
header will be extracted using truncated file names, perhaps overwriting
|
||||
header are extracted using truncated file names, perhaps overwriting
|
||||
existing files or directories. It may be a security risk to extract a file
|
||||
with a truncated file name.
|
||||
|
||||
|
@ -1117,9 +1128,9 @@ tar.lz archives, keeping backwards compatibility. If tarlz finds a member
|
|||
misalignment during multi-threaded decoding, it switches to single-threaded
|
||||
mode and continues decoding the archive.
|
||||
|
||||
If the files in the archive are large, multi-threaded @samp{--list} on a
|
||||
If the files in the archive are large, multi-threaded @option{--list} on a
|
||||
regular (seekable) tar.lz archive can be hundreds of times faster than
|
||||
sequential @samp{--list} because, in addition to using several processors,
|
||||
sequential @option{--list} because, in addition to using several processors,
|
||||
it only needs to decompress part of each lzip member. See the following
|
||||
example listing the Silesia corpus on a dual core machine:
|
||||
|
||||
|
@ -1130,7 +1141,7 @@ time plzip -cd silesia.tar.lz | tar -tf - (3.256s)
|
|||
time tarlz -tf silesia.tar.lz (0.020s)
|
||||
@end example
|
||||
|
||||
On the other hand, multi-threaded @samp{--list} won't detect corruption in
|
||||
On the other hand, multi-threaded @option{--list} won't detect corruption in
|
||||
the tar member data because it only decodes the part of each lzip member
|
||||
corresponding to the tar member header. This is another reason why the tar
|
||||
headers must provide their own integrity checking.
|
||||
|
@ -1176,11 +1187,11 @@ multimember compressed archive.
|
|||
For this to work as expected (and roughly multiply the compression speed by
|
||||
the number of available processors), the uncompressed archive must be at
|
||||
least as large as the number of worker threads times the block size
|
||||
(@pxref{--data-size}). Else some processors will not get any data to
|
||||
compress, and compression will be proportionally slower. The maximum speed
|
||||
increase achievable on a given archive is limited by the ratio
|
||||
(@pxref{--data-size}). Else some processors do not get any data to compress,
|
||||
and compression is proportionally slower. The maximum speed increase
|
||||
achievable on a given archive is limited by the ratio
|
||||
@w{(uncompressed_size / data_size)}. For example, a tarball the size of gcc
|
||||
or linux will scale up to 10 or 14 processors at level -9.
|
||||
or linux scales up to 10 or 14 processors at level -9.
|
||||
|
||||
The following table shows the minimum uncompressed archive size needed for
|
||||
full use of N processors at a given compression level, using the default
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue