Merging upstream version 1.13~rc1.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
f40403d840
commit
95e3ee3bd3
29 changed files with 472 additions and 517 deletions
204
doc/zutils.texi
204
doc/zutils.texi
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 7 January 2023
|
||||
@set VERSION 1.12
|
||||
@set UPDATED 31 December 2023
|
||||
@set VERSION 1.13-rc1
|
||||
|
||||
@dircategory Compression
|
||||
@direntry
|
||||
|
@ -66,8 +66,8 @@ is a collection of utilities able to process any combination of
|
|||
compressed and uncompressed files transparently. If any file given,
|
||||
including standard input, is compressed, its decompressed content is used.
|
||||
Compressed files are decompressed on the fly; no temporary files are
|
||||
created. Data format is detected by its magic bytes, not by the file name
|
||||
extension.
|
||||
created. Data format is detected by its identifier string (magic bytes), not
|
||||
by the file name extension. Empty files are considered uncompressed.
|
||||
|
||||
These utilities are not wrapper scripts but safer and more efficient C++
|
||||
programs. In particular the option @option{--recursive} is very efficient in
|
||||
|
@ -86,6 +86,11 @@ improved replacements for the shell scripts provided by GNU gzip.
|
|||
@command{ztest} is unique to zutils. @command{zupdate} is similar to gzip's
|
||||
znew.
|
||||
|
||||
@anchor{search-order}
|
||||
When @command{zcat}, @command{zcmp}, @command{zdiff}, or @command{zgrep}
|
||||
need to try compressed file names, the search order is: lzip, gzip, bzip2,
|
||||
zstd, xz. (@var{file}.[lz|gz|bz2|zst|xz]).
|
||||
|
||||
NOTE: Bzip2 and lzip provide well-defined values of exit status, which makes
|
||||
them safe to use with zutils. Gzip and xz may return ambiguous warning
|
||||
values, making them less reliable back ends for zutils. Zstd currently does
|
||||
|
@ -106,24 +111,6 @@ LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
|
|||
been compressed. Decompressed is used to refer to data which have undergone
|
||||
the process of decompression.
|
||||
|
||||
@sp 1
|
||||
Numbers given as arguments to options (positions, sizes) may be followed
|
||||
by a multiplier and an optional @samp{B} for "byte".
|
||||
|
||||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
||||
@item Prefix @tab Value @tab | @tab Prefix @tab Value
|
||||
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
||||
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
||||
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
||||
@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
|
||||
@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
|
||||
@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
|
||||
@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
|
||||
@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
|
||||
@end multitable
|
||||
|
||||
|
||||
@node Common options
|
||||
@chapter Common options
|
||||
|
@ -132,7 +119,8 @@ Table of SI and binary prefixes (unit multipliers):
|
|||
The following
|
||||
@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
|
||||
are available in all the utilities. Rather than writing identical
|
||||
descriptions for each of the programs, they are described here.
|
||||
descriptions for each of the programs, they are described here. Remember to
|
||||
prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}.
|
||||
@ifnothtml
|
||||
@xref{Argument syntax,,,arg_parser}.
|
||||
@end ifnothtml
|
||||
|
@ -209,6 +197,26 @@ It must return 0 if no errors occurred, and a non-zero value otherwise.
|
|||
|
||||
@end table
|
||||
|
||||
Numbers given as arguments to options may be expressed in decimal,
|
||||
hexadecimal, or octal (using the same syntax as integer constants in C++),
|
||||
and may be followed by a multiplier and an optional @samp{B} for "byte".
|
||||
|
||||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
||||
@item Prefix @tab Value @tab | @tab Prefix @tab Value
|
||||
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
||||
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
||||
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
||||
@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
|
||||
@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
|
||||
@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
|
||||
@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
|
||||
@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
|
||||
@item R @tab ronnabyte (10^27) @tab | @tab Ri @tab robibyte (2^90)
|
||||
@item Q @tab quettabyte (10^30) @tab | @tab Qi @tab quebibyte (2^100)
|
||||
@end multitable
|
||||
|
||||
|
||||
@node Configuration
|
||||
@chapter The configuration file 'zutils.conf'
|
||||
|
@ -249,8 +257,9 @@ where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, or
|
|||
sequence. If any file given is compressed, its decompressed content is
|
||||
copied. If a file given does not exist, and its name does not end with one
|
||||
of the known extensions, @command{zcat} tries the compressed file names
|
||||
corresponding to the formats supported. If a file fails to decompress,
|
||||
@command{zcat} continues copying the rest of the files.
|
||||
corresponding to the formats supported until one is found.
|
||||
@xref{search-order}. If a file fails to decompress, @command{zcat} continues
|
||||
copying the rest of the files.
|
||||
|
||||
If a file is specified as @samp{-}, data are read from standard input,
|
||||
decompressed if needed, and sent to standard output. Data read from
|
||||
|
@ -297,8 +306,8 @@ Number all output lines, starting with 1. The line count is unlimited.
|
|||
Force the compressed format given. Valid values for @var{format} are
|
||||
@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, @samp{zst}, and @samp{un} for
|
||||
@samp{uncompressed}. If this option is used, the files are passed to the
|
||||
corresponding decompressor (or transmitted unmodified) without verifying
|
||||
their format, and the exact file name must be given. Other names won't be
|
||||
corresponding decompressor (or transmitted unmodified) without checking
|
||||
their format, and the exact file name must be given. Other names are not
|
||||
tried.
|
||||
|
||||
@item -q
|
||||
|
@ -360,17 +369,10 @@ zcmp [@var{options}] @var{file1} [@var{file2}]
|
|||
@noindent
|
||||
This compares @var{file1} to @var{file2}. The standard input is used only if
|
||||
@var{file1} or @var{file2} refers to standard input. If @var{file2} is
|
||||
omitted @command{zcmp} tries the following:
|
||||
|
||||
@itemize -
|
||||
@item
|
||||
If @var{file1} is compressed, compares its decompressed contents with
|
||||
the corresponding uncompressed file (the name of @var{file1} with the
|
||||
extension removed).
|
||||
@item
|
||||
If @var{file1} is uncompressed, compares it with the decompressed
|
||||
contents of @var{file1}.[lz|bz2|gz|zst|xz] (the first one that is found).
|
||||
@end itemize
|
||||
omitted @command{zcmp} tries to compare @var{file1} with the corresponding
|
||||
uncompressed file (if @var{file1} is compressed), and then with the
|
||||
corresponding compressed files of the remaining formats until one is found.
|
||||
@xref{search-order}.
|
||||
|
||||
@noindent
|
||||
An exit status of 0 means no differences were found, 1 means some
|
||||
|
@ -409,14 +411,14 @@ Compare at most @var{count} input bytes.
|
|||
|
||||
@item -O [@var{format1}][,@var{format2}]
|
||||
@itemx --force-format=[@var{format1}][,@var{format2}]
|
||||
Force the compressed formats given. Any of @var{format1} or @var{format2}
|
||||
may be omitted and the corresponding format will be automatically detected.
|
||||
Valid values for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz},
|
||||
@samp{xz}, @samp{zst}, and @samp{un} for @samp{uncompressed}. If at least
|
||||
one format is specified with this option, the file is passed to the
|
||||
corresponding decompressor (or transmitted unmodified) without verifying its
|
||||
format, and the exact file names of both @var{file1} and @var{file2} must be
|
||||
given. Other names won't be tried.
|
||||
Force the compressed formats given. If @var{format1} or @var{format2} is
|
||||
omitted, the corresponding format is automatically detected. Valid values
|
||||
for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz},
|
||||
@samp{zst}, and @samp{un} for @samp{uncompressed}. If at least one format is
|
||||
specified with this option, the file is passed to the corresponding
|
||||
decompressor (or transmitted unmodified) without checking its format, and
|
||||
the exact file names of both @var{file1} and @var{file2} must be given.
|
||||
Other names are not tried.
|
||||
|
||||
@item -q
|
||||
@itemx --quiet
|
||||
|
@ -441,24 +443,6 @@ the verbosity level. @xref{version}.
|
|||
|
||||
@end table
|
||||
|
||||
Byte counts given as arguments to options may be expressed in decimal,
|
||||
hexadecimal, or octal (using the same syntax as integer constants in C++),
|
||||
and may be followed by a multiplier and an optional @samp{B} for "byte".
|
||||
|
||||
Table of SI and binary prefixes (unit multipliers):
|
||||
|
||||
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
||||
@item Prefix @tab Value @tab | @tab Prefix @tab Value
|
||||
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
||||
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
||||
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
||||
@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
|
||||
@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
|
||||
@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
|
||||
@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
|
||||
@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
|
||||
@end multitable
|
||||
|
||||
|
||||
@node Zdiff
|
||||
@chapter Zdiff
|
||||
|
@ -480,17 +464,10 @@ zdiff [@var{options}] @var{file1} [@var{file2}]
|
|||
@noindent
|
||||
This compares @var{file1} to @var{file2}. The standard input is used only if
|
||||
@var{file1} or @var{file2} refers to standard input. If @var{file2} is
|
||||
omitted @command{zdiff} tries the following:
|
||||
|
||||
@itemize -
|
||||
@item
|
||||
If @var{file1} is compressed, compares its decompressed contents with
|
||||
the corresponding uncompressed file (the name of @var{file1} with the
|
||||
extension removed).
|
||||
@item
|
||||
If @var{file1} is uncompressed, compares it with the decompressed
|
||||
contents of @var{file1}.[lz|bz2|gz|zst|xz] (the first one that is found).
|
||||
@end itemize
|
||||
omitted @command{zdiff} tries to compare @var{file1} with the corresponding
|
||||
uncompressed file (if @var{file1} is compressed), and then with the
|
||||
corresponding compressed files of the remaining formats until one is found.
|
||||
@xref{search-order}.
|
||||
|
||||
@noindent
|
||||
An exit status of 0 means no differences were found, 1 means some
|
||||
|
@ -529,18 +506,18 @@ Ignore changes due to tab expansion.
|
|||
|
||||
@item -i
|
||||
@itemx --ignore-case
|
||||
Ignore case differences in file contents.
|
||||
Ignore case differences. Consider uppercase and lowercase letters equivalent.
|
||||
|
||||
@item -O [@var{format1}][,@var{format2}]
|
||||
@itemx --force-format=[@var{format1}][,@var{format2}]
|
||||
Force the compressed formats given. Any of @var{format1} or @var{format2}
|
||||
may be omitted and the corresponding format will be automatically detected.
|
||||
Valid values for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz},
|
||||
@samp{xz}, @samp{zst}, and @samp{un} for @samp{uncompressed}. If at least
|
||||
one format is specified with this option, the file is passed to the
|
||||
corresponding decompressor (or transmitted unmodified) without verifying its
|
||||
format, and the exact file names of both @var{file1} and @var{file2} must be
|
||||
given. Other names won't be tried.
|
||||
Force the compressed formats given. If @var{format1} or @var{format2} is
|
||||
omitted, the corresponding format is automatically detected. Valid values
|
||||
for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz},
|
||||
@samp{zst}, and @samp{un} for @samp{uncompressed}. If at least one format is
|
||||
specified with this option, the file is passed to the corresponding
|
||||
decompressor (or transmitted unmodified) without checking its format, and
|
||||
the exact file names of both @var{file1} and @var{file2} must be given.
|
||||
Other names are not tried.
|
||||
|
||||
@item -p
|
||||
@itemx --show-c-function
|
||||
|
@ -599,13 +576,12 @@ search on any combination of compressed and uncompressed files. If any file
|
|||
given is compressed, its decompressed content is used. If a file given does
|
||||
not exist, and its name does not end with one of the known extensions,
|
||||
@command{zgrep} tries the compressed file names corresponding to the formats
|
||||
supported. If a file fails to decompress, @command{zgrep} continues
|
||||
searching the rest of the files.
|
||||
supported until one is found. @xref{search-order}. If a file fails to
|
||||
decompress, @command{zgrep} continues searching the rest of the files.
|
||||
|
||||
If a file is specified as @samp{-}, data are read from standard input,
|
||||
decompressed if needed, and fed to grep. Data read from standard input
|
||||
must be of the same type; all uncompressed or all in the same
|
||||
compressed format.
|
||||
decompressed if needed, and fed to grep. Data read from standard input must
|
||||
be of the same type; all uncompressed or all in the same compressed format.
|
||||
|
||||
If no files are specified, recursive searches examine the current working
|
||||
directory, and nonrecursive searches read standard input.
|
||||
|
@ -738,8 +714,8 @@ Show only the part of matching lines that actually matches @var{pattern}.
|
|||
Force the compressed format given. Valid values for @var{format} are
|
||||
@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, @samp{zst}, and @samp{un} for
|
||||
@samp{uncompressed}. If this option is used, the files are passed to the
|
||||
corresponding decompressor (or transmitted unmodified) without verifying
|
||||
their format, and the exact file name must be given. Other names won't be
|
||||
corresponding decompressor (or transmitted unmodified) without checking
|
||||
their format, and the exact file name must be given. Other names are not
|
||||
tried.
|
||||
|
||||
@item -P
|
||||
|
@ -809,14 +785,14 @@ unusual characters like newlines.
|
|||
@chapter Ztest
|
||||
@cindex ztest
|
||||
|
||||
@command{ztest} verifies the integrity of the compressed files specified. It
|
||||
@command{ztest} checks the integrity of the compressed files specified. It
|
||||
also warns if an uncompressed file has a compressed file name extension, or
|
||||
if a compressed file has a wrong compressed extension. Uncompressed files
|
||||
are otherwise ignored. If a file is specified as @samp{-}, the integrity of
|
||||
compressed data read from standard input is verified. Data read from
|
||||
compressed data read from standard input is checked. Data read from
|
||||
standard input must be all in the same compressed format. If a file fails to
|
||||
decompress, does not exist, can't be opened, or is a terminal, @command{ztest}
|
||||
continues verifying the rest of the files. A final diagnostic is shown at
|
||||
continues testing the rest of the files. A final diagnostic is shown at
|
||||
verbosity level 1 or higher if any file fails the test when testing multiple
|
||||
files.
|
||||
|
||||
|
@ -827,14 +803,14 @@ Bzip2, gzip, and lzip are the primary formats. Xz and zstd are optional. If
|
|||
the decompressor for the xz or zstd formats is not found, the corresponding
|
||||
files are ignored.
|
||||
|
||||
Note that error detection in the xz format is broken. First, some xz
|
||||
files lack integrity information. Second, not all xz decompressors can
|
||||
@uref{http://www.nongnu.org/lzip/xz_inadequate.html#fragmented,,verify the integrity}
|
||||
Note that error detection in the xz format is broken. First, some xz files
|
||||
lack integrity information. Second, not all xz decompressors can
|
||||
@uref{http://www.nongnu.org/lzip/xz_inadequate.html#fragmented,,check the integrity}
|
||||
of all xz files. Third, section 2.1.1.2 'Stream Flags' of the
|
||||
@uref{http://tukaani.org/xz/xz-file-format.txt,,xz format specification}
|
||||
allows xz decompressors to produce garbage output without issuing any
|
||||
warning. Therefore, xz files can't always be verified as reliably as
|
||||
files in the other formats can.
|
||||
warning. Therefore, xz files can't always be checked as reliably as files in
|
||||
the other formats can.
|
||||
@c We can only hope that xz is soon abandoned.
|
||||
|
||||
The format for running @command{ztest} is:
|
||||
|
@ -844,8 +820,8 @@ ztest [@var{options}] [@var{files}]
|
|||
@end example
|
||||
|
||||
@noindent
|
||||
Exit status is 0 if all compressed files verify OK, 1 if environmental
|
||||
problems (file not found, invalid command line options, I/O errors, etc),
|
||||
Exit status is 0 if all compressed files check OK, 1 if environmental
|
||||
problems (file not found, invalid command-line options, I/O errors, etc),
|
||||
2 if any compressed file is corrupt or invalid, or if any file has an
|
||||
incorrect file name extension.
|
||||
|
||||
|
@ -857,8 +833,8 @@ incorrect file name extension.
|
|||
Force the compressed format given. Valid values for @var{format} are
|
||||
@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, and @samp{zst}. If this option
|
||||
is used, the files are passed to the corresponding decompressor without
|
||||
verifying their format, and any files in a format that the decompressor
|
||||
can't understand will fail.
|
||||
checking their format, and any files in a format that the decompressor can't
|
||||
understand fail the test.
|
||||
|
||||
@item -q
|
||||
@itemx --quiet
|
||||
|
@ -877,7 +853,7 @@ recursively, following all symbolic links.
|
|||
|
||||
@item -v
|
||||
@itemx --verbose
|
||||
Verbose mode. Show the verify status for each file processed. Further -v's
|
||||
Verbose mode. Show the check status for each file processed. Further -v's
|
||||
increase the verbosity level. @xref{version}.
|
||||
|
||||
@end table
|
||||
|
@ -894,21 +870,21 @@ recompressed, other files are ignored. Compressed files are decompressed and
|
|||
then recompressed on the fly; no temporary files are created. If an error
|
||||
happens while recompressing a file, @command{zupdate} exits immediately
|
||||
without recompressing the rest of the files. The lzip format is chosen as
|
||||
destination because it is the most appropriate for long-term data archiving.
|
||||
destination because it is the most appropriate for long-term archiving.
|
||||
|
||||
If no files are specified, recursive searches examine the current working
|
||||
directory, and nonrecursive searches do nothing.
|
||||
|
||||
If the lzip compressed version of a file already exists, the file is skipped
|
||||
If the lzip-compressed version of a file already exists, the file is skipped
|
||||
unless the option @option{--force} is given. In this case, if the comparison
|
||||
with the existing lzip version fails, an error is returned and the original
|
||||
file is not deleted. The operation of @command{zupdate} is meant to be safe
|
||||
and not cause any data loss. Therefore, existing lzip compressed files are
|
||||
and not cause any data loss. Therefore, existing lzip-compressed files are
|
||||
never overwritten nor deleted.
|
||||
|
||||
Combining the options @option{--force} and @option{--keep}, as in
|
||||
@w{@samp{zupdate -f -k *.gz}}, verifies that there are no differences
|
||||
between each pair of files in a multiformat set of files.
|
||||
@w{@samp{zupdate -f -k *.gz}}, checks that there are no differences between
|
||||
each pair of files in a multiformat set of files.
|
||||
|
||||
The names of the original files must have one of the following extensions:@*
|
||||
@samp{.bz2}, @samp{.gz}, @samp{.xz}, @samp{.zst}, or @samp{.Z}, which are
|
||||
|
@ -938,7 +914,7 @@ zupdate [@var{options}] [@var{files}]
|
|||
Exit status is 0 if all the compressed files were successfully recompressed
|
||||
(if needed), compared, and deleted (if requested). 1 if a non-fatal error
|
||||
occurred (file not found or not regular, or has invalid format, or can't be
|
||||
deleted). 2 if a fatal error occurred (invalid command line options,
|
||||
deleted). 2 if a fatal error occurred (invalid command-line options,
|
||||
compressor can't be run, or comparison fails).
|
||||
|
||||
@command{zupdate} supports the following options:
|
||||
|
@ -968,10 +944,10 @@ Expand combined file name extensions; recompress @samp{.tbz}, @samp{.tbz2},
|
|||
|
||||
@item -f
|
||||
@itemx --force
|
||||
Don't skip a file for which a lzip compressed version already exists.
|
||||
@option{--force} compares the content of the input file with the content
|
||||
of the existing lzip file and deletes the input file if both contents
|
||||
are identical.
|
||||
Don't skip a file for which a lzip-compressed version already exists.
|
||||
@option{--force} compares the content of the input file with the content of
|
||||
the existing lzip file and deletes the input file if both contents are
|
||||
identical.
|
||||
|
||||
@item -i
|
||||
@itemx --ignore-errors
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue