1
0
Fork 0

Merging upstream version 1.11.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-24 06:00:56 +01:00
parent ddac2f7869
commit bd6a3e4e88
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
31 changed files with 734 additions and 377 deletions

View file

@ -6,10 +6,10 @@
@finalout
@c %**end of header
@set UPDATED 5 January 2021
@set VERSION 1.10
@set UPDATED 25 January 2022
@set VERSION 1.11
@dircategory Data Compression
@dircategory Compression
@direntry
* Zutils: (zutils). Utilities dealing with compressed files
@end direntry
@ -50,7 +50,7 @@ This manual is for Zutils (version @value{VERSION}, @value{UPDATED}).
@end menu
@sp 1
Copyright @copyright{} 2009-2021 Antonio Diaz Diaz.
Copyright @copyright{} 2009-2022 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to copy,
distribute, and modify it.
@ -74,7 +74,7 @@ those utilities supporting it.
@noindent
The utilities provided are zcat, zcmp, zdiff, zgrep, ztest, and zupdate.@*
The formats supported are bzip2, gzip, lzip, and xz.@*
The formats supported are bzip2, gzip, lzip, xz, and zstd.@*
Zutils uses external compressors. The compressor to be used for each format
is configurable at runtime.
@ -84,12 +84,15 @@ gzip's znew.
NOTE: Bzip2 and lzip provide well-defined values of exit status, which makes
them safe to use with zutils. Gzip and xz may return ambiguous warning
values, making them less reliable back ends for zutils.
values, making them less reliable back ends for zutils. Zstd currently does
not even document its exit status in its man page.
@xref{compressor-requirements}.
FORMAT NOTE 1: The option @samp{--format} allows the processing of a subset
of formats in recursive mode and when trying compressed file names:
@w{@samp{zgrep foo -r --format=bz2,lz somedir somefile.tar}}.
of formats in recursive mode and when trying compressed file names. For
example, use the following command to search for the string @samp{foo} in
gzip and lzip files only:
@w{@samp{zgrep foo -r --format=gz,lz somedir somefile.tar}}.
FORMAT NOTE 2: If the option @samp{--force-format} is given, the files are
passed to the corresponding decompressor without verifying their format,
@ -141,17 +144,19 @@ only supports the @samp{--help} form of this option.
@itemx --version
Print the version number on the standard output and exit.
This version number should be included in all bug reports.
In verbose mode, zdiff and zgrep print also the version of the diff or grep
program used respectively.
@item -M @var{format_list}
@itemx --format=@var{format_list}
Process only the formats listed in the comma-separated
@var{format_list}. Valid formats are @samp{bz2}, @samp{gz}, @samp{lz},
@samp{xz}, and @samp{un} for @samp{uncompressed}, meaning "any file name
without a known extension". This option excludes files based on
extension, instead of format, because it is more efficient. The
exclusion only applies to names generated automatically (for example
when adding extensions to a file name or when operating recursively on
directories). Files given in the command line are always processed.
Process only the formats listed in the comma-separated @var{format_list}.
Valid formats are @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, @samp{zst},
and @samp{un} for @samp{uncompressed}, meaning "any file name without a
known extension". This option excludes files based on extension, instead of
format, because it is more efficient. The exclusion only applies to names
generated automatically (for example when adding extensions to a file name
or when operating recursively on directories). Files given in the command
line are always processed.
Each format in @var{format_list} enables file names with the following
extensions:
@ -161,6 +166,7 @@ extensions:
@item gz @tab enables @tab .gz .tgz
@item lz @tab enables @tab .lz .tlz
@item xz @tab enables @tab .xz .txz
@item zst @tab enables @tab .zst .tzst
@item un @tab enables @tab any other file name
@end multitable
@ -172,19 +178,21 @@ Don't read the runtime configuration file @samp{zutilsrc}.
@itemx --gz=@var{command}
@itemx --lz=@var{command}
@itemx --xz=@var{command}
@itemx --zst=@var{command}
Set program to be used as (de)compressor for the corresponding format.
@var{command} may include arguments. For example
@w{@samp{--lz='plzip --threads=2'}}. The program set with @samp{--lz} is
used for both compression and decompression. The other three are used only
for decompression. The name of the program can't begin with @samp{-}. These
used for both compression and decompression. The others are used only for
decompression. The name of the program can't begin with @samp{-}. These
options override the values set in @file{zutilsrc}. The compression program
used must meet three requirements:
@anchor{compressor-requirements}
@enumerate
@item
When called with the option @samp{-d}, it must read compressed data from
the standard input and produce decompressed data on the standard output.
When called with the option @samp{-d} and without file names, it must read
compressed data from the standard input and produce decompressed data on the
standard output.
@item
If the option @samp{-q} is passed to zutils, the compression program must
also accept it.
@ -220,7 +228,8 @@ format, with the syntax:
@example
<format> = <compressor> [options]
@end example
where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz}, or @samp{xz}.
where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, or
@samp{zst}.
@end enumerate
@ -278,10 +287,10 @@ Number all output lines, starting with 1. The line count is unlimited.
@item -O @var{format}
@itemx --force-format=@var{format}
Force the compressed format given. Valid values for @var{format} are
@samp{bz2}, @samp{gz}, @samp{lz}, and @samp{xz}. If this option is used,
the files are passed to the corresponding decompressor without verifying
their format, and the exact file name must be given. Other names won't
be tried.
@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, and @samp{zst}. If this option
is used, the files are passed to the corresponding decompressor without
verifying their format, and the exact file name must be given. Other names
won't be tried.
@item -q
@itemx --quiet
@ -350,7 +359,7 @@ the corresponding uncompressed file (the name of @var{file1} with the
extension removed).
@item
If @var{file1} is uncompressed, compares it with the decompressed
contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found).
contents of @var{file1}.[lz|bz2|gz|zst|xz] (the first one that is found).
@end itemize
@noindent
@ -387,13 +396,13 @@ Compare at most @var{count} input bytes.
@item -O [@var{format1}][,@var{format2}]
@itemx --force-format=[@var{format1}][,@var{format2}]
Force the compressed formats given. Any of @var{format1} or
@var{format2} may be omitted and the corresponding format will be
automatically detected. Valid values for @var{format} are @samp{bz2},
@samp{gz}, @samp{lz}, and @samp{xz}. If at least one format is specified
with this option, the file is passed to the corresponding decompressor
without verifying its format, and the exact file names of both
@var{file1} and @var{file2} must be given. Other names won't be tried.
Force the compressed formats given. Any of @var{format1} or @var{format2}
may be omitted and the corresponding format will be automatically detected.
Valid values for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz},
@samp{xz}, and @samp{zst}. If at least one format is specified with this
option, the file is passed to the corresponding decompressor without
verifying its format, and the exact file names of both @var{file1} and
@var{file2} must be given. Other names won't be tried.
@item -q
@itemx -s
@ -434,7 +443,7 @@ the corresponding uncompressed file (the name of @var{file1} with the
extension removed).
@item
If @var{file1} is uncompressed, compares it with the decompressed
contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found).
contents of @var{file1}.[lz|bz2|gz|zst|xz] (the first one that is found).
@end itemize
@noindent
@ -478,13 +487,13 @@ Ignore case differences in file contents.
@item -O [@var{format1}][,@var{format2}]
@itemx --force-format=[@var{format1}][,@var{format2}]
Force the compressed formats given. Any of @var{format1} or
@var{format2} may be omitted and the corresponding format will be
automatically detected. Valid values for @var{format} are @samp{bz2},
@samp{gz}, @samp{lz}, and @samp{xz}. If at least one format is specified
with this option, the file is passed to the corresponding decompressor
without verifying its format, and the exact file names of both
@var{file1} and @var{file2} must be given. Other names won't be tried.
Force the compressed formats given. Any of @var{format1} or @var{format2}
may be omitted and the corresponding format will be automatically detected.
Valid values for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz},
@samp{xz}, and @samp{zst}. If at least one format is specified with this
option, the file is passed to the corresponding decompressor without
verifying its format, and the exact file names of both @var{file1} and
@var{file2} must be given. Other names won't be tried.
@item -p
@itemx --show-c-function
@ -513,6 +522,11 @@ Use the unified output format.
@itemx --unified=@var{n}
Same as -u but use @var{n} lines of context.
@item -v
@itemx --verbose
When specified before @samp{--version}, print the version of the diff
program used.
@item -w
@itemx --ignore-all-space
Ignore all white space.
@ -644,10 +658,10 @@ Show only the part of matching lines that actually matches @var{pattern}.
@item -O @var{format}
@itemx --force-format=@var{format}
Force the compressed format given. Valid values for @var{format} are
@samp{bz2}, @samp{gz}, @samp{lz}, and @samp{xz}. If this option is used,
the files are passed to the corresponding decompressor without verifying
their format, and the exact file name must be given. Other names won't
be tried.
@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, and @samp{zst}. If this option
is used, the files are passed to the corresponding decompressor without
verifying their format, and the exact file name must be given. Other names
won't be tried.
@item -q
@itemx --quiet
@ -674,7 +688,8 @@ Suppress error messages about nonexistent or unreadable files.
Select non-matching lines.
@item --verbose
Verbose mode. Show error messages.
Verbose mode. Show error messages. When specified before @samp{--version},
print the version of the grep program used.
@item -w
@itemx --word-regexp
@ -703,6 +718,10 @@ test when testing multiple files.
If no files are specified, recursive searches examine the current working
directory, and nonrecursive searches read standard input.
Bzip2, gzip, and lzip are the primary formats. Xz and zstd are optional. If
the decompressor for the xz or zstd formats is not found, the corresponding
files are ignored.
Note that error detection in the xz format is broken. First, some xz
files lack integrity information. Second, not all xz decompressors can
@uref{http://www.nongnu.org/lzip/xz_inadequate.html#fragmented,,verify the integrity}
@ -730,11 +749,11 @@ ztest supports the following options:
@item -O @var{format}
@itemx --force-format=@var{format}
Force the compressed format given. Valid values for @var{format} are
@samp{bz2}, @samp{gz}, @samp{lz}, and @samp{xz}. If this option is used, the
files are passed to the corresponding decompressor without verifying their
format, and any files in a format that the decompressor can't understand
will fail. For example, @samp{--force-format=gz} can test gzipped (.gz) and
compress'd (.Z) files if the compressor used is GNU gzip.
@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, and @samp{zst}. If this option
is used, the files are passed to the corresponding decompressor without
verifying their format, and any files in a format that the decompressor
can't understand will fail. For example, @samp{--force-format=gz} can test
gzipped (.gz) and compress'd (.Z) files if the compressor used is GNU gzip.
@item -q
@itemx --quiet
@ -763,14 +782,14 @@ Further -v's increase the verbosity level.
@chapter Zupdate
@cindex zupdate
zupdate recompresses files from bzip2, gzip, and xz formats to lzip format.
Each original is compared with the new file and then deleted. Only regular
files with standard file name extensions are recompressed, other files are
ignored. Compressed files are decompressed and then recompressed on the fly;
no temporary files are created. If an error happens while recompressing a
file, zupdate exits immediately without recompressing the rest of the files.
The lzip format is chosen as destination because it is the most appropriate
for long-term data archiving.
zupdate recompresses files from bzip2, gzip, xz, and zstd formats to lzip
format. Each original is compared with the new file and then deleted. Only
regular files with standard file name extensions are recompressed, other
files are ignored. Compressed files are decompressed and then recompressed
on the fly; no temporary files are created. If an error happens while
recompressing a file, zupdate exits immediately without recompressing the
rest of the files. The lzip format is chosen as destination because it is
the most appropriate for long-term data archiving.
If no files are specified, recursive searches examine the current working
directory, and nonrecursive searches do nothing.
@ -782,21 +801,29 @@ and the original file is not deleted. The operation of zupdate is meant
to be safe and not cause any data loss. Therefore, existing lzip
compressed files are never overwritten nor deleted.
Recompressing files from a read-only file system to another place can be
done by first linking the files from the destination directory and then
compressing the links: @w{@samp{ln -s /src/foo.gz . && zupdate foo.gz}}
Combining the options @samp{--force} and @samp{--keep}, as in
@w{@samp{zupdate -f -k *.gz}}, verifies that there are no differences
between each pair of files in a multiformat set of files.
The names of the original files must have one of the following extensions:@*
@samp{.bz2}, @samp{.gz}, or @samp{.xz}, which are recompressed to
@samp{.lz};@*
@samp{.tbz}, @samp{.tbz2}, @samp{.tgz}, or @samp{.txz}, which are
recompressed to @samp{.tlz}.@*
@samp{.bz2}, @samp{.gz}, @samp{.xz}, or @samp{.zst}, which are recompressed
to @samp{.lz};@*
@samp{.tbz}, @samp{.tbz2}, @samp{.tgz}, @samp{.txz}, or @samp{.tzst}, which
are recompressed to @samp{.tlz}.@*
Keeping the combined extensions (@samp{.tgz} --> @samp{.tlz}) may be useful
when recompressing Slackware packages, for example.
Bzip2, gzip, and lzip are the primary formats. Xz and zstd are optional. If
the decompressor for the xz or zstd formats is not found, the corresponding
files are ignored.
Recompressing a file is much like copying or moving it; therefore zupdate
preserves the access and modification dates, permissions, and, when
possible, ownership of the file just as @samp{cp -p} does. (If the user ID or
possible, ownership of the file just as @w{@samp{cp -p}} does. (If the user ID or
the group ID can't be duplicated, the file permission bits S_ISUID and
S_ISGID are cleared).