667 lines
18 KiB
Text
667 lines
18 KiB
Text
\input texinfo @c -*-texinfo-*-
|
|
@c %**start of header
|
|
@setfilename zutils.info
|
|
@documentencoding ISO-8859-15
|
|
@settitle Zutils Manual
|
|
@finalout
|
|
@c %**end of header
|
|
|
|
@set UPDATED 2 August 2013
|
|
@set VERSION 1.1
|
|
|
|
@dircategory Data Compression
|
|
@direntry
|
|
* Zutils: (zutils). Utilities dealing with compressed files
|
|
@end direntry
|
|
|
|
|
|
@ifnothtml
|
|
@titlepage
|
|
@title Zutils
|
|
@subtitle Utilities dealing with compressed files
|
|
@subtitle for Zutils version @value{VERSION}, @value{UPDATED}
|
|
@author by Antonio Diaz Diaz
|
|
|
|
@page
|
|
@vskip 0pt plus 1filll
|
|
@end titlepage
|
|
|
|
@contents
|
|
@end ifnothtml
|
|
|
|
@node Top
|
|
@top
|
|
|
|
This manual is for Zutils (version @value{VERSION}, @value{UPDATED}).
|
|
|
|
@menu
|
|
* Introduction:: Purpose and features of zutils
|
|
* Common options:: Common options
|
|
* The zutilsrc file:: The zutils configuration file
|
|
* Zcat:: Concatenating compressed files
|
|
* Zcmp:: Comparing compressed files byte by byte
|
|
* Zdiff:: Comparing compressed files line by line
|
|
* Zgrep:: Searching inside compressed files
|
|
* Ztest:: Testing integrity of compressed files
|
|
* Problems:: Reporting bugs
|
|
* Concept index:: Index of concepts
|
|
@end menu
|
|
|
|
@sp 1
|
|
Copyright @copyright{} 2008, 2009, 2010, 2011, 2012, 2013
|
|
Antonio Diaz Diaz.
|
|
|
|
This manual is free documentation: you have unlimited permission
|
|
to copy, distribute and modify it.
|
|
|
|
|
|
@node Introduction
|
|
@chapter Introduction
|
|
@cindex introduction
|
|
|
|
Zutils is a collection of utilities able to deal with any combination of
|
|
compressed and uncompressed files transparently. If any given file,
|
|
including standard input, is compressed, its decompressed content is
|
|
used. Compressed files are decompressed on the fly; no temporary files
|
|
are created.
|
|
|
|
These utilities are not wrapper scripts but safer and more efficient C++
|
|
programs. In particular the @samp{--recursive} option is very efficient
|
|
in those utilities supporting it.
|
|
|
|
@noindent
|
|
The provided utilities are zcat, zcmp, zdiff, zgrep and ztest.@*
|
|
The supported formats are bzip2, gzip, lzip and xz.@*
|
|
The compressor to be used for each format is configurable at runtime.
|
|
|
|
Zcat, zcmp, zdiff, and zgrep are improved replacements for the shell
|
|
scripts provided with GNU gzip. Ztest is unique to zutils.
|
|
|
|
NOTE: Bzip2 and lzip provide well-defined values of exit status, which
|
|
makes them safe to use with zutils. Gzip and xz may return ambiguous
|
|
warning values, making them less reliable backends for zutils.
|
|
|
|
LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never
|
|
have been compressed. Decompressed is used to refer to data which has
|
|
undergone the process of decompression.
|
|
|
|
@sp 1
|
|
Numbers given as arguments to options (positions, sizes) may be followed
|
|
by a multiplier and an optional @samp{B} for "byte".
|
|
|
|
Table of SI and binary prefixes (unit multipliers):
|
|
|
|
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
|
|
@item Prefix @tab Value @tab | @tab Prefix @tab Value
|
|
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
|
|
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
|
|
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
|
|
@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
|
|
@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
|
|
@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
|
|
@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
|
|
@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
|
|
@end multitable
|
|
|
|
|
|
@node Common options
|
|
@chapter Common options
|
|
@cindex common options
|
|
|
|
The following options are available in all the utilities. Rather than
|
|
writing identical descriptions for each of the programs, they are
|
|
described here.
|
|
|
|
@table @samp
|
|
@item -h
|
|
@itemx --help
|
|
Print an informative help message describing the options and exit. Zgrep
|
|
only supports the @samp{--help} form of this option.
|
|
|
|
@item -V
|
|
@itemx --version
|
|
Print the version number on the standard output and exit.
|
|
|
|
@item -N
|
|
@itemx --no-rcfile
|
|
Don't read the runtime configuration file @samp{zutilsrc}.
|
|
|
|
@item --bz2=@var{command}
|
|
@itemx --gz=@var{command}
|
|
@itemx --lz=@var{command}
|
|
@itemx --xz=@var{command}
|
|
Set program (may include arguments) to be used as (de)compressor for the
|
|
given format. These options override the values set in @file{zutilsrc}.
|
|
The compression program used must meet three requirements:
|
|
|
|
@enumerate
|
|
@item
|
|
When called with the @samp{-d} option, it must read compressed data from
|
|
the standard input and produce decompressed data on the standard output.
|
|
@item
|
|
If the @samp{-q} option is passed to zutils, the compression program
|
|
must also accept it.
|
|
@item
|
|
It must return 0 if no errors occurred, and a non-zero value otherwise.
|
|
@end enumerate
|
|
|
|
@end table
|
|
|
|
|
|
@node The zutilsrc file
|
|
@chapter The zutilsrc file
|
|
@cindex the zutilsrc file
|
|
|
|
@file{zutilsrc} is the runtime configuration file for zutils. In it you
|
|
may define the compressor name and options to be used for each format.
|
|
The @file{zutilsrc} file is optional; you do not need to install it in
|
|
order to run zutils.
|
|
|
|
The compressors specified in the command line override those specified
|
|
in the @file{zutilsrc} file.
|
|
|
|
You may copy the system @file{zutilsrc} file
|
|
@file{$@{sysconfdir@}/zutilsrc} to @file{$HOME/.zutilsrc} and customize
|
|
these options as you like. The file syntax is fairly obvious (and there
|
|
are further instructions in it):
|
|
|
|
@enumerate
|
|
@item
|
|
Any line beginning with @samp{#} is a comment line.
|
|
@item
|
|
Each non-comment line defines the command to be used for the given
|
|
format, with the syntax:
|
|
@example
|
|
<format> = <compressor> [options]
|
|
@end example
|
|
where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz} or @samp{xz}.
|
|
@end enumerate
|
|
|
|
|
|
@node Zcat
|
|
@chapter Zcat
|
|
@cindex zcat
|
|
|
|
Zcat copies each given file (@samp{-} means standard input), to standard
|
|
output. If any given file is compressed, its decompressed content is
|
|
used. If a given file does not exist, and its name does not end with one
|
|
of the known extensions, zcat tries the compressed file names
|
|
corresponding to the supported formats.
|
|
|
|
If no files are specified, data is read from standard input,
|
|
decompressed if needed, and sent to standard output. Data read from
|
|
standard input must be of the same type; all uncompressed or all in the
|
|
same compression format.
|
|
|
|
The format for running zcat is:
|
|
|
|
@example
|
|
zcat [@var{options}] [@var{files}]
|
|
@end example
|
|
|
|
@noindent
|
|
Exit status is 0 if no errors occurred, non-zero otherwise.
|
|
|
|
Zcat supports the following options:
|
|
|
|
@table @samp
|
|
@item -A
|
|
@itemx --show-all
|
|
Equivalent to @samp{-vET}.
|
|
|
|
@item -b
|
|
@itemx --number-nonblank
|
|
Number all nonblank output lines, starting with 1. The line count is
|
|
unlimited.
|
|
|
|
@item -e
|
|
Equivalent to @samp{-vE}.
|
|
|
|
@item -E
|
|
@itemx --show-ends
|
|
Print a @samp{$} after the end of each line.
|
|
|
|
@item --format=@var{fmt}
|
|
Force the given compression format. Valid values for @var{fmt} are
|
|
@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used,
|
|
the exact file name must be given. Other names won't be tried.
|
|
|
|
@item -n
|
|
@itemx --number
|
|
Number all output lines, starting with 1. The line count is unlimited.
|
|
|
|
@item -q
|
|
@itemx --quiet
|
|
Quiet operation. Suppress all messages.
|
|
|
|
@item -r
|
|
@itemx --recursive
|
|
Operate recursively on directories.
|
|
|
|
@item -s
|
|
@itemx --squeeze-blank
|
|
Replace multiple adjacent blank lines with a single blank line.
|
|
|
|
@item -t
|
|
Equivalent to @samp{-vT}.
|
|
|
|
@item -T
|
|
@itemx --show-tabs
|
|
Print TAB characters as @samp{^I}.
|
|
|
|
@item -v
|
|
@itemx --show-nonprinting
|
|
Print control characters except for LF (newline) and TAB using @samp{^}
|
|
notation and precede characters larger than 127 with @samp{M-} (which
|
|
stands for "meta").
|
|
|
|
@item --verbose
|
|
Verbose mode. Show error messages.
|
|
|
|
@end table
|
|
|
|
|
|
@node Zcmp
|
|
@chapter Zcmp
|
|
@cindex zcmp
|
|
|
|
Zcmp compares two files (@samp{-} means standard input), and if they
|
|
differ, tells the first byte and line number where they differ. Bytes
|
|
and lines are numbered starting with 1. If any given file is compressed,
|
|
its decompressed content is used. Compressed files are decompressed on
|
|
the fly; no temporary files are created.
|
|
|
|
The format for running zcmp is:
|
|
|
|
@example
|
|
zcmp [@var{options}] @var{file1} [@var{file2}]
|
|
@end example
|
|
|
|
@noindent
|
|
This compares @var{file1} to @var{file2}. If @var{file2} is omitted zcmp
|
|
tries the following:
|
|
|
|
@enumerate
|
|
@item
|
|
If @var{file1} is compressed, compares its decompressed contents with
|
|
the corresponding uncompressed file (the name of @var{file1} with the
|
|
extension removed).
|
|
@item
|
|
If @var{file1} is uncompressed, compares it with the decompressed
|
|
contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found).
|
|
@item
|
|
If no suitable file is found, compares @var{file1} with data read from
|
|
standard input.
|
|
@end enumerate
|
|
|
|
@noindent
|
|
An exit status of 0 means no differences were found, 1 means some
|
|
differences were found, and 2 means trouble.
|
|
|
|
Zcmp supports the following options:
|
|
|
|
@table @samp
|
|
@item -b
|
|
@itemx --print-bytes
|
|
Print the differing bytes. Print control bytes as a @samp{^} followed by
|
|
a letter, and precede bytes larger than 127 with @samp{M-} (which stands
|
|
for "meta").
|
|
|
|
@item --format=[@var{fmt1}][,@var{fmt2}]
|
|
Force the given compression formats. Any of @var{fmt1} or @var{fmt2} may
|
|
be omitted and the corresponding format will be automatically detected.
|
|
Valid values for @var{fmt} are @samp{bz2}, @samp{gz}, @samp{lz} and
|
|
@samp{xz}. If at least one format is specified with this option, the
|
|
exact file names of both @var{file1} and @var{file2} must be given.
|
|
Other names won't be tried.
|
|
|
|
@item -i @var{size}
|
|
@itemx --ignore-initial=@var{size}
|
|
Ignore any differences in the first @var{size} bytes of the input files.
|
|
Treat files with fewer than @var{size} bytes as if they were empty. If
|
|
@var{size} is in the form @samp{@var{size1},@var{size2}}, ignore the
|
|
first @var{size1} bytes of the first input file and the first
|
|
@var{size2} bytes of the second input file.
|
|
|
|
@item -l
|
|
@itemx -v
|
|
@itemx --list
|
|
@itemx --verbose
|
|
Print the byte numbers (in decimal) and values (in octal) of all
|
|
differing bytes.
|
|
|
|
@item -n @var{count}
|
|
@itemx --bytes=@var{count}
|
|
Compare at most @var{count} input bytes.
|
|
|
|
@item -q
|
|
@itemx -s
|
|
@itemx --quiet
|
|
@itemx --silent
|
|
Do not print anything; only return an exit status indicating whether the
|
|
files differ.
|
|
|
|
@end table
|
|
|
|
|
|
@node Zdiff
|
|
@chapter Zdiff
|
|
@cindex zdiff
|
|
|
|
Zdiff compares two files (@samp{-} means standard input), and if they
|
|
differ, shows the differences line by line. If any given file is
|
|
compressed, its decompressed content is used. Zdiff is a front end to
|
|
the diff program and has the limitation that messages from diff refer to
|
|
temporary filenames instead of those specified.
|
|
|
|
The format for running zdiff is:
|
|
|
|
@example
|
|
zdiff [@var{options}] @var{file1} [@var{file2}]
|
|
@end example
|
|
|
|
@noindent
|
|
This compares @var{file1} to @var{file2}. If @var{file2} is omitted
|
|
zdiff tries the following:
|
|
|
|
@enumerate
|
|
@item
|
|
If @var{file1} is compressed, compares its decompressed contents with
|
|
the corresponding uncompressed file (the name of @var{file1} with the
|
|
extension removed).
|
|
@item
|
|
If @var{file1} is uncompressed, compares it with the decompressed
|
|
contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found).
|
|
@item
|
|
If no suitable file is found, compares @var{file1} with data read from
|
|
standard input.
|
|
@end enumerate
|
|
|
|
@noindent
|
|
An exit status of 0 means no differences were found, 1 means some
|
|
differences were found, and 2 means trouble.
|
|
|
|
Zdiff supports the following options:
|
|
|
|
@table @samp
|
|
@item -a
|
|
@itemx --text
|
|
Treat all files as text.
|
|
|
|
@item -b
|
|
@itemx --ignore-space-change
|
|
Ignore changes in the amount of white space.
|
|
|
|
@item -B
|
|
@itemx --ignore-blank-lines
|
|
Ignore changes whose lines are all blank.
|
|
|
|
@itemx -c
|
|
Use the context output format.
|
|
|
|
@item -C @var{n}
|
|
@itemx --context=@var{n}
|
|
Same as -c but use @var{n} lines of context.
|
|
|
|
@item -d
|
|
@itemx --minimal
|
|
Try hard to find a smaller set of changes.
|
|
|
|
@item -E
|
|
@itemx --ignore-tab-expansion
|
|
Ignore changes due to tab expansion.
|
|
|
|
@item --format=[@var{fmt1}][,@var{fmt2}]
|
|
Force the given compression formats. Any of @var{fmt1} or @var{fmt2} may
|
|
be omitted and the corresponding format will be automatically detected.
|
|
Valid values for @var{fmt} are @samp{bz2}, @samp{gz}, @samp{lz} and
|
|
@samp{xz}. If at least one format is specified with this option, the
|
|
exact file names of both @var{file1} and @var{file2} must be given.
|
|
Other names won't be tried.
|
|
|
|
@item -i
|
|
@itemx --ignore-case
|
|
Ignore case differences in file contents.
|
|
|
|
@item -p
|
|
@itemx --show-c-function
|
|
Show which C function each change is in.
|
|
|
|
@item -q
|
|
@itemx --brief
|
|
Output only whether files differ.
|
|
|
|
@item -s
|
|
@itemx --report-identical-files
|
|
Report when two files are identical.
|
|
|
|
@item -t
|
|
@itemx --expand-tabs
|
|
Expand tabs to spaces in output.
|
|
|
|
@item -T
|
|
@itemx --initial-tab
|
|
Make tabs line up by prepending a tab.
|
|
|
|
@item -u
|
|
Use the unified output format.
|
|
|
|
@item -U @var{n}
|
|
@itemx --unified=@var{n}
|
|
Same as -u but use @var{n} lines of context.
|
|
|
|
@item -w
|
|
@itemx --ignore-all-space
|
|
Ignore all white space.
|
|
|
|
@end table
|
|
|
|
|
|
@node Zgrep
|
|
@chapter Zgrep
|
|
@cindex zgrep
|
|
|
|
Zgrep is a front end to the grep program that allows transparent search
|
|
on any combination of compressed and uncompressed files. If any given
|
|
file is compressed, its decompressed content is used. If a given file
|
|
does not exist, and its name does not end with one of the known
|
|
extensions, zgrep tries the compressed file names corresponding to the
|
|
supported formats.
|
|
|
|
If no files are specified, data is read from standard input,
|
|
decompressed if needed, and fed to grep. Data read from standard input
|
|
must be of the same type; all uncompressed or all in the same
|
|
compression format.
|
|
|
|
The format for running zgrep is:
|
|
|
|
@example
|
|
zgrep [@var{options}] @var{pattern} [@var{files}]
|
|
@end example
|
|
|
|
@noindent
|
|
An exit status of 0 means at least one match was found, 1 means no
|
|
matches were found, and 2 means trouble.
|
|
|
|
Zgrep supports the following options:
|
|
|
|
@table @samp
|
|
@item -a
|
|
@itemx --text
|
|
Treat all files as text.
|
|
|
|
@item -A @var{n}
|
|
@itemx --after-context=@var{n}
|
|
Print @var{n} lines of trailing context.
|
|
|
|
@item -b
|
|
@itemx --byte-offset
|
|
Print the byte offset of each line.
|
|
|
|
@item -B @var{n}
|
|
@itemx --before-context=@var{n}
|
|
Print @var{n} lines of leading context.
|
|
|
|
@item -c
|
|
@itemx --count
|
|
Only print a count of matching lines per file.
|
|
|
|
@item -C @var{n}
|
|
@itemx --context=@var{n}
|
|
Print @var{n} lines of output context.
|
|
|
|
@item -e @var{pattern}
|
|
@itemx --regexp=@var{pattern}
|
|
Use @var{pattern} as the pattern to match.
|
|
|
|
@item -E
|
|
@itemx --extended-regexp
|
|
Treat @var{pattern} as an extended regular expression.
|
|
|
|
@item -f @var{file}
|
|
@itemx --file=@var{file}
|
|
Obtain patterns from @var{file}, one per line.
|
|
|
|
@item -F
|
|
@itemx --fixed-strings
|
|
Treat @var{pattern} as a set of newline-separated strings.
|
|
|
|
@item --format=@var{fmt}
|
|
Force the given compression format. Valid values for @var{fmt} are
|
|
@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used,
|
|
the exact file name must be given. Other names won't be tried.
|
|
|
|
@item -h
|
|
@itemx --no-filename
|
|
Suppress the prefixing of filenames on output when multiple files are
|
|
searched.
|
|
|
|
@item -H
|
|
@itemx --with-filename
|
|
Print the filename for each match.
|
|
|
|
@item -i
|
|
@itemx --ignore-case
|
|
Ignore case distinctions.
|
|
|
|
@item -I
|
|
Ignore binary files.
|
|
|
|
@item -l
|
|
@itemx --files-with-matches
|
|
Only print names of files containing at least one match.
|
|
|
|
@item -L
|
|
@itemx --files-without-match
|
|
Only print names of files not containing any matches.
|
|
|
|
@item -m @var{n}
|
|
@itemx --max-count=@var{n}
|
|
Stop after @var{n} matches.
|
|
|
|
@item -n
|
|
@itemx --line-number
|
|
Prefix each matched line with its line number in the input file.
|
|
|
|
@item -o
|
|
@itemx --only-matching
|
|
Show only the part of matching lines that actually matches @var{pattern}.
|
|
|
|
@item -q
|
|
@itemx --quiet
|
|
Suppress all messages.
|
|
|
|
@item -r
|
|
@itemx --recursive
|
|
Operate recursively on directories.
|
|
|
|
@item -s
|
|
@itemx --no-messages
|
|
Suppress error messages.
|
|
|
|
@item -v
|
|
@itemx --invert-match
|
|
Select non-matching lines.
|
|
|
|
@item --verbose
|
|
Verbose mode. Show error messages.
|
|
|
|
@item -w
|
|
@itemx --word-regexp
|
|
Match only whole words.
|
|
|
|
@item -x
|
|
@itemx --line-regexp
|
|
Match only whole lines.
|
|
|
|
@end table
|
|
|
|
|
|
@node Ztest
|
|
@chapter Ztest
|
|
@cindex ztest
|
|
|
|
Ztest verifies the integrity of the specified compressed files.
|
|
Uncompressed files are ignored. If no files are specified, the integrity
|
|
of compressed data read from standard input is verified. Data read from
|
|
standard input must be all in the same compression format.
|
|
|
|
Note that some xz files lack integrity information, and therefore can't
|
|
be verified as reliably as the other formats can.
|
|
|
|
The format for running ztest is:
|
|
|
|
@example
|
|
ztest [@var{options}] [@var{files}]
|
|
@end example
|
|
|
|
@noindent
|
|
The exit status is 0 if all compressed files verify OK, 1 if
|
|
environmental problems (file not found, invalid flags, I/O errors, etc),
|
|
2 if any compressed file is corrupt or invalid.
|
|
|
|
Ztest supports the following options:
|
|
|
|
@table @samp
|
|
@item --format=@var{fmt}
|
|
Force the given compression format. Valid values for @var{fmt} are
|
|
@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used,
|
|
all files not in the given format will fail.
|
|
|
|
@item -q
|
|
@itemx --quiet
|
|
Quiet operation. Suppress all messages.
|
|
|
|
@item -r
|
|
@itemx --recursive
|
|
Operate recursively on directories.
|
|
|
|
@item -v
|
|
@itemx --verbose
|
|
Verbose mode. Show the verify status for each file processed.
|
|
Further -v's increase the verbosity level.
|
|
|
|
@end table
|
|
|
|
|
|
@node Problems
|
|
@chapter Reporting Bugs
|
|
@cindex bugs
|
|
@cindex getting help
|
|
|
|
There are probably bugs in zutils. There are certainly errors and
|
|
omissions in this manual. If you report them, they will get fixed. If
|
|
you don't, no one will ever know about them and they will remain unfixed
|
|
for all eternity, if not longer.
|
|
|
|
If you find a bug in zutils, please send electronic mail to
|
|
@email{zutils-bug@@nongnu.org}. Include the version number, which you can
|
|
find by running @w{@samp{zutils --version}}.
|
|
|
|
|
|
@node Concept index
|
|
@unnumbered Concept index
|
|
|
|
@printindex cp
|
|
|
|
@bye
|