Merging upstream version 1.5.

Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-02-24 04:12:55 +01:00 · 2025-02-24 04:12:55 +01:00 · 66060d80f9
commit 66060d80f9
parent 5e1f92d2a0
20 changed files with 632 additions and 272 deletions
--- a/doc/plzip.info
+++ b/doc/plzip.info
@ -11,7 +11,7 @@ File: plzip.info,  Node: Top,  Next: Introduction,  Up: (dir)
 Plzip Manual
 ************

-This manual is for Plzip (version 1.4, 9 July 2015).
+This manual is for Plzip (version 1.5, 14 May 2016).

 * Menu:

@ -21,11 +21,13 @@ This manual is for Plzip (version 1.4, 9 July 2015).
 * File format::            Detailed format of the compressed file
 * Memory requirements::    Memory required to compress and decompress
 * Minimum file sizes::     Minimum file sizes required for full speed
+* Trailing data::          Extra data appended to the file
+* Examples::               A small tutorial with examples
 * Problems::               Reporting bugs
 * Concept index::          Index of concepts


-   Copyright (C) 2009-2015 Antonio Diaz Diaz.
+   Copyright (C) 2009-2016 Antonio Diaz Diaz.

   This manual is free documentation: you have unlimited permission to
 copy, distribute and modify it.
@ -59,7 +61,7 @@ availability:
     recovery means. The lziprecover program can repair bit-flip errors
     (one of the most common forms of data corruption) in lzip files,
     and provides data recovery capabilities, including error-checked
-     merging of damaged copies of a file.  *note Data safety:
+     merging of damaged copies of a file.  *Note Data safety:
     (lziprecover)Data safety.

   * The lzip format is as simple as possible (but not simpler). The
@ -115,13 +117,6 @@ two or more compressed files. The result is the concatenation of the
 corresponding uncompressed files. Integrity testing of concatenated
 compressed files is also supported.

-   WARNING! Even if plzip is bug-free, other causes may result in a
-corrupt compressed file (bugs in the system libraries, memory errors,
-etc).  Therefore, if the data you are going to compress are important,
-give the '--keep' option to plzip and do not remove the original file
-until you verify the compressed file with a command like
-'plzip -cd file.lz | cmp file -'.
-

 File: plzip.info,  Node: Invoking plzip,  Next: Program design,  Prev: Introduction,  Up: Top

@ -132,6 +127,10 @@ The format for running plzip is:

     plzip [OPTIONS] [FILES]

+'-' used as a FILE argument means standard input. It can be mixed with
+other FILES and is read just once, the first time it appears in the
+command line.
+
   Plzip supports the following options:

 '-h'
@ -142,6 +141,13 @@ The format for running plzip is:
 '--version'
     Print the version number of plzip on the standard output and exit.

+'-a'
+'--trailing-error'
+     Exit with error status 2 if any remaining input is detected after
+     decompressing the last member. Such remaining input is usually
+     trailing garbage that can be safely ignored. *Note
+     concat-example::.
+
 '-B BYTES'
 '--data-size=BYTES'
     Set the size of the input data blocks, in bytes. The input file
@ -153,12 +159,17 @@ The format for running plzip is:

 '-c'
 '--stdout'
-     Compress or decompress to standard output. Needed when reading
-     from a named pipe (fifo) or from a device.
+     Compress or decompress to standard output; keep input files
+     unchanged.  If compressing several files, each file is compressed
+     independently.  This option is needed when reading from a named
+     pipe (fifo) or from a device.

 '-d'
 '--decompress'
-     Decompress.
+     Decompress the specified file(s). If a file does not exist or
+     can't be opened, plzip continues decompressing the rest of the
+     files. If a file fails to decompress, plzip exits immediately
+     without decompressing the rest of the files.

 '-f'
 '--force'
@ -207,12 +218,13 @@ The format for running plzip is:

 '-s BYTES'
 '--dictionary-size=BYTES'
-     Set the dictionary size limit in bytes. Valid values range from 4
-     KiB to 512 MiB. Plzip will use the smallest possible dictionary
-     size for each file without exceeding this limit. Note that
-     dictionary sizes are quantized. If the specified size does not
-     match one of the valid sizes, it will be rounded upwards by adding
-     up to (BYTES / 16) to it.
+     Set the dictionary size limit in bytes. Plzip will use the smallest
+     possible dictionary size for each file without exceeding this
+     limit.  Valid values range from 4 KiB to 512 MiB. Values 12 to 29
+     are interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note
+     that dictionary sizes are quantized. If the specified size does
+     not match one of the valid sizes, it will be rounded upwards by
+     adding up to (BYTES / 8) to it.

     For maximum compression you should use a dictionary size limit as
     large as possible, but keep in mind that the decompression memory
@ -224,7 +236,8 @@ The format for running plzip is:
     Check integrity of the specified file(s), but don't decompress
     them.  This really performs a trial decompression and throws away
     the result.  Use it together with '-v' to see information about
-     the file.
+     the file(s). If a file fails the test, plzip may be unable to
+     check the rest of the files.

 '-v'
 '--verbose'
@ -237,14 +250,14 @@ The format for running plzip is:

 '-0 .. -9'
     Set the compression parameters (dictionary size and match length
-     limit) as shown in the table below. Note that '-9' can be much
-     slower than '-0'. These options have no effect when decompressing.
+     limit) as shown in the table below. The default compression level
+     is '-6'.  Note that '-9' can be much slower than '-0'. These
+     options have no effect when decompressing.

     The bidimensional parameter space of LZMA can't be mapped to a
     linear scale optimal for all files. If your files are large, very
-     repetitive, etc, you may need to use the '--match-length' and
-     '--dictionary-size' options directly to achieve optimal
-     performance.
+     repetitive, etc, you may need to use the '--dictionary-size' and
+     '--match-length' options directly to achieve optimal performance.

     Level   Dictionary size   Match length limit
     -0      64 KiB            16 bytes
@ -292,7 +305,7 @@ File: plzip.info,  Node: Program design,  Next: File format,  Prev: Invoking plz

 When compressing, plzip divides the input file into chunks and
 compresses as many chunks simultaneously as worker threads are chosen,
-creating a multi-member compressed file.
+creating a multimember compressed file.

   When decompressing, plzip decompresses as many members
 simultaneously as worker threads are chosen. Files that were compressed
@ -348,12 +361,12 @@ additional information before, between, or after them.

   Each member has the following structure:
 +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-| ID string | VN | DS | Lzma stream | CRC32 |   Data size   |  Member size  |
+| ID string | VN | DS | LZMA stream | CRC32 |   Data size   |  Member size  |
 +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   All multibyte values are stored in little endian order.

-'ID string'
+'ID string (the "magic" bytes)'
     A four byte string, identifying the lzip format, with the value
     "LZIP" (0x4C, 0x5A, 0x49, 0x50).

@ -371,8 +384,8 @@ additional information before, between, or after them.
     Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
     Valid values for dictionary size range from 4 KiB to 512 MiB.

-'Lzma stream'
-     The lzma stream, finished by an end of stream marker. Uses default
+'LZMA stream'
+     The LZMA stream, finished by an end of stream marker. Uses default
     values for encoder properties.  *Note Stream format: (lzip)Stream
     format, for a complete description.

@ -386,7 +399,7 @@ additional information before, between, or after them.
     Total size of the member, including header and trailer. This field
     acts as a distributed index, allows the verification of stream
     integrity, and facilitates safe recovery of undamaged members from
-     multi-member files.
+     multimember files.



@ -408,7 +421,7 @@ following:
     file, or for testing of a regular file; the dictionary size.

     (Note that regular files with more than 1024 bytes of trailing
-     garbage are treated as non-seekable).
+     data are treated as non-seekable).

   * For testing of a non-seekable file or of standard input; the
     dictionary size plus up to 5 MiB.
@ -420,14 +433,14 @@ following:
     dictionary size plus up to 35 MiB.


-File: plzip.info,  Node: Minimum file sizes,  Next: Problems,  Prev: Memory requirements,  Up: Top
+File: plzip.info,  Node: Minimum file sizes,  Next: Trailing data,  Prev: Memory requirements,  Up: Top

 6 Minimum file sizes required for full compression speed
 ********************************************************

 When compressing, plzip divides the input file into chunks and
 compresses as many chunks simultaneously as worker threads are chosen,
-creating a multi-member compressed file.
+creating a multimember compressed file.

   For this to work as expected (and roughly multiply the compression
 speed by the number of available processors), the uncompressed file
@ -456,9 +469,106 @@ Level
 -9           128 MiB   256 MiB   512 MiB   1 GiB     4 GiB     16 GiB


-File: plzip.info,  Node: Problems,  Next: Concept index,  Prev: Minimum file sizes,  Up: Top
+File: plzip.info,  Node: Trailing data,  Next: Examples,  Prev: Minimum file sizes,  Up: Top

-7 Reporting bugs
+7 Extra data appended to the file
+*********************************
+
+Sometimes extra data is found appended to a lzip file after the last
+member. Such trailing data may be:
+
+   * Padding added to make the file size a multiple of some block size,
+     for example when writing to a tape.
+
+   * Garbage added by some not totally successful copy operation.
+
+   * Useful data added by the user; a cryptographically secure hash, a
+     description of file contents, etc.
+
+   * Malicious data added to the file in order to make its total size
+     and hash value (for a chosen hash) coincide with those of another
+     file.
+
+   * In very rare cases, trailing data could be the corrupt header of
+     another member. In multimember or concatenated files the
+     probability of corruption happening in the magic bytes is 5 times
+     smaller than the probability of getting a false positive caused by
+     the corruption of the integrity information itself. Therefore it
+     can be considered to be below the noise level.
+
+   Trailing data can be safely ignored in most cases. In some cases,
+like that of user-added data, it is expected to be ignored. In those
+cases where a file containing trailing data must be rejected, the option
+'--trailing-error' can be used. *Note --trailing-error::.
+
+
+File: plzip.info,  Node: Examples,  Next: Problems,  Prev: Trailing data,  Up: Top
+
+8 A small tutorial with examples
+********************************
+
+WARNING! Even if plzip is bug-free, other causes may result in a corrupt
+compressed file (bugs in the system libraries, memory errors, etc).
+Therefore, if the data you are going to compress are important, give the
+'--keep' option to plzip and don't remove the original file until you
+verify the compressed file with a command like
+'plzip -cd file.lz | cmp file -'.
+
+
+Example 1: Replace a regular file with its compressed version 'file.lz'
+and show the compression ratio.
+
+     plzip -v file
+
+
+Example 2: Like example 1 but the created 'file.lz' has a block size of
+1 MiB. The compression ratio is not shown.
+
+     plzip -B 1MiB file
+
+
+Example 3: Restore a regular file from its compressed version
+'file.lz'. If the operation is successful, 'file.lz' is removed.
+
+     plzip -d file.lz
+
+
+Example 4: Verify the integrity of the compressed file 'file.lz' and
+show status.
+
+     plzip -tv file.lz
+
+
+Example 5: Compress a whole device in /dev/sdc and send the output to
+'file.lz'.
+
+     plzip -c /dev/sdc > file.lz
+
+
+Example 6: The right way of concatenating compressed files.  *Note
+Trailing data::.
+
+     Don't do this
+       cat file1.lz file2.lz file3.lz | plzip -d
+     Do this instead
+       plzip -cd file1.lz file2.lz file3.lz
+
+
+Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed
+data are produced.
+
+     plzip -cd file.lz | dd bs=1024 count=10
+
+
+Example 8: Decompress 'file.lz' partially from decompressed byte 10000
+to decompressed byte 15000 (5000 bytes are produced).
+
+     plzip -cd file.lz | dd bs=1000 skip=10 count=5
+
+
+File: plzip.info,  Node: Problems,  Next: Concept index,  Prev: Examples,  Up: Top
+
+9 Reporting bugs
 ****************

 There are probably bugs in plzip. There are certainly errors and
@ -480,6 +590,7 @@ Concept index
 * Menu:

 * bugs:                                  Problems.              (line 6)
+* examples:                              Examples.              (line 6)
 * file format:                           File format.           (line 6)
 * getting help:                          Problems.              (line 6)
 * introduction:                          Introduction.          (line 6)
@ -488,6 +599,7 @@ Concept index
 * minimum file sizes:                    Minimum file sizes.    (line 6)
 * options:                               Invoking plzip.        (line 6)
 * program design:                        Program design.        (line 6)
+* trailing data:                         Trailing data.         (line 6)
 * usage:                                 Invoking plzip.        (line 6)
 * version:                               Invoking plzip.        (line 6)

@ -495,15 +607,19 @@ Concept index

 Tag Table:
 Node: Top221
-Node: Introduction984
-Node: Invoking plzip5332
-Ref: --data-size5747
-Node: Program design10972
-Node: File format12560
-Node: Memory requirements14973
-Node: Minimum file sizes16085
-Node: Problems18007
-Node: Concept index18543
+Node: Introduction1101
+Node: Invoking plzip5078
+Ref: --trailing-error5647
+Ref: --data-size5890
+Node: Program design11683
+Node: File format13270
+Node: Memory requirements15702
+Node: Minimum file sizes16811
+Node: Trailing data18737
+Node: Examples20121
+Ref: concat-example21286
+Node: Problems21823
+Node: Concept index22349

 End Tag Table