1
0
Fork 0

Merging upstream version 1.2.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-24 04:05:33 +01:00
parent 844a3e48f2
commit 73f1304e10
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
19 changed files with 181 additions and 105 deletions

View file

@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
Plzip Manual
************
This manual is for Plzip (version 1.2-rc2, 1 July 2014).
This manual is for Plzip (version 1.2, 29 August 2014).
* Menu:
@ -23,7 +23,7 @@ This manual is for Plzip (version 1.2-rc2, 1 July 2014).
* Concept index:: Index of concepts
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
Copyright (C) 2009-2014 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to
copy, distribute and modify it.
@ -34,15 +34,15 @@ File: plzip.info, Node: Introduction, Next: Program design, Prev: Top, Up: T
1 Introduction
**************
Plzip is a massively parallel (multi-threaded), lossless data compressor
Plzip is a massively parallel (multi-threaded) lossless data compressor
based on the lzlib compression library, with a user interface similar to
the one of lzip, bzip2 or gzip.
Plzip can compress/decompress large files on multiprocessor machines
much faster than lzip, at the cost of a slightly reduced compression
ratio. Note that the number of usable threads is limited by file size,
so on files larger than a few GB plzip can use hundreds of processors,
but on files of only a few MB plzip is no faster than lzip.
ratio. Note that the number of usable threads is limited by file size;
on files larger than a few GB plzip can use hundreds of processors, but
on files of only a few MB plzip is no faster than lzip.
Plzip uses the lzip file format; the files produced by plzip are
fully compatible with lzip-1.4 or newer, and can be rescued with
@ -67,6 +67,11 @@ into account both data integrity and decoder availability:
* Additionally lzip is copylefted, which guarantees that it will
remain free forever.
A nice feature of the lzip format is that a corrupt byte is easier to
repair the nearer it is from the beginning of the file. Therefore, with
the help of lziprecover, losing an entire archive just because of a
corrupt byte near the beginning is a thing of the past.
The member trailer stores the 32-bit CRC of the original data, the
size of the original data and the size of the member. These values,
together with the value remaining in the range decoder and the
@ -81,7 +86,23 @@ uncompressed data.
Plzip uses the same well-defined exit status values used by lzip and
bzip2, which makes it safer than compressors returning ambiguous warning
values (like gzip) when it is used as a back end for tar or zutils.
values (like gzip) when it is used as a back end for other programs like
tar or zutils.
The amount of memory required *per thread* is approximately the
following:
* For compression; 3 times the data size (*note --data-size::) plus
11 times the dictionary size.
* For decompression or testing of a non-seekable file or of standard
input; 2 times the dictionary size plus up to 32 MiB.
* For decompression of a regular file to a non-seekable file or to
standard output; the dictionary size plus up to 32 MiB.
* For decompression of a regular file to another regular file, or for
testing of a regular file; the dictionary size.
Plzip will automatically use the smallest possible dictionary size
for each file without exceeding the given limit. Keep in mind that the
@ -129,7 +150,17 @@ File: plzip.info, Node: Program design, Next: Invoking plzip, Prev: Introduct
2 Program design
****************
For each input file, a splitter thread and several worker threads are
When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen,
creating a multi-member compressed file.
When decompressing, plzip decompresses as many members
simultaneously as worker threads are chosen. Files that were compressed
with lzip will not be decompressed faster than using lzip (unless the
'-b' option was used) because lzip usually produces single-member
files, which can't be decompressed in parallel.
For each input file, a splitter thread and several worker threads are
created, acting the main thread as muxer (multiplexer) thread. A "packet
courier" takes care of data transfers among threads and limits the
maximum number of data blocks (packets) being processed simultaneously.
@ -141,10 +172,11 @@ writes them to the output file.
When decompressing from a regular file, the splitter is removed and
the workers read directly from the input file. If the output file is
also a regular file, the muxer is also removed, and the workers write
directly to the output file. With these optimizations, decompression
speed of large files with many members is only limited by the number of
processors available and by I/O speed.
also a regular file, the muxer is also removed and the workers write
directly to the output file. With these optimizations, the use of RAM
is greatly reduced and the decompression speed of large files with many
members is only limited by the number of processors available and by
I/O speed.

File: plzip.info, Node: Invoking plzip, Next: File format, Prev: Program design, Up: Top
@ -168,11 +200,11 @@ The format for running plzip is:
'-B BYTES'
'--data-size=BYTES'
Set the input data block size in bytes. The input file will be
divided in chunks of this size before compression is performed.
Valid values range from 8 KiB to 1 GiB. Default value is two times
the dictionary size. Plzip will reduce the dictionary size if it
is larger than the chosen data size.
Set the size of the input data blocks, in bytes. The input file
will be divided in chunks of this size before compression is
performed. Valid values range from 8 KiB to 1 GiB. Default value
is two times the dictionary size. Plzip will reduce the dictionary
size if it is larger than the chosen data size.
'-c'
'--stdout'
@ -418,13 +450,13 @@ Concept index

Tag Table:
Node: Top221
Node: Introduction873
Node: Program design5442
Node: Invoking plzip6496
Ref: --data-size6941
Node: File format12090
Node: Problems14595
Node: Concept index15124
Node: Introduction847
Node: Program design6279
Node: Invoking plzip7868
Ref: --data-size8313
Node: File format13471
Node: Problems15976
Node: Concept index16505

End Tag Table