Adding upstream version 1.2.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
63578be2c3
commit
9f17fcd573
19 changed files with 181 additions and 105 deletions
18
ChangeLog
18
ChangeLog
|
@ -1,20 +1,12 @@
|
|||
2014-07-01 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
2014-08-29 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.2-rc2 released.
|
||||
* License changed to GPL version 2 or later.
|
||||
|
||||
2014-05-08 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.2-rc1 released.
|
||||
* Minor changes.
|
||||
|
||||
2014-01-20 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
* Version 1.2-pre1 released.
|
||||
* Version 1.2 released.
|
||||
* main.cc (close_and_set_permissions): Behave like 'cp -p'.
|
||||
* dec_stdout.cc dec_stream.cc: Make 'slot_av' a vector to limit
|
||||
the number of packets produced by each worker individually.
|
||||
* plzip.texinfo: Renamed to plzip.texi.
|
||||
* plzip.texi: Documented the approximate amount of memory required.
|
||||
* License changed to GPL version 2 or later.
|
||||
|
||||
2013-09-17 Antonio Diaz Diaz <antonio@gnu.org>
|
||||
|
||||
|
@ -120,7 +112,7 @@
|
|||
until something better appears on the net.
|
||||
|
||||
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This file is a collection of facts, and thus it is not copyrightable,
|
||||
but just in case, you have unlimited permission to copy, distribute and
|
||||
|
|
6
INSTALL
6
INSTALL
|
@ -1,7 +1,7 @@
|
|||
Requirements
|
||||
------------
|
||||
You will need a C++ compiler and the lzlib compression library installed.
|
||||
I use gcc 4.8.1 and 3.3.6, but the code should compile with any
|
||||
I use gcc 4.9.1 and 3.3.6, but the code should compile with any
|
||||
standards compliant compiler.
|
||||
Lzlib must be version 1.0 or newer.
|
||||
Gcc is available at http://gcc.gnu.org.
|
||||
|
@ -34,7 +34,7 @@ the main archive.
|
|||
5. Type 'make install' to install the program and any data files and
|
||||
documentation.
|
||||
|
||||
You can install only the program, the info manual or the man page
|
||||
You can install only the program, the info manual or the man page by
|
||||
typing 'make install-bin', 'make install-info' or 'make install-man'
|
||||
respectively.
|
||||
|
||||
|
@ -60,7 +60,7 @@ After running 'configure', you can run 'make' and 'make install' as
|
|||
explained above.
|
||||
|
||||
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This file is free documentation: you have unlimited permission to copy,
|
||||
distribute and modify it.
|
||||
|
|
3
NEWS
3
NEWS
|
@ -8,6 +8,9 @@ Individual limits have been set on the number of packets produced by
|
|||
each decompresor worker thread to limit the amount of memory used in all
|
||||
cases.
|
||||
|
||||
The approximate amount of memory required has been documented in the
|
||||
manual.
|
||||
|
||||
"plzip.texinfo" has been renamed to "plzip.texi".
|
||||
|
||||
The license has been changed to GPL version 2 or later.
|
||||
|
|
28
README
28
README
|
@ -1,14 +1,24 @@
|
|||
Description
|
||||
|
||||
Plzip is a massively parallel (multi-threaded), lossless data compressor
|
||||
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||
based on the lzlib compression library, with a user interface similar to
|
||||
the one of lzip, bzip2 or gzip.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines
|
||||
much faster than lzip, at the cost of a slightly reduced compression
|
||||
ratio. Note that the number of usable threads is limited by file size,
|
||||
so on files larger than a few GB plzip can use hundreds of processors,
|
||||
but on files of only a few MB plzip is no faster than lzip.
|
||||
ratio. Note that the number of usable threads is limited by file size;
|
||||
on files larger than a few GB plzip can use hundreds of processors, but
|
||||
on files of only a few MB plzip is no faster than lzip.
|
||||
|
||||
When compressing, plzip divides the input file into chunks and
|
||||
compresses as many chunks simultaneously as worker threads are chosen,
|
||||
creating a multi-member compressed file.
|
||||
|
||||
When decompressing, plzip decompresses as many members simultaneously as
|
||||
worker threads are chosen. Files that were compressed with lzip will not
|
||||
be decompressed faster than using lzip (unless the "-b" option was used)
|
||||
because lzip usually produces single-member files, which can't be
|
||||
decompressed in parallel.
|
||||
|
||||
Plzip uses the lzip file format; the files produced by plzip are fully
|
||||
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
||||
|
@ -32,9 +42,15 @@ into account both data integrity and decoder availability:
|
|||
* Additionally lzip is copylefted, which guarantees that it will
|
||||
remain free forever.
|
||||
|
||||
A nice feature of the lzip format is that a corrupt byte is easier to
|
||||
repair the nearer it is from the beginning of the file. Therefore, with
|
||||
the help of lziprecover, losing an entire archive just because of a
|
||||
corrupt byte near the beginning is a thing of the past.
|
||||
|
||||
Plzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for tar or zutils.
|
||||
values (like gzip) when it is used as a back end for other programs like
|
||||
tar or zutils.
|
||||
|
||||
Plzip will automatically use the smallest possible dictionary size for
|
||||
each file without exceeding the given limit. Keep in mind that the
|
||||
|
@ -70,7 +86,7 @@ corresponding uncompressed files. Integrity testing of concatenated
|
|||
compressed files is also supported.
|
||||
|
||||
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This file is free documentation: you have unlimited permission to copy,
|
||||
distribute and modify it.
|
||||
|
|
|
@ -1,6 +1,5 @@
|
|||
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
|
||||
Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014
|
||||
Antonio Diaz Diaz.
|
||||
Copyright (C) 2006-2014 Antonio Diaz Diaz.
|
||||
|
||||
This library is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
|
@ -1,6 +1,5 @@
|
|||
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
|
||||
Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014
|
||||
Antonio Diaz Diaz.
|
||||
Copyright (C) 2006-2014 Antonio Diaz Diaz.
|
||||
|
||||
This library is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009 Laszlo Ersek.
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
6
configure
vendored
6
configure
vendored
|
@ -1,12 +1,12 @@
|
|||
#! /bin/sh
|
||||
# configure script for Plzip - Parallel compressor compatible with lzip
|
||||
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
# Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
#
|
||||
# This configure script is free software: you have unlimited permission
|
||||
# to copy, distribute and modify it.
|
||||
|
||||
pkgname=plzip
|
||||
pkgversion=1.2-rc2
|
||||
pkgversion=1.2
|
||||
progname=plzip
|
||||
srctrigger=doc/${pkgname}.texi
|
||||
|
||||
|
@ -165,7 +165,7 @@ echo "LDFLAGS = ${LDFLAGS}"
|
|||
rm -f Makefile
|
||||
cat > Makefile << EOF
|
||||
# Makefile for Plzip - Parallel compressor compatible with lzip
|
||||
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
# Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
# This file was generated automatically by configure. Do not edit.
|
||||
#
|
||||
# This Makefile is free software: you have unlimited permission
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009 Laszlo Ersek.
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009 Laszlo Ersek.
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009 Laszlo Ersek.
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
@ -43,17 +43,17 @@
|
|||
int preadblock( const int fd, uint8_t * const buf, const int size,
|
||||
const long long pos )
|
||||
{
|
||||
int rest = size;
|
||||
int sz = 0;
|
||||
errno = 0;
|
||||
while( rest > 0 )
|
||||
while( sz < size )
|
||||
{
|
||||
const int n = pread( fd, buf + size - rest, rest, pos + size - rest );
|
||||
if( n > 0 ) rest -= n;
|
||||
const int n = pread( fd, buf + sz, size - sz, pos + sz );
|
||||
if( n > 0 ) sz += n;
|
||||
else if( n == 0 ) break; // EOF
|
||||
else if( errno != EINTR ) break;
|
||||
errno = 0;
|
||||
}
|
||||
return size - rest;
|
||||
return sz;
|
||||
}
|
||||
|
||||
|
||||
|
@ -63,16 +63,16 @@ int preadblock( const int fd, uint8_t * const buf, const int size,
|
|||
int pwriteblock( const int fd, const uint8_t * const buf, const int size,
|
||||
const long long pos )
|
||||
{
|
||||
int rest = size;
|
||||
int sz = 0;
|
||||
errno = 0;
|
||||
while( rest > 0 )
|
||||
while( sz < size )
|
||||
{
|
||||
const int n = pwrite( fd, buf + size - rest, rest, pos + size - rest );
|
||||
if( n > 0 ) rest -= n;
|
||||
const int n = pwrite( fd, buf + sz, size - sz, pos + sz );
|
||||
if( n > 0 ) sz += n;
|
||||
else if( n < 0 && errno != EINTR ) break;
|
||||
errno = 0;
|
||||
}
|
||||
return size - rest;
|
||||
return sz;
|
||||
}
|
||||
|
||||
|
||||
|
|
12
doc/plzip.1
12
doc/plzip.1
|
@ -1,10 +1,10 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
|
||||
.TH PLZIP "1" "July 2014" "plzip 1.2-rc2" "User Commands"
|
||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1.
|
||||
.TH PLZIP "1" "August 2014" "plzip 1.2" "User Commands"
|
||||
.SH NAME
|
||||
plzip \- reduces the size of files
|
||||
.SH SYNOPSIS
|
||||
.B plzip
|
||||
[\fIoptions\fR] [\fIfiles\fR]
|
||||
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
||||
.SH DESCRIPTION
|
||||
Plzip \- Parallel compressor compatible with lzip.
|
||||
.SH OPTIONS
|
||||
|
@ -16,7 +16,7 @@ display this help and exit
|
|||
output version information and exit
|
||||
.TP
|
||||
\fB\-B\fR, \fB\-\-data\-size=\fR<bytes>
|
||||
set input data block size in bytes
|
||||
set size of input data blocks, in bytes
|
||||
.TP
|
||||
\fB\-c\fR, \fB\-\-stdout\fR
|
||||
send output to standard output
|
||||
|
@ -37,7 +37,7 @@ keep (don't delete) input files
|
|||
set match length limit in bytes [36]
|
||||
.TP
|
||||
\fB\-n\fR, \fB\-\-threads=\fR<n>
|
||||
set number of (de)compression threads [1]
|
||||
set number of (de)compression threads [2]
|
||||
.TP
|
||||
\fB\-o\fR, \fB\-\-output=\fR<file>
|
||||
if reading stdin, place the output into <file>
|
||||
|
@ -85,7 +85,7 @@ Plzip home page: http://www.nongnu.org/lzip/plzip.html
|
|||
Copyright \(co 2009 Laszlo Ersek.
|
||||
.br
|
||||
Copyright \(co 2014 Antonio Diaz Diaz.
|
||||
Using Lzlib 1.6\-rc2
|
||||
Using Lzlib 1.6
|
||||
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||
.br
|
||||
This is free software: you are free to change and redistribute it.
|
||||
|
|
|
@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
|
|||
Plzip Manual
|
||||
************
|
||||
|
||||
This manual is for Plzip (version 1.2-rc2, 1 July 2014).
|
||||
This manual is for Plzip (version 1.2, 29 August 2014).
|
||||
|
||||
* Menu:
|
||||
|
||||
|
@ -23,7 +23,7 @@ This manual is for Plzip (version 1.2-rc2, 1 July 2014).
|
|||
* Concept index:: Index of concepts
|
||||
|
||||
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission to
|
||||
copy, distribute and modify it.
|
||||
|
@ -34,15 +34,15 @@ File: plzip.info, Node: Introduction, Next: Program design, Prev: Top, Up: T
|
|||
1 Introduction
|
||||
**************
|
||||
|
||||
Plzip is a massively parallel (multi-threaded), lossless data compressor
|
||||
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||
based on the lzlib compression library, with a user interface similar to
|
||||
the one of lzip, bzip2 or gzip.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines
|
||||
much faster than lzip, at the cost of a slightly reduced compression
|
||||
ratio. Note that the number of usable threads is limited by file size,
|
||||
so on files larger than a few GB plzip can use hundreds of processors,
|
||||
but on files of only a few MB plzip is no faster than lzip.
|
||||
ratio. Note that the number of usable threads is limited by file size;
|
||||
on files larger than a few GB plzip can use hundreds of processors, but
|
||||
on files of only a few MB plzip is no faster than lzip.
|
||||
|
||||
Plzip uses the lzip file format; the files produced by plzip are
|
||||
fully compatible with lzip-1.4 or newer, and can be rescued with
|
||||
|
@ -67,6 +67,11 @@ into account both data integrity and decoder availability:
|
|||
* Additionally lzip is copylefted, which guarantees that it will
|
||||
remain free forever.
|
||||
|
||||
A nice feature of the lzip format is that a corrupt byte is easier to
|
||||
repair the nearer it is from the beginning of the file. Therefore, with
|
||||
the help of lziprecover, losing an entire archive just because of a
|
||||
corrupt byte near the beginning is a thing of the past.
|
||||
|
||||
The member trailer stores the 32-bit CRC of the original data, the
|
||||
size of the original data and the size of the member. These values,
|
||||
together with the value remaining in the range decoder and the
|
||||
|
@ -81,7 +86,23 @@ uncompressed data.
|
|||
|
||||
Plzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for tar or zutils.
|
||||
values (like gzip) when it is used as a back end for other programs like
|
||||
tar or zutils.
|
||||
|
||||
The amount of memory required *per thread* is approximately the
|
||||
following:
|
||||
|
||||
* For compression; 3 times the data size (*note --data-size::) plus
|
||||
11 times the dictionary size.
|
||||
|
||||
* For decompression or testing of a non-seekable file or of standard
|
||||
input; 2 times the dictionary size plus up to 32 MiB.
|
||||
|
||||
* For decompression of a regular file to a non-seekable file or to
|
||||
standard output; the dictionary size plus up to 32 MiB.
|
||||
|
||||
* For decompression of a regular file to another regular file, or for
|
||||
testing of a regular file; the dictionary size.
|
||||
|
||||
Plzip will automatically use the smallest possible dictionary size
|
||||
for each file without exceeding the given limit. Keep in mind that the
|
||||
|
@ -129,7 +150,17 @@ File: plzip.info, Node: Program design, Next: Invoking plzip, Prev: Introduct
|
|||
2 Program design
|
||||
****************
|
||||
|
||||
For each input file, a splitter thread and several worker threads are
|
||||
When compressing, plzip divides the input file into chunks and
|
||||
compresses as many chunks simultaneously as worker threads are chosen,
|
||||
creating a multi-member compressed file.
|
||||
|
||||
When decompressing, plzip decompresses as many members
|
||||
simultaneously as worker threads are chosen. Files that were compressed
|
||||
with lzip will not be decompressed faster than using lzip (unless the
|
||||
'-b' option was used) because lzip usually produces single-member
|
||||
files, which can't be decompressed in parallel.
|
||||
|
||||
For each input file, a splitter thread and several worker threads are
|
||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||
courier" takes care of data transfers among threads and limits the
|
||||
maximum number of data blocks (packets) being processed simultaneously.
|
||||
|
@ -141,10 +172,11 @@ writes them to the output file.
|
|||
|
||||
When decompressing from a regular file, the splitter is removed and
|
||||
the workers read directly from the input file. If the output file is
|
||||
also a regular file, the muxer is also removed, and the workers write
|
||||
directly to the output file. With these optimizations, decompression
|
||||
speed of large files with many members is only limited by the number of
|
||||
processors available and by I/O speed.
|
||||
also a regular file, the muxer is also removed and the workers write
|
||||
directly to the output file. With these optimizations, the use of RAM
|
||||
is greatly reduced and the decompression speed of large files with many
|
||||
members is only limited by the number of processors available and by
|
||||
I/O speed.
|
||||
|
||||
|
||||
File: plzip.info, Node: Invoking plzip, Next: File format, Prev: Program design, Up: Top
|
||||
|
@ -168,11 +200,11 @@ The format for running plzip is:
|
|||
|
||||
'-B BYTES'
|
||||
'--data-size=BYTES'
|
||||
Set the input data block size in bytes. The input file will be
|
||||
divided in chunks of this size before compression is performed.
|
||||
Valid values range from 8 KiB to 1 GiB. Default value is two times
|
||||
the dictionary size. Plzip will reduce the dictionary size if it
|
||||
is larger than the chosen data size.
|
||||
Set the size of the input data blocks, in bytes. The input file
|
||||
will be divided in chunks of this size before compression is
|
||||
performed. Valid values range from 8 KiB to 1 GiB. Default value
|
||||
is two times the dictionary size. Plzip will reduce the dictionary
|
||||
size if it is larger than the chosen data size.
|
||||
|
||||
'-c'
|
||||
'--stdout'
|
||||
|
@ -418,13 +450,13 @@ Concept index
|
|||
|
||||
Tag Table:
|
||||
Node: Top221
|
||||
Node: Introduction873
|
||||
Node: Program design5442
|
||||
Node: Invoking plzip6496
|
||||
Ref: --data-size6941
|
||||
Node: File format12090
|
||||
Node: Problems14595
|
||||
Node: Concept index15124
|
||||
Node: Introduction847
|
||||
Node: Program design6279
|
||||
Node: Invoking plzip7868
|
||||
Ref: --data-size8313
|
||||
Node: File format13471
|
||||
Node: Problems15976
|
||||
Node: Concept index16505
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
|
|
@ -6,8 +6,8 @@
|
|||
@finalout
|
||||
@c %**end of header
|
||||
|
||||
@set UPDATED 1 July 2014
|
||||
@set VERSION 1.2-rc2
|
||||
@set UPDATED 29 August 2014
|
||||
@set VERSION 1.2
|
||||
|
||||
@dircategory Data Compression
|
||||
@direntry
|
||||
|
@ -44,8 +44,7 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
|
|||
@end menu
|
||||
|
||||
@sp 1
|
||||
Copyright @copyright{} 2009, 2010, 2011, 2012, 2013, 2014
|
||||
Antonio Diaz Diaz.
|
||||
Copyright @copyright{} 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This manual is free documentation: you have unlimited permission
|
||||
to copy, distribute and modify it.
|
||||
|
@ -55,15 +54,15 @@ to copy, distribute and modify it.
|
|||
@chapter Introduction
|
||||
@cindex introduction
|
||||
|
||||
Plzip is a massively parallel (multi-threaded), lossless data compressor
|
||||
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||
based on the lzlib compression library, with a user interface similar to
|
||||
the one of lzip, bzip2 or gzip.
|
||||
|
||||
Plzip can compress/decompress large files on multiprocessor machines
|
||||
much faster than lzip, at the cost of a slightly reduced compression
|
||||
ratio. Note that the number of usable threads is limited by file size,
|
||||
so on files larger than a few GB plzip can use hundreds of processors,
|
||||
but on files of only a few MB plzip is no faster than lzip.
|
||||
ratio. Note that the number of usable threads is limited by file size;
|
||||
on files larger than a few GB plzip can use hundreds of processors, but
|
||||
on files of only a few MB plzip is no faster than lzip.
|
||||
|
||||
Plzip uses the lzip file format; the files produced by plzip are fully
|
||||
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
||||
|
@ -92,6 +91,11 @@ Additionally lzip is copylefted, which guarantees that it will remain
|
|||
free forever.
|
||||
@end itemize
|
||||
|
||||
A nice feature of the lzip format is that a corrupt byte is easier to
|
||||
repair the nearer it is from the beginning of the file. Therefore, with
|
||||
the help of lziprecover, losing an entire archive just because of a
|
||||
corrupt byte near the beginning is a thing of the past.
|
||||
|
||||
The member trailer stores the 32-bit CRC of the original data, the size
|
||||
of the original data and the size of the member. These values, together
|
||||
with the value remaining in the range decoder and the end-of-stream
|
||||
|
@ -105,7 +109,29 @@ wrong. It can't help you recover the original uncompressed data.
|
|||
|
||||
Plzip uses the same well-defined exit status values used by lzip and
|
||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||
values (like gzip) when it is used as a back end for tar or zutils.
|
||||
values (like gzip) when it is used as a back end for other programs like
|
||||
tar or zutils.
|
||||
|
||||
The amount of memory required @strong{per thread} is approximately the
|
||||
following:
|
||||
|
||||
@itemize @bullet
|
||||
@item
|
||||
For compression; 3 times the data size (@pxref{--data-size}) plus 11
|
||||
times the dictionary size.
|
||||
|
||||
@item
|
||||
For decompression or testing of a non-seekable file or of standard
|
||||
input; 2 times the dictionary size plus up to 32 MiB.
|
||||
|
||||
@item
|
||||
For decompression of a regular file to a non-seekable file or to
|
||||
standard output; the dictionary size plus up to 32 MiB.
|
||||
|
||||
@item
|
||||
For decompression of a regular file to another regular file, or for
|
||||
testing of a regular file; the dictionary size.
|
||||
@end itemize
|
||||
|
||||
Plzip will automatically use the smallest possible dictionary size for
|
||||
each file without exceeding the given limit. Keep in mind that the
|
||||
|
@ -154,6 +180,16 @@ you verify the compressed file with a command like
|
|||
@chapter Program design
|
||||
@cindex program design
|
||||
|
||||
When compressing, plzip divides the input file into chunks and
|
||||
compresses as many chunks simultaneously as worker threads are chosen,
|
||||
creating a multi-member compressed file.
|
||||
|
||||
When decompressing, plzip decompresses as many members simultaneously as
|
||||
worker threads are chosen. Files that were compressed with lzip will not
|
||||
be decompressed faster than using lzip (unless the @samp{-b} option was
|
||||
used) because lzip usually produces single-member files, which can't be
|
||||
decompressed in parallel.
|
||||
|
||||
For each input file, a splitter thread and several worker threads are
|
||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||
courier" takes care of data transfers among threads and limits the
|
||||
|
@ -166,10 +202,10 @@ writes them to the output file.
|
|||
|
||||
When decompressing from a regular file, the splitter is removed and the
|
||||
workers read directly from the input file. If the output file is also a
|
||||
regular file, the muxer is also removed, and the workers write directly
|
||||
to the output file. With these optimizations, decompression speed of
|
||||
large files with many members is only limited by the number of
|
||||
processors available and by I/O speed.
|
||||
regular file, the muxer is also removed and the workers write directly
|
||||
to the output file. With these optimizations, the use of RAM is greatly
|
||||
reduced and the decompression speed of large files with many members is
|
||||
only limited by the number of processors available and by I/O speed.
|
||||
|
||||
|
||||
@node Invoking plzip
|
||||
|
@ -199,11 +235,11 @@ Print the version number of plzip on the standard output and exit.
|
|||
@item -B @var{bytes}
|
||||
@itemx --data-size=@var{bytes}
|
||||
@anchor{--data-size}
|
||||
Set the input data block size in bytes. The input file will be divided
|
||||
in chunks of this size before compression is performed. Valid values
|
||||
range from 8 KiB to 1 GiB. Default value is two times the dictionary
|
||||
size. Plzip will reduce the dictionary size if it is larger than the
|
||||
chosen data size.
|
||||
Set the size of the input data blocks, in bytes. The input file will be
|
||||
divided in chunks of this size before compression is performed. Valid
|
||||
values range from 8 KiB to 1 GiB. Default value is two times the
|
||||
dictionary size. Plzip will reduce the dictionary size if it is larger
|
||||
than the chosen data size.
|
||||
|
||||
@item -c
|
||||
@itemx --stdout
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
@ -59,7 +59,7 @@ File_index::File_index( const int infd )
|
|||
{
|
||||
const long long isize = lseek( infd, 0, SEEK_END );
|
||||
if( isize < 0 )
|
||||
{ set_errno_error( "Input file is not seekable :" ); return; }
|
||||
{ set_errno_error( "Input file is not seekable: " ); return; }
|
||||
if( isize < min_member_size )
|
||||
{ error_ = "Input file is too short."; retval_ = 2; return; }
|
||||
if( isize > INT64_MAX )
|
||||
|
@ -68,7 +68,7 @@ File_index::File_index( const int infd )
|
|||
|
||||
File_header header;
|
||||
if( seek_read( infd, header.data, File_header::size, 0 ) != File_header::size )
|
||||
{ set_errno_error( "Error reading member header :" ); return; }
|
||||
{ set_errno_error( "Error reading member header: " ); return; }
|
||||
if( !header.verify_magic() )
|
||||
{ error_ = "Bad magic number (file not in lzip format).";
|
||||
retval_ = 2; return; }
|
||||
|
@ -82,7 +82,7 @@ File_index::File_index( const int infd )
|
|||
File_trailer trailer;
|
||||
if( seek_read( infd, trailer.data, File_trailer::size,
|
||||
pos - File_trailer::size ) != File_trailer::size )
|
||||
{ set_errno_error( "Error reading member trailer :" ); break; }
|
||||
{ set_errno_error( "Error reading member trailer: " ); break; }
|
||||
const long long member_size = trailer.member_size();
|
||||
if( member_size < min_member_size || member_size > pos )
|
||||
{
|
||||
|
@ -93,7 +93,7 @@ File_index::File_index( const int infd )
|
|||
}
|
||||
if( seek_read( infd, header.data, File_header::size,
|
||||
pos - member_size ) != File_header::size )
|
||||
{ set_errno_error( "Error reading member header :" ); break; }
|
||||
{ set_errno_error( "Error reading member header: " ); break; }
|
||||
if( !header.verify_magic() || !header.verify_version() )
|
||||
{
|
||||
if( member_vector.empty() ) // maybe trailing garbage
|
||||
|
@ -119,7 +119,7 @@ File_index::File_index( const int infd )
|
|||
return;
|
||||
}
|
||||
std::reverse( member_vector.begin(), member_vector.end() );
|
||||
for( unsigned i = 0; i < member_vector.size() - 1; ++i )
|
||||
for( unsigned long i = 0; i < member_vector.size() - 1; ++i )
|
||||
{
|
||||
const long long end = member_vector[i].dblock.end();
|
||||
if( end < 0 || end > INT64_MAX )
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
2
lzip.h
2
lzip.h
|
@ -1,5 +1,5 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
|
9
main.cc
9
main.cc
|
@ -1,6 +1,6 @@
|
|||
/* Plzip - Parallel compressor compatible with lzip
|
||||
Copyright (C) 2009 Laszlo Ersek.
|
||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
|
@ -103,7 +103,7 @@ void show_help( const long num_online )
|
|||
std::printf( "\nOptions:\n"
|
||||
" -h, --help display this help and exit\n"
|
||||
" -V, --version output version information and exit\n"
|
||||
" -B, --data-size=<bytes> set input data block size in bytes\n"
|
||||
" -B, --data-size=<bytes> set size of input data blocks, in bytes\n"
|
||||
" -c, --stdout send output to standard output\n"
|
||||
" -d, --decompress decompress\n"
|
||||
" -f, --force overwrite existing output files\n"
|
||||
|
@ -688,9 +688,8 @@ int main( const int argc, const char * const argv[] )
|
|||
int tmp;
|
||||
if( program_mode == m_compress )
|
||||
{
|
||||
if( verbosity >= 2 )
|
||||
show_progress( 0, &pp, ( in_statsp && S_ISREG( in_statsp->st_mode ) ) ?
|
||||
in_statsp->st_size / 100 : 0 ); // init
|
||||
if( verbosity >= 2 ) // init
|
||||
show_progress( 0, &pp, infd_isreg ? in_statsp->st_size / 100 : 0 );
|
||||
tmp = compress( data_size, encoder_options.dictionary_size,
|
||||
encoder_options.match_len_limit,
|
||||
num_workers, infd, outfd, pp, debug_level );
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
#! /bin/sh
|
||||
# check script for Plzip - Parallel compressor compatible with lzip
|
||||
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
||||
# Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||
#
|
||||
# This script is free software: you have unlimited permission
|
||||
# to copy, distribute and modify it.
|
||||
|
|
Loading…
Add table
Reference in a new issue