Merging upstream version 1.2.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
844a3e48f2
commit
73f1304e10
19 changed files with 181 additions and 105 deletions
18
ChangeLog
18
ChangeLog
|
@ -1,20 +1,12 @@
|
||||||
2014-07-01 Antonio Diaz Diaz <antonio@gnu.org>
|
2014-08-29 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
* Version 1.2-rc2 released.
|
* Version 1.2 released.
|
||||||
* License changed to GPL version 2 or later.
|
|
||||||
|
|
||||||
2014-05-08 Antonio Diaz Diaz <antonio@gnu.org>
|
|
||||||
|
|
||||||
* Version 1.2-rc1 released.
|
|
||||||
* Minor changes.
|
|
||||||
|
|
||||||
2014-01-20 Antonio Diaz Diaz <antonio@gnu.org>
|
|
||||||
|
|
||||||
* Version 1.2-pre1 released.
|
|
||||||
* main.cc (close_and_set_permissions): Behave like 'cp -p'.
|
* main.cc (close_and_set_permissions): Behave like 'cp -p'.
|
||||||
* dec_stdout.cc dec_stream.cc: Make 'slot_av' a vector to limit
|
* dec_stdout.cc dec_stream.cc: Make 'slot_av' a vector to limit
|
||||||
the number of packets produced by each worker individually.
|
the number of packets produced by each worker individually.
|
||||||
* plzip.texinfo: Renamed to plzip.texi.
|
* plzip.texinfo: Renamed to plzip.texi.
|
||||||
|
* plzip.texi: Documented the approximate amount of memory required.
|
||||||
|
* License changed to GPL version 2 or later.
|
||||||
|
|
||||||
2013-09-17 Antonio Diaz Diaz <antonio@gnu.org>
|
2013-09-17 Antonio Diaz Diaz <antonio@gnu.org>
|
||||||
|
|
||||||
|
@ -120,7 +112,7 @@
|
||||||
until something better appears on the net.
|
until something better appears on the net.
|
||||||
|
|
||||||
|
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This file is a collection of facts, and thus it is not copyrightable,
|
This file is a collection of facts, and thus it is not copyrightable,
|
||||||
but just in case, you have unlimited permission to copy, distribute and
|
but just in case, you have unlimited permission to copy, distribute and
|
||||||
|
|
6
INSTALL
6
INSTALL
|
@ -1,7 +1,7 @@
|
||||||
Requirements
|
Requirements
|
||||||
------------
|
------------
|
||||||
You will need a C++ compiler and the lzlib compression library installed.
|
You will need a C++ compiler and the lzlib compression library installed.
|
||||||
I use gcc 4.8.1 and 3.3.6, but the code should compile with any
|
I use gcc 4.9.1 and 3.3.6, but the code should compile with any
|
||||||
standards compliant compiler.
|
standards compliant compiler.
|
||||||
Lzlib must be version 1.0 or newer.
|
Lzlib must be version 1.0 or newer.
|
||||||
Gcc is available at http://gcc.gnu.org.
|
Gcc is available at http://gcc.gnu.org.
|
||||||
|
@ -34,7 +34,7 @@ the main archive.
|
||||||
5. Type 'make install' to install the program and any data files and
|
5. Type 'make install' to install the program and any data files and
|
||||||
documentation.
|
documentation.
|
||||||
|
|
||||||
You can install only the program, the info manual or the man page
|
You can install only the program, the info manual or the man page by
|
||||||
typing 'make install-bin', 'make install-info' or 'make install-man'
|
typing 'make install-bin', 'make install-info' or 'make install-man'
|
||||||
respectively.
|
respectively.
|
||||||
|
|
||||||
|
@ -60,7 +60,7 @@ After running 'configure', you can run 'make' and 'make install' as
|
||||||
explained above.
|
explained above.
|
||||||
|
|
||||||
|
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This file is free documentation: you have unlimited permission to copy,
|
This file is free documentation: you have unlimited permission to copy,
|
||||||
distribute and modify it.
|
distribute and modify it.
|
||||||
|
|
3
NEWS
3
NEWS
|
@ -8,6 +8,9 @@ Individual limits have been set on the number of packets produced by
|
||||||
each decompresor worker thread to limit the amount of memory used in all
|
each decompresor worker thread to limit the amount of memory used in all
|
||||||
cases.
|
cases.
|
||||||
|
|
||||||
|
The approximate amount of memory required has been documented in the
|
||||||
|
manual.
|
||||||
|
|
||||||
"plzip.texinfo" has been renamed to "plzip.texi".
|
"plzip.texinfo" has been renamed to "plzip.texi".
|
||||||
|
|
||||||
The license has been changed to GPL version 2 or later.
|
The license has been changed to GPL version 2 or later.
|
||||||
|
|
28
README
28
README
|
@ -1,14 +1,24 @@
|
||||||
Description
|
Description
|
||||||
|
|
||||||
Plzip is a massively parallel (multi-threaded), lossless data compressor
|
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||||
based on the lzlib compression library, with a user interface similar to
|
based on the lzlib compression library, with a user interface similar to
|
||||||
the one of lzip, bzip2 or gzip.
|
the one of lzip, bzip2 or gzip.
|
||||||
|
|
||||||
Plzip can compress/decompress large files on multiprocessor machines
|
Plzip can compress/decompress large files on multiprocessor machines
|
||||||
much faster than lzip, at the cost of a slightly reduced compression
|
much faster than lzip, at the cost of a slightly reduced compression
|
||||||
ratio. Note that the number of usable threads is limited by file size,
|
ratio. Note that the number of usable threads is limited by file size;
|
||||||
so on files larger than a few GB plzip can use hundreds of processors,
|
on files larger than a few GB plzip can use hundreds of processors, but
|
||||||
but on files of only a few MB plzip is no faster than lzip.
|
on files of only a few MB plzip is no faster than lzip.
|
||||||
|
|
||||||
|
When compressing, plzip divides the input file into chunks and
|
||||||
|
compresses as many chunks simultaneously as worker threads are chosen,
|
||||||
|
creating a multi-member compressed file.
|
||||||
|
|
||||||
|
When decompressing, plzip decompresses as many members simultaneously as
|
||||||
|
worker threads are chosen. Files that were compressed with lzip will not
|
||||||
|
be decompressed faster than using lzip (unless the "-b" option was used)
|
||||||
|
because lzip usually produces single-member files, which can't be
|
||||||
|
decompressed in parallel.
|
||||||
|
|
||||||
Plzip uses the lzip file format; the files produced by plzip are fully
|
Plzip uses the lzip file format; the files produced by plzip are fully
|
||||||
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
||||||
|
@ -32,9 +42,15 @@ into account both data integrity and decoder availability:
|
||||||
* Additionally lzip is copylefted, which guarantees that it will
|
* Additionally lzip is copylefted, which guarantees that it will
|
||||||
remain free forever.
|
remain free forever.
|
||||||
|
|
||||||
|
A nice feature of the lzip format is that a corrupt byte is easier to
|
||||||
|
repair the nearer it is from the beginning of the file. Therefore, with
|
||||||
|
the help of lziprecover, losing an entire archive just because of a
|
||||||
|
corrupt byte near the beginning is a thing of the past.
|
||||||
|
|
||||||
Plzip uses the same well-defined exit status values used by lzip and
|
Plzip uses the same well-defined exit status values used by lzip and
|
||||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||||
values (like gzip) when it is used as a back end for tar or zutils.
|
values (like gzip) when it is used as a back end for other programs like
|
||||||
|
tar or zutils.
|
||||||
|
|
||||||
Plzip will automatically use the smallest possible dictionary size for
|
Plzip will automatically use the smallest possible dictionary size for
|
||||||
each file without exceeding the given limit. Keep in mind that the
|
each file without exceeding the given limit. Keep in mind that the
|
||||||
|
@ -70,7 +86,7 @@ corresponding uncompressed files. Integrity testing of concatenated
|
||||||
compressed files is also supported.
|
compressed files is also supported.
|
||||||
|
|
||||||
|
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This file is free documentation: you have unlimited permission to copy,
|
This file is free documentation: you have unlimited permission to copy,
|
||||||
distribute and modify it.
|
distribute and modify it.
|
||||||
|
|
|
@ -1,6 +1,5 @@
|
||||||
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
|
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
|
||||||
Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014
|
Copyright (C) 2006-2014 Antonio Diaz Diaz.
|
||||||
Antonio Diaz Diaz.
|
|
||||||
|
|
||||||
This library is free software: you can redistribute it and/or modify
|
This library is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
|
@ -1,6 +1,5 @@
|
||||||
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
|
/* Arg_parser - POSIX/GNU command line argument parser. (C++ version)
|
||||||
Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014
|
Copyright (C) 2006-2014 Antonio Diaz Diaz.
|
||||||
Antonio Diaz Diaz.
|
|
||||||
|
|
||||||
This library is free software: you can redistribute it and/or modify
|
This library is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009 Laszlo Ersek.
|
Copyright (C) 2009 Laszlo Ersek.
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
6
configure
vendored
6
configure
vendored
|
@ -1,12 +1,12 @@
|
||||||
#! /bin/sh
|
#! /bin/sh
|
||||||
# configure script for Plzip - Parallel compressor compatible with lzip
|
# configure script for Plzip - Parallel compressor compatible with lzip
|
||||||
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
# Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
#
|
#
|
||||||
# This configure script is free software: you have unlimited permission
|
# This configure script is free software: you have unlimited permission
|
||||||
# to copy, distribute and modify it.
|
# to copy, distribute and modify it.
|
||||||
|
|
||||||
pkgname=plzip
|
pkgname=plzip
|
||||||
pkgversion=1.2-rc2
|
pkgversion=1.2
|
||||||
progname=plzip
|
progname=plzip
|
||||||
srctrigger=doc/${pkgname}.texi
|
srctrigger=doc/${pkgname}.texi
|
||||||
|
|
||||||
|
@ -165,7 +165,7 @@ echo "LDFLAGS = ${LDFLAGS}"
|
||||||
rm -f Makefile
|
rm -f Makefile
|
||||||
cat > Makefile << EOF
|
cat > Makefile << EOF
|
||||||
# Makefile for Plzip - Parallel compressor compatible with lzip
|
# Makefile for Plzip - Parallel compressor compatible with lzip
|
||||||
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
# Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
# This file was generated automatically by configure. Do not edit.
|
# This file was generated automatically by configure. Do not edit.
|
||||||
#
|
#
|
||||||
# This Makefile is free software: you have unlimited permission
|
# This Makefile is free software: you have unlimited permission
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009 Laszlo Ersek.
|
Copyright (C) 2009 Laszlo Ersek.
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009 Laszlo Ersek.
|
Copyright (C) 2009 Laszlo Ersek.
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009 Laszlo Ersek.
|
Copyright (C) 2009 Laszlo Ersek.
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
@ -43,17 +43,17 @@
|
||||||
int preadblock( const int fd, uint8_t * const buf, const int size,
|
int preadblock( const int fd, uint8_t * const buf, const int size,
|
||||||
const long long pos )
|
const long long pos )
|
||||||
{
|
{
|
||||||
int rest = size;
|
int sz = 0;
|
||||||
errno = 0;
|
errno = 0;
|
||||||
while( rest > 0 )
|
while( sz < size )
|
||||||
{
|
{
|
||||||
const int n = pread( fd, buf + size - rest, rest, pos + size - rest );
|
const int n = pread( fd, buf + sz, size - sz, pos + sz );
|
||||||
if( n > 0 ) rest -= n;
|
if( n > 0 ) sz += n;
|
||||||
else if( n == 0 ) break; // EOF
|
else if( n == 0 ) break; // EOF
|
||||||
else if( errno != EINTR ) break;
|
else if( errno != EINTR ) break;
|
||||||
errno = 0;
|
errno = 0;
|
||||||
}
|
}
|
||||||
return size - rest;
|
return sz;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@ -63,16 +63,16 @@ int preadblock( const int fd, uint8_t * const buf, const int size,
|
||||||
int pwriteblock( const int fd, const uint8_t * const buf, const int size,
|
int pwriteblock( const int fd, const uint8_t * const buf, const int size,
|
||||||
const long long pos )
|
const long long pos )
|
||||||
{
|
{
|
||||||
int rest = size;
|
int sz = 0;
|
||||||
errno = 0;
|
errno = 0;
|
||||||
while( rest > 0 )
|
while( sz < size )
|
||||||
{
|
{
|
||||||
const int n = pwrite( fd, buf + size - rest, rest, pos + size - rest );
|
const int n = pwrite( fd, buf + sz, size - sz, pos + sz );
|
||||||
if( n > 0 ) rest -= n;
|
if( n > 0 ) sz += n;
|
||||||
else if( n < 0 && errno != EINTR ) break;
|
else if( n < 0 && errno != EINTR ) break;
|
||||||
errno = 0;
|
errno = 0;
|
||||||
}
|
}
|
||||||
return size - rest;
|
return sz;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
12
doc/plzip.1
12
doc/plzip.1
|
@ -1,10 +1,10 @@
|
||||||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
|
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1.
|
||||||
.TH PLZIP "1" "July 2014" "plzip 1.2-rc2" "User Commands"
|
.TH PLZIP "1" "August 2014" "plzip 1.2" "User Commands"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
plzip \- reduces the size of files
|
plzip \- reduces the size of files
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
.B plzip
|
.B plzip
|
||||||
[\fIoptions\fR] [\fIfiles\fR]
|
[\fI\,options\/\fR] [\fI\,files\/\fR]
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
Plzip \- Parallel compressor compatible with lzip.
|
Plzip \- Parallel compressor compatible with lzip.
|
||||||
.SH OPTIONS
|
.SH OPTIONS
|
||||||
|
@ -16,7 +16,7 @@ display this help and exit
|
||||||
output version information and exit
|
output version information and exit
|
||||||
.TP
|
.TP
|
||||||
\fB\-B\fR, \fB\-\-data\-size=\fR<bytes>
|
\fB\-B\fR, \fB\-\-data\-size=\fR<bytes>
|
||||||
set input data block size in bytes
|
set size of input data blocks, in bytes
|
||||||
.TP
|
.TP
|
||||||
\fB\-c\fR, \fB\-\-stdout\fR
|
\fB\-c\fR, \fB\-\-stdout\fR
|
||||||
send output to standard output
|
send output to standard output
|
||||||
|
@ -37,7 +37,7 @@ keep (don't delete) input files
|
||||||
set match length limit in bytes [36]
|
set match length limit in bytes [36]
|
||||||
.TP
|
.TP
|
||||||
\fB\-n\fR, \fB\-\-threads=\fR<n>
|
\fB\-n\fR, \fB\-\-threads=\fR<n>
|
||||||
set number of (de)compression threads [1]
|
set number of (de)compression threads [2]
|
||||||
.TP
|
.TP
|
||||||
\fB\-o\fR, \fB\-\-output=\fR<file>
|
\fB\-o\fR, \fB\-\-output=\fR<file>
|
||||||
if reading stdin, place the output into <file>
|
if reading stdin, place the output into <file>
|
||||||
|
@ -85,7 +85,7 @@ Plzip home page: http://www.nongnu.org/lzip/plzip.html
|
||||||
Copyright \(co 2009 Laszlo Ersek.
|
Copyright \(co 2009 Laszlo Ersek.
|
||||||
.br
|
.br
|
||||||
Copyright \(co 2014 Antonio Diaz Diaz.
|
Copyright \(co 2014 Antonio Diaz Diaz.
|
||||||
Using Lzlib 1.6\-rc2
|
Using Lzlib 1.6
|
||||||
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
|
||||||
.br
|
.br
|
||||||
This is free software: you are free to change and redistribute it.
|
This is free software: you are free to change and redistribute it.
|
||||||
|
|
|
@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
|
||||||
Plzip Manual
|
Plzip Manual
|
||||||
************
|
************
|
||||||
|
|
||||||
This manual is for Plzip (version 1.2-rc2, 1 July 2014).
|
This manual is for Plzip (version 1.2, 29 August 2014).
|
||||||
|
|
||||||
* Menu:
|
* Menu:
|
||||||
|
|
||||||
|
@ -23,7 +23,7 @@ This manual is for Plzip (version 1.2-rc2, 1 July 2014).
|
||||||
* Concept index:: Index of concepts
|
* Concept index:: Index of concepts
|
||||||
|
|
||||||
|
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This manual is free documentation: you have unlimited permission to
|
This manual is free documentation: you have unlimited permission to
|
||||||
copy, distribute and modify it.
|
copy, distribute and modify it.
|
||||||
|
@ -34,15 +34,15 @@ File: plzip.info, Node: Introduction, Next: Program design, Prev: Top, Up: T
|
||||||
1 Introduction
|
1 Introduction
|
||||||
**************
|
**************
|
||||||
|
|
||||||
Plzip is a massively parallel (multi-threaded), lossless data compressor
|
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||||
based on the lzlib compression library, with a user interface similar to
|
based on the lzlib compression library, with a user interface similar to
|
||||||
the one of lzip, bzip2 or gzip.
|
the one of lzip, bzip2 or gzip.
|
||||||
|
|
||||||
Plzip can compress/decompress large files on multiprocessor machines
|
Plzip can compress/decompress large files on multiprocessor machines
|
||||||
much faster than lzip, at the cost of a slightly reduced compression
|
much faster than lzip, at the cost of a slightly reduced compression
|
||||||
ratio. Note that the number of usable threads is limited by file size,
|
ratio. Note that the number of usable threads is limited by file size;
|
||||||
so on files larger than a few GB plzip can use hundreds of processors,
|
on files larger than a few GB plzip can use hundreds of processors, but
|
||||||
but on files of only a few MB plzip is no faster than lzip.
|
on files of only a few MB plzip is no faster than lzip.
|
||||||
|
|
||||||
Plzip uses the lzip file format; the files produced by plzip are
|
Plzip uses the lzip file format; the files produced by plzip are
|
||||||
fully compatible with lzip-1.4 or newer, and can be rescued with
|
fully compatible with lzip-1.4 or newer, and can be rescued with
|
||||||
|
@ -67,6 +67,11 @@ into account both data integrity and decoder availability:
|
||||||
* Additionally lzip is copylefted, which guarantees that it will
|
* Additionally lzip is copylefted, which guarantees that it will
|
||||||
remain free forever.
|
remain free forever.
|
||||||
|
|
||||||
|
A nice feature of the lzip format is that a corrupt byte is easier to
|
||||||
|
repair the nearer it is from the beginning of the file. Therefore, with
|
||||||
|
the help of lziprecover, losing an entire archive just because of a
|
||||||
|
corrupt byte near the beginning is a thing of the past.
|
||||||
|
|
||||||
The member trailer stores the 32-bit CRC of the original data, the
|
The member trailer stores the 32-bit CRC of the original data, the
|
||||||
size of the original data and the size of the member. These values,
|
size of the original data and the size of the member. These values,
|
||||||
together with the value remaining in the range decoder and the
|
together with the value remaining in the range decoder and the
|
||||||
|
@ -81,7 +86,23 @@ uncompressed data.
|
||||||
|
|
||||||
Plzip uses the same well-defined exit status values used by lzip and
|
Plzip uses the same well-defined exit status values used by lzip and
|
||||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||||
values (like gzip) when it is used as a back end for tar or zutils.
|
values (like gzip) when it is used as a back end for other programs like
|
||||||
|
tar or zutils.
|
||||||
|
|
||||||
|
The amount of memory required *per thread* is approximately the
|
||||||
|
following:
|
||||||
|
|
||||||
|
* For compression; 3 times the data size (*note --data-size::) plus
|
||||||
|
11 times the dictionary size.
|
||||||
|
|
||||||
|
* For decompression or testing of a non-seekable file or of standard
|
||||||
|
input; 2 times the dictionary size plus up to 32 MiB.
|
||||||
|
|
||||||
|
* For decompression of a regular file to a non-seekable file or to
|
||||||
|
standard output; the dictionary size plus up to 32 MiB.
|
||||||
|
|
||||||
|
* For decompression of a regular file to another regular file, or for
|
||||||
|
testing of a regular file; the dictionary size.
|
||||||
|
|
||||||
Plzip will automatically use the smallest possible dictionary size
|
Plzip will automatically use the smallest possible dictionary size
|
||||||
for each file without exceeding the given limit. Keep in mind that the
|
for each file without exceeding the given limit. Keep in mind that the
|
||||||
|
@ -129,7 +150,17 @@ File: plzip.info, Node: Program design, Next: Invoking plzip, Prev: Introduct
|
||||||
2 Program design
|
2 Program design
|
||||||
****************
|
****************
|
||||||
|
|
||||||
For each input file, a splitter thread and several worker threads are
|
When compressing, plzip divides the input file into chunks and
|
||||||
|
compresses as many chunks simultaneously as worker threads are chosen,
|
||||||
|
creating a multi-member compressed file.
|
||||||
|
|
||||||
|
When decompressing, plzip decompresses as many members
|
||||||
|
simultaneously as worker threads are chosen. Files that were compressed
|
||||||
|
with lzip will not be decompressed faster than using lzip (unless the
|
||||||
|
'-b' option was used) because lzip usually produces single-member
|
||||||
|
files, which can't be decompressed in parallel.
|
||||||
|
|
||||||
|
For each input file, a splitter thread and several worker threads are
|
||||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||||
courier" takes care of data transfers among threads and limits the
|
courier" takes care of data transfers among threads and limits the
|
||||||
maximum number of data blocks (packets) being processed simultaneously.
|
maximum number of data blocks (packets) being processed simultaneously.
|
||||||
|
@ -141,10 +172,11 @@ writes them to the output file.
|
||||||
|
|
||||||
When decompressing from a regular file, the splitter is removed and
|
When decompressing from a regular file, the splitter is removed and
|
||||||
the workers read directly from the input file. If the output file is
|
the workers read directly from the input file. If the output file is
|
||||||
also a regular file, the muxer is also removed, and the workers write
|
also a regular file, the muxer is also removed and the workers write
|
||||||
directly to the output file. With these optimizations, decompression
|
directly to the output file. With these optimizations, the use of RAM
|
||||||
speed of large files with many members is only limited by the number of
|
is greatly reduced and the decompression speed of large files with many
|
||||||
processors available and by I/O speed.
|
members is only limited by the number of processors available and by
|
||||||
|
I/O speed.
|
||||||
|
|
||||||
|
|
||||||
File: plzip.info, Node: Invoking plzip, Next: File format, Prev: Program design, Up: Top
|
File: plzip.info, Node: Invoking plzip, Next: File format, Prev: Program design, Up: Top
|
||||||
|
@ -168,11 +200,11 @@ The format for running plzip is:
|
||||||
|
|
||||||
'-B BYTES'
|
'-B BYTES'
|
||||||
'--data-size=BYTES'
|
'--data-size=BYTES'
|
||||||
Set the input data block size in bytes. The input file will be
|
Set the size of the input data blocks, in bytes. The input file
|
||||||
divided in chunks of this size before compression is performed.
|
will be divided in chunks of this size before compression is
|
||||||
Valid values range from 8 KiB to 1 GiB. Default value is two times
|
performed. Valid values range from 8 KiB to 1 GiB. Default value
|
||||||
the dictionary size. Plzip will reduce the dictionary size if it
|
is two times the dictionary size. Plzip will reduce the dictionary
|
||||||
is larger than the chosen data size.
|
size if it is larger than the chosen data size.
|
||||||
|
|
||||||
'-c'
|
'-c'
|
||||||
'--stdout'
|
'--stdout'
|
||||||
|
@ -418,13 +450,13 @@ Concept index
|
||||||
|
|
||||||
Tag Table:
|
Tag Table:
|
||||||
Node: Top221
|
Node: Top221
|
||||||
Node: Introduction873
|
Node: Introduction847
|
||||||
Node: Program design5442
|
Node: Program design6279
|
||||||
Node: Invoking plzip6496
|
Node: Invoking plzip7868
|
||||||
Ref: --data-size6941
|
Ref: --data-size8313
|
||||||
Node: File format12090
|
Node: File format13471
|
||||||
Node: Problems14595
|
Node: Problems15976
|
||||||
Node: Concept index15124
|
Node: Concept index16505
|
||||||
|
|
||||||
End Tag Table
|
End Tag Table
|
||||||
|
|
||||||
|
|
|
@ -6,8 +6,8 @@
|
||||||
@finalout
|
@finalout
|
||||||
@c %**end of header
|
@c %**end of header
|
||||||
|
|
||||||
@set UPDATED 1 July 2014
|
@set UPDATED 29 August 2014
|
||||||
@set VERSION 1.2-rc2
|
@set VERSION 1.2
|
||||||
|
|
||||||
@dircategory Data Compression
|
@dircategory Data Compression
|
||||||
@direntry
|
@direntry
|
||||||
|
@ -44,8 +44,7 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
|
||||||
@end menu
|
@end menu
|
||||||
|
|
||||||
@sp 1
|
@sp 1
|
||||||
Copyright @copyright{} 2009, 2010, 2011, 2012, 2013, 2014
|
Copyright @copyright{} 2009-2014 Antonio Diaz Diaz.
|
||||||
Antonio Diaz Diaz.
|
|
||||||
|
|
||||||
This manual is free documentation: you have unlimited permission
|
This manual is free documentation: you have unlimited permission
|
||||||
to copy, distribute and modify it.
|
to copy, distribute and modify it.
|
||||||
|
@ -55,15 +54,15 @@ to copy, distribute and modify it.
|
||||||
@chapter Introduction
|
@chapter Introduction
|
||||||
@cindex introduction
|
@cindex introduction
|
||||||
|
|
||||||
Plzip is a massively parallel (multi-threaded), lossless data compressor
|
Plzip is a massively parallel (multi-threaded) lossless data compressor
|
||||||
based on the lzlib compression library, with a user interface similar to
|
based on the lzlib compression library, with a user interface similar to
|
||||||
the one of lzip, bzip2 or gzip.
|
the one of lzip, bzip2 or gzip.
|
||||||
|
|
||||||
Plzip can compress/decompress large files on multiprocessor machines
|
Plzip can compress/decompress large files on multiprocessor machines
|
||||||
much faster than lzip, at the cost of a slightly reduced compression
|
much faster than lzip, at the cost of a slightly reduced compression
|
||||||
ratio. Note that the number of usable threads is limited by file size,
|
ratio. Note that the number of usable threads is limited by file size;
|
||||||
so on files larger than a few GB plzip can use hundreds of processors,
|
on files larger than a few GB plzip can use hundreds of processors, but
|
||||||
but on files of only a few MB plzip is no faster than lzip.
|
on files of only a few MB plzip is no faster than lzip.
|
||||||
|
|
||||||
Plzip uses the lzip file format; the files produced by plzip are fully
|
Plzip uses the lzip file format; the files produced by plzip are fully
|
||||||
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
|
||||||
|
@ -92,6 +91,11 @@ Additionally lzip is copylefted, which guarantees that it will remain
|
||||||
free forever.
|
free forever.
|
||||||
@end itemize
|
@end itemize
|
||||||
|
|
||||||
|
A nice feature of the lzip format is that a corrupt byte is easier to
|
||||||
|
repair the nearer it is from the beginning of the file. Therefore, with
|
||||||
|
the help of lziprecover, losing an entire archive just because of a
|
||||||
|
corrupt byte near the beginning is a thing of the past.
|
||||||
|
|
||||||
The member trailer stores the 32-bit CRC of the original data, the size
|
The member trailer stores the 32-bit CRC of the original data, the size
|
||||||
of the original data and the size of the member. These values, together
|
of the original data and the size of the member. These values, together
|
||||||
with the value remaining in the range decoder and the end-of-stream
|
with the value remaining in the range decoder and the end-of-stream
|
||||||
|
@ -105,7 +109,29 @@ wrong. It can't help you recover the original uncompressed data.
|
||||||
|
|
||||||
Plzip uses the same well-defined exit status values used by lzip and
|
Plzip uses the same well-defined exit status values used by lzip and
|
||||||
bzip2, which makes it safer than compressors returning ambiguous warning
|
bzip2, which makes it safer than compressors returning ambiguous warning
|
||||||
values (like gzip) when it is used as a back end for tar or zutils.
|
values (like gzip) when it is used as a back end for other programs like
|
||||||
|
tar or zutils.
|
||||||
|
|
||||||
|
The amount of memory required @strong{per thread} is approximately the
|
||||||
|
following:
|
||||||
|
|
||||||
|
@itemize @bullet
|
||||||
|
@item
|
||||||
|
For compression; 3 times the data size (@pxref{--data-size}) plus 11
|
||||||
|
times the dictionary size.
|
||||||
|
|
||||||
|
@item
|
||||||
|
For decompression or testing of a non-seekable file or of standard
|
||||||
|
input; 2 times the dictionary size plus up to 32 MiB.
|
||||||
|
|
||||||
|
@item
|
||||||
|
For decompression of a regular file to a non-seekable file or to
|
||||||
|
standard output; the dictionary size plus up to 32 MiB.
|
||||||
|
|
||||||
|
@item
|
||||||
|
For decompression of a regular file to another regular file, or for
|
||||||
|
testing of a regular file; the dictionary size.
|
||||||
|
@end itemize
|
||||||
|
|
||||||
Plzip will automatically use the smallest possible dictionary size for
|
Plzip will automatically use the smallest possible dictionary size for
|
||||||
each file without exceeding the given limit. Keep in mind that the
|
each file without exceeding the given limit. Keep in mind that the
|
||||||
|
@ -154,6 +180,16 @@ you verify the compressed file with a command like
|
||||||
@chapter Program design
|
@chapter Program design
|
||||||
@cindex program design
|
@cindex program design
|
||||||
|
|
||||||
|
When compressing, plzip divides the input file into chunks and
|
||||||
|
compresses as many chunks simultaneously as worker threads are chosen,
|
||||||
|
creating a multi-member compressed file.
|
||||||
|
|
||||||
|
When decompressing, plzip decompresses as many members simultaneously as
|
||||||
|
worker threads are chosen. Files that were compressed with lzip will not
|
||||||
|
be decompressed faster than using lzip (unless the @samp{-b} option was
|
||||||
|
used) because lzip usually produces single-member files, which can't be
|
||||||
|
decompressed in parallel.
|
||||||
|
|
||||||
For each input file, a splitter thread and several worker threads are
|
For each input file, a splitter thread and several worker threads are
|
||||||
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
created, acting the main thread as muxer (multiplexer) thread. A "packet
|
||||||
courier" takes care of data transfers among threads and limits the
|
courier" takes care of data transfers among threads and limits the
|
||||||
|
@ -166,10 +202,10 @@ writes them to the output file.
|
||||||
|
|
||||||
When decompressing from a regular file, the splitter is removed and the
|
When decompressing from a regular file, the splitter is removed and the
|
||||||
workers read directly from the input file. If the output file is also a
|
workers read directly from the input file. If the output file is also a
|
||||||
regular file, the muxer is also removed, and the workers write directly
|
regular file, the muxer is also removed and the workers write directly
|
||||||
to the output file. With these optimizations, decompression speed of
|
to the output file. With these optimizations, the use of RAM is greatly
|
||||||
large files with many members is only limited by the number of
|
reduced and the decompression speed of large files with many members is
|
||||||
processors available and by I/O speed.
|
only limited by the number of processors available and by I/O speed.
|
||||||
|
|
||||||
|
|
||||||
@node Invoking plzip
|
@node Invoking plzip
|
||||||
|
@ -199,11 +235,11 @@ Print the version number of plzip on the standard output and exit.
|
||||||
@item -B @var{bytes}
|
@item -B @var{bytes}
|
||||||
@itemx --data-size=@var{bytes}
|
@itemx --data-size=@var{bytes}
|
||||||
@anchor{--data-size}
|
@anchor{--data-size}
|
||||||
Set the input data block size in bytes. The input file will be divided
|
Set the size of the input data blocks, in bytes. The input file will be
|
||||||
in chunks of this size before compression is performed. Valid values
|
divided in chunks of this size before compression is performed. Valid
|
||||||
range from 8 KiB to 1 GiB. Default value is two times the dictionary
|
values range from 8 KiB to 1 GiB. Default value is two times the
|
||||||
size. Plzip will reduce the dictionary size if it is larger than the
|
dictionary size. Plzip will reduce the dictionary size if it is larger
|
||||||
chosen data size.
|
than the chosen data size.
|
||||||
|
|
||||||
@item -c
|
@item -c
|
||||||
@itemx --stdout
|
@itemx --stdout
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
@ -59,7 +59,7 @@ File_index::File_index( const int infd )
|
||||||
{
|
{
|
||||||
const long long isize = lseek( infd, 0, SEEK_END );
|
const long long isize = lseek( infd, 0, SEEK_END );
|
||||||
if( isize < 0 )
|
if( isize < 0 )
|
||||||
{ set_errno_error( "Input file is not seekable :" ); return; }
|
{ set_errno_error( "Input file is not seekable: " ); return; }
|
||||||
if( isize < min_member_size )
|
if( isize < min_member_size )
|
||||||
{ error_ = "Input file is too short."; retval_ = 2; return; }
|
{ error_ = "Input file is too short."; retval_ = 2; return; }
|
||||||
if( isize > INT64_MAX )
|
if( isize > INT64_MAX )
|
||||||
|
@ -68,7 +68,7 @@ File_index::File_index( const int infd )
|
||||||
|
|
||||||
File_header header;
|
File_header header;
|
||||||
if( seek_read( infd, header.data, File_header::size, 0 ) != File_header::size )
|
if( seek_read( infd, header.data, File_header::size, 0 ) != File_header::size )
|
||||||
{ set_errno_error( "Error reading member header :" ); return; }
|
{ set_errno_error( "Error reading member header: " ); return; }
|
||||||
if( !header.verify_magic() )
|
if( !header.verify_magic() )
|
||||||
{ error_ = "Bad magic number (file not in lzip format).";
|
{ error_ = "Bad magic number (file not in lzip format).";
|
||||||
retval_ = 2; return; }
|
retval_ = 2; return; }
|
||||||
|
@ -82,7 +82,7 @@ File_index::File_index( const int infd )
|
||||||
File_trailer trailer;
|
File_trailer trailer;
|
||||||
if( seek_read( infd, trailer.data, File_trailer::size,
|
if( seek_read( infd, trailer.data, File_trailer::size,
|
||||||
pos - File_trailer::size ) != File_trailer::size )
|
pos - File_trailer::size ) != File_trailer::size )
|
||||||
{ set_errno_error( "Error reading member trailer :" ); break; }
|
{ set_errno_error( "Error reading member trailer: " ); break; }
|
||||||
const long long member_size = trailer.member_size();
|
const long long member_size = trailer.member_size();
|
||||||
if( member_size < min_member_size || member_size > pos )
|
if( member_size < min_member_size || member_size > pos )
|
||||||
{
|
{
|
||||||
|
@ -93,7 +93,7 @@ File_index::File_index( const int infd )
|
||||||
}
|
}
|
||||||
if( seek_read( infd, header.data, File_header::size,
|
if( seek_read( infd, header.data, File_header::size,
|
||||||
pos - member_size ) != File_header::size )
|
pos - member_size ) != File_header::size )
|
||||||
{ set_errno_error( "Error reading member header :" ); break; }
|
{ set_errno_error( "Error reading member header: " ); break; }
|
||||||
if( !header.verify_magic() || !header.verify_version() )
|
if( !header.verify_magic() || !header.verify_version() )
|
||||||
{
|
{
|
||||||
if( member_vector.empty() ) // maybe trailing garbage
|
if( member_vector.empty() ) // maybe trailing garbage
|
||||||
|
@ -119,7 +119,7 @@ File_index::File_index( const int infd )
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
std::reverse( member_vector.begin(), member_vector.end() );
|
std::reverse( member_vector.begin(), member_vector.end() );
|
||||||
for( unsigned i = 0; i < member_vector.size() - 1; ++i )
|
for( unsigned long i = 0; i < member_vector.size() - 1; ++i )
|
||||||
{
|
{
|
||||||
const long long end = member_vector[i].dblock.end();
|
const long long end = member_vector[i].dblock.end();
|
||||||
if( end < 0 || end > INT64_MAX )
|
if( end < 0 || end > INT64_MAX )
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
2
lzip.h
2
lzip.h
|
@ -1,5 +1,5 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
|
9
main.cc
9
main.cc
|
@ -1,6 +1,6 @@
|
||||||
/* Plzip - Parallel compressor compatible with lzip
|
/* Plzip - Parallel compressor compatible with lzip
|
||||||
Copyright (C) 2009 Laszlo Ersek.
|
Copyright (C) 2009 Laszlo Ersek.
|
||||||
Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
|
|
||||||
This program is free software: you can redistribute it and/or modify
|
This program is free software: you can redistribute it and/or modify
|
||||||
it under the terms of the GNU General Public License as published by
|
it under the terms of the GNU General Public License as published by
|
||||||
|
@ -103,7 +103,7 @@ void show_help( const long num_online )
|
||||||
std::printf( "\nOptions:\n"
|
std::printf( "\nOptions:\n"
|
||||||
" -h, --help display this help and exit\n"
|
" -h, --help display this help and exit\n"
|
||||||
" -V, --version output version information and exit\n"
|
" -V, --version output version information and exit\n"
|
||||||
" -B, --data-size=<bytes> set input data block size in bytes\n"
|
" -B, --data-size=<bytes> set size of input data blocks, in bytes\n"
|
||||||
" -c, --stdout send output to standard output\n"
|
" -c, --stdout send output to standard output\n"
|
||||||
" -d, --decompress decompress\n"
|
" -d, --decompress decompress\n"
|
||||||
" -f, --force overwrite existing output files\n"
|
" -f, --force overwrite existing output files\n"
|
||||||
|
@ -688,9 +688,8 @@ int main( const int argc, const char * const argv[] )
|
||||||
int tmp;
|
int tmp;
|
||||||
if( program_mode == m_compress )
|
if( program_mode == m_compress )
|
||||||
{
|
{
|
||||||
if( verbosity >= 2 )
|
if( verbosity >= 2 ) // init
|
||||||
show_progress( 0, &pp, ( in_statsp && S_ISREG( in_statsp->st_mode ) ) ?
|
show_progress( 0, &pp, infd_isreg ? in_statsp->st_size / 100 : 0 );
|
||||||
in_statsp->st_size / 100 : 0 ); // init
|
|
||||||
tmp = compress( data_size, encoder_options.dictionary_size,
|
tmp = compress( data_size, encoder_options.dictionary_size,
|
||||||
encoder_options.match_len_limit,
|
encoder_options.match_len_limit,
|
||||||
num_workers, infd, outfd, pp, debug_level );
|
num_workers, infd, outfd, pp, debug_level );
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
#! /bin/sh
|
#! /bin/sh
|
||||||
# check script for Plzip - Parallel compressor compatible with lzip
|
# check script for Plzip - Parallel compressor compatible with lzip
|
||||||
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
|
# Copyright (C) 2009-2014 Antonio Diaz Diaz.
|
||||||
#
|
#
|
||||||
# This script is free software: you have unlimited permission
|
# This script is free software: you have unlimited permission
|
||||||
# to copy, distribute and modify it.
|
# to copy, distribute and modify it.
|
||||||
|
|
Loading…
Add table
Reference in a new issue