78 lines
3.7 KiB
Text
78 lines
3.7 KiB
Text
Description
|
|
|
|
Tarlz is a combined implementation of the tar archiver and the lzip
|
|
compressor. By default tarlz creates, lists and extracts archives in a
|
|
simplified posix pax format compressed with lzip on a per file basis. Each
|
|
tar member is compressed in its own lzip member, as well as the end-of-file
|
|
blocks. This method adds an indexed lzip layer on top of the tar archive,
|
|
making it possible to decode the archive safely in parallel. The resulting
|
|
multimember tar.lz archive is fully backward compatible with standard tar
|
|
tools like GNU tar, which treat it like any other tar.lz archive. Tarlz can
|
|
append files to the end of such compressed archives.
|
|
|
|
Tarlz can create tar archives with four levels of compression granularity;
|
|
per file, per directory, appendable solid, and solid.
|
|
|
|
Of course, compressing each file (or each directory) individually is
|
|
less efficient than compressing the whole tar archive, but it has the
|
|
following advantages:
|
|
|
|
* The resulting multimember tar.lz archive can be decompressed in
|
|
parallel, multiplying the decompression speed.
|
|
|
|
* New members can be appended to the archive (by removing the EOF
|
|
member) just like to an uncompressed tar archive.
|
|
|
|
* It is a safe posix-style backup format. In case of corruption,
|
|
tarlz can extract all the undamaged members from the tar.lz
|
|
archive, skipping over the damaged members, just like the standard
|
|
(uncompressed) tar. Moreover, the option '--keep-damaged' can be
|
|
used to recover as much data as possible from each damaged member,
|
|
and lziprecover can be used to recover some of the damaged members.
|
|
|
|
* A multimember tar.lz archive is usually smaller than the
|
|
corresponding solidly compressed tar.gz archive, except when
|
|
individually compressing files smaller than about 32 KiB.
|
|
|
|
Note that the posix pax format has a serious flaw. The metadata stored in
|
|
pax extended records are not protected by any kind of check sequence.
|
|
Corruption in a long filename may cause the extraction of the file in the
|
|
wrong place without warning. Corruption in a large file size may cause the
|
|
truncation of the file or the appending of garbage to the file, both
|
|
followed by a spurious warning about a corrupt header far from the place of
|
|
the undetected corruption.
|
|
|
|
Metadata like filename and file size must be always protected in an archive
|
|
format because of the adverse effects of undetected corruption in them,
|
|
potentially much worse that undetected corruption in the data. Even more so
|
|
in the case of pax because the amount of metadata it stores is potentially
|
|
large, making undetected corruption more probable.
|
|
|
|
Because of the above, tarlz protects the extended records with a CRC in
|
|
a way compatible with standard tar tools.
|
|
|
|
Tarlz does not understand other tar formats like gnu, oldgnu, star or v7.
|
|
|
|
The diagram below shows the correspondence between each tar member
|
|
(formed by one or two headers plus optional data) in the tar archive and
|
|
each lzip member in the resulting multimember tar.lz archive:
|
|
|
|
tar
|
|
+========+======+=================+===============+========+======+========+
|
|
| header | data | extended header | extended data | header | data | EOF |
|
|
+========+======+=================+===============+========+======+========+
|
|
|
|
tar.lz
|
|
+===============+=================================================+========+
|
|
| member | member | member |
|
|
+===============+=================================================+========+
|
|
|
|
|
|
Copyright (C) 2013-2019 Antonio Diaz Diaz.
|
|
|
|
This file is free documentation: you have unlimited permission to copy,
|
|
distribute and modify it.
|
|
|
|
The file Makefile.in is a data file used by configure to produce the
|
|
Makefile. It has the same copyright owner and permissions that configure
|
|
itself.
|