2025-02-17 21:04:20 +01:00
|
|
|
Description
|
|
|
|
|
2025-02-17 21:12:08 +01:00
|
|
|
Tarlz is a massively parallel (multi-threaded) combined implementation of
|
|
|
|
the tar archiver and the lzip compressor. Tarlz creates, lists and extracts
|
|
|
|
archives in a simplified posix pax format compressed with lzip, keeping the
|
|
|
|
alignment between tar members and lzip members. This method adds an indexed
|
|
|
|
lzip layer on top of the tar archive, making it possible to decode the
|
|
|
|
archive safely in parallel. The resulting multimember tar.lz archive is
|
|
|
|
fully backward compatible with standard tar tools like GNU tar, which treat
|
|
|
|
it like any other tar.lz archive. Tarlz can append files to the end of such
|
|
|
|
compressed archives.
|
|
|
|
|
|
|
|
Tarlz can create tar archives with five levels of compression granularity;
|
2025-02-17 21:12:27 +01:00
|
|
|
per file, per block (default), per directory, appendable solid, and solid.
|
2025-02-17 21:12:08 +01:00
|
|
|
|
|
|
|
Of course, compressing each file (or each directory) individually can't
|
|
|
|
achieve a compression ratio as high as compressing solidly the whole tar
|
|
|
|
archive, but it has the following advantages:
|
2025-02-17 21:04:20 +01:00
|
|
|
|
|
|
|
* The resulting multimember tar.lz archive can be decompressed in
|
2025-02-17 21:10:27 +01:00
|
|
|
parallel, multiplying the decompression speed.
|
2025-02-17 21:04:20 +01:00
|
|
|
|
2025-02-17 21:09:53 +01:00
|
|
|
* New members can be appended to the archive (by removing the EOF
|
2025-02-17 21:04:20 +01:00
|
|
|
member) just like to an uncompressed tar archive.
|
|
|
|
|
|
|
|
* It is a safe posix-style backup format. In case of corruption,
|
|
|
|
tarlz can extract all the undamaged members from the tar.lz
|
|
|
|
archive, skipping over the damaged members, just like the standard
|
2025-02-17 21:09:53 +01:00
|
|
|
(uncompressed) tar. Moreover, the option '--keep-damaged' can be
|
|
|
|
used to recover as much data as possible from each damaged member,
|
|
|
|
and lziprecover can be used to recover some of the damaged members.
|
2025-02-17 21:04:20 +01:00
|
|
|
|
|
|
|
* A multimember tar.lz archive is usually smaller than the
|
|
|
|
corresponding solidly compressed tar.gz archive, except when
|
|
|
|
individually compressing files smaller than about 32 KiB.
|
|
|
|
|
2025-02-17 21:10:27 +01:00
|
|
|
Note that the posix pax format has a serious flaw. The metadata stored in
|
|
|
|
pax extended records are not protected by any kind of check sequence.
|
2025-02-17 21:09:53 +01:00
|
|
|
Corruption in a long filename may cause the extraction of the file in the
|
2025-02-17 21:10:27 +01:00
|
|
|
wrong place without warning. Corruption in a large file size may cause the
|
2025-02-17 21:09:53 +01:00
|
|
|
truncation of the file or the appending of garbage to the file, both
|
2025-02-17 21:10:27 +01:00
|
|
|
followed by a spurious warning about a corrupt header far from the place of
|
|
|
|
the undetected corruption.
|
2025-02-17 21:09:53 +01:00
|
|
|
|
|
|
|
Metadata like filename and file size must be always protected in an archive
|
|
|
|
format because of the adverse effects of undetected corruption in them,
|
|
|
|
potentially much worse that undetected corruption in the data. Even more so
|
|
|
|
in the case of pax because the amount of metadata it stores is potentially
|
|
|
|
large, making undetected corruption more probable.
|
|
|
|
|
2025-02-17 21:12:08 +01:00
|
|
|
Because of the above, tarlz protects the extended records with a CRC in a
|
|
|
|
way compatible with standard tar tools.
|
2025-02-17 21:09:53 +01:00
|
|
|
|
|
|
|
Tarlz does not understand other tar formats like gnu, oldgnu, star or v7.
|
|
|
|
|
2025-02-17 21:12:08 +01:00
|
|
|
The diagram below shows the correspondence between each tar member (formed
|
|
|
|
by one or two headers plus optional data) in the tar archive and each lzip
|
|
|
|
member in the resulting multimember tar.lz archive, when per file
|
|
|
|
compression is used:
|
2025-02-17 21:09:53 +01:00
|
|
|
|
|
|
|
tar
|
|
|
|
+========+======+=================+===============+========+======+========+
|
|
|
|
| header | data | extended header | extended data | header | data | EOF |
|
|
|
|
+========+======+=================+===============+========+======+========+
|
|
|
|
|
|
|
|
tar.lz
|
|
|
|
+===============+=================================================+========+
|
|
|
|
| member | member | member |
|
|
|
|
+===============+=================================================+========+
|
|
|
|
|
2025-02-17 21:04:20 +01:00
|
|
|
|
2025-02-17 21:10:27 +01:00
|
|
|
Copyright (C) 2013-2019 Antonio Diaz Diaz.
|
2025-02-17 21:04:20 +01:00
|
|
|
|
|
|
|
This file is free documentation: you have unlimited permission to copy,
|
|
|
|
distribute and modify it.
|
|
|
|
|
|
|
|
The file Makefile.in is a data file used by configure to produce the
|
|
|
|
Makefile. It has the same copyright owner and permissions that configure
|
|
|
|
itself.
|