Merging upstream version 0.6.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
e39d8907e0
commit
f4329ad86e
13 changed files with 378 additions and 275 deletions
96
README
96
README
|
@ -1,25 +1,23 @@
|
|||
Description
|
||||
|
||||
Xlunzip is a test tool for the lzip decompression code of my lzip patch
|
||||
for linux. Xlunzip is similar to lunzip, but it uses the lzip_decompress
|
||||
linux module as a backend. Xlunzip tests the module for stream,
|
||||
buffer-to-buffer and mixed decompression modes, including in-place
|
||||
decompression (using the same buffer for input and output). You can use
|
||||
xlunzip to verify that the module produces correct results when
|
||||
decompressing single member files, multimember files, or the
|
||||
concatenation of two or more compressed files. Xlunzip can be used with
|
||||
unzcrash to test the robustness of the module to the decompression of
|
||||
corrupted data.
|
||||
Xlunzip is a test tool for the lzip decompression code of my lzip patch for
|
||||
linux. Xlunzip is similar to lunzip, but it uses the lzip_decompress linux
|
||||
module as a backend. Xlunzip tests the module for stream, buffer-to-buffer,
|
||||
and mixed decompression modes, including in-place decompression (using the
|
||||
same buffer for input and output). You can use xlunzip to verify that the
|
||||
module produces correct results when decompressing single member files,
|
||||
multimember files, or the concatenation of two or more compressed files.
|
||||
Xlunzip can be used with unzcrash to test the robustness of the module to
|
||||
the decompression of corrupted data.
|
||||
|
||||
Note that the in-place decompression of concatenated files can't be
|
||||
guaranteed to work because an arbitrarily low compression ratio of the
|
||||
last part of the data can be achieved by appending enough empty
|
||||
compressed members to a file, masking a high compression ratio at the
|
||||
beginning of the data.
|
||||
The distributed index feature of the lzip format allows xlunzip to
|
||||
decompress concatenated files in place. This can't be guaranteed to work
|
||||
with formats like gzip or bzip2 because they can't detect whether a high
|
||||
compression ratio in the first members of the multimember data is being
|
||||
masked by a low compression ratio in the last members.
|
||||
|
||||
The xlunzip tarball contains a copy of the lzip_decompress module and
|
||||
can be compiled and tested without downloading or applying the patch to
|
||||
the kernel.
|
||||
The xlunzip tarball contains a copy of the lzip_decompress module and can be
|
||||
compiled and tested without downloading or applying the patch to the kernel.
|
||||
|
||||
My lzip patch for linux can be found at
|
||||
http://download.savannah.gnu.org/releases/lzip/kernel/
|
||||
|
@ -29,14 +27,72 @@ Lzip related components in the kernel
|
|||
|
||||
The lzip_decompress module in lib/lzip_decompress.c provides a versatile
|
||||
lzip decompression function able to do buffer to buffer decompression or
|
||||
stream decompression with fill and flush callback functions. The usage
|
||||
of the function is documented in include/linux/lzip.h.
|
||||
stream decompression with fill and flush callback functions. The usage of
|
||||
the function is documented in include/linux/lzip.h.
|
||||
|
||||
For decompressing the kernel image, initramfs, and initrd, there is a
|
||||
wrapper function in lib/decompress_lunzip.c providing the same common
|
||||
interface as the other decompress_*.c files, which is defined in
|
||||
include/linux/decompress/generic.h.
|
||||
|
||||
Analysis of the in-place decompression
|
||||
======================================
|
||||
|
||||
In order to decompress the kernel in place (using the same buffer for input
|
||||
and output), the compressed data is placed at the end of the buffer used to
|
||||
hold the decompressed data. The buffer must be large enough to contain after
|
||||
the decompressed data extra space for a marker, a trailer, the maximum
|
||||
possible data expansion, and (if the compressed data consists of more than
|
||||
one member) N-1 empty members.
|
||||
|
||||
|------ compressed data ------|
|
||||
V V
|
||||
|----------------|-------------------|---------|
|
||||
^ ^ extra
|
||||
|-------- decompressed data ---------|
|
||||
|
||||
The input pointer initially points to the beginning of the compressed data
|
||||
and the output pointer initially points to the beginning of the buffer.
|
||||
Decompressing compressible data reduces the distance between the pointers,
|
||||
while decompressing uncompressible data increases the distance. The extra
|
||||
space must be large enough that the output pointer does not overrun the
|
||||
input pointer even if all the overlap between compressed and decompressed
|
||||
data is uncompressible. The worst case is very compressible data followed by
|
||||
uncompressible data because in this case the output pointer increases faster
|
||||
when the input pointer is smaller.
|
||||
|
||||
| * <-- input pointer
|
||||
| * , <-- output pointer
|
||||
| * , '
|
||||
| x ' <-- overrun (x)
|
||||
memory | * ,'
|
||||
address | * ,'
|
||||
|* ,'
|
||||
| ,'
|
||||
| ,'
|
||||
|,'
|
||||
`--------------------------
|
||||
time
|
||||
|
||||
All we need to know to calculate the minimum required extra space is:
|
||||
The maximum expansion ratio.
|
||||
The size of the last part of a member required to verify integrity.
|
||||
For multimember data, the overhead per member. (36 bytes for lzip).
|
||||
|
||||
The maximum expansion ratio of LZMA data is of about 1.4%. Rounding this up
|
||||
to 1/64 (1.5625%) and adding 36 bytes per input member, the extra space
|
||||
required to decompress lzip data in place is:
|
||||
extra_bytes = ( compressed_size >> 6 ) + members * 36
|
||||
|
||||
Using the compressed size to calculate the extra_bytes (as in the equation
|
||||
above) may slightly overestimate the amount of space required in the worst
|
||||
case. But calculating the extra_bytes from the uncompressed size (as does
|
||||
linux) is wrong (and inefficient for high compression ratios). The formula
|
||||
used in arch/x86/boot/header.S
|
||||
extra_bytes = (uncompressed_size >> 8) + 65536
|
||||
fails with 1 MB of zeros followed by 8 MB of random data, and wastes memory
|
||||
for compression ratios > 4:1.
|
||||
|
||||
|
||||
Copyright (C) 2016-2020 Antonio Diaz Diaz.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue