diff --git a/AUTHORS b/AUTHORS index dfd16e1..f34d016 100644 --- a/AUTHORS +++ b/AUTHORS @@ -1,7 +1,7 @@ Lzlib was written by Antonio Diaz Diaz. The ideas embodied in lzlib are due to (at least) the following people: -Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the -definition of Markov chains), G.N.N. Martin (for the definition of range -encoding), Igor Pavlov (for putting all the above together in LZMA), and -Julian Seward (for bzip2's CLI). +Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for +the definition of Markov chains), G.N.N. Martin (for the definition of +range encoding), Igor Pavlov (for putting all the above together in +LZMA), and Julian Seward (for bzip2's CLI). diff --git a/COPYING b/COPYING index a6511c8..4ad17ae 100644 --- a/COPYING +++ b/COPYING @@ -1,17 +1,338 @@ - Lzlib - Compression library for the lzip format - Copyright (C) Antonio Diaz Diaz. + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + Preamble - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/COPYING.GPL b/COPYING.GPL deleted file mode 100644 index 42fe735..0000000 --- a/COPYING.GPL +++ /dev/null @@ -1,337 +0,0 @@ - GNU GENERAL PUBLIC LICENSE - Version 2, June 1991 - - Copyright (C) 1989, 1991 Free Software Foundation, Inc. - Everyone is permitted to copy and distribute verbatim copies - of this license document, but changing it is not allowed. - - Preamble - - The licenses for most software are designed to take away your -freedom to share and change it. By contrast, the GNU General Public -License is intended to guarantee your freedom to share and change free -software--to make sure the software is free for all its users. This -General Public License applies to most of the Free Software -Foundation's software and to any other program whose authors commit to -using it. (Some other Free Software Foundation software is covered by -the GNU Lesser General Public License instead.) You can apply it to -your programs, too. - - When we speak of free software, we are referring to freedom, not -price. Our General Public Licenses are designed to make sure that you -have the freedom to distribute copies of free software (and charge for -this service if you wish), that you receive source code or can get it -if you want it, that you can change the software or use pieces of it -in new free programs; and that you know you can do these things. - - To protect your rights, we need to make restrictions that forbid -anyone to deny you these rights or to ask you to surrender the rights. -These restrictions translate to certain responsibilities for you if you -distribute copies of the software, or if you modify it. - - For example, if you distribute copies of such a program, whether -gratis or for a fee, you must give the recipients all the rights that -you have. You must make sure that they, too, receive or can get the -source code. And you must show them these terms so they know their -rights. - - We protect your rights with two steps: (1) copyright the software, and -(2) offer you this license which gives you legal permission to copy, -distribute and/or modify the software. - - Also, for each author's protection and ours, we want to make certain -that everyone understands that there is no warranty for this free -software. If the software is modified by someone else and passed on, we -want its recipients to know that what they have is not the original, so -that any problems introduced by others will not reflect on the original -authors' reputations. - - Finally, any free program is threatened constantly by software -patents. We wish to avoid the danger that redistributors of a free -program will individually obtain patent licenses, in effect making the -program proprietary. To prevent this, we have made it clear that any -patent must be licensed for everyone's free use or not licensed at all. - - The precise terms and conditions for copying, distribution and -modification follow. - - GNU GENERAL PUBLIC LICENSE - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - - 0. This License applies to any program or other work which contains -a notice placed by the copyright holder saying it may be distributed -under the terms of this General Public License. The "Program", below, -refers to any such program or work, and a "work based on the Program" -means either the Program or any derivative work under copyright law: -that is to say, a work containing the Program or a portion of it, -either verbatim or with modifications and/or translated into another -language. (Hereinafter, translation is included without limitation in -the term "modification".) Each licensee is addressed as "you". - -Activities other than copying, distribution and modification are not -covered by this License; they are outside its scope. The act of -running the Program is not restricted, and the output from the Program -is covered only if its contents constitute a work based on the -Program (independent of having been made by running the Program). -Whether that is true depends on what the Program does. - - 1. You may copy and distribute verbatim copies of the Program's -source code as you receive it, in any medium, provided that you -conspicuously and appropriately publish on each copy an appropriate -copyright notice and disclaimer of warranty; keep intact all the -notices that refer to this License and to the absence of any warranty; -and give any other recipients of the Program a copy of this License -along with the Program. - -You may charge a fee for the physical act of transferring a copy, and -you may at your option offer warranty protection in exchange for a fee. - - 2. You may modify your copy or copies of the Program or any portion -of it, thus forming a work based on the Program, and copy and -distribute such modifications or work under the terms of Section 1 -above, provided that you also meet all of these conditions: - - a) You must cause the modified files to carry prominent notices - stating that you changed the files and the date of any change. - - b) You must cause any work that you distribute or publish, that in - whole or in part contains or is derived from the Program or any - part thereof, to be licensed as a whole at no charge to all third - parties under the terms of this License. - - c) If the modified program normally reads commands interactively - when run, you must cause it, when started running for such - interactive use in the most ordinary way, to print or display an - announcement including an appropriate copyright notice and a - notice that there is no warranty (or else, saying that you provide - a warranty) and that users may redistribute the program under - these conditions, and telling the user how to view a copy of this - License. (Exception: if the Program itself is interactive but - does not normally print such an announcement, your work based on - the Program is not required to print an announcement.) - -These requirements apply to the modified work as a whole. If -identifiable sections of that work are not derived from the Program, -and can be reasonably considered independent and separate works in -themselves, then this License, and its terms, do not apply to those -sections when you distribute them as separate works. But when you -distribute the same sections as part of a whole which is a work based -on the Program, the distribution of the whole must be on the terms of -this License, whose permissions for other licensees extend to the -entire whole, and thus to each and every part regardless of who wrote it. - -Thus, it is not the intent of this section to claim rights or contest -your rights to work written entirely by you; rather, the intent is to -exercise the right to control the distribution of derivative or -collective works based on the Program. - -In addition, mere aggregation of another work not based on the Program -with the Program (or with a work based on the Program) on a volume of -a storage or distribution medium does not bring the other work under -the scope of this License. - - 3. You may copy and distribute the Program (or a work based on it, -under Section 2) in object code or executable form under the terms of -Sections 1 and 2 above provided that you also do one of the following: - - a) Accompany it with the complete corresponding machine-readable - source code, which must be distributed under the terms of Sections - 1 and 2 above on a medium customarily used for software interchange; or, - - b) Accompany it with a written offer, valid for at least three - years, to give any third party, for a charge no more than your - cost of physically performing source distribution, a complete - machine-readable copy of the corresponding source code, to be - distributed under the terms of Sections 1 and 2 above on a medium - customarily used for software interchange; or, - - c) Accompany it with the information you received as to the offer - to distribute corresponding source code. (This alternative is - allowed only for noncommercial distribution and only if you - received the program in object code or executable form with such - an offer, in accord with Subsection b above.) - -The source code for a work means the preferred form of the work for -making modifications to it. For an executable work, complete source -code means all the source code for all modules it contains, plus any -associated interface definition files, plus the scripts used to -control compilation and installation of the executable. However, as a -special exception, the source code distributed need not include -anything that is normally distributed (in either source or binary -form) with the major components (compiler, kernel, and so on) of the -operating system on which the executable runs, unless that component -itself accompanies the executable. - -If distribution of executable or object code is made by offering -access to copy from a designated place, then offering equivalent -access to copy the source code from the same place counts as -distribution of the source code, even though third parties are not -compelled to copy the source along with the object code. - - 4. You may not copy, modify, sublicense, or distribute the Program -except as expressly provided under this License. Any attempt -otherwise to copy, modify, sublicense or distribute the Program is -void, and will automatically terminate your rights under this License. -However, parties who have received copies, or rights, from you under -this License will not have their licenses terminated so long as such -parties remain in full compliance. - - 5. You are not required to accept this License, since you have not -signed it. However, nothing else grants you permission to modify or -distribute the Program or its derivative works. These actions are -prohibited by law if you do not accept this License. Therefore, by -modifying or distributing the Program (or any work based on the -Program), you indicate your acceptance of this License to do so, and -all its terms and conditions for copying, distributing or modifying -the Program or works based on it. - - 6. Each time you redistribute the Program (or any work based on the -Program), the recipient automatically receives a license from the -original licensor to copy, distribute or modify the Program subject to -these terms and conditions. You may not impose any further -restrictions on the recipients' exercise of the rights granted herein. -You are not responsible for enforcing compliance by third parties to -this License. - - 7. If, as a consequence of a court judgment or allegation of patent -infringement or for any other reason (not limited to patent issues), -conditions are imposed on you (whether by court order, agreement or -otherwise) that contradict the conditions of this License, they do not -excuse you from the conditions of this License. If you cannot -distribute so as to satisfy simultaneously your obligations under this -License and any other pertinent obligations, then as a consequence you -may not distribute the Program at all. For example, if a patent -license would not permit royalty-free redistribution of the Program by -all those who receive copies directly or indirectly through you, then -the only way you could satisfy both it and this License would be to -refrain entirely from distribution of the Program. - -If any portion of this section is held invalid or unenforceable under -any particular circumstance, the balance of the section is intended to -apply and the section as a whole is intended to apply in other -circumstances. - -It is not the purpose of this section to induce you to infringe any -patents or other property right claims or to contest validity of any -such claims; this section has the sole purpose of protecting the -integrity of the free software distribution system, which is -implemented by public license practices. Many people have made -generous contributions to the wide range of software distributed -through that system in reliance on consistent application of that -system; it is up to the author/donor to decide if he or she is willing -to distribute software through any other system and a licensee cannot -impose that choice. - -This section is intended to make thoroughly clear what is believed to -be a consequence of the rest of this License. - - 8. If the distribution and/or use of the Program is restricted in -certain countries either by patents or by copyrighted interfaces, the -original copyright holder who places the Program under this License -may add an explicit geographical distribution limitation excluding -those countries, so that distribution is permitted only in or among -countries not thus excluded. In such case, this License incorporates -the limitation as if written in the body of this License. - - 9. The Free Software Foundation may publish revised and/or new versions -of the General Public License from time to time. Such new versions will -be similar in spirit to the present version, but may differ in detail to -address new problems or concerns. - -Each version is given a distinguishing version number. If the Program -specifies a version number of this License which applies to it and "any -later version", you have the option of following the terms and conditions -either of that version or of any later version published by the Free -Software Foundation. If the Program does not specify a version number of -this License, you may choose any version ever published by the Free Software -Foundation. - - 10. If you wish to incorporate parts of the Program into other free -programs whose distribution conditions are different, write to the author -to ask for permission. For software which is copyrighted by the Free -Software Foundation, write to the Free Software Foundation; we sometimes -make exceptions for this. Our decision will be guided by the two goals -of preserving the free status of all derivatives of our free software and -of promoting the sharing and reuse of software generally. - - NO WARRANTY - - 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY -FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN -OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES -PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED -OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF -MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS -TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE -PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, -REPAIR OR CORRECTION. - - 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING -WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR -REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, -INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING -OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED -TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY -YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER -PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE -POSSIBILITY OF SUCH DAMAGES. - - END OF TERMS AND CONDITIONS - - How to Apply These Terms to Your New Programs - - If you develop a new program, and you want it to be of the greatest -possible use to the public, the best way to achieve this is to make it -free software which everyone can redistribute and change under these terms. - - To do so, attach the following notices to the program. It is safest -to attach them to the start of each source file to most effectively -convey the exclusion of warranty; and each file should have at least -the "copyright" line and a pointer to where the full notice is found. - - - Copyright (C) - - This program is free software: you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation, either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program. If not, see . - -Also add information on how to contact you by electronic and paper mail. - -If the program is interactive, make it output a short notice like this -when it starts in an interactive mode: - - Gnomovision version 69, Copyright (C) - Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. - This is free software, and you are welcome to redistribute it - under certain conditions; type `show c' for details. - -The hypothetical commands `show w' and `show c' should show the appropriate -parts of the General Public License. Of course, the commands you use may -be called something other than `show w' and `show c'; they could even be -mouse-clicks or menu items--whatever suits your program. - -You should also get your employer (if you work as a programmer) or your -school, if any, to sign a "copyright disclaimer" for the program, if -necessary. Here is a sample; alter the names: - - Yoyodyne, Inc., hereby disclaims all copyright interest in the program - `Gnomovision' (which makes passes at compilers) written by James Hacker. - - , 1 April 1989 - Ty Coon, President of Vice - -This General Public License does not permit incorporating your program into -proprietary programs. If your program is a subroutine library, you may -consider it more useful to permit linking proprietary applications with the -library. If this is what you want to do, use the GNU Lesser General -Public License instead of this License. diff --git a/ChangeLog b/ChangeLog index d3e52e4..3c65439 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,155 +1,38 @@ -2025-01-09 Antonio Diaz Diaz - - * Version 1.15 released. - * decoder.h (Rd_try_reload): Reject a nonzero first LZMA byte. - * minilzip.c (do_decompress): Reject empty member in multimember. - (Pp_free): New function. - * lzlib.h: Declare LZ_Errno, LZ_Encoder, and LZ_Decoder as typedef. - * Makefile.in: New target 'lib' which builds just the library. - New target 'bin' which builds the library and minilzip. - 'lib' is now the default; minilzip is no longer built by default. - 'install-bin' installs minilzip and its man page again. - * configure, Makefile.in: Use '--soname' conditionally. - (Reported by Michael Sullivan). - * INSTALL: Document use of 'make bin'. - * check.sh: Use 'cp' instead of 'cat'. - -2024-01-20 Antonio Diaz Diaz - - * Version 1.14 released. - * minilzip.c: Reformat file diagnostics as 'PROGRAM: FILE: MESSAGE'. - (show_option_error): New function showing argument and option name. - (main): Make -o preserve date/mode/owner if 1 input file. - * lzip.h: Rename verify_* to check_*. - * lzlib.texi: Document the need to declare uint8_t before lzlib.h. - (Reported by Michal Górny). - * configure, Makefile.in: New variable 'MAKEINFO'. - * INSTALL: Document use of CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'. - -2022-01-23 Antonio Diaz Diaz - - * Version 1.13 released. - * configure: Set variables AR and ARFLAGS. (Reported by Hoël Bézier). - * main.c: Rename to minilzip.c. - * minilzip.c (getnum): Show option name and valid range if error. - (check_lib): Check that LZ_API_VERSION and LZ_version_string match. - * Improve several descriptions in manual, '--help', and man page. - * lzlib.texi: Change GNU Texinfo category to 'Compression'. - (Reported by Alfred M. Szmidt). - -2021-01-02 Antonio Diaz Diaz - - * Version 1.12 released. - * lzlib.h: Define LZ_API_VERSION as 1000 * major + minor. 1.12 = 1012. - This change does not affect the soversion. - * lzlib.h, lzlib.c: New function LZ_api_version. - * LZd_try_verify_trailer: Return 2 if EOF at trailer or EOS marker. - * Decompression speed has been slightly increased. - * decoder.h: Increase 'rd_min_available_bytes' from 8 to 10. - * encoder_base.c (LZeb_try_sync_flush): - Compensate for the increase in 'rd_min_available_bytes'. - * main.c (do_decompress): Fix false report about library stall. - New option '--check-lib'. - (main): Report an error if a file name is empty. - Make '-o' behave like '-c', but writing to file instead of stdout. - Make '-c' and '-o' check whether the output is a terminal only once. - Do not open output if input is a terminal. - Replace 'decompressed', 'compressed' with 'out', 'in' in output. - Set a valid invocation_name even if argc == 0. - * lzlib.texi: Document the new way of checking the library version. - Document that 'LZ_(de)compress_close' and 'LZ_(de)compress_errno' - can be called with a null argument. - Document that sync flush marker is not allowed in lzip files. - Document the consequences of not calling 'LZ_decompress_finish'. - Document that 'LZ_decompress_read' returns at least once per member. - Document that 'LZ_(de)compress_read' can be called with a null - buffer pointer argument. - Real code examples for common uses have been added to the tutorial. - * bbexample.c: Don't use 'LZ_(de)compress_write_size'. - * lzcheck.c: New options '-s' (sync) and '-m' (member by member). - Test member by member without 'LZ_decompress_finish'. - * ffexample.c: New file containing example functions for file-to-file - compression/decompression. - * Document extraction from tar.lz in '--help' output and man page. - * Makefile.in: 'install-bin' no longer installs the man page. - New targets 'install-bin-compress' and 'install-bin-strip-compress'. - * testsuite: Add 9 new test files. - -2019-01-02 Antonio Diaz Diaz - - * Version 1.11 released. - * Rename File_* to Lzip_*. - * LZ_decompress_read: Don't return error until all data is read. - * decoder.c (LZd_decode_member): Decode truncated data until EOF. - * cbuffer.c (Cb_read_data): Allow a null buffer pointer. - * main.c: Don't allow mixing different operations (-d and -t). - * main.c: Check return value of close( infd ). - * main.c: Compile on DOS with DJGPP. - * lzlib.texi: Improve descriptions of '-0..-9', '-m', and '-s'. - Document that 'LZ_(de)compress_finish' can be called repeatedly. - * configure: Accept appending to CFLAGS; 'CFLAGS+=OPTIONS'. - * Makefile.in: Rename targets 'install-bin*' to 'install-lib*'. - * Makefile.in: Targets 'install-bin*' now install minilzip. - * INSTALL: Document use of CFLAGS+='-D __USE_MINGW_ANSI_STDIO'. - -2018-02-07 Antonio Diaz Diaz - - * Version 1.10 released. - * LZ_compress_finish now adjusts dictionary size for each member. - (Older versions can adjust dictionary size only once). - * lzlib.c (LZ_decompress_read): Detect corrupt header with HD=3. - * main.c: New option '--loose-trailing'. - (main): Make option '-S, --volume-size' keep input files. - Replace 'bits/byte' with inverse compression ratio in output. - (main): Show final diagnostic when testing multiple files. - (set_c_outname): Do not add a second '.lz' to the arg of '-o'. - (do_decompress): Show dictionary size at verbosity level 4 (-vvvv). - * lzlib.texi: New chapter 'Invoking minilzip'. - -2017-04-11 Antonio Diaz Diaz - - * Version 1.9 released. - * Compression time of option '-0' has been reduced by 3%. - * Compression time of options -1 to -9 has been reduced by 1%. - * Decompression time has been reduced by 3%. - * main.c: Continue testing if any input file is a terminal. - * Change the license of the library to "2-clause BSD". - 2016-05-17 Antonio Diaz Diaz * Version 1.8 released. - * lzlib.h: Define LZ_API_VERSION to 1. - * lzlib.c (LZ_decompress_sync_to_member): Add skipped size to in_size. - * decoder.c (LZd_verify_trailer): Remove test of final code. - * main.c: New option '-a, --trailing-error'. - (main): Delete '--output' file if infd is a terminal. - (main): Don't use stdin more than once. + * decoder.c (LZd_verify_trailer): Removed test of final code. + * main.c: Added new option '-a, --trailing-error'. + * main.c (main): Delete '--output' file if infd is a terminal. + * main.c (main): Don't use stdin more than once. * configure: Avoid warning on some shells when testing for gcc. * Makefile.in: Detect the existence of install-info. - * check.sh: Require a POSIX shell. Don't check error messages. + * testsuite/check.sh: A POSIX shell is required to run the tests. + * testsuite/check.sh: Don't check error messages. 2015-07-08 Antonio Diaz Diaz * Version 1.7 released. - * Port fast encoder and option '-0' from lzip. + * Ported fast encoder and option '-0' from lzip. * If open-->write-->finish, produce same dictionary size as lzip. - * Makefile.in: New targets 'install*-compress'. + * Makefile.in: Added new targets 'install*-compress'. 2014-08-27 Antonio Diaz Diaz * Version 1.6 released. - * Compression ratio of option '-9' has been slightly increased. - * configure: New options '--disable-static' and '--disable-ldconfig'. + * Compression ratio of option -9 has been slightly increased. + * configure: Added new option '--disable-static'. + * configure: Added new option '--disable-ldconfig'. * Makefile.in: Ignore errors from ldconfig. * Makefile.in: Use 'CFLAGS' in every invocation of 'CC'. * main.c (close_and_set_permissions): Behave like 'cp -p'. - * lzlib.texinfo: Rename to lzlib.texi. - * Change license to "GPL version 2 or later with link exception". + * lzlib.texinfo: Renamed to lzlib.texi. + * License changed to "GPL version 2 or later with link exception". 2013-09-15 Antonio Diaz Diaz * Version 1.5 released. - * Remove decompression support for version 0 files. + * Removed decompression support for version 0 files. * The LZ_compress_sync_flush mechanism has been fixed (again). * Minor fixes. @@ -160,19 +43,20 @@ * Compression ratio has been slightly increased. * Compression time has been reduced by 8%. * Decompression time has been reduced by 7%. - * lzlib.h: Change 'long long' values to 'unsigned long long'. + * lzlib.h: Changed 'long long' values to 'unsigned long long'. * encoder.c (Mf_init): Reduce minimum buffer size to 64KiB. * lzlib.c (LZ_decompress_read): Tell LZ_header_error from LZ_unexpected_eof the same way as lzip does. - * Makefile.in: New targets 'install-as-lzip' and 'install-bin'. + * Makefile.in: Added new target 'install-as-lzip'. + * Makefile.in: Added new target 'install-bin'. + * main.c: Use 'setmode' instead of '_setmode' on Windows and OS/2. * main.c: Define 'strtoull' to 'strtoul' on Windows. - (main): Use 'setmode' instead of '_setmode' on Windows and OS/2. 2012-02-29 Antonio Diaz Diaz * Version 1.3 released. * Translated to C from the C++ source of lzlib 1.2. - * configure: Rename 'datadir' to 'datarootdir'. + * configure: 'datadir' renamed to 'datarootdir'. 2011-10-25 Antonio Diaz Diaz @@ -181,11 +65,12 @@ independently of the value of 'pos_state'. This gives better compression for large values of '--match-length' without being slower. - * encoder.h, encoder.cc: Optimize pair price calculations, reducing - compression time for large values of '--match-length' by up to 6%. - * main.cc: New option '-F, --recompress'. - * Makefile.in: 'make install' no longer tries to run '/sbin/ldconfig' - on systems lacking it. + * encoder.h encoder.cc: Optimize pair price calculations. This + reduces compression time for large values of '--match-length' + by up to 6%. + * main.cc: Added new option '-F, --recompress'. + * Makefile.in: 'make install' no more tries to run + '/sbin/ldconfig' on systems lacking it. 2011-01-03 Antonio Diaz Diaz @@ -193,20 +78,24 @@ * Compression time has been reduced by 2%. * All declarations not belonging to the API have been encapsulated in the namespace 'Lzlib'. - * testsuite: Rename 'test1' to 'test.txt'. New tests. - * main.cc (main): Set match length limits to same values as lzip 1.11. - (main): Set stdin/stdout in binary mode on OS2. + * testsuite: 'test1' renamed to 'test.txt'. Added new tests. + * Match length limits set by options -1 to -9 of minilzip have + been changed to match those of lzip 1.11. + * main.cc: Set stdin/stdout in binary mode on OS2. * bbexample.cc: New file containing example functions for buffer-to-buffer compression/decompression. 2010-05-08 Antonio Diaz Diaz * Version 1.0 released. - * New functions LZ_decompress_member_version, LZ_decompress_data_crc, - LZ_decompress_member_finished, and LZ_decompress_dictionary_size. - * Variables declared 'extern' have been encapsulated in a namespace. - * main.cc: Fix warning about fchown's return value being ignored. - * decoder.h: Integrate Input_buffer in Range_decoder. + * Added new function LZ_decompress_member_finished. + * Added new function LZ_decompress_member_version. + * Added new function LZ_decompress_dictionary_size. + * Added new function LZ_decompress_data_crc. + * Variables declared 'extern' have been encapsulated in a + namespace. + * main.cc: Fixed warning about fchown's return value being ignored. + * decoder.h: Input_buffer integrated in Range_decoder. 2010-02-10 Antonio Diaz Diaz @@ -217,26 +106,28 @@ 2010-01-17 Antonio Diaz Diaz * Version 0.8 released. - * New functions LZ_decompress_reset, LZ_decompress_sync_to_member, - LZ_decompress_write_size, and LZ_strerror. - * lzlib.h: API change. Replace 'enum' with functions for values of - dictionary size limits to make interface names consistent. - * lzlib.h: API change. Rename 'LZ_errno' to 'LZ_Errno'. - * lzlib.h: API change. Replace 'void *' with 'struct LZ_Encoder *' + * Added new function LZ_decompress_reset. + * Added new function LZ_decompress_sync_to_member. + * Added new function LZ_decompress_write_size. + * Added new function LZ_strerror. + * lzlib.h: API change. Replaced 'enum' with functions for values + of dictionary size limits to make interface names consistent. + * lzlib.h: API change. 'LZ_errno' replaced with 'LZ_Errno'. + * lzlib.h: API change. Replaced 'void *' with 'struct LZ_Encoder *' and 'struct LZ_Decoder *' to make interface type safe. - * decoder.cc: A truncated member trailer is now correctly detected. + * decoder.cc: Truncated member trailer is now correctly detected. * encoder.cc: Matchfinder::reset now also clears at_stream_end_, allowing LZ_compress_restart_member to restart a finished stream. * lzlib.cc: Accept only query or close operations after a fatal error has occurred. - * The shared version of lzlib is no longer built by default. - * check.sh: Use 'test1' instead of 'COPYING' for testing. + * Shared version of lzlib is no more built by default. + * testsuite/check.sh: Use 'test1' instead of 'COPYING' for testing. 2009-10-20 Antonio Diaz Diaz * Version 0.7 released. * Compression time has been reduced by 4%. - * check.sh: Remove -9 to run in less than 256MiB of RAM. + * testsuite/check.sh: Removed -9 to run in less than 256MiB of RAM. * lzcheck.cc: Read files of any size up to 2^63 bytes. 2009-09-02 Antonio Diaz Diaz @@ -248,14 +139,15 @@ * Version 0.5 released. * Decompression speed has been improved. - * main.cc (signal_handler): Declare as 'extern "C"'. + * main.cc (signal_handler): Declared as 'extern "C"'. 2009-06-03 Antonio Diaz Diaz * Version 0.4 released. - * New functions LZ_compress_sync_flush and LZ_compress_write_size. + * Added new function LZ_compress_sync_flush. + * Added new function LZ_compress_write_size. * Decompression speed has been improved. - * lzlib.texinfo: New chapter 'Buffering'. + * Added chapter 'Buffering' to the manual. 2009-05-03 Antonio Diaz Diaz @@ -265,15 +157,16 @@ 2009-04-26 Antonio Diaz Diaz * Version 0.2 released. - * Fix a segfault when decompressing trailing garbage. - * Fix a false positive in LZ_(de)compress_finished. + * Fixed a segfault when decompressing trailing garbage. + * Fixed a false positive in LZ_(de)compress_finished. 2009-04-21 Antonio Diaz Diaz * Version 0.1 released. -Copyright (C) 2009-2025 Antonio Diaz Diaz. +Copyright (C) 2009-2016 Antonio Diaz Diaz. -This file is a collection of facts, and thus it is not copyrightable, but just -in case, you have unlimited permission to copy, distribute, and modify it. +This file is a collection of facts, and thus it is not copyrightable, +but just in case, you have unlimited permission to copy, distribute and +modify it. diff --git a/INSTALL b/INSTALL index ba3337e..31237fc 100644 --- a/INSTALL +++ b/INSTALL @@ -1,14 +1,9 @@ Requirements ------------ -You will need a C99 compiler. (gcc 3.3.6 or newer is recommended). -I use gcc 6.1.0 and 3.3.6, but the code should compile with any standards -compliant compiler. -Gcc is available at http://gcc.gnu.org -Lzip is available at http://www.nongnu.org/lzip/lzip.html - -The operating system must allow signal handlers read access to objects with -static storage duration so that the cleanup handler for Control-C can delete -the partial output file. (This requirement is for minilzip only). +You will need a C compiler. +I use gcc 5.3.0 and 4.1.2, but the code should compile with any +standards compliant compiler. +Gcc is available at http://gcc.gnu.org. Procedure @@ -19,8 +14,8 @@ Procedure or lzip -cd lzlib[version].tar.lz | tar -xf - -This creates the directory ./lzlib[version] containing the source code -extracted from the archive. +This creates the directory ./lzlib[version] containing the source from +the main archive. 2. Change to lzlib directory and run configure. (Try 'configure --help' for usage instructions). @@ -28,65 +23,46 @@ extracted from the archive. cd lzlib[version] ./configure - If you choose a C standard, enable the POSIX features explicitly: - - ./configure CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500' - - If you are compiling on MinGW, use: - - ./configure CFLAGS+='-D __USE_MINGW_ANSI_STDIO' - -3. Run make +3. Run make. make -to build the library, or - - make bin - -to build also minilzip. - 4. Optionally, type 'make check' to run the tests that come with lzlib. 5. Type 'make install' to install the library and any data files and - documentation. You need root privileges to install into a prefix owned - by root. (You may need to run ldconfig also). + documentation. (You may need to run ldconfig also). Or type 'make install-compress', which additionally compresses the - info manual after installation. - (Installing compressed docs may become the default in the future). + info manual after installation. (Installing compressed docs may + become the default in the future). - You can install only the library or the info manual by typing - 'make install-lib' or 'make install-info' respectively. + You can install only the library, the info manual or the man page by + typing 'make install-bin', 'make install-info' or 'make install-man' + respectively. - 'make install-bin' installs the program minilzip and its man page. It - installs a shared minilzip if the shared library has been configured. - Else it installs a static minilzip. - 'make install-bin-compress' additionally compresses the man page after - installation. - - 'make install-as-lzip' runs 'make install-bin' and then links minilzip to - the name 'lzip'. + Instead of 'make install', you can type 'make install-as-lzip' to + install the library and any data files and documentation, and link + minilzip to the name 'lzip'. Another way ----------- You can also compile lzlib into a separate directory. -To do this, you must use a version of 'make' that supports the variable -'VPATH', such as GNU 'make'. 'cd' to the directory where you want the +To do this, you must use a version of 'make' that supports the 'VPATH' +variable, such as GNU 'make'. 'cd' to the directory where you want the object files and executables to go and run the 'configure' script. -'configure' automatically checks for the source code in '.', in '..', and +'configure' automatically checks for the source code in '.', in '..' and in the directory that 'configure' is in. -'configure' recognizes the option '--srcdir=DIR' to control where to look -for the source code. Usually 'configure' can determine that directory +'configure' recognizes the option '--srcdir=DIR' to control where to +look for the sources. Usually 'configure' can determine that directory automatically. After running 'configure', you can run 'make' and 'make install' as explained above. -Copyright (C) 2009-2025 Antonio Diaz Diaz. +Copyright (C) 2009-2016 Antonio Diaz Diaz. This file is free documentation: you have unlimited permission to copy, -distribute, and modify it. +distribute and modify it. diff --git a/Makefile.in b/Makefile.in index 4f99874..02a1870 100644 --- a/Makefile.in +++ b/Makefile.in @@ -1,55 +1,44 @@ DISTNAME = $(pkgname)-$(pkgversion) +AR = ar INSTALL = install INSTALL_PROGRAM = $(INSTALL) -m 755 -INSTALL_DIR = $(INSTALL) -d -m 755 INSTALL_DATA = $(INSTALL) -m 644 -INSTALL_SO = $(INSTALL) -m 644 +INSTALL_DIR = $(INSTALL) -d -m 755 LDCONFIG = /sbin/ldconfig SHELL = /bin/sh CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1 -objs = carg_parser.o minilzip.o +objs = carg_parser.o main.o .PHONY : all install install-bin install-info install-man \ install-strip install-compress install-strip-compress \ install-bin-strip install-info-compress install-man-compress \ - install-bin-compress install-bin-strip-compress \ - install-lib install-lib-strip \ - install-as-lzip \ - uninstall uninstall-bin uninstall-lib uninstall-info uninstall-man \ + install-as-lzip uninstall uninstall-bin uninstall-info uninstall-man \ doc info man check dist clean distclean -all : lib - -lib : $(libname_static) $(libname_shared) +all : $(progname_static) $(progname_shared) lib$(libname).a : lzlib.o - $(AR) $(ARFLAGS) $@ $< + $(AR) -rcs $@ $< -lib$(libname).so.$(soversion) : lzlib_sh.o - $(CC) $(CFLAGS) $(LDFLAGS) -fpic -fPIC -shared -Wl,--soname=$@ -o $@ $< || \ - $(CC) $(CFLAGS) $(LDFLAGS) -fpic -fPIC -shared -o $@ $< - -bin : $(progname_static) $(progname_shared) +lib$(libname).so.$(pkgversion) : lzlib_sh.o + $(CC) $(LDFLAGS) $(CFLAGS) -fpic -fPIC -shared -Wl,--soname=lib$(libname).so.$(soversion) -o $@ $< $(progname) : $(objs) lib$(libname).a - $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(objs) lib$(libname).a + $(CC) $(LDFLAGS) $(CFLAGS) -o $@ $(objs) lib$(libname).a -$(progname)_shared : $(objs) lib$(libname).so.$(soversion) - $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(objs) lib$(libname).so.$(soversion) +$(progname)_shared : $(objs) lib$(libname).so.$(pkgversion) + $(CC) $(LDFLAGS) $(CFLAGS) -o $@ $(objs) lib$(libname).so.$(pkgversion) bbexample : bbexample.o lib$(libname).a - $(CC) $(CFLAGS) $(LDFLAGS) -o $@ bbexample.o lib$(libname).a - -ffexample : ffexample.o lib$(libname).a - $(CC) $(CFLAGS) $(LDFLAGS) -o $@ ffexample.o lib$(libname).a + $(CC) $(LDFLAGS) $(CFLAGS) -o $@ bbexample.o lib$(libname).a lzcheck : lzcheck.o lib$(libname).a - $(CC) $(CFLAGS) $(LDFLAGS) -o $@ lzcheck.o lib$(libname).a + $(CC) $(LDFLAGS) $(CFLAGS) -o $@ lzcheck.o lib$(libname).a -minilzip.o : minilzip.c +main.o : main.c $(CC) $(CPPFLAGS) $(CFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $< lzlib_sh.o : lzlib.c @@ -58,11 +47,6 @@ lzlib_sh.o : lzlib.c %.o : %.c $(CC) $(CPPFLAGS) $(CFLAGS) -c -o $@ $< -# prevent 'make' from trying to remake source files -$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ; -MAKEFLAGS += -r -.SUFFIXES : - lzdeps = lzlib.h lzip.h cbuffer.c decoder.h decoder.c encoder_base.h \ encoder_base.c encoder.h encoder.c fast_encoder.h fast_encoder.c @@ -70,73 +54,64 @@ $(objs) : Makefile carg_parser.o : carg_parser.h lzlib.o : Makefile $(lzdeps) lzlib_sh.o : Makefile $(lzdeps) -minilzip.o : carg_parser.h lzlib.h +main.o : carg_parser.h lzlib.h bbexample.o : Makefile lzlib.h -ffexample.o : Makefile lzlib.h lzcheck.o : Makefile lzlib.h + doc : info man info : $(VPATH)/doc/$(pkgname).info $(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texi - cd $(VPATH)/doc && $(MAKEINFO) $(pkgname).texi + cd $(VPATH)/doc && makeinfo $(pkgname).texi man : $(VPATH)/doc/$(progname).1 $(VPATH)/doc/$(progname).1 : $(progname) - help2man -n 'reduces the size of files' -o $@ --info-page=$(pkgname) ./$(progname) + help2man -n 'reduces the size of files' -o $@ --no-info ./$(progname) Makefile : $(VPATH)/configure $(VPATH)/Makefile.in ./config.status -check : $(progname) bbexample ffexample lzcheck +check : $(progname) bbexample lzcheck @$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion) -install : install-lib install-info -install-strip : install-lib-strip install-info -install-compress : install-lib install-info-compress -install-strip-compress : install-lib-strip install-info-compress -install-bin-compress : install-bin install-man-compress -install-bin-strip-compress : install-bin-strip install-man-compress +install : install-bin install-info +install-strip : install-bin-strip install-info +install-compress : install-bin install-info-compress +install-strip-compress : install-bin-strip install-info-compress -install-bin : bin install-man - if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi - $(INSTALL_PROGRAM) ./$(progname_lzip) "$(DESTDIR)$(bindir)/$(progname)" - -install-bin-strip : bin - $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin - -install-lib : lib +install-bin : all if [ ! -d "$(DESTDIR)$(includedir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(includedir)" ; fi if [ ! -d "$(DESTDIR)$(libdir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(libdir)" ; fi $(INSTALL_DATA) $(VPATH)/$(libname)lib.h "$(DESTDIR)$(includedir)/$(libname)lib.h" - if [ -n "$(libname_static)" ] ; then \ + if [ -n "$(progname_static)" ] ; then \ $(INSTALL_DATA) ./lib$(libname).a "$(DESTDIR)$(libdir)/lib$(libname).a" ; \ fi - if [ -n "$(libname_shared)" ] ; then \ + if [ -n "$(progname_shared)" ] ; then \ + $(INSTALL_PROGRAM) ./lib$(libname).so.$(pkgversion) "$(DESTDIR)$(libdir)/lib$(libname).so.$(pkgversion)" ; \ if [ -e "$(DESTDIR)$(libdir)/lib$(libname).so.$(soversion)" ] ; then \ run_ldconfig=no ; \ else run_ldconfig=yes ; \ fi ; \ rm -f "$(DESTDIR)$(libdir)/lib$(libname).so" ; \ rm -f "$(DESTDIR)$(libdir)/lib$(libname).so.$(soversion)" ; \ - $(INSTALL_SO) ./lib$(libname).so.$(soversion) "$(DESTDIR)$(libdir)/lib$(libname).so.$(pkgversion)" ; \ cd "$(DESTDIR)$(libdir)" && ln -s lib$(libname).so.$(pkgversion) lib$(libname).so ; \ cd "$(DESTDIR)$(libdir)" && ln -s lib$(libname).so.$(pkgversion) lib$(libname).so.$(soversion) ; \ if [ "${disable_ldconfig}" != yes ] && [ $${run_ldconfig} = yes ] && \ [ -x "$(LDCONFIG)" ] ; then "$(LDCONFIG)" -n "$(DESTDIR)$(libdir)" || true ; fi ; \ fi -install-lib-strip : lib - $(MAKE) INSTALL_SO='$(INSTALL_SO) -s' install-lib +install-bin-strip : all + $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin install-info : if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info" -if $(CAN_RUN_INSTALLINFO) ; then \ - install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ + install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ fi install-info-compress : install-info @@ -150,24 +125,25 @@ install-man : install-man-compress : install-man lzip -v -9 "$(DESTDIR)$(mandir)/man1/$(progname).1" -install-as-lzip : install-bin +install-as-lzip : install install-man + if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi + $(INSTALL_PROGRAM) ./$(progname_lzip) "$(DESTDIR)$(bindir)/$(progname)" -rm -f "$(DESTDIR)$(bindir)/lzip" cd "$(DESTDIR)$(bindir)" && ln -s $(progname) lzip -uninstall : uninstall-info uninstall-lib +uninstall : uninstall-man uninstall-info uninstall-bin uninstall-bin : -rm -f "$(DESTDIR)$(bindir)/$(progname)" - -uninstall-lib : -rm -f "$(DESTDIR)$(includedir)/$(libname)lib.h" -rm -f "$(DESTDIR)$(libdir)/lib$(libname).a" -rm -f "$(DESTDIR)$(libdir)/lib$(libname).so" -rm -f "$(DESTDIR)$(libdir)/lib$(libname).so.$(soversion)" + -rm -f "$(DESTDIR)$(libdir)/lib$(libname).so.$(pkgversion)" uninstall-info : -if $(CAN_RUN_INSTALLINFO) ; then \ - install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ + install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ fi -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* @@ -179,7 +155,6 @@ dist : doc tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \ $(DISTNAME)/AUTHORS \ $(DISTNAME)/COPYING \ - $(DISTNAME)/COPYING.GPL \ $(DISTNAME)/ChangeLog \ $(DISTNAME)/INSTALL \ $(DISTNAME)/Makefile.in \ @@ -189,22 +164,20 @@ dist : doc $(DISTNAME)/doc/$(progname).1 \ $(DISTNAME)/doc/$(pkgname).info \ $(DISTNAME)/doc/$(pkgname).texi \ - $(DISTNAME)/*.h \ - $(DISTNAME)/*.c \ $(DISTNAME)/testsuite/check.sh \ $(DISTNAME)/testsuite/test.txt \ - $(DISTNAME)/testsuite/fox_lf \ - $(DISTNAME)/testsuite/fox.lz \ - $(DISTNAME)/testsuite/fox_*.lz \ + $(DISTNAME)/testsuite/test2.txt \ + $(DISTNAME)/testsuite/test.txt.lz \ $(DISTNAME)/testsuite/test_sync.lz \ - $(DISTNAME)/testsuite/test.txt.lz + $(DISTNAME)/*.h \ + $(DISTNAME)/*.c rm -f $(DISTNAME) lzip -v -9 $(DISTNAME).tar clean : - -rm -f $(progname) $(objs) lzlib.o lib$(libname).a - -rm -f $(progname)_shared lzlib_sh.o lib$(libname).so* - -rm -f bbexample bbexample.o ffexample ffexample.o lzcheck lzcheck.o + -rm -f $(progname) $(objs) + -rm -f $(progname)_shared lzlib_sh.o *.so.$(pkgversion) + -rm -f bbexample bbexample.o lzcheck lzcheck.o lzlib.o *.a distclean : clean -rm -f Makefile config.status *.tar *.tar.lz diff --git a/NEWS b/NEWS index 1528dce..e60b1bd 100644 --- a/NEWS +++ b/NEWS @@ -1,21 +1,16 @@ -Changes in version 1.15: +Changes in version 1.8: -Lzlib now reports a nonzero first LZMA byte as a LZ_data_error. +The test of the value remaining in the range decoder has been removed. +(After extensive testing it has been found useless to detect corruption +in the decompressed data. Eliminating it reduces the number of false +positives for corruption and makes error detection more accurate). -minilzip now exits with error status 2 if any empty member is found in a -multimember file. +The option "-a, --trailing-error", which makes minilzip exit with error +status 2 if any remaining input is detected after decompressing the last +member, has been added. -LZ_Errno, LZ_Encoder, and LZ_Decoder are now declared in lzlib.h as typedef. +When decompressing with minilzip, the file specified with the '--output' +option is now deleted if the input is a terminal. -The targets 'lib' and 'bin' have been added to Makefile.in. 'lib' is the new -default and builds just the library. 'bin' builds both the library and -minilzip. - -minilzip is no longer built by default. - -'install-bin' installs minilzip and its man page again. - -To improve portability, the linker option '--soname' is now used conditionally. -(Reported by Michael Sullivan). - -The use of the target 'bin' has been documented in INSTALL. +A harmless check failure on Windows, caused by the failed comparison of +a message in text mode, has been fixed. diff --git a/README b/README index b52806d..97f11e9 100644 --- a/README +++ b/README @@ -1,89 +1,97 @@ -See the file INSTALL for compilation and installation instructions. - Description -Lzlib is a data compression library providing in-memory LZMA compression and -decompression functions, including integrity checking of the decompressed -data. The compressed data format used by the library is the lzip format. -Lzlib is written in C and is distributed under a 2-clause BSD license. +Lzlib is a data compression library providing in-memory LZMA compression +and decompression functions, including integrity checking of the +decompressed data. The compressed data format used by the library is the +lzip format. Lzlib is written in C. -The functions and variables forming the interface of the compression library -are declared in the file 'lzlib.h'. Usage examples of the library are given -in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from the source -distribution. +The lzip file format is designed for data sharing and long-term +archiving, taking into account both data integrity and decoder +availability: -As 'lzlib.h' can be used in C and C++ programs, it must not impose a choice -of system headers on the program by including one of them. Therefore it is -the responsibility of the program using lzlib to include before 'lzlib.h' -some header that declares the type 'uint8_t'. There are at least four such -headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and 'cinttypes'. + * The lzip format provides very safe integrity checking and some data + recovery means. The lziprecover program can repair bit-flip errors + (one of the most common forms of data corruption) in lzip files, + and provides data recovery capabilities, including error-checked + merging of damaged copies of a file. -All the library functions are thread safe. The library does not install any -signal handler. The decoder checks the consistency of the compressed data, -so the library should never crash even in case of corrupted input. + * The lzip format is as simple as possible (but not simpler). The + lzip manual provides the code of a simple decompressor along with a + detailed explanation of how it works, so that with the only help of + the lzip manual it would be possible for a digital archaeologist to + extract the data from a lzip file long after quantum computers + eventually render LZMA obsolete. + + * Additionally the lzip reference implementation is copylefted, which + guarantees that it will remain free forever. + +A nice feature of the lzip format is that a corrupt byte is easier to +repair the nearer it is from the beginning of the file. Therefore, with +the help of lziprecover, losing an entire archive just because of a +corrupt byte near the beginning is a thing of the past. + +The functions and variables forming the interface of the compression +library are declared in the file 'lzlib.h'. Usage examples of the +library are given in the files 'main.c' and 'bbexample.c' from the +source distribution. Compression/decompression is done by repeatedly calling a couple of -read/write functions until all the data have been processed by the library. -This interface is safer and less error prone than the traditional zlib -interface. +read/write functions until all the data have been processed by the +library. This interface is safer and less error prone than the +traditional zlib interface. Compression/decompression is done when the read function is called. This -means the value returned by the position functions is not updated until a -read call, even if a lot of data are written. If you want the data to be -compressed in advance, just call the read function with a size equal to 0. +means the value returned by the position functions will not be updated +until a read call, even if a lot of data is written. If you want the +data to be compressed in advance, just call the read function with a +size equal to 0. -If all the data to be compressed are written in advance, lzlib automatically -adjusts the header of the compressed data to use the largest dictionary size -that does not exceed neither the data size nor the limit given to -'LZ_compress_open'. This feature reduces the amount of memory needed for -decompression and allows minilzip to produce identical compressed output as -lzip. +If all the data to be compressed are written in advance, lzlib will +automatically adjust the header of the compressed data to use the +smallest possible dictionary size. This feature reduces the amount of +memory needed for decompression and allows minilzip to produce identical +compressed output as lzip. -Lzlib correctly decompresses a data stream which is the concatenation of -two or more compressed data streams. The result is the concatenation of the -corresponding decompressed data streams. Integrity testing of concatenated -compressed data streams is also supported. +Lzlib will correctly decompress a data stream which is the concatenation +of two or more compressed data streams. The result is the concatenation +of the corresponding decompressed data streams. Integrity testing of +concatenated compressed data streams is also supported. -Lzlib is able to compress and decompress streams of unlimited size by -automatically creating multimember output. The members so created are large, -about 2 PiB each. +All the library functions are thread safe. The library does not install +any signal handler. The decoder checks the consistency of the compressed +data, so the library should never crash even in case of corrupted input. In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a concrete algorithm; it is more like "any algorithm using the LZMA coding -scheme". For example, the option '-0' of lzip uses the scheme in almost the -simplest way possible; issuing the longest match it can find, or a literal -byte if it can't find a match. Inversely, a more elaborate way of finding -coding sequences of minimum size than the one currently used by lzip could -be developed, and the resulting sequence could also be coded using the LZMA -coding scheme. +scheme". For example, the option '-0' of lzip uses the scheme in almost +the simplest way possible; issuing the longest match it can find, or a +literal byte if it can't find a match. Inversely, a much more elaborated +way of finding coding sequences of minimum size than the one currently +used by lzip could be developed, and the resulting sequence could also +be coded using the LZMA coding scheme. -Lzlib currently implements two variants of the LZMA algorithm: fast (used by -option '-0' of minilzip) and normal (used by all other compression levels). +Lzlib currently implements two variants of the LZMA algorithm; fast +(used by option '-0' of minilzip) and normal (used by all other +compression levels). The high compression of LZMA comes from combining two basic, well-proven -compression ideas: sliding dictionaries (LZ77) and Markov models (the thing -used by every compression algorithm that uses a range encoder or similar -order-0 entropy coder as its last stage) with segregation of contexts -according to what the bits are used for. +compression ideas: sliding dictionaries (LZ77/78) and markov models (the +thing used by every compression algorithm that uses a range encoder or +similar order-0 entropy coder as its last stage) with segregation of +contexts according to what the bits are used for. The ideas embodied in lzlib are due to (at least) the following people: -Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the -definition of Markov chains), G.N.N. Martin (for the definition of range -encoding), Igor Pavlov (for putting all the above together in LZMA), and -Julian Seward (for bzip2's CLI). - -LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have -been compressed. Decompressed is used to refer to data which have undergone -the process of decompression. - -minilzip uses Arg_parser for command-line argument parsing: -http://www.nongnu.org/arg-parser/arg_parser.html +Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for +the definition of Markov chains), G.N.N. Martin (for the definition of +range encoding), Igor Pavlov (for putting all the above together in +LZMA), and Julian Seward (for bzip2's CLI). -Copyright (C) 2009-2025 Antonio Diaz Diaz. +Copyright (C) 2009-2016 Antonio Diaz Diaz. This file is free documentation: you have unlimited permission to copy, -distribute, and modify it. +distribute and modify it. -The file Makefile.in is a data file used by configure to produce the Makefile. -It has the same copyright owner and permissions that configure itself. +The file Makefile.in is a data file used by configure to produce the +Makefile. It has the same copyright owner and permissions that configure +itself. diff --git a/bbexample.c b/bbexample.c index 4370c7e..6b1352b 100644 --- a/bbexample.c +++ b/bbexample.c @@ -1,20 +1,21 @@ -/* Buffer to buffer example - Test program for the library lzlib - Copyright (C) 2010-2025 Antonio Diaz Diaz. +/* Buffer to buffer example - Test program for the lzlib library + Copyright (C) 2010-2016 Antonio Diaz Diaz. - This program is free software: you have unlimited permission - to copy, distribute, and modify it. + This program is free software: you have unlimited permission + to copy, distribute and modify it. - Usage: bbexample filename + Usage is: + bbexample filename - This program is an example of how buffer-to-buffer - compression/decompression can be implemented using lzlib. + This program is an example of how buffer-to-buffer + compression/decompression can be implemented using lzlib. */ -#define _FILE_OFFSET_BITS 64 - #include #include +#ifndef __cplusplus #include +#endif #include #include #include @@ -23,72 +24,70 @@ #include "lzlib.h" -#ifndef min - #define min(x,y) ((x) <= (y) ? (x) : (y)) -#endif - -/* Return the address of a malloc'd buffer containing the file data and - the file size in '*file_sizep'. - In case of error, return 0 and do not modify '*file_sizep'. +/* Returns the address of a malloc'd buffer containing the file data and + its size in '*size'. + In case of error, returns 0 and does not modify '*size'. */ -uint8_t * read_file( const char * const name, long * const file_sizep ) +uint8_t * read_file( const char * const name, long * const size ) { long buffer_size = 1 << 20, file_size; uint8_t * buffer, * tmp; FILE * const f = fopen( name, "rb" ); if( !f ) - { fprintf( stderr, "bbexample: %s: Can't open input file: %s\n", - name, strerror( errno ) ); return 0; } + { + fprintf( stderr, "bbexample: Can't open input file '%s': %s\n", + name, strerror( errno ) ); + return 0; + } buffer = (uint8_t *)malloc( buffer_size ); if( !buffer ) - { fputs( "bbexample: read_file: Not enough memory.\n", stderr ); - fclose( f ); return 0; } + { fputs( "bbexample: Not enough memory.\n", stderr ); return 0; } file_size = fread( buffer, 1, buffer_size, f ); while( file_size >= buffer_size ) { if( buffer_size >= LONG_MAX ) { - fprintf( stderr, "bbexample: %s: Input file is too large.\n", name ); - free( buffer ); fclose( f ); return 0; + fprintf( stderr, "bbexample: Input file '%s' is too large.\n", name ); + free( buffer ); return 0; } - buffer_size = (buffer_size <= LONG_MAX / 2) ? 2 * buffer_size : LONG_MAX; + buffer_size = ( buffer_size <= LONG_MAX / 2 ) ? 2 * buffer_size : LONG_MAX; tmp = (uint8_t *)realloc( buffer, buffer_size ); if( !tmp ) - { fputs( "bbexample: read_file: Not enough memory.\n", stderr ); - free( buffer ); fclose( f ); return 0; } + { fputs( "bbexample: Not enough memory.\n", stderr ); + free( buffer ); return 0; } buffer = tmp; file_size += fread( buffer + file_size, 1, buffer_size - file_size, f ); } if( ferror( f ) || !feof( f ) ) { - fprintf( stderr, "bbexample: %s: Error reading file: %s\n", + fprintf( stderr, "bbexample: Error reading file '%s': %s\n", name, strerror( errno ) ); - free( buffer ); fclose( f ); return 0; + free( buffer ); return 0; } fclose( f ); - *file_sizep = file_size; + *size = file_size; return buffer; } -/* Compress 'insize' bytes from 'inbuf'. - Return the address of a malloc'd buffer containing the compressed data, - and the size of the data in '*outlenp'. - In case of error, return 0 and do not modify '*outlenp'. +/* Compresses 'size' bytes from 'data'. Returns the address of a + malloc'd buffer containing the compressed data and its size in + '*out_sizep'. + In case of error, returns 0 and does not modify '*out_sizep'. */ -uint8_t * bbcompressl( const uint8_t * const inbuf, const long insize, - const int level, long * const outlenp ) +uint8_t * bbcompress( const uint8_t * const data, const long size, + const int level, long * const out_sizep ) { - typedef struct Lzma_options + struct Lzma_options { int dictionary_size; /* 4 KiB .. 512 MiB */ int match_len_limit; /* 5 .. 273 */ - } Lzma_options; - /* Mapping from gzip/bzip2 style 0..9 compression levels to the - corresponding LZMA compression parameters. */ - const Lzma_options option_mapping[] = + }; + /* Mapping from gzip/bzip2 style 1..9 compression modes + to the corresponding LZMA compression modes. */ + const struct Lzma_options option_mapping[] = { { 65535, 16 }, /* -0 (65535,16 chooses fast encoder) */ { 1 << 20, 5 }, /* -1 */ @@ -100,247 +99,133 @@ uint8_t * bbcompressl( const uint8_t * const inbuf, const long insize, { 1 << 24, 68 }, /* -7 */ { 3 << 23, 132 }, /* -8 */ { 1 << 25, 273 } }; /* -9 */ - Lzma_options encoder_options; - LZ_Encoder * encoder; - uint8_t * outbuf; - const long delta_size = insize / 4 + 64; /* insize may be zero */ - long outsize = delta_size; /* initial outsize */ - long inpos = 0; - long outpos = 0; + struct Lzma_options encoder_options; + const unsigned long long member_size = 0x7FFFFFFFFFFFFFFFULL; /* INT64_MAX */ + struct LZ_Encoder * encoder; + uint8_t * new_data; + const long delta_size = ( size / 4 ) + 64; /* size may be zero */ + long new_data_size = delta_size; /* initial size */ + long new_pos = 0; + long written = 0; bool error = false; if( level < 0 || level > 9 ) return 0; encoder_options = option_mapping[level]; - if( encoder_options.dictionary_size > insize && level != 0 ) - encoder_options.dictionary_size = insize; /* saves memory */ + if( encoder_options.dictionary_size > size && level != 0 ) + encoder_options.dictionary_size = size; /* saves memory */ if( encoder_options.dictionary_size < LZ_min_dictionary_size() ) encoder_options.dictionary_size = LZ_min_dictionary_size(); encoder = LZ_compress_open( encoder_options.dictionary_size, - encoder_options.match_len_limit, INT64_MAX ); - outbuf = (uint8_t *)malloc( outsize ); - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok || !outbuf ) - { free( outbuf ); LZ_compress_close( encoder ); return 0; } - - while( true ) - { - int ret = LZ_compress_write( encoder, inbuf + inpos, - min( INT_MAX, insize - inpos ) ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_compress_finish( encoder ); - ret = LZ_compress_read( encoder, outbuf + outpos, - min( INT_MAX, outsize - outpos ) ); - if( ret < 0 ) { error = true; break; } - outpos += ret; - if( LZ_compress_finished( encoder ) == 1 ) break; - if( outpos >= outsize ) - { - uint8_t * tmp; - if( outsize > LONG_MAX - delta_size ) { error = true; break; } - outsize += delta_size; - tmp = (uint8_t *)realloc( outbuf, outsize ); - if( !tmp ) { error = true; break; } - outbuf = tmp; - } - } - - if( LZ_compress_close( encoder ) < 0 ) error = true; - if( error ) { free( outbuf ); return 0; } - *outlenp = outpos; - return outbuf; - } - - -/* Decompress 'insize' bytes from 'inbuf'. - Return the address of a malloc'd buffer containing the decompressed - data, and the size of the data in '*outlenp'. - In case of error, return 0 and do not modify '*outlenp'. -*/ -uint8_t * bbdecompressl( const uint8_t * const inbuf, const long insize, - long * const outlenp ) - { - LZ_Decoder * const decoder = LZ_decompress_open(); - const long delta_size = insize; /* insize must be > zero */ - long outsize = delta_size; /* initial outsize */ - uint8_t * outbuf = (uint8_t *)malloc( outsize ); - long inpos = 0; - long outpos = 0; - bool error = false; - if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok || !outbuf ) - { free( outbuf ); LZ_decompress_close( decoder ); return 0; } - - while( true ) - { - int ret = LZ_decompress_write( decoder, inbuf + inpos, - min( INT_MAX, insize - inpos ) ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_decompress_finish( decoder ); - ret = LZ_decompress_read( decoder, outbuf + outpos, - min( INT_MAX, outsize - outpos ) ); - if( ret < 0 ) { error = true; break; } - outpos += ret; - if( LZ_decompress_finished( decoder ) == 1 ) break; - if( outpos >= outsize ) - { - uint8_t * tmp; - if( outsize > LONG_MAX - delta_size ) { error = true; break; } - outsize += delta_size; - tmp = (uint8_t *)realloc( outbuf, outsize ); - if( !tmp ) { error = true; break; } - outbuf = tmp; - } - } - - if( LZ_decompress_close( decoder ) < 0 ) error = true; - if( error ) { free( outbuf ); return 0; } - *outlenp = outpos; - return outbuf; - } - - -/* Test the whole file at all levels. */ -int full_test( const uint8_t * const inbuf, const long insize ) - { - int level; - for( level = 0; level <= 9; ++level ) - { - long midsize = 0, outsize = 0; - uint8_t * outbuf; - uint8_t * midbuf = bbcompressl( inbuf, insize, level, &midsize ); - if( !midbuf ) - { fputs( "bbexample: full_test: Not enough memory or compress error.\n", - stderr ); return 1; } - - outbuf = bbdecompressl( midbuf, midsize, &outsize ); - free( midbuf ); - if( !outbuf ) - { fputs( "bbexample: full_test: Not enough memory or decompress error.\n", - stderr ); return 1; } - - if( insize != outsize || - ( insize > 0 && memcmp( inbuf, outbuf, insize ) != 0 ) ) - { fputs( "bbexample: full_test: Decompressed data differs from original.\n", - stderr ); free( outbuf ); return 1; } - - free( outbuf ); - } - return 0; - } - - -/* Compress 'insize' bytes from 'inbuf' to 'outbuf'. - Return the size of the compressed data in '*outlenp'. - In case of error, or if 'outsize' is too small, return false and do not - modify '*outlenp'. -*/ -bool bbcompress( const uint8_t * const inbuf, const int insize, - const int dictionary_size, const int match_len_limit, - uint8_t * const outbuf, const int outsize, - int * const outlenp ) - { - int inpos = 0, outpos = 0; - bool error = false; - LZ_Encoder * const encoder = - LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX ); + encoder_options.match_len_limit, member_size ); if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { LZ_compress_close( encoder ); return false; } + { LZ_compress_close( encoder ); return 0; } + + new_data = (uint8_t *)malloc( new_data_size ); + if( !new_data ) + { LZ_compress_close( encoder ); return 0; } while( true ) { - int ret = LZ_compress_write( encoder, inbuf + inpos, insize - inpos ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_compress_finish( encoder ); - ret = LZ_compress_read( encoder, outbuf + outpos, outsize - outpos ); - if( ret < 0 ) { error = true; break; } - outpos += ret; + int rd; + if( LZ_compress_write_size( encoder ) > 0 ) + { + if( written < size ) + { + const int wr = LZ_compress_write( encoder, data + written, + size - written ); + if( wr < 0 ) { error = true; break; } + written += wr; + } + if( written >= size ) LZ_compress_finish( encoder ); + } + rd = LZ_compress_read( encoder, new_data + new_pos, + new_data_size - new_pos ); + if( rd < 0 ) { error = true; break; } + new_pos += rd; if( LZ_compress_finished( encoder ) == 1 ) break; - if( outpos >= outsize ) { error = true; break; } + if( new_pos >= new_data_size ) + { + uint8_t * tmp; + if( new_data_size > LONG_MAX - delta_size ) { error = true; break; } + new_data_size += delta_size; + tmp = (uint8_t *)realloc( new_data, new_data_size ); + if( !tmp ) { error = true; break; } + new_data = tmp; + } } if( LZ_compress_close( encoder ) < 0 ) error = true; - if( error ) return false; - *outlenp = outpos; - return true; + if( error ) { free( new_data ); return 0; } + *out_sizep = new_pos; + return new_data; } -/* Decompress 'insize' bytes from 'inbuf' to 'outbuf'. - Return the size of the decompressed data in '*outlenp'. - In case of error, or if 'outsize' is too small, return false and do not - modify '*outlenp'. +/* Decompresses 'size' bytes from 'data'. Returns the address of a + malloc'd buffer containing the decompressed data and its size in + '*out_sizep'. + In case of error, returns 0 and does not modify '*out_sizep'. */ -bool bbdecompress( const uint8_t * const inbuf, const int insize, - uint8_t * const outbuf, const int outsize, - int * const outlenp ) +uint8_t * bbdecompress( const uint8_t * const data, const long size, + long * const out_sizep ) { - int inpos = 0, outpos = 0; + struct LZ_Decoder * const decoder = LZ_decompress_open(); + uint8_t * new_data; + const long delta_size = size; /* size must be > zero */ + long new_data_size = delta_size; /* initial size */ + long new_pos = 0; + long written = 0; bool error = false; - LZ_Decoder * const decoder = LZ_decompress_open(); if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) - { LZ_decompress_close( decoder ); return false; } + { LZ_decompress_close( decoder ); return 0; } + + new_data = (uint8_t *)malloc( new_data_size ); + if( !new_data ) + { LZ_decompress_close( decoder ); return 0; } while( true ) { - int ret = LZ_decompress_write( decoder, inbuf + inpos, insize - inpos ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_decompress_finish( decoder ); - ret = LZ_decompress_read( decoder, outbuf + outpos, outsize - outpos ); - if( ret < 0 ) { error = true; break; } - outpos += ret; + int rd; + if( LZ_decompress_write_size( decoder ) > 0 ) + { + if( written < size ) + { + const int wr = LZ_decompress_write( decoder, data + written, + size - written ); + if( wr < 0 ) { error = true; break; } + written += wr; + } + if( written >= size ) LZ_decompress_finish( decoder ); + } + rd = LZ_decompress_read( decoder, new_data + new_pos, + new_data_size - new_pos ); + if( rd < 0 ) { error = true; break; } + new_pos += rd; if( LZ_decompress_finished( decoder ) == 1 ) break; - if( outpos >= outsize ) { error = true; break; } + if( new_pos >= new_data_size ) + { + uint8_t * tmp; + if( new_data_size > LONG_MAX - delta_size ) { error = true; break; } + new_data_size += delta_size; + tmp = (uint8_t *)realloc( new_data, new_data_size ); + if( !tmp ) { error = true; break; } + new_data = tmp; + } } if( LZ_decompress_close( decoder ) < 0 ) error = true; - if( error ) return false; - *outlenp = outpos; - return true; - } - - -/* Test at most INT_MAX bytes from the file with buffers of fixed size. */ -int fixed_test( const uint8_t * const inbuf, const int insize ) - { - int dictionary_size = 65535; /* fast encoder */ - int midsize = min( INT_MAX, ( insize / 8 ) * 9LL + 44 ), outsize = insize; - uint8_t * midbuf = (uint8_t *)malloc( midsize ); - uint8_t * outbuf = (uint8_t *)malloc( outsize ); - if( !midbuf || !outbuf ) - { fputs( "bbexample: fixed_test: Not enough memory.\n", stderr ); - free( outbuf ); free( midbuf ); return 1; } - - for( ; dictionary_size <= 8 << 20; dictionary_size += 8323073 ) - { - int midlen, outlen; - if( !bbcompress( inbuf, insize, dictionary_size, 16, midbuf, midsize, &midlen ) ) - { fputs( "bbexample: fixed_test: Not enough memory or compress error.\n", - stderr ); free( outbuf ); free( midbuf ); return 1; } - - if( !bbdecompress( midbuf, midlen, outbuf, outsize, &outlen ) ) - { fputs( "bbexample: fixed_test: Not enough memory or decompress error.\n", - stderr ); free( outbuf ); free( midbuf ); return 1; } - - if( insize != outlen || - ( insize > 0 && memcmp( inbuf, outbuf, insize ) != 0 ) ) - { fputs( "bbexample: fixed_test: Decompressed data differs from original.\n", - stderr ); free( outbuf ); free( midbuf ); return 1; } - - } - free( outbuf ); - free( midbuf ); - return 0; + if( error ) { free( new_data ); return 0; } + *out_sizep = new_pos; + return new_data; } int main( const int argc, const char * const argv[] ) { - int retval = 0, i; - int open_failures = 0; - const bool verbose = argc > 2; + uint8_t * in_buffer; + long in_size = 0; + int level; if( argc < 2 ) { @@ -348,20 +233,38 @@ int main( const int argc, const char * const argv[] ) return 1; } - for( i = 1; i < argc && retval == 0; ++i ) - { - long insize; - uint8_t * const inbuf = read_file( argv[i], &insize ); - if( !inbuf ) { ++open_failures; continue; } - if( verbose ) fprintf( stderr, " Testing file '%s'\n", argv[i] ); + in_buffer = read_file( argv[1], &in_size ); + if( !in_buffer ) return 1; - retval = full_test( inbuf, insize ); - if( retval == 0 ) retval = fixed_test( inbuf, min( INT_MAX, insize ) ); - free( inbuf ); + for( level = 0; level <= 9; ++level ) + { + uint8_t * mid_buffer, * out_buffer; + long mid_size = 0, out_size = 0; + + mid_buffer = bbcompress( in_buffer, in_size, level, &mid_size ); + if( !mid_buffer ) + { + fputs( "bbexample: Not enough memory or compress error.\n", stderr ); + return 1; + } + + out_buffer = bbdecompress( mid_buffer, mid_size, &out_size ); + if( !out_buffer ) + { + fputs( "bbexample: Not enough memory or decompress error.\n", stderr ); + return 1; + } + + if( in_size != out_size || + ( in_size > 0 && memcmp( in_buffer, out_buffer, in_size ) != 0 ) ) + { + fputs( "bbexample: Decompressed data differs from original.\n", stderr ); + return 1; + } + + free( out_buffer ); + free( mid_buffer ); } - if( open_failures > 0 && verbose ) - fprintf( stderr, "bbexample: warning: %d %s failed to open.\n", - open_failures, ( open_failures == 1 ) ? "file" : "files" ); - if( retval == 0 && open_failures ) retval = 1; - return retval; + free( in_buffer ); + return 0; } diff --git a/carg_parser.c b/carg_parser.c index 20b8a16..3d4e89f 100644 --- a/carg_parser.c +++ b/carg_parser.c @@ -1,20 +1,20 @@ -/* Arg_parser - POSIX/GNU command-line argument parser. (C version) - Copyright (C) 2006-2025 Antonio Diaz Diaz. +/* Arg_parser - POSIX/GNU command line argument parser. (C version) + Copyright (C) 2006-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software. Redistribution and use in source and + binary forms, with or without modification, are permitted provided + that the following conditions are met: - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. */ #include @@ -32,46 +32,28 @@ static void * ap_resize_buffer( void * buf, const int min_size ) } -static char push_back_record( Arg_parser * const ap, const int code, - const char * const long_name, - const char * const argument ) +static char push_back_record( struct Arg_parser * const ap, + const int code, const char * const argument ) { - ap_Record * p; + const int len = strlen( argument ); + struct ap_Record * p; void * tmp = ap_resize_buffer( ap->data, - ( ap->data_size + 1 ) * sizeof (ap_Record) ); + ( ap->data_size + 1 ) * sizeof (struct ap_Record) ); if( !tmp ) return 0; - ap->data = (ap_Record *)tmp; + ap->data = (struct ap_Record *)tmp; p = &(ap->data[ap->data_size]); p->code = code; - if( long_name ) - { - const int len = strlen( long_name ); - p->parsed_name = (char *)malloc( len + 2 + 1 ); - if( !p->parsed_name ) return 0; - p->parsed_name[0] = p->parsed_name[1] = '-'; - strncpy( p->parsed_name + 2, long_name, len + 1 ); - } - else if( code > 0 && code < 256 ) - { - p->parsed_name = (char *)malloc( 2 + 1 ); - if( !p->parsed_name ) return 0; - p->parsed_name[0] = '-'; p->parsed_name[1] = code; p->parsed_name[2] = 0; - } - else p->parsed_name = 0; - if( argument ) - { - const int len = strlen( argument ); - p->argument = (char *)malloc( len + 1 ); - if( !p->argument ) { free( p->parsed_name ); return 0; } - strncpy( p->argument, argument, len + 1 ); - } - else p->argument = 0; + p->argument = 0; + tmp = ap_resize_buffer( p->argument, len + 1 ); + if( !tmp ) return 0; + p->argument = (char *)tmp; + strncpy( p->argument, argument, len + 1 ); ++ap->data_size; return 1; } -static char add_error( Arg_parser * const ap, const char * const msg ) +static char add_error( struct Arg_parser * const ap, const char * const msg ) { const int len = strlen( msg ); void * tmp = ap_resize_buffer( ap->error, ap->error_size + len + 1 ); @@ -83,20 +65,19 @@ static char add_error( Arg_parser * const ap, const char * const msg ) } -static void free_data( Arg_parser * const ap ) +static void free_data( struct Arg_parser * const ap ) { int i; - for( i = 0; i < ap->data_size; ++i ) - { free( ap->data[i].argument ); free( ap->data[i].parsed_name ); } + for( i = 0; i < ap->data_size; ++i ) free( ap->data[i].argument ); if( ap->data ) { free( ap->data ); ap->data = 0; } ap->data_size = 0; } -/* Return 0 only if out of memory. */ -static char parse_long_option( Arg_parser * const ap, +static char parse_long_option( struct Arg_parser * const ap, const char * const opt, const char * const arg, - const ap_Option options[], int * const argindp ) + const struct ap_Option options[], + int * const argindp ) { unsigned len; int index = -1, i; @@ -106,15 +87,14 @@ static char parse_long_option( Arg_parser * const ap, /* Test all long options for either exact match or abbreviated matches. */ for( i = 0; options[i].code != 0; ++i ) - if( options[i].long_name && - strncmp( options[i].long_name, &opt[2], len ) == 0 ) + if( options[i].name && strncmp( options[i].name, &opt[2], len ) == 0 ) { - if( strlen( options[i].long_name ) == len ) /* Exact match found */ + if( strlen( options[i].name ) == len ) /* Exact match found */ { index = i; exact = 1; break; } else if( index < 0 ) index = i; /* First nonexact match found */ else if( options[index].code != options[i].code || options[index].has_arg != options[i].has_arg ) - ambig = 1; /* Second or later nonexact match found */ + ambig = 1; /* Second or later nonexact match found */ } if( ambig && !exact ) @@ -137,55 +117,52 @@ static char parse_long_option( Arg_parser * const ap, { if( options[index].has_arg == ap_no ) { - add_error( ap, "option '--" ); add_error( ap, options[index].long_name ); + add_error( ap, "option '--" ); add_error( ap, options[index].name ); add_error( ap, "' doesn't allow an argument" ); return 1; } if( options[index].has_arg == ap_yes && !opt[len+3] ) { - add_error( ap, "option '--" ); add_error( ap, options[index].long_name ); + add_error( ap, "option '--" ); add_error( ap, options[index].name ); add_error( ap, "' requires an argument" ); return 1; } - return push_back_record( ap, options[index].code, options[index].long_name, - &opt[len+3] ); /* argument may be empty */ + return push_back_record( ap, options[index].code, &opt[len+3] ); } - if( options[index].has_arg == ap_yes || options[index].has_arg == ap_yme ) + if( options[index].has_arg == ap_yes ) { - if( !arg || ( options[index].has_arg == ap_yes && !arg[0] ) ) + if( !arg || !arg[0] ) { - add_error( ap, "option '--" ); add_error( ap, options[index].long_name ); + add_error( ap, "option '--" ); add_error( ap, options[index].name ); add_error( ap, "' requires an argument" ); return 1; } ++*argindp; - return push_back_record( ap, options[index].code, options[index].long_name, - arg ); /* argument may be empty */ + return push_back_record( ap, options[index].code, arg ); } - return push_back_record( ap, options[index].code, - options[index].long_name, 0 ); + return push_back_record( ap, options[index].code, "" ); } -/* Return 0 only if out of memory. */ -static char parse_short_option( Arg_parser * const ap, +static char parse_short_option( struct Arg_parser * const ap, const char * const opt, const char * const arg, - const ap_Option options[], int * const argindp ) + const struct ap_Option options[], + int * const argindp ) { int cind = 1; /* character index in opt */ while( cind > 0 ) { int index = -1, i; - const unsigned char c = opt[cind]; + const unsigned char code = opt[cind]; char code_str[2]; - code_str[0] = c; code_str[1] = 0; + code_str[0] = code; code_str[1] = 0; - if( c != 0 ) + if( code != 0 ) for( i = 0; options[i].code; ++i ) - if( c == options[i].code ) + if( code == options[i].code ) { index = i; break; } if( index < 0 ) @@ -199,34 +176,34 @@ static char parse_short_option( Arg_parser * const ap, if( options[index].has_arg != ap_no && cind > 0 && opt[cind] ) { - if( !push_back_record( ap, c, 0, &opt[cind] ) ) return 0; + if( !push_back_record( ap, code, &opt[cind] ) ) return 0; ++*argindp; cind = 0; } - else if( options[index].has_arg == ap_yes || options[index].has_arg == ap_yme ) + else if( options[index].has_arg == ap_yes ) { - if( !arg || ( options[index].has_arg == ap_yes && !arg[0] ) ) + if( !arg || !arg[0] ) { add_error( ap, "option requires an argument -- '" ); add_error( ap, code_str ); add_error( ap, "'" ); return 1; } - ++*argindp; cind = 0; /* argument may be empty */ - if( !push_back_record( ap, c, 0, arg ) ) return 0; + ++*argindp; cind = 0; + if( !push_back_record( ap, code, arg ) ) return 0; } - else if( !push_back_record( ap, c, 0, 0 ) ) return 0; + else if( !push_back_record( ap, code, "" ) ) return 0; } return 1; } -char ap_init( Arg_parser * const ap, +char ap_init( struct Arg_parser * const ap, const int argc, const char * const argv[], - const ap_Option options[], const char in_order ) + const struct ap_Option options[], const char in_order ) { const char ** non_options = 0; /* skipped non-options */ int non_options_size = 0; /* number of skipped non-options */ int argind = 1; /* index in argv */ - char done = 0; /* false until success */ + int i; ap->data = 0; ap->error = 0; @@ -246,41 +223,38 @@ char ap_init( Arg_parser * const ap, if( ch2 == '-' ) { if( !argv[argind][2] ) { ++argind; break; } /* we found "--" */ - else if( !parse_long_option( ap, opt, arg, options, &argind ) ) goto out; + else if( !parse_long_option( ap, opt, arg, options, &argind ) ) return 0; } - else if( !parse_short_option( ap, opt, arg, options, &argind ) ) goto out; + else if( !parse_short_option( ap, opt, arg, options, &argind ) ) return 0; if( ap->error ) break; } else { - if( in_order ) - { if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out; } - else + if( !in_order ) { void * tmp = ap_resize_buffer( non_options, ( non_options_size + 1 ) * sizeof *non_options ); - if( !tmp ) goto out; + if( !tmp ) return 0; non_options = (const char **)tmp; non_options[non_options_size++] = argv[argind++]; } + else if( !push_back_record( ap, 0, argv[argind++] ) ) return 0; } } if( ap->error ) free_data( ap ); else { - int i; for( i = 0; i < non_options_size; ++i ) - if( !push_back_record( ap, 0, 0, non_options[i] ) ) goto out; + if( !push_back_record( ap, 0, non_options[i] ) ) return 0; while( argind < argc ) - if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out; + if( !push_back_record( ap, 0, argv[argind++] ) ) return 0; } - done = 1; -out: if( non_options ) free( non_options ); - return done; + if( non_options ) free( non_options ); + return 1; } -void ap_free( Arg_parser * const ap ) +void ap_free( struct Arg_parser * const ap ) { free_data( ap ); if( ap->error ) { free( ap->error ); ap->error = 0; } @@ -288,26 +262,23 @@ void ap_free( Arg_parser * const ap ) } -const char * ap_error( const Arg_parser * const ap ) { return ap->error; } +const char * ap_error( const struct Arg_parser * const ap ) + { return ap->error; } -int ap_arguments( const Arg_parser * const ap ) { return ap->data_size; } -int ap_code( const Arg_parser * const ap, const int i ) +int ap_arguments( const struct Arg_parser * const ap ) + { return ap->data_size; } + + +int ap_code( const struct Arg_parser * const ap, const int i ) { - if( i < 0 || i >= ap_arguments( ap ) ) return 0; - return ap->data[i].code; + if( i >= 0 && i < ap_arguments( ap ) ) return ap->data[i].code; + else return 0; } -const char * ap_parsed_name( const Arg_parser * const ap, const int i ) +const char * ap_argument( const struct Arg_parser * const ap, const int i ) { - if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].parsed_name ) return ""; - return ap->data[i].parsed_name; - } - - -const char * ap_argument( const Arg_parser * const ap, const int i ) - { - if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].argument ) return ""; - return ap->data[i].argument; + if( i >= 0 && i < ap_arguments( ap ) ) return ap->data[i].argument; + else return ""; } diff --git a/carg_parser.h b/carg_parser.h index 28eabee..e918942 100644 --- a/carg_parser.h +++ b/carg_parser.h @@ -1,101 +1,92 @@ -/* Arg_parser - POSIX/GNU command-line argument parser. (C version) - Copyright (C) 2006-2025 Antonio Diaz Diaz. +/* Arg_parser - POSIX/GNU command line argument parser. (C version) + Copyright (C) 2006-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software. Redistribution and use in source and + binary forms, with or without modification, are permitted provided + that the following conditions are met: - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. */ -/* Arg_parser reads the arguments in 'argv' and creates a number of - option codes, option arguments, and non-option arguments. +/* Arg_parser reads the arguments in 'argv' and creates a number of + option codes, option arguments and non-option arguments. - In case of error, 'ap_error' returns a non-null pointer to an error - message. + In case of error, 'ap_error' returns a non-null pointer to an error + message. - 'options' is an array of 'struct ap_Option' terminated by an element - containing a code which is zero. A null long_name means a short-only - option. A code value outside the unsigned char range means a long-only - option. + 'options' is an array of 'struct ap_Option' terminated by an element + containing a code which is zero. A null name means a short-only + option. A code value outside the unsigned char range means a + long-only option. - Arg_parser normally makes it appear as if all the option arguments - were specified before all the non-option arguments for the purposes - of parsing, even if the user of your program intermixed option and - non-option arguments. If you want the arguments in the exact order - the user typed them, call 'ap_init' with 'in_order' = true. + Arg_parser normally makes it appear as if all the option arguments + were specified before all the non-option arguments for the purposes + of parsing, even if the user of your program intermixed option and + non-option arguments. If you want the arguments in the exact order + the user typed them, call 'ap_init' with 'in_order' = true. - The argument '--' terminates all options; any following arguments are - treated as non-option arguments, even if they begin with a hyphen. + The argument '--' terminates all options; any following arguments are + treated as non-option arguments, even if they begin with a hyphen. - The syntax of options with an optional argument is - '-' (without whitespace), or - '--='. - - The syntax of options with an empty argument is '- ""', - '-- ""', or '--=""'. + The syntax for optional option arguments is '-' + (without whitespace), or '--='. */ #ifdef __cplusplus extern "C" { #endif -/* ap_yme = yes but maybe empty */ -typedef enum ap_Has_arg { ap_no, ap_yes, ap_maybe, ap_yme } ap_Has_arg; +enum ap_Has_arg { ap_no, ap_yes, ap_maybe }; -typedef struct ap_Option +struct ap_Option { int code; /* Short option letter or code ( code != 0 ) */ - const char * long_name; /* Long option name (maybe null) */ - ap_Has_arg has_arg; - } ap_Option; + const char * name; /* Long option name (maybe null) */ + enum ap_Has_arg has_arg; + }; -typedef struct ap_Record +struct ap_Record { int code; - char * parsed_name; char * argument; - } ap_Record; + }; -typedef struct Arg_parser +struct Arg_parser { - ap_Record * data; + struct ap_Record * data; char * error; int data_size; int error_size; - } Arg_parser; + }; -char ap_init( Arg_parser * const ap, +char ap_init( struct Arg_parser * const ap, const int argc, const char * const argv[], - const ap_Option options[], const char in_order ); + const struct ap_Option options[], const char in_order ); -void ap_free( Arg_parser * const ap ); +void ap_free( struct Arg_parser * const ap ); -const char * ap_error( const Arg_parser * const ap ); +const char * ap_error( const struct Arg_parser * const ap ); -/* The number of arguments parsed. May be different from argc. */ -int ap_arguments( const Arg_parser * const ap ); + /* The number of arguments parsed (may be different from argc) */ +int ap_arguments( const struct Arg_parser * const ap ); -/* If ap_code( i ) is 0, ap_argument( i ) is a non-option. - Else ap_argument( i ) is the option's argument (or empty). */ -int ap_code( const Arg_parser * const ap, const int i ); + /* If ap_code( i ) is 0, ap_argument( i ) is a non-option. + Else ap_argument( i ) is the option's argument (or empty). */ +int ap_code( const struct Arg_parser * const ap, const int i ); -/* Full name of the option parsed (short or long). */ -const char * ap_parsed_name( const Arg_parser * const ap, const int i ); - -const char * ap_argument( const Arg_parser * const ap, const int i ); +const char * ap_argument( const struct Arg_parser * const ap, const int i ); #ifdef __cplusplus } diff --git a/cbuffer.c b/cbuffer.c index 23d95e1..89693ab 100644 --- a/cbuffer.c +++ b/cbuffer.c @@ -1,31 +1,42 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -typedef struct Circular_buffer +struct Circular_buffer { uint8_t * buffer; unsigned buffer_size; /* capacity == buffer_size - 1 */ unsigned get; /* buffer is empty when get == put */ unsigned put; - } Circular_buffer; + }; -static inline bool Cb_init( Circular_buffer * const cb, +static inline void Cb_reset( struct Circular_buffer * const cb ) + { cb->get = 0; cb->put = 0; } + +static inline bool Cb_init( struct Circular_buffer * const cb, const unsigned buf_size ) { cb->buffer_size = buf_size + 1; @@ -33,39 +44,35 @@ static inline bool Cb_init( Circular_buffer * const cb, cb->put = 0; cb->buffer = ( cb->buffer_size > 1 ) ? (uint8_t *)malloc( cb->buffer_size ) : 0; - return cb->buffer != 0; + return ( cb->buffer != 0 ); } -static inline void Cb_free( Circular_buffer * const cb ) +static inline void Cb_free( struct Circular_buffer * const cb ) { free( cb->buffer ); cb->buffer = 0; } -static inline void Cb_reset( Circular_buffer * const cb ) - { cb->get = 0; cb->put = 0; } - -static inline unsigned Cb_empty( const Circular_buffer * const cb ) - { return cb->get == cb->put; } - -static inline unsigned Cb_used_bytes( const Circular_buffer * const cb ) +static inline unsigned Cb_used_bytes( const struct Circular_buffer * const cb ) { return ( (cb->get <= cb->put) ? 0 : cb->buffer_size ) + cb->put - cb->get; } -static inline unsigned Cb_free_bytes( const Circular_buffer * const cb ) +static inline unsigned Cb_free_bytes( const struct Circular_buffer * const cb ) { return ( (cb->get <= cb->put) ? cb->buffer_size : 0 ) - cb->put + cb->get - 1; } -static inline uint8_t Cb_get_byte( Circular_buffer * const cb ) +static inline uint8_t Cb_get_byte( struct Circular_buffer * const cb ) { const uint8_t b = cb->buffer[cb->get]; if( ++cb->get >= cb->buffer_size ) cb->get = 0; return b; } -static inline void Cb_put_byte( Circular_buffer * const cb, const uint8_t b ) +static inline void Cb_put_byte( struct Circular_buffer * const cb, + const uint8_t b ) { cb->buffer[cb->put] = b; if( ++cb->put >= cb->buffer_size ) cb->put = 0; } -static bool Cb_unread_data( Circular_buffer * const cb, const unsigned size ) +static bool Cb_unread_data( struct Circular_buffer * const cb, + const unsigned size ) { if( size > Cb_free_bytes( cb ) ) return false; if( cb->get >= size ) cb->get -= size; @@ -74,11 +81,10 @@ static bool Cb_unread_data( Circular_buffer * const cb, const unsigned size ) } -/* Copy up to 'out_size' bytes to 'out_buffer' and update 'get'. - If 'out_buffer' is null, the bytes are discarded. - Return the number of bytes copied or discarded. +/* Copies up to 'out_size' bytes to 'out_buffer' and updates 'get'. + Returns the number of bytes copied. */ -static unsigned Cb_read_data( Circular_buffer * const cb, +static unsigned Cb_read_data( struct Circular_buffer * const cb, uint8_t * const out_buffer, const unsigned out_size ) { @@ -89,7 +95,7 @@ static unsigned Cb_read_data( Circular_buffer * const cb, size = min( cb->buffer_size - cb->get, out_size ); if( size > 0 ) { - if( out_buffer ) memcpy( out_buffer, cb->buffer + cb->get, size ); + memcpy( out_buffer, cb->buffer + cb->get, size ); cb->get += size; if( cb->get >= cb->buffer_size ) cb->get = 0; } @@ -99,7 +105,7 @@ static unsigned Cb_read_data( Circular_buffer * const cb, const unsigned size2 = min( cb->put - cb->get, out_size - size ); if( size2 > 0 ) { - if( out_buffer ) memcpy( out_buffer + size, cb->buffer + cb->get, size2 ); + memcpy( out_buffer + size, cb->buffer + cb->get, size2 ); cb->get += size2; size += size2; } @@ -108,10 +114,10 @@ static unsigned Cb_read_data( Circular_buffer * const cb, } -/* Copy up to 'in_size' bytes from 'in_buffer' and update 'put'. - Return the number of bytes copied. +/* Copies up to 'in_size' bytes from 'in_buffer' and updates 'put'. + Returns the number of bytes copied. */ -static unsigned Cb_write_data( Circular_buffer * const cb, +static unsigned Cb_write_data( struct Circular_buffer * const cb, const uint8_t * const in_buffer, const unsigned in_size ) { diff --git a/configure b/configure index 90ab72d..2182cfa 100755 --- a/configure +++ b/configure @@ -1,21 +1,19 @@ #! /bin/sh # configure script for Lzlib - Compression library for the lzip format -# Copyright (C) 2009-2025 Antonio Diaz Diaz. +# Copyright (C) 2009-2016 Antonio Diaz Diaz. # # This configure script is free software: you have unlimited permission -# to copy, distribute, and modify it. +# to copy, distribute and modify it. pkgname=lzlib -pkgversion=1.15 +pkgversion=1.8 soversion=1 -libname=lz -libname_static=lib${libname}.a -libname_shared= progname=minilzip progname_static=${progname} progname_shared= progname_lzip=${progname} disable_ldconfig= +libname=lz srctrigger=doc/${pkgname}.texi # clear some things potentially inherited from environment. @@ -31,15 +29,16 @@ infodir='$(datarootdir)/info' libdir='$(exec_prefix)/lib' mandir='$(datarootdir)/man' CC=gcc -AR=ar CPPFLAGS= CFLAGS='-Wall -W -O2' LDFLAGS= -ARFLAGS=-rcs -MAKEINFO=makeinfo # checking whether we are using GNU C. -/bin/sh -c "${CC} --version" > /dev/null 2>&1 || { CC=cc ; CFLAGS=-O2 ; } +if /bin/sh -c "${CC} --version" > /dev/null 2>&1 ; then true +else + CC=cc + CFLAGS='-W -O2' +fi # Loop over all args args= @@ -51,26 +50,22 @@ while [ $# != 0 ] ; do shift # Add the argument quoted to args - if [ -z "${args}" ] ; then args="\"${option}\"" - else args="${args} \"${option}\"" ; fi + args="${args} \"${option}\"" # Split out the argument for options that take them case ${option} in - *=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;; + *=*) optarg=`echo ${option} | sed -e 's,^[^=]*=,,;s,/$,,'` ;; esac # Process the options case ${option} in --help | -h) - echo "Usage: $0 [OPTION]... [VAR=VALUE]..." + echo "Usage: configure [options]" echo - echo "To assign makefile variables (e.g., CC, CFLAGS...), specify them as" - echo "arguments to configure in the form VAR=VALUE." - echo - echo "Options and variables: [defaults in brackets]" + echo "Options: [defaults in brackets]" echo " -h, --help display this help and exit" echo " -V, --version output version information and exit" - echo " --srcdir=DIR find the source code in DIR [. or ..]" + echo " --srcdir=DIR find the sources in DIR [. or ..]" echo " --prefix=DIR install into DIR [${prefix}]" echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]" echo " --bindir=DIR user executables directory [${bindir}]" @@ -84,13 +79,9 @@ while [ $# != 0 ] ; do echo " --enable-shared build also a shared library [disable]" echo " --disable-ldconfig don't run ldconfig after install" echo " CC=COMPILER C compiler to use [${CC}]" - echo " AR=ARCHIVER library archiver to use [${AR}]" - echo " CPPFLAGS=OPTIONS command-line options for the preprocessor [${CPPFLAGS}]" - echo " CFLAGS=OPTIONS command-line options for the C compiler [${CFLAGS}]" - echo " CFLAGS+=OPTIONS append options to the current value of CFLAGS" - echo " LDFLAGS=OPTIONS command-line options for the linker [${LDFLAGS}]" - echo " ARFLAGS=OPTIONS command-line options for the library archiver [${ARFLAGS}]" - echo " MAKEINFO=NAME makeinfo program to use [${MAKEINFO}]" + echo " CPPFLAGS=OPTIONS command line options for the preprocessor [${CPPFLAGS}]" + echo " CFLAGS=OPTIONS command line options for the C compiler [${CFLAGS}]" + echo " LDFLAGS=OPTIONS command line options for the linker [${LDFLAGS}]" echo exit 0 ;; --version | -V) @@ -117,25 +108,16 @@ while [ $# != 0 ] ; do --mandir=*) mandir=${optarg} ;; --no-create) no_create=yes ;; --disable-static) - libname_static= progname_static= - libname_shared=lib${libname}.so.${soversion} - progname_shared=${progname}_shared - progname_lzip=${progname}_shared ;; - --enable-shared) - libname_shared=lib${libname}.so.${soversion} progname_shared=${progname}_shared progname_lzip=${progname}_shared ;; + --enable-shared) progname_shared=${progname}_shared ;; --disable-ldconfig) disable_ldconfig=yes ;; - CC=*) CC=${optarg} ;; - AR=*) AR=${optarg} ;; - CPPFLAGS=*) CPPFLAGS=${optarg} ;; - CFLAGS=*) CFLAGS=${optarg} ;; - CFLAGS+=*) CFLAGS="${CFLAGS} ${optarg}" ;; - LDFLAGS=*) LDFLAGS=${optarg} ;; - ARFLAGS=*) ARFLAGS=${optarg} ;; - MAKEINFO=*) MAKEINFO=${optarg} ;; + CC=*) CC=${optarg} ;; + CPPFLAGS=*) CPPFLAGS=${optarg} ;; + CFLAGS=*) CFLAGS=${optarg} ;; + LDFLAGS=*) LDFLAGS=${optarg} ;; --*) echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;; @@ -146,7 +128,7 @@ while [ $# != 0 ] ; do exit 1 ;; esac - # Check whether the option took a separate argument + # Check if the option took a separate argument if [ "${arg2}" = yes ] ; then if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift else echo "configure: Missing argument to '${option}'" 1>&2 @@ -155,19 +137,19 @@ while [ $# != 0 ] ; do fi done -# Find the source code, if location was not specified. +# Find the source files, if location was not specified. srcdirtext= if [ -z "${srcdir}" ] ; then srcdirtext="or . or .." ; srcdir=. if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi if [ ! -r "${srcdir}/${srctrigger}" ] ; then ## the sed command below emulates the dirname command - srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` + srcdir=`echo $0 | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` fi fi if [ ! -r "${srcdir}/${srctrigger}" ] ; then - echo "configure: Can't find source code in ${srcdir} ${srcdirtext}" 1>&2 + echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2 echo "configure: (At least ${srctrigger} is missing)." 1>&2 exit 1 fi @@ -185,9 +167,9 @@ if [ -z "${no_create}" ] ; then # Run this file to recreate the current configuration. # # This script is free software: you have unlimited permission -# to copy, distribute, and modify it. +# to copy, distribute and modify it. -exec /bin/sh "$0" ${args} --no-create +exec /bin/sh $0 ${args} --no-create EOF chmod +x config.status fi @@ -203,32 +185,27 @@ echo "infodir = ${infodir}" echo "libdir = ${libdir}" echo "mandir = ${mandir}" echo "CC = ${CC}" -echo "AR = ${AR}" echo "CPPFLAGS = ${CPPFLAGS}" echo "CFLAGS = ${CFLAGS}" echo "LDFLAGS = ${LDFLAGS}" -echo "ARFLAGS = ${ARFLAGS}" -echo "MAKEINFO = ${MAKEINFO}" rm -f Makefile cat > Makefile << EOF # Makefile for Lzlib - Compression library for the lzip format -# Copyright (C) 2009-2025 Antonio Diaz Diaz. +# Copyright (C) 2009-2016 Antonio Diaz Diaz. # This file was generated automatically by configure. Don't edit. # # This Makefile is free software: you have unlimited permission -# to copy, distribute, and modify it. +# to copy, distribute and modify it. pkgname = ${pkgname} pkgversion = ${pkgversion} soversion = ${soversion} -libname = ${libname} -libname_static = ${libname_static} -libname_shared = ${libname_shared} progname = ${progname} progname_static = ${progname_static} progname_shared = ${progname_shared} progname_lzip = ${progname_lzip} disable_ldconfig = ${disable_ldconfig} +libname = ${libname} VPATH = ${srcdir} prefix = ${prefix} exec_prefix = ${exec_prefix} @@ -239,12 +216,9 @@ infodir = ${infodir} libdir = ${libdir} mandir = ${mandir} CC = ${CC} -AR = ${AR} CPPFLAGS = ${CPPFLAGS} CFLAGS = ${CFLAGS} LDFLAGS = ${LDFLAGS} -ARFLAGS = ${ARFLAGS} -MAKEINFO = ${MAKEINFO} EOF cat "${srcdir}/Makefile.in" >> Makefile diff --git a/decoder.c b/decoder.c index 83a128c..8ef3942 100644 --- a/decoder.c +++ b/decoder.c @@ -1,148 +1,167 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -static int LZd_try_check_trailer( LZ_decoder * const d ) +static bool LZd_verify_trailer( struct LZ_decoder * const d ) { - Lzip_trailer trailer; - if( Rd_available_bytes( d->rdec ) < Lt_size ) - { if( !d->rdec->at_stream_end ) return 0; else return 2; } - d->check_trailer_pending = false; - d->member_finished = true; + File_trailer trailer; + int size = Rd_read_data( d->rdec, trailer, Ft_size ); - if( Rd_read_data( d->rdec, trailer, Lt_size ) == Lt_size && - Lt_get_data_crc( trailer ) == LZd_crc( d ) && - Lt_get_data_size( trailer ) == LZd_data_position( d ) && - Lt_get_member_size( trailer ) == d->rdec->member_position ) return 0; - return 3; + if( size < Ft_size ) + return false; + + return ( Ft_get_data_crc( trailer ) == LZd_crc( d ) && + Ft_get_data_size( trailer ) == LZd_data_position( d ) && + Ft_get_member_size( trailer ) == d->rdec->member_position ); } /* Return value: 0 = OK, 1 = decoder error, 2 = unexpected EOF, 3 = trailer error, 4 = unknown marker found, - 5 = nonzero first LZMA byte found, 6 = library error. */ -static int LZd_decode_member( LZ_decoder * const d ) + 5 = library error. */ +static int LZd_decode_member( struct LZ_decoder * const d ) { - Range_decoder * const rdec = d->rdec; + struct Range_decoder * const rdec = d->rdec; State * const state = &d->state; - unsigned old_mpos = rdec->member_position; +/* unsigned long long old_mpos = d->rdec->member_position; */ if( d->member_finished ) return 0; - const int tmp = Rd_try_reload( rdec ); - if( tmp > 1 ) return 5; - if( !tmp ) { if( !rdec->at_stream_end ) return 0; else return 2; } - if( d->check_trailer_pending ) return LZd_try_check_trailer( d ); + if( !Rd_try_reload( rdec, false ) ) + { if( !rdec->at_stream_end ) return 0; else return 2; } + if( d->verify_trailer_pending ) + { + if( Rd_available_bytes( rdec ) < Ft_size && !rdec->at_stream_end ) + return 0; + d->verify_trailer_pending = false; + d->member_finished = true; + if( LZd_verify_trailer( d ) ) return 0; else return 3; + } while( !Rd_finished( rdec ) ) { - const unsigned mpos = rdec->member_position; - if( mpos - old_mpos > rd_min_available_bytes ) return 6; - old_mpos = mpos; - if( !Rd_enough_available_bytes( rdec ) ) /* check unexpected EOF */ - { if( !rdec->at_stream_end ) return 0; - if( Cb_empty( &rdec->cb ) ) break; } /* decode until EOF */ - if( !LZd_enough_free_bytes( d ) ) return 0; const int pos_state = LZd_data_position( d ) & pos_state_mask; - if( Rd_decode_bit( rdec, &d->bm_match[*state][pos_state] ) == 0 ) /* 1st bit */ +/* const unsigned long long mpos = d->rdec->member_position; + if( mpos - old_mpos > rd_min_available_bytes ) return 5; + old_mpos = mpos; */ + if( !Rd_enough_available_bytes( rdec ) ) /* check unexpected eof */ + { if( !rdec->at_stream_end ) return 0; else break; } + if( !LZd_enough_free_bytes( d ) ) return 0; + if( Rd_decode_bit( rdec, &d->bm_match[*state][pos_state] ) == 0 ) /* 1st bit */ { - /* literal byte */ - Bit_model * const bm = d->bm_literal[get_lit_state(LZd_peek_prev( d ))]; - if( ( *state = St_set_char( *state ) ) < 4 ) - LZd_put_byte( d, Rd_decode_tree8( rdec, bm ) ); - else - LZd_put_byte( d, Rd_decode_matched( rdec, bm, LZd_peek( d, d->rep0 ) ) ); - continue; - } - /* match or repeated match */ - int len; - if( Rd_decode_bit( rdec, &d->bm_rep[*state] ) != 0 ) /* 2nd bit */ - { - if( Rd_decode_bit( rdec, &d->bm_rep0[*state] ) == 0 ) /* 3rd bit */ + const uint8_t prev_byte = LZd_peek_prev( d ); + if( St_is_char( *state ) ) { - if( Rd_decode_bit( rdec, &d->bm_len[*state][pos_state] ) == 0 ) /* 4th bit */ - { *state = St_set_shortrep( *state ); - LZd_put_byte( d, LZd_peek( d, d->rep0 ) ); continue; } + *state -= ( *state < 4 ) ? *state : 3; + LZd_put_byte( d, Rd_decode_tree( rdec, + d->bm_literal[get_lit_state(prev_byte)], 8 ) ); } else { - unsigned distance; - if( Rd_decode_bit( rdec, &d->bm_rep1[*state] ) == 0 ) /* 4th bit */ - distance = d->rep1; - else + *state -= ( *state < 10 ) ? 3 : 6; + LZd_put_byte( d, Rd_decode_matched( rdec, + d->bm_literal[get_lit_state(prev_byte)], + LZd_peek( d, d->rep0 ) ) ); + } + } + else /* match or repeated match */ + { + int len; + if( Rd_decode_bit( rdec, &d->bm_rep[*state] ) != 0 ) /* 2nd bit */ + { + if( Rd_decode_bit( rdec, &d->bm_rep0[*state] ) != 0 ) /* 3rd bit */ { - if( Rd_decode_bit( rdec, &d->bm_rep2[*state] ) == 0 ) /* 5th bit */ - distance = d->rep2; + unsigned distance; + if( Rd_decode_bit( rdec, &d->bm_rep1[*state] ) == 0 ) /* 4th bit */ + distance = d->rep1; else - { distance = d->rep3; d->rep3 = d->rep2; } - d->rep2 = d->rep1; + { + if( Rd_decode_bit( rdec, &d->bm_rep2[*state] ) == 0 ) /* 5th bit */ + distance = d->rep2; + else + { distance = d->rep3; d->rep3 = d->rep2; } + d->rep2 = d->rep1; + } + d->rep1 = d->rep0; + d->rep0 = distance; } - d->rep1 = d->rep0; - d->rep0 = distance; - } - *state = St_set_rep( *state ); - len = Rd_decode_len( rdec, &d->rep_len_model, pos_state ); - } - else /* match */ - { - len = Rd_decode_len( rdec, &d->match_len_model, pos_state ); - unsigned distance = Rd_decode_tree6( rdec, d->bm_dis_slot[get_len_state(len)] ); - if( distance >= start_dis_model ) - { - const unsigned dis_slot = distance; - const int direct_bits = ( dis_slot >> 1 ) - 1; - distance = ( 2 | ( dis_slot & 1 ) ) << direct_bits; - if( dis_slot < end_dis_model ) - distance += Rd_decode_tree_reversed( rdec, - d->bm_dis + ( distance - dis_slot ), direct_bits ); else { - distance += - Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits; - distance += Rd_decode_tree_reversed4( rdec, d->bm_align ); - if( distance == 0xFFFFFFFFU ) /* marker found */ + if( Rd_decode_bit( rdec, &d->bm_len[*state][pos_state] ) == 0 ) /* 4th bit */ + { *state = St_set_short_rep( *state ); + LZd_put_byte( d, LZd_peek( d, d->rep0 ) ); continue; } + } + *state = St_set_rep( *state ); + len = min_match_len + Rd_decode_len( rdec, &d->rep_len_model, pos_state ); + } + else /* match */ + { + const unsigned rep0_saved = d->rep0; + int dis_slot; + len = min_match_len + Rd_decode_len( rdec, &d->match_len_model, pos_state ); + dis_slot = Rd_decode_tree6( rdec, d->bm_dis_slot[get_len_state(len)] ); + if( dis_slot < start_dis_model ) d->rep0 = dis_slot; + else + { + const int direct_bits = ( dis_slot >> 1 ) - 1; + d->rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits; + if( dis_slot < end_dis_model ) + d->rep0 += Rd_decode_tree_reversed( rdec, + d->bm_dis + d->rep0 - dis_slot - 1, direct_bits ); + else { - Rd_normalize( rdec ); - const unsigned mpos = rdec->member_position; - if( mpos - old_mpos > rd_min_available_bytes ) return 6; - old_mpos = mpos; - if( len == min_match_len ) /* End Of Stream marker */ + d->rep0 += Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits; + d->rep0 += Rd_decode_tree_reversed4( rdec, d->bm_align ); + if( d->rep0 == 0xFFFFFFFFU ) /* marker found */ { - d->check_trailer_pending = true; - return LZd_try_check_trailer( d ); + d->rep0 = rep0_saved; + Rd_normalize( rdec ); + if( len == min_match_len ) /* End Of Stream marker */ + { + if( Rd_available_bytes( rdec ) < Ft_size && !rdec->at_stream_end ) + { d->verify_trailer_pending = true; return 0; } + d->member_finished = true; + if( LZd_verify_trailer( d ) ) return 0; else return 3; + } + if( len == min_match_len + 1 ) /* Sync Flush marker */ + { + if( Rd_try_reload( rdec, true ) ) { /*old_mpos += 5;*/ continue; } + else { if( !rdec->at_stream_end ) return 0; else break; } + } + return 4; } - if( len == min_match_len + 1 ) /* Sync Flush marker */ - { - rdec->reload_pending = true; - const int tmp = Rd_try_reload( rdec ); - if( tmp > 1 ) return 5; - if( tmp ) continue; - if( !rdec->at_stream_end ) return 0; else break; - } - return 4; } } + d->rep3 = d->rep2; d->rep2 = d->rep1; d->rep1 = rep0_saved; + *state = St_set_match( *state ); + if( d->rep0 >= d->dictionary_size || + ( d->rep0 >= d->cb.put && !d->pos_wrapped ) ) + return 1; } - d->rep3 = d->rep2; d->rep2 = d->rep1; d->rep1 = d->rep0; d->rep0 = distance; - *state = St_set_match( *state ); - if( d->rep0 >= d->dictionary_size || - ( d->rep0 >= d->cb.put && !d->pos_wrapped ) ) return 1; + LZd_copy_block( d, d->rep0, len ); } - LZd_copy_block( d, d->rep0, len ); } return 2; } diff --git a/decoder.h b/decoder.h index f880849..a14156e 100644 --- a/decoder.h +++ b/decoder.h @@ -1,35 +1,43 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -enum { rd_min_available_bytes = 10 }; +enum { rd_min_available_bytes = 8 }; -typedef struct Range_decoder +struct Range_decoder { - Circular_buffer cb; /* input buffer */ + struct Circular_buffer cb; /* input buffer */ unsigned long long member_position; uint32_t code; uint32_t range; bool at_stream_end; bool reload_pending; - } Range_decoder; + }; -static inline bool Rd_init( Range_decoder * const rdec ) +static inline bool Rd_init( struct Range_decoder * const rdec ) { if( !Cb_init( &rdec->cb, 65536 + rd_min_available_bytes ) ) return false; rdec->member_position = 0; @@ -40,25 +48,25 @@ static inline bool Rd_init( Range_decoder * const rdec ) return true; } -static inline void Rd_free( Range_decoder * const rdec ) +static inline void Rd_free( struct Range_decoder * const rdec ) { Cb_free( &rdec->cb ); } -static inline bool Rd_finished( const Range_decoder * const rdec ) - { return rdec->at_stream_end && Cb_empty( &rdec->cb ); } +static inline bool Rd_finished( const struct Range_decoder * const rdec ) + { return rdec->at_stream_end && !Cb_used_bytes( &rdec->cb ); } -static inline void Rd_finish( Range_decoder * const rdec ) +static inline void Rd_finish( struct Range_decoder * const rdec ) { rdec->at_stream_end = true; } -static inline bool Rd_enough_available_bytes( const Range_decoder * const rdec ) - { return Cb_used_bytes( &rdec->cb ) >= rd_min_available_bytes; } +static inline bool Rd_enough_available_bytes( const struct Range_decoder * const rdec ) + { return ( Cb_used_bytes( &rdec->cb ) >= rd_min_available_bytes ); } -static inline unsigned Rd_available_bytes( const Range_decoder * const rdec ) +static inline unsigned Rd_available_bytes( const struct Range_decoder * const rdec ) { return Cb_used_bytes( &rdec->cb ); } -static inline unsigned Rd_free_bytes( const Range_decoder * const rdec ) - { return rdec->at_stream_end ? 0 : Cb_free_bytes( &rdec->cb ); } +static inline unsigned Rd_free_bytes( const struct Range_decoder * const rdec ) + { if( rdec->at_stream_end ) return 0; return Cb_free_bytes( &rdec->cb ); } -static inline unsigned long long Rd_purge( Range_decoder * const rdec ) +static inline unsigned long long Rd_purge( struct Range_decoder * const rdec ) { const unsigned long long size = rdec->member_position + Cb_used_bytes( &rdec->cb ); @@ -67,32 +75,32 @@ static inline unsigned long long Rd_purge( Range_decoder * const rdec ) return size; } -static inline void Rd_reset( Range_decoder * const rdec ) +static inline void Rd_reset( struct Range_decoder * const rdec ) { Cb_reset( &rdec->cb ); rdec->member_position = 0; rdec->at_stream_end = false; } -/* Seek for a member header and update 'get'. Set '*skippedp' to the number - of bytes skipped. Return true if a valid header is found. +/* Seeks a member header and updates 'get'. '*skippedp' is set to the + number of bytes skipped. Returns true if it finds a valid header. */ -static bool Rd_find_header( Range_decoder * const rdec, - unsigned * const skippedp ) +static bool Rd_find_header( struct Range_decoder * const rdec, + int * const skippedp ) { *skippedp = 0; while( rdec->cb.get != rdec->cb.put ) { - if( rdec->cb.buffer[rdec->cb.get] == lzip_magic[0] ) + if( rdec->cb.buffer[rdec->cb.get] == magic_string[0] ) { unsigned get = rdec->cb.get; int i; - Lzip_header header; - for( i = 0; i < Lh_size; ++i ) + File_header header; + for( i = 0; i < Fh_size; ++i ) { if( get == rdec->cb.put ) return false; /* not enough data */ header[i] = rdec->cb.buffer[get]; if( ++get >= rdec->cb.buffer_size ) get = 0; } - if( Lh_check( header ) ) return true; + if( Fh_verify( header ) ) return true; } if( ++rdec->cb.get >= rdec->cb.buffer_size ) rdec->cb.get = 0; ++*skippedp; @@ -101,22 +109,20 @@ static bool Rd_find_header( Range_decoder * const rdec, } -static inline int Rd_write_data( Range_decoder * const rdec, +static inline int Rd_write_data( struct Range_decoder * const rdec, const uint8_t * const inbuf, const int size ) { if( rdec->at_stream_end || size <= 0 ) return 0; return Cb_write_data( &rdec->cb, inbuf, size ); } -static inline uint8_t Rd_get_byte( Range_decoder * const rdec ) +static inline uint8_t Rd_get_byte( struct Range_decoder * const rdec ) { - /* 0xFF avoids decoder error if member is truncated at EOS marker */ - if( Rd_finished( rdec ) ) return 0xFF; ++rdec->member_position; return Cb_get_byte( &rdec->cb ); } -static inline int Rd_read_data( Range_decoder * const rdec, +static inline int Rd_read_data( struct Range_decoder * const rdec, uint8_t * const outbuf, const int size ) { const int sz = Cb_read_data( &rdec->cb, outbuf, size ); @@ -124,7 +130,7 @@ static inline int Rd_read_data( Range_decoder * const rdec, return sz; } -static inline bool Rd_unread_data( Range_decoder * const rdec, +static inline bool Rd_unread_data( struct Range_decoder * const rdec, const unsigned size ) { if( size > rdec->member_position || !Cb_unread_data( &rdec->cb, size ) ) @@ -133,211 +139,172 @@ static inline bool Rd_unread_data( Range_decoder * const rdec, return true; } -static int Rd_try_reload( Range_decoder * const rdec ) +static bool Rd_try_reload( struct Range_decoder * const rdec, const bool force ) { + if( force ) rdec->reload_pending = true; if( rdec->reload_pending && Rd_available_bytes( rdec ) >= 5 ) { + int i; rdec->reload_pending = false; rdec->code = 0; - rdec->range = 0xFFFFFFFFU; - /* check first byte of the LZMA stream without reading it */ - if( rdec->cb.buffer[rdec->cb.get] != 0 ) return 2; - Rd_get_byte( rdec ); /* discard first byte of the LZMA stream */ - int i; for( i = 0; i < 4; ++i ) + for( i = 0; i < 5; ++i ) rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); + rdec->range = 0xFFFFFFFFU; + rdec->code &= rdec->range; /* make sure that first byte is discarded */ } return !rdec->reload_pending; } -static inline void Rd_normalize( Range_decoder * const rdec ) +static inline void Rd_normalize( struct Range_decoder * const rdec ) { if( rdec->range <= 0x00FFFFFFU ) - { rdec->range <<= 8; rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); } + { + rdec->range <<= 8; + rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); + } } -static inline unsigned Rd_decode( Range_decoder * const rdec, - const int num_bits ) +static inline int Rd_decode( struct Range_decoder * const rdec, + const int num_bits ) { - unsigned symbol = 0; + int symbol = 0; int i; for( i = num_bits; i > 0; --i ) { + uint32_t mask; Rd_normalize( rdec ); rdec->range >>= 1; /* symbol <<= 1; */ /* if( rdec->code >= rdec->range ) { rdec->code -= rdec->range; symbol |= 1; } */ - const bool bit = rdec->code >= rdec->range; - symbol <<= 1; symbol += bit; - rdec->code -= rdec->range & ( 0U - bit ); + mask = 0U - (rdec->code < rdec->range); + rdec->code -= rdec->range; + rdec->code += rdec->range & mask; + symbol = (symbol << 1) + (mask + 1); } return symbol; } -static inline unsigned Rd_decode_bit( Range_decoder * const rdec, - Bit_model * const probability ) +static inline int Rd_decode_bit( struct Range_decoder * const rdec, + Bit_model * const probability ) { + uint32_t bound; Rd_normalize( rdec ); - const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability; + bound = ( rdec->range >> bit_model_total_bits ) * *probability; if( rdec->code < bound ) { rdec->range = bound; - *probability += ( bit_model_total - *probability ) >> bit_model_move_bits; + *probability += (bit_model_total - *probability) >> bit_model_move_bits; return 0; } else { - rdec->code -= bound; rdec->range -= bound; + rdec->code -= bound; *probability -= *probability >> bit_model_move_bits; return 1; } } -static inline void Rd_decode_symbol_bit( Range_decoder * const rdec, - Bit_model * const probability, unsigned * symbol ) +static inline int Rd_decode_tree( struct Range_decoder * const rdec, + Bit_model bm[], const int num_bits ) { - Rd_normalize( rdec ); - *symbol <<= 1; - const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability; - if( rdec->code < bound ) - { - rdec->range = bound; - *probability += ( bit_model_total - *probability ) >> bit_model_move_bits; - } - else - { - rdec->code -= bound; - rdec->range -= bound; - *probability -= *probability >> bit_model_move_bits; - *symbol |= 1; - } + int symbol = 1; + int i; + for( i = num_bits; i > 0; --i ) + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + return symbol - (1 << num_bits); } -static inline void Rd_decode_symbol_bit_reversed( Range_decoder * const rdec, - Bit_model * const probability, unsigned * model, - unsigned * symbol, const int i ) +static inline int Rd_decode_tree6( struct Range_decoder * const rdec, + Bit_model bm[] ) { - Rd_normalize( rdec ); - *model <<= 1; - const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability; - if( rdec->code < bound ) - { - rdec->range = bound; - *probability += ( bit_model_total - *probability ) >> bit_model_move_bits; - } - else - { - rdec->code -= bound; - rdec->range -= bound; - *probability -= *probability >> bit_model_move_bits; - *model |= 1; - *symbol |= 1 << i; - } - } - -static inline unsigned Rd_decode_tree6( Range_decoder * const rdec, - Bit_model bm[] ) - { - unsigned symbol = 1; - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + int symbol = 1; + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); return symbol & 0x3F; } -static inline unsigned Rd_decode_tree8( Range_decoder * const rdec, - Bit_model bm[] ) +static inline int Rd_decode_tree_reversed( struct Range_decoder * const rdec, + Bit_model bm[], const int num_bits ) { - unsigned symbol = 1; - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + int model = 1; + int symbol = 0; + int i; + for( i = 0; i < num_bits; ++i ) + { + const bool bit = Rd_decode_bit( rdec, &bm[model] ); + model <<= 1; + if( bit ) { ++model; symbol |= (1 << i); } + } + return symbol; + } + +static inline int Rd_decode_tree_reversed4( struct Range_decoder * const rdec, + Bit_model bm[] ) + { + int model = 1; + int symbol = Rd_decode_bit( rdec, &bm[model] ); + int bit; + model = (model << 1) + symbol; + bit = Rd_decode_bit( rdec, &bm[model] ); + model = (model << 1) + bit; symbol |= (bit << 1); + bit = Rd_decode_bit( rdec, &bm[model] ); + model = (model << 1) + bit; symbol |= (bit << 2); + if( Rd_decode_bit( rdec, &bm[model] ) ) symbol |= 8; + return symbol; + } + +static inline int Rd_decode_matched( struct Range_decoder * const rdec, + Bit_model bm[], int match_byte ) + { + Bit_model * const bm1 = bm + 0x100; + int symbol = 1; + while( symbol < 0x100 ) + { + int match_bit, bit; + match_byte <<= 1; + match_bit = match_byte & 0x100; + bit = Rd_decode_bit( rdec, &bm1[match_bit+symbol] ); + symbol = ( symbol << 1 ) | bit; + if( match_bit != bit << 8 ) + { + while( symbol < 0x100 ) + symbol = ( symbol << 1 ) | Rd_decode_bit( rdec, &bm[symbol] ); + break; + } + } return symbol & 0xFF; } -static inline unsigned -Rd_decode_tree_reversed( Range_decoder * const rdec, - Bit_model bm[], const int num_bits ) +static inline int Rd_decode_len( struct Range_decoder * const rdec, + struct Len_model * const lm, + const int pos_state ) { - unsigned model = 1; - unsigned symbol = 0; - int i; - for( i = 0; i < num_bits; ++i ) - Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, i ); - return symbol; - } - -static inline unsigned -Rd_decode_tree_reversed4( Range_decoder * const rdec, Bit_model bm[] ) - { - unsigned model = 1; - unsigned symbol = 0; - Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 0 ); - Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 1 ); - Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 2 ); - Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 3 ); - return symbol; - } - -static inline unsigned Rd_decode_matched( Range_decoder * const rdec, - Bit_model bm[], unsigned match_byte ) - { - unsigned symbol = 1; - unsigned mask = 0x100; - while( true ) - { - const unsigned match_bit = ( match_byte <<= 1 ) & mask; - const unsigned bit = Rd_decode_bit( rdec, &bm[symbol+match_bit+mask] ); - symbol <<= 1; symbol += bit; - if( symbol > 0xFF ) return symbol & 0xFF; - mask &= ~(match_bit ^ (bit << 8)); /* if( match_bit != bit ) mask = 0; */ - } - } - -static inline unsigned Rd_decode_len( Range_decoder * const rdec, - Len_model * const lm, - const int pos_state ) - { - Bit_model * bm; - unsigned mask, offset, symbol = 1; - if( Rd_decode_bit( rdec, &lm->choice1 ) == 0 ) - { bm = lm->bm_low[pos_state]; mask = 7; offset = 0; goto len3; } + return Rd_decode_tree( rdec, lm->bm_low[pos_state], len_low_bits ); if( Rd_decode_bit( rdec, &lm->choice2 ) == 0 ) - { bm = lm->bm_mid[pos_state]; mask = 7; offset = len_low_symbols; goto len3; } - bm = lm->bm_high; mask = 0xFF; offset = len_low_symbols + len_mid_symbols; - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); -len3: - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); - return ( symbol & mask ) + min_match_len + offset; + return len_low_symbols + + Rd_decode_tree( rdec, lm->bm_mid[pos_state], len_mid_bits ); + return len_low_symbols + len_mid_symbols + + Rd_decode_tree( rdec, lm->bm_high, len_high_bits ); } enum { lzd_min_free_bytes = max_match_len }; -typedef struct LZ_decoder +struct LZ_decoder { - Circular_buffer cb; + struct Circular_buffer cb; unsigned long long partial_data_pos; - Range_decoder * rdec; + struct Range_decoder * rdec; unsigned dictionary_size; uint32_t crc; - bool check_trailer_pending; bool member_finished; + bool verify_trailer_pending; bool pos_wrapped; unsigned rep0; /* rep[0-3] latest four distances */ unsigned rep1; /* used for efficient coding of */ @@ -353,28 +320,31 @@ typedef struct LZ_decoder Bit_model bm_rep2[states]; Bit_model bm_len[states][pos_states]; Bit_model bm_dis_slot[len_states][1<cb ) >= lzd_min_free_bytes; } -static inline uint8_t LZd_peek_prev( const LZ_decoder * const d ) - { return d->cb.buffer[((d->cb.put > 0) ? d->cb.put : d->cb.buffer_size)-1]; } - -static inline uint8_t LZd_peek( const LZ_decoder * const d, - const unsigned distance ) +static inline uint8_t LZd_peek_prev( const struct LZ_decoder * const d ) { - const unsigned i = ( (d->cb.put > distance) ? 0 : d->cb.buffer_size ) + - d->cb.put - distance - 1; + const unsigned i = ( ( d->cb.put > 0 ) ? d->cb.put : d->cb.buffer_size ) - 1; return d->cb.buffer[i]; } -static inline void LZd_put_byte( LZ_decoder * const d, const uint8_t b ) +static inline uint8_t LZd_peek( const struct LZ_decoder * const d, + const unsigned distance ) + { + unsigned i = d->cb.put - distance - 1; + if( d->cb.put <= distance ) i += d->cb.buffer_size; + return d->cb.buffer[i]; + } + +static inline void LZd_put_byte( struct LZ_decoder * const d, const uint8_t b ) { CRC32_update_byte( &d->crc, b ); d->cb.buffer[d->cb.put] = b; @@ -382,31 +352,21 @@ static inline void LZd_put_byte( LZ_decoder * const d, const uint8_t b ) { d->partial_data_pos += d->cb.put; d->cb.put = 0; d->pos_wrapped = true; } } -static inline void LZd_copy_block( LZ_decoder * const d, +static inline void LZd_copy_block( struct LZ_decoder * const d, const unsigned distance, unsigned len ) { - unsigned lpos = d->cb.put, i = lpos - distance - 1; - bool fast, fast2; - if( lpos > distance ) - { - fast = len < d->cb.buffer_size - lpos; - fast2 = fast && len <= lpos - i; - } + unsigned i = d->cb.put - distance - 1; + bool fast; + if( d->cb.put <= distance ) + { i += d->cb.buffer_size; + fast = ( len <= d->cb.buffer_size - i && len <= i - d->cb.put ); } else + fast = ( len < d->cb.buffer_size - d->cb.put && len <= d->cb.put - i ); + if( fast ) /* no wrap, no overlap */ { - i += d->cb.buffer_size; - fast = len < d->cb.buffer_size - i; /* (i == pos) may happen */ - fast2 = fast && len <= i - lpos; - } - if( fast ) /* no wrap */ - { - const unsigned tlen = len; - if( fast2 ) /* no wrap, no overlap */ - memcpy( d->cb.buffer + lpos, d->cb.buffer + i, len ); - else - for( ; len > 0; --len ) d->cb.buffer[lpos++] = d->cb.buffer[i++]; - CRC32_update_buf( &d->crc, d->cb.buffer + d->cb.put, tlen ); - d->cb.put += tlen; + CRC32_update_buf( &d->crc, d->cb.buffer + i, len ); + memcpy( d->cb.buffer + d->cb.put, d->cb.buffer + i, len ); + d->cb.put += len; } else for( ; len > 0; --len ) { @@ -415,7 +375,8 @@ static inline void LZd_copy_block( LZ_decoder * const d, } } -static inline bool LZd_init( LZ_decoder * const d, Range_decoder * const rde, +static inline bool LZd_init( struct LZ_decoder * const d, + struct Range_decoder * const rde, const unsigned dict_size ) { if( !Cb_init( &d->cb, max( 65536, dict_size ) + lzd_min_free_bytes ) ) @@ -424,11 +385,9 @@ static inline bool LZd_init( LZ_decoder * const d, Range_decoder * const rde, d->rdec = rde; d->dictionary_size = dict_size; d->crc = 0xFFFFFFFFU; - d->check_trailer_pending = false; d->member_finished = false; + d->verify_trailer_pending = false; d->pos_wrapped = false; - /* prev_byte of first byte; also for LZd_peek( 0 ) on corrupt file */ - d->cb.buffer[d->cb.buffer_size-1] = 0; d->rep0 = 0; d->rep1 = 0; d->rep2 = 0; @@ -443,21 +402,23 @@ static inline bool LZd_init( LZ_decoder * const d, Range_decoder * const rde, Bm_array_init( d->bm_rep2, states ); Bm_array_init( d->bm_len[0], states * pos_states ); Bm_array_init( d->bm_dis_slot[0], len_states * (1 << dis_slot_bits) ); - Bm_array_init( d->bm_dis, modeled_distances - end_dis_model + 1 ); + Bm_array_init( d->bm_dis, modeled_distances - end_dis_model ); Bm_array_init( d->bm_align, dis_align_size ); Lm_init( &d->match_len_model ); Lm_init( &d->rep_len_model ); + d->cb.buffer[d->cb.buffer_size-1] = 0; /* prev_byte of first byte */ return true; } -static inline void LZd_free( LZ_decoder * const d ) { Cb_free( &d->cb ); } +static inline void LZd_free( struct LZ_decoder * const d ) + { Cb_free( &d->cb ); } -static inline bool LZd_member_finished( const LZ_decoder * const d ) - { return d->member_finished && Cb_empty( &d->cb ); } +static inline bool LZd_member_finished( const struct LZ_decoder * const d ) + { return ( d->member_finished && !Cb_used_bytes( &d->cb ) ); } -static inline unsigned LZd_crc( const LZ_decoder * const d ) +static inline unsigned LZd_crc( const struct LZ_decoder * const d ) { return d->crc ^ 0xFFFFFFFFU; } static inline unsigned long long -LZd_data_position( const LZ_decoder * const d ) +LZd_data_position( const struct LZ_decoder * const d ) { return d->partial_data_pos + d->cb.put; } diff --git a/doc/lzlib.info b/doc/lzlib.info index 4e8d079..a9f47b3 100644 --- a/doc/lzlib.info +++ b/doc/lzlib.info @@ -1,6 +1,6 @@ This is lzlib.info, produced by makeinfo version 4.13+ from lzlib.texi. -INFO-DIR-SECTION Compression +INFO-DIR-SECTION Data Compression START-INFO-DIR-ENTRY * Lzlib: (lzlib). Compression library for the lzip format END-INFO-DIR-ENTRY @@ -11,29 +11,28 @@ File: lzlib.info, Node: Top, Next: Introduction, Up: (dir) Lzlib Manual ************ -This manual is for Lzlib (version 1.15, 9 January 2025). +This manual is for Lzlib (version 1.8, 17 May 2016). * Menu: -* Introduction:: Purpose and features of lzlib -* Library version:: Checking library version -* Buffering:: Sizes of lzlib's buffers -* Parameter limits:: Min / max values for some parameters -* Compression functions:: Descriptions of the compression functions -* Decompression functions:: Descriptions of the decompression functions -* Error codes:: Meaning of codes returned by functions -* Error messages:: Error messages corresponding to error codes -* Invoking minilzip:: Command-line interface of the test program -* File format:: Detailed format of the compressed file -* Examples:: A small tutorial with examples -* Problems:: Reporting bugs -* Concept index:: Index of concepts +* Introduction:: Purpose and features of lzlib +* Library version:: Checking library version +* Buffering:: Sizes of lzlib's buffers +* Parameter limits:: Min / max values for some parameters +* Compression functions:: Descriptions of the compression functions +* Decompression functions:: Descriptions of the decompression functions +* Error codes:: Meaning of codes returned by functions +* Error messages:: Error messages corresponding to error codes +* Data format:: Detailed format of the compressed data +* Examples:: A small tutorial with examples +* Problems:: Reporting bugs +* Concept index:: Index of concepts - Copyright (C) 2009-2025 Antonio Diaz Diaz. + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This manual is free documentation: you have unlimited permission to copy, -distribute, and modify it. + This manual is free documentation: you have unlimited permission to +copy, distribute and modify it.  File: lzlib.info, Node: Introduction, Next: Library version, Prev: Top, Up: Top @@ -41,81 +40,93 @@ File: lzlib.info, Node: Introduction, Next: Library version, Prev: Top, Up: 1 Introduction ************** -Lzlib is a data compression library providing in-memory LZMA compression and -decompression functions, including integrity checking of the decompressed -data. The compressed data format used by the library is the lzip format. -Lzlib is written in C and is distributed under a 2-clause BSD license. +Lzlib is a data compression library providing in-memory LZMA compression +and decompression functions, including integrity checking of the +decompressed data. The compressed data format used by the library is the +lzip format. Lzlib is written in C. + + The lzip file format is designed for data sharing and long-term +archiving, taking into account both data integrity and decoder +availability: + + * The lzip format provides very safe integrity checking and some data + recovery means. The lziprecover program can repair bit-flip errors + (one of the most common forms of data corruption) in lzip files, + and provides data recovery capabilities, including error-checked + merging of damaged copies of a file. *Note Data safety: + (lziprecover)Data safety. + + * The lzip format is as simple as possible (but not simpler). The + lzip manual provides the code of a simple decompressor along with + a detailed explanation of how it works, so that with the only help + of the lzip manual it would be possible for a digital + archaeologist to extract the data from a lzip file long after + quantum computers eventually render LZMA obsolete. + + * Additionally the lzip reference implementation is copylefted, which + guarantees that it will remain free forever. + + A nice feature of the lzip format is that a corrupt byte is easier to +repair the nearer it is from the beginning of the file. Therefore, with +the help of lziprecover, losing an entire archive just because of a +corrupt byte near the beginning is a thing of the past. The functions and variables forming the interface of the compression -library are declared in the file 'lzlib.h'. Usage examples of the library -are given in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from -the source distribution. - - As 'lzlib.h' can be used in C and C++ programs, it must not impose a -choice of system headers on the program by including one of them. Therefore -it is the responsibility of the program using lzlib to include before -'lzlib.h' some header that declares the type 'uint8_t'. There are at least -four such headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and -'cinttypes'. - - All the library functions are thread safe. The library does not install -any signal handler. The decoder checks the consistency of the compressed -data, so the library should never crash even in case of corrupted input. +library are declared in the file 'lzlib.h'. Usage examples of the +library are given in the files 'main.c' and 'bbexample.c' from the +source distribution. Compression/decompression is done by repeatedly calling a couple of -read/write functions until all the data have been processed by the library. -This interface is safer and less error prone than the traditional zlib -interface. +read/write functions until all the data have been processed by the +library. This interface is safer and less error prone than the +traditional zlib interface. - Compression/decompression is done when the read function is called. This -means the value returned by the position functions is not updated until a -read call, even if a lot of data are written. If you want the data to be -compressed in advance, just call the read function with a SIZE equal to 0. + Compression/decompression is done when the read function is called. +This means the value returned by the position functions will not be +updated until a read call, even if a lot of data is written. If you +want the data to be compressed in advance, just call the read function +with a SIZE equal to 0. - If all the data to be compressed are written in advance, lzlib -automatically adjusts the header of the compressed data to use the largest -dictionary size that does not exceed neither the data size nor the limit -given to 'LZ_compress_open'. This feature reduces the amount of memory -needed for decompression and allows minilzip to produce identical + If all the data to be compressed are written in advance, lzlib will +automatically adjust the header of the compressed data to use the +smallest possible dictionary size. This feature reduces the amount of +memory needed for decompression and allows minilzip to produce identical compressed output as lzip. - Lzlib correctly decompresses a data stream which is the concatenation of -two or more compressed data streams. The result is the concatenation of the -corresponding decompressed data streams. Integrity testing of concatenated -compressed data streams is also supported. + Lzlib will correctly decompress a data stream which is the +concatenation of two or more compressed data streams. The result is the +concatenation of the corresponding decompressed data streams. Integrity +testing of concatenated compressed data streams is also supported. - Lzlib is able to compress and decompress streams of unlimited size by -automatically creating multimember output. The members so created are large, -about 2 PiB each. + All the library functions are thread safe. The library does not +install any signal handler. The decoder checks the consistency of the +compressed data, so the library should never crash even in case of +corrupted input. - In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a -concrete algorithm; it is more like "any algorithm using the LZMA coding -scheme". For example, the option '-0' of lzip uses the scheme in almost the -simplest way possible; issuing the longest match it can find, or a literal -byte if it can't find a match. Inversely, a more elaborate way of finding -coding sequences of minimum size than the one currently used by lzip could -be developed, and the resulting sequence could also be coded using the LZMA -coding scheme. + In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is +not a concrete algorithm; it is more like "any algorithm using the LZMA +coding scheme". For example, the option '-0' of lzip uses the scheme in +almost the simplest way possible; issuing the longest match it can +find, or a literal byte if it can't find a match. Inversely, a much +more elaborated way of finding coding sequences of minimum size than +the one currently used by lzip could be developed, and the resulting +sequence could also be coded using the LZMA coding scheme. - Lzlib currently implements two variants of the LZMA algorithm: fast -(used by option '-0' of minilzip) and normal (used by all other compression -levels). + Lzlib currently implements two variants of the LZMA algorithm; fast +(used by option '-0' of minilzip) and normal (used by all other +compression levels). - The high compression of LZMA comes from combining two basic, well-proven -compression ideas: sliding dictionaries (LZ77) and Markov models (the thing -used by every compression algorithm that uses a range encoder or similar -order-0 entropy coder as its last stage) with segregation of contexts -according to what the bits are used for. + The high compression of LZMA comes from combining two basic, +well-proven compression ideas: sliding dictionaries (LZ77/78) and +markov models (the thing used by every compression algorithm that uses +a range encoder or similar order-0 entropy coder as its last stage) +with segregation of contexts according to what the bits are used for. - The ideas embodied in lzlib are due to (at least) the following people: -Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the -definition of Markov chains), G.N.N. Martin (for the definition of range -encoding), Igor Pavlov (for putting all the above together in LZMA), and -Julian Seward (for bzip2's CLI). - - LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never -have been compressed. Decompressed is used to refer to data which have -undergone the process of decompression. + The ideas embodied in lzlib are due to (at least) the following +people: Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey +Markov (for the definition of Markov chains), G.N.N. Martin (for the +definition of range encoding), Igor Pavlov (for putting all the above +together in LZMA), and Julian Seward (for bzip2's CLI).  File: lzlib.info, Node: Library version, Next: Buffering, Prev: Introduction, Up: Top @@ -123,55 +134,19 @@ File: lzlib.info, Node: Library version, Next: Buffering, Prev: Introduction, 2 Library version ***************** -One goal of lzlib is to keep perfect backward compatibility with older -versions of itself down to 1.0. Any application working with an older lzlib -should work with a newer lzlib. Installing a newer lzlib should not break -anything. This chapter describes the constants and functions that the -application can use to discover the version of the library being used. All -of them are declared in 'lzlib.h'. - - -- Constant: LZ_API_VERSION - This constant is defined in 'lzlib.h' and works as a version test - macro. The application should check at compile time that - LZ_API_VERSION is greater than or equal to the version required by the - application: - - #if !defined LZ_API_VERSION || LZ_API_VERSION < 1012 - #error "lzlib 1.12 or newer needed." - #endif - - Before version 1.8, lzlib didn't define LZ_API_VERSION. - LZ_API_VERSION was first defined in lzlib 1.8 to 1. - Since lzlib 1.12, LZ_API_VERSION is defined as (major * 1000 + minor). - - NOTE: Version test macros are the library's way of announcing -functionality to the application. They should not be confused with feature -test macros, which allow the application to announce to the library its -desire to have certain symbols and prototypes exposed. - - -- Function: int LZ_api_version ( void ) - If LZ_API_VERSION >= 1012, this function is declared in 'lzlib.h' (else - it doesn't exist). It returns the LZ_API_VERSION of the library object - code being used. The application should check at run time that the - value returned by 'LZ_api_version' is greater than or equal to the - version required by the application. An application may be dynamically - linked at run time with a different version of lzlib than the one it - was compiled for, and this should not break the application as long as - the library used provides the functionality required by the - application. - - #if defined LZ_API_VERSION && LZ_API_VERSION >= 1012 - if( LZ_api_version() < 1012 ) - show_error( "lzlib 1.12 or newer needed." ); - #endif + -- Function: const char * LZ_version ( void ) + Returns the library version as a string. -- Constant: const char * LZ_version_string - This string constant is defined in the header file 'lzlib.h' and - represents the version of the library being used at compile time. + This constant is defined in the header file 'lzlib.h'. - -- Function: const char * LZ_version ( void ) - This function returns a string representing the version of the library - being used at run time. + The application should compare LZ_version and LZ_version_string for +consistency. If the first character differs, the library code actually +used may be incompatible with the 'lzlib.h' header file used by the +application. + + if( LZ_version()[0] != LZ_version_string[0] ) + error( "bad library version" );  File: lzlib.info, Node: Buffering, Next: Parameter limits, Prev: Library version, Up: Top @@ -179,31 +154,30 @@ File: lzlib.info, Node: Buffering, Next: Parameter limits, Prev: Library vers 3 Buffering *********** -Lzlib internal functions need access to a memory chunk at least as large as -the dictionary size (sliding window). For efficiency reasons, the input -buffer for compression is twice or sixteen times as large as the dictionary -size. +Lzlib internal functions need access to a memory chunk at least as large +as the dictionary size (sliding window). For efficiency reasons, the +input buffer for compression is twice or sixteen times as large as the +dictionary size. Finally, for safety reasons, lzlib uses two more internal buffers. - These are the four buffers used by lzlib, and their guaranteed minimum -sizes: + These are the four buffers used by lzlib, and their guaranteed +minimum sizes: - * Input compression buffer. Written to by the function - 'LZ_compress_write'. For the normal variant of LZMA, its size is two - times the dictionary size set with the function 'LZ_compress_open' or - 64 KiB, whichever is larger. For the fast variant, its size is 1 MiB. + * Input compression buffer. Written to by the 'LZ_compress_write' + function. For the normal variant of LZMA, its size is two times + the dictionary size set with the 'LZ_compress_open' function or 64 + KiB, whichever is larger. For the fast variant, its size is 1 MiB. - * Output compression buffer. Read from by the function - 'LZ_compress_read'. Its size is 64 KiB. + * Output compression buffer. Read from by the 'LZ_compress_read' + function. Its size is 64 KiB. - * Input decompression buffer. Written to by the function - 'LZ_decompress_write'. Its size is 64 KiB. + * Input decompression buffer. Written to by the + 'LZ_decompress_write' function. Its size is 64 KiB. - * Output decompression buffer. Read from by the function - 'LZ_decompress_read'. Its size is the dictionary size set in the header - of the member currently being decompressed or 64 KiB, whichever is - larger. + * Output decompression buffer. Read from by the 'LZ_decompress_read' + function. Its size is the dictionary size set in the header of the + member currently being decompressed or 64 KiB, whichever is larger.  File: lzlib.info, Node: Parameter limits, Next: Compression functions, Prev: Buffering, Up: Top @@ -222,7 +196,8 @@ Current values are shown in square brackets. Returns the smallest valid dictionary size [4 KiB]. -- Function: int LZ_max_dictionary_bits ( void ) - Returns the base 2 logarithm of the largest valid dictionary size [29]. + Returns the base 2 logarithm of the largest valid dictionary size + [29]. -- Function: int LZ_max_dictionary_size ( void ) Returns the largest valid dictionary size [512 MiB]. @@ -241,151 +216,135 @@ File: lzlib.info, Node: Compression functions, Next: Decompression functions, These are the functions used to compress data. In case of error, all of them return -1 or 0, for signed and unsigned return values respectively, -except 'LZ_compress_open' whose return value must be checked by calling -'LZ_compress_errno' before using it. +except 'LZ_compress_open' whose return value must be verified by +calling 'LZ_compress_errno' before using it. - -- Function: LZ_Encoder * LZ_compress_open ( const int DICTIONARY_SIZE, - const int MATCH_LEN_LIMIT, const unsigned long long MEMBER_SIZE ) + -- Function: struct LZ_Encoder * LZ_compress_open ( const int + DICTIONARY_SIZE, const int MATCH_LEN_LIMIT, const unsigned + long long MEMBER_SIZE ) Initializes the internal stream state for compression and returns a - pointer that can only be used as the ENCODER argument for the other - LZ_compress functions, or a null pointer if the encoder could not be - allocated. + pointer that can only be used as the ENCODER argument for the + other LZ_compress functions, or a null pointer if the encoder + could not be allocated. - The returned pointer must be checked by calling 'LZ_compress_errno' - before using it. If 'LZ_compress_errno' does not return 'LZ_ok', the - returned pointer must not be used and should be freed with - 'LZ_compress_close' to avoid memory leaks. + The returned pointer must be verified by calling + 'LZ_compress_errno' before using it. If 'LZ_compress_errno' does + not return 'LZ_ok', the returned pointer must not be used and + should be freed with 'LZ_compress_close' to avoid memory leaks. - DICTIONARY_SIZE sets the dictionary size to be used, in bytes. Valid - values range from 4 KiB to 512 MiB. Note that dictionary sizes are - quantized. If the size specified does not match one of the valid - sizes, it is rounded upwards by adding up to (DICTIONARY_SIZE / 8) to - it. + DICTIONARY_SIZE sets the dictionary size to be used, in bytes. + Valid values range from 4 KiB to 512 MiB. Note that dictionary + sizes are quantized. If the specified size does not match one of + the valid sizes, it will be rounded upwards by adding up to + (DICTIONARY_SIZE / 8) to it. MATCH_LEN_LIMIT sets the match length limit in bytes. Valid values range from 5 to 273. Larger values usually give better compression ratios but longer compression times. If DICTIONARY_SIZE is 65535 and MATCH_LEN_LIMIT is 16, the fast - variant of LZMA is chosen, which produces identical compressed output - as 'lzip -0'. (The dictionary size used is rounded upwards to 64 KiB). + variant of LZMA is chosen, which produces identical compressed + output as 'lzip -0'. (The dictionary size used will be rounded + upwards to 64 KiB). - MEMBER_SIZE sets the member size limit in bytes. Valid values range - from 4 KiB to 2 PiB. A small member size may degrade compression + MEMBER_SIZE sets the member size limit in bytes. Minimum member + size limit is 100 kB. Small member size may degrade compression ratio, so use it only when needed. To produce a single-member data - stream, give MEMBER_SIZE a value larger than the amount of data to be - produced. Values larger than 2 PiB are reduced to 2 PiB to prevent the - uncompressed size of the member from overflowing. + stream, give MEMBER_SIZE a value larger than the amount of data to + be produced, for example INT64_MAX. - -- Function: int LZ_compress_close ( LZ_Encoder * const ENCODER ) - Frees all dynamically allocated data structures for this stream. This - function discards any unprocessed input and does not flush any pending - output. After a call to 'LZ_compress_close', ENCODER can no longer be - used as an argument to any LZ_compress function. It is safe to call - 'LZ_compress_close' with a null argument. + -- Function: int LZ_compress_close ( struct LZ_Encoder * const ENCODER + ) + Frees all dynamically allocated data structures for this stream. + This function discards any unprocessed input and does not flush + any pending output. After a call to 'LZ_compress_close', ENCODER + can no more be used as an argument to any LZ_compress function. - -- Function: int LZ_compress_finish ( LZ_Encoder * const ENCODER ) + -- Function: int LZ_compress_finish ( struct LZ_Encoder * const + ENCODER ) Use this function to tell 'lzlib' that all the data for this member - have already been written (with the function 'LZ_compress_write'). It - is safe to call 'LZ_compress_finish' as many times as needed. After - all the compressed data have been read with 'LZ_compress_read' and - 'LZ_compress_member_finished' returns 1, a new member can be started - with 'LZ_compress_restart_member'. + have already been written (with the 'LZ_compress_write' function). + After all the produced compressed data have been read with + 'LZ_compress_read' and 'LZ_compress_member_finished' returns 1, a + new member can be started with 'LZ_compress_restart_member'. - -- Function: int LZ_compress_restart_member ( LZ_Encoder * const ENCODER, - const unsigned long long MEMBER_SIZE ) - Use this function to start a new member in a multimember data stream. - Call this function only after 'LZ_compress_member_finished' indicates - that the current member has been fully read (with the function - 'LZ_compress_read'). *Note member_size::, for a description of - MEMBER_SIZE. + -- Function: int LZ_compress_restart_member ( struct LZ_Encoder * + const ENCODER, const unsigned long long MEMBER_SIZE ) + Use this function to start a new member in a multimember data + stream. Call this function only after + 'LZ_compress_member_finished' indicates that the current member + has been fully read (with the 'LZ_compress_read' function). - -- Function: int LZ_compress_sync_flush ( LZ_Encoder * const ENCODER ) - Use this function to make available to 'LZ_compress_read' all the data - already written with the function 'LZ_compress_write'. First call - 'LZ_compress_sync_flush'. Then call 'LZ_compress_read' until it - returns 0. - - This function writes at least one LZMA marker '3' ('Sync Flush' marker) - to the compressed output. Note that the sync flush marker is not - allowed in lzip files; it is a device for interactive communication - between applications using lzlib, but is useless and wasteful in a - file, and is excluded from the media type 'application/lzip'. The LZMA - marker '2' ('End Of Stream' marker) is the only marker allowed in lzip - files. *Note File format::. + -- Function: int LZ_compress_sync_flush ( struct LZ_Encoder * const + ENCODER ) + Use this function to make available to 'LZ_compress_read' all the + data already written with the 'LZ_compress_write' function. First + call 'LZ_compress_sync_flush'. Then call 'LZ_compress_read' until + it returns 0. Repeated use of 'LZ_compress_sync_flush' may degrade compression - ratio, so use it only when needed. If the interval between calls to - 'LZ_compress_sync_flush' is large (comparable to dictionary size), - creating a multimember data stream with 'LZ_compress_restart_member' - may be an alternative. + ratio, so use it only when needed. - Combining multimember stream creation with flushing may be tricky. If - there are more bytes available than those needed to complete - MEMBER_SIZE, 'LZ_compress_restart_member' needs to be called when - 'LZ_compress_member_finished' returns 1, followed by a new call to - 'LZ_compress_sync_flush'. + -- Function: int LZ_compress_read ( struct LZ_Encoder * const ENCODER, + uint8_t * const BUFFER, const int SIZE ) + The 'LZ_compress_read' function reads up to SIZE bytes from the + stream pointed to by ENCODER, storing the results in BUFFER. - -- Function: int LZ_compress_read ( LZ_Encoder * const ENCODER, uint8_t * - const BUFFER, const int SIZE ) - Reads up to SIZE bytes from the stream pointed to by ENCODER, storing - the results in BUFFER. If LZ_API_VERSION >= 1012, BUFFER may be a null - pointer, in which case the bytes read are discarded. + The return value is the number of bytes actually read. This might + be less than SIZE; for example, if there aren't that many bytes + left in the stream or if more bytes have to be yet written with the + 'LZ_compress_write' function. Note that reading less than SIZE + bytes is not an error. - Returns the number of bytes actually read. This might be less than - SIZE; for example, if there aren't that many bytes left in the stream - or if more bytes have to be yet written with the function - 'LZ_compress_write'. Note that reading less than SIZE bytes is not an - error. + -- Function: int LZ_compress_write ( struct LZ_Encoder * const + ENCODER, uint8_t * const BUFFER, const int SIZE ) + The 'LZ_compress_write' function writes up to SIZE bytes from + BUFFER to the stream pointed to by ENCODER. - -- Function: int LZ_compress_write ( LZ_Encoder * const ENCODER, uint8_t * - const BUFFER, const int SIZE ) - Writes up to SIZE bytes from BUFFER to the stream pointed to by - ENCODER. Returns the number of bytes actually written. This might be - less than SIZE. Note that writing less than SIZE bytes is not an error. + The return value is the number of bytes actually written. This + might be less than SIZE. Note that writing less than SIZE bytes is + not an error. - -- Function: int LZ_compress_write_size ( LZ_Encoder * const ENCODER ) - Returns the maximum number of bytes that can be immediately written - through 'LZ_compress_write'. For efficiency reasons, once the input - buffer is full and 'LZ_compress_write_size' returns 0, almost all the - buffer must be compressed before a size greater than 0 is returned - again. (This is done to minimize the amount of data that must be - copied to the beginning of the buffer before new data can be accepted). + -- Function: int LZ_compress_write_size ( struct LZ_Encoder * const + ENCODER ) + The 'LZ_compress_write_size' function returns the maximum number of + bytes that can be immediately written through the + 'LZ_compress_write' function. It is guaranteed that an immediate call to 'LZ_compress_write' will accept a SIZE up to the returned number of bytes. - -- Function: LZ_Errno LZ_compress_errno ( LZ_Encoder * const ENCODER ) - Returns the current error code for ENCODER. *Note Error codes::. It is - safe to call 'LZ_compress_errno' with a null argument, in which case - it returns 'LZ_bad_argument'. - - -- Function: int LZ_compress_finished ( LZ_Encoder * const ENCODER ) - Returns 1 if all the data have been read and 'LZ_compress_close' can - be safely called. Otherwise it returns 0. 'LZ_compress_finished' - implies 'LZ_compress_member_finished'. - - -- Function: int LZ_compress_member_finished ( LZ_Encoder * const ENCODER ) - Returns 1 if the current member, in a multimember data stream, has been - fully read and 'LZ_compress_restart_member' can be safely called. - Otherwise it returns 0. - - -- Function: unsigned long long LZ_compress_data_position ( LZ_Encoder * + -- Function: enum LZ_Errno LZ_compress_errno ( struct LZ_Encoder * const ENCODER ) + Returns the current error code for ENCODER (*note Error codes::). + + -- Function: int LZ_compress_finished ( struct LZ_Encoder * const + ENCODER ) + Returns 1 if all the data have been read and 'LZ_compress_close' + can be safely called. Otherwise it returns 0. + + -- Function: int LZ_compress_member_finished ( struct LZ_Encoder * + const ENCODER ) + Returns 1 if the current member, in a multimember data stream, has + been fully read and 'LZ_compress_restart_member' can be safely + called. Otherwise it returns 0. + + -- Function: unsigned long long LZ_compress_data_position ( struct + LZ_Encoder * const ENCODER ) Returns the number of input bytes already compressed in the current member. - -- Function: unsigned long long LZ_compress_member_position ( LZ_Encoder * - const ENCODER ) - Returns the number of compressed bytes already produced, but perhaps - not yet read, in the current member. + -- Function: unsigned long long LZ_compress_member_position ( struct + LZ_Encoder * const ENCODER ) + Returns the number of compressed bytes already produced, but + perhaps not yet read, in the current member. - -- Function: unsigned long long LZ_compress_total_in_size ( LZ_Encoder * - const ENCODER ) + -- Function: unsigned long long LZ_compress_total_in_size ( struct + LZ_Encoder * const ENCODER ) Returns the total number of input bytes already compressed. - -- Function: unsigned long long LZ_compress_total_out_size ( LZ_Encoder * - const ENCODER ) + -- Function: unsigned long long LZ_compress_total_out_size ( struct + LZ_Encoder * const ENCODER ) Returns the total number of compressed bytes already produced, but perhaps not yet read. @@ -395,146 +354,132 @@ File: lzlib.info, Node: Decompression functions, Next: Error codes, Prev: Com 6 Decompression functions ************************* -These are the functions used to decompress data. In case of error, all of -them return -1 or 0, for signed and unsigned return values respectively, -except 'LZ_decompress_open' whose return value must be checked by calling -'LZ_decompress_errno' before using it. +These are the functions used to decompress data. In case of error, all +of them return -1 or 0, for signed and unsigned return values +respectively, except 'LZ_decompress_open' whose return value must be +verified by calling 'LZ_decompress_errno' before using it. - -- Function: LZ_Decoder * LZ_decompress_open ( void ) - Initializes the internal stream state for decompression and returns a - pointer that can only be used as the DECODER argument for the other - LZ_decompress functions, or a null pointer if the decoder could not be - allocated. + -- Function: struct LZ_Decoder * LZ_decompress_open ( void ) + Initializes the internal stream state for decompression and + returns a pointer that can only be used as the DECODER argument + for the other LZ_decompress functions, or a null pointer if the + decoder could not be allocated. - The returned pointer must be checked by calling 'LZ_decompress_errno' - before using it. If 'LZ_decompress_errno' does not return 'LZ_ok', the - returned pointer must not be used and should be freed with - 'LZ_decompress_close' to avoid memory leaks. + The returned pointer must be verified by calling + 'LZ_decompress_errno' before using it. If 'LZ_decompress_errno' + does not return 'LZ_ok', the returned pointer must not be used and + should be freed with 'LZ_decompress_close' to avoid memory leaks. - -- Function: int LZ_decompress_close ( LZ_Decoder * const DECODER ) - Frees all dynamically allocated data structures for this stream. This - function discards any unprocessed input and does not flush any pending - output. After a call to 'LZ_decompress_close', DECODER can no longer - be used as an argument to any LZ_decompress function. It is safe to - call 'LZ_decompress_close' with a null argument. + -- Function: int LZ_decompress_close ( struct LZ_Decoder * const + DECODER ) + Frees all dynamically allocated data structures for this stream. + This function discards any unprocessed input and does not flush + any pending output. After a call to 'LZ_decompress_close', DECODER + can no more be used as an argument to any LZ_decompress function. - -- Function: int LZ_decompress_finish ( LZ_Decoder * const DECODER ) + -- Function: int LZ_decompress_finish ( struct LZ_Decoder * const + DECODER ) Use this function to tell 'lzlib' that all the data for this stream - have already been written (with the function 'LZ_decompress_write'). - It is safe to call 'LZ_decompress_finish' as many times as needed. It - is not required to call 'LZ_decompress_finish' if the input stream - only contains whole members, but not calling it prevents lzlib from - detecting a truncated member. + have already been written (with the 'LZ_decompress_write' + function). - -- Function: int LZ_decompress_reset ( LZ_Decoder * const DECODER ) - Resets the internal state of DECODER as it was just after opening it - with the function 'LZ_decompress_open'. Data stored in the internal - buffers are discarded. Position counters are set to 0. + -- Function: int LZ_decompress_reset ( struct LZ_Decoder * const + DECODER ) + Resets the internal state of DECODER as it was just after opening + it with the 'LZ_decompress_open' function. Data stored in the + internal buffers is discarded. Position counters are set to 0. - -- Function: int LZ_decompress_sync_to_member ( LZ_Decoder * const DECODER - ) - Resets the error state of DECODER and enters a search state that lasts - until a new member header (or the end of the stream) is found. After a - successful call to 'LZ_decompress_sync_to_member', data written with - 'LZ_decompress_write' is consumed and 'LZ_decompress_read' returns 0 - until a header is found. + -- Function: int LZ_decompress_sync_to_member ( struct LZ_Decoder * + const DECODER ) + Resets the error state of DECODER and enters a search state that + lasts until a new member header (or the end of the stream) is + found. After a successful call to 'LZ_decompress_sync_to_member', + data written with 'LZ_decompress_write' will be consumed and + 'LZ_decompress_read' will return 0 until a header is found. This function is useful to discard any data preceding the first - member, or to discard the rest of the current member, for example in - case of a data error. If the decoder is already at the beginning of a - member, this function does nothing. + member, or to discard the rest of the current member, for example + in case of a data error. If the decoder is already at the + beginning of a member, this function does nothing. - -- Function: int LZ_decompress_read ( LZ_Decoder * const DECODER, uint8_t - * const BUFFER, const int SIZE ) - Reads up to SIZE bytes from the stream pointed to by DECODER, storing - the results in BUFFER. If LZ_API_VERSION >= 1012, BUFFER may be a null - pointer, in which case the bytes read are discarded. + -- Function: int LZ_decompress_read ( struct LZ_Decoder * const + DECODER, uint8_t * const BUFFER, const int SIZE ) + The 'LZ_decompress_read' function reads up to SIZE bytes from the + stream pointed to by DECODER, storing the results in BUFFER. - Returns the number of bytes actually read. This might be less than - SIZE; for example, if there aren't that many bytes left in the stream - or if more bytes have to be yet written with the function - 'LZ_decompress_write'. Note that reading less than SIZE bytes is not - an error. + The return value is the number of bytes actually read. This might + be less than SIZE; for example, if there aren't that many bytes + left in the stream or if more bytes have to be yet written with the + 'LZ_decompress_write' function. Note that reading less than SIZE + bytes is not an error. - 'LZ_decompress_read' returns at least once per member so that - 'LZ_decompress_member_finished' can be called (and trailer data - retrieved) for each member, even for empty members. Therefore, - 'LZ_decompress_read' returning 0 does not mean that the end of the - stream has been reached. The increase in the value returned by - 'LZ_decompress_total_in_size' can be used to tell the end of the stream - from an empty member. + -- Function: int LZ_decompress_write ( struct LZ_Decoder * const + DECODER, uint8_t * const BUFFER, const int SIZE ) + The 'LZ_decompress_write' function writes up to SIZE bytes from + BUFFER to the stream pointed to by DECODER. - In case of decompression error caused by corrupt or truncated data, - 'LZ_decompress_read' does not signal the error immediately to the - application, but waits until all the bytes decoded have been read. This - allows tools like tarlz to recover as much data as possible from each - damaged member. *Note tarlz manual: (tarlz)Top. + The return value is the number of bytes actually written. This + might be less than SIZE. Note that writing less than SIZE bytes is + not an error. - -- Function: int LZ_decompress_write ( LZ_Decoder * const DECODER, uint8_t - * const BUFFER, const int SIZE ) - Writes up to SIZE bytes from BUFFER to the stream pointed to by - DECODER. Returns the number of bytes actually written. This might be - less than SIZE. Note that writing less than SIZE bytes is not an error. - - -- Function: int LZ_decompress_write_size ( LZ_Decoder * const DECODER ) - Returns the maximum number of bytes that can be immediately written - through 'LZ_decompress_write'. This number varies smoothly; each - compressed byte consumed may be overwritten immediately, increasing by - 1 the value returned. - - It is guaranteed that an immediate call to 'LZ_decompress_write' will - accept a SIZE up to the returned number of bytes. - - -- Function: LZ_Errno LZ_decompress_errno ( LZ_Decoder * const DECODER ) - Returns the current error code for DECODER. *Note Error codes::. It is - safe to call 'LZ_decompress_errno' with a null argument, in which case - it returns 'LZ_bad_argument'. - - -- Function: int LZ_decompress_finished ( LZ_Decoder * const DECODER ) - Returns 1 if all the data have been read and 'LZ_decompress_close' can - be safely called. Otherwise it returns 0. 'LZ_decompress_finished' - does not imply 'LZ_decompress_member_finished'. - - -- Function: int LZ_decompress_member_finished ( LZ_Decoder * const + -- Function: int LZ_decompress_write_size ( struct LZ_Decoder * const DECODER ) - Returns 1 if the previous call to 'LZ_decompress_read' finished reading - the current member, indicating that final values for the member are - available through 'LZ_decompress_data_crc', - 'LZ_decompress_data_position', and 'LZ_decompress_member_position'. - Otherwise it returns 0. + The 'LZ_decompress_write_size' function returns the maximum number + of bytes that can be immediately written through the + 'LZ_decompress_write' function. - -- Function: int LZ_decompress_member_version ( LZ_Decoder * const DECODER - ) - Returns the version of the current member, read from the member header. + It is guaranteed that an immediate call to 'LZ_decompress_write' + will accept a SIZE up to the returned number of bytes. - -- Function: int LZ_decompress_dictionary_size ( LZ_Decoder * const + -- Function: enum LZ_Errno LZ_decompress_errno ( struct LZ_Decoder * + const DECODER ) + Returns the current error code for DECODER (*note Error codes::). + + -- Function: int LZ_decompress_finished ( struct LZ_Decoder * const DECODER ) - Returns the dictionary size of the current member, read from the - member header. + Returns 1 if all the data have been read and 'LZ_decompress_close' + can be safely called. Otherwise it returns 0. - -- Function: unsigned LZ_decompress_data_crc ( LZ_Decoder * const DECODER ) - Returns the 32 bit Cyclic Redundancy Check of the data decompressed - from the current member. The value returned is valid only when - 'LZ_decompress_member_finished' returns 1. - - -- Function: unsigned long long LZ_decompress_data_position ( LZ_Decoder * + -- Function: int LZ_decompress_member_finished ( struct LZ_Decoder * const DECODER ) - Returns the number of decompressed bytes already produced, but perhaps - not yet read, in the current member. + Returns 1 if the previous call to 'LZ_decompress_read' finished + reading the current member, indicating that final values for + member are available through 'LZ_decompress_data_crc', + 'LZ_decompress_data_position', and + 'LZ_decompress_member_position'. Otherwise it returns 0. - -- Function: unsigned long long LZ_decompress_member_position ( LZ_Decoder - * const DECODER ) - Returns the number of input bytes already decompressed in the current - member. - - -- Function: unsigned long long LZ_decompress_total_in_size ( LZ_Decoder * + -- Function: int LZ_decompress_member_version ( struct LZ_Decoder * const DECODER ) + Returns the version of current member from member header. + + -- Function: int LZ_decompress_dictionary_size ( struct LZ_Decoder * + const DECODER ) + Returns the dictionary size of current member from member header. + + -- Function: unsigned LZ_decompress_data_crc ( struct LZ_Decoder * + const DECODER ) + Returns the 32 bit Cyclic Redundancy Check of the data + decompressed from the current member. The returned value is valid + only when 'LZ_decompress_member_finished' returns 1. + + -- Function: unsigned long long LZ_decompress_data_position ( struct + LZ_Decoder * const DECODER ) + Returns the number of decompressed bytes already produced, but + perhaps not yet read, in the current member. + + -- Function: unsigned long long LZ_decompress_member_position ( struct + LZ_Decoder * const DECODER ) + Returns the number of input bytes already decompressed in the + current member. + + -- Function: unsigned long long LZ_decompress_total_in_size ( struct + LZ_Decoder * const DECODER ) Returns the total number of input bytes already decompressed. - -- Function: unsigned long long LZ_decompress_total_out_size ( LZ_Decoder - * const DECODER ) - Returns the total number of decompressed bytes already produced, but - perhaps not yet read. + -- Function: unsigned long long LZ_decompress_total_out_size ( struct + LZ_Decoder * const DECODER ) + Returns the total number of decompressed bytes already produced, + but perhaps not yet read.  File: lzlib.info, Node: Error codes, Next: Error messages, Prev: Decompression functions, Up: Top @@ -544,345 +489,96 @@ File: lzlib.info, Node: Error codes, Next: Error messages, Prev: Decompressio Most library functions return -1 to indicate that they have failed. But this return value only tells you that an error has occurred. To find out -what kind of error it was, you need to check the error code by calling +what kind of error it was, you need to verify the error code by calling 'LZ_(de)compress_errno'. Library functions don't change the value returned by 'LZ_(de)compress_errno' when they succeed; thus, the value returned by -'LZ_(de)compress_errno' after a successful call is not necessarily LZ_ok, -and you should not use 'LZ_(de)compress_errno' to determine whether a call -failed. If the call failed, then you can examine 'LZ_(de)compress_errno'. +'LZ_(de)compress_errno' after a successful call is not necessarily +LZ_ok, and you should not use 'LZ_(de)compress_errno' to determine +whether a call failed. If the call failed, then you can examine +'LZ_(de)compress_errno'. - The error codes are defined in the header file 'lzlib.h'. 'LZ_Errno' is -an enum type: + The error codes are defined in the header file 'lzlib.h'. - -- Constant: LZ_Errno LZ_ok - The value of this constant is 0 and is used to indicate that there is - no error. + -- Constant: enum LZ_Errno LZ_ok + The value of this constant is 0 and is used to indicate that there + is no error. - -- Constant: LZ_Errno LZ_bad_argument + -- Constant: enum LZ_Errno LZ_bad_argument At least one of the arguments passed to the library function was invalid. - -- Constant: LZ_Errno LZ_mem_error + -- Constant: enum LZ_Errno LZ_mem_error No memory available. The system cannot allocate more virtual memory because its capacity is full. - -- Constant: LZ_Errno LZ_sequence_error + -- Constant: enum LZ_Errno LZ_sequence_error A library function was called in the wrong order. For example 'LZ_compress_restart_member' was called before - 'LZ_compress_member_finished' indicated that the current member is + 'LZ_compress_member_finished' indicates that the current member is finished. - -- Constant: LZ_Errno LZ_header_error - An invalid member header (one with the wrong magic bytes) was read. If - this happens at the end of the data stream it may indicate trailing - data. + -- Constant: enum LZ_Errno LZ_header_error + An invalid member header (one with the wrong magic bytes) was + read. If this happens at the end of the data stream it may + indicate trailing data. - -- Constant: LZ_Errno LZ_unexpected_eof + -- Constant: enum LZ_Errno LZ_unexpected_eof The end of the data stream was reached in the middle of a member. - -- Constant: LZ_Errno LZ_data_error - The data stream is corrupt. If 'LZ_decompress_member_position' is 6 or - less, it indicates either a format version not supported, an invalid - dictionary size, a nonzero first LZMA byte, a corrupt header in a - multimember data stream, or trailing data too similar to a valid lzip - header. Lziprecover can be used to repair some of these errors and to - remove conflicting trailing data from a file. + -- Constant: enum LZ_Errno LZ_data_error + The data stream is corrupt. - -- Constant: LZ_Errno LZ_library_error - A bug was detected in the library. Please, report it. *Note Problems::. + -- Constant: enum LZ_Errno LZ_library_error + A bug was detected in the library. Please, report it (*note + Problems::).  -File: lzlib.info, Node: Error messages, Next: Invoking minilzip, Prev: Error codes, Up: Top +File: lzlib.info, Node: Error messages, Next: Data format, Prev: Error codes, Up: Top 8 Error messages **************** - -- Function: const char * LZ_strerror ( const LZ_Errno LZ_ERRNO ) - Returns the error message corresponding to the error code LZ_ERRNO. - The messages are fairly short; there are no multi-line messages or - embedded newlines. This function makes it easy for your program to - report informative error messages about the failure of a library call. + -- Function: const char * LZ_strerror ( const enum LZ_Errno LZ_ERRNO ) + Returns the standard error message for a given error code. The + messages are fairly short; there are no multi-line messages or + embedded newlines. This function makes it easy for your program + to report informative error messages about the failure of a + library call. The value of LZ_ERRNO normally comes from a call to 'LZ_(de)compress_errno'.  -File: lzlib.info, Node: Invoking minilzip, Next: File format, Prev: Error messages, Up: Top +File: lzlib.info, Node: Data format, Next: Examples, Prev: Error messages, Up: Top -9 Invoking minilzip -******************* - -Minilzip is a test program for the compression library lzlib. Minilzip is -not intended to be installed because lzip has more features, but minilzip is -well tested and you can use it as your main compressor if so you wish. -*Note lzip: (lzip)Top. - - Lzip is a lossless data compressor with a user interface similar to the -one of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov -chain-Algorithm) designed to achieve complete interoperability between -implementations. The maximum dictionary size is 512 MiB so that any lzip -file can be decompressed on 32-bit machines. Lzip provides accurate and -robust 3-factor integrity checking. 'lzip -0' compresses about as fast as -gzip, while 'lzip -9' compresses most files more than bzip2. Decompression -speed is intermediate between gzip and bzip2. Lzip provides better data -recovery capabilities than gzip and bzip2. Lzip has been designed, written, -and tested with great care to replace gzip and bzip2 as general-purpose -compressed format for Unix-like systems. - -The format for running minilzip is: - - minilzip [OPTIONS] [FILES] - -If no file names are specified, minilzip compresses (or decompresses) from -standard input to standard output. A hyphen '-' used as a FILE argument -means standard input. It can be mixed with other FILES and is read just -once, the first time it appears in the command line. Remember to prepend -'./' to any file name beginning with a hyphen, or use '--'. - -minilzip supports the following options: *Note Argument syntax: -(plzip)Argument syntax. - -'-h' -'--help' - Print an informative help message describing the options and exit. - -'-V' -'--version' - Print the version number of minilzip on the standard output and exit. - This version number should be included in all bug reports. - -'-a' -'--trailing-error' - Exit with error status 2 if any remaining input is detected after - decompressing the last member. Such remaining input is usually trailing - garbage that can be safely ignored. - -'-b BYTES' -'--member-size=BYTES' - When compressing, set the member size limit to BYTES. If BYTES is - smaller than the compressed size, a multimember file is produced. It is - advisable to keep members smaller than RAM size so that they can be - repaired with lziprecover in case of corruption. A small member size - may degrade compression ratio, so use it only when needed. Valid - values range from 100 kB to 2 PiB. Defaults to 2 PiB. - -'-c' -'--stdout' - Compress or decompress to standard output; keep input files unchanged. - If compressing several files, each file is compressed independently. - (The output consists of a sequence of independently compressed - members). This option (or '-o') is needed when reading from a named - pipe (fifo) or from a device. Use it also to recover as much of the - decompressed data as possible when decompressing a corrupt file. '-c' - overrides '-o' and '-S'. '-c' has no effect when testing. - -'-d' -'--decompress' - Decompress the files specified. The integrity of the files specified is - checked. If a file does not exist, can't be opened, or the destination - file already exists and '--force' has not been specified, minilzip - continues decompressing the rest of the files and exits with error - status 1. If a file fails to decompress, or is a terminal, minilzip - exits immediately with error status 2 without decompressing the rest - of the files. A terminal is considered an uncompressed file, and - therefore invalid. A multimember file with one or more empty members - is accepted if redirected to standard input. - -'-f' -'--force' - Force overwrite of output files. - -'-F' -'--recompress' - When compressing, force re-compression of files whose name already has - the '.lz' or '.tlz' suffix. - -'-k' -'--keep' - Keep (don't delete) input files during compression or decompression. - -'-m BYTES' -'--match-length=BYTES' - When compressing, set the match length limit in bytes. After a match - this long is found, the search is finished. Valid values range from 5 - to 273. Larger values usually give better compression ratios but - longer compression times. - -'-o FILE' -'--output=FILE' - If '-c' has not been also specified, write the (de)compressed output - to FILE; keep input files unchanged. If compressing several files, - each file is compressed independently. (The output consists of a - sequence of independently compressed members). This option (or '-c') - is needed when reading from a named pipe (fifo) or from a device. - '-o -' is equivalent to '-c'. '-o' has no effect when testing. - - When compressing and splitting the output in volumes, FILE is used as - a prefix, and several files named 'FILE00001.lz', 'FILE00002.lz', etc, - are created. In this case, only one input file is allowed. - -'-q' -'--quiet' - Quiet operation. Suppress all messages. - -'-s BYTES' -'--dictionary-size=BYTES' - When compressing, set the dictionary size limit in bytes. Minilzip - uses for each file the largest dictionary size that does not exceed - neither the file size nor this limit. Valid values range from 4 KiB to - 512 MiB. Values 12 to 29 are interpreted as powers of two, meaning - 2^12 to 2^29 bytes. Dictionary sizes are quantized so that they can be - coded in just one byte (*note coded-dict-size::). If the size - specified does not match one of the valid sizes, it is rounded upwards - by adding up to (BYTES / 8) to it. - - For maximum compression you should use a dictionary size limit as large - as possible, but keep in mind that the decompression memory requirement - is affected at compression time by the choice of dictionary size limit. - The dictionary size used for decompression is the same dictionary size - used for compression. - -'-S BYTES' -'--volume-size=BYTES' - When compressing, and '-c' has not been also specified, split the - compressed output into several volume files with names - 'original_name00001.lz', 'original_name00002.lz', etc, and set the - volume size limit to BYTES. Input files are kept unchanged. Each - volume is a complete, maybe multimember, lzip file. A small volume - size may degrade compression ratio, so use it only when needed. Valid - values range from 100 kB to 4 EiB. - -'-t' -'--test' - Check integrity of the files specified, but don't decompress them. This - really performs a trial decompression and throws away the result. Use - it together with '-v' to see information about the files. If a file - fails the test, does not exist, can't be opened, or is a terminal, - minilzip continues testing the rest of the files. A final diagnostic - is shown at verbosity level 1 or higher if any file fails the test - when testing multiple files. A multimember file with one or more empty - members is accepted if redirected to standard input. - -'-v' -'--verbose' - Verbose mode. - When compressing, show the compression ratio and size for each file - processed. - When decompressing or testing, further -v's (up to 4) increase the - verbosity level, showing status, compression ratio, dictionary size, - and trailer contents (CRC, data size, member size). - -'-0 .. -9' - Compression level. Set the compression parameters (dictionary size and - match length limit) as shown in the table below. The default - compression level is '-6', equivalent to '-s8MiB -m36'. Note that '-9' - can be much slower than '-0'. These options have no effect when - decompressing or testing. - - The bidimensional parameter space of LZMA can't be mapped to a linear - scale optimal for all files. If your files are large, very repetitive, - etc, you may need to use the options '--dictionary-size' and - '--match-length' directly to achieve optimal performance. - - If several compression levels or '-s' or '-m' options are given, the - last setting is used. For example '-9 -s64MiB' is equivalent to - '-s64MiB -m273' - - Level Dictionary size (-s) Match length limit (-m) - ------------------------------------------------------ - -0 64 KiB 16 bytes - -1 1 MiB 5 bytes - -2 1.5 MiB 6 bytes - -3 2 MiB 8 bytes - -4 3 MiB 12 bytes - -5 4 MiB 20 bytes - -6 8 MiB 36 bytes - -7 16 MiB 68 bytes - -8 24 MiB 132 bytes - -9 32 MiB 273 bytes - -'--fast' -'--best' - Aliases for GNU gzip compatibility. - -'--loose-trailing' - When decompressing or testing, allow trailing data whose first bytes - are so similar to the magic bytes of a lzip header that they can be - confused with a corrupt header. Use this option if a file triggers a - 'corrupt header' error and the cause is not indeed a corrupt header. - -'--check-lib' - Compare the version of lzlib used to compile minilzip with the version - actually being used at run time and exit. Report any differences - found. Exit with error status 1 if differences are found. A mismatch - may indicate that lzlib is not correctly installed or that a different - version of lzlib has been installed after compiling the shared version - of minilzip. Exit with error status 2 if LZ_API_VERSION and - LZ_version_string don't match. 'minilzip -v --check-lib' shows the - version of lzlib being used and the value of LZ_API_VERSION (if - defined). *Note Library version::. - - - Numbers given as arguments to options may be expressed in decimal, -hexadecimal, or octal (using the same syntax as integer constants in C++), -and may be followed by a multiplier and an optional 'B' for "byte". - - Table of SI and binary prefixes (unit multipliers): - -Prefix Value | Prefix Value ----------------------------------------------------------------------- -k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024) -M megabyte (10^6) | Mi mebibyte (2^20) -G gigabyte (10^9) | Gi gibibyte (2^30) -T terabyte (10^12) | Ti tebibyte (2^40) -P petabyte (10^15) | Pi pebibyte (2^50) -E exabyte (10^18) | Ei exbibyte (2^60) -Z zettabyte (10^21) | Zi zebibyte (2^70) -Y yottabyte (10^24) | Yi yobibyte (2^80) -R ronnabyte (10^27) | Ri robibyte (2^90) -Q quettabyte (10^30) | Qi quebibyte (2^100) - - - Exit status: 0 for a normal exit, 1 for environmental problems (file not -found, invalid command-line options, I/O errors, etc), 2 to indicate a -corrupt or invalid input file, 3 for an internal consistency error (e.g., -bug) which caused minilzip to panic. - - -File: lzlib.info, Node: File format, Next: Examples, Prev: Invoking minilzip, Up: Top - -10 File format -************** +9 Data format +************* Perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. -- Antoine de Saint-Exupery - In the diagram below, a box like this: + In the diagram below, a box like this: +---+ | | <-- the vertical bars might be missing +---+ represents one byte; a box like this: - +==============+ | | +==============+ represents a variable number of bytes. -A lzip file consists of one or more independent "members" (compressed data -sets). The members simply appear one after another in the file, with no -additional information before, between, or after them. Each member can -encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The -size of a multimember file is unlimited. Empty members (data size = 0) are -not allowed in multimember files. + + A lzip data stream consists of a series of "members" (compressed data +sets). The members simply appear one after another in the data stream, +with no additional information before, between, or after them. Each member has the following structure: - +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size | +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ @@ -890,361 +586,178 @@ not allowed in multimember files. All multibyte values are stored in little endian order. 'ID string (the "magic" bytes)' - A four byte string, identifying the lzip format, with the value "LZIP" - (0x4C, 0x5A, 0x49, 0x50). + A four byte string, identifying the lzip format, with the value + "LZIP" (0x4C, 0x5A, 0x49, 0x50). 'VN (version number, 1 byte)' - Just in case something needs to be modified in the future. 1 for now. + Just in case something needs to be modified in the future. 1 for + now. 'DS (coded dictionary size, 1 byte)' The dictionary size is calculated by taking a power of 2 (the base - size) and subtracting from it a fraction between 0/16 and 7/16 of the - base size. + size) and substracting from it a fraction between 0/16 and 7/16 of + the base size. Bits 4-0 contain the base 2 logarithm of the base size (12 to 29). - Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract - from the base size to obtain the dictionary size. + Bits 7-5 contain the numerator of the fraction (0 to 7) to + substract from the base size to obtain the dictionary size. Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB Valid values for dictionary size range from 4 KiB to 512 MiB. 'LZMA stream' - The LZMA stream, terminated by an 'End Of Stream' marker. Uses default - values for encoder properties. *Note Stream format: (lzip)Stream + The LZMA stream, finished by an end of stream marker. Uses default + values for encoder properties. *Note Stream format: (lzip)Stream format, for a complete description. - Lzip only uses the LZMA marker '2' ('End Of Stream' marker). Lzlib - also uses the LZMA marker '3' ('Sync Flush' marker). *Note - sync_flush::. + Lzip only uses the LZMA marker '2' ("End Of Stream" marker). Lzlib + also uses the LZMA marker '3' ("Sync Flush" marker). 'CRC32 (4 bytes)' - Cyclic Redundancy Check (CRC) of the original uncompressed data. + CRC of the uncompressed original data. 'Data size (8 bytes)' - Size of the original uncompressed data. + Size of the uncompressed original data. 'Member size (8 bytes)' - Total size of the member, including header and trailer. This field acts - as a distributed index, improves the checking of stream integrity, and - facilitates the safe recovery of undamaged members from multimember - files. Lzip limits the member size to 2 PiB to prevent the data size - field from overflowing. + Total size of the member, including header and trailer. This field + acts as a distributed index, allows the verification of stream + integrity, and facilitates safe recovery of undamaged members from + multimember files.  -File: lzlib.info, Node: Examples, Next: Problems, Prev: File format, Up: Top +File: lzlib.info, Node: Examples, Next: Problems, Prev: Data format, Up: Top -11 A small tutorial with examples +10 A small tutorial with examples ********************************* -This chapter provides real code examples for the most common uses of the -library. See these examples in context in the files 'bbexample.c' and -'ffexample.c' from the source distribution of lzlib. +This chapter shows the order in which the library functions should be +called depending on what kind of data stream you want to compress or +decompress. See the file 'bbexample.c' in the source distribution for +an example of how buffer-to-buffer compression/decompression can be +implemented using lzlib. - Note that the interface of lzlib is symmetrical. That is, the code for -normal compression and decompression is identical except because one calls -LZ_compress* functions while the other calls LZ_decompress* functions. - -* Menu: - -* Buffer compression:: Buffer-to-buffer single-member compression -* Buffer decompression:: Buffer-to-buffer decompression -* File compression:: File-to-file single-member compression -* File decompression:: File-to-file decompression -* File compression mm:: File-to-file multimember compression -* Skipping data errors:: Decompression with automatic resynchronization - - -File: lzlib.info, Node: Buffer compression, Next: Buffer decompression, Up: Examples - -11.1 Buffer compression -======================= - -Buffer-to-buffer single-member compression (MEMBER_SIZE > total output). - -/* Compress 'insize' bytes from 'inbuf' to 'outbuf'. - Return the size of the compressed data in '*outlenp'. - In case of error, or if 'outsize' is too small, return false and do not - modify '*outlenp'. -*/ -bool bbcompress( const uint8_t * const inbuf, const int insize, - const int dictionary_size, const int match_len_limit, - uint8_t * const outbuf, const int outsize, - int * const outlenp ) - { - int inpos = 0, outpos = 0; - bool error = false; - LZ_Encoder * const encoder = - LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX ); - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { LZ_compress_close( encoder ); return false; } - - while( true ) - { - int ret = LZ_compress_write( encoder, inbuf + inpos, insize - inpos ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_compress_finish( encoder ); - ret = LZ_compress_read( encoder, outbuf + outpos, outsize - outpos ); - if( ret < 0 ) { error = true; break; } - outpos += ret; - if( LZ_compress_finished( encoder ) == 1 ) break; - if( outpos >= outsize ) { error = true; break; } - } - - if( LZ_compress_close( encoder ) < 0 ) error = true; - if( error ) return false; - *outlenp = outpos; - return true; - } - - -File: lzlib.info, Node: Buffer decompression, Next: File compression, Prev: Buffer compression, Up: Examples - -11.2 Buffer decompression -========================= - -Buffer-to-buffer decompression. - -/* Decompress 'insize' bytes from 'inbuf' to 'outbuf'. - Return the size of the decompressed data in '*outlenp'. - In case of error, or if 'outsize' is too small, return false and do not - modify '*outlenp'. -*/ -bool bbdecompress( const uint8_t * const inbuf, const int insize, - uint8_t * const outbuf, const int outsize, - int * const outlenp ) - { - int inpos = 0, outpos = 0; - bool error = false; - LZ_Decoder * const decoder = LZ_decompress_open(); - if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) - { LZ_decompress_close( decoder ); return false; } - - while( true ) - { - int ret = LZ_decompress_write( decoder, inbuf + inpos, insize - inpos ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_decompress_finish( decoder ); - ret = LZ_decompress_read( decoder, outbuf + outpos, outsize - outpos ); - if( ret < 0 ) { error = true; break; } - outpos += ret; - if( LZ_decompress_finished( decoder ) == 1 ) break; - if( outpos >= outsize ) { error = true; break; } - } - - if( LZ_decompress_close( decoder ) < 0 ) error = true; - if( error ) return false; - *outlenp = outpos; - return true; - } - - -File: lzlib.info, Node: File compression, Next: File decompression, Prev: Buffer decompression, Up: Examples - -11.3 File compression -===================== - -File-to-file compression using LZ_compress_write_size. - -int ffcompress( LZ_Encoder * const encoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_finished( encoder ) == 1 ) return 0; - } - return 1; - } - - -File: lzlib.info, Node: File decompression, Next: File compression mm, Prev: File compression, Up: Examples - -11.4 File decompression -======================= - -File-to-file decompression using LZ_decompress_write_size. - -int ffdecompress( LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_decompress_write( decoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } - - -File: lzlib.info, Node: File compression mm, Next: Skipping data errors, Prev: File decompression, Up: Examples - -11.5 File-to-file multimember compression -========================================= - -Example 1: Multimember compression with members of fixed size -(MEMBER_SIZE < total output). - -int ffmmcompress( FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384, member_size = 4096 }; - uint8_t buffer[buffer_size]; - bool done = false; - LZ_Encoder * const encoder = LZ_compress_open( 65535, 16, member_size ); - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { fputs( "ffexample: Not enough memory.\n", stderr ); - LZ_compress_close( encoder ); return 1; } - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_member_finished( encoder ) == 1 ) - { - if( LZ_compress_finished( encoder ) == 1 ) { done = true; break; } - if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break; - } - } - if( LZ_compress_close( encoder ) < 0 ) done = false; - return done; - } + Note that lzlib's interface is symmetrical. That is, the code for +normal compression and decompression is identical except because one +calls LZ_compress* functions while the other calls LZ_decompress* +functions. -Example 2: Multimember compression (user-restarted members). (Call -LZ_compress_open with MEMBER_SIZE > largest member). +Example 1: Normal compression (MEMBER_SIZE > total output). -/* Compress 'infile' to 'outfile' as a multimember stream with one member - for each line of text terminated by a newline character or by EOF. - Return 0 if success, 1 if error. -*/ -int fflfcompress( LZ_Encoder * const encoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - for( len = 0; len < size; ) - { - int ch = getc( infile ); - if( ch == EOF || ( buffer[len++] = ch ) == '\n' ) break; - } - /* avoid writing an empty member to outfile */ - if( len == 0 && LZ_compress_data_position( encoder ) == 0 ) return 0; - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) || buffer[len-1] == '\n' ) - LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_member_finished( encoder ) == 1 ) - { - if( feof( infile ) && LZ_compress_finished( encoder ) == 1 ) return 0; - if( LZ_compress_restart_member( encoder, INT64_MAX ) < 0 ) break; - } - } - return 1; - } + 1) LZ_compress_open + 2) LZ_compress_write + 3) LZ_compress_read + 4) go back to step 2 until all input data have been written + 5) LZ_compress_finish + 6) LZ_compress_read + 7) go back to step 6 until LZ_compress_finished returns 1 + 8) LZ_compress_close - -File: lzlib.info, Node: Skipping data errors, Prev: File compression mm, Up: Examples -11.6 Skipping data errors -========================= +Example 2: Normal compression using LZ_compress_write_size. -/* Decompress 'infile' to 'outfile' with automatic resynchronization to - next member in case of data error, including the automatic removal of - leading garbage. -*/ -int ffrsdecompress( LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_decompress_write( decoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) - { - if( LZ_decompress_errno( decoder ) == LZ_header_error || - LZ_decompress_errno( decoder ) == LZ_data_error ) - { LZ_decompress_sync_to_member( decoder ); continue; } - break; - } - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } + 1) LZ_compress_open + 2) go to step 5 if LZ_compress_write_size returns 0 + 3) LZ_compress_write + 4) if no more data to write, call LZ_compress_finish + 5) LZ_compress_read + 6) go back to step 2 until LZ_compress_finished returns 1 + 7) LZ_compress_close + + +Example 3: Decompression. + + 1) LZ_decompress_open + 2) LZ_decompress_write + 3) LZ_decompress_read + 4) go back to step 2 until all input data have been written + 5) LZ_decompress_finish + 6) LZ_decompress_read + 7) go back to step 6 until LZ_decompress_finished returns 1 + 8) LZ_decompress_close + + +Example 4: Decompression using LZ_decompress_write_size. + + 1) LZ_decompress_open + 2) go to step 5 if LZ_decompress_write_size returns 0 + 3) LZ_decompress_write + 4) if no more data to write, call LZ_decompress_finish + 5) LZ_decompress_read + 5a) optionally, if LZ_decompress_member_finished returns 1, read + final values for member with LZ_decompress_data_crc, etc. + 6) go back to step 2 until LZ_decompress_finished returns 1 + 7) LZ_decompress_close + + +Example 5: Multimember compression (MEMBER_SIZE < total output). + + 1) LZ_compress_open + 2) go to step 5 if LZ_compress_write_size returns 0 + 3) LZ_compress_write + 4) if no more data to write, call LZ_compress_finish + 5) LZ_compress_read + 6) go back to step 2 until LZ_compress_member_finished returns 1 + 7) go to step 10 if LZ_compress_finished() returns 1 + 8) LZ_compress_restart_member + 9) go back to step 2 + 10) LZ_compress_close + + +Example 6: Multimember compression (user-restarted members). + + 1) LZ_compress_open + 2) LZ_compress_write + 3) LZ_compress_read + 4) go back to step 2 until member termination is desired + 5) LZ_compress_finish + 6) LZ_compress_read + 7) go back to step 6 until LZ_compress_member_finished returns 1 + 8) verify that LZ_compress_finished returns 1 + 9) go to step 12 if all input data have been written + 10) LZ_compress_restart_member + 11) go back to step 2 + 12) LZ_compress_close + + +Example 7: Decompression with automatic removal of leading data. + + 1) LZ_decompress_open + 2) LZ_decompress_sync_to_member + 3) go to step 6 if LZ_decompress_write_size returns 0 + 4) LZ_decompress_write + 5) if no more data to write, call LZ_decompress_finish + 6) LZ_decompress_read + 7) go back to step 3 until LZ_decompress_finished returns 1 + 8) LZ_decompress_close + + +Example 8: Streamed decompression with automatic resynchronization to +next member in case of data error. + + 1) LZ_decompress_open + 2) go to step 5 if LZ_decompress_write_size returns 0 + 3) LZ_decompress_write + 4) if no more data to write, call LZ_decompress_finish + 5) if LZ_decompress_read produces LZ_header_error or LZ_data_error, + call LZ_decompress_sync_to_member + 6) go back to step 2 until LZ_decompress_finished returns 1 + 7) LZ_decompress_close  File: lzlib.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top -12 Reporting bugs +11 Reporting bugs ***************** -There are probably bugs in lzlib. There are certainly errors and omissions -in this manual. If you report them, they will get fixed. If you don't, no -one will ever know about them and they will remain unfixed for all -eternity, if not longer. +There are probably bugs in lzlib. There are certainly errors and +omissions in this manual. If you report them, they will get fixed. If +you don't, no one will ever know about them and they will remain unfixed +for all eternity, if not longer. If you find a bug in lzlib, please send electronic mail to -. Include the version number, which you can find by -running 'minilzip --version' and 'minilzip -v --check-lib'. +. Include the version number, which you can find +by running 'minilzip --version' or in 'LZ_version_string' from +'lzlib.h'.  File: lzlib.info, Node: Concept index, Prev: Problems, Up: Top @@ -1255,53 +768,36 @@ Concept index [index] * Menu: -* buffer compression: Buffer compression. (line 6) -* buffer decompression: Buffer decompression. (line 6) -* buffering: Buffering. (line 6) -* bugs: Problems. (line 6) -* compression functions: Compression functions. (line 6) -* decompression functions: Decompression functions. (line 6) -* error codes: Error codes. (line 6) -* error messages: Error messages. (line 6) -* examples: Examples. (line 6) -* file compression: File compression. (line 6) -* file decompression: File decompression. (line 6) -* file format: File format. (line 6) -* getting help: Problems. (line 6) -* introduction: Introduction. (line 6) -* invoking: Invoking minilzip. (line 6) -* library version: Library version. (line 6) -* multimember compression: File compression mm. (line 6) -* options: Invoking minilzip. (line 6) -* parameter limits: Parameter limits. (line 6) -* skipping data errors: Skipping data errors. (line 6) +* buffering: Buffering. (line 6) +* bugs: Problems. (line 6) +* compression functions: Compression functions. (line 6) +* data format: Data format. (line 6) +* decompression functions: Decompression functions. + (line 6) +* error codes: Error codes. (line 6) +* error messages: Error messages. (line 6) +* examples: Examples. (line 6) +* getting help: Problems. (line 6) +* introduction: Introduction. (line 6) +* library version: Library version. (line 6) +* parameter limits: Parameter limits. (line 6)  Tag Table: -Node: Top215 -Node: Introduction1337 -Node: Library version5500 -Node: Buffering8051 -Node: Parameter limits9276 -Node: Compression functions10230 -Ref: member_size12006 -Ref: sync_flush13747 -Node: Decompression functions18313 -Node: Error codes25700 -Node: Error messages28045 -Node: Invoking minilzip28628 -Node: File format39710 -Ref: coded-dict-size41208 -Node: Examples42615 -Node: Buffer compression43576 -Node: Buffer decompression45089 -Node: File compression46496 -Node: File decompression47472 -Node: File compression mm48469 -Node: Skipping data errors51480 -Node: Problems52778 -Node: Concept index53339 +Node: Top220 +Node: Introduction1301 +Node: Library version5918 +Node: Buffering6563 +Node: Parameter limits7783 +Node: Compression functions8742 +Node: Decompression functions15282 +Node: Error codes21450 +Node: Error messages23425 +Node: Data format24004 +Node: Examples26569 +Node: Problems30650 +Node: Concept index31222  End Tag Table diff --git a/doc/lzlib.texi b/doc/lzlib.texi index 3e15079..bc3b9fe 100644 --- a/doc/lzlib.texi +++ b/doc/lzlib.texi @@ -6,10 +6,10 @@ @finalout @c %**end of header -@set UPDATED 9 January 2025 -@set VERSION 1.15 +@set UPDATED 17 May 2016 +@set VERSION 1.8 -@dircategory Compression +@dircategory Data Compression @direntry * Lzlib: (lzlib). Compression library for the lzip format @end direntry @@ -29,178 +29,154 @@ @contents @end ifnothtml -@ifnottex @node Top @top This manual is for Lzlib (version @value{VERSION}, @value{UPDATED}). @menu -* Introduction:: Purpose and features of lzlib -* Library version:: Checking library version -* Buffering:: Sizes of lzlib's buffers -* Parameter limits:: Min / max values for some parameters -* Compression functions:: Descriptions of the compression functions -* Decompression functions:: Descriptions of the decompression functions -* Error codes:: Meaning of codes returned by functions -* Error messages:: Error messages corresponding to error codes -* Invoking minilzip:: Command-line interface of the test program -* File format:: Detailed format of the compressed file -* Examples:: A small tutorial with examples -* Problems:: Reporting bugs -* Concept index:: Index of concepts +* Introduction:: Purpose and features of lzlib +* Library version:: Checking library version +* Buffering:: Sizes of lzlib's buffers +* Parameter limits:: Min / max values for some parameters +* Compression functions:: Descriptions of the compression functions +* Decompression functions:: Descriptions of the decompression functions +* Error codes:: Meaning of codes returned by functions +* Error messages:: Error messages corresponding to error codes +* Data format:: Detailed format of the compressed data +* Examples:: A small tutorial with examples +* Problems:: Reporting bugs +* Concept index:: Index of concepts @end menu @sp 1 -Copyright @copyright{} 2009-2025 Antonio Diaz Diaz. +Copyright @copyright{} 2009-2016 Antonio Diaz Diaz. -This manual is free documentation: you have unlimited permission to copy, -distribute, and modify it. -@end ifnottex +This manual is free documentation: you have unlimited permission +to copy, distribute and modify it. @node Introduction @chapter Introduction @cindex introduction -@uref{http://www.nongnu.org/lzip/lzlib.html,,Lzlib} -is a data compression library providing in-memory LZMA compression and -decompression functions, including integrity checking of the decompressed -data. The compressed data format used by the library is the -@uref{http://www.nongnu.org/lzip/lzip.html,,lzip} format. -Lzlib is written in C and is distributed under a 2-clause BSD license. +Lzlib is a data compression library providing in-memory LZMA compression +and decompression functions, including integrity checking of the +decompressed data. The compressed data format used by the library is the +lzip format. Lzlib is written in C. -The functions and variables forming the interface of the compression library -are declared in the file @file{lzlib.h}. Usage examples of the library are -given in the files @file{bbexample.c}, @file{ffexample.c}, and -@file{minilzip.c} from the source distribution. +The lzip file format is designed for data sharing and long-term +archiving, taking into account both data integrity and decoder +availability: -As @file{lzlib.h} can be used in C and C++ programs, it must not impose a -choice of system headers on the program by including one of them. Therefore -it is the responsibility of the program using lzlib to include before -@file{lzlib.h} some header that declares the type @samp{uint8_t}. There are -at least four such headers in C and C++: @file{stdint.h}, @file{cstdint}, -@file{inttypes.h}, and @file{cinttypes}. +@itemize @bullet +@item +The lzip format provides very safe integrity checking and some data +recovery means. The +@uref{http://www.nongnu.org/lzip/manual/lziprecover_manual.html#Data-safety,,lziprecover} +program can repair bit-flip errors (one of the most common forms of data +corruption) in lzip files, and provides data recovery capabilities, +including error-checked merging of damaged copies of a file. +@ifnothtml +@xref{Data safety,,,lziprecover}. +@end ifnothtml -All the library functions are thread safe. The library does not install any -signal handler. The decoder checks the consistency of the compressed data, -so the library should never crash even in case of corrupted input. +@item +The lzip format is as simple as possible (but not simpler). The lzip +manual provides the code of a simple decompressor along with a detailed +explanation of how it works, so that with the only help of the lzip +manual it would be possible for a digital archaeologist to extract the +data from a lzip file long after quantum computers eventually render +LZMA obsolete. + +@item +Additionally the lzip reference implementation is copylefted, which +guarantees that it will remain free forever. +@end itemize + +A nice feature of the lzip format is that a corrupt byte is easier to +repair the nearer it is from the beginning of the file. Therefore, with +the help of lziprecover, losing an entire archive just because of a +corrupt byte near the beginning is a thing of the past. + +The functions and variables forming the interface of the compression +library are declared in the file @samp{lzlib.h}. Usage examples of the +library are given in the files @samp{main.c} and @samp{bbexample.c} from +the source distribution. Compression/decompression is done by repeatedly calling a couple of -read/write functions until all the data have been processed by the library. -This interface is safer and less error prone than the traditional zlib -interface. +read/write functions until all the data have been processed by the +library. This interface is safer and less error prone than the +traditional zlib interface. Compression/decompression is done when the read function is called. This -means the value returned by the position functions is not updated until a -read call, even if a lot of data are written. If you want the data to be -compressed in advance, just call the read function with a @var{size} equal -to 0. +means the value returned by the position functions will not be updated +until a read call, even if a lot of data is written. If you want the +data to be compressed in advance, just call the read function with a +@var{size} equal to 0. -If all the data to be compressed are written in advance, lzlib automatically -adjusts the header of the compressed data to use the largest dictionary size -that does not exceed neither the data size nor the limit given to -@samp{LZ_compress_open}. This feature reduces the amount of memory needed for -decompression and allows minilzip to produce identical compressed output as -lzip. +If all the data to be compressed are written in advance, lzlib will +automatically adjust the header of the compressed data to use the +smallest possible dictionary size. This feature reduces the amount of +memory needed for decompression and allows minilzip to produce identical +compressed output as lzip. -Lzlib correctly decompresses a data stream which is the concatenation of -two or more compressed data streams. The result is the concatenation of the -corresponding decompressed data streams. Integrity testing of concatenated -compressed data streams is also supported. +Lzlib will correctly decompress a data stream which is the concatenation +of two or more compressed data streams. The result is the concatenation +of the corresponding decompressed data streams. Integrity testing of +concatenated compressed data streams is also supported. -Lzlib is able to compress and decompress streams of unlimited size by -automatically creating multimember output. The members so created are large, -about @w{2 PiB} each. +All the library functions are thread safe. The library does not install +any signal handler. The decoder checks the consistency of the compressed +data, so the library should never crash even in case of corrupted input. In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a concrete algorithm; it is more like "any algorithm using the LZMA coding -scheme". For example, the option @option{-0} of lzip uses the scheme in -almost the simplest way possible; issuing the longest match it can find, or -a literal byte if it can't find a match. Inversely, a more elaborate way of -finding coding sequences of minimum size than the one currently used by lzip -could be developed, and the resulting sequence could also be coded using the -LZMA coding scheme. +scheme". For example, the option @samp{-0} of lzip uses the scheme in almost +the simplest way possible; issuing the longest match it can find, or a +literal byte if it can't find a match. Inversely, a much more elaborated +way of finding coding sequences of minimum size than the one currently +used by lzip could be developed, and the resulting sequence could also +be coded using the LZMA coding scheme. -Lzlib currently implements two variants of the LZMA algorithm: fast (used by -option @option{-0} of minilzip) and normal (used by all other compression levels). +Lzlib currently implements two variants of the LZMA algorithm; fast +(used by option @samp{-0} of minilzip) and normal (used by all other +compression levels). The high compression of LZMA comes from combining two basic, well-proven -compression ideas: sliding dictionaries (LZ77) and Markov models (the thing -used by every compression algorithm that uses a range encoder or similar -order-0 entropy coder as its last stage) with segregation of contexts -according to what the bits are used for. +compression ideas: sliding dictionaries (LZ77/78) and markov models (the +thing used by every compression algorithm that uses a range encoder or +similar order-0 entropy coder as its last stage) with segregation of +contexts according to what the bits are used for. The ideas embodied in lzlib are due to (at least) the following people: -Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the -definition of Markov chains), G.N.N. Martin (for the definition of range -encoding), Igor Pavlov (for putting all the above together in LZMA), and -Julian Seward (for bzip2's CLI). - -LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have -been compressed. Decompressed is used to refer to data which have undergone -the process of decompression. +Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for +the definition of Markov chains), G.N.N. Martin (for the definition of +range encoding), Igor Pavlov (for putting all the above together in +LZMA), and Julian Seward (for bzip2's CLI). @node Library version @chapter Library version @cindex library version -One goal of lzlib is to keep perfect backward compatibility with older -versions of itself down to 1.0. Any application working with an older lzlib -should work with a newer lzlib. Installing a newer lzlib should not break -anything. This chapter describes the constants and functions that the -application can use to discover the version of the library being used. All -of them are declared in @file{lzlib.h}. - -@defvr Constant LZ_API_VERSION -This constant is defined in @file{lzlib.h} and works as a version test -macro. The application should check at compile time that LZ_API_VERSION is -greater than or equal to the version required by the application: - -@example -#if !defined LZ_API_VERSION || LZ_API_VERSION < 1012 -#error "lzlib 1.12 or newer needed." -#endif -@end example - -Before version 1.8, lzlib didn't define LZ_API_VERSION.@* -LZ_API_VERSION was first defined in lzlib 1.8 to 1.@* -Since lzlib 1.12, LZ_API_VERSION is defined as (major * 1000 + minor). -@end defvr - -NOTE: Version test macros are the library's way of announcing functionality -to the application. They should not be confused with feature test macros, -which allow the application to announce to the library its desire to have -certain symbols and prototypes exposed. - -@deftypefun int LZ_api_version ( void ) -If LZ_API_VERSION >= 1012, this function is declared in @file{lzlib.h} (else -it doesn't exist). It returns the LZ_API_VERSION of the library object code -being used. The application should check at run time that the value -returned by @code{LZ_api_version} is greater than or equal to the version -required by the application. An application may be dynamically linked at run -time with a different version of lzlib than the one it was compiled for, and -this should not break the application as long as the library used provides -the functionality required by the application. - -@example -#if defined LZ_API_VERSION && LZ_API_VERSION >= 1012 - if( LZ_api_version() < 1012 ) - show_error( "lzlib 1.12 or newer needed." ); -#endif -@end example +@deftypefun {const char *} LZ_version ( void ) +Returns the library version as a string. @end deftypefun @deftypevr Constant {const char *} LZ_version_string -This string constant is defined in the header file @file{lzlib.h} and -represents the version of the library being used at compile time. +This constant is defined in the header file @samp{lzlib.h}. @end deftypevr -@deftypefun {const char *} LZ_version ( void ) -This function returns a string representing the version of the library being -used at run time. -@end deftypefun +The application should compare LZ_version and LZ_version_string for +consistency. If the first character differs, the library code actually +used may be incompatible with the @samp{lzlib.h} header file used by the +application. + +@example +if( LZ_version()[0] != LZ_version_string[0] ) + error( "bad library version" ); +@end example @node Buffering @@ -214,23 +190,26 @@ dictionary size. Finally, for safety reasons, lzlib uses two more internal buffers. -These are the four buffers used by lzlib, and their guaranteed minimum sizes: +These are the four buffers used by lzlib, and their guaranteed minimum +sizes: @itemize @bullet -@item Input compression buffer. Written to by the function -@samp{LZ_compress_write}. For the normal variant of LZMA, its size is two -times the dictionary size set with the function @samp{LZ_compress_open} or -@w{64 KiB}, whichever is larger. For the fast variant, its size is @w{1 MiB}. +@item Input compression buffer. Written to by the +@samp{LZ_compress_write} function. For the normal variant of LZMA, its +size is two times the dictionary size set with the +@samp{LZ_compress_open} function or 64 KiB, whichever is larger. For the +fast variant, its size is 1 MiB. -@item Output compression buffer. Read from by the function -@samp{LZ_compress_read}. Its size is @w{64 KiB}. +@item Output compression buffer. Read from by the +@samp{LZ_compress_read} function. Its size is 64 KiB. -@item Input decompression buffer. Written to by the function -@samp{LZ_decompress_write}. Its size is @w{64 KiB}. +@item Input decompression buffer. Written to by the +@samp{LZ_decompress_write} function. Its size is 64 KiB. -@item Output decompression buffer. Read from by the function -@samp{LZ_decompress_read}. Its size is the dictionary size set in the header -of the member currently being decompressed or @w{64 KiB}, whichever is larger. +@item Output decompression buffer. Read from by the +@samp{LZ_decompress_read} function. Its size is the dictionary size set +in the header of the member currently being decompressed or 64 KiB, +whichever is larger. @end itemize @@ -272,176 +251,149 @@ Returns the largest valid match length limit [273]. These are the functions used to compress data. In case of error, all of them return -1 or 0, for signed and unsigned return values respectively, -except @samp{LZ_compress_open} whose return value must be checked by +except @samp{LZ_compress_open} whose return value must be verified by calling @samp{LZ_compress_errno} before using it. -@deftypefun {LZ_Encoder *} LZ_compress_open ( const int @var{dictionary_size}, const int @var{match_len_limit}, const unsigned long long @var{member_size} ) +@deftypefun {struct LZ_Encoder *} LZ_compress_open ( const int @var{dictionary_size}, const int @var{match_len_limit}, const unsigned long long @var{member_size} ) Initializes the internal stream state for compression and returns a pointer that can only be used as the @var{encoder} argument for the other LZ_compress functions, or a null pointer if the encoder could not be allocated. -The returned pointer must be checked by calling @samp{LZ_compress_errno} -before using it. If @samp{LZ_compress_errno} does not return @samp{LZ_ok}, -the returned pointer must not be used and should be freed with -@samp{LZ_compress_close} to avoid memory leaks. +The returned pointer must be verified by calling +@samp{LZ_compress_errno} before using it. If @samp{LZ_compress_errno} +does not return @samp{LZ_ok}, the returned pointer must not be used and +should be freed with @samp{LZ_compress_close} to avoid memory leaks. @var{dictionary_size} sets the dictionary size to be used, in bytes. -Valid values range from @w{4 KiB} to @w{512 MiB}. Note that dictionary -sizes are quantized. If the size specified does not match one of the -valid sizes, it is rounded upwards by adding up to -@w{(@var{dictionary_size} / 8)} to it. +Valid values range from 4 KiB to 512 MiB. Note that dictionary sizes are +quantized. If the specified size does not match one of the valid sizes, +it will be rounded upwards by adding up to (@var{dictionary_size} / 8) +to it. @var{match_len_limit} sets the match length limit in bytes. Valid values -range from 5 to 273. Larger values usually give better compression ratios -but longer compression times. +range from 5 to 273. Larger values usually give better compression +ratios but longer compression times. -If @var{dictionary_size} is 65535 and @var{match_len_limit} is 16, the fast -variant of LZMA is chosen, which produces identical compressed output as -@w{@samp{lzip -0}}. (The dictionary size used is rounded upwards to -@w{64 KiB}). +If @var{dictionary_size} is 65535 and @var{match_len_limit} is 16, the +fast variant of LZMA is chosen, which produces identical compressed +output as @code{lzip -0}. (The dictionary size used will be rounded +upwards to 64 KiB). -@anchor{member_size} -@var{member_size} sets the member size limit in bytes. Valid values range -from @w{4 KiB} to @w{2 PiB}. A small member size may degrade compression -ratio, so use it only when needed. To produce a single-member data stream, -give @var{member_size} a value larger than the amount of data to be -produced. Values larger than @w{2 PiB} are reduced to @w{2 PiB} to prevent -the uncompressed size of the member from overflowing. +@var{member_size} sets the member size limit in bytes. Minimum member +size limit is 100 kB. Small member size may degrade compression ratio, so +use it only when needed. To produce a single-member data stream, give +@var{member_size} a value larger than the amount of data to be produced, +for example INT64_MAX. @end deftypefun -@deftypefun int LZ_compress_close ( LZ_Encoder * const @var{encoder} ) +@deftypefun int LZ_compress_close ( struct LZ_Encoder * const @var{encoder} ) Frees all dynamically allocated data structures for this stream. This function discards any unprocessed input and does not flush any pending output. After a call to @samp{LZ_compress_close}, @var{encoder} can no -longer be used as an argument to any LZ_compress function. -It is safe to call @samp{LZ_compress_close} with a null argument. +more be used as an argument to any LZ_compress function. @end deftypefun -@deftypefun int LZ_compress_finish ( LZ_Encoder * const @var{encoder} ) +@deftypefun int LZ_compress_finish ( struct LZ_Encoder * const @var{encoder} ) Use this function to tell @samp{lzlib} that all the data for this member -have already been written (with the function @samp{LZ_compress_write}). -It is safe to call @samp{LZ_compress_finish} as many times as needed. -After all the compressed data have been read with @samp{LZ_compress_read} -and @samp{LZ_compress_member_finished} returns 1, a new member can be -started with @samp{LZ_compress_restart_member}. +have already been written (with the @samp{LZ_compress_write} function). +After all the produced compressed data have been read with +@samp{LZ_compress_read} and @samp{LZ_compress_member_finished} returns +1, a new member can be started with @samp{LZ_compress_restart_member}. @end deftypefun -@deftypefun int LZ_compress_restart_member ( LZ_Encoder * const @var{encoder}, const unsigned long long @var{member_size} ) -Use this function to start a new member in a multimember data stream. Call -this function only after @samp{LZ_compress_member_finished} indicates that -the current member has been fully read (with the function -@samp{LZ_compress_read}). @xref{member_size}, for a description of -@var{member_size}. +@deftypefun int LZ_compress_restart_member ( struct LZ_Encoder * const @var{encoder}, const unsigned long long @var{member_size} ) +Use this function to start a new member in a multimember data stream. +Call this function only after @samp{LZ_compress_member_finished} +indicates that the current member has been fully read (with the +@samp{LZ_compress_read} function). @end deftypefun -@anchor{sync_flush} -@deftypefun int LZ_compress_sync_flush ( LZ_Encoder * const @var{encoder} ) -Use this function to make available to @samp{LZ_compress_read} all the data -already written with the function @samp{LZ_compress_write}. First call -@samp{LZ_compress_sync_flush}. Then call @samp{LZ_compress_read} until it -returns 0. - -This function writes at least one LZMA marker @samp{3} ('Sync Flush' marker) -to the compressed output. Note that the sync flush marker is not allowed in -lzip files; it is a device for interactive communication between -applications using lzlib, but is useless and wasteful in a file, and is -excluded from the media type @samp{application/lzip}. The LZMA marker -@samp{2} ('End Of Stream' marker) is the only marker allowed in lzip files. -@xref{File format}. +@deftypefun int LZ_compress_sync_flush ( struct LZ_Encoder * const @var{encoder} ) +Use this function to make available to @samp{LZ_compress_read} all the +data already written with the @samp{LZ_compress_write} function. First +call @samp{LZ_compress_sync_flush}. Then call @samp{LZ_compress_read} +until it returns 0. Repeated use of @samp{LZ_compress_sync_flush} may degrade compression -ratio, so use it only when needed. If the interval between calls to -@samp{LZ_compress_sync_flush} is large (comparable to dictionary size), -creating a multimember data stream with @samp{LZ_compress_restart_member} -may be an alternative. - -Combining multimember stream creation with flushing may be tricky. If there -are more bytes available than those needed to complete @var{member_size}, -@samp{LZ_compress_restart_member} needs to be called when -@samp{LZ_compress_member_finished} returns 1, followed by a new call to -@samp{LZ_compress_sync_flush}. +ratio, so use it only when needed. @end deftypefun -@deftypefun int LZ_compress_read ( LZ_Encoder * const @var{encoder}, uint8_t * const @var{buffer}, const int @var{size} ) -Reads up to @var{size} bytes from the stream pointed to by @var{encoder}, -storing the results in @var{buffer}. If @w{LZ_API_VERSION >= 1012}, -@var{buffer} may be a null pointer, in which case the bytes read are -discarded. +@deftypefun int LZ_compress_read ( struct LZ_Encoder * const @var{encoder}, uint8_t * const @var{buffer}, const int @var{size} ) +The @samp{LZ_compress_read} function reads up to @var{size} bytes from +the stream pointed to by @var{encoder}, storing the results in +@var{buffer}. -Returns the number of bytes actually read. This might be less than -@var{size}; for example, if there aren't that many bytes left in the stream -or if more bytes have to be yet written with the function -@samp{LZ_compress_write}. Note that reading less than @var{size} bytes is +The return value is the number of bytes actually read. This might be +less than @var{size}; for example, if there aren't that many bytes left +in the stream or if more bytes have to be yet written with the +@samp{LZ_compress_write} function. Note that reading less than +@var{size} bytes is not an error. +@end deftypefun + + +@deftypefun int LZ_compress_write ( struct LZ_Encoder * const @var{encoder}, uint8_t * const @var{buffer}, const int @var{size} ) +The @samp{LZ_compress_write} function writes up to @var{size} bytes from +@var{buffer} to the stream pointed to by @var{encoder}. + +The return value is the number of bytes actually written. This might be +less than @var{size}. Note that writing less than @var{size} bytes is not an error. @end deftypefun -@deftypefun int LZ_compress_write ( LZ_Encoder * const @var{encoder}, uint8_t * const @var{buffer}, const int @var{size} ) -Writes up to @var{size} bytes from @var{buffer} to the stream pointed to by -@var{encoder}. Returns the number of bytes actually written. This might be -less than @var{size}. Note that writing less than @var{size} bytes is not an -error. -@end deftypefun - - -@deftypefun int LZ_compress_write_size ( LZ_Encoder * const @var{encoder} ) -Returns the maximum number of bytes that can be immediately written through -@samp{LZ_compress_write}. For efficiency reasons, once the input buffer is -full and @samp{LZ_compress_write_size} returns 0, almost all the buffer must -be compressed before a size greater than 0 is returned again. (This is done -to minimize the amount of data that must be copied to the beginning of the -buffer before new data can be accepted). +@deftypefun int LZ_compress_write_size ( struct LZ_Encoder * const @var{encoder} ) +The @samp{LZ_compress_write_size} function returns the maximum number of +bytes that can be immediately written through the @samp{LZ_compress_write} +function. It is guaranteed that an immediate call to @samp{LZ_compress_write} will accept a @var{size} up to the returned number of bytes. @end deftypefun -@deftypefun {LZ_Errno} LZ_compress_errno ( LZ_Encoder * const @var{encoder} ) -Returns the current error code for @var{encoder}. @xref{Error codes}. -It is safe to call @samp{LZ_compress_errno} with a null argument, in which -case it returns @samp{LZ_bad_argument}. +@deftypefun {enum LZ_Errno} LZ_compress_errno ( struct LZ_Encoder * const @var{encoder} ) +Returns the current error code for @var{encoder} (@pxref{Error codes}). @end deftypefun -@deftypefun int LZ_compress_finished ( LZ_Encoder * const @var{encoder} ) +@deftypefun int LZ_compress_finished ( struct LZ_Encoder * const @var{encoder} ) Returns 1 if all the data have been read and @samp{LZ_compress_close} -can be safely called. Otherwise it returns 0. @samp{LZ_compress_finished} -implies @samp{LZ_compress_member_finished}. +can be safely called. Otherwise it returns 0. @end deftypefun -@deftypefun int LZ_compress_member_finished ( LZ_Encoder * const @var{encoder} ) +@deftypefun int LZ_compress_member_finished ( struct LZ_Encoder * const @var{encoder} ) Returns 1 if the current member, in a multimember data stream, has been fully read and @samp{LZ_compress_restart_member} can be safely called. Otherwise it returns 0. @end deftypefun -@deftypefun {unsigned long long} LZ_compress_data_position ( LZ_Encoder * const @var{encoder} ) -Returns the number of input bytes already compressed in the current member. +@deftypefun {unsigned long long} LZ_compress_data_position ( struct LZ_Encoder * const @var{encoder} ) +Returns the number of input bytes already compressed in the current +member. @end deftypefun -@deftypefun {unsigned long long} LZ_compress_member_position ( LZ_Encoder * const @var{encoder} ) +@deftypefun {unsigned long long} LZ_compress_member_position ( struct LZ_Encoder * const @var{encoder} ) Returns the number of compressed bytes already produced, but perhaps not yet read, in the current member. @end deftypefun -@deftypefun {unsigned long long} LZ_compress_total_in_size ( LZ_Encoder * const @var{encoder} ) +@deftypefun {unsigned long long} LZ_compress_total_in_size ( struct LZ_Encoder * const @var{encoder} ) Returns the total number of input bytes already compressed. @end deftypefun -@deftypefun {unsigned long long} LZ_compress_total_out_size ( LZ_Encoder * const @var{encoder} ) +@deftypefun {unsigned long long} LZ_compress_total_out_size ( struct LZ_Encoder * const @var{encoder} ) Returns the total number of compressed bytes already produced, but perhaps not yet read. @end deftypefun @@ -451,172 +403,149 @@ perhaps not yet read. @chapter Decompression functions @cindex decompression functions -These are the functions used to decompress data. In case of error, all of -them return -1 or 0, for signed and unsigned return values respectively, -except @samp{LZ_decompress_open} whose return value must be checked by -calling @samp{LZ_decompress_errno} before using it. +These are the functions used to decompress data. In case of error, all +of them return -1 or 0, for signed and unsigned return values +respectively, except @samp{LZ_decompress_open} whose return value must +be verified by calling @samp{LZ_decompress_errno} before using it. -@deftypefun {LZ_Decoder *} LZ_decompress_open ( void ) +@deftypefun {struct LZ_Decoder *} LZ_decompress_open ( void ) Initializes the internal stream state for decompression and returns a -pointer that can only be used as the @var{decoder} argument for the other -LZ_decompress functions, or a null pointer if the decoder could not be -allocated. +pointer that can only be used as the @var{decoder} argument for the +other LZ_decompress functions, or a null pointer if the decoder could +not be allocated. -The returned pointer must be checked by calling @samp{LZ_decompress_errno} -before using it. If @samp{LZ_decompress_errno} does not return @samp{LZ_ok}, -the returned pointer must not be used and should be freed with +The returned pointer must be verified by calling +@samp{LZ_decompress_errno} before using it. If +@samp{LZ_decompress_errno} does not return @samp{LZ_ok}, the returned +pointer must not be used and should be freed with @samp{LZ_decompress_close} to avoid memory leaks. @end deftypefun -@deftypefun int LZ_decompress_close ( LZ_Decoder * const @var{decoder} ) +@deftypefun int LZ_decompress_close ( struct LZ_Decoder * const @var{decoder} ) Frees all dynamically allocated data structures for this stream. This function discards any unprocessed input and does not flush any pending output. After a call to @samp{LZ_decompress_close}, @var{decoder} can no -longer be used as an argument to any LZ_decompress function. -It is safe to call @samp{LZ_decompress_close} with a null argument. +more be used as an argument to any LZ_decompress function. @end deftypefun -@deftypefun int LZ_decompress_finish ( LZ_Decoder * const @var{decoder} ) +@deftypefun int LZ_decompress_finish ( struct LZ_Decoder * const @var{decoder} ) Use this function to tell @samp{lzlib} that all the data for this stream -have already been written (with the function @samp{LZ_decompress_write}). -It is safe to call @samp{LZ_decompress_finish} as many times as needed. -It is not required to call @samp{LZ_decompress_finish} if the input stream -only contains whole members, but not calling it prevents lzlib from -detecting a truncated member. +have already been written (with the @samp{LZ_decompress_write} function). @end deftypefun -@deftypefun int LZ_decompress_reset ( LZ_Decoder * const @var{decoder} ) +@deftypefun int LZ_decompress_reset ( struct LZ_Decoder * const @var{decoder} ) Resets the internal state of @var{decoder} as it was just after opening -it with the function @samp{LZ_decompress_open}. Data stored in the -internal buffers are discarded. Position counters are set to 0. +it with the @samp{LZ_decompress_open} function. Data stored in the +internal buffers is discarded. Position counters are set to 0. @end deftypefun -@deftypefun int LZ_decompress_sync_to_member ( LZ_Decoder * const @var{decoder} ) -Resets the error state of @var{decoder} and enters a search state that lasts -until a new member header (or the end of the stream) is found. After a -successful call to @samp{LZ_decompress_sync_to_member}, data written with -@samp{LZ_decompress_write} is consumed and @samp{LZ_decompress_read} returns -0 until a header is found. +@deftypefun int LZ_decompress_sync_to_member ( struct LZ_Decoder * const @var{decoder} ) +Resets the error state of @var{decoder} and enters a search state that +lasts until a new member header (or the end of the stream) is found. +After a successful call to @samp{LZ_decompress_sync_to_member}, data +written with @samp{LZ_decompress_write} will be consumed and +@samp{LZ_decompress_read} will return 0 until a header is found. -This function is useful to discard any data preceding the first member, or -to discard the rest of the current member, for example in case of a data -error. If the decoder is already at the beginning of a member, this function -does nothing. +This function is useful to discard any data preceding the first member, +or to discard the rest of the current member, for example in case of a +data error. If the decoder is already at the beginning of a member, this +function does nothing. @end deftypefun -@deftypefun int LZ_decompress_read ( LZ_Decoder * const @var{decoder}, uint8_t * const @var{buffer}, const int @var{size} ) -Reads up to @var{size} bytes from the stream pointed to by @var{decoder}, -storing the results in @var{buffer}. If @w{LZ_API_VERSION >= 1012}, -@var{buffer} may be a null pointer, in which case the bytes read are -discarded. +@deftypefun int LZ_decompress_read ( struct LZ_Decoder * const @var{decoder}, uint8_t * const @var{buffer}, const int @var{size} ) +The @samp{LZ_decompress_read} function reads up to @var{size} bytes from +the stream pointed to by @var{decoder}, storing the results in +@var{buffer}. -Returns the number of bytes actually read. This might be less than -@var{size}; for example, if there aren't that many bytes left in the stream -or if more bytes have to be yet written with the function -@samp{LZ_decompress_write}. Note that reading less than @var{size} bytes is +The return value is the number of bytes actually read. This might be +less than @var{size}; for example, if there aren't that many bytes left +in the stream or if more bytes have to be yet written with the +@samp{LZ_decompress_write} function. Note that reading less than +@var{size} bytes is not an error. +@end deftypefun + + +@deftypefun int LZ_decompress_write ( struct LZ_Decoder * const @var{decoder}, uint8_t * const @var{buffer}, const int @var{size} ) +The @samp{LZ_decompress_write} function writes up to @var{size} bytes from +@var{buffer} to the stream pointed to by @var{decoder}. + +The return value is the number of bytes actually written. This might be +less than @var{size}. Note that writing less than @var{size} bytes is not an error. - -@samp{LZ_decompress_read} returns at least once per member so that -@samp{LZ_decompress_member_finished} can be called (and trailer data -retrieved) for each member, even for empty members. Therefore, -@samp{LZ_decompress_read} returning 0 does not mean that the end of the -stream has been reached. The increase in the value returned by -@samp{LZ_decompress_total_in_size} can be used to tell the end of the stream -from an empty member. - -In case of decompression error caused by corrupt or truncated data, -@samp{LZ_decompress_read} does not signal the error immediately to the -application, but waits until all the bytes decoded have been read. This -allows tools like -@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} to -recover as much data as possible from each damaged member. -@ifnothtml -@xref{Top,tarlz manual,,tarlz}. -@end ifnothtml @end deftypefun -@deftypefun int LZ_decompress_write ( LZ_Decoder * const @var{decoder}, uint8_t * const @var{buffer}, const int @var{size} ) -Writes up to @var{size} bytes from @var{buffer} to the stream pointed to by -@var{decoder}. Returns the number of bytes actually written. This might be -less than @var{size}. Note that writing less than @var{size} bytes is not an -error. +@deftypefun int LZ_decompress_write_size ( struct LZ_Decoder * const @var{decoder} ) +The @samp{LZ_decompress_write_size} function returns the maximum number +of bytes that can be immediately written through the +@samp{LZ_decompress_write} function. + +It is guaranteed that an immediate call to @samp{LZ_decompress_write} +will accept a @var{size} up to the returned number of bytes. @end deftypefun -@deftypefun int LZ_decompress_write_size ( LZ_Decoder * const @var{decoder} ) -Returns the maximum number of bytes that can be immediately written through -@samp{LZ_decompress_write}. This number varies smoothly; each compressed -byte consumed may be overwritten immediately, increasing by 1 the value -returned. - -It is guaranteed that an immediate call to @samp{LZ_decompress_write} will -accept a @var{size} up to the returned number of bytes. +@deftypefun {enum LZ_Errno} LZ_decompress_errno ( struct LZ_Decoder * const @var{decoder} ) +Returns the current error code for @var{decoder} (@pxref{Error codes}). @end deftypefun -@deftypefun {LZ_Errno} LZ_decompress_errno ( LZ_Decoder * const @var{decoder} ) -Returns the current error code for @var{decoder}. @xref{Error codes}. -It is safe to call @samp{LZ_decompress_errno} with a null argument, in which -case it returns @samp{LZ_bad_argument}. -@end deftypefun - - -@deftypefun int LZ_decompress_finished ( LZ_Decoder * const @var{decoder} ) +@deftypefun int LZ_decompress_finished ( struct LZ_Decoder * const @var{decoder} ) Returns 1 if all the data have been read and @samp{LZ_decompress_close} -can be safely called. Otherwise it returns 0. @samp{LZ_decompress_finished} -does not imply @samp{LZ_decompress_member_finished}. +can be safely called. Otherwise it returns 0. @end deftypefun -@deftypefun int LZ_decompress_member_finished ( LZ_Decoder * const @var{decoder} ) -Returns 1 if the previous call to @samp{LZ_decompress_read} finished reading -the current member, indicating that final values for the member are available -through @samp{LZ_decompress_data_crc}, @samp{LZ_decompress_data_position}, -and @samp{LZ_decompress_member_position}. Otherwise it returns 0. +@deftypefun int LZ_decompress_member_finished ( struct LZ_Decoder * const @var{decoder} ) +Returns 1 if the previous call to @samp{LZ_decompress_read} finished +reading the current member, indicating that final values for member are +available through @samp{LZ_decompress_data_crc}, +@samp{LZ_decompress_data_position}, and +@samp{LZ_decompress_member_position}. Otherwise it returns 0. @end deftypefun -@deftypefun int LZ_decompress_member_version ( LZ_Decoder * const @var{decoder} ) -Returns the version of the current member, read from the member header. +@deftypefun int LZ_decompress_member_version ( struct LZ_Decoder * const @var{decoder} ) +Returns the version of current member from member header. @end deftypefun -@deftypefun int LZ_decompress_dictionary_size ( LZ_Decoder * const @var{decoder} ) -Returns the dictionary size of the current member, read from the member header. +@deftypefun int LZ_decompress_dictionary_size ( struct LZ_Decoder * const @var{decoder} ) +Returns the dictionary size of current member from member header. @end deftypefun -@deftypefun {unsigned} LZ_decompress_data_crc ( LZ_Decoder * const @var{decoder} ) +@deftypefun {unsigned} LZ_decompress_data_crc ( struct LZ_Decoder * const @var{decoder} ) Returns the 32 bit Cyclic Redundancy Check of the data decompressed from -the current member. The value returned is valid only when +the current member. The returned value is valid only when @samp{LZ_decompress_member_finished} returns 1. @end deftypefun -@deftypefun {unsigned long long} LZ_decompress_data_position ( LZ_Decoder * const @var{decoder} ) +@deftypefun {unsigned long long} LZ_decompress_data_position ( struct LZ_Decoder * const @var{decoder} ) Returns the number of decompressed bytes already produced, but perhaps not yet read, in the current member. @end deftypefun -@deftypefun {unsigned long long} LZ_decompress_member_position ( LZ_Decoder * const @var{decoder} ) -Returns the number of input bytes already decompressed in the current member. +@deftypefun {unsigned long long} LZ_decompress_member_position ( struct LZ_Decoder * const @var{decoder} ) +Returns the number of input bytes already decompressed in the current +member. @end deftypefun -@deftypefun {unsigned long long} LZ_decompress_total_in_size ( LZ_Decoder * const @var{decoder} ) +@deftypefun {unsigned long long} LZ_decompress_total_in_size ( struct LZ_Decoder * const @var{decoder} ) Returns the total number of input bytes already decompressed. @end deftypefun -@deftypefun {unsigned long long} LZ_decompress_total_out_size ( LZ_Decoder * const @var{decoder} ) +@deftypefun {unsigned long long} LZ_decompress_total_out_size ( struct LZ_Decoder * const @var{decoder} ) Returns the total number of decompressed bytes already produced, but perhaps not yet read. @end deftypefun @@ -628,7 +557,7 @@ perhaps not yet read. Most library functions return -1 to indicate that they have failed. But this return value only tells you that an error has occurred. To find out -what kind of error it was, you need to check the error code by calling +what kind of error it was, you need to verify the error code by calling @samp{LZ_(de)compress_errno}. Library functions don't change the value returned by @@ -638,49 +567,46 @@ necessarily LZ_ok, and you should not use @samp{LZ_(de)compress_errno} to determine whether a call failed. If the call failed, then you can examine @samp{LZ_(de)compress_errno}. -The error codes are defined in the header file @file{lzlib.h}. -@samp{LZ_Errno} is an enum type: +The error codes are defined in the header file @samp{lzlib.h}. -@deftypevr Constant {LZ_Errno} LZ_ok -The value of this constant is 0 and is used to indicate that there is no error. +@deftypevr Constant {enum LZ_Errno} LZ_ok +The value of this constant is 0 and is used to indicate that there is no +error. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_bad_argument -At least one of the arguments passed to the library function was invalid. +@deftypevr Constant {enum LZ_Errno} LZ_bad_argument +At least one of the arguments passed to the library function was +invalid. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_mem_error -No memory available. The system cannot allocate more virtual memory because -its capacity is full. +@deftypevr Constant {enum LZ_Errno} LZ_mem_error +No memory available. The system cannot allocate more virtual memory +because its capacity is full. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_sequence_error +@deftypevr Constant {enum LZ_Errno} LZ_sequence_error A library function was called in the wrong order. For example @samp{LZ_compress_restart_member} was called before -@samp{LZ_compress_member_finished} indicated that the current member is +@samp{LZ_compress_member_finished} indicates that the current member is finished. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_header_error -An invalid member header (one with the wrong magic bytes) was read. If this -happens at the end of the data stream it may indicate trailing data. +@deftypevr Constant {enum LZ_Errno} LZ_header_error +An invalid member header (one with the wrong magic bytes) was read. If +this happens at the end of the data stream it may indicate trailing +data. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_unexpected_eof +@deftypevr Constant {enum LZ_Errno} LZ_unexpected_eof The end of the data stream was reached in the middle of a member. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_data_error -The data stream is corrupt. If @samp{LZ_decompress_member_position} is 6 or -less, it indicates either a format version not supported, an invalid -dictionary size, a nonzero first LZMA byte, a corrupt header in a multimember -data stream, or trailing data too similar to a valid lzip header. -Lziprecover can be used to repair some of these errors and to remove -conflicting trailing data from a file. +@deftypevr Constant {enum LZ_Errno} LZ_data_error +The data stream is corrupt. @end deftypevr -@deftypevr Constant {LZ_Errno} LZ_library_error -A bug was detected in the library. Please, report it. @xref{Problems}. +@deftypevr Constant {enum LZ_Errno} LZ_library_error +A bug was detected in the library. Please, report it (@pxref{Problems}). @end deftypevr @@ -688,285 +614,27 @@ A bug was detected in the library. Please, report it. @xref{Problems}. @chapter Error messages @cindex error messages -@deftypefun {const char *} LZ_strerror ( const LZ_Errno @var{lz_errno} ) -Returns the error message corresponding to the error code @var{lz_errno}. -The messages are fairly short; there are no multi-line messages or embedded -newlines. This function makes it easy for your program to report informative -error messages about the failure of a library call. +@deftypefun {const char *} LZ_strerror ( const enum LZ_Errno @var{lz_errno} ) +Returns the standard error message for a given error code. The messages +are fairly short; there are no multi-line messages or embedded newlines. +This function makes it easy for your program to report informative error +messages about the failure of a library call. The value of @var{lz_errno} normally comes from a call to @samp{LZ_(de)compress_errno}. @end deftypefun -@node Invoking minilzip -@chapter Invoking minilzip -@cindex invoking -@cindex options - -Minilzip is a test program for the compression library lzlib. Minilzip is -not intended to be installed because lzip has more features, but minilzip is -well tested and you can use it as your main compressor if so you wish. -@ifnothtml -@xref{Top,lzip,,lzip}. -@end ifnothtml - -@uref{http://www.nongnu.org/lzip/lzip.html,,Lzip} -is a lossless data compressor with a user interface similar to the one -of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov -chain-Algorithm) designed to achieve complete interoperability between -implementations. The maximum dictionary size is 512 MiB so that any lzip -file can be decompressed on 32-bit machines. Lzip provides accurate and -robust 3-factor integrity checking. @w{@samp{lzip -0}} compresses about as fast as -gzip, while @w{@samp{lzip -9}} compresses most files more than bzip2. Decompression -speed is intermediate between gzip and bzip2. Lzip provides better data -recovery capabilities than gzip and bzip2. Lzip has been designed, written, -and tested with great care to replace gzip and bzip2 as general-purpose -compressed format for Unix-like systems. - -@noindent -The format for running minilzip is: - -@example -minilzip [@var{options}] [@var{files}] -@end example - -@noindent -If no file names are specified, minilzip compresses (or decompresses) from -standard input to standard output. A hyphen @samp{-} used as a @var{file} -argument means standard input. It can be mixed with other @var{files} and is -read just once, the first time it appears in the command line. Remember to -prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}. - -@noindent -minilzip supports the following -@uref{http://www.nongnu.org/lzip/manual/plzip_manual.html#Argument-syntax,,options}: -@ifnothtml -@xref{Argument syntax,,,plzip}. -@end ifnothtml - -@table @code -@item -h -@itemx --help -Print an informative help message describing the options and exit. - -@item -V -@itemx --version -Print the version number of minilzip on the standard output and exit. -This version number should be included in all bug reports. - -@item -a -@itemx --trailing-error -Exit with error status 2 if any remaining input is detected after -decompressing the last member. Such remaining input is usually trailing -garbage that can be safely ignored. - -@item -b @var{bytes} -@itemx --member-size=@var{bytes} -When compressing, set the member size limit to @var{bytes}. If @var{bytes} -is smaller than the compressed size, a multimember file is produced. It is -advisable to keep members smaller than RAM size so that they can be repaired -with lziprecover in case of corruption. A small member size may degrade -compression ratio, so use it only when needed. Valid values range from -@w{100 kB} to @w{2 PiB}. Defaults to @w{2 PiB}. - -@item -c -@itemx --stdout -Compress or decompress to standard output; keep input files unchanged. If -compressing several files, each file is compressed independently. (The -output consists of a sequence of independently compressed members). This -option (or @option{-o}) is needed when reading from a named pipe (fifo) or -from a device. Use it also to recover as much of the decompressed data as -possible when decompressing a corrupt file. @option{-c} overrides @option{-o} -and @option{-S}. @option{-c} has no effect when testing. - -@item -d -@itemx --decompress -Decompress the files specified. The integrity of the files specified is -checked. If a file does not exist, can't be opened, or the destination file -already exists and @option{--force} has not been specified, minilzip continues -decompressing the rest of the files and exits with error status 1. If a file -fails to decompress, or is a terminal, minilzip exits immediately with error -status 2 without decompressing the rest of the files. A terminal is -considered an uncompressed file, and therefore invalid. A multimember file -with one or more empty members is accepted if redirected to standard input. - -@item -f -@itemx --force -Force overwrite of output files. - -@item -F -@itemx --recompress -When compressing, force re-compression of files whose name already has -the @file{.lz} or @file{.tlz} suffix. - -@item -k -@itemx --keep -Keep (don't delete) input files during compression or decompression. - -@item -m @var{bytes} -@itemx --match-length=@var{bytes} -When compressing, set the match length limit in bytes. After a match this -long is found, the search is finished. Valid values range from 5 to 273. -Larger values usually give better compression ratios but longer compression -times. - -@item -o @var{file} -@itemx --output=@var{file} -If @option{-c} has not been also specified, write the (de)compressed output -to @var{file}; keep input files unchanged. If compressing several files, -each file is compressed independently. (The output consists of a sequence of -independently compressed members). This option (or @option{-c}) is needed -when reading from a named pipe (fifo) or from a device. @w{@option{-o -}} is -equivalent to @option{-c}. @option{-o} has no effect when testing. - -When compressing and splitting the output in volumes, @var{file} is used as -a prefix, and several files named @file{@var{file}00001.lz}, -@file{@var{file}00002.lz}, etc, are created. In this case, only one input -file is allowed. - -@item -q -@itemx --quiet -Quiet operation. Suppress all messages. - -@item -s @var{bytes} -@itemx --dictionary-size=@var{bytes} -When compressing, set the dictionary size limit in bytes. Minilzip uses for -each file the largest dictionary size that does not exceed neither the file -size nor this limit. Valid values range from @w{4 KiB} to @w{512 MiB}. -Values 12 to 29 are interpreted as powers of two, meaning 2^12 to 2^29 -bytes. Dictionary sizes are quantized so that they can be coded in just one -byte (@pxref{coded-dict-size}). If the size specified does not match one of -the valid sizes, it is rounded upwards by adding up to @w{(@var{bytes} / 8)} -to it. - -For maximum compression you should use a dictionary size limit as large -as possible, but keep in mind that the decompression memory requirement -is affected at compression time by the choice of dictionary size limit. -The dictionary size used for decompression is the same dictionary size used -for compression. - -@item -S @var{bytes} -@itemx --volume-size=@var{bytes} -When compressing, and @option{-c} has not been also specified, split the -compressed output into several volume files with names -@file{original_name00001.lz}, @file{original_name00002.lz}, etc, and set the -volume size limit to @var{bytes}. Input files are kept unchanged. Each -volume is a complete, maybe multimember, lzip file. A small volume size may -degrade compression ratio, so use it only when needed. Valid values range -from @w{100 kB} to @w{4 EiB}. - -@item -t -@itemx --test -Check integrity of the files specified, but don't decompress them. This -really performs a trial decompression and throws away the result. Use it -together with @option{-v} to see information about the files. If a file -fails the test, does not exist, can't be opened, or is a terminal, minilzip -continues testing the rest of the files. A final diagnostic is shown at -verbosity level 1 or higher if any file fails the test when testing multiple -files. A multimember file with one or more empty members is accepted if -redirected to standard input. - -@item -v -@itemx --verbose -Verbose mode.@* -When compressing, show the compression ratio and size for each file processed.@* -When decompressing or testing, further -v's (up to 4) increase the verbosity -level, showing status, compression ratio, dictionary size, and trailer -contents (CRC, data size, member size). - -@item -0 .. -9 -Compression level. Set the compression parameters (dictionary size and -match length limit) as shown in the table below. The default compression -level is @option{-6}, equivalent to @w{@option{-s8MiB -m36}}. Note that -@option{-9} can be much slower than @option{-0}. These options have no -effect when decompressing or testing. - -The bidimensional parameter space of LZMA can't be mapped to a linear scale -optimal for all files. If your files are large, very repetitive, etc, you -may need to use the options @option{--dictionary-size} and -@option{--match-length} directly to achieve optimal performance. - -If several compression levels or @option{-s} or @option{-m} options are -given, the last setting is used. For example @w{@option{-9 -s64MiB}} is -equivalent to @w{@option{-s64MiB -m273}} - -@multitable {Level} {Dictionary size (-s)} {Match length limit (-m)} -@headitem Level @tab Dictionary size (-s) @tab Match length limit (-m) -@item -0 @tab 64 KiB @tab 16 bytes -@item -1 @tab 1 MiB @tab 5 bytes -@item -2 @tab 1.5 MiB @tab 6 bytes -@item -3 @tab 2 MiB @tab 8 bytes -@item -4 @tab 3 MiB @tab 12 bytes -@item -5 @tab 4 MiB @tab 20 bytes -@item -6 @tab 8 MiB @tab 36 bytes -@item -7 @tab 16 MiB @tab 68 bytes -@item -8 @tab 24 MiB @tab 132 bytes -@item -9 @tab 32 MiB @tab 273 bytes -@end multitable - -@item --fast -@itemx --best -Aliases for GNU gzip compatibility. - -@item --loose-trailing -When decompressing or testing, allow trailing data whose first bytes are -so similar to the magic bytes of a lzip header that they can be confused -with a corrupt header. Use this option if a file triggers a 'corrupt -header' error and the cause is not indeed a corrupt header. - -@item --check-lib -Compare the @uref{#Library-version,,version of lzlib} used to compile -minilzip with the version actually being used at run time and exit. Report -any differences found. Exit with error status 1 if differences are found. A -mismatch may indicate that lzlib is not correctly installed or that a -different version of lzlib has been installed after compiling the shared -version of minilzip. Exit with error status 2 if LZ_API_VERSION and -LZ_version_string don't match. @w{@samp{minilzip -v --check-lib}} shows the -version of lzlib being used and the value of LZ_API_VERSION (if defined). -@ifnothtml -@xref{Library version}. -@end ifnothtml - -@end table - -Numbers given as arguments to options may be expressed in decimal, -hexadecimal, or octal (using the same syntax as integer constants in C++), -and may be followed by a multiplier and an optional @samp{B} for "byte". - -Table of SI and binary prefixes (unit multipliers): - -@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)} -@headitem Prefix @tab Value @tab | @tab Prefix @tab Value -@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024) -@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20) -@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30) -@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40) -@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50) -@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60) -@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70) -@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80) -@item R @tab ronnabyte (10^27) @tab | @tab Ri @tab robibyte (2^90) -@item Q @tab quettabyte (10^30) @tab | @tab Qi @tab quebibyte (2^100) -@end multitable - -@sp 1 -Exit status: 0 for a normal exit, 1 for environmental problems -(file not found, invalid command-line options, I/O errors, etc), 2 to -indicate a corrupt or invalid input file, 3 for an internal consistency -error (e.g., bug) which caused minilzip to panic. - - -@node File format -@chapter File format -@cindex file format +@node Data format +@chapter Data format +@cindex data format Perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away.@* --- Antoine de Saint-Exupery +@sp 1 In the diagram below, a box like this: - @verbatim +---+ | | <-- the vertical bars might be missing @@ -974,7 +642,6 @@ In the diagram below, a box like this: @end verbatim represents one byte; a box like this: - @verbatim +==============+ | | @@ -983,16 +650,12 @@ represents one byte; a box like this: represents a variable number of bytes. -@noindent -A lzip file consists of one or more independent "members" (compressed data -sets). The members simply appear one after another in the file, with no -additional information before, between, or after them. Each member can -encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data. -The size of a multimember file is unlimited. Empty members (data size = 0) -are not allowed in multimember files. +@sp 1 +A lzip data stream consists of a series of "members" (compressed data +sets). The members simply appear one after another in the data stream, +with no additional information before, between, or after them. Each member has the following structure: - @verbatim +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size | @@ -1009,19 +672,19 @@ A four byte string, identifying the lzip format, with the value "LZIP" @item VN (version number, 1 byte) Just in case something needs to be modified in the future. 1 for now. -@anchor{coded-dict-size} @item DS (coded dictionary size, 1 byte) The dictionary size is calculated by taking a power of 2 (the base size) -and subtracting from it a fraction between 0/16 and 7/16 of the base size.@* +and substracting from it a fraction between 0/16 and 7/16 of the base +size.@* Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@* -Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract +Bits 7-5 contain the numerator of the fraction (0 to 7) to substract from the base size to obtain the dictionary size.@* Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@* Valid values for dictionary size range from 4 KiB to 512 MiB. @item LZMA stream -The LZMA stream, terminated by an 'End Of Stream' marker. Uses default values -for encoder properties. +The LZMA stream, finished by an end of stream marker. Uses default +values for encoder properties. @ifnothtml @xref{Stream format,,,lzip}, @end ifnothtml @@ -1030,21 +693,19 @@ See @uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format,,Stream format} @end ifhtml for a complete description.@* -Lzip only uses the LZMA marker @samp{2} ('End Of Stream' marker). Lzlib -also uses the LZMA marker @samp{3} ('Sync Flush' marker). @xref{sync_flush}. +Lzip only uses the LZMA marker @samp{2} ("End Of Stream" marker). Lzlib +also uses the LZMA marker @samp{3} ("Sync Flush" marker). @item CRC32 (4 bytes) -Cyclic Redundancy Check (CRC) of the original uncompressed data. +CRC of the uncompressed original data. @item Data size (8 bytes) -Size of the original uncompressed data. +Size of the uncompressed original data. @item Member size (8 bytes) Total size of the member, including header and trailer. This field acts -as a distributed index, improves the checking of stream integrity, and -facilitates the safe recovery of undamaged members from multimember files. -Lzip limits the member size to @w{2 PiB} to prevent the data size field from -overflowing. +as a distributed index, allows the verification of stream integrity, and +facilitates safe recovery of undamaged members from multimember files. @end table @@ -1053,312 +714,142 @@ overflowing. @chapter A small tutorial with examples @cindex examples -This chapter provides real code examples for the most common uses of the -library. See these examples in context in the files @file{bbexample.c} and -@file{ffexample.c} from the source distribution of lzlib. +This chapter shows the order in which the library functions should be +called depending on what kind of data stream you want to compress or +decompress. See the file @samp{bbexample.c} in the source distribution +for an example of how buffer-to-buffer compression/decompression can be +implemented using lzlib. -Note that the interface of lzlib is symmetrical. That is, the code for -normal compression and decompression is identical except because one calls +Note that lzlib's interface is symmetrical. That is, the code for normal +compression and decompression is identical except because one calls LZ_compress* functions while the other calls LZ_decompress* functions. -@menu -* Buffer compression:: Buffer-to-buffer single-member compression -* Buffer decompression:: Buffer-to-buffer decompression -* File compression:: File-to-file single-member compression -* File decompression:: File-to-file decompression -* File compression mm:: File-to-file multimember compression -* Skipping data errors:: Decompression with automatic resynchronization -@end menu - - -@node Buffer compression -@section Buffer compression -@cindex buffer compression - -Buffer-to-buffer single-member compression -@w{(@var{member_size} > total output)}. - -@verbatim -/* Compress 'insize' bytes from 'inbuf' to 'outbuf'. - Return the size of the compressed data in '*outlenp'. - In case of error, or if 'outsize' is too small, return false and do not - modify '*outlenp'. -*/ -bool bbcompress( const uint8_t * const inbuf, const int insize, - const int dictionary_size, const int match_len_limit, - uint8_t * const outbuf, const int outsize, - int * const outlenp ) - { - int inpos = 0, outpos = 0; - bool error = false; - LZ_Encoder * const encoder = - LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX ); - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { LZ_compress_close( encoder ); return false; } - - while( true ) - { - int ret = LZ_compress_write( encoder, inbuf + inpos, insize - inpos ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_compress_finish( encoder ); - ret = LZ_compress_read( encoder, outbuf + outpos, outsize - outpos ); - if( ret < 0 ) { error = true; break; } - outpos += ret; - if( LZ_compress_finished( encoder ) == 1 ) break; - if( outpos >= outsize ) { error = true; break; } - } - - if( LZ_compress_close( encoder ) < 0 ) error = true; - if( error ) return false; - *outlenp = outpos; - return true; - } -@end verbatim - - -@node Buffer decompression -@section Buffer decompression -@cindex buffer decompression - -Buffer-to-buffer decompression. - -@verbatim -/* Decompress 'insize' bytes from 'inbuf' to 'outbuf'. - Return the size of the decompressed data in '*outlenp'. - In case of error, or if 'outsize' is too small, return false and do not - modify '*outlenp'. -*/ -bool bbdecompress( const uint8_t * const inbuf, const int insize, - uint8_t * const outbuf, const int outsize, - int * const outlenp ) - { - int inpos = 0, outpos = 0; - bool error = false; - LZ_Decoder * const decoder = LZ_decompress_open(); - if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) - { LZ_decompress_close( decoder ); return false; } - - while( true ) - { - int ret = LZ_decompress_write( decoder, inbuf + inpos, insize - inpos ); - if( ret < 0 ) { error = true; break; } - inpos += ret; - if( inpos >= insize ) LZ_decompress_finish( decoder ); - ret = LZ_decompress_read( decoder, outbuf + outpos, outsize - outpos ); - if( ret < 0 ) { error = true; break; } - outpos += ret; - if( LZ_decompress_finished( decoder ) == 1 ) break; - if( outpos >= outsize ) { error = true; break; } - } - - if( LZ_decompress_close( decoder ) < 0 ) error = true; - if( error ) return false; - *outlenp = outpos; - return true; - } -@end verbatim - - -@node File compression -@section File compression -@cindex file compression - -File-to-file compression using LZ_compress_write_size. - -@verbatim -int ffcompress( LZ_Encoder * const encoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_finished( encoder ) == 1 ) return 0; - } - return 1; - } -@end verbatim - - -@node File decompression -@section File decompression -@cindex file decompression - -File-to-file decompression using LZ_decompress_write_size. - -@verbatim -int ffdecompress( LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_decompress_write( decoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } -@end verbatim - - -@node File compression mm -@section File-to-file multimember compression -@cindex multimember compression - -Example 1: Multimember compression with members of fixed size -@w{(@var{member_size} < total output)}. - -@verbatim -int ffmmcompress( FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384, member_size = 4096 }; - uint8_t buffer[buffer_size]; - bool done = false; - LZ_Encoder * const encoder = LZ_compress_open( 65535, 16, member_size ); - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { fputs( "ffexample: Not enough memory.\n", stderr ); - LZ_compress_close( encoder ); return 1; } - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_member_finished( encoder ) == 1 ) - { - if( LZ_compress_finished( encoder ) == 1 ) { done = true; break; } - if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break; - } - } - if( LZ_compress_close( encoder ) < 0 ) done = false; - return done; - } -@end verbatim - @sp 1 @noindent -Example 2: Multimember compression (user-restarted members). -(Call LZ_compress_open with @var{member_size} > largest member). +Example 1: Normal compression (@var{member_size} > total output). -@verbatim -/* Compress 'infile' to 'outfile' as a multimember stream with one member - for each line of text terminated by a newline character or by EOF. - Return 0 if success, 1 if error. -*/ -int fflfcompress( LZ_Encoder * const encoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - for( len = 0; len < size; ) - { - int ch = getc( infile ); - if( ch == EOF || ( buffer[len++] = ch ) == '\n' ) break; - } - /* avoid writing an empty member to outfile */ - if( len == 0 && LZ_compress_data_position( encoder ) == 0 ) return 0; - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) || buffer[len-1] == '\n' ) - LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_member_finished( encoder ) == 1 ) - { - if( feof( infile ) && LZ_compress_finished( encoder ) == 1 ) return 0; - if( LZ_compress_restart_member( encoder, INT64_MAX ) < 0 ) break; - } - } - return 1; - } -@end verbatim +@example +1) LZ_compress_open +2) LZ_compress_write +3) LZ_compress_read +4) go back to step 2 until all input data have been written +5) LZ_compress_finish +6) LZ_compress_read +7) go back to step 6 until LZ_compress_finished returns 1 +8) LZ_compress_close +@end example +@sp 1 +@noindent +Example 2: Normal compression using LZ_compress_write_size. -@node Skipping data errors -@section Skipping data errors -@cindex skipping data errors +@example +1) LZ_compress_open +2) go to step 5 if LZ_compress_write_size returns 0 +3) LZ_compress_write +4) if no more data to write, call LZ_compress_finish +5) LZ_compress_read +6) go back to step 2 until LZ_compress_finished returns 1 +7) LZ_compress_close +@end example -@verbatim -/* Decompress 'infile' to 'outfile' with automatic resynchronization to - next member in case of data error, including the automatic removal of - leading garbage. -*/ -int ffrsdecompress( LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_decompress_write( decoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) - { - if( LZ_decompress_errno( decoder ) == LZ_header_error || - LZ_decompress_errno( decoder ) == LZ_data_error ) - { LZ_decompress_sync_to_member( decoder ); continue; } - break; - } - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } -@end verbatim +@sp 1 +@noindent +Example 3: Decompression. + +@example +1) LZ_decompress_open +2) LZ_decompress_write +3) LZ_decompress_read +4) go back to step 2 until all input data have been written +5) LZ_decompress_finish +6) LZ_decompress_read +7) go back to step 6 until LZ_decompress_finished returns 1 +8) LZ_decompress_close +@end example + +@sp 1 +@noindent +Example 4: Decompression using LZ_decompress_write_size. + +@example +1) LZ_decompress_open +2) go to step 5 if LZ_decompress_write_size returns 0 +3) LZ_decompress_write +4) if no more data to write, call LZ_decompress_finish +5) LZ_decompress_read +5a) optionally, if LZ_decompress_member_finished returns 1, read + final values for member with LZ_decompress_data_crc, etc. +6) go back to step 2 until LZ_decompress_finished returns 1 +7) LZ_decompress_close +@end example + +@sp 1 +@noindent +Example 5: Multimember compression (@var{member_size} < total output). + +@example + 1) LZ_compress_open + 2) go to step 5 if LZ_compress_write_size returns 0 + 3) LZ_compress_write + 4) if no more data to write, call LZ_compress_finish + 5) LZ_compress_read + 6) go back to step 2 until LZ_compress_member_finished returns 1 + 7) go to step 10 if LZ_compress_finished() returns 1 + 8) LZ_compress_restart_member + 9) go back to step 2 +10) LZ_compress_close +@end example + +@sp 1 +@noindent +Example 6: Multimember compression (user-restarted members). + +@example + 1) LZ_compress_open + 2) LZ_compress_write + 3) LZ_compress_read + 4) go back to step 2 until member termination is desired + 5) LZ_compress_finish + 6) LZ_compress_read + 7) go back to step 6 until LZ_compress_member_finished returns 1 + 8) verify that LZ_compress_finished returns 1 + 9) go to step 12 if all input data have been written +10) LZ_compress_restart_member +11) go back to step 2 +12) LZ_compress_close +@end example + +@sp 1 +@noindent +Example 7: Decompression with automatic removal of leading data. + +@example +1) LZ_decompress_open +2) LZ_decompress_sync_to_member +3) go to step 6 if LZ_decompress_write_size returns 0 +4) LZ_decompress_write +5) if no more data to write, call LZ_decompress_finish +6) LZ_decompress_read +7) go back to step 3 until LZ_decompress_finished returns 1 +8) LZ_decompress_close +@end example + +@sp 1 +@noindent +Example 8: Streamed decompression with automatic resynchronization to +next member in case of data error. + +@example +1) LZ_decompress_open +2) go to step 5 if LZ_decompress_write_size returns 0 +3) LZ_decompress_write +4) if no more data to write, call LZ_decompress_finish +5) if LZ_decompress_read produces LZ_header_error or LZ_data_error, + call LZ_decompress_sync_to_member +6) go back to step 2 until LZ_decompress_finished returns 1 +7) LZ_decompress_close +@end example @node Problems @@ -1373,8 +864,8 @@ for all eternity, if not longer. If you find a bug in lzlib, please send electronic mail to @email{lzip-bug@@nongnu.org}. Include the version number, which you can -find by running @w{@samp{minilzip --version}} and -@w{@samp{minilzip -v --check-lib}}. +find by running @w{@code{minilzip --version}} or in +@samp{LZ_version_string} from @samp{lzlib.h}. @node Concept index diff --git a/doc/minilzip.1 b/doc/minilzip.1 index e375123..b5ebf78 100644 --- a/doc/minilzip.1 +++ b/doc/minilzip.1 @@ -1,26 +1,12 @@ -.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2. -.TH MINILZIP "1" "January 2025" "minilzip 1.15" "User Commands" +.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.46.1. +.TH MINILZIP "1" "May 2016" "minilzip 1.8" "User Commands" .SH NAME minilzip \- reduces the size of files .SH SYNOPSIS .B minilzip [\fI\,options\/\fR] [\fI\,files\/\fR] .SH DESCRIPTION -Minilzip is a test program for the compression library lzlib. Minilzip is -not intended to be installed because lzip has more features, but minilzip is -well tested and you can use it as your main compressor if so you wish. -.PP -Lzip is a lossless data compressor with a user interface similar to the one -of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel\-Ziv\-Markov -chain\-Algorithm) designed to achieve complete interoperability between -implementations. The maximum dictionary size is 512 MiB so that any lzip -file can be decompressed on 32\-bit machines. Lzip provides accurate and -robust 3\-factor integrity checking. 'lzip \fB\-0\fR' compresses about as fast as -gzip, while 'lzip \fB\-9\fR' compresses most files more than bzip2. Decompression -speed is intermediate between gzip and bzip2. Lzip provides better data -recovery capabilities than gzip and bzip2. Lzip has been designed, written, -and tested with great care to replace gzip and bzip2 as general\-purpose -compressed format for Unix\-like systems. +Minilzip \- Test program for the lzlib library. .SH OPTIONS .TP \fB\-h\fR, \fB\-\-help\fR @@ -33,13 +19,13 @@ output version information and exit exit with error status if trailing data .TP \fB\-b\fR, \fB\-\-member\-size=\fR -set member size limit of multimember files +set member size limit in bytes .TP \fB\-c\fR, \fB\-\-stdout\fR write to standard output, keep input files .TP \fB\-d\fR, \fB\-\-decompress\fR -decompress, test compressed file integrity +decompress .TP \fB\-f\fR, \fB\-\-force\fR overwrite existing output files @@ -54,7 +40,7 @@ keep (don't delete) input files set match length limit in bytes [36] .TP \fB\-o\fR, \fB\-\-output=\fR -write to , keep input files +if reading standard input, write to .TP \fB\-q\fR, \fB\-\-quiet\fR suppress all messages @@ -79,59 +65,31 @@ alias for \fB\-0\fR .TP \fB\-\-best\fR alias for \fB\-9\fR -.TP -\fB\-\-loose\-trailing\fR -allow trailing data seeming corrupt header -.TP -\fB\-\-check\-lib\fR -compare version of lzlib.h with liblz.{a,so} .PP If no file names are given, or if a file is '\-', minilzip compresses or decompresses from standard input to standard output. Numbers may be followed by a multiplier: k = kB = 10^3 = 1000, Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc... -Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to -2^29 bytes. +Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 +to 2^29 bytes. .PP -The bidimensional parameter space of LZMA can't be mapped to a linear scale -optimal for all files. If your files are large, very repetitive, etc, you -may need to use the options \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR directly -to achieve optimal performance. +The bidimensional parameter space of LZMA can't be mapped to a linear +scale optimal for all files. If your files are large, very repetitive, +etc, you may need to use the \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR +options directly to achieve optimal performance. .PP -To extract all the files from archive 'foo.tar.lz', use the commands -\&'tar \fB\-xf\fR foo.tar.lz' or 'minilzip \fB\-cd\fR foo.tar.lz | tar \fB\-xf\fR \-'. -.PP -Exit status: 0 for a normal exit, 1 for environmental problems -(file not found, invalid command\-line options, I/O errors, etc), 2 to -indicate a corrupt or invalid input file, 3 for an internal consistency -error (e.g., bug) which caused minilzip to panic. -.PP -The ideas embodied in lzlib are due to (at least) the following people: -Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the -definition of Markov chains), G.N.N. Martin (for the definition of range -encoding), Igor Pavlov (for putting all the above together in LZMA), and -Julian Seward (for bzip2's CLI). +Exit status: 0 for a normal exit, 1 for environmental problems (file +not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or +invalid input file, 3 for an internal consistency error (eg, bug) which +caused minilzip to panic. .SH "REPORTING BUGS" Report bugs to lzip\-bug@nongnu.org .br Lzlib home page: http://www.nongnu.org/lzip/lzlib.html .SH COPYRIGHT -Copyright \(co 2025 Antonio Diaz Diaz. +Copyright \(co 2016 Antonio Diaz Diaz. +Using lzlib 1.8 License GPLv2+: GNU GPL version 2 or later .br This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. -Using lzlib 1.15 -Using LZ_API_VERSION = 1015 -.SH "SEE ALSO" -The full documentation for -.B minilzip -is maintained as a Texinfo manual. If the -.B info -and -.B minilzip -programs are properly installed at your site, the command -.IP -.B info lzlib -.PP -should give you access to the complete manual. diff --git a/encoder.c b/encoder.c index 442670a..f5b5b46 100644 --- a/encoder.c +++ b/encoder.c @@ -1,27 +1,46 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -static int LZe_get_match_pairs( LZ_encoder * const e, Pair * pairs ) +static int LZe_get_match_pairs( struct LZ_encoder * const e, struct Pair * pairs ) { int32_t * ptr0 = e->eb.mb.pos_array + ( e->eb.mb.cyclic_pos << 1 ); int32_t * ptr1 = ptr0 + 1; + int32_t * newptr; + int len = 0, len0 = 0, len1 = 0; + int maxlen = 0; + int num_pairs = 0; + const int pos1 = e->eb.mb.pos + 1; + const int min_pos = ( e->eb.mb.pos > e->eb.mb.dictionary_size ) ? + e->eb.mb.pos - e->eb.mb.dictionary_size : 0; + const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb ); + int count, key2, key3, key4, newpos; + unsigned tmp; int len_limit = e->match_len_limit; + if( len_limit > Mb_available_bytes( &e->eb.mb ) ) { e->been_flushed = true; @@ -29,61 +48,54 @@ static int LZe_get_match_pairs( LZ_encoder * const e, Pair * pairs ) if( len_limit < 4 ) { *ptr0 = *ptr1 = 0; return 0; } } - int maxlen = 3; /* only used if pairs != 0 */ - int num_pairs = 0; - const int min_pos = (e->eb.mb.pos > e->eb.mb.dictionary_size) ? - e->eb.mb.pos - e->eb.mb.dictionary_size : 0; - const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb ); - - unsigned tmp = crc32[data[0]] ^ data[1]; - const int key2 = tmp & ( num_prev_positions2 - 1 ); + tmp = crc32[data[0]] ^ data[1]; + key2 = tmp & ( num_prev_positions2 - 1 ); tmp ^= (unsigned)data[2] << 8; - const int key3 = num_prev_positions2 + ( tmp & ( num_prev_positions3 - 1 ) ); - const int key4 = num_prev_positions2 + num_prev_positions3 + - ( ( tmp ^ ( crc32[data[3]] << 5 ) ) & e->eb.mb.key4_mask ); + key3 = num_prev_positions2 + ( tmp & ( num_prev_positions3 - 1 ) ); + key4 = num_prev_positions2 + num_prev_positions3 + + ( ( tmp ^ ( crc32[data[3]] << 5 ) ) & e->eb.mb.key4_mask ); if( pairs ) { - const int np2 = e->eb.mb.prev_positions[key2]; - const int np3 = e->eb.mb.prev_positions[key3]; + int np2 = e->eb.mb.prev_positions[key2]; + int np3 = e->eb.mb.prev_positions[key3]; if( np2 > min_pos && e->eb.mb.buffer[np2-1] == data[0] ) { pairs[0].dis = e->eb.mb.pos - np2; - pairs[0].len = maxlen = 2 + ( np2 == np3 ); + pairs[0].len = maxlen = 2; num_pairs = 1; } if( np2 != np3 && np3 > min_pos && e->eb.mb.buffer[np3-1] == data[0] ) { maxlen = 3; - pairs[num_pairs++].dis = e->eb.mb.pos - np3; + np2 = np3; + pairs[num_pairs].dis = e->eb.mb.pos - np2; + ++num_pairs; } if( num_pairs > 0 ) { - const int delta = pairs[num_pairs-1].dis + 1; + const int delta = pos1 - np2; while( maxlen < len_limit && data[maxlen-delta] == data[maxlen] ) ++maxlen; pairs[num_pairs-1].len = maxlen; - if( maxlen < 3 ) maxlen = 3; if( maxlen >= len_limit ) pairs = 0; /* done. now just skip */ } + if( maxlen < 3 ) maxlen = 3; } - const int pos1 = e->eb.mb.pos + 1; e->eb.mb.prev_positions[key2] = pos1; e->eb.mb.prev_positions[key3] = pos1; - int newpos1 = e->eb.mb.prev_positions[key4]; + newpos = e->eb.mb.prev_positions[key4]; e->eb.mb.prev_positions[key4] = pos1; - int len = 0, len0 = 0, len1 = 0; - - int count; for( count = e->cycles; ; ) { - if( newpos1 <= min_pos || --count < 0 ) { *ptr0 = *ptr1 = 0; break; } + int delta; + if( newpos <= min_pos || --count < 0 ) { *ptr0 = *ptr1 = 0; break; } if( e->been_flushed ) len = 0; - const int delta = pos1 - newpos1; - int32_t * const newptr = e->eb.mb.pos_array + + delta = pos1 - newpos; + newptr = e->eb.mb.pos_array + ( ( e->eb.mb.cyclic_pos - delta + ( (e->eb.mb.cyclic_pos >= delta) ? 0 : e->eb.mb.dictionary_size + 1 ) ) << 1 ); if( data[len-delta] == data[len] ) @@ -104,16 +116,16 @@ static int LZe_get_match_pairs( LZ_encoder * const e, Pair * pairs ) } if( data[len-delta] < data[len] ) { - *ptr0 = newpos1; + *ptr0 = newpos; ptr0 = newptr + 1; - newpos1 = *ptr0; + newpos = *ptr0; len0 = len; if( len1 < len ) len = len1; } else { - *ptr1 = newpos1; + *ptr1 = newpos; ptr1 = newptr; - newpos1 = *ptr1; + newpos = *ptr1; len1 = len; if( len0 < len ) len = len0; } } @@ -121,7 +133,7 @@ static int LZe_get_match_pairs( LZ_encoder * const e, Pair * pairs ) } -static void LZe_update_distance_prices( LZ_encoder * const e ) +static void LZe_update_distance_prices( struct LZ_encoder * const e ) { int dis, len_state; for( dis = start_dis_model; dis < modeled_distances; ++dis ) @@ -129,7 +141,7 @@ static void LZe_update_distance_prices( LZ_encoder * const e ) const int dis_slot = dis_slots[dis]; const int direct_bits = ( dis_slot >> 1 ) - 1; const int base = ( 2 | ( dis_slot & 1 ) ) << direct_bits; - const int price = price_symbol_reversed( e->eb.bm_dis + ( base - dis_slot ), + const int price = price_symbol_reversed( e->eb.bm_dis + base - dis_slot - 1, dis - base, direct_bits ); for( len_state = 0; len_state < len_states; ++len_state ) e->dis_prices[len_state][dis] = price; @@ -138,15 +150,15 @@ static void LZe_update_distance_prices( LZ_encoder * const e ) for( len_state = 0; len_state < len_states; ++len_state ) { int * const dsp = e->dis_slot_prices[len_state]; + int * const dp = e->dis_prices[len_state]; const Bit_model * const bmds = e->eb.bm_dis_slot[len_state]; int slot = 0; for( ; slot < end_dis_model; ++slot ) - dsp[slot] = price_symbol6( bmds, slot ); + dsp[slot] = price_symbol( bmds, slot, dis_slot_bits ); for( ; slot < e->num_dis_slots; ++slot ) - dsp[slot] = price_symbol6( bmds, slot ) + + dsp[slot] = price_symbol( bmds, slot, dis_slot_bits ) + (((( slot >> 1 ) - 1 ) - dis_align_bits ) << price_shift_bits ); - int * const dp = e->dis_prices[len_state]; for( dis = 0; dis < start_dis_model; ++dis ) dp[dis] = dsp[dis]; for( ; dis < modeled_distances; ++dis ) @@ -155,17 +167,18 @@ static void LZe_update_distance_prices( LZ_encoder * const e ) } -/* Return the number of bytes advanced (ahead). +/* Returns the number of bytes advanced (ahead). trials[0]..trials[ahead-1] contain the steps to encode. - ( trials[0].dis4 == -1 ) means literal. + ( trials[0].dis == -1 ) means literal. A match/rep longer or equal than match_len_limit finishes the sequence. */ -static int LZe_sequence_optimizer( LZ_encoder * const e, +static int LZe_sequence_optimizer( struct LZ_encoder * const e, const int reps[num_rep_distances], const State state ) { - int num_pairs, num_trials; - int i, rep, len; + int main_len, num_pairs, i, rep, cur = 0, num_trials, len; + int replens[num_rep_distances]; + int rep_index = 0; if( e->pending_num_pairs > 0 ) /* from previous call */ { @@ -174,19 +187,17 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, } else num_pairs = LZe_read_match_distances( e ); - const int main_len = (num_pairs > 0) ? e->pairs[num_pairs-1].len : 0; + main_len = ( num_pairs > 0 ) ? e->pairs[num_pairs-1].len : 0; - int replens[num_rep_distances]; - int rep_index = 0; for( i = 0; i < num_rep_distances; ++i ) { - replens[i] = Mb_true_match_len( &e->eb.mb, 0, reps[i] + 1 ); + replens[i] = Mb_true_match_len( &e->eb.mb, 0, reps[i] + 1, max_match_len ); if( replens[i] > replens[rep_index] ) rep_index = i; } if( replens[rep_index] >= e->match_len_limit ) { e->trials[0].price = replens[rep_index]; - e->trials[0].dis4 = rep_index; + e->trials[0].dis = rep_index; if( !LZe_move_and_update( e, replens[rep_index] ) ) return 0; return replens[rep_index]; } @@ -194,12 +205,15 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( main_len >= e->match_len_limit ) { e->trials[0].price = main_len; - e->trials[0].dis4 = e->pairs[num_pairs-1].dis + num_rep_distances; + e->trials[0].dis = e->pairs[num_pairs-1].dis + num_rep_distances; if( !LZe_move_and_update( e, main_len ) ) return 0; return main_len; } + { const int pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask; + const int match_price = price1( e->eb.bm_match[state][pos_state] ); + const int rep_match_price = match_price + price1( e->eb.bm_rep[state] ); const uint8_t prev_byte = Mb_peek( &e->eb.mb, 1 ); const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 ); const uint8_t match_byte = Mb_peek( &e->eb.mb, reps[0] + 1 ); @@ -209,10 +223,7 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, e->trials[1].price += LZeb_price_literal( &e->eb, prev_byte, cur_byte ); else e->trials[1].price += LZeb_price_matched( &e->eb, prev_byte, cur_byte, match_byte ); - e->trials[1].dis4 = -1; /* literal */ - - const int match_price = price1( e->eb.bm_match[state][pos_state] ); - const int rep_match_price = match_price + price1( e->eb.bm_rep[state] ); + e->trials[1].dis = -1; /* literal */ if( match_byte == cur_byte ) Tr_update( &e->trials[1], rep_match_price + @@ -223,7 +234,7 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( num_trials < min_match_len ) { e->trials[0].price = 1; - e->trials[0].dis4 = e->trials[1].dis4; + e->trials[0].dis = e->trials[1].dis; if( !Mb_move_pos( &e->eb.mb ) ) return 0; return 1; } @@ -237,8 +248,9 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, for( rep = 0; rep < num_rep_distances; ++rep ) { + int price; if( replens[rep] < min_match_len ) continue; - const int price = rep_match_price + LZeb_price_rep( &e->eb, rep, state, pos_state ); + price = rep_match_price + LZeb_price_rep( &e->eb, rep, state, pos_state ); for( len = min_match_len; len <= replens[rep]; ++len ) Tr_update( &e->trials[len], price + Lp_price( &e->rep_len_prices, len, pos_state ), rep, 0 ); @@ -247,7 +259,7 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( main_len > replens[0] ) { const int normal_match_price = match_price + price0( e->eb.bm_rep[state] ); - int i = 0, len = max( replens[0] + 1, min_match_len ); + i = 0, len = max( replens[0] + 1, min_match_len ); while( len > e->pairs[i].len ) ++i; while( true ) { @@ -258,10 +270,17 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( ++len > e->pairs[i].len && ++i >= num_pairs ) break; } } + } - int cur = 0; while( true ) /* price optimization loop */ { + struct Trial *cur_trial, *next_trial; + int newlen, pos_state, triable_bytes, len_limit; + int start_len = min_match_len; + int next_price, match_price, rep_match_price; + State cur_state; + uint8_t prev_byte, cur_byte, match_byte; + if( !Mb_move_pos( &e->eb.mb ) ) return 0; if( ++cur >= num_trials ) /* no more initialized trials */ { @@ -269,8 +288,8 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, return cur; } - const int num_pairs = LZe_read_match_distances( e ); - const int newlen = (num_pairs > 0) ? e->pairs[num_pairs-1].len : 0; + num_pairs = LZe_read_match_distances( e ); + newlen = ( num_pairs > 0 ) ? e->pairs[num_pairs-1].len : 0; if( newlen >= e->match_len_limit ) { e->pending_num_pairs = num_pairs; @@ -279,10 +298,9 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, } /* give final values to current trial */ - Trial * cur_trial = &e->trials[cur]; - State cur_state; + cur_trial = &e->trials[cur]; { - const int dis4 = cur_trial->dis4; + int dis = cur_trial->dis; int prev_index = cur_trial->prev_index; const int prev_index2 = cur_trial->prev_index2; @@ -291,47 +309,55 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, cur_state = e->trials[prev_index].state; if( prev_index + 1 == cur ) /* len == 1 */ { - if( dis4 == 0 ) cur_state = St_set_shortrep( cur_state ); + if( dis == 0 ) cur_state = St_set_short_rep( cur_state ); else cur_state = St_set_char( cur_state ); /* literal */ } - else if( dis4 < num_rep_distances ) cur_state = St_set_rep( cur_state ); + else if( dis < num_rep_distances ) cur_state = St_set_rep( cur_state ); else cur_state = St_set_match( cur_state ); } - else + else if( prev_index2 == dual_step_trial ) /* dis == 0 */ { - if( prev_index2 == dual_step_trial ) /* dis4 == 0 (rep0) */ - --prev_index; - else /* prev_index2 >= 0 */ - prev_index = prev_index2; - cur_state = St_set_char_rep(); + --prev_index; + cur_state = e->trials[prev_index].state; + cur_state = St_set_char( cur_state ); + cur_state = St_set_rep( cur_state ); + } + else /* if( prev_index2 >= 0 ) */ + { + prev_index = prev_index2; + cur_state = e->trials[prev_index].state; + if( dis < num_rep_distances ) cur_state = St_set_rep( cur_state ); + else cur_state = St_set_match( cur_state ); + cur_state = St_set_char( cur_state ); + cur_state = St_set_rep( cur_state ); } cur_trial->state = cur_state; for( i = 0; i < num_rep_distances; ++i ) cur_trial->reps[i] = e->trials[prev_index].reps[i]; - mtf_reps( dis4, cur_trial->reps ); /* literal is ignored */ + mtf_reps( dis, cur_trial->reps ); } - const int pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask; - const uint8_t prev_byte = Mb_peek( &e->eb.mb, 1 ); - const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 ); - const uint8_t match_byte = Mb_peek( &e->eb.mb, cur_trial->reps[0] + 1 ); + pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask; + prev_byte = Mb_peek( &e->eb.mb, 1 ); + cur_byte = Mb_peek( &e->eb.mb, 0 ); + match_byte = Mb_peek( &e->eb.mb, cur_trial->reps[0] + 1 ); - int next_price = cur_trial->price + - price0( e->eb.bm_match[cur_state][pos_state] ); + next_price = cur_trial->price + + price0( e->eb.bm_match[cur_state][pos_state] ); if( St_is_char( cur_state ) ) next_price += LZeb_price_literal( &e->eb, prev_byte, cur_byte ); else next_price += LZeb_price_matched( &e->eb, prev_byte, cur_byte, match_byte ); /* try last updates to next trial */ - Trial * next_trial = &e->trials[cur+1]; + next_trial = &e->trials[cur+1]; Tr_update( next_trial, next_price, -1, cur ); /* literal */ - const int match_price = cur_trial->price + price1( e->eb.bm_match[cur_state][pos_state] ); - const int rep_match_price = match_price + price1( e->eb.bm_rep[cur_state] ); + match_price = cur_trial->price + price1( e->eb.bm_match[cur_state][pos_state] ); + rep_match_price = match_price + price1( e->eb.bm_rep[cur_state] ); - if( match_byte == cur_byte && next_trial->dis4 != 0 && + if( match_byte == cur_byte && next_trial->dis != 0 && next_trial->prev_index2 == single_step_trial ) { const int price = rep_match_price + @@ -339,16 +365,16 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( price <= next_trial->price ) { next_trial->price = price; - next_trial->dis4 = 0; /* rep0 */ + next_trial->dis = 0; next_trial->prev_index = cur; } } - const int triable_bytes = + triable_bytes = min( Mb_available_bytes( &e->eb.mb ), max_num_trials - 1 - cur ); if( triable_bytes < min_match_len ) continue; - const int len_limit = min( e->match_len_limit, triable_bytes ); + len_limit = min( e->match_len_limit, triable_bytes ); /* try literal + rep0 */ if( match_byte != cur_byte && next_trial->prev_index != cur ) @@ -356,28 +382,27 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb ); const int dis = cur_trial->reps[0] + 1; const int limit = min( e->match_len_limit + 1, triable_bytes ); - int len = 1; + len = 1; while( len < limit && data[len-dis] == data[len] ) ++len; if( --len >= min_match_len ) { const int pos_state2 = ( pos_state + 1 ) & pos_state_mask; const State state2 = St_set_char( cur_state ); const int price = next_price + - price1( e->eb.bm_match[state2][pos_state2] ) + - price1( e->eb.bm_rep[state2] ) + - LZe_price_rep0_len( e, len, state2, pos_state2 ); + price1( e->eb.bm_match[state2][pos_state2] ) + + price1( e->eb.bm_rep[state2] ) + + LZe_price_rep0_len( e, len, state2, pos_state2 ); while( num_trials < cur + 1 + len ) e->trials[++num_trials].price = infinite_price; Tr_update2( &e->trials[cur+1+len], price, cur + 1 ); } } - int start_len = min_match_len; - /* try rep distances */ for( rep = 0; rep < num_rep_distances; ++rep ) { const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb ); + int price; const int dis = cur_trial->reps[rep] + 1; if( data[0-dis] != data[0] || data[1-dis] != data[1] ) continue; @@ -385,7 +410,7 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( data[len-dis] != data[len] ) break; while( num_trials < cur + len ) e->trials[++num_trials].price = infinite_price; - int price = rep_match_price + LZeb_price_rep( &e->eb, rep, cur_state, pos_state ); + price = rep_match_price + LZeb_price_rep( &e->eb, rep, cur_state, pos_state ); for( i = min_match_len; i <= len; ++i ) Tr_update( &e->trials[cur+i], price + Lp_price( &e->rep_len_prices, i, pos_state ), rep, cur ); @@ -393,14 +418,17 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, if( rep == 0 ) start_len = len + 1; /* discard shorter matches */ /* try rep + literal + rep0 */ + { int len2 = len + 1; const int limit = min( e->match_len_limit + len2, triable_bytes ); + int pos_state2; + State state2; while( len2 < limit && data[len2-dis] == data[len2] ) ++len2; len2 -= len + 1; if( len2 < min_match_len ) continue; - int pos_state2 = ( pos_state + len ) & pos_state_mask; - State state2 = St_set_rep( cur_state ); + pos_state2 = ( pos_state + len ) & pos_state_mask; + state2 = St_set_rep( cur_state ); price += Lp_price( &e->rep_len_prices, len, pos_state ) + price0( e->eb.bm_match[state2][pos_state2] ) + LZeb_price_matched( &e->eb, data[len-1], data[len], data[len-dis] ); @@ -413,22 +441,25 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, e->trials[++num_trials].price = infinite_price; Tr_update3( &e->trials[cur+len+1+len2], price, rep, cur + len + 1, cur ); } + } /* try matches */ if( newlen >= start_len && newlen <= len_limit ) { + int dis; const int normal_match_price = match_price + price0( e->eb.bm_rep[cur_state] ); while( num_trials < cur + newlen ) e->trials[++num_trials].price = infinite_price; - int i = 0; + i = 0; while( e->pairs[i].len < start_len ) ++i; - int dis = e->pairs[i].dis; + dis = e->pairs[i].dis; for( len = start_len; ; ++len ) { int price = normal_match_price + LZe_price_pair( e, dis, len, pos_state ); + Tr_update( &e->trials[cur+len], price, dis + num_rep_distances, cur ); /* try match + literal + rep0 */ @@ -466,26 +497,30 @@ static int LZe_sequence_optimizer( LZ_encoder * const e, } -static bool LZe_encode_member( LZ_encoder * const e ) +static bool LZe_encode_member( struct LZ_encoder * const e ) { - const bool best = e->match_len_limit > 12; + const bool best = ( e->match_len_limit > 12 ); const int dis_price_count = best ? 1 : 512; const int align_price_count = best ? 1 : dis_align_size; - const int price_count = (e->match_len_limit > 36) ? 1013 : 4093; - int i; + const int price_count = ( e->match_len_limit > 36 ) ? 1013 : 4093; + int ahead, i; State * const state = &e->eb.state; if( e->eb.member_finished ) return true; if( Re_member_position( &e->eb.renc ) >= e->eb.member_size_limit ) - { LZeb_try_full_flush( &e->eb ); return true; } + { + if( LZeb_full_flush( &e->eb ) ) e->eb.member_finished = true; + return true; + } if( Mb_data_position( &e->eb.mb ) == 0 && !Mb_data_finished( &e->eb.mb ) ) /* encode first byte */ { + const uint8_t prev_byte = 0; + uint8_t cur_byte; if( !Mb_enough_available_bytes( &e->eb.mb ) || !Re_enough_free_bytes( &e->eb.renc ) ) return true; - const uint8_t prev_byte = 0; - const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 ); + cur_byte = Mb_peek( &e->eb.mb, 0 ); Re_encode_bit( &e->eb.renc, &e->eb.bm_match[*state][0], 0 ); LZeb_encode_literal( &e->eb, prev_byte, cur_byte ); CRC32_update_byte( &e->eb.crc, cur_byte ); @@ -512,7 +547,8 @@ static bool LZe_encode_member( LZ_encoder * const e ) Lp_update_prices( &e->rep_len_prices ); } - int ahead = LZe_sequence_optimizer( e, e->eb.reps, *state ); + ahead = LZe_sequence_optimizer( e, e->eb.reps, *state ); + if( ahead <= 0 ) return false; /* can't happen */ e->price_counter -= ahead; for( i = 0; ahead > 0; ) @@ -520,32 +556,33 @@ static bool LZe_encode_member( LZ_encoder * const e ) const int pos_state = ( Mb_data_position( &e->eb.mb ) - ahead ) & pos_state_mask; const int len = e->trials[i].price; - int dis = e->trials[i].dis4; + const int dis = e->trials[i].dis; - bool bit = dis < 0; + bool bit = ( dis < 0 ); Re_encode_bit( &e->eb.renc, &e->eb.bm_match[*state][pos_state], !bit ); if( bit ) /* literal byte */ { const uint8_t prev_byte = Mb_peek( &e->eb.mb, ahead + 1 ); const uint8_t cur_byte = Mb_peek( &e->eb.mb, ahead ); CRC32_update_byte( &e->eb.crc, cur_byte ); - if( ( *state = St_set_char( *state ) ) < 4 ) + if( St_is_char( *state ) ) LZeb_encode_literal( &e->eb, prev_byte, cur_byte ); else { const uint8_t match_byte = Mb_peek( &e->eb.mb, ahead + e->eb.reps[0] + 1 ); LZeb_encode_matched( &e->eb, prev_byte, cur_byte, match_byte ); } + *state = St_set_char( *state ); } else /* match or repeated match */ { CRC32_update_buf( &e->eb.crc, Mb_ptr_to_current_pos( &e->eb.mb ) - ahead, len ); mtf_reps( dis, e->eb.reps ); - bit = dis < num_rep_distances; + bit = ( dis < num_rep_distances ); Re_encode_bit( &e->eb.renc, &e->eb.bm_rep[*state], bit ); if( bit ) /* repeated match */ { - bit = dis == 0; + bit = ( dis == 0 ); Re_encode_bit( &e->eb.renc, &e->eb.bm_rep0[*state], !bit ); if( bit ) Re_encode_bit( &e->eb.renc, &e->eb.bm_len[*state][pos_state], len > 1 ); @@ -555,7 +592,7 @@ static bool LZe_encode_member( LZ_encoder * const e ) if( dis > 1 ) Re_encode_bit( &e->eb.renc, &e->eb.bm_rep2[*state], dis > 2 ); } - if( len == 1 ) *state = St_set_shortrep( *state ); + if( len == 1 ) *state = St_set_short_rep( *state ); else { Re_encode_len( &e->eb.renc, &e->eb.rep_len_model, len, pos_state ); @@ -565,9 +602,9 @@ static bool LZe_encode_member( LZ_encoder * const e ) } else /* match */ { - dis -= num_rep_distances; - LZeb_encode_pair( &e->eb, dis, len, pos_state ); - if( dis >= modeled_distances ) --e->align_price_counter; + LZeb_encode_pair( &e->eb, dis - num_rep_distances, len, pos_state ); + if( get_slot( dis - num_rep_distances ) >= end_dis_model ) + --e->align_price_counter; --e->dis_price_counter; Lp_decrement_counter( &e->match_len_prices, pos_state ); *state = St_set_match( *state ); @@ -577,11 +614,11 @@ static bool LZe_encode_member( LZ_encoder * const e ) if( Re_member_position( &e->eb.renc ) >= e->eb.member_size_limit ) { if( !Mb_dec_pos( &e->eb.mb, ahead ) ) return false; - LZeb_try_full_flush( &e->eb ); + if( LZeb_full_flush( &e->eb ) ) e->eb.member_finished = true; return true; } } } - LZeb_try_full_flush( &e->eb ); + if( LZeb_full_flush( &e->eb ) ) e->eb.member_finished = true; return true; } diff --git a/encoder.h b/encoder.h index cb9689e..b70a8ec 100644 --- a/encoder.h +++ b/encoder.h @@ -1,47 +1,56 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -typedef struct Len_prices +struct Len_prices { - const Len_model * lm; + const struct Len_model * lm; int len_symbols; int count; int prices[pos_states][max_len_symbols]; - int counters[pos_states]; /* may decrement below 0 */ - } Len_prices; + int counters[pos_states]; + }; -static inline void Lp_update_low_mid_prices( Len_prices * const lp, +static inline void Lp_update_low_mid_prices( struct Len_prices * const lp, const int pos_state ) { int * const pps = lp->prices[pos_state]; int tmp = price0( lp->lm->choice1 ); int len = 0; + lp->counters[pos_state] = lp->count; for( ; len < len_low_symbols && len < lp->len_symbols; ++len ) - pps[len] = tmp + price_symbol3( lp->lm->bm_low[pos_state], len ); + pps[len] = tmp + price_symbol( lp->lm->bm_low[pos_state], len, len_low_bits ); if( len >= lp->len_symbols ) return; tmp = price1( lp->lm->choice1 ) + price0( lp->lm->choice2 ); for( ; len < len_low_symbols + len_mid_symbols && len < lp->len_symbols; ++len ) pps[len] = tmp + - price_symbol3( lp->lm->bm_mid[pos_state], len - len_low_symbols ); + price_symbol( lp->lm->bm_mid[pos_state], len - len_low_symbols, len_mid_bits ); } -static inline void Lp_update_high_prices( Len_prices * const lp ) +static inline void Lp_update_high_prices( struct Len_prices * const lp ) { const int tmp = price1( lp->lm->choice1 ) + price1( lp->lm->choice2 ); int len; @@ -49,152 +58,161 @@ static inline void Lp_update_high_prices( Len_prices * const lp ) /* using 4 slots per value makes "Lp_price" faster */ lp->prices[3][len] = lp->prices[2][len] = lp->prices[1][len] = lp->prices[0][len] = tmp + - price_symbol8( lp->lm->bm_high, len - len_low_symbols - len_mid_symbols ); + price_symbol( lp->lm->bm_high, len - len_low_symbols - len_mid_symbols, len_high_bits ); } -static inline void Lp_reset( Len_prices * const lp ) +static inline void Lp_reset( struct Len_prices * const lp ) { int i; for( i = 0; i < pos_states; ++i ) lp->counters[i] = 0; } -static inline void Lp_init( Len_prices * const lp, const Len_model * const lm, +static inline void Lp_init( struct Len_prices * const lp, + const struct Len_model * const lm, const int match_len_limit ) { lp->lm = lm; lp->len_symbols = match_len_limit + 1 - min_match_len; - lp->count = (match_len_limit > 12) ? 1 : lp->len_symbols; + lp->count = ( match_len_limit > 12 ) ? 1 : lp->len_symbols; Lp_reset( lp ); } -static inline void Lp_decrement_counter( Len_prices * const lp, +static inline void Lp_decrement_counter( struct Len_prices * const lp, const int pos_state ) { --lp->counters[pos_state]; } -static inline void Lp_update_prices( Len_prices * const lp ) +static inline void Lp_update_prices( struct Len_prices * const lp ) { int pos_state; bool high_pending = false; for( pos_state = 0; pos_state < pos_states; ++pos_state ) if( lp->counters[pos_state] <= 0 ) - { lp->counters[pos_state] = lp->count; - Lp_update_low_mid_prices( lp, pos_state ); high_pending = true; } + { Lp_update_low_mid_prices( lp, pos_state ); high_pending = true; } if( high_pending && lp->len_symbols > len_low_symbols + len_mid_symbols ) Lp_update_high_prices( lp ); } -static inline int Lp_price( const Len_prices * const lp, - const int len, const int pos_state ) - { return lp->prices[pos_state][len - min_match_len]; } +static inline int Lp_price( const struct Len_prices * const lp, + const int symbol, const int pos_state ) + { return lp->prices[pos_state][symbol - min_match_len]; } -typedef struct Pair /* distance-length pair */ +struct Pair /* distance-length pair */ { int dis; int len; - } Pair; + }; enum { infinite_price = 0x0FFFFFFF, max_num_trials = 1 << 13, single_step_trial = -2, dual_step_trial = -1 }; -typedef struct Trial +struct Trial { State state; int price; /* dual use var; cumulative price, match length */ - int dis4; /* -1 for literal, or rep, or match distance + 4 */ + int dis; /* rep index or match distance. (-1 for literal) */ int prev_index; /* index of prev trial in trials[] */ int prev_index2; /* -2 trial is single step */ /* -1 literal + rep0 */ /* >= 0 ( rep or match ) + literal + rep0 */ int reps[num_rep_distances]; - } Trial; + }; -static inline void Tr_update( Trial * const trial, const int pr, - const int distance4, const int p_i ) +static inline void Tr_update( struct Trial * const trial, const int pr, + const int distance, const int p_i ) { if( pr < trial->price ) - { trial->price = pr; trial->dis4 = distance4; trial->prev_index = p_i; - trial->prev_index2 = single_step_trial; } + { + trial->price = pr; trial->dis = distance; trial->prev_index = p_i; + trial->prev_index2 = single_step_trial; + } } -static inline void Tr_update2( Trial * const trial, const int pr, +static inline void Tr_update2( struct Trial * const trial, const int pr, const int p_i ) { if( pr < trial->price ) - { trial->price = pr; trial->dis4 = 0; trial->prev_index = p_i; - trial->prev_index2 = dual_step_trial; } + { + trial->price = pr; trial->dis = 0; trial->prev_index = p_i; + trial->prev_index2 = dual_step_trial; + } } -static inline void Tr_update3( Trial * const trial, const int pr, - const int distance4, const int p_i, +static inline void Tr_update3( struct Trial * const trial, const int pr, + const int distance, const int p_i, const int p_i2 ) { if( pr < trial->price ) - { trial->price = pr; trial->dis4 = distance4; trial->prev_index = p_i; - trial->prev_index2 = p_i2; } + { + trial->price = pr; trial->dis = distance; trial->prev_index = p_i; + trial->prev_index2 = p_i2; + } } -typedef struct LZ_encoder +struct LZ_encoder { - LZ_encoder_base eb; + struct LZ_encoder_base eb; int cycles; int match_len_limit; - Len_prices match_len_prices; - Len_prices rep_len_prices; + struct Len_prices match_len_prices; + struct Len_prices rep_len_prices; int pending_num_pairs; - Pair pairs[max_match_len+1]; - Trial trials[max_num_trials]; + struct Pair pairs[max_match_len+1]; + struct Trial trials[max_num_trials]; int dis_slot_prices[len_states][2*max_dictionary_bits]; int dis_prices[len_states][modeled_distances]; int align_prices[dis_align_size]; int num_dis_slots; - int price_counter; /* counters may decrement below 0 */ + int price_counter; int dis_price_counter; int align_price_counter; bool been_flushed; - } LZ_encoder; + }; -static inline bool Mb_dec_pos( Matchfinder_base * const mb, const int ahead ) +static inline bool Mb_dec_pos( struct Matchfinder_base * const mb, + const int ahead ) { if( ahead < 0 || mb->pos < ahead ) return false; mb->pos -= ahead; - if( mb->cyclic_pos < ahead ) mb->cyclic_pos += mb->dictionary_size + 1; mb->cyclic_pos -= ahead; + if( mb->cyclic_pos < 0 ) mb->cyclic_pos += mb->dictionary_size + 1; return true; } -static int LZe_get_match_pairs( LZ_encoder * const e, Pair * pairs ); +static int LZe_get_match_pairs( struct LZ_encoder * const e, struct Pair * pairs ); - /* move-to-front dis in/into reps; do nothing if( dis4 <= 0 ) */ -static inline void mtf_reps( const int dis4, int reps[num_rep_distances] ) + /* move-to-front dis in/into reps if( dis > 0 ) */ +static inline void mtf_reps( const int dis, int reps[num_rep_distances] ) { - if( dis4 >= num_rep_distances ) /* match */ + int i; + if( dis >= num_rep_distances ) { - reps[3] = reps[2]; reps[2] = reps[1]; reps[1] = reps[0]; - reps[0] = dis4 - num_rep_distances; + for( i = num_rep_distances - 1; i > 0; --i ) reps[i] = reps[i-1]; + reps[0] = dis - num_rep_distances; } - else if( dis4 > 0 ) /* repeated match */ + else if( dis > 0 ) { - const int distance = reps[dis4]; - int i; for( i = dis4; i > 0; --i ) reps[i] = reps[i-1]; + const int distance = reps[dis]; + for( i = dis; i > 0; --i ) reps[i] = reps[i-1]; reps[0] = distance; } } -static inline int LZeb_price_shortrep( const LZ_encoder_base * const eb, +static inline int LZeb_price_shortrep( const struct LZ_encoder_base * const eb, const State state, const int pos_state ) { return price0( eb->bm_rep0[state] ) + price0( eb->bm_len[state][pos_state] ); } -static inline int LZeb_price_rep( const LZ_encoder_base * const eb, - const int rep, const State state, - const int pos_state ) +static inline int LZeb_price_rep( const struct LZ_encoder_base * const eb, + const int rep, + const State state, const int pos_state ) { + int price; if( rep == 0 ) return price0( eb->bm_rep0[state] ) + price1( eb->bm_len[state][pos_state] ); - int price = price1( eb->bm_rep0[state] ); + price = price1( eb->bm_rep0[state] ); if( rep == 1 ) price += price0( eb->bm_rep1[state] ); else @@ -205,15 +223,15 @@ static inline int LZeb_price_rep( const LZ_encoder_base * const eb, return price; } -static inline int LZe_price_rep0_len( const LZ_encoder * const e, - const int len, const State state, - const int pos_state ) +static inline int LZe_price_rep0_len( const struct LZ_encoder * const e, + const int len, + const State state, const int pos_state ) { return LZeb_price_rep( &e->eb, 0, state, pos_state ) + Lp_price( &e->rep_len_prices, len, pos_state ); } -static inline int LZe_price_pair( const LZ_encoder * const e, +static inline int LZe_price_pair( const struct LZ_encoder * const e, const int dis, const int len, const int pos_state ) { @@ -226,20 +244,23 @@ static inline int LZe_price_pair( const LZ_encoder * const e, e->align_prices[dis & (dis_align_size - 1)]; } -static inline int LZe_read_match_distances( LZ_encoder * const e ) +static inline int LZe_read_match_distances( struct LZ_encoder * const e ) { const int num_pairs = LZe_get_match_pairs( e, e->pairs ); if( num_pairs > 0 ) { - const int len = e->pairs[num_pairs-1].len; + int len = e->pairs[num_pairs-1].len; if( len == e->match_len_limit && len < max_match_len ) - e->pairs[num_pairs-1].len = - Mb_true_match_len( &e->eb.mb, len, e->pairs[num_pairs-1].dis + 1 ); + { + len += Mb_true_match_len( &e->eb.mb, len, e->pairs[num_pairs-1].dis + 1, + max_match_len - len ); + e->pairs[num_pairs-1].len = len; + } } return num_pairs; } -static inline bool LZe_move_and_update( LZ_encoder * const e, int n ) +static inline bool LZe_move_and_update( struct LZ_encoder * const e, int n ) { while( true ) { @@ -250,29 +271,29 @@ static inline bool LZe_move_and_update( LZ_encoder * const e, int n ) return true; } -static inline void LZe_backward( LZ_encoder * const e, int cur ) +static inline void LZe_backward( struct LZ_encoder * const e, int cur ) { - int dis4 = e->trials[cur].dis4; + int * const dis = &e->trials[cur].dis; while( cur > 0 ) { const int prev_index = e->trials[cur].prev_index; - Trial * const prev_trial = &e->trials[prev_index]; + struct Trial * const prev_trial = &e->trials[prev_index]; if( e->trials[cur].prev_index2 != single_step_trial ) { - prev_trial->dis4 = -1; /* literal */ + prev_trial->dis = -1; prev_trial->prev_index = prev_index - 1; prev_trial->prev_index2 = single_step_trial; if( e->trials[cur].prev_index2 >= 0 ) { - Trial * const prev_trial2 = &e->trials[prev_index-1]; - prev_trial2->dis4 = dis4; dis4 = 0; /* rep0 */ + struct Trial * const prev_trial2 = &e->trials[prev_index-1]; + prev_trial2->dis = *dis; *dis = 0; prev_trial2->prev_index = e->trials[cur].prev_index2; prev_trial2->prev_index2 = single_step_trial; } } prev_trial->price = cur - prev_index; /* len */ - cur = dis4; dis4 = prev_trial->dis4; prev_trial->dis4 = cur; + cur = *dis; *dis = prev_trial->dis; prev_trial->dis = cur; cur = prev_index; } } @@ -280,11 +301,11 @@ static inline void LZe_backward( LZ_encoder * const e, int cur ) enum { num_prev_positions3 = 1 << 16, num_prev_positions2 = 1 << 10 }; -static inline bool LZe_init( LZ_encoder * const e, +static inline bool LZe_init( struct LZ_encoder * const e, const int dict_size, const int len_limit, const unsigned long long member_size ) { - enum { before_size = max_num_trials, + enum { before = max_num_trials + 1, /* bytes to keep in buffer after pos */ after_size = max_num_trials + ( 2 * max_match_len ) + 1, dict_factor = 2, @@ -292,10 +313,10 @@ static inline bool LZe_init( LZ_encoder * const e, pos_array_factor = 2, min_free_bytes = 2 * max_num_trials }; - if( !LZeb_init( &e->eb, before_size, dict_size, after_size, dict_factor, + if( !LZeb_init( &e->eb, before, dict_size, after_size, dict_factor, num_prev_positions23, pos_array_factor, min_free_bytes, member_size ) ) return false; - e->cycles = (len_limit < max_match_len) ? 16 + ( len_limit / 2 ) : 256; + e->cycles = ( len_limit < max_match_len ) ? 16 + ( len_limit / 2 ) : 256; e->match_len_limit = len_limit; Lp_init( &e->match_len_prices, &e->eb.match_len_model, e->match_len_limit ); Lp_init( &e->rep_len_prices, &e->eb.rep_len_model, e->match_len_limit ); @@ -310,7 +331,7 @@ static inline bool LZe_init( LZ_encoder * const e, return true; } -static inline void LZe_reset( LZ_encoder * const e, +static inline void LZe_reset( struct LZ_encoder * const e, const unsigned long long member_size ) { LZeb_reset( &e->eb, member_size ); diff --git a/encoder_base.c b/encoder_base.c index b823dfa..ee7e0bb 100644 --- a/encoder_base.c +++ b/encoder_base.c @@ -1,35 +1,42 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -static bool Mb_normalize_pos( Matchfinder_base * const mb ) +static bool Mb_normalize_pos( struct Matchfinder_base * const mb ) { if( mb->pos > mb->stream_pos ) { mb->pos = mb->stream_pos; return false; } if( !mb->at_stream_end ) { int i; - /* offset is int32_t for the min below */ - const int32_t offset = mb->pos - mb->before_size - mb->dictionary_size; + const int offset = mb->pos - mb->dictionary_size - mb->before_size; const int size = mb->stream_pos - offset; memmove( mb->buffer, mb->buffer + offset, size ); mb->partial_data_pos += offset; - mb->pos -= offset; /* pos = before_size + dictionary_size */ + mb->pos -= offset; mb->stream_pos -= offset; for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] -= min( mb->prev_positions[i], offset ); @@ -40,42 +47,43 @@ static bool Mb_normalize_pos( Matchfinder_base * const mb ) } -static bool Mb_init( Matchfinder_base * const mb, const int before_size, - const int dict_size, const int after_size, - const int dict_factor, const int num_prev_positions23, +static bool Mb_init( struct Matchfinder_base * const mb, + const int before, const int dict_size, + const int after_size, const int dict_factor, + const int num_prev_positions23, const int pos_array_factor ) { const int buffer_size_limit = - ( dict_factor * dict_size ) + before_size + after_size; + ( dict_factor * dict_size ) + before + after_size; + unsigned size; int i; mb->partial_data_pos = 0; - mb->before_size = before_size; + mb->before_size = before; mb->after_size = after_size; mb->pos = 0; mb->cyclic_pos = 0; mb->stream_pos = 0; - mb->num_prev_positions23 = num_prev_positions23; mb->at_stream_end = false; - mb->sync_flush_pending = false; + mb->flushing = false; mb->buffer_size = max( 65536, buffer_size_limit ); mb->buffer = (uint8_t *)malloc( mb->buffer_size ); if( !mb->buffer ) return false; - mb->saved_dictionary_size = dict_size; mb->dictionary_size = dict_size; mb->pos_limit = mb->buffer_size - after_size; - unsigned size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 ); - if( mb->dictionary_size > 1 << 26 ) size >>= 1; /* 64 MiB */ - mb->key4_mask = size - 1; /* increases with dictionary size */ + size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 ); + if( mb->dictionary_size > 1 << 26 ) /* 64 MiB */ + size >>= 1; + mb->key4_mask = size - 1; + mb->num_prev_positions23 = num_prev_positions23; size += num_prev_positions23; - mb->num_prev_positions = size; + mb->num_prev_positions = size; mb->pos_array_size = pos_array_factor * ( mb->dictionary_size + 1 ); size += mb->pos_array_size; - if( size * sizeof mb->prev_positions[0] <= size ) mb->prev_positions = 0; - else mb->prev_positions = - (int32_t *)malloc( size * sizeof mb->prev_positions[0] ); + if( size * sizeof (int32_t) <= size ) mb->prev_positions = 0; + else mb->prev_positions = (int32_t *)malloc( size * sizeof (int32_t) ); if( !mb->prev_positions ) { free( mb->buffer ); return false; } mb->pos_array = mb->prev_positions + mb->num_prev_positions; for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] = 0; @@ -83,29 +91,26 @@ static bool Mb_init( Matchfinder_base * const mb, const int before_size, } -static void Mb_adjust_array( Matchfinder_base * const mb ) - { - int size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 ); - if( mb->dictionary_size > 1 << 26 ) size >>= 1; /* 64 MiB */ - mb->key4_mask = size - 1; - size += mb->num_prev_positions23; - mb->num_prev_positions = size; - mb->pos_array = mb->prev_positions + mb->num_prev_positions; - } - - -static void Mb_adjust_dictionary_size( Matchfinder_base * const mb ) +static void Mb_adjust_dictionary_size( struct Matchfinder_base * const mb ) { if( mb->stream_pos < mb->dictionary_size ) { - mb->dictionary_size = max( min_dictionary_size, mb->stream_pos ); - Mb_adjust_array( mb ); - mb->pos_limit = mb->buffer_size; + int size; + mb->buffer_size = + mb->dictionary_size = + mb->pos_limit = max( min_dictionary_size, mb->stream_pos ); + size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 ); + if( mb->dictionary_size > 1 << 26 ) + size >>= 1; + mb->key4_mask = size - 1; + size += mb->num_prev_positions23; + mb->num_prev_positions = size; + mb->pos_array = mb->prev_positions + mb->num_prev_positions; } } -static void Mb_reset( Matchfinder_base * const mb ) +static void Mb_reset( struct Matchfinder_base * const mb ) { int i; if( mb->stream_pos > mb->pos ) @@ -115,62 +120,60 @@ static void Mb_reset( Matchfinder_base * const mb ) mb->pos = 0; mb->cyclic_pos = 0; mb->at_stream_end = false; - mb->sync_flush_pending = false; - mb->dictionary_size = mb->saved_dictionary_size; - Mb_adjust_array( mb ); - mb->pos_limit = mb->buffer_size - mb->after_size; + mb->flushing = false; for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] = 0; } -/* End Of Stream marker => (dis == 0xFFFFFFFFU, len == min_match_len) */ -static void LZeb_try_full_flush( LZ_encoder_base * const eb ) + /* End Of Stream mark => (dis == 0xFFFFFFFFU, len == min_match_len) */ +static bool LZeb_full_flush( struct LZ_encoder_base * const eb ) { - if( eb->member_finished || Cb_free_bytes( &eb->renc.cb ) < - max_marker_size + eb->renc.ff_count + Lt_size ) return; - eb->member_finished = true; + int i; const int pos_state = Mb_data_position( &eb->mb ) & pos_state_mask; const State state = eb->state; + File_trailer trailer; + if( eb->member_finished || + Cb_free_bytes( &eb->renc.cb ) < max_marker_size + eb->renc.ff_count + Ft_size ) + return false; Re_encode_bit( &eb->renc, &eb->bm_match[state][pos_state], 1 ); Re_encode_bit( &eb->renc, &eb->bm_rep[state], 0 ); LZeb_encode_pair( eb, 0xFFFFFFFFU, min_match_len, pos_state ); Re_flush( &eb->renc ); - Lzip_trailer trailer; - Lt_set_data_crc( trailer, LZeb_crc( eb ) ); - Lt_set_data_size( trailer, Mb_data_position( &eb->mb ) ); - Lt_set_member_size( trailer, Re_member_position( &eb->renc ) + Lt_size ); - int i; for( i = 0; i < Lt_size; ++i ) Cb_put_byte( &eb->renc.cb, trailer[i] ); + Ft_set_data_crc( trailer, LZeb_crc( eb ) ); + Ft_set_data_size( trailer, Mb_data_position( &eb->mb ) ); + Ft_set_member_size( trailer, Re_member_position( &eb->renc ) + Ft_size ); + for( i = 0; i < Ft_size; ++i ) + Cb_put_byte( &eb->renc.cb, trailer[i] ); + return true; } -/* Sync Flush marker => (dis == 0xFFFFFFFFU, len == min_match_len + 1) */ -static void LZeb_try_sync_flush( LZ_encoder_base * const eb ) + /* Sync Flush mark => (dis == 0xFFFFFFFFU, len == min_match_len + 1) */ +static bool LZeb_sync_flush( struct LZ_encoder_base * const eb ) { - const unsigned min_size = eb->renc.ff_count + max_marker_size; - if( eb->member_finished || - Cb_free_bytes( &eb->renc.cb ) < min_size + max_marker_size ) return; - eb->mb.sync_flush_pending = false; - const unsigned long long old_mpos = Re_member_position( &eb->renc ); + int i; const int pos_state = Mb_data_position( &eb->mb ) & pos_state_mask; const State state = eb->state; - do { /* size of markers must be >= rd_min_available_bytes + 5 */ + if( eb->member_finished || + Cb_free_bytes( &eb->renc.cb ) < (2 * max_marker_size) + eb->renc.ff_count ) + return false; + for( i = 0; i < 2; ++i ) /* 2 consecutive markers guarantee decoding */ + { Re_encode_bit( &eb->renc, &eb->bm_match[state][pos_state], 1 ); Re_encode_bit( &eb->renc, &eb->bm_rep[state], 0 ); LZeb_encode_pair( eb, 0xFFFFFFFFU, min_match_len + 1, pos_state ); Re_flush( &eb->renc ); } - while( Re_member_position( &eb->renc ) - old_mpos < min_size ); + return true; } -static void LZeb_reset( LZ_encoder_base * const eb, +static void LZeb_reset( struct LZ_encoder_base * const eb, const unsigned long long member_size ) { - const unsigned long long min_member_size = min_dictionary_size; - const unsigned long long max_member_size = 0x0008000000000000ULL; /* 2 PiB */ + int i; Mb_reset( &eb->mb ); - eb->member_size_limit = min( max( min_member_size, member_size ), - max_member_size ) - Lt_size - max_marker_size; + eb->member_size_limit = member_size - Ft_size - max_marker_size; eb->crc = 0xFFFFFFFFU; Bm_array_init( eb->bm_literal[0], (1 << literal_context_bits) * 0x300 ); Bm_array_init( eb->bm_match[0], states * pos_states ); @@ -180,12 +183,12 @@ static void LZeb_reset( LZ_encoder_base * const eb, Bm_array_init( eb->bm_rep2, states ); Bm_array_init( eb->bm_len[0], states * pos_states ); Bm_array_init( eb->bm_dis_slot[0], len_states * (1 << dis_slot_bits) ); - Bm_array_init( eb->bm_dis, modeled_distances - end_dis_model + 1 ); + Bm_array_init( eb->bm_dis, modeled_distances - end_dis_model ); Bm_array_init( eb->bm_align, dis_align_size ); Lm_init( &eb->match_len_model ); Lm_init( &eb->rep_len_model ); - Re_reset( &eb->renc, eb->mb.dictionary_size ); - int i; for( i = 0; i < num_rep_distances; ++i ) eb->reps[i] = 0; + Re_reset( &eb->renc ); + for( i = 0; i < num_rep_distances; ++i ) eb->reps[i] = 0; eb->state = 0; eb->member_finished = false; } diff --git a/encoder_base.h b/encoder_base.h index b4a6f02..3209922 100644 --- a/encoder_base.h +++ b/encoder_base.h @@ -1,20 +1,28 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ enum { price_shift_bits = 6, @@ -141,45 +149,22 @@ static inline int price0( const Bit_model probability ) static inline int price1( const Bit_model probability ) { return get_price( bit_model_total - probability ); } -static inline int price_bit( const Bit_model bm, const bool bit ) - { return bit ? price1( bm ) : price0( bm ); } +static inline int price_bit( const Bit_model bm, const int bit ) + { if( bit ) return price1( bm ); else return price0( bm ); } -static inline int price_symbol3( const Bit_model bm[], int symbol ) +static inline int price_symbol( const Bit_model bm[], int symbol, + const int num_bits ) { - bool bit = symbol & 1; - symbol |= 8; symbol >>= 1; - int price = price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - return price + price_bit( bm[1], symbol & 1 ); - } - - -static inline int price_symbol6( const Bit_model bm[], unsigned symbol ) - { - bool bit = symbol & 1; - symbol |= 64; symbol >>= 1; - int price = price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - return price + price_bit( bm[1], symbol & 1 ); - } - - -static inline int price_symbol8( const Bit_model bm[], int symbol ) - { - bool bit = symbol & 1; - symbol |= 0x100; symbol >>= 1; - int price = price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit ); - return price + price_bit( bm[1], symbol & 1 ); + int price = 0; + symbol |= ( 1 << num_bits ); + while( symbol > 1 ) + { + const int bit = symbol & 1; + symbol >>= 1; + price += price_bit( bm[symbol], bit ); + } + return price; } @@ -191,33 +176,37 @@ static inline int price_symbol_reversed( const Bit_model bm[], int symbol, int i; for( i = num_bits; i > 0; --i ) { - const bool bit = symbol & 1; - symbol >>= 1; + const int bit = symbol & 1; price += price_bit( bm[model], bit ); - model <<= 1; model |= bit; + model = ( model << 1 ) | bit; + symbol >>= 1; } return price; } -static inline int price_matched( const Bit_model bm[], unsigned symbol, - unsigned match_byte ) +static inline int price_matched( const Bit_model bm[], int symbol, + int match_byte ) { int price = 0; - unsigned mask = 0x100; + int mask = 0x100; symbol |= mask; - while( true ) - { - const unsigned match_bit = ( match_byte <<= 1 ) & mask; - const bool bit = ( symbol <<= 1 ) & 0x100; - price += price_bit( bm[(symbol>>9)+match_bit+mask], bit ); - if( symbol >= 0x10000 ) return price; - mask &= ~(match_bit ^ symbol); /* if( match_bit != bit ) mask = 0; */ + + do { + int match_bit, bit; + match_byte <<= 1; + match_bit = match_byte & mask; + symbol <<= 1; + bit = symbol & 0x100; + price += price_bit( bm[match_bit+(symbol>>9)+mask], bit ); + mask &= ~(match_byte ^ symbol); /* if( match_bit != bit ) mask = 0; */ } + while( symbol < 0x10000 ); + return price; } -typedef struct Matchfinder_base +struct Matchfinder_base { unsigned long long partial_data_pos; uint8_t * buffer; /* input buffer */ @@ -235,55 +224,56 @@ typedef struct Matchfinder_base int num_prev_positions23; int num_prev_positions; /* size of prev_positions */ int pos_array_size; - int saved_dictionary_size; /* dictionary_size restored by Mb_reset */ bool at_stream_end; /* stream_pos shows real end of file */ - bool sync_flush_pending; - } Matchfinder_base; + bool flushing; + }; -static bool Mb_normalize_pos( Matchfinder_base * const mb ); +static bool Mb_normalize_pos( struct Matchfinder_base * const mb ); -static bool Mb_init( Matchfinder_base * const mb, const int before_size, - const int dict_size, const int after_size, - const int dict_factor, const int num_prev_positions23, +static bool Mb_init( struct Matchfinder_base * const mb, + const int before, const int dict_size, + const int after_size, const int dict_factor, + const int num_prev_positions23, const int pos_array_factor ); -static inline void Mb_free( Matchfinder_base * const mb ) +static inline void Mb_free( struct Matchfinder_base * const mb ) { free( mb->prev_positions ); free( mb->buffer ); } -static inline uint8_t Mb_peek( const Matchfinder_base * const mb, +static inline uint8_t Mb_peek( const struct Matchfinder_base * const mb, const int distance ) { return mb->buffer[mb->pos-distance]; } -static inline int Mb_available_bytes( const Matchfinder_base * const mb ) +static inline int Mb_available_bytes( const struct Matchfinder_base * const mb ) { return mb->stream_pos - mb->pos; } static inline unsigned long long -Mb_data_position( const Matchfinder_base * const mb ) +Mb_data_position( const struct Matchfinder_base * const mb ) { return mb->partial_data_pos + mb->pos; } -static inline void Mb_finish( Matchfinder_base * const mb ) - { mb->at_stream_end = true; mb->sync_flush_pending = false; } +static inline void Mb_finish( struct Matchfinder_base * const mb ) + { mb->at_stream_end = true; mb->flushing = false; } -static inline bool Mb_data_finished( const Matchfinder_base * const mb ) - { return mb->at_stream_end && mb->pos >= mb->stream_pos; } +static inline bool Mb_data_finished( const struct Matchfinder_base * const mb ) + { return mb->at_stream_end && !mb->flushing && mb->pos >= mb->stream_pos; } -static inline bool Mb_flushing_or_end( const Matchfinder_base * const mb ) - { return mb->at_stream_end || mb->sync_flush_pending; } +static inline bool Mb_flushing_or_end( const struct Matchfinder_base * const mb ) + { return mb->at_stream_end || mb->flushing; } -static inline int Mb_free_bytes( const Matchfinder_base * const mb ) +static inline int Mb_free_bytes( const struct Matchfinder_base * const mb ) { if( Mb_flushing_or_end( mb ) ) return 0; return mb->buffer_size - mb->stream_pos; } -static inline bool -Mb_enough_available_bytes( const Matchfinder_base * const mb ) - { return mb->pos + mb->after_size <= mb->stream_pos || - ( Mb_flushing_or_end( mb ) && mb->pos < mb->stream_pos ); } +static inline bool Mb_enough_available_bytes( const struct Matchfinder_base * const mb ) + { + return ( mb->pos + mb->after_size <= mb->stream_pos || + ( Mb_flushing_or_end( mb ) && mb->pos < mb->stream_pos ) ); + } static inline const uint8_t * -Mb_ptr_to_current_pos( const Matchfinder_base * const mb ) +Mb_ptr_to_current_pos( const struct Matchfinder_base * const mb ) { return mb->buffer + mb->pos; } -static int Mb_write_data( Matchfinder_base * const mb, +static int Mb_write_data( struct Matchfinder_base * const mb, const uint8_t * const inbuf, const int size ) { const int sz = min( mb->buffer_size - mb->stream_pos, size ); @@ -293,17 +283,19 @@ static int Mb_write_data( Matchfinder_base * const mb, return sz; } -static inline int Mb_true_match_len( const Matchfinder_base * const mb, - const int index, const int distance ) +static inline int Mb_true_match_len( const struct Matchfinder_base * const mb, + const int index, const int distance, + int len_limit ) { - const uint8_t * const data = mb->buffer + mb->pos; - int i = index; - const int len_limit = min( Mb_available_bytes( mb ), max_match_len ); + const uint8_t * const data = mb->buffer + mb->pos + index; + int i = 0; + if( index + len_limit > Mb_available_bytes( mb ) ) + len_limit = Mb_available_bytes( mb ) - index; while( i < len_limit && data[i-distance] == data[i] ) ++i; return i; } -static inline bool Mb_move_pos( Matchfinder_base * const mb ) +static inline bool Mb_move_pos( struct Matchfinder_base * const mb ) { if( ++mb->cyclic_pos > mb->dictionary_size ) mb->cyclic_pos = 0; if( ++mb->pos >= mb->pos_limit ) return Mb_normalize_pos( mb ); @@ -311,23 +303,23 @@ static inline bool Mb_move_pos( Matchfinder_base * const mb ) } -typedef struct Range_encoder +struct Range_encoder { - Circular_buffer cb; + struct Circular_buffer cb; unsigned min_free_bytes; uint64_t low; unsigned long long partial_member_pos; uint32_t range; unsigned ff_count; uint8_t cache; - Lzip_header header; - } Range_encoder; + File_header header; + }; -static inline void Re_shift_low( Range_encoder * const renc ) +static inline void Re_shift_low( struct Range_encoder * const renc ) { - if( renc->low >> 24 != 0xFF ) + const bool carry = ( renc->low > 0xFFFFFFFFU ); + if( carry || renc->low < 0xFF000000U ) { - const bool carry = renc->low > 0xFFFFFFFFU; Cb_put_byte( &renc->cb, renc->cache + carry ); for( ; renc->ff_count > 0; --renc->ff_count ) Cb_put_byte( &renc->cb, 0xFF + carry ); @@ -337,41 +329,42 @@ static inline void Re_shift_low( Range_encoder * const renc ) renc->low = ( renc->low & 0x00FFFFFFU ) << 8; } -static inline void Re_reset( Range_encoder * const renc, - const unsigned dictionary_size ) +static inline void Re_reset( struct Range_encoder * const renc ) { + int i; Cb_reset( &renc->cb ); renc->low = 0; renc->partial_member_pos = 0; renc->range = 0xFFFFFFFFU; renc->ff_count = 0; renc->cache = 0; - Lh_set_dictionary_size( renc->header, dictionary_size ); - int i; for( i = 0; i < Lh_size; ++i ) Cb_put_byte( &renc->cb, renc->header[i] ); + for( i = 0; i < Fh_size; ++i ) + Cb_put_byte( &renc->cb, renc->header[i] ); } -static inline bool Re_init( Range_encoder * const renc, +static inline bool Re_init( struct Range_encoder * const renc, const unsigned dictionary_size, const unsigned min_free_bytes ) { if( !Cb_init( &renc->cb, 65536 + min_free_bytes ) ) return false; renc->min_free_bytes = min_free_bytes; - Lh_set_magic( renc->header ); - Re_reset( renc, dictionary_size ); + Fh_set_magic( renc->header ); + Fh_set_dictionary_size( renc->header, dictionary_size ); + Re_reset( renc ); return true; } -static inline void Re_free( Range_encoder * const renc ) +static inline void Re_free( struct Range_encoder * const renc ) { Cb_free( &renc->cb ); } static inline unsigned long long -Re_member_position( const Range_encoder * const renc ) +Re_member_position( const struct Range_encoder * const renc ) { return renc->partial_member_pos + Cb_used_bytes( &renc->cb ) + renc->ff_count; } -static inline bool Re_enough_free_bytes( const Range_encoder * const renc ) +static inline bool Re_enough_free_bytes( const struct Range_encoder * const renc ) { return Cb_free_bytes( &renc->cb ) >= renc->min_free_bytes + renc->ff_count; } -static inline int Re_read_data( Range_encoder * const renc, +static inline int Re_read_data( struct Range_encoder * const renc, uint8_t * const out_buffer, const int out_size ) { const int size = Cb_read_data( &renc->cb, out_buffer, out_size ); @@ -379,7 +372,7 @@ static inline int Re_read_data( Range_encoder * const renc, return size; } -static inline void Re_flush( Range_encoder * const renc ) +static inline void Re_flush( struct Range_encoder * const renc ) { int i; for( i = 0; i < 5; ++i ) Re_shift_low( renc ); renc->low = 0; @@ -388,20 +381,21 @@ static inline void Re_flush( Range_encoder * const renc ) renc->cache = 0; } -static inline void Re_encode( Range_encoder * const renc, +static inline void Re_encode( struct Range_encoder * const renc, const int symbol, const int num_bits ) { - unsigned mask; - for( mask = 1 << ( num_bits - 1 ); mask > 0; mask >>= 1 ) + int i; + for( i = num_bits - 1; i >= 0; --i ) { renc->range >>= 1; - if( symbol & mask ) renc->low += renc->range; - if( renc->range <= 0x00FFFFFFU ) { renc->range <<= 8; Re_shift_low( renc ); } + if( (symbol >> i) & 1 ) renc->low += renc->range; + if( renc->range <= 0x00FFFFFFU ) + { renc->range <<= 8; Re_shift_low( renc ); } } } -static inline void Re_encode_bit( Range_encoder * const renc, - Bit_model * const probability, const bool bit ) +static inline void Re_encode_bit( struct Range_encoder * const renc, + Bit_model * const probability, const int bit ) { const uint32_t bound = ( renc->range >> bit_model_total_bits ) * *probability; if( !bit ) @@ -415,96 +409,76 @@ static inline void Re_encode_bit( Range_encoder * const renc, renc->range -= bound; *probability -= *probability >> bit_model_move_bits; } - if( renc->range <= 0x00FFFFFFU ) { renc->range <<= 8; Re_shift_low( renc ); } + if( renc->range <= 0x00FFFFFFU ) + { renc->range <<= 8; Re_shift_low( renc ); } } -static inline void Re_encode_tree3( Range_encoder * const renc, - Bit_model bm[], const int symbol ) - { - bool bit = ( symbol >> 2 ) & 1; - Re_encode_bit( renc, &bm[1], bit ); - int model = 2 | bit; - bit = ( symbol >> 1 ) & 1; - Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit; - Re_encode_bit( renc, &bm[model], symbol & 1 ); - } - -static inline void Re_encode_tree6( Range_encoder * const renc, - Bit_model bm[], const unsigned symbol ) - { - bool bit = ( symbol >> 5 ) & 1; - Re_encode_bit( renc, &bm[1], bit ); - int model = 2 | bit; - bit = ( symbol >> 4 ) & 1; - Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit; - bit = ( symbol >> 3 ) & 1; - Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit; - bit = ( symbol >> 2 ) & 1; - Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit; - bit = ( symbol >> 1 ) & 1; - Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit; - Re_encode_bit( renc, &bm[model], symbol & 1 ); - } - -static inline void Re_encode_tree8( Range_encoder * const renc, - Bit_model bm[], const int symbol ) +static inline void Re_encode_tree( struct Range_encoder * const renc, + Bit_model bm[], const int symbol, const int num_bits ) { + int mask = ( 1 << ( num_bits - 1 ) ); int model = 1; int i; - for( i = 7; i >= 0; --i ) + for( i = num_bits; i > 0; --i, mask >>= 1 ) { - const bool bit = ( symbol >> i ) & 1; + const int bit = ( symbol & mask ); Re_encode_bit( renc, &bm[model], bit ); - model <<= 1; model |= bit; + model <<= 1; + if( bit ) model |= 1; } } -static inline void Re_encode_tree_reversed( Range_encoder * const renc, - Bit_model bm[], int symbol, const int num_bits ) +static inline void Re_encode_tree_reversed( struct Range_encoder * const renc, + Bit_model bm[], int symbol, const int num_bits ) { int model = 1; int i; for( i = num_bits; i > 0; --i ) { - const bool bit = symbol & 1; - symbol >>= 1; + const int bit = symbol & 1; Re_encode_bit( renc, &bm[model], bit ); - model <<= 1; model |= bit; + model = ( model << 1 ) | bit; + symbol >>= 1; } } -static inline void Re_encode_matched( Range_encoder * const renc, - Bit_model bm[], unsigned symbol, - unsigned match_byte ) +static inline void Re_encode_matched( struct Range_encoder * const renc, + Bit_model bm[], int symbol, + int match_byte ) { - unsigned mask = 0x100; + int mask = 0x100; symbol |= mask; - while( true ) - { - const unsigned match_bit = ( match_byte <<= 1 ) & mask; - const bool bit = ( symbol <<= 1 ) & 0x100; - Re_encode_bit( renc, &bm[(symbol>>9)+match_bit+mask], bit ); - if( symbol >= 0x10000 ) break; - mask &= ~(match_bit ^ symbol); /* if( match_bit != bit ) mask = 0; */ + + do { + int match_bit, bit; + match_byte <<= 1; + match_bit = match_byte & mask; + symbol <<= 1; + bit = symbol & 0x100; + Re_encode_bit( renc, &bm[match_bit+(symbol>>9)+mask], bit ); + mask &= ~(match_byte ^ symbol); /* if( match_bit != bit ) mask = 0; */ } + while( symbol < 0x10000 ); } -static inline void Re_encode_len( Range_encoder * const renc, - Len_model * const lm, +static inline void Re_encode_len( struct Range_encoder * const renc, + struct Len_model * const lm, int symbol, const int pos_state ) { - bool bit = ( symbol -= min_match_len ) >= len_low_symbols; + bool bit = ( ( symbol -= min_match_len ) >= len_low_symbols ); Re_encode_bit( renc, &lm->choice1, bit ); if( !bit ) - Re_encode_tree3( renc, lm->bm_low[pos_state], symbol ); + Re_encode_tree( renc, lm->bm_low[pos_state], symbol, len_low_bits ); else { - bit = ( symbol -= len_low_symbols ) >= len_mid_symbols; + bit = ( symbol >= len_low_symbols + len_mid_symbols ); Re_encode_bit( renc, &lm->choice2, bit ); if( !bit ) - Re_encode_tree3( renc, lm->bm_mid[pos_state], symbol ); + Re_encode_tree( renc, lm->bm_mid[pos_state], + symbol - len_low_symbols, len_mid_bits ); else - Re_encode_tree8( renc, lm->bm_high, symbol - len_mid_symbols ); + Re_encode_tree( renc, lm->bm_high, + symbol - len_low_symbols - len_mid_symbols, len_high_bits ); } } @@ -512,9 +486,9 @@ static inline void Re_encode_len( Range_encoder * const renc, enum { max_marker_size = 16, num_rep_distances = 4 }; /* must be 4 */ -typedef struct LZ_encoder_base +struct LZ_encoder_base { - Matchfinder_base mb; + struct Matchfinder_base mb; unsigned long long member_size_limit; uint32_t crc; @@ -526,28 +500,28 @@ typedef struct LZ_encoder_base Bit_model bm_rep2[states]; Bit_model bm_len[states][pos_states]; Bit_model bm_dis_slot[len_states][1<mb, before_size, dict_size, after_size, dict_factor, + if( !Mb_init( &eb->mb, before, dict_size, after_size, dict_factor, num_prev_positions23, pos_array_factor ) ) return false; if( !Re_init( &eb->renc, eb->mb.dictionary_size, min_free_bytes ) ) return false; @@ -555,40 +529,44 @@ static inline bool LZeb_init( LZ_encoder_base * const eb, return true; } -static inline bool LZeb_member_finished( const LZ_encoder_base * const eb ) - { return eb->member_finished && Cb_empty( &eb->renc.cb ); } +static inline bool LZeb_member_finished( const struct LZ_encoder_base * const eb ) + { return ( eb->member_finished && !Cb_used_bytes( &eb->renc.cb ) ); } -static inline void LZeb_free( LZ_encoder_base * const eb ) +static inline void LZeb_free( struct LZ_encoder_base * const eb ) { Re_free( &eb->renc ); Mb_free( &eb->mb ); } -static inline unsigned LZeb_crc( const LZ_encoder_base * const eb ) +static inline unsigned LZeb_crc( const struct LZ_encoder_base * const eb ) { return eb->crc ^ 0xFFFFFFFFU; } -static inline int LZeb_price_literal( const LZ_encoder_base * const eb, - const uint8_t prev_byte, const uint8_t symbol ) - { return price_symbol8( eb->bm_literal[get_lit_state(prev_byte)], symbol ); } +static inline int LZeb_price_literal( const struct LZ_encoder_base * const eb, + uint8_t prev_byte, uint8_t symbol ) + { return price_symbol( eb->bm_literal[get_lit_state(prev_byte)], symbol, 8 ); } -static inline int LZeb_price_matched( const LZ_encoder_base * const eb, - const uint8_t prev_byte, const uint8_t symbol, const uint8_t match_byte ) +static inline int LZeb_price_matched( const struct LZ_encoder_base * const eb, + uint8_t prev_byte, uint8_t symbol, + uint8_t match_byte ) { return price_matched( eb->bm_literal[get_lit_state(prev_byte)], symbol, match_byte ); } -static inline void LZeb_encode_literal( LZ_encoder_base * const eb, - const uint8_t prev_byte, const uint8_t symbol ) - { Re_encode_tree8( &eb->renc, eb->bm_literal[get_lit_state(prev_byte)], symbol ); } +static inline void LZeb_encode_literal( struct LZ_encoder_base * const eb, + uint8_t prev_byte, uint8_t symbol ) + { Re_encode_tree( &eb->renc, + eb->bm_literal[get_lit_state(prev_byte)], symbol, 8 ); } -static inline void LZeb_encode_matched( LZ_encoder_base * const eb, - const uint8_t prev_byte, const uint8_t symbol, const uint8_t match_byte ) +static inline void LZeb_encode_matched( struct LZ_encoder_base * const eb, + uint8_t prev_byte, uint8_t symbol, + uint8_t match_byte ) { Re_encode_matched( &eb->renc, eb->bm_literal[get_lit_state(prev_byte)], symbol, match_byte ); } -static inline void LZeb_encode_pair( LZ_encoder_base * const eb, +static inline void LZeb_encode_pair( struct LZ_encoder_base * const eb, const unsigned dis, const int len, const int pos_state ) { + const int dis_slot = get_slot( dis ); Re_encode_len( &eb->renc, &eb->match_len_model, len, pos_state ); - const unsigned dis_slot = get_slot( dis ); - Re_encode_tree6( &eb->renc, eb->bm_dis_slot[get_len_state(len)], dis_slot ); + Re_encode_tree( &eb->renc, eb->bm_dis_slot[get_len_state(len)], dis_slot, + dis_slot_bits ); if( dis_slot >= start_dis_model ) { @@ -597,7 +575,7 @@ static inline void LZeb_encode_pair( LZ_encoder_base * const eb, const unsigned direct_dis = dis - base; if( dis_slot < end_dis_model ) - Re_encode_tree_reversed( &eb->renc, eb->bm_dis + ( base - dis_slot ), + Re_encode_tree_reversed( &eb->renc, eb->bm_dis + base - dis_slot - 1, direct_dis, direct_bits ); else { diff --git a/fast_encoder.c b/fast_encoder.c index bd675bb..03697cc 100644 --- a/fast_encoder.c +++ b/fast_encoder.c @@ -1,79 +1,99 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -static int FLZe_longest_match_len( FLZ_encoder * const fe, int * const distance ) +int FLZe_longest_match_len( struct FLZ_encoder * const fe, int * const distance ) { enum { len_limit = 16 }; - int32_t * ptr0 = fe->eb.mb.pos_array + fe->eb.mb.cyclic_pos; - const int available = min( Mb_available_bytes( &fe->eb.mb ), max_match_len ); - if( available < len_limit ) { *ptr0 = 0; return 0; } - const uint8_t * const data = Mb_ptr_to_current_pos( &fe->eb.mb ); - fe->key4 = ( ( fe->key4 << 4 ) ^ data[3] ) & fe->eb.mb.key4_mask; + int32_t * ptr0 = fe->eb.mb.pos_array + fe->eb.mb.cyclic_pos; + int32_t * newptr; const int pos1 = fe->eb.mb.pos + 1; - int newpos1 = fe->eb.mb.prev_positions[fe->key4]; + int maxlen = 0; + int count, delta, newpos; + if( len_limit > Mb_available_bytes( &fe->eb.mb ) ) { *ptr0 = 0; return 0; } + + fe->key4 = ( ( fe->key4 << 4 ) ^ data[3] ) & fe->eb.mb.key4_mask; + newpos = fe->eb.mb.prev_positions[fe->key4]; fe->eb.mb.prev_positions[fe->key4] = pos1; - int maxlen = 0, count; for( count = 4; ; ) { - int delta; - if( newpos1 <= 0 || --count < 0 || - ( delta = pos1 - newpos1 ) > fe->eb.mb.dictionary_size ) - { *ptr0 = 0; break; } - int32_t * const newptr = fe->eb.mb.pos_array + + if( --count < 0 || newpos <= 0 ) { *ptr0 = 0; break; } + delta = pos1 - newpos; + if( delta > fe->eb.mb.dictionary_size ) { *ptr0 = 0; break; } + newptr = fe->eb.mb.pos_array + ( fe->eb.mb.cyclic_pos - delta + ( ( fe->eb.mb.cyclic_pos >= delta ) ? 0 : fe->eb.mb.dictionary_size + 1 ) ); if( data[maxlen-delta] == data[maxlen] ) { int len = 0; - while( len < available && data[len-delta] == data[len] ) ++len; - if( maxlen < len ) - { maxlen = len; *distance = delta - 1; - if( maxlen >= len_limit ) { *ptr0 = *newptr; break; } } + while( len < len_limit && data[len-delta] == data[len] ) ++len; + if( maxlen < len ) { maxlen = len; *distance = delta - 1; } } - *ptr0 = newpos1; - ptr0 = newptr; - newpos1 = *ptr0; + if( maxlen < len_limit ) + { + *ptr0 = newpos; + ptr0 = newptr; + newpos = *ptr0; + } + else + { + *ptr0 = *newptr; + maxlen += Mb_true_match_len( &fe->eb.mb, maxlen, *distance + 1, + max_match_len - maxlen ); + break; + } } return maxlen; } -static bool FLZe_encode_member( FLZ_encoder * const fe ) +bool FLZe_encode_member( struct FLZ_encoder * const fe ) { int rep = 0, i; State * const state = &fe->eb.state; if( fe->eb.member_finished ) return true; if( Re_member_position( &fe->eb.renc ) >= fe->eb.member_size_limit ) - { LZeb_try_full_flush( &fe->eb ); return true; } + { + if( LZeb_full_flush( &fe->eb ) ) fe->eb.member_finished = true; + return true; + } if( Mb_data_position( &fe->eb.mb ) == 0 && !Mb_data_finished( &fe->eb.mb ) ) /* encode first byte */ { + const uint8_t prev_byte = 0; + uint8_t cur_byte; if( !Mb_enough_available_bytes( &fe->eb.mb ) || !Re_enough_free_bytes( &fe->eb.renc ) ) return true; - const uint8_t prev_byte = 0; - const uint8_t cur_byte = Mb_peek( &fe->eb.mb, 0 ); + cur_byte = Mb_peek( &fe->eb.mb, 0 ); Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][0], 0 ); LZeb_encode_literal( &fe->eb, prev_byte, cur_byte ); CRC32_update_byte( &fe->eb.crc, cur_byte ); @@ -84,16 +104,17 @@ static bool FLZe_encode_member( FLZ_encoder * const fe ) while( !Mb_data_finished( &fe->eb.mb ) && Re_member_position( &fe->eb.renc ) < fe->eb.member_size_limit ) { + int match_distance; + int main_len, pos_state, len = 0; if( !Mb_enough_available_bytes( &fe->eb.mb ) || !Re_enough_free_bytes( &fe->eb.renc ) ) return true; - int match_distance = 0; /* avoid warning from gcc 6.1.0 */ - const int main_len = FLZe_longest_match_len( fe, &match_distance ); - const int pos_state = Mb_data_position( &fe->eb.mb ) & pos_state_mask; - int len = 0; + main_len = FLZe_longest_match_len( fe, &match_distance ); + pos_state = Mb_data_position( &fe->eb.mb ) & pos_state_mask; for( i = 0; i < num_rep_distances; ++i ) { - const int tlen = Mb_true_match_len( &fe->eb.mb, 0, fe->eb.reps[i] + 1 ); + const int tlen = Mb_true_match_len( &fe->eb.mb, 0, + fe->eb.reps[i] + 1, max_match_len ); if( tlen > len ) { len = tlen; rep = i; } } if( len > min_match_len && len + 3 > main_len ) @@ -106,10 +127,11 @@ static bool FLZe_encode_member( FLZ_encoder * const fe ) Re_encode_bit( &fe->eb.renc, &fe->eb.bm_len[*state][pos_state], 1 ); else { + int distance; Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep1[*state], rep > 1 ); if( rep > 1 ) Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep2[*state], rep > 2 ); - const int distance = fe->eb.reps[rep]; + distance = fe->eb.reps[rep]; for( i = rep; i > 0; --i ) fe->eb.reps[i] = fe->eb.reps[i-1]; fe->eb.reps[0] = distance; } @@ -134,6 +156,7 @@ static bool FLZe_encode_member( FLZ_encoder * const fe ) continue; } + { const uint8_t prev_byte = Mb_peek( &fe->eb.mb, 1 ); const uint8_t cur_byte = Mb_peek( &fe->eb.mb, 0 ); const uint8_t match_byte = Mb_peek( &fe->eb.mb, fe->eb.reps[0] + 1 ); @@ -142,34 +165,36 @@ static bool FLZe_encode_member( FLZ_encoder * const fe ) if( match_byte == cur_byte ) { - const int shortrep_price = price1( fe->eb.bm_match[*state][pos_state] ) + - price1( fe->eb.bm_rep[*state] ) + - price0( fe->eb.bm_rep0[*state] ) + - price0( fe->eb.bm_len[*state][pos_state] ); + const int short_rep_price = price1( fe->eb.bm_match[*state][pos_state] ) + + price1( fe->eb.bm_rep[*state] ) + + price0( fe->eb.bm_rep0[*state] ) + + price0( fe->eb.bm_len[*state][pos_state] ); int price = price0( fe->eb.bm_match[*state][pos_state] ); if( St_is_char( *state ) ) price += LZeb_price_literal( &fe->eb, prev_byte, cur_byte ); else price += LZeb_price_matched( &fe->eb, prev_byte, cur_byte, match_byte ); - if( shortrep_price < price ) + if( short_rep_price < price ) { Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][pos_state], 1 ); Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[*state], 1 ); Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep0[*state], 0 ); Re_encode_bit( &fe->eb.renc, &fe->eb.bm_len[*state][pos_state], 0 ); - *state = St_set_shortrep( *state ); + *state = St_set_short_rep( *state ); continue; } } /* literal byte */ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][pos_state], 0 ); - if( ( *state = St_set_char( *state ) ) < 4 ) + if( St_is_char( *state ) ) LZeb_encode_literal( &fe->eb, prev_byte, cur_byte ); else LZeb_encode_matched( &fe->eb, prev_byte, cur_byte, match_byte ); + *state = St_set_char( *state ); + } } - LZeb_try_full_flush( &fe->eb ); + if( LZeb_full_flush( &fe->eb ) ) fe->eb.member_finished = true; return true; } diff --git a/fast_encoder.h b/fast_encoder.h index bce1b26..9d9e1c7 100644 --- a/fast_encoder.h +++ b/fast_encoder.h @@ -1,29 +1,37 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ -typedef struct FLZ_encoder +struct FLZ_encoder { - LZ_encoder_base eb; - unsigned key4; /* key made from latest 4 bytes */ - } FLZ_encoder; + struct LZ_encoder_base eb; + int key4; /* key made from latest 4 bytes */ + }; -static inline void FLZe_reset_key4( FLZ_encoder * const fe ) +static inline void FLZe_reset_key4( struct FLZ_encoder * const fe ) { int i; fe->key4 = 0; @@ -31,40 +39,44 @@ static inline void FLZe_reset_key4( FLZ_encoder * const fe ) fe->key4 = ( fe->key4 << 4 ) ^ fe->eb.mb.buffer[i]; } -static inline bool FLZe_update_and_move( FLZ_encoder * const fe, int n ) +int FLZe_longest_match_len( struct FLZ_encoder * const fe, int * const distance ); + +static inline bool FLZe_update_and_move( struct FLZ_encoder * const fe, int n ) { - Matchfinder_base * const mb = &fe->eb.mb; while( --n >= 0 ) { - if( Mb_available_bytes( mb ) >= 4 ) + if( Mb_available_bytes( &fe->eb.mb ) >= 4 ) { - fe->key4 = ( ( fe->key4 << 4 ) ^ mb->buffer[mb->pos+3] ) & mb->key4_mask; - mb->pos_array[mb->cyclic_pos] = mb->prev_positions[fe->key4]; - mb->prev_positions[fe->key4] = mb->pos + 1; + int newpos; + fe->key4 = ( ( fe->key4 << 4 ) ^ fe->eb.mb.buffer[fe->eb.mb.pos+3] ) & + fe->eb.mb.key4_mask; + newpos = fe->eb.mb.prev_positions[fe->key4]; + fe->eb.mb.prev_positions[fe->key4] = fe->eb.mb.pos + 1; + fe->eb.mb.pos_array[fe->eb.mb.cyclic_pos] = newpos; } - else mb->pos_array[mb->cyclic_pos] = 0; - if( !Mb_move_pos( mb ) ) return false; + else fe->eb.mb.pos_array[fe->eb.mb.cyclic_pos] = 0; + if( !Mb_move_pos( &fe->eb.mb ) ) return false; } return true; } -static inline bool FLZe_init( FLZ_encoder * const fe, +static inline bool FLZe_init( struct FLZ_encoder * const fe, const unsigned long long member_size ) { - enum { before_size = 0, + enum { before = 0, dict_size = 65536, /* bytes to keep in buffer after pos */ after_size = max_match_len, dict_factor = 16, - min_free_bytes = max_marker_size, num_prev_positions23 = 0, - pos_array_factor = 1 }; + pos_array_factor = 1, + min_free_bytes = max_marker_size }; - return LZeb_init( &fe->eb, before_size, dict_size, after_size, dict_factor, + return LZeb_init( &fe->eb, before, dict_size, after_size, dict_factor, num_prev_positions23, pos_array_factor, min_free_bytes, member_size ); } -static inline void FLZe_reset( FLZ_encoder * const fe, +static inline void FLZe_reset( struct FLZ_encoder * const fe, const unsigned long long member_size ) { LZeb_reset( &fe->eb, member_size ); } diff --git a/ffexample.c b/ffexample.c deleted file mode 100644 index 9f313ae..0000000 --- a/ffexample.c +++ /dev/null @@ -1,298 +0,0 @@ -/* File to file example - Test program for the library lzlib - Copyright (C) 2010-2025 Antonio Diaz Diaz. - - This program is free software: you have unlimited permission - to copy, distribute, and modify it. - - Try 'ffexample -h' for usage information. - - This program is an example of how file-to-file - compression/decompression can be implemented using lzlib. -*/ - -#define _FILE_OFFSET_BITS 64 - -#include -#include -#include -#include -#include -#include -#include -#include -#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ -#include -#include -#endif - -#include "lzlib.h" - -#ifndef min - #define min(x,y) ((x) <= (y) ? (x) : (y)) -#endif - - -static void show_help( void ) - { - printf( "ffexample is an example program showing how file-to-file (de)compression can\n" - "be implemented using lzlib. The content of infile is compressed,\n" - "decompressed, or both, and then written to outfile.\n" - "\nUsage: ffexample operation [infile [outfile]]\n" ); - printf( "\nOperation:\n" - " -h display this help and exit\n" - " -c compress infile to outfile\n" - " -d decompress infile to outfile\n" - " -b both (compress then decompress) infile to outfile\n" - " -m compress (multimember) infile to outfile\n" - " -l compress (1 member per line) infile to outfile\n" - " -r decompress with resync if data error or leading garbage\n" - "\nIf infile or outfile are omitted, or are specified as '-', standard input or\n" - "standard output are used in their place respectively.\n" - "\nReport bugs to lzip-bug@nongnu.org\n" - "Lzlib home page: http://www.nongnu.org/lzip/lzlib.html\n" ); - } - - -int ffcompress( LZ_Encoder * const encoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_finished( encoder ) == 1 ) return 0; - } - return 1; - } - - -int ffdecompress( LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_decompress_write( decoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } - - -int ffboth( LZ_Encoder * const encoder, LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - ret = LZ_compress_read( encoder, buffer, size ); - if( ret < 0 ) break; - ret = LZ_decompress_write( decoder, buffer, ret ); - if( ret < 0 ) break; - if( LZ_compress_finished( encoder ) == 1 ) - LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } - - -int ffmmcompress( FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384, member_size = 4096 }; - uint8_t buffer[buffer_size]; - bool done = false; - LZ_Encoder * const encoder = LZ_compress_open( 65535, 16, member_size ); - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { fputs( "ffexample: Not enough memory.\n", stderr ); - LZ_compress_close( encoder ); return 1; } - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_member_finished( encoder ) == 1 ) - { - if( LZ_compress_finished( encoder ) == 1 ) { done = true; break; } - if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break; - } - } - if( LZ_compress_close( encoder ) < 0 ) done = false; - return done; - } - - -/* Compress 'infile' to 'outfile' as a multimember stream with one member - for each line of text terminated by a newline character or by EOF. - Return 0 if success, 1 if error. -*/ -int fflfcompress( LZ_Encoder * const encoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_compress_write_size( encoder ) ); - if( size > 0 ) - { - for( len = 0; len < size; ) - { - int ch = getc( infile ); - if( ch == EOF || ( buffer[len++] = ch ) == '\n' ) break; - } - /* avoid writing an empty member to outfile */ - if( len == 0 && LZ_compress_data_position( encoder ) == 0 ) return 0; - ret = LZ_compress_write( encoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) || buffer[len-1] == '\n' ) - LZ_compress_finish( encoder ); - } - ret = LZ_compress_read( encoder, buffer, buffer_size ); - if( ret < 0 ) break; - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_compress_member_finished( encoder ) == 1 ) - { - if( feof( infile ) && LZ_compress_finished( encoder ) == 1 ) return 0; - if( LZ_compress_restart_member( encoder, INT64_MAX ) < 0 ) break; - } - } - return 1; - } - - -/* Decompress 'infile' to 'outfile' with automatic resynchronization to - next member in case of data error, including the automatic removal of - leading garbage. -*/ -int ffrsdecompress( LZ_Decoder * const decoder, - FILE * const infile, FILE * const outfile ) - { - enum { buffer_size = 16384 }; - uint8_t buffer[buffer_size]; - while( true ) - { - int len, ret; - int size = min( buffer_size, LZ_decompress_write_size( decoder ) ); - if( size > 0 ) - { - len = fread( buffer, 1, size, infile ); - ret = LZ_decompress_write( decoder, buffer, len ); - if( ret < 0 || ferror( infile ) ) break; - if( feof( infile ) ) LZ_decompress_finish( decoder ); - } - ret = LZ_decompress_read( decoder, buffer, buffer_size ); - if( ret < 0 ) - { - if( LZ_decompress_errno( decoder ) == LZ_header_error || - LZ_decompress_errno( decoder ) == LZ_data_error ) - { LZ_decompress_sync_to_member( decoder ); continue; } - break; - } - len = fwrite( buffer, 1, ret, outfile ); - if( len < ret ) break; - if( LZ_decompress_finished( decoder ) == 1 ) return 0; - } - return 1; - } - - -int main( const int argc, const char * const argv[] ) - { -#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ - setmode( STDIN_FILENO, O_BINARY ); - setmode( STDOUT_FILENO, O_BINARY ); -#endif - - LZ_Encoder * const encoder = LZ_compress_open( 65535, 16, INT64_MAX ); - LZ_Decoder * const decoder = LZ_decompress_open(); - FILE * const infile = (argc >= 3 && strcmp( argv[2], "-" ) != 0) ? - fopen( argv[2], "rb" ) : stdin; - FILE * const outfile = (argc >= 4 && strcmp( argv[3], "-" ) != 0) ? - fopen( argv[3], "wb" ) : stdout; - int retval; - - if( argc < 2 || argc > 4 || strlen( argv[1] ) != 2 || argv[1][0] != '-' ) - { show_help(); return 1; } - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok || - !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) - { fputs( "ffexample: Not enough memory.\n", stderr ); - LZ_compress_close( encoder ); LZ_decompress_close( decoder ); return 1; } - if( !infile ) - { fprintf( stderr, "ffexample: %s: Can't open input file: %s\n", - argv[2], strerror( errno ) ); return 1; } - if( !outfile ) - { fprintf( stderr, "ffexample: %s: Can't open output file: %s\n", - argv[3], strerror( errno ) ); return 1; } - - switch( argv[1][1] ) - { - case 'c': retval = ffcompress( encoder, infile, outfile ); break; - case 'd': retval = ffdecompress( decoder, infile, outfile ); break; - case 'b': retval = ffboth( encoder, decoder, infile, outfile ); break; - case 'm': retval = ffmmcompress( infile, outfile ); break; - case 'l': retval = fflfcompress( encoder, infile, outfile ); break; - case 'r': retval = ffrsdecompress( decoder, infile, outfile ); break; - default: show_help(); return argv[1][1] != 'h'; - } - - if( LZ_decompress_close( decoder ) < 0 || LZ_compress_close( encoder ) < 0 || - fclose( outfile ) != 0 || fclose( infile ) != 0 ) retval = 1; - return retval; - } diff --git a/lzcheck.c b/lzcheck.c index 86ce87d..b9ba11b 100644 --- a/lzcheck.c +++ b/lzcheck.c @@ -1,398 +1,239 @@ -/* Lzcheck - Test program for the library lzlib - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzcheck - Test program for the lzlib library + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This program is free software: you have unlimited permission - to copy, distribute, and modify it. + This program is free software: you have unlimited permission + to copy, distribute and modify it. - Usage: lzcheck [-m|-s] filename.txt... + Usage is: + lzcheck filename.txt - This program reads each text file specified and then compresses it, - line by line, to test the flushing mechanism and the member - restart/reset/sync functions. + This program reads the specified text file and then compresses it, + line by line, to test the flushing mechanism and the member + restart/reset/sync functions. */ #define _FILE_OFFSET_BITS 64 -#include #include #include #include #include #include #include -#include #include "lzlib.h" +#ifndef min + #define min(x,y) ((x) <= (y) ? (x) : (y)) +#endif -const unsigned long long member_size = INT64_MAX; -enum { buffer_size = 32749 }; /* largest prime < 32768 */ + +enum { buffer_size = 32768 }; uint8_t in_buffer[buffer_size]; uint8_t mid_buffer[buffer_size]; uint8_t out_buffer[buffer_size]; -static void show_line( const uint8_t * const buffer, const int size ) - { - int i; - for( i = 0; i < size; ++i ) - fputc( isprint( buffer[i] ) ? buffer[i] : '.', stderr ); - fputc( '\n', stderr ); - } - - -static LZ_Encoder * xopen_encoder( const int dictionary_size ) +int lzcheck( FILE * const file, const int dictionary_size ) { const int match_len_limit = 16; - LZ_Encoder * const encoder = - LZ_compress_open( dictionary_size, match_len_limit, member_size ); + const unsigned long long member_size = 0x7FFFFFFFFFFFFFFFULL; /* INT64_MAX */ + struct LZ_Encoder * encoder; + struct LZ_Decoder * decoder; + int retval = 0; + + encoder = LZ_compress_open( dictionary_size, match_len_limit, member_size ); if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) { - const bool bad_arg = - encoder && ( LZ_compress_errno( encoder ) == LZ_bad_argument ); + const bool mem_error = ( LZ_compress_errno( encoder ) == LZ_mem_error ); LZ_compress_close( encoder ); - if( bad_arg ) + if( mem_error ) { - fputs( "lzcheck: internal error: Invalid argument to encoder.\n", stderr ); - exit( 3 ); + fputs( "lzcheck: Not enough memory.\n", stderr ); + return 1; } - fputs( "lzcheck: Not enough memory.\n", stderr ); - exit( 1 ); + fputs( "lzcheck: internal error: Invalid argument to encoder.\n", stderr ); + return 3; } - return encoder; - } - -static LZ_Decoder * xopen_decoder( void ) - { - LZ_Decoder * const decoder = LZ_decompress_open(); + decoder = LZ_decompress_open(); if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) { LZ_decompress_close( decoder ); fputs( "lzcheck: Not enough memory.\n", stderr ); - exit( 1 ); - } - return decoder; - } - - -static void xclose_encoder( LZ_Encoder * const encoder, const bool finish ) - { - if( finish ) - { - unsigned long long size = 0; - LZ_compress_finish( encoder ); - while( true ) - { - const int rd = LZ_compress_read( encoder, mid_buffer, buffer_size ); - if( rd < 0 ) - { - fprintf( stderr, "lzcheck: xclose: LZ_compress_read error: %s\n", - LZ_strerror( LZ_compress_errno( encoder ) ) ); - exit( 3 ); - } - size += rd; - if( LZ_compress_finished( encoder ) == 1 ) break; - } - if( size > 0 ) - { - fprintf( stderr, "lzcheck: %lld bytes remain in encoder.\n", size ); - exit( 3 ); - } - } - if( LZ_compress_close( encoder ) < 0 ) exit( 1 ); - } - - -static void xclose_decoder( LZ_Decoder * const decoder, const bool finish ) - { - if( finish ) - { - unsigned long long size = 0; - LZ_decompress_finish( decoder ); - while( true ) - { - const int rd = LZ_decompress_read( decoder, out_buffer, buffer_size ); - if( rd < 0 ) - { - fprintf( stderr, "lzcheck: xclose: LZ_decompress_read error: %s\n", - LZ_strerror( LZ_decompress_errno( decoder ) ) ); - exit( 3 ); - } - size += rd; - if( LZ_decompress_finished( decoder ) == 1 ) break; - } - if( size > 0 ) - { - fprintf( stderr, "lzcheck: %lld bytes remain in decoder.\n", size ); - exit( 3 ); - } - } - if( LZ_decompress_close( decoder ) < 0 ) exit( 1 ); - } - - -/* Return the next (usually newline-terminated) chunk of data from file. - The size returned in *sizep is always <= buffer_size. - If sizep is a null pointer, rewind the file, reset state, and return. - If file is at EOF, return an empty line. -*/ -static const uint8_t * next_line( FILE * const file, int * const sizep ) - { - static int l = 0; - static int read_size = 0; - int r; - - if( !sizep ) { rewind( file ); l = read_size = 0; return in_buffer; } - if( l >= read_size ) - { - l = 0; read_size = fread( in_buffer, 1, buffer_size, file ); - if( l >= read_size ) { *sizep = 0; return in_buffer; } /* end of file */ + return 1; } - for( r = l + 1; r < read_size && in_buffer[r-1] != '\n'; ++r ); - *sizep = r - l; l = r; - return in_buffer + l - *sizep; - } - - -static int check_sync_flush( FILE * const file, const int dictionary_size ) - { - LZ_Encoder * const encoder = xopen_encoder( dictionary_size ); - LZ_Decoder * const decoder = xopen_decoder(); - int retval = 0; - - while( retval <= 1 ) /* test LZ_compress_sync_flush */ + while( retval <= 1 ) { - int in_size, mid_size, out_size; - int line_size; - const uint8_t * const line_buf = next_line( file, &line_size ); - if( line_size <= 0 ) break; /* end of file */ + int i, l, r; + const int read_size = fread( in_buffer, 1, buffer_size, file ); + if( read_size <= 0 ) break; /* end of file */ - in_size = LZ_compress_write( encoder, line_buf, line_size ); - if( in_size < 0 ) + for( l = 0, r = 1; r <= read_size; l = r, ++r ) { - fprintf( stderr, "lzcheck: LZ_compress_write error: %s\n", - LZ_strerror( LZ_compress_errno( encoder ) ) ); - retval = 3; break; - } - if( in_size < line_size ) - { - fprintf( stderr, "lzcheck: sync: LZ_compress_write only accepted %d " - "of %d bytes\n", in_size, line_size ); + int in_size, mid_size, out_size; + while( r < read_size && in_buffer[r-1] != '\n' ) ++r; + in_size = LZ_compress_write( encoder, in_buffer + l, r - l ); + if( in_size < r - l ) r = l + in_size; + LZ_compress_sync_flush( encoder ); mid_size = LZ_compress_read( encoder, mid_buffer, buffer_size ); - const int wr = - LZ_compress_write( encoder, line_buf + in_size, line_size - in_size ); - if( wr < 0 ) + if( mid_size < 0 ) { - fprintf( stderr, "lzcheck: LZ_compress_write error: %s\n", + fprintf( stderr, "lzcheck: LZ_compress_read error: %s\n", LZ_strerror( LZ_compress_errno( encoder ) ) ); retval = 3; break; } - if( wr + in_size != line_size ) + LZ_decompress_write( decoder, mid_buffer, mid_size ); + out_size = LZ_decompress_read( decoder, out_buffer, buffer_size ); + if( out_size < 0 ) { - fprintf( stderr, "lzcheck: sync: LZ_compress_write only accepted %d " - "of %d remaining bytes\n", wr, line_size - in_size ); + fprintf( stderr, "lzcheck: LZ_decompress_read error: %s\n", + LZ_strerror( LZ_decompress_errno( decoder ) ) ); retval = 3; break; } - in_size += wr; - LZ_compress_sync_flush( encoder ); - const int rd = LZ_compress_read( encoder, mid_buffer + mid_size, - buffer_size - mid_size ); - if( rd > 0 ) mid_size += rd; - else if( rd < 0 ) mid_size = -1; - } - else - { - LZ_compress_sync_flush( encoder ); - if( line_buf[0] & 1 ) /* read all data at once or byte by byte */ - mid_size = LZ_compress_read( encoder, mid_buffer, buffer_size ); - else for( mid_size = 0; mid_size < buffer_size; ) - { - const int rd = LZ_compress_read( encoder, mid_buffer + mid_size, 1 ); - if( rd > 0 ) mid_size += rd; - else { if( rd < 0 ) { mid_size = -1; } break; } - } - } - if( mid_size < 0 ) - { - fprintf( stderr, "lzcheck: LZ_compress_read error: %s\n", - LZ_strerror( LZ_compress_errno( encoder ) ) ); - retval = 3; break; - } - LZ_decompress_write( decoder, mid_buffer, mid_size ); - out_size = LZ_decompress_read( decoder, out_buffer, buffer_size ); - if( out_size < 0 ) - { - fprintf( stderr, "lzcheck: LZ_decompress_read error: %s\n", - LZ_strerror( LZ_decompress_errno( decoder ) ) ); - retval = 3; break; - } - if( out_size != in_size || memcmp( line_buf, out_buffer, out_size ) ) - { - fprintf( stderr, "lzcheck: LZ_compress_sync_flush error: " - "in_size = %d, out_size = %d\n", in_size, out_size ); - show_line( line_buf, in_size ); - show_line( out_buffer, out_size ); - retval = 1; + if( out_size != in_size || memcmp( in_buffer + l, out_buffer, out_size ) ) + { + fprintf( stderr, "lzcheck: Sync error at pos %d in_size = %d, out_size = %d\n", + l, in_size, out_size ); + for( i = 0; i < in_size; ++i ) + fputc( in_buffer[l+i], stderr ); + if( in_buffer[l+in_size-1] != '\n' ) + fputc( '\n', stderr ); + for( i = 0; i < out_size; ++i ) + fputc( out_buffer[i], stderr ); + fputc( '\n', stderr ); + retval = 1; + } } } if( retval <= 1 ) { - int rd = 0; + rewind( file ); if( LZ_compress_finish( encoder ) < 0 || - ( rd = LZ_compress_read( encoder, mid_buffer, buffer_size ) ) < 0 ) + LZ_decompress_write( decoder, mid_buffer, LZ_compress_read( encoder, mid_buffer, buffer_size ) ) < 0 || + LZ_decompress_read( decoder, out_buffer, buffer_size ) != 0 || + LZ_compress_restart_member( encoder, member_size ) < 0 ) { - fprintf( stderr, "lzcheck: Can't drain encoder: %s\n", - LZ_strerror( LZ_compress_errno( encoder ) ) ); + fprintf( stderr, "lzcheck: Can't finish member: %s\n", + LZ_strerror( LZ_decompress_errno( decoder ) ) ); retval = 3; } - LZ_decompress_write( decoder, mid_buffer, rd ); } - xclose_decoder( decoder, retval == 0 ); - xclose_encoder( encoder, retval == 0 ); - return retval; - } - - -/* Test member by member decompression without calling LZ_decompress_finish, - inserting leading garbage before some members, and resetting the - decompressor sometimes. Test that the increase in total_in_size when - syncing to member is equal to the size of the leading garbage skipped. -*/ -static int check_members( FILE * const file, const int dictionary_size ) - { - LZ_Encoder * const encoder = xopen_encoder( dictionary_size ); - LZ_Decoder * const decoder = xopen_decoder(); - int retval = 0; - - while( retval <= 1 ) /* test LZ_compress_restart_member */ + while( retval <= 1 ) { - unsigned long long garbage_begin = 0; /* avoid warning from gcc 3.3.6 */ - int leading_garbage, in_size, mid_size, out_size; - int line_size; - const uint8_t * const line_buf = next_line( file, &line_size ); - if( line_size <= 0 && /* end of file, write at least 1 member */ - LZ_decompress_total_in_size( decoder ) != 0 ) break; + int i, l, r, size; + const int read_size = fread( in_buffer, 1, buffer_size / 2, file ); + if( read_size <= 0 ) break; /* end of file */ - if( LZ_compress_finished( encoder ) == 1 ) + for( l = 0, r = 1; r <= read_size; l = r, ++r ) { - if( LZ_compress_restart_member( encoder, member_size ) < 0 ) + int leading_garbage, in_size, mid_size, out_size; + while( r < read_size && in_buffer[r-1] != '\n' ) ++r; + leading_garbage = (l == 0) ? min( r, read_size / 2 ) : 0; + in_size = LZ_compress_write( encoder, in_buffer + l, r - l ); + if( in_size < r - l ) r = l + in_size; + LZ_compress_sync_flush( encoder ); + if( leading_garbage ) + memset( mid_buffer, in_buffer[0], leading_garbage ); + mid_size = LZ_compress_read( encoder, mid_buffer + leading_garbage, + buffer_size - leading_garbage ); + if( mid_size < 0 ) { - fprintf( stderr, "lzcheck: Can't restart member: %s\n", + fprintf( stderr, "lzcheck: LZ_compress_read error: %s\n", LZ_strerror( LZ_compress_errno( encoder ) ) ); retval = 3; break; } - if( line_size >= 2 && line_buf[1] == 'h' ) - LZ_decompress_reset( decoder ); - } - in_size = LZ_compress_write( encoder, line_buf, line_size ); - if( in_size < line_size ) - fprintf( stderr, "lzcheck: member: LZ_compress_write only accepted %d of %d bytes\n", - in_size, line_size ); - LZ_compress_finish( encoder ); - if( line_size * 3 < buffer_size && line_buf[0] == 't' ) - { leading_garbage = line_size; - memset( mid_buffer, in_buffer[0], leading_garbage ); - garbage_begin = LZ_decompress_total_in_size( decoder ); } - else leading_garbage = 0; - mid_size = LZ_compress_read( encoder, mid_buffer + leading_garbage, - buffer_size - leading_garbage ); - if( mid_size < 0 ) - { - fprintf( stderr, "lzcheck: member: LZ_compress_read error: %s\n", - LZ_strerror( LZ_compress_errno( encoder ) ) ); - retval = 3; break; - } - LZ_decompress_write( decoder, mid_buffer, leading_garbage + mid_size ); - out_size = LZ_decompress_read( decoder, out_buffer, buffer_size ); - if( out_size < 0 ) - { - if( leading_garbage && - ( LZ_decompress_errno( decoder ) == LZ_header_error || - LZ_decompress_errno( decoder ) == LZ_data_error ) ) - { - LZ_decompress_sync_to_member( decoder ); /* skip leading garbage */ - const unsigned long long garbage_end = - LZ_decompress_total_in_size( decoder ); - if( garbage_end - garbage_begin != (unsigned)leading_garbage ) - { - fprintf( stderr, "lzcheck: member: LZ_decompress_sync_to_member error:\n" - " garbage_begin = %llu garbage_end = %llu " - "difference = %llu expected = %d\n", garbage_begin, - garbage_end, garbage_end - garbage_begin, leading_garbage ); - retval = 3; break; - } - out_size = LZ_decompress_read( decoder, out_buffer, buffer_size ); - } + LZ_decompress_write( decoder, mid_buffer, mid_size + leading_garbage ); + out_size = LZ_decompress_read( decoder, out_buffer, buffer_size ); if( out_size < 0 ) { - fprintf( stderr, "lzcheck: member: LZ_decompress_read error: %s\n", - LZ_strerror( LZ_decompress_errno( decoder ) ) ); - retval = 3; break; + if( LZ_decompress_errno( decoder ) == LZ_header_error || + LZ_decompress_errno( decoder ) == LZ_data_error ) + { + LZ_decompress_sync_to_member( decoder ); /* remove leading garbage */ + out_size = LZ_decompress_read( decoder, out_buffer, buffer_size ); + } + if( out_size < 0 ) + { + fprintf( stderr, "lzcheck: LZ_decompress_read error: %s\n", + LZ_strerror( LZ_decompress_errno( decoder ) ) ); + retval = 3; break; + } + } + + if( out_size != in_size || memcmp( in_buffer + l, out_buffer, out_size ) ) + { + fprintf( stderr, "lzcheck: Sync error at pos %d in_size = %d, out_size = %d, leading garbage = %d\n", + l, in_size, out_size, leading_garbage ); + for( i = 0; i < in_size; ++i ) + fputc( in_buffer[l+i], stderr ); + if( in_buffer[l+in_size-1] != '\n' ) + fputc( '\n', stderr ); + for( i = 0; i < out_size; ++i ) + fputc( out_buffer[i], stderr ); + fputc( '\n', stderr ); + retval = 1; } } + if( retval >= 3 ) break; - if( out_size != in_size || memcmp( line_buf, out_buffer, out_size ) ) + if( LZ_compress_finish( encoder ) < 0 || + LZ_decompress_write( decoder, mid_buffer, LZ_compress_read( encoder, mid_buffer, buffer_size ) ) < 0 || + LZ_decompress_read( decoder, out_buffer, buffer_size ) != 0 || + LZ_decompress_reset( decoder ) < 0 || + LZ_compress_restart_member( encoder, member_size ) < 0 ) { - fprintf( stderr, "lzcheck: LZ_compress_restart_member error: " - "in_size = %d, out_size = %d\n", in_size, out_size ); - show_line( line_buf, in_size ); - show_line( out_buffer, out_size ); - retval = 1; + fprintf( stderr, "lzcheck: Can't restart member: %s\n", + LZ_strerror( LZ_decompress_errno( decoder ) ) ); + retval = 3; break; + } + + size = min( 100, read_size ); + if( LZ_compress_write( encoder, in_buffer, size ) != size || + LZ_compress_finish( encoder ) < 0 || + LZ_decompress_write( decoder, mid_buffer, LZ_compress_read( encoder, mid_buffer, buffer_size ) ) < 0 || + LZ_decompress_read( decoder, out_buffer, 0 ) != 0 || + LZ_decompress_sync_to_member( decoder ) < 0 || + LZ_compress_restart_member( encoder, member_size ) < 0 ) + { + fprintf( stderr, "lzcheck: Can't seek to next member: %s\n", + LZ_strerror( LZ_decompress_errno( decoder ) ) ); + retval = 3; break; } } - xclose_decoder( decoder, retval == 0 ); - xclose_encoder( encoder, retval == 0 ); + LZ_decompress_close( decoder ); + LZ_compress_close( encoder ); return retval; } int main( const int argc, const char * const argv[] ) { - int retval = 0, i; - int open_failures = 0; - const char opt = ( argc > 2 && - ( strcmp( argv[1], "-m" ) == 0 || strcmp( argv[1], "-s" ) == 0 ) ) ? - argv[1][1] : 0; - const int first = opt ? 2 : 1; - const bool verbose = opt != 0 || argc > first + 1; + FILE * file; + int retval; if( argc < 2 ) { - fputs( "Usage: lzcheck [-m|-s] filename.txt...\n", stderr ); + fputs( "Usage: lzcheck filename.txt\n", stderr ); return 1; } - for( i = first; i < argc && retval == 0; ++i ) + file = fopen( argv[1], "rb" ); + if( !file ) { - struct stat st; - if( stat( argv[i], &st ) != 0 || !S_ISREG( st.st_mode ) ) continue; - FILE * file = fopen( argv[i], "rb" ); - if( !file ) - { - fprintf( stderr, "lzcheck: %s: Can't open file for reading.\n", argv[i] ); - ++open_failures; continue; - } - if( verbose ) fprintf( stderr, " Testing file '%s'\n", argv[i] ); - - /* 65535,16 chooses fast encoder */ - if( opt != 'm' ) retval = check_sync_flush( file, 65535 ); - if( retval == 0 && opt != 'm' ) - { next_line( file, 0 ); retval = check_sync_flush( file, 1 << 20 ); } - if( retval == 0 && opt != 's' ) - { next_line( file, 0 ); retval = check_members( file, 65535 ); } - if( retval == 0 && opt != 's' ) - { next_line( file, 0 ); retval = check_members( file, 1 << 20 ); } - fclose( file ); + fprintf( stderr, "lzcheck: Can't open file '%s' for reading.\n", argv[1] ); + return 1; } - if( open_failures > 0 && verbose ) - fprintf( stderr, "lzcheck: warning: %d %s failed to open.\n", - open_failures, ( open_failures == 1 ) ? "file" : "files" ); - if( retval == 0 && open_failures ) retval = 1; +/* fprintf( stderr, "lzcheck: Testing file '%s'\n", argv[1] ); */ + + retval = lzcheck( file, 65535 ); /* 65535,16 chooses fast encoder */ + if( retval == 0 ) + { rewind( file ); retval = lzcheck( file, 1 << 20 ); } + fclose( file ); return retval; } diff --git a/lzip.h b/lzip.h index 0f2c1ed..73fc7f1 100644 --- a/lzip.h +++ b/lzip.h @@ -1,20 +1,28 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ #ifndef max @@ -35,13 +43,15 @@ static inline State St_set_char( const State st ) static const State next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 }; return next[st]; } -static inline State St_set_char_rep() { return 8; } + static inline State St_set_match( const State st ) - { return ( st < 7 ) ? 7 : 10; } + { return ( ( st < 7 ) ? 7 : 10 ); } + static inline State St_set_rep( const State st ) - { return ( st < 7 ) ? 8 : 11; } -static inline State St_set_shortrep( const State st ) - { return ( st < 7 ) ? 9 : 11; } + { return ( ( st < 7 ) ? 8 : 11 ); } + +static inline State St_set_short_rep( const State st ) + { return ( ( st < 7 ) ? 9 : 11 ); } enum { @@ -79,7 +89,7 @@ static inline int get_len_state( const int len ) { return min( len - min_match_len, len_states - 1 ); } static inline int get_lit_state( const uint8_t prev_byte ) - { return prev_byte >> ( 8 - literal_context_bits ); } + { return ( prev_byte >> ( 8 - literal_context_bits ) ); } enum { bit_model_move_bits = 5, @@ -94,16 +104,16 @@ static inline void Bm_init( Bit_model * const probability ) static inline void Bm_array_init( Bit_model bm[], const int size ) { int i; for( i = 0; i < size; ++i ) Bm_init( &bm[i] ); } -typedef struct Len_model +struct Len_model { Bit_model choice1; Bit_model choice2; Bit_model bm_low[pos_states][len_low_symbols]; Bit_model bm_mid[pos_states][len_mid_symbols]; Bit_model bm_high[len_high_symbols]; - } Len_model; + }; -static inline void Lm_init( Len_model * const lm ) +static inline void Lm_init( struct Len_model * const lm ) { Bm_init( &lm->choice1 ); Bm_init( &lm->choice2 ); @@ -164,22 +174,19 @@ static const uint32_t crc32[256] = static inline void CRC32_update_byte( uint32_t * const crc, const uint8_t byte ) { *crc = crc32[(*crc^byte)&0xFF] ^ ( *crc >> 8 ); } -/* about as fast as it is possible without messing with endianness */ static inline void CRC32_update_buf( uint32_t * const crc, const uint8_t * const buffer, const int size ) { int i; - uint32_t c = *crc; for( i = 0; i < size; ++i ) - c = crc32[(c^buffer[i])&0xFF] ^ ( c >> 8 ); - *crc = c; + *crc = crc32[(*crc^buffer[i])&0xFF] ^ ( *crc >> 8 ); } static inline bool isvalid_ds( const unsigned dictionary_size ) - { return dictionary_size >= min_dictionary_size && - dictionary_size <= max_dictionary_size; } + { return ( dictionary_size >= min_dictionary_size && + dictionary_size <= max_dictionary_size ); } static inline int real_bits( unsigned value ) @@ -190,51 +197,42 @@ static inline int real_bits( unsigned value ) } -static const uint8_t lzip_magic[4] = { 0x4C, 0x5A, 0x49, 0x50 }; /* "LZIP" */ +static const uint8_t magic_string[4] = { 0x4C, 0x5A, 0x49, 0x50 }; /* "LZIP" */ -enum { Lh_size = 6 }; -typedef uint8_t Lzip_header[Lh_size]; /* 0-3 magic bytes */ +typedef uint8_t File_header[6]; /* 0-3 magic bytes */ /* 4 version */ - /* 5 coded dictionary size */ + /* 5 coded_dict_size */ +enum { Fh_size = 6 }; -static inline void Lh_set_magic( Lzip_header data ) - { memcpy( data, lzip_magic, 4 ); data[4] = 1; } +static inline void Fh_set_magic( File_header data ) + { memcpy( data, magic_string, 4 ); data[4] = 1; } -static inline bool Lh_check_magic( const Lzip_header data ) - { return memcmp( data, lzip_magic, 4 ) == 0; } +static inline bool Fh_verify_magic( const File_header data ) + { return ( memcmp( data, magic_string, 4 ) == 0 ); } -/* detect (truncated) header */ -static inline bool Lh_check_prefix( const Lzip_header data, const int sz ) +/* detect truncated header */ +static inline bool Fh_verify_prefix( const File_header data, const int size ) { - int i; for( i = 0; i < sz && i < 4; ++i ) - if( data[i] != lzip_magic[i] ) return false; - return sz > 0; + int i; for( i = 0; i < size && i < 4; ++i ) + if( data[i] != magic_string[i] ) return false; + return ( size > 0 ); } -/* detect corrupt header */ -static inline bool Lh_check_corrupt( const Lzip_header data ) - { - int matches = 0; - int i; for( i = 0; i < 4; ++i ) - if( data[i] == lzip_magic[i] ) ++matches; - return matches > 1 && matches < 4; - } - -static inline uint8_t Lh_version( const Lzip_header data ) +static inline uint8_t Fh_version( const File_header data ) { return data[4]; } -static inline bool Lh_check_version( const Lzip_header data ) - { return data[4] == 1; } +static inline bool Fh_verify_version( const File_header data ) + { return ( data[4] == 1 ); } -static inline unsigned Lh_get_dictionary_size( const Lzip_header data ) +static inline unsigned Fh_get_dictionary_size( const File_header data ) { - unsigned sz = 1 << ( data[5] & 0x1F ); + unsigned sz = ( 1 << ( data[5] & 0x1F ) ); if( sz > min_dictionary_size ) sz -= ( sz / 16 ) * ( ( data[5] >> 5 ) & 7 ); return sz; } -static inline bool Lh_set_dictionary_size( Lzip_header data, const unsigned sz ) +static inline bool Fh_set_dictionary_size( File_header data, const unsigned sz ) { if( !isvalid_ds( sz ) ) return false; data[5] = real_bits( sz - 1 ); @@ -242,53 +240,55 @@ static inline bool Lh_set_dictionary_size( Lzip_header data, const unsigned sz ) { const unsigned base_size = 1 << data[5]; const unsigned fraction = base_size / 16; - unsigned i; + int i; for( i = 7; i >= 1; --i ) if( base_size - ( i * fraction ) >= sz ) - { data[5] |= i << 5; break; } + { data[5] |= ( i << 5 ); break; } } return true; } -static inline bool Lh_check( const Lzip_header data ) +static inline bool Fh_verify( const File_header data ) { - return Lh_check_magic( data ) && Lh_check_version( data ) && - isvalid_ds( Lh_get_dictionary_size( data ) ); + if( Fh_verify_magic( data ) && Fh_verify_version( data ) ) + return isvalid_ds( Fh_get_dictionary_size( data ) ); + return false; } -enum { Lt_size = 20 }; -typedef uint8_t Lzip_trailer[Lt_size]; +typedef uint8_t File_trailer[20]; /* 0-3 CRC32 of the uncompressed data */ /* 4-11 size of the uncompressed data */ /* 12-19 member size including header and trailer */ -static inline unsigned Lt_get_data_crc( const Lzip_trailer data ) +enum { Ft_size = 20 }; + +static inline unsigned Ft_get_data_crc( const File_trailer data ) { unsigned tmp = 0; int i; for( i = 3; i >= 0; --i ) { tmp <<= 8; tmp += data[i]; } return tmp; } -static inline void Lt_set_data_crc( Lzip_trailer data, unsigned crc ) +static inline void Ft_set_data_crc( File_trailer data, unsigned crc ) { int i; for( i = 0; i <= 3; ++i ) { data[i] = (uint8_t)crc; crc >>= 8; } } -static inline unsigned long long Lt_get_data_size( const Lzip_trailer data ) +static inline unsigned long long Ft_get_data_size( const File_trailer data ) { unsigned long long tmp = 0; int i; for( i = 11; i >= 4; --i ) { tmp <<= 8; tmp += data[i]; } return tmp; } -static inline void Lt_set_data_size( Lzip_trailer data, unsigned long long sz ) +static inline void Ft_set_data_size( File_trailer data, unsigned long long sz ) { int i; for( i = 4; i <= 11; ++i ) { data[i] = (uint8_t)sz; sz >>= 8; } } -static inline unsigned long long Lt_get_member_size( const Lzip_trailer data ) +static inline unsigned long long Ft_get_member_size( const File_trailer data ) { unsigned long long tmp = 0; int i; for( i = 19; i >= 12; --i ) { tmp <<= 8; tmp += data[i]; } return tmp; } -static inline void Lt_set_member_size( Lzip_trailer data, unsigned long long sz ) +static inline void Ft_set_member_size( File_trailer data, unsigned long long sz ) { int i; for( i = 12; i <= 19; ++i ) { data[i] = (uint8_t)sz; sz >>= 8; } } diff --git a/lzlib.c b/lzlib.c index 3dd2566..953f8e3 100644 --- a/lzlib.c +++ b/lzlib.c @@ -1,20 +1,28 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ #include @@ -39,14 +47,14 @@ struct LZ_Encoder { unsigned long long partial_in_size; unsigned long long partial_out_size; - LZ_encoder_base * lz_encoder_base; /* these 3 pointers make a */ - LZ_encoder * lz_encoder; /* polymorphic encoder */ - FLZ_encoder * flz_encoder; - LZ_Errno lz_errno; + struct LZ_encoder_base * lz_encoder_base; /* these 3 pointers make a */ + struct LZ_encoder * lz_encoder; /* polymorphic encoder */ + struct FLZ_encoder * flz_encoder; + enum LZ_Errno lz_errno; bool fatal; }; -static void LZ_Encoder_init( LZ_Encoder * const e ) +static void LZ_Encoder_init( struct LZ_Encoder * const e ) { e->partial_in_size = 0; e->partial_out_size = 0; @@ -62,16 +70,16 @@ struct LZ_Decoder { unsigned long long partial_in_size; unsigned long long partial_out_size; - Range_decoder * rdec; - LZ_decoder * lz_decoder; - LZ_Errno lz_errno; - Lzip_header member_header; /* header of current member */ + struct Range_decoder * rdec; + struct LZ_decoder * lz_decoder; + enum LZ_Errno lz_errno; + File_header member_header; /* header of current member */ bool fatal; bool first_header; /* true until first header is read */ bool seeking; }; -static void LZ_Decoder_init( LZ_Decoder * const d ) +static void LZ_Decoder_init( struct LZ_Decoder * const d ) { int i; d->partial_in_size = 0; @@ -79,14 +87,14 @@ static void LZ_Decoder_init( LZ_Decoder * const d ) d->rdec = 0; d->lz_decoder = 0; d->lz_errno = LZ_ok; - for( i = 0; i < Lh_size; ++i ) d->member_header[i] = 0; + for( i = 0; i < Fh_size; ++i ) d->member_header[i] = 0; d->fatal = false; d->first_header = true; d->seeking = false; } -static bool check_encoder( LZ_Encoder * const e ) +static bool verify_encoder( struct LZ_Encoder * const e ) { if( !e ) return false; if( !e->lz_encoder_base || ( !e->lz_encoder && !e->flz_encoder ) || @@ -96,7 +104,7 @@ static bool check_encoder( LZ_Encoder * const e ) } -static bool check_decoder( LZ_Decoder * const d ) +static bool verify_decoder( struct LZ_Decoder * const d ) { if( !d ) return false; if( !d->rdec ) @@ -105,13 +113,12 @@ static bool check_decoder( LZ_Decoder * const d ) } -/* ------------------------- Misc Functions ------------------------- */ - -int LZ_api_version( void ) { return LZ_API_VERSION; } +/*------------------------- Misc Functions -------------------------*/ const char * LZ_version( void ) { return LZ_version_string; } -const char * LZ_strerror( const LZ_Errno lz_errno ) + +const char * LZ_strerror( const enum LZ_Errno lz_errno ) { switch( lz_errno ) { @@ -120,7 +127,7 @@ const char * LZ_strerror( const LZ_Errno lz_errno ) case LZ_mem_error : return "Not enough memory"; case LZ_sequence_error: return "Sequence error"; case LZ_header_error : return "Header error"; - case LZ_unexpected_eof: return "Unexpected EOF"; + case LZ_unexpected_eof: return "Unexpected eof"; case LZ_data_error : return "Data error"; case LZ_library_error : return "Library error"; } @@ -136,17 +143,18 @@ int LZ_min_match_len_limit( void ) { return min_match_len_limit; } int LZ_max_match_len_limit( void ) { return max_match_len; } -/* --------------------- Compression Functions --------------------- */ +/*---------------------- Compression Functions ----------------------*/ -LZ_Encoder * LZ_compress_open( const int dictionary_size, - const int match_len_limit, - const unsigned long long member_size ) +struct LZ_Encoder * LZ_compress_open( const int dictionary_size, + const int match_len_limit, + const unsigned long long member_size ) { - Lzip_header header; - LZ_Encoder * const e = (LZ_Encoder *)malloc( sizeof (LZ_Encoder) ); + File_header header; + struct LZ_Encoder * const e = + (struct LZ_Encoder *)malloc( sizeof (struct LZ_Encoder) ); if( !e ) return 0; LZ_Encoder_init( e ); - if( !Lh_set_dictionary_size( header, dictionary_size ) || + if( !Fh_set_dictionary_size( header, dictionary_size ) || match_len_limit < min_match_len_limit || match_len_limit > max_match_len || member_size < min_dictionary_size ) @@ -155,15 +163,15 @@ LZ_Encoder * LZ_compress_open( const int dictionary_size, { if( dictionary_size == 65535 && match_len_limit == 16 ) { - e->flz_encoder = (FLZ_encoder *)malloc( sizeof (FLZ_encoder) ); + e->flz_encoder = (struct FLZ_encoder *)malloc( sizeof (struct FLZ_encoder) ); if( e->flz_encoder && FLZe_init( e->flz_encoder, member_size ) ) { e->lz_encoder_base = &e->flz_encoder->eb; return e; } free( e->flz_encoder ); e->flz_encoder = 0; } else { - e->lz_encoder = (LZ_encoder *)malloc( sizeof (LZ_encoder) ); - if( e->lz_encoder && LZe_init( e->lz_encoder, Lh_get_dictionary_size( header ), + e->lz_encoder = (struct LZ_encoder *)malloc( sizeof (struct LZ_encoder) ); + if( e->lz_encoder && LZe_init( e->lz_encoder, Fh_get_dictionary_size( header ), match_len_limit, member_size ) ) { e->lz_encoder_base = &e->lz_encoder->eb; return e; } free( e->lz_encoder ); e->lz_encoder = 0; @@ -175,7 +183,7 @@ LZ_Encoder * LZ_compress_open( const int dictionary_size, } -int LZ_compress_close( LZ_Encoder * const e ) +int LZ_compress_close( struct LZ_Encoder * const e ) { if( !e ) return -1; if( e->lz_encoder_base ) @@ -186,17 +194,17 @@ int LZ_compress_close( LZ_Encoder * const e ) } -int LZ_compress_finish( LZ_Encoder * const e ) +int LZ_compress_finish( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) || e->fatal ) return -1; + if( !verify_encoder( e ) || e->fatal ) return -1; Mb_finish( &e->lz_encoder_base->mb ); /* if (open --> write --> finish) use same dictionary size as lzip. */ /* this does not save any memory. */ if( Mb_data_position( &e->lz_encoder_base->mb ) == 0 && - Re_member_position( &e->lz_encoder_base->renc ) == Lh_size ) + LZ_compress_total_out_size( e ) == Fh_size ) { Mb_adjust_dictionary_size( &e->lz_encoder_base->mb ); - Lh_set_dictionary_size( e->lz_encoder_base->renc.header, + Fh_set_dictionary_size( e->lz_encoder_base->renc.header, e->lz_encoder_base->mb.dictionary_size ); e->lz_encoder_base->renc.cb.buffer[5] = e->lz_encoder_base->renc.header[5]; } @@ -204,10 +212,10 @@ int LZ_compress_finish( LZ_Encoder * const e ) } -int LZ_compress_restart_member( LZ_Encoder * const e, +int LZ_compress_restart_member( struct LZ_Encoder * const e, const unsigned long long member_size ) { - if( !check_encoder( e ) || e->fatal ) return -1; + if( !verify_encoder( e ) || e->fatal ) return -1; if( !LZeb_member_finished( e->lz_encoder_base ) ) { e->lz_errno = LZ_sequence_error; return -1; } if( member_size < min_dictionary_size ) @@ -223,111 +231,114 @@ int LZ_compress_restart_member( LZ_Encoder * const e, } -int LZ_compress_sync_flush( LZ_Encoder * const e ) +int LZ_compress_sync_flush( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) || e->fatal ) return -1; - if( !e->lz_encoder_base->mb.at_stream_end ) - e->lz_encoder_base->mb.sync_flush_pending = true; + if( !verify_encoder( e ) || e->fatal ) return -1; + if( !Mb_flushing_or_end( &e->lz_encoder_base->mb ) ) + e->lz_encoder_base->mb.flushing = true; return 0; } -int LZ_compress_read( LZ_Encoder * const e, +int LZ_compress_read( struct LZ_Encoder * const e, uint8_t * const buffer, const int size ) { - if( !check_encoder( e ) || e->fatal ) return -1; + int out_size = 0; + if( !verify_encoder( e ) || e->fatal ) return -1; if( size < 0 ) return 0; - - { LZ_encoder_base * const eb = e->lz_encoder_base; - int out_size = Re_read_data( &eb->renc, buffer, size ); - /* minimize number of calls to encode_member */ - if( out_size < size || size == 0 ) - { + do { if( ( e->flz_encoder && !FLZe_encode_member( e->flz_encoder ) ) || ( e->lz_encoder && !LZe_encode_member( e->lz_encoder ) ) ) { e->lz_errno = LZ_library_error; e->fatal = true; return -1; } - if( eb->mb.sync_flush_pending && Mb_available_bytes( &eb->mb ) <= 0 ) - LZeb_try_sync_flush( eb ); - out_size += Re_read_data( &eb->renc, buffer + out_size, size - out_size ); + if( e->lz_encoder_base->mb.flushing && + Mb_available_bytes( &e->lz_encoder_base->mb ) <= 0 && + LZeb_sync_flush( e->lz_encoder_base ) ) + e->lz_encoder_base->mb.flushing = false; + out_size += Re_read_data( &e->lz_encoder_base->renc, + buffer + out_size, size - out_size ); } - return out_size; } + while( e->lz_encoder_base->mb.flushing && out_size < size && + Mb_enough_available_bytes( &e->lz_encoder_base->mb ) && + Re_enough_free_bytes( &e->lz_encoder_base->renc ) ); + return out_size; } -int LZ_compress_write( LZ_Encoder * const e, +int LZ_compress_write( struct LZ_Encoder * const e, const uint8_t * const buffer, const int size ) { - if( !check_encoder( e ) || e->fatal ) return -1; + if( !verify_encoder( e ) || e->fatal ) return -1; return Mb_write_data( &e->lz_encoder_base->mb, buffer, size ); } -int LZ_compress_write_size( LZ_Encoder * const e ) +int LZ_compress_write_size( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) || e->fatal ) return -1; + if( !verify_encoder( e ) || e->fatal ) return -1; return Mb_free_bytes( &e->lz_encoder_base->mb ); } -LZ_Errno LZ_compress_errno( LZ_Encoder * const e ) +enum LZ_Errno LZ_compress_errno( struct LZ_Encoder * const e ) { if( !e ) return LZ_bad_argument; return e->lz_errno; } -int LZ_compress_finished( LZ_Encoder * const e ) +int LZ_compress_finished( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) ) return -1; - return Mb_data_finished( &e->lz_encoder_base->mb ) && - LZeb_member_finished( e->lz_encoder_base ); + if( !verify_encoder( e ) ) return -1; + return ( Mb_data_finished( &e->lz_encoder_base->mb ) && + LZeb_member_finished( e->lz_encoder_base ) ); } -int LZ_compress_member_finished( LZ_Encoder * const e ) +int LZ_compress_member_finished( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) ) return -1; + if( !verify_encoder( e ) ) return -1; return LZeb_member_finished( e->lz_encoder_base ); } -unsigned long long LZ_compress_data_position( LZ_Encoder * const e ) +unsigned long long LZ_compress_data_position( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) ) return 0; + if( !verify_encoder( e ) ) return 0; return Mb_data_position( &e->lz_encoder_base->mb ); } -unsigned long long LZ_compress_member_position( LZ_Encoder * const e ) +unsigned long long LZ_compress_member_position( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) ) return 0; + if( !verify_encoder( e ) ) return 0; return Re_member_position( &e->lz_encoder_base->renc ); } -unsigned long long LZ_compress_total_in_size( LZ_Encoder * const e ) +unsigned long long LZ_compress_total_in_size( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) ) return 0; + if( !verify_encoder( e ) ) return 0; return e->partial_in_size + Mb_data_position( &e->lz_encoder_base->mb ); } -unsigned long long LZ_compress_total_out_size( LZ_Encoder * const e ) +unsigned long long LZ_compress_total_out_size( struct LZ_Encoder * const e ) { - if( !check_encoder( e ) ) return 0; + if( !verify_encoder( e ) ) return 0; return e->partial_out_size + Re_member_position( &e->lz_encoder_base->renc ); } -/* -------------------- Decompression Functions -------------------- */ +/*--------------------- Decompression Functions ---------------------*/ -LZ_Decoder * LZ_decompress_open( void ) +struct LZ_Decoder * LZ_decompress_open( void ) { - LZ_Decoder * const d = (LZ_Decoder *)malloc( sizeof (LZ_Decoder) ); + struct LZ_Decoder * const d = + (struct LZ_Decoder *)malloc( sizeof (struct LZ_Decoder) ); if( !d ) return 0; LZ_Decoder_init( d ); - d->rdec = (Range_decoder *)malloc( sizeof (Range_decoder) ); + d->rdec = (struct Range_decoder *)malloc( sizeof (struct Range_decoder) ); if( !d->rdec || !Rd_init( d->rdec ) ) { if( d->rdec ) { Rd_free( d->rdec ); free( d->rdec ); d->rdec = 0; } @@ -337,7 +348,7 @@ LZ_Decoder * LZ_decompress_open( void ) } -int LZ_decompress_close( LZ_Decoder * const d ) +int LZ_decompress_close( struct LZ_Decoder * const d ) { if( !d ) return -1; if( d->lz_decoder ) @@ -348,9 +359,9 @@ int LZ_decompress_close( LZ_Decoder * const d ) } -int LZ_decompress_finish( LZ_Decoder * const d ) +int LZ_decompress_finish( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) || d->fatal ) return -1; + if( !verify_decoder( d ) || d->fatal ) return -1; if( d->seeking ) { d->seeking = false; d->partial_in_size += Rd_purge( d->rdec ); } else Rd_finish( d->rdec ); @@ -358,9 +369,9 @@ int LZ_decompress_finish( LZ_Decoder * const d ) } -int LZ_decompress_reset( LZ_Decoder * const d ) +int LZ_decompress_reset( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) ) return -1; + if( !verify_decoder( d ) ) return -1; if( d->lz_decoder ) { LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0; } d->partial_in_size = 0; @@ -374,10 +385,10 @@ int LZ_decompress_reset( LZ_Decoder * const d ) } -int LZ_decompress_sync_to_member( LZ_Decoder * const d ) +int LZ_decompress_sync_to_member( struct LZ_Decoder * const d ) { - unsigned skipped = 0; - if( !check_decoder( d ) ) return -1; + int skipped = 0; + if( !verify_decoder( d ) ) return -1; if( d->lz_decoder ) { LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0; } if( Rd_find_header( d->rdec, &skipped ) ) d->seeking = false; @@ -393,16 +404,12 @@ int LZ_decompress_sync_to_member( LZ_Decoder * const d ) } -int LZ_decompress_read( LZ_Decoder * const d, +int LZ_decompress_read( struct LZ_Decoder * const d, uint8_t * const buffer, const int size ) { int result; - if( !check_decoder( d ) ) return -1; - if( size < 0 ) return 0; - if( d->fatal ) /* don't return error until pending bytes are read */ - { if( d->lz_decoder && !Cb_empty( &d->lz_decoder->cb ) ) goto get_data; - return -1; } - if( d->seeking ) return 0; + if( !verify_decoder( d ) || d->fatal ) return -1; + if( d->seeking || size < 0 ) return 0; if( d->lz_decoder && LZd_member_finished( d->lz_decoder ) ) { @@ -414,42 +421,25 @@ int LZ_decompress_read( LZ_Decoder * const d, int rd; d->partial_in_size += d->rdec->member_position; d->rdec->member_position = 0; - if( Rd_available_bytes( d->rdec ) < Lh_size + 5 && + if( Rd_available_bytes( d->rdec ) < Fh_size + 5 && !d->rdec->at_stream_end ) return 0; if( Rd_finished( d->rdec ) && !d->first_header ) return 0; - rd = Rd_read_data( d->rdec, d->member_header, Lh_size ); - if( rd < Lh_size || Rd_finished( d->rdec ) ) /* End Of File */ + rd = Rd_read_data( d->rdec, d->member_header, Fh_size ); + if( Rd_finished( d->rdec ) ) { - if( rd <= 0 || Lh_check_prefix( d->member_header, rd ) ) + if( rd <= 0 || Fh_verify_prefix( d->member_header, rd ) ) d->lz_errno = LZ_unexpected_eof; else d->lz_errno = LZ_header_error; d->fatal = true; return -1; } - if( !Lh_check_magic( d->member_header ) ) + if( !Fh_verify( d->member_header ) ) { /* unreading the header prevents sync_to_member from skipping a member if leading garbage is shorter than a full header; "lgLZIP\x01\x0C" */ if( Rd_unread_data( d->rdec, rd ) ) - { - if( d->first_header || !Lh_check_corrupt( d->member_header ) ) - d->lz_errno = LZ_header_error; - else - d->lz_errno = LZ_data_error; /* corrupt header */ - } - else - d->lz_errno = LZ_library_error; - d->fatal = true; - return -1; - } - if( !Lh_check_version( d->member_header ) || - !isvalid_ds( Lh_get_dictionary_size( d->member_header ) ) ) - { - /* Skip a possible "LZIP" leading garbage; "LZIPLZIP\x01\x0C". - Leave member_pos pointing to the first error. */ - if( Rd_unread_data( d->rdec, 1 + !Lh_check_version( d->member_header ) ) ) - d->lz_errno = LZ_data_error; /* bad version or bad dict size */ + d->lz_errno = LZ_header_error; else d->lz_errno = LZ_library_error; d->fatal = true; @@ -465,9 +455,9 @@ int LZ_decompress_read( LZ_Decoder * const d, d->fatal = true; return -1; } - d->lz_decoder = (LZ_decoder *)malloc( sizeof (LZ_decoder) ); + d->lz_decoder = (struct LZ_decoder *)malloc( sizeof (struct LZ_decoder) ); if( !d->lz_decoder || !LZd_init( d->lz_decoder, d->rdec, - Lh_get_dictionary_size( d->member_header ) ) ) + Fh_get_dictionary_size( d->member_header ) ) ) { /* not enough free memory */ if( d->lz_decoder ) { LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0; } @@ -480,32 +470,30 @@ int LZ_decompress_read( LZ_Decoder * const d, result = LZd_decode_member( d->lz_decoder ); if( result != 0 ) { - if( result == 2 ) /* set input position at EOF */ - { d->rdec->member_position += Cb_used_bytes( &d->rdec->cb ); - Cb_reset( &d->rdec->cb ); - d->lz_errno = LZ_unexpected_eof; } - else if( result == 6 ) d->lz_errno = LZ_library_error; + if( result == 2 ) + { d->lz_errno = LZ_unexpected_eof; + d->rdec->member_position += Cb_used_bytes( &d->rdec->cb ); + Cb_reset( &d->rdec->cb ); } + else if( result == 5 ) d->lz_errno = LZ_library_error; else d->lz_errno = LZ_data_error; d->fatal = true; - if( Cb_empty( &d->lz_decoder->cb ) ) return -1; + return -1; } -get_data: return Cb_read_data( &d->lz_decoder->cb, buffer, size ); } -int LZ_decompress_write( LZ_Decoder * const d, +int LZ_decompress_write( struct LZ_Decoder * const d, const uint8_t * const buffer, const int size ) { int result; - if( !check_decoder( d ) || d->fatal ) return -1; + if( !verify_decoder( d ) || d->fatal ) return -1; if( size < 0 ) return 0; result = Rd_write_data( d->rdec, buffer, size ); while( d->seeking ) { - int size2; - unsigned skipped = 0; + int size2, skipped = 0; if( Rd_find_header( d->rdec, &skipped ) ) d->seeking = false; d->partial_in_size += skipped; if( result >= size ) break; @@ -517,82 +505,82 @@ int LZ_decompress_write( LZ_Decoder * const d, } -int LZ_decompress_write_size( LZ_Decoder * const d ) +int LZ_decompress_write_size( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) || d->fatal ) return -1; + if( !verify_decoder( d ) || d->fatal ) return -1; return Rd_free_bytes( d->rdec ); } -LZ_Errno LZ_decompress_errno( LZ_Decoder * const d ) +enum LZ_Errno LZ_decompress_errno( struct LZ_Decoder * const d ) { if( !d ) return LZ_bad_argument; return d->lz_errno; } -int LZ_decompress_finished( LZ_Decoder * const d ) +int LZ_decompress_finished( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) || d->fatal ) return -1; - return Rd_finished( d->rdec ) && - ( !d->lz_decoder || LZd_member_finished( d->lz_decoder ) ); + if( !verify_decoder( d ) ) return -1; + return ( Rd_finished( d->rdec ) && + ( !d->lz_decoder || LZd_member_finished( d->lz_decoder ) ) ); } -int LZ_decompress_member_finished( LZ_Decoder * const d ) +int LZ_decompress_member_finished( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) || d->fatal ) return -1; - return d->lz_decoder && LZd_member_finished( d->lz_decoder ); + if( !verify_decoder( d ) ) return -1; + return ( d->lz_decoder && LZd_member_finished( d->lz_decoder ) ); } -int LZ_decompress_member_version( LZ_Decoder * const d ) +int LZ_decompress_member_version( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) ) return -1; - return Lh_version( d->member_header ); + if( !verify_decoder( d ) ) return -1; + return Fh_version( d->member_header ); } -int LZ_decompress_dictionary_size( LZ_Decoder * const d ) +int LZ_decompress_dictionary_size( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) ) return -1; - return Lh_get_dictionary_size( d->member_header ); + if( !verify_decoder( d ) ) return -1; + return Fh_get_dictionary_size( d->member_header ); } -unsigned LZ_decompress_data_crc( LZ_Decoder * const d ) +unsigned LZ_decompress_data_crc( struct LZ_Decoder * const d ) { - if( check_decoder( d ) && d->lz_decoder ) + if( verify_decoder( d ) && d->lz_decoder ) return LZd_crc( d->lz_decoder ); return 0; } -unsigned long long LZ_decompress_data_position( LZ_Decoder * const d ) +unsigned long long LZ_decompress_data_position( struct LZ_Decoder * const d ) { - if( check_decoder( d ) && d->lz_decoder ) + if( verify_decoder( d ) && d->lz_decoder ) return LZd_data_position( d->lz_decoder ); return 0; } -unsigned long long LZ_decompress_member_position( LZ_Decoder * const d ) +unsigned long long LZ_decompress_member_position( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) ) return 0; + if( !verify_decoder( d ) ) return 0; return d->rdec->member_position; } -unsigned long long LZ_decompress_total_in_size( LZ_Decoder * const d ) +unsigned long long LZ_decompress_total_in_size( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) ) return 0; + if( !verify_decoder( d ) ) return 0; return d->partial_in_size + d->rdec->member_position; } -unsigned long long LZ_decompress_total_out_size( LZ_Decoder * const d ) +unsigned long long LZ_decompress_total_out_size( struct LZ_Decoder * const d ) { - if( !check_decoder( d ) ) return 0; + if( !verify_decoder( d ) ) return 0; if( d->lz_decoder ) return d->partial_out_size + LZd_data_position( d->lz_decoder ); return d->partial_out_size; diff --git a/lzlib.h b/lzlib.h index 926124a..a734fbf 100644 --- a/lzlib.h +++ b/lzlib.h @@ -1,42 +1,45 @@ -/* Lzlib - Compression library for the lzip format - Copyright (C) 2009-2025 Antonio Diaz Diaz. +/* Lzlib - Compression library for the lzip format + Copyright (C) 2009-2016 Antonio Diaz Diaz. - This library is free software. Redistribution and use in source and - binary forms, with or without modification, are permitted provided - that the following conditions are met: + This library is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions, and the following disclaimer. + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions, and the following disclaimer in the - documentation and/or other materials provided with the distribution. + You should have received a copy of the GNU General Public License + along with this library. If not, see . - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + As a special exception, you may use this file as part of a free + software library without restriction. Specifically, if other files + instantiate templates or use macros or inline functions from this + file, or you compile this file and link it with other files to + produce an executable, this file does not by itself cause the + resulting executable to be covered by the GNU General Public + License. This exception does not however invalidate any other + reasons why the executable file might be covered by the GNU General + Public License. */ #ifdef __cplusplus extern "C" { #endif -/* LZ_API_VERSION was first defined in lzlib 1.8 to 1. - Since lzlib 1.12, LZ_API_VERSION is defined as (major * 1000 + minor). */ +#define LZ_API_VERSION 1 -#define LZ_API_VERSION 1015 +static const char * const LZ_version_string = "1.8"; -static const char * const LZ_version_string = "1.15"; - -typedef enum LZ_Errno - { LZ_ok = 0, LZ_bad_argument, LZ_mem_error, - LZ_sequence_error, LZ_header_error, LZ_unexpected_eof, - LZ_data_error, LZ_library_error } LZ_Errno; +enum LZ_Errno { LZ_ok = 0, LZ_bad_argument, LZ_mem_error, + LZ_sequence_error, LZ_header_error, LZ_unexpected_eof, + LZ_data_error, LZ_library_error }; -int LZ_api_version( void ); /* new in 1.12 */ const char * LZ_version( void ); -const char * LZ_strerror( const LZ_Errno lz_errno ); +const char * LZ_strerror( const enum LZ_Errno lz_errno ); int LZ_min_dictionary_bits( void ); int LZ_min_dictionary_size( void ); @@ -46,65 +49,65 @@ int LZ_min_match_len_limit( void ); int LZ_max_match_len_limit( void ); -/* --------------------- Compression Functions --------------------- */ +/*---------------------- Compression Functions ----------------------*/ -typedef struct LZ_Encoder LZ_Encoder; +struct LZ_Encoder; -LZ_Encoder * LZ_compress_open( const int dictionary_size, - const int match_len_limit, - const unsigned long long member_size ); -int LZ_compress_close( LZ_Encoder * const encoder ); +struct LZ_Encoder * LZ_compress_open( const int dictionary_size, + const int match_len_limit, + const unsigned long long member_size ); +int LZ_compress_close( struct LZ_Encoder * const encoder ); -int LZ_compress_finish( LZ_Encoder * const encoder ); -int LZ_compress_restart_member( LZ_Encoder * const encoder, +int LZ_compress_finish( struct LZ_Encoder * const encoder ); +int LZ_compress_restart_member( struct LZ_Encoder * const encoder, const unsigned long long member_size ); -int LZ_compress_sync_flush( LZ_Encoder * const encoder ); +int LZ_compress_sync_flush( struct LZ_Encoder * const encoder ); -int LZ_compress_read( LZ_Encoder * const encoder, +int LZ_compress_read( struct LZ_Encoder * const encoder, uint8_t * const buffer, const int size ); -int LZ_compress_write( LZ_Encoder * const encoder, +int LZ_compress_write( struct LZ_Encoder * const encoder, const uint8_t * const buffer, const int size ); -int LZ_compress_write_size( LZ_Encoder * const encoder ); +int LZ_compress_write_size( struct LZ_Encoder * const encoder ); -LZ_Errno LZ_compress_errno( LZ_Encoder * const encoder ); -int LZ_compress_finished( LZ_Encoder * const encoder ); -int LZ_compress_member_finished( LZ_Encoder * const encoder ); +enum LZ_Errno LZ_compress_errno( struct LZ_Encoder * const encoder ); +int LZ_compress_finished( struct LZ_Encoder * const encoder ); +int LZ_compress_member_finished( struct LZ_Encoder * const encoder ); -unsigned long long LZ_compress_data_position( LZ_Encoder * const encoder ); -unsigned long long LZ_compress_member_position( LZ_Encoder * const encoder ); -unsigned long long LZ_compress_total_in_size( LZ_Encoder * const encoder ); -unsigned long long LZ_compress_total_out_size( LZ_Encoder * const encoder ); +unsigned long long LZ_compress_data_position( struct LZ_Encoder * const encoder ); +unsigned long long LZ_compress_member_position( struct LZ_Encoder * const encoder ); +unsigned long long LZ_compress_total_in_size( struct LZ_Encoder * const encoder ); +unsigned long long LZ_compress_total_out_size( struct LZ_Encoder * const encoder ); -/* -------------------- Decompression Functions -------------------- */ +/*--------------------- Decompression Functions ---------------------*/ -typedef struct LZ_Decoder LZ_Decoder; +struct LZ_Decoder; -LZ_Decoder * LZ_decompress_open( void ); -int LZ_decompress_close( LZ_Decoder * const decoder ); +struct LZ_Decoder * LZ_decompress_open( void ); +int LZ_decompress_close( struct LZ_Decoder * const decoder ); -int LZ_decompress_finish( LZ_Decoder * const decoder ); -int LZ_decompress_reset( LZ_Decoder * const decoder ); -int LZ_decompress_sync_to_member( LZ_Decoder * const decoder ); +int LZ_decompress_finish( struct LZ_Decoder * const decoder ); +int LZ_decompress_reset( struct LZ_Decoder * const decoder ); +int LZ_decompress_sync_to_member( struct LZ_Decoder * const decoder ); -int LZ_decompress_read( LZ_Decoder * const decoder, +int LZ_decompress_read( struct LZ_Decoder * const decoder, uint8_t * const buffer, const int size ); -int LZ_decompress_write( LZ_Decoder * const decoder, +int LZ_decompress_write( struct LZ_Decoder * const decoder, const uint8_t * const buffer, const int size ); -int LZ_decompress_write_size( LZ_Decoder * const decoder ); +int LZ_decompress_write_size( struct LZ_Decoder * const decoder ); -LZ_Errno LZ_decompress_errno( LZ_Decoder * const decoder ); -int LZ_decompress_finished( LZ_Decoder * const decoder ); -int LZ_decompress_member_finished( LZ_Decoder * const decoder ); +enum LZ_Errno LZ_decompress_errno( struct LZ_Decoder * const decoder ); +int LZ_decompress_finished( struct LZ_Decoder * const decoder ); +int LZ_decompress_member_finished( struct LZ_Decoder * const decoder ); -int LZ_decompress_member_version( LZ_Decoder * const decoder ); -int LZ_decompress_dictionary_size( LZ_Decoder * const decoder ); -unsigned LZ_decompress_data_crc( LZ_Decoder * const decoder ); +int LZ_decompress_member_version( struct LZ_Decoder * const decoder ); +int LZ_decompress_dictionary_size( struct LZ_Decoder * const decoder ); +unsigned LZ_decompress_data_crc( struct LZ_Decoder * const decoder ); -unsigned long long LZ_decompress_data_position( LZ_Decoder * const decoder ); -unsigned long long LZ_decompress_member_position( LZ_Decoder * const decoder ); -unsigned long long LZ_decompress_total_in_size( LZ_Decoder * const decoder ); -unsigned long long LZ_decompress_total_out_size( LZ_Decoder * const decoder ); +unsigned long long LZ_decompress_data_position( struct LZ_Decoder * const decoder ); +unsigned long long LZ_decompress_member_position( struct LZ_Decoder * const decoder ); +unsigned long long LZ_decompress_total_in_size( struct LZ_Decoder * const decoder ); +unsigned long long LZ_decompress_total_out_size( struct LZ_Decoder * const decoder ); #ifdef __cplusplus } diff --git a/main.c b/main.c new file mode 100644 index 0000000..c2754bf --- /dev/null +++ b/main.c @@ -0,0 +1,1072 @@ +/* Minilzip - Test program for the lzlib library + Copyright (C) 2009-2016 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ +/* + Exit status: 0 for a normal exit, 1 for environmental problems + (file not found, invalid flags, I/O errors, etc), 2 to indicate a + corrupt or invalid input file, 3 for an internal consistency error + (eg, bug) which caused minilzip to panic. +*/ + +#define _FILE_OFFSET_BITS 64 + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#if defined(__MSVCRT__) +#include +#define fchmod(x,y) 0 +#define fchown(x,y,z) 0 +#define strtoull strtoul +#define SIGHUP SIGTERM +#define S_ISSOCK(x) 0 +#define S_IRGRP 0 +#define S_IWGRP 0 +#define S_IROTH 0 +#define S_IWOTH 0 +#endif +#if defined(__OS2__) +#include +#endif + +#include "carg_parser.h" +#include "lzlib.h" + +#ifndef O_BINARY +#define O_BINARY 0 +#endif + +#if CHAR_BIT != 8 +#error "Environments where CHAR_BIT != 8 are not supported." +#endif + +#ifndef max + #define max(x,y) ((x) >= (y) ? (x) : (y)) +#endif +#ifndef min + #define min(x,y) ((x) <= (y) ? (x) : (y)) +#endif + +void cleanup_and_fail( const int retval ); +void show_error( const char * const msg, const int errcode, const bool help ); +void internal_error( const char * const msg ); + +int verbosity = 0; + +const char * const Program_name = "Minilzip"; +const char * const program_name = "minilzip"; +const char * const program_year = "2016"; +const char * invocation_name = 0; + +struct { const char * from; const char * to; } const known_extensions[] = { + { ".lz", "" }, + { ".tlz", ".tar" }, + { 0, 0 } }; + +struct Lzma_options + { + int dictionary_size; /* 4 KiB .. 512 MiB */ + int match_len_limit; /* 5 .. 273 */ + }; + +enum Mode { m_compress, m_decompress, m_test }; + +char * output_filename = 0; +int outfd = -1; +bool delete_output_on_interrupt = false; + + +struct Pretty_print + { + const char * name; + const char * stdin_name; + unsigned longest_name; + bool first_post; + }; + +static void Pp_init( struct Pretty_print * const pp, + const char * const filenames[], + const int num_filenames, const int verbosity ) + { + unsigned stdin_name_len; + int i; + pp->name = 0; + pp->stdin_name = "(stdin)"; + pp->longest_name = 0; + pp->first_post = false; + stdin_name_len = strlen( pp->stdin_name ); + + if( verbosity <= 0 ) return; + for( i = 0; i < num_filenames; ++i ) + { + const char * const s = filenames[i]; + const unsigned len = (strcmp( s, "-" ) == 0) ? stdin_name_len : strlen( s ); + if( len > pp->longest_name ) pp->longest_name = len; + } + if( pp->longest_name == 0 ) pp->longest_name = stdin_name_len; + } + +static inline void Pp_set_name( struct Pretty_print * const pp, + const char * const filename ) + { + if( filename && filename[0] && strcmp( filename, "-" ) != 0 ) + pp->name = filename; + else pp->name = pp->stdin_name; + pp->first_post = true; + } + +static inline void Pp_reset( struct Pretty_print * const pp ) + { if( pp->name && pp->name[0] ) pp->first_post = true; } + +static void Pp_show_msg( struct Pretty_print * const pp, const char * const msg ) + { + if( verbosity >= 0 ) + { + if( pp->first_post ) + { + unsigned i; + pp->first_post = false; + fprintf( stderr, " %s: ", pp->name ); + for( i = strlen( pp->name ); i < pp->longest_name; ++i ) + fputc( ' ', stderr ); + if( !msg ) fflush( stderr ); + } + if( msg ) fprintf( stderr, "%s\n", msg ); + } + } + + +static void show_help( void ) + { + printf( "%s - Test program for the lzlib library.\n", Program_name ); + printf( "\nUsage: %s [options] [files]\n", invocation_name ); + printf( "\nOptions:\n" + " -h, --help display this help and exit\n" + " -V, --version output version information and exit\n" + " -a, --trailing-error exit with error status if trailing data\n" + " -b, --member-size= set member size limit in bytes\n" + " -c, --stdout write to standard output, keep input files\n" + " -d, --decompress decompress\n" + " -f, --force overwrite existing output files\n" + " -F, --recompress force re-compression of compressed files\n" + " -k, --keep keep (don't delete) input files\n" + " -m, --match-length= set match length limit in bytes [36]\n" + " -o, --output= if reading standard input, write to \n" + " -q, --quiet suppress all messages\n" + " -s, --dictionary-size= set dictionary size limit in bytes [8 MiB]\n" + " -S, --volume-size= set volume size limit in bytes\n" + " -t, --test test compressed file integrity\n" + " -v, --verbose be verbose (a 2nd -v gives more)\n" + " -0 .. -9 set compression level [default 6]\n" + " --fast alias for -0\n" + " --best alias for -9\n" + "If no file names are given, or if a file is '-', minilzip compresses or\n" + "decompresses from standard input to standard output.\n" + "Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n" + "Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n" + "Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12\n" + "to 2^29 bytes.\n" + "\nThe bidimensional parameter space of LZMA can't be mapped to a linear\n" + "scale optimal for all files. If your files are large, very repetitive,\n" + "etc, you may need to use the --dictionary-size and --match-length\n" + "options directly to achieve optimal performance.\n" + "\nExit status: 0 for a normal exit, 1 for environmental problems (file\n" + "not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or\n" + "invalid input file, 3 for an internal consistency error (eg, bug) which\n" + "caused minilzip to panic.\n" + "\nReport bugs to lzip-bug@nongnu.org\n" + "Lzlib home page: http://www.nongnu.org/lzip/lzlib.html\n" ); + } + + +static void show_version( void ) + { + printf( "%s %s\n", program_name, PROGVERSION ); + printf( "Copyright (C) %s Antonio Diaz Diaz.\n", program_year ); + printf( "Using lzlib %s\n", LZ_version() ); + printf( "License GPLv2+: GNU GPL version 2 or later \n" + "This is free software: you are free to change and redistribute it.\n" + "There is NO WARRANTY, to the extent permitted by law.\n" ); + } + + +static void show_header( const unsigned dictionary_size ) + { + if( verbosity >= 3 ) + { + const char * const prefix[8] = + { "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi" }; + enum { factor = 1024 }; + const char * p = ""; + const char * np = " "; + unsigned num = dictionary_size, i; + bool exact = ( num % factor == 0 ); + + for( i = 0; i < 8 && ( num > 9999 || ( exact && num >= factor ) ); ++i ) + { num /= factor; if( num % factor != 0 ) exact = false; + p = prefix[i]; np = ""; } + fprintf( stderr, "dictionary size %s%4u %sB. ", np, num, p ); + } + } + + +static unsigned long long getnum( const char * const ptr, + const unsigned long long llimit, + const unsigned long long ulimit ) + { + unsigned long long result; + char * tail; + errno = 0; + result = strtoull( ptr, &tail, 0 ); + if( tail == ptr ) + { + show_error( "Bad or missing numerical argument.", 0, true ); + exit( 1 ); + } + + if( !errno && tail[0] ) + { + const int factor = ( tail[1] == 'i' ) ? 1024 : 1000; + int exponent = 0; /* 0 = bad multiplier */ + int i; + switch( tail[0] ) + { + case 'Y': exponent = 8; break; + case 'Z': exponent = 7; break; + case 'E': exponent = 6; break; + case 'P': exponent = 5; break; + case 'T': exponent = 4; break; + case 'G': exponent = 3; break; + case 'M': exponent = 2; break; + case 'K': if( factor == 1024 ) exponent = 1; break; + case 'k': if( factor == 1000 ) exponent = 1; break; + } + if( exponent <= 0 ) + { + show_error( "Bad multiplier in numerical argument.", 0, true ); + exit( 1 ); + } + for( i = 0; i < exponent; ++i ) + { + if( ulimit / factor >= result ) result *= factor; + else { errno = ERANGE; break; } + } + } + if( !errno && ( result < llimit || result > ulimit ) ) errno = ERANGE; + if( errno ) + { + show_error( "Numerical argument out of limits.", 0, false ); + exit( 1 ); + } + return result; + } + + +static int get_dict_size( const char * const arg ) + { + char * tail; + int dictionary_size; + const int bits = strtol( arg, &tail, 0 ); + if( bits >= LZ_min_dictionary_bits() && + bits <= LZ_max_dictionary_bits() && *tail == 0 ) + return ( 1 << bits ); + dictionary_size = getnum( arg, LZ_min_dictionary_size(), + LZ_max_dictionary_size() ); + if( dictionary_size == 65535 ) ++dictionary_size; /* no fast encoder */ + return dictionary_size; + } + + +static int extension_index( const char * const name ) + { + int i; + for( i = 0; known_extensions[i].from; ++i ) + { + const char * const ext = known_extensions[i].from; + const unsigned name_len = strlen( name ); + const unsigned ext_len = strlen( ext ); + if( name_len > ext_len && + strncmp( name + name_len - ext_len, ext, ext_len ) == 0 ) + return i; + } + return -1; + } + + +static int open_instream( const char * const name, struct stat * const in_statsp, + const enum Mode program_mode, const int eindex, + const bool recompress, const bool to_stdout ) + { + int infd = -1; + if( program_mode == m_compress && !recompress && eindex >= 0 ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Input file '%s' already has '%s' suffix.\n", + program_name, name, known_extensions[eindex].from ); + } + else + { + infd = open( name, O_RDONLY | O_BINARY ); + if( infd < 0 ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Can't open input file '%s': %s\n", + program_name, name, strerror( errno ) ); + } + else + { + const int i = fstat( infd, in_statsp ); + const mode_t mode = in_statsp->st_mode; + const bool can_read = ( i == 0 && + ( S_ISBLK( mode ) || S_ISCHR( mode ) || + S_ISFIFO( mode ) || S_ISSOCK( mode ) ) ); + const bool no_ofile = ( to_stdout || program_mode == m_test ); + if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || !no_ofile ) ) ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Input file '%s' is not a regular file%s.\n", + program_name, name, + ( can_read && !no_ofile ) ? + ",\n and '--stdout' was not specified" : "" ); + close( infd ); + infd = -1; + } + } + } + return infd; + } + + +/* assure at least a minimum size for buffer 'buf' */ +static void * resize_buffer( void * buf, const int min_size ) + { + if( buf ) buf = realloc( buf, min_size ); + else buf = malloc( min_size ); + if( !buf ) + { + show_error( "Not enough memory.", 0, false ); + cleanup_and_fail( 1 ); + } + return buf; + } + + +static void set_c_outname( const char * const name, const bool multifile ) + { + output_filename = resize_buffer( output_filename, strlen( name ) + 5 + + strlen( known_extensions[0].from ) + 1 ); + strcpy( output_filename, name ); + if( multifile ) strcat( output_filename, "00001" ); + strcat( output_filename, known_extensions[0].from ); + } + + +static void set_d_outname( const char * const name, const int i ) + { + const unsigned name_len = strlen( name ); + if( i >= 0 ) + { + const char * const from = known_extensions[i].from; + const unsigned from_len = strlen( from ); + if( name_len > from_len ) + { + output_filename = resize_buffer( output_filename, name_len + + strlen( known_extensions[0].to ) + 1 ); + strcpy( output_filename, name ); + strcpy( output_filename + name_len - from_len, known_extensions[i].to ); + return; + } + } + output_filename = resize_buffer( output_filename, name_len + 4 + 1 ); + strcpy( output_filename, name ); + strcat( output_filename, ".out" ); + if( verbosity >= 1 ) + fprintf( stderr, "%s: Can't guess original name for '%s' -- using '%s'\n", + program_name, name, output_filename ); + } + + +static bool open_outstream( const bool force, const bool from_stdin ) + { + const mode_t usr_rw = S_IRUSR | S_IWUSR; + const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH; + const mode_t outfd_mode = from_stdin ? all_rw : usr_rw; + int flags = O_CREAT | O_WRONLY | O_BINARY; + if( force ) flags |= O_TRUNC; else flags |= O_EXCL; + + outfd = open( output_filename, flags, outfd_mode ); + if( outfd >= 0 ) delete_output_on_interrupt = true; + else if( verbosity >= 0 ) + { + if( errno == EEXIST ) + fprintf( stderr, "%s: Output file '%s' already exists, skipping.\n", + program_name, output_filename ); + else + fprintf( stderr, "%s: Can't create output file '%s': %s\n", + program_name, output_filename, strerror( errno ) ); + } + return ( outfd >= 0 ); + } + + +static bool check_tty( const int infd, const enum Mode program_mode ) + { + if( program_mode == m_compress && isatty( outfd ) ) + { + show_error( "I won't write compressed data to a terminal.", 0, true ); + return false; + } + if( ( program_mode == m_decompress || program_mode == m_test ) && + isatty( infd ) ) + { + show_error( "I won't read compressed data from a terminal.", 0, true ); + return false; + } + return true; + } + + +void cleanup_and_fail( const int retval ) + { + if( delete_output_on_interrupt ) + { + delete_output_on_interrupt = false; + if( verbosity >= 0 ) + fprintf( stderr, "%s: Deleting output file '%s', if it exists.\n", + program_name, output_filename ); + if( outfd >= 0 ) { close( outfd ); outfd = -1; } + if( remove( output_filename ) != 0 && errno != ENOENT ) + show_error( "WARNING: deletion of output file (apparently) failed.", 0, false ); + } + exit( retval ); + } + + + /* Set permissions, owner and times. */ +static void close_and_set_permissions( const struct stat * const in_statsp ) + { + bool warning = false; + if( in_statsp ) + { + const mode_t mode = in_statsp->st_mode; + /* fchown will in many cases return with EPERM, which can be safely ignored. */ + if( fchown( outfd, in_statsp->st_uid, in_statsp->st_gid ) == 0 ) + { if( fchmod( outfd, mode ) != 0 ) warning = true; } + else + if( errno != EPERM || + fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 ) + warning = true; + } + if( close( outfd ) != 0 ) + { + show_error( "Error closing output file", errno, false ); + cleanup_and_fail( 1 ); + } + outfd = -1; + delete_output_on_interrupt = false; + if( in_statsp ) + { + struct utimbuf t; + t.actime = in_statsp->st_atime; + t.modtime = in_statsp->st_mtime; + if( utime( output_filename, &t ) != 0 ) warning = true; + } + if( warning && verbosity >= 1 ) + show_error( "Can't change output file attributes.", 0, false ); + } + + +/* Returns the number of bytes really read. + If (returned value < size) and (errno == 0), means EOF was reached. +*/ +static int readblock( const int fd, uint8_t * const buf, const int size ) + { + int sz = 0; + errno = 0; + while( sz < size ) + { + const int n = read( fd, buf + sz, size - sz ); + if( n > 0 ) sz += n; + else if( n == 0 ) break; /* EOF */ + else if( errno != EINTR ) break; + errno = 0; + } + return sz; + } + + +/* Returns the number of bytes really written. + If (returned value < size), it is always an error. +*/ +static int writeblock( const int fd, const uint8_t * const buf, const int size ) + { + int sz = 0; + errno = 0; + while( sz < size ) + { + const int n = write( fd, buf + sz, size - sz ); + if( n > 0 ) sz += n; + else if( n < 0 && errno != EINTR ) break; + errno = 0; + } + return sz; + } + + +static bool next_filename( void ) + { + const unsigned name_len = strlen( output_filename ); + const unsigned ext_len = strlen( known_extensions[0].from ); + int i, j; + if( name_len >= ext_len + 5 ) /* "*00001.lz" */ + for( i = name_len - ext_len - 1, j = 0; j < 5; --i, ++j ) + { + if( output_filename[i] < '9' ) { ++output_filename[i]; return true; } + else output_filename[i] = '0'; + } + return false; + } + + +static int do_compress( struct LZ_Encoder * const encoder, + const unsigned long long member_size, + const unsigned long long volume_size, + const int infd, struct Pretty_print * const pp, + const struct stat * const in_statsp ) + { + unsigned long long partial_volume_size = 0; + enum { buffer_size = 65536 }; + uint8_t buffer[buffer_size]; + if( verbosity >= 1 ) Pp_show_msg( pp, 0 ); + + while( true ) + { + int in_size = 0, out_size; + while( LZ_compress_write_size( encoder ) > 0 ) + { + const int size = min( LZ_compress_write_size( encoder ), buffer_size ); + const int rd = readblock( infd, buffer, size ); + if( rd != size && errno ) + { + Pp_show_msg( pp, 0 ); show_error( "Read error", errno, false ); + return 1; + } + if( rd > 0 && rd != LZ_compress_write( encoder, buffer, rd ) ) + internal_error( "library error (LZ_compress_write)." ); + if( rd < size ) LZ_compress_finish( encoder ); +/* else LZ_compress_sync_flush( encoder ); */ + in_size += rd; + } + out_size = LZ_compress_read( encoder, buffer, buffer_size ); + if( out_size < 0 ) + { + Pp_show_msg( pp, 0 ); + if( verbosity >= 0 ) + fprintf( stderr, "%s: LZ_compress_read error: %s\n", + program_name, LZ_strerror( LZ_compress_errno( encoder ) ) ); + return 1; + } + else if( out_size > 0 ) + { + const int wr = writeblock( outfd, buffer, out_size ); + if( wr != out_size ) + { + Pp_show_msg( pp, 0 ); show_error( "Write error", errno, false ); + return 1; + } + } + else if( in_size == 0 ) internal_error( "library error (LZ_compress_read)." ); + if( LZ_compress_member_finished( encoder ) ) + { + unsigned long long size; + if( LZ_compress_finished( encoder ) == 1 ) break; + if( volume_size > 0 ) + { + partial_volume_size += LZ_compress_member_position( encoder ); + if( partial_volume_size >= volume_size - LZ_min_dictionary_size() ) + { + partial_volume_size = 0; + if( delete_output_on_interrupt ) + { + close_and_set_permissions( in_statsp ); + if( !next_filename() ) + { Pp_show_msg( pp, "Too many volume files." ); return 1; } + if( !open_outstream( true, !in_statsp ) ) return 1; + } + } + size = min( member_size, volume_size - partial_volume_size ); + } + else + size = member_size; + if( LZ_compress_restart_member( encoder, size ) < 0 ) + { + Pp_show_msg( pp, 0 ); + if( verbosity >= 0 ) + fprintf( stderr, "%s: LZ_compress_restart_member error: %s\n", + program_name, LZ_strerror( LZ_compress_errno( encoder ) ) ); + return 1; + } + } + } + + if( verbosity >= 1 ) + { + const unsigned long long in_size = LZ_compress_total_in_size( encoder ); + const unsigned long long out_size = LZ_compress_total_out_size( encoder ); + if( in_size == 0 || out_size == 0 ) + fputs( " no data compressed.\n", stderr ); + else + fprintf( stderr, "%6.3f:1, %6.3f bits/byte, " + "%5.2f%% saved, %llu in, %llu out.\n", + (double)in_size / out_size, + ( 8.0 * out_size ) / in_size, + 100.0 * ( 1.0 - ( (double)out_size / in_size ) ), + in_size, out_size ); + } + return 0; + } + + +static int compress( const unsigned long long member_size, + const unsigned long long volume_size, const int infd, + const struct Lzma_options * const encoder_options, + struct Pretty_print * const pp, + const struct stat * const in_statsp ) + { + struct LZ_Encoder * const encoder = + LZ_compress_open( encoder_options->dictionary_size, + encoder_options->match_len_limit, ( volume_size > 0 ) ? + min( member_size, volume_size ) : member_size ); + int retval; + + if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) + { + if( !encoder || LZ_compress_errno( encoder ) == LZ_mem_error ) + Pp_show_msg( pp, "Not enough memory. Try a smaller dictionary size." ); + else + internal_error( "invalid argument to encoder." ); + retval = 1; + } + else retval = do_compress( encoder, member_size, volume_size, + infd, pp, in_statsp ); + LZ_compress_close( encoder ); + return retval; + } + + +static int do_decompress( struct LZ_Decoder * const decoder, const int infd, + struct Pretty_print * const pp, + const bool ignore_trailing, const bool testing ) + { + enum { buffer_size = 65536 }; + uint8_t buffer[buffer_size]; + bool first_member; + + for( first_member = true; ; ) + { + const int max_in_size = min( LZ_decompress_write_size( decoder ), buffer_size ); + int in_size = 0, out_size = 0; + if( max_in_size > 0 ) + { + in_size = readblock( infd, buffer, max_in_size ); + if( in_size != max_in_size && errno ) + { + Pp_show_msg( pp, 0 ); show_error( "Read error", errno, false ); + return 1; + } + if( in_size > 0 && in_size != LZ_decompress_write( decoder, buffer, in_size ) ) + internal_error( "library error (LZ_decompress_write)." ); + if( in_size < max_in_size ) LZ_decompress_finish( decoder ); + } + while( true ) + { + const int rd = LZ_decompress_read( decoder, buffer, buffer_size ); + if( rd > 0 ) + { + out_size += rd; + if( outfd >= 0 ) + { + const int wr = writeblock( outfd, buffer, rd ); + if( wr != rd ) + { + Pp_show_msg( pp, 0 ); show_error( "Write error", errno, false ); + return 1; + } + } + } + else if( rd < 0 ) { out_size = rd; break; } + if( LZ_decompress_member_finished( decoder ) == 1 ) + { + if( verbosity >= 1 ) + { + const unsigned long long data_size = LZ_decompress_data_position( decoder ); + const unsigned long long member_size = LZ_decompress_member_position( decoder ); + Pp_show_msg( pp, 0 ); + show_header( LZ_decompress_dictionary_size( decoder ) ); + if( verbosity >= 2 && data_size > 0 && member_size > 0 ) + fprintf( stderr, "%6.3f:1, %6.3f bits/byte, %5.2f%% saved. ", + (double)data_size / member_size, + ( 8.0 * member_size ) / data_size, + 100.0 * ( 1.0 - ( (double)member_size / data_size ) ) ); + if( verbosity >= 4 ) + fprintf( stderr, "data CRC %08X, data size %9llu, member size %8llu. ", + LZ_decompress_data_crc( decoder ), data_size, member_size ); + fputs( testing ? "ok\n" : "done\n", stderr ); + } + first_member = false; Pp_reset( pp ); + } + if( rd <= 0 ) break; + } + if( out_size < 0 || ( first_member && out_size == 0 ) ) + { + const enum LZ_Errno lz_errno = LZ_decompress_errno( decoder ); + if( lz_errno == LZ_unexpected_eof && + LZ_decompress_member_position( decoder ) <= 6 ) + { Pp_show_msg( pp, "File ends unexpectedly at member header." ); + return 2; } + if( lz_errno == LZ_header_error ) + { + if( first_member ) + { Pp_show_msg( pp, "Bad magic number (file not in lzip format)." ); + return 2; } + else if( !ignore_trailing ) + { show_error( "Trailing data not allowed.", 0, false ); return 2; } + break; + } + if( lz_errno == LZ_mem_error ) + { Pp_show_msg( pp, "Not enough memory." ); return 1; } + if( verbosity >= 0 ) + { + Pp_show_msg( pp, 0 ); + if( lz_errno == LZ_unexpected_eof ) + fprintf( stderr, "File ends unexpectedly at pos %llu\n", + LZ_decompress_total_in_size( decoder ) ); + else + fprintf( stderr, "Decoder error at pos %llu: %s\n", + LZ_decompress_total_in_size( decoder ), + LZ_strerror( LZ_decompress_errno( decoder ) ) ); + } + return 2; + } + if( LZ_decompress_finished( decoder ) == 1 ) break; + if( in_size == 0 && out_size == 0 ) + internal_error( "library error (LZ_decompress_read)." ); + } + return 0; + } + + +static int decompress( const int infd, struct Pretty_print * const pp, + const bool ignore_trailing, const bool testing ) + { + struct LZ_Decoder * const decoder = LZ_decompress_open(); + int retval; + + if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) + { Pp_show_msg( pp, "Not enough memory." ); retval = 1; } + else retval = do_decompress( decoder, infd, pp, ignore_trailing, testing ); + + LZ_decompress_close( decoder ); + return retval; + } + + +void signal_handler( int sig ) + { + if( sig ) {} /* keep compiler happy */ + show_error( "Control-C or similar caught, quitting.", 0, false ); + cleanup_and_fail( 1 ); + } + + +static void set_signals( void ) + { + signal( SIGHUP, signal_handler ); + signal( SIGINT, signal_handler ); + signal( SIGTERM, signal_handler ); + } + + +void show_error( const char * const msg, const int errcode, const bool help ) + { + if( verbosity < 0 ) return; + if( msg && msg[0] ) + { + fprintf( stderr, "%s: %s", program_name, msg ); + if( errcode > 0 ) fprintf( stderr, ": %s", strerror( errcode ) ); + fputc( '\n', stderr ); + } + if( help ) + fprintf( stderr, "Try '%s --help' for more information.\n", + invocation_name ); + } + + +void internal_error( const char * const msg ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: internal error: %s\n", program_name, msg ); + exit( 3 ); + } + + +int main( const int argc, const char * const argv[] ) + { + /* Mapping from gzip/bzip2 style 1..9 compression modes + to the corresponding LZMA compression modes. */ + const struct Lzma_options option_mapping[] = + { + { 65535, 16 }, /* -0 (65535,16 chooses fast encoder) */ + { 1 << 20, 5 }, /* -1 */ + { 3 << 19, 6 }, /* -2 */ + { 1 << 21, 8 }, /* -3 */ + { 3 << 20, 12 }, /* -4 */ + { 1 << 22, 20 }, /* -5 */ + { 1 << 23, 36 }, /* -6 */ + { 1 << 24, 68 }, /* -7 */ + { 3 << 23, 132 }, /* -8 */ + { 1 << 25, 273 } }; /* -9 */ + struct Lzma_options encoder_options = option_mapping[6]; /* default = "-6" */ + const unsigned long long max_member_size = 0x0008000000000000ULL; + const unsigned long long max_volume_size = 0x4000000000000000ULL; + unsigned long long member_size = max_member_size; + unsigned long long volume_size = 0; + const char * input_filename = ""; + const char * default_output_filename = ""; + const char ** filenames = 0; + int num_filenames = 0; + int infd = -1; + enum Mode program_mode = m_compress; + int argind = 0; + int retval = 0; + int i; + bool filenames_given = false; + bool force = false; + bool ignore_trailing = true; + bool keep_input_files = false; + bool stdin_used = false; + bool recompress = false; + bool to_stdout = false; + struct Pretty_print pp; + + const struct ap_Option options[] = + { + { '0', "fast", ap_no }, + { '1', 0, ap_no }, + { '2', 0, ap_no }, + { '3', 0, ap_no }, + { '4', 0, ap_no }, + { '5', 0, ap_no }, + { '6', 0, ap_no }, + { '7', 0, ap_no }, + { '8', 0, ap_no }, + { '9', "best", ap_no }, + { 'a', "trailing-error", ap_no }, + { 'b', "member-size", ap_yes }, + { 'c', "stdout", ap_no }, + { 'd', "decompress", ap_no }, + { 'f', "force", ap_no }, + { 'F', "recompress", ap_no }, + { 'h', "help", ap_no }, + { 'k', "keep", ap_no }, + { 'm', "match-length", ap_yes }, + { 'n', "threads", ap_yes }, + { 'o', "output", ap_yes }, + { 'q', "quiet", ap_no }, + { 's', "dictionary-size", ap_yes }, + { 'S', "volume-size", ap_yes }, + { 't', "test", ap_no }, + { 'v', "verbose", ap_no }, + { 'V', "version", ap_no }, + { 0 , 0, ap_no } }; + + struct Arg_parser parser; + + invocation_name = argv[0]; + + if( LZ_version()[0] != LZ_version_string[0] ) + internal_error( "bad library version." ); + if( strcmp( PROGVERSION, LZ_version_string ) != 0 ) + internal_error( "bad library version_string." ); + + if( !ap_init( &parser, argc, argv, options, 0 ) ) + { show_error( "Not enough memory.", 0, false ); return 1; } + if( ap_error( &parser ) ) /* bad option */ + { show_error( ap_error( &parser ), 0, true ); return 1; } + + for( ; argind < ap_arguments( &parser ); ++argind ) + { + const int code = ap_code( &parser, argind ); + const char * const arg = ap_argument( &parser, argind ); + if( !code ) break; /* no more options */ + switch( code ) + { + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + encoder_options = option_mapping[code-'0']; break; + case 'a': ignore_trailing = false; break; + case 'b': member_size = getnum( arg, 100000, max_member_size ); break; + case 'c': to_stdout = true; break; + case 'd': program_mode = m_decompress; break; + case 'f': force = true; break; + case 'F': recompress = true; break; + case 'h': show_help(); return 0; + case 'k': keep_input_files = true; break; + case 'm': encoder_options.match_len_limit = + getnum( arg, LZ_min_match_len_limit(), + LZ_max_match_len_limit() ); break; + case 'n': break; + case 'o': default_output_filename = arg; break; + case 'q': verbosity = -1; break; + case 's': encoder_options.dictionary_size = get_dict_size( arg ); + break; + case 'S': volume_size = getnum( arg, 100000, max_volume_size ); break; + case 't': program_mode = m_test; break; + case 'v': if( verbosity < 4 ) ++verbosity; break; + case 'V': show_version(); return 0; + default : internal_error( "uncaught option." ); + } + } /* end process options */ + +#if defined(__MSVCRT__) || defined(__OS2__) + setmode( STDIN_FILENO, O_BINARY ); + setmode( STDOUT_FILENO, O_BINARY ); +#endif + + if( program_mode == m_test ) + outfd = -1; + + num_filenames = max( 1, ap_arguments( &parser ) - argind ); + filenames = resize_buffer( filenames, num_filenames * sizeof filenames[0] ); + filenames[0] = "-"; + + for( i = 0; argind + i < ap_arguments( &parser ); ++i ) + { + filenames[i] = ap_argument( &parser, argind + i ); + if( strcmp( filenames[i], "-" ) != 0 ) filenames_given = true; + } + + if( !to_stdout && program_mode != m_test && + ( filenames_given || default_output_filename[0] ) ) + set_signals(); + + Pp_init( &pp, filenames, num_filenames, verbosity ); + + output_filename = resize_buffer( output_filename, 1 ); + for( i = 0; i < num_filenames; ++i ) + { + int tmp; + struct stat in_stats; + const struct stat * in_statsp; + output_filename[0] = 0; + + if( !filenames[i][0] || strcmp( filenames[i], "-" ) == 0 ) + { + if( stdin_used ) continue; else stdin_used = true; + input_filename = ""; + infd = STDIN_FILENO; + if( program_mode != m_test ) + { + if( to_stdout || !default_output_filename[0] ) + outfd = STDOUT_FILENO; + else + { + if( program_mode == m_compress ) + set_c_outname( default_output_filename, volume_size > 0 ); + else + { + output_filename = resize_buffer( output_filename, + strlen( default_output_filename ) + 1 ); + strcpy( output_filename, default_output_filename ); + } + if( !open_outstream( force, true ) ) + { + if( retval < 1 ) retval = 1; + close( infd ); infd = -1; + continue; + } + } + } + } + else + { + const int eindex = extension_index( filenames[i] ); + input_filename = filenames[i]; + infd = open_instream( input_filename, &in_stats, program_mode, + eindex, recompress, to_stdout ); + if( infd < 0 ) { if( retval < 1 ) retval = 1; continue; } + if( program_mode != m_test ) + { + if( to_stdout ) outfd = STDOUT_FILENO; + else + { + if( program_mode == m_compress ) + set_c_outname( input_filename, volume_size > 0 ); + else set_d_outname( input_filename, eindex ); + if( !open_outstream( force, false ) ) + { + if( retval < 1 ) retval = 1; + close( infd ); infd = -1; + continue; + } + } + } + } + + if( !check_tty( infd, program_mode ) ) + { + if( retval < 1 ) retval = 1; + cleanup_and_fail( retval ); + } + + in_statsp = input_filename[0] ? &in_stats : 0; + Pp_set_name( &pp, input_filename ); + if( program_mode == m_compress ) + tmp = compress( member_size, volume_size, infd, &encoder_options, &pp, + in_statsp ); + else + tmp = decompress( infd, &pp, ignore_trailing, program_mode == m_test ); + if( tmp > retval ) retval = tmp; + if( tmp && program_mode != m_test ) cleanup_and_fail( retval ); + + if( delete_output_on_interrupt ) + close_and_set_permissions( in_statsp ); + if( input_filename[0] ) + { + close( infd ); infd = -1; + if( !keep_input_files && !to_stdout && program_mode != m_test ) + remove( input_filename ); + } + } + if( outfd >= 0 && close( outfd ) != 0 ) + { + show_error( "Can't close stdout", errno, false ); + if( retval < 1 ) retval = 1; + } + free( output_filename ); + free( filenames ); + ap_free( &parser ); + return retval; + } diff --git a/minilzip.c b/minilzip.c deleted file mode 100644 index 733506c..0000000 --- a/minilzip.c +++ /dev/null @@ -1,1306 +0,0 @@ -/* Minilzip - Test program for the library lzlib - Copyright (C) 2009-2025 Antonio Diaz Diaz. - - This program is free software: you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation, either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program. If not, see . -*/ -/* - Exit status: 0 for a normal exit, 1 for environmental problems - (file not found, invalid command-line options, I/O errors, etc), 2 to - indicate a corrupt or invalid input file, 3 for an internal consistency - error (e.g., bug) which caused minilzip to panic. -*/ - -#define _FILE_OFFSET_BITS 64 - -#include -#include -#include -#include /* CHAR_BIT, SSIZE_MAX */ -#include -#include -#include /* SIZE_MAX */ -#include -#include -#include -#include -#include -#include -#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ -#include -#if defined __MSVCRT__ -#define fchmod(x,y) 0 -#define fchown(x,y,z) 0 -#define strtoull strtoul -#define SIGHUP SIGTERM -#define S_ISSOCK(x) 0 -#ifndef S_IRGRP -#define S_IRGRP 0 -#define S_IWGRP 0 -#define S_IROTH 0 -#define S_IWOTH 0 -#endif -#endif -#if defined __DJGPP__ -#define S_ISSOCK(x) 0 -#define S_ISVTX 0 -#endif -#endif - -#include "carg_parser.h" -#include "lzlib.h" - -#ifndef O_BINARY -#define O_BINARY 0 -#endif - -#if CHAR_BIT != 8 -#error "Environments where CHAR_BIT != 8 are not supported." -#endif - -#if ( defined SIZE_MAX && SIZE_MAX < UINT_MAX ) || \ - ( defined SSIZE_MAX && SSIZE_MAX < INT_MAX ) -#error "Environments where 'size_t' is narrower than 'int' are not supported." -#endif - -#ifndef max - #define max(x,y) ((x) >= (y) ? (x) : (y)) -#endif -#ifndef min - #define min(x,y) ((x) <= (y) ? (x) : (y)) -#endif - -static void cleanup_and_fail( const int retval ); -static void show_error( const char * const msg, const int errcode, - const bool help ); -static void show_file_error( const char * const filename, - const char * const msg, const int errcode ); -static void internal_error( const char * const msg ); -static const char * const mem_msg = "Not enough memory."; - -int verbosity = 0; - -static const char * const program_name = "minilzip"; -static const char * const program_year = "2025"; -static const char * invocation_name = "minilzip"; /* default value */ - -static const struct { const char * from; const char * to; } known_extensions[] = { - { ".lz", "" }, - { ".tlz", ".tar" }, - { 0, 0 } }; - -typedef struct Lzma_options - { - int dictionary_size; /* 4 KiB .. 512 MiB */ - int match_len_limit; /* 5 .. 273 */ - } Lzma_options; - -typedef enum Mode { m_compress, m_decompress, m_test } Mode; - -/* Variables used in signal handler context. - They are not declared volatile because the handler never returns. */ -static char * output_filename = 0; -static int outfd = -1; -static bool delete_output_on_interrupt = false; - - -static void show_help( void ) - { - printf( "Minilzip is a test program for the compression library lzlib. Minilzip is\n" - "not intended to be installed because lzip has more features, but minilzip is\n" - "well tested and you can use it as your main compressor if so you wish.\n" - "\nLzip is a lossless data compressor with a user interface similar to the one\n" - "of gzip or bzip2. Lzip uses a simplified form of LZMA (Lempel-Ziv-Markov\n" - "chain-Algorithm) designed to achieve complete interoperability between\n" - "implementations. The maximum dictionary size is 512 MiB so that any lzip\n" - "file can be decompressed on 32-bit machines. Lzip provides accurate and\n" - "robust 3-factor integrity checking. 'lzip -0' compresses about as fast as\n" - "gzip, while 'lzip -9' compresses most files more than bzip2. Decompression\n" - "speed is intermediate between gzip and bzip2. Lzip provides better data\n" - "recovery capabilities than gzip and bzip2. Lzip has been designed, written,\n" - "and tested with great care to replace gzip and bzip2 as general-purpose\n" - "compressed format for Unix-like systems.\n" - "\nUsage: %s [options] [files]\n", invocation_name ); - printf( "\nOptions:\n" - " -h, --help display this help and exit\n" - " -V, --version output version information and exit\n" - " -a, --trailing-error exit with error status if trailing data\n" - " -b, --member-size= set member size limit of multimember files\n" - " -c, --stdout write to standard output, keep input files\n" - " -d, --decompress decompress, test compressed file integrity\n" - " -f, --force overwrite existing output files\n" - " -F, --recompress force re-compression of compressed files\n" - " -k, --keep keep (don't delete) input files\n" - " -m, --match-length= set match length limit in bytes [36]\n" - " -o, --output= write to , keep input files\n" - " -q, --quiet suppress all messages\n" - " -s, --dictionary-size= set dictionary size limit in bytes [8 MiB]\n" - " -S, --volume-size= set volume size limit in bytes\n" - " -t, --test test compressed file integrity\n" - " -v, --verbose be verbose (a 2nd -v gives more)\n" - " -0 .. -9 set compression level [default 6]\n" - " --fast alias for -0\n" - " --best alias for -9\n" - " --loose-trailing allow trailing data seeming corrupt header\n" - " --check-lib compare version of lzlib.h with liblz.{a,so}\n" - "\nIf no file names are given, or if a file is '-', minilzip compresses or\n" - "decompresses from standard input to standard output.\n" - "Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n" - "Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n" - "Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to\n" - "2^29 bytes.\n" - "\nThe bidimensional parameter space of LZMA can't be mapped to a linear scale\n" - "optimal for all files. If your files are large, very repetitive, etc, you\n" - "may need to use the options --dictionary-size and --match-length directly\n" - "to achieve optimal performance.\n" - "\nTo extract all the files from archive 'foo.tar.lz', use the commands\n" - "'tar -xf foo.tar.lz' or 'minilzip -cd foo.tar.lz | tar -xf -'.\n" - "\nExit status: 0 for a normal exit, 1 for environmental problems\n" - "(file not found, invalid command-line options, I/O errors, etc), 2 to\n" - "indicate a corrupt or invalid input file, 3 for an internal consistency\n" - "error (e.g., bug) which caused minilzip to panic.\n" - "\nThe ideas embodied in lzlib are due to (at least) the following people:\n" - "Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the\n" - "definition of Markov chains), G.N.N. Martin (for the definition of range\n" - "encoding), Igor Pavlov (for putting all the above together in LZMA), and\n" - "Julian Seward (for bzip2's CLI).\n" - "\nReport bugs to lzip-bug@nongnu.org\n" - "Lzlib home page: http://www.nongnu.org/lzip/lzlib.html\n" ); - } - - -static void show_lzlib_version( void ) - { - printf( "Using lzlib %s\n", LZ_version() ); -#if !defined LZ_API_VERSION - fputs( "LZ_API_VERSION is not defined.\n", stdout ); -#elif LZ_API_VERSION >= 1012 - printf( "Using LZ_API_VERSION = %u\n", LZ_api_version() ); -#else - printf( "Compiled with LZ_API_VERSION = %u. " - "Using an unknown LZ_API_VERSION\n", LZ_API_VERSION ); -#endif - } - - -static void show_version( void ) - { - printf( "%s %s\n", program_name, PROGVERSION ); - printf( "Copyright (C) %s Antonio Diaz Diaz.\n", program_year ); - printf( "License GPLv2+: GNU GPL version 2 or later \n" - "This is free software: you are free to change and redistribute it.\n" - "There is NO WARRANTY, to the extent permitted by law.\n" ); - show_lzlib_version(); - } - - -static inline void set_retval( int * retval, const int new_val ) - { if( *retval < new_val ) *retval = new_val; } - - -static int check_lzlib_ver() /* . or .[a-z.-]* */ - { -#if defined LZ_API_VERSION && LZ_API_VERSION >= 1012 - const unsigned char * p = (unsigned char *)LZ_version_string; - unsigned major = 0, minor = 0; - while( major < 100000 && isdigit( *p ) ) - { major *= 10; major += *p - '0'; ++p; } - if( *p == '.' ) ++p; - else -out: { show_error( "Invalid LZ_version_string in lzlib.h", 0, false ); return 2; } - while( minor < 100 && isdigit( *p ) ) - { minor *= 10; minor += *p - '0'; ++p; } - if( *p && *p != '-' && *p != '.' && !islower( *p ) ) goto out; - const unsigned version = major * 1000 + minor; - if( LZ_API_VERSION != version ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: Version mismatch in lzlib.h: " - "LZ_API_VERSION = %u, should be %u.\n", - program_name, LZ_API_VERSION, version ); - return 2; - } -#endif - return 0; - } - - -static int check_lib() - { - int retval = check_lzlib_ver(); - if( strcmp( LZ_version_string, LZ_version() ) != 0 ) - { set_retval( &retval, 1 ); - if( verbosity >= 0 ) - printf( "warning: LZ_version_string != LZ_version() (%s vs %s)\n", - LZ_version_string, LZ_version() ); } -#if defined LZ_API_VERSION && LZ_API_VERSION >= 1012 - if( LZ_API_VERSION != LZ_api_version() ) - { set_retval( &retval, 1 ); - if( verbosity >= 0 ) - printf( "warning: LZ_API_VERSION != LZ_api_version() (%u vs %u)\n", - LZ_API_VERSION, LZ_api_version() ); } -#endif - if( verbosity >= 1 ) show_lzlib_version(); - return retval; - } - - -/* assure at least a minimum size for buffer 'buf' */ -static void * resize_buffer( void * buf, const unsigned min_size ) - { - if( buf ) buf = realloc( buf, min_size ); - else buf = malloc( min_size ); - if( !buf ) { show_error( mem_msg, 0, false ); cleanup_and_fail( 1 ); } - return buf; - } - - -typedef struct Pretty_print - { - const char * name; - char * padded_name; - const char * stdin_name; - unsigned longest_name; - bool first_post; - } Pretty_print; - -static void Pp_init( Pretty_print * const pp, - const char * const filenames[], const int num_filenames ) - { - pp->name = 0; - pp->padded_name = 0; - pp->stdin_name = "(stdin)"; - pp->longest_name = 0; - pp->first_post = false; - - if( verbosity <= 0 ) return; - const unsigned stdin_name_len = strlen( pp->stdin_name ); - int i; - for( i = 0; i < num_filenames; ++i ) - { - const char * const s = filenames[i]; - const unsigned len = (strcmp( s, "-" ) == 0) ? stdin_name_len : strlen( s ); - if( pp->longest_name < len ) pp->longest_name = len; - } - if( pp->longest_name == 0 ) pp->longest_name = stdin_name_len; - } - -void Pp_free( Pretty_print * const pp ) - { if( pp->padded_name ) { free( pp->padded_name ); pp->padded_name = 0; } } - -static void Pp_set_name( Pretty_print * const pp, const char * const filename ) - { - unsigned name_len, padded_name_len, i = 0; - - if( filename && filename[0] && strcmp( filename, "-" ) != 0 ) - pp->name = filename; - else pp->name = pp->stdin_name; - name_len = strlen( pp->name ); - padded_name_len = max( name_len, pp->longest_name ) + 4; - pp->padded_name = resize_buffer( pp->padded_name, padded_name_len + 1 ); - while( i < 2 ) pp->padded_name[i++] = ' '; - while( i < name_len + 2 ) { pp->padded_name[i] = pp->name[i-2]; ++i; } - pp->padded_name[i++] = ':'; - while( i < padded_name_len ) pp->padded_name[i++] = ' '; - pp->padded_name[i] = 0; - pp->first_post = true; - } - -static void Pp_reset( Pretty_print * const pp ) - { if( pp->name && pp->name[0] ) pp->first_post = true; } - -static void Pp_show_msg( Pretty_print * const pp, const char * const msg ) - { - if( verbosity < 0 ) return; - if( pp->first_post ) - { - pp->first_post = false; - fputs( pp->padded_name, stderr ); - if( !msg ) fflush( stderr ); - } - if( msg ) fprintf( stderr, "%s\n", msg ); - } - - -static void show_header( const unsigned dictionary_size ) - { - enum { factor = 1024, n = 3 }; - const char * const prefix[n] = { "Ki", "Mi", "Gi" }; - const char * p = ""; - const char * np = " "; - unsigned num = dictionary_size; - bool exact = num % factor == 0; - - int i; for( i = 0; i < n && ( num > 9999 || ( exact && num >= factor ) ); ++i ) - { num /= factor; if( num % factor != 0 ) exact = false; - p = prefix[i]; np = ""; } - fprintf( stderr, "dict %s%4u %sB, ", np, num, p ); - } - - -/* separate numbers of 5 or more digits in groups of 3 digits using '_' */ -static const char * format_num3( unsigned long long num ) - { - enum { buffers = 8, bufsize = 4 * sizeof num, n = 10 }; - const char * const si_prefix = "kMGTPEZYRQ"; - const char * const binary_prefix = "KMGTPEZYRQ"; - static char buffer[buffers][bufsize]; /* circle of static buffers for printf */ - static int current = 0; - int i; - char * const buf = buffer[current++]; current %= buffers; - char * p = buf + bufsize - 1; /* fill the buffer backwards */ - *p = 0; /* terminator */ - if( num > 9999 ) - { - char prefix = 0; /* try binary first, then si */ - for( i = 0; i < n && num != 0 && num % 1024 == 0; ++i ) - { num /= 1024; prefix = binary_prefix[i]; } - if( prefix ) *(--p) = 'i'; - else - for( i = 0; i < n && num != 0 && num % 1000 == 0; ++i ) - { num /= 1000; prefix = si_prefix[i]; } - if( prefix ) *(--p) = prefix; - } - const bool split = num >= 10000; - - for( i = 0; ; ) - { - *(--p) = num % 10 + '0'; num /= 10; if( num == 0 ) break; - if( split && ++i >= 3 ) { i = 0; *(--p) = '_'; } - } - return p; - } - - -void show_option_error( const char * const arg, const char * const msg, - const char * const option_name ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: '%s': %s option '%s'.\n", - program_name, arg, msg, option_name ); - } - - -/* Recognized formats: k, Ki, [MGTPEZYRQ][i] */ -static unsigned long long getnum( const char * const arg, - const char * const option_name, - const unsigned long long llimit, - const unsigned long long ulimit ) - { - char * tail; - errno = 0; - unsigned long long result = strtoull( arg, &tail, 0 ); - if( tail == arg ) - { show_option_error( arg, "Bad or missing numerical argument in", - option_name ); exit( 1 ); } - - if( !errno && tail[0] ) - { - const unsigned factor = (tail[1] == 'i') ? 1024 : 1000; - int exponent = 0; /* 0 = bad multiplier */ - int i; - switch( tail[0] ) - { - case 'Q': exponent = 10; break; - case 'R': exponent = 9; break; - case 'Y': exponent = 8; break; - case 'Z': exponent = 7; break; - case 'E': exponent = 6; break; - case 'P': exponent = 5; break; - case 'T': exponent = 4; break; - case 'G': exponent = 3; break; - case 'M': exponent = 2; break; - case 'K': if( factor == 1024 ) exponent = 1; break; - case 'k': if( factor == 1000 ) exponent = 1; break; - } - if( exponent <= 0 ) - { show_option_error( arg, "Bad multiplier in numerical argument of", - option_name ); exit( 1 ); } - for( i = 0; i < exponent; ++i ) - { - if( ulimit / factor >= result ) result *= factor; - else { errno = ERANGE; break; } - } - } - if( !errno && ( result < llimit || result > ulimit ) ) errno = ERANGE; - if( errno ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: '%s': Value out of limits [%s,%s] in " - "option '%s'.\n", program_name, arg, format_num3( llimit ), - format_num3( ulimit ), option_name ); - exit( 1 ); - } - return result; - } - - -static int get_dict_size( const char * const arg, const char * const option_name ) - { - char * tail; - const long bits = strtol( arg, &tail, 0 ); - if( bits >= LZ_min_dictionary_bits() && - bits <= LZ_max_dictionary_bits() && *tail == 0 ) - return 1 << bits; - int dictionary_size = getnum( arg, option_name, LZ_min_dictionary_size(), - LZ_max_dictionary_size() ); - if( dictionary_size == 65535 ) ++dictionary_size; /* no fast encoder */ - return dictionary_size; - } - - -static void set_mode( Mode * const program_modep, const Mode new_mode ) - { - if( *program_modep != m_compress && *program_modep != new_mode ) - { - show_error( "Only one operation can be specified.", 0, true ); - exit( 1 ); - } - *program_modep = new_mode; - } - - -static int extension_index( const char * const name ) - { - int eindex; - for( eindex = 0; known_extensions[eindex].from; ++eindex ) - { - const char * const ext = known_extensions[eindex].from; - const unsigned name_len = strlen( name ); - const unsigned ext_len = strlen( ext ); - if( name_len > ext_len && - strncmp( name + name_len - ext_len, ext, ext_len ) == 0 ) - return eindex; - } - return -1; - } - - -static void set_c_outname( const char * const name, const bool force_ext, - const bool multifile ) - { - output_filename = resize_buffer( output_filename, strlen( name ) + 5 + - strlen( known_extensions[0].from ) + 1 ); - strcpy( output_filename, name ); - if( multifile ) strcat( output_filename, "00001" ); - if( force_ext || multifile ) - strcat( output_filename, known_extensions[0].from ); - } - - -static void set_d_outname( const char * const name, const int eindex ) - { - const unsigned name_len = strlen( name ); - if( eindex >= 0 ) - { - const char * const from = known_extensions[eindex].from; - const unsigned from_len = strlen( from ); - if( name_len > from_len ) - { - output_filename = resize_buffer( output_filename, name_len + - strlen( known_extensions[eindex].to ) + 1 ); - strcpy( output_filename, name ); - strcpy( output_filename + name_len - from_len, known_extensions[eindex].to ); - return; - } - } - output_filename = resize_buffer( output_filename, name_len + 4 + 1 ); - strcpy( output_filename, name ); - strcat( output_filename, ".out" ); - if( verbosity >= 1 ) - fprintf( stderr, "%s: %s: Can't guess original name -- using '%s'\n", - program_name, name, output_filename ); - } - - -static int open_instream( const char * const name, struct stat * const in_statsp, - const Mode program_mode, const int eindex, - const bool one_to_one, const bool recompress ) - { - if( program_mode == m_compress && !recompress && eindex >= 0 ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: %s: Input file already has '%s' suffix, ignored.\n", - program_name, name, known_extensions[eindex].from ); - return -1; - } - int infd = open( name, O_RDONLY | O_BINARY ); - if( infd < 0 ) - show_file_error( name, "Can't open input file", errno ); - else - { - const int i = fstat( infd, in_statsp ); - const mode_t mode = in_statsp->st_mode; - const bool can_read = i == 0 && - ( S_ISBLK( mode ) || S_ISCHR( mode ) || - S_ISFIFO( mode ) || S_ISSOCK( mode ) ); - if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || one_to_one ) ) ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: %s: Input file is not a regular file%s.\n", - program_name, name, ( can_read && one_to_one ) ? - ",\n and neither '-c' nor '-o' were specified" : "" ); - close( infd ); - infd = -1; - } - } - return infd; - } - - -static bool open_outstream( const bool force, const bool protect ) - { - const mode_t usr_rw = S_IRUSR | S_IWUSR; - const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH; - const mode_t outfd_mode = protect ? usr_rw : all_rw; - int flags = O_CREAT | O_WRONLY | O_BINARY; - if( force ) flags |= O_TRUNC; else flags |= O_EXCL; - - outfd = open( output_filename, flags, outfd_mode ); - if( outfd >= 0 ) delete_output_on_interrupt = true; - else if( errno == EEXIST ) - show_file_error( output_filename, - "Output file already exists, skipping.", 0 ); - else - show_file_error( output_filename, "Can't create output file", errno ); - return outfd >= 0; - } - - -static void set_signals( void (*action)(int) ) - { - signal( SIGHUP, action ); - signal( SIGINT, action ); - signal( SIGTERM, action ); - } - - -static void cleanup_and_fail( const int retval ) - { - set_signals( SIG_IGN ); /* ignore signals */ - if( delete_output_on_interrupt ) - { - delete_output_on_interrupt = false; - show_file_error( output_filename, "Deleting output file, if it exists.", 0 ); - if( outfd >= 0 ) { close( outfd ); outfd = -1; } - if( remove( output_filename ) != 0 && errno != ENOENT ) - show_error( "warning: deletion of output file failed", errno, false ); - } - exit( retval ); - } - - -static void signal_handler( int sig ) - { - if( sig ) {} /* keep compiler happy */ - show_error( "Control-C or similar caught, quitting.", 0, false ); - cleanup_and_fail( 1 ); - } - - -static bool check_tty_in( const char * const input_filename, const int infd, - const Mode program_mode, int * const retval ) - { - if( ( program_mode == m_decompress || program_mode == m_test ) && - isatty( infd ) ) /* for example /dev/tty */ - { show_file_error( input_filename, - "I won't read compressed data from a terminal.", 0 ); - close( infd ); set_retval( retval, 2 ); - if( program_mode != m_test ) cleanup_and_fail( *retval ); - return false; } - return true; - } - -static bool check_tty_out( const Mode program_mode ) - { - if( program_mode == m_compress && isatty( outfd ) ) - { show_file_error( output_filename[0] ? - output_filename : "(stdout)", - "I won't write compressed data to a terminal.", 0 ); - return false; } - return true; - } - - -/* Set permissions, owner, and times. */ -static void close_and_set_permissions( const struct stat * const in_statsp ) - { - bool warning = false; - if( in_statsp ) - { - const mode_t mode = in_statsp->st_mode; - /* fchown in many cases returns with EPERM, which can be safely ignored. */ - if( fchown( outfd, in_statsp->st_uid, in_statsp->st_gid ) == 0 ) - { if( fchmod( outfd, mode ) != 0 ) warning = true; } - else - if( errno != EPERM || - fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 ) - warning = true; - } - if( close( outfd ) != 0 ) - { show_file_error( output_filename, "Error closing output file", errno ); - cleanup_and_fail( 1 ); } - outfd = -1; - delete_output_on_interrupt = false; - if( in_statsp ) - { - struct utimbuf t; - t.actime = in_statsp->st_atime; - t.modtime = in_statsp->st_mtime; - if( utime( output_filename, &t ) != 0 ) warning = true; - } - if( warning && verbosity >= 1 ) - show_file_error( output_filename, - "warning: can't change output file attributes", errno ); - } - - -/* Return the number of bytes really read. - If (value returned < size) and (errno == 0), means EOF was reached. -*/ -static int readblock( const int fd, uint8_t * const buf, const int size ) - { - int sz = 0; - errno = 0; - while( sz < size ) - { - const int n = read( fd, buf + sz, size - sz ); - if( n > 0 ) sz += n; - else if( n == 0 ) break; /* EOF */ - else if( errno != EINTR ) break; - errno = 0; - } - return sz; - } - - -/* Return the number of bytes really written. - If (value returned < size), it is always an error. -*/ -static int writeblock( const int fd, const uint8_t * const buf, const int size ) - { - int sz = 0; - errno = 0; - while( sz < size ) - { - const int n = write( fd, buf + sz, size - sz ); - if( n > 0 ) sz += n; - else if( n < 0 && errno != EINTR ) break; - errno = 0; - } - return sz; - } - - -static bool next_filename( void ) - { - const unsigned name_len = strlen( output_filename ); - const unsigned ext_len = strlen( known_extensions[0].from ); - int i, j; - if( name_len >= ext_len + 5 ) /* "*00001.lz" */ - for( i = name_len - ext_len - 1, j = 0; j < 5; --i, ++j ) - { - if( output_filename[i] < '9' ) { ++output_filename[i]; return true; } - else output_filename[i] = '0'; - } - return false; - } - - -static int do_compress( LZ_Encoder * const encoder, - const unsigned long long member_size, - const unsigned long long volume_size, const int infd, - Pretty_print * const pp, - const struct stat * const in_statsp ) - { - unsigned long long partial_volume_size = 0; - enum { buffer_size = 65536 }; - uint8_t buffer[buffer_size]; /* read/write buffer */ - if( verbosity >= 1 ) Pp_show_msg( pp, 0 ); - - while( true ) - { - int in_size = 0; - while( LZ_compress_write_size( encoder ) > 0 ) - { - const int size = min( LZ_compress_write_size( encoder ), buffer_size ); - const int rd = readblock( infd, buffer, size ); - if( rd != size && errno ) - { - Pp_show_msg( pp, 0 ); show_error( "Read error", errno, false ); - return 1; - } - if( rd > 0 && rd != LZ_compress_write( encoder, buffer, rd ) ) - internal_error( "library error (LZ_compress_write)." ); - if( rd < size ) LZ_compress_finish( encoder ); -/* else LZ_compress_sync_flush( encoder ); */ - in_size += rd; - } - const int out_size = LZ_compress_read( encoder, buffer, buffer_size ); - if( out_size < 0 ) - { - Pp_show_msg( pp, 0 ); - if( verbosity >= 0 ) - fprintf( stderr, "%s: LZ_compress_read error: %s\n", - program_name, LZ_strerror( LZ_compress_errno( encoder ) ) ); - return 1; - } - else if( out_size > 0 ) - { - const int wr = writeblock( outfd, buffer, out_size ); - if( wr != out_size ) - { - Pp_show_msg( pp, 0 ); show_error( "Write error", errno, false ); - return 1; - } - } - else if( in_size == 0 ) - internal_error( "library error (LZ_compress_read)." ); - if( LZ_compress_member_finished( encoder ) ) - { - unsigned long long size; - if( LZ_compress_finished( encoder ) == 1 ) break; - if( volume_size > 0 ) - { - partial_volume_size += LZ_compress_member_position( encoder ); - if( partial_volume_size >= volume_size - LZ_min_dictionary_size() ) - { - partial_volume_size = 0; - if( delete_output_on_interrupt ) - { - close_and_set_permissions( in_statsp ); - if( !next_filename() ) - { Pp_show_msg( pp, "Too many volume files." ); return 1; } - if( !open_outstream( true, in_statsp ) ) return 1; - } - } - size = min( member_size, volume_size - partial_volume_size ); - } - else - size = member_size; - if( LZ_compress_restart_member( encoder, size ) < 0 ) - { - Pp_show_msg( pp, 0 ); - if( verbosity >= 0 ) - fprintf( stderr, "%s: LZ_compress_restart_member error: %s\n", - program_name, LZ_strerror( LZ_compress_errno( encoder ) ) ); - return 1; - } - } - } - - if( verbosity >= 1 ) - { - const unsigned long long in_size = LZ_compress_total_in_size( encoder ); - const unsigned long long out_size = LZ_compress_total_out_size( encoder ); - if( in_size == 0 || out_size == 0 ) - fputs( " no data compressed.\n", stderr ); - else - fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved, " - "%llu in, %llu out.\n", - (double)in_size / out_size, - ( 100.0 * out_size ) / in_size, - 100.0 - ( ( 100.0 * out_size ) / in_size ), - in_size, out_size ); - } - return 0; - } - - -static int compress( const unsigned long long member_size, - const unsigned long long volume_size, const int infd, - const Lzma_options * const encoder_options, - Pretty_print * const pp, - const struct stat * const in_statsp ) - { - LZ_Encoder * const encoder = - LZ_compress_open( encoder_options->dictionary_size, - encoder_options->match_len_limit, ( volume_size > 0 ) ? - min( member_size, volume_size ) : member_size ); - int retval; - - if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ) - { - if( !encoder || LZ_compress_errno( encoder ) == LZ_mem_error ) - Pp_show_msg( pp, "Not enough memory. Try a smaller dictionary size." ); - else - internal_error( "invalid argument to encoder." ); - retval = 1; - } - else retval = do_compress( encoder, member_size, volume_size, - infd, pp, in_statsp ); - LZ_compress_close( encoder ); - return retval; - } - - -static int do_decompress( LZ_Decoder * const decoder, const int infd, - Pretty_print * const pp, const bool from_stdin, - const bool ignore_trailing, const bool loose_trailing, - const bool testing ) - { - enum { buffer_size = 65536 }; - uint8_t buffer[buffer_size]; /* read/write buffer */ - unsigned long long total_in = 0; /* to detect library stall */ - bool first_member; - bool empty = false, multi = false; - - for( first_member = true; ; ) - { - const int max_in_size = - min( LZ_decompress_write_size( decoder ), buffer_size ); - int in_size = 0, out_size = 0; - if( max_in_size > 0 ) - { - in_size = readblock( infd, buffer, max_in_size ); - if( in_size != max_in_size && errno ) - { - Pp_show_msg( pp, 0 ); show_error( "Read error", errno, false ); - return 1; - } - if( in_size > 0 && in_size != LZ_decompress_write( decoder, buffer, in_size ) ) - internal_error( "library error (LZ_decompress_write)." ); - if( in_size < max_in_size ) LZ_decompress_finish( decoder ); - } - while( true ) - { - const int rd = - LZ_decompress_read( decoder, (outfd >= 0) ? buffer : 0, buffer_size ); - if( rd > 0 ) - { - out_size += rd; - if( outfd >= 0 ) - { - const int wr = writeblock( outfd, buffer, rd ); - if( wr != rd ) - { - Pp_show_msg( pp, 0 ); show_error( "Write error", errno, false ); - return 1; - } - } - } - else if( rd < 0 ) { out_size = rd; break; } - if( LZ_decompress_member_finished( decoder ) == 1 ) - { - const unsigned long long data_size = LZ_decompress_data_position( decoder ); - if( !from_stdin ) - { multi = !first_member; if( data_size == 0 ) empty = true; } - if( verbosity >= 1 ) - { - const unsigned long long member_size = - LZ_decompress_member_position( decoder ); - if( verbosity >= 2 || ( verbosity == 1 && first_member ) ) - Pp_show_msg( pp, 0 ); - if( verbosity >= 2 ) - { - if( verbosity >= 4 ) - show_header( LZ_decompress_dictionary_size( decoder ) ); - if( data_size == 0 || member_size == 0 ) - fputs( "no data compressed. ", stderr ); - else - fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved. ", - (double)data_size / member_size, - ( 100.0 * member_size ) / data_size, - 100.0 - ( ( 100.0 * member_size ) / data_size ) ); - if( verbosity >= 4 ) - fprintf( stderr, "CRC %08X, ", LZ_decompress_data_crc( decoder ) ); - if( verbosity >= 3 ) - fprintf( stderr, "%9llu out, %8llu in. ", data_size, member_size ); - fputs( testing ? "ok\n" : "done\n", stderr ); Pp_reset( pp ); - } - } - first_member = false; /* member decompressed successfully */ - } - if( rd <= 0 ) break; - } - if( out_size < 0 || ( first_member && out_size == 0 ) ) - { - const unsigned long long member_pos = LZ_decompress_member_position( decoder ); - const LZ_Errno lz_errno = LZ_decompress_errno( decoder ); - if( lz_errno == LZ_library_error ) - internal_error( "library error (LZ_decompress_read)." ); - if( member_pos <= 6 ) - { - if( lz_errno == LZ_unexpected_eof ) - { - if( first_member ) - show_file_error( pp->name, "File ends unexpectedly at member header.", 0 ); - else - Pp_show_msg( pp, "Truncated header in multimember file." ); - return 2; - } - else if( lz_errno == LZ_data_error ) - { - if( member_pos == 4 ) - { if( verbosity >= 0 ) - { Pp_show_msg( pp, 0 ); - fprintf( stderr, "Version %d member format not supported.\n", - LZ_decompress_member_version( decoder ) ); } } - else if( member_pos == 5 ) - Pp_show_msg( pp, "Invalid dictionary size in member header." ); - else if( member_pos == 6 ) - Pp_show_msg( pp, "Nonzero first LZMA byte." ); - else if( first_member ) /* for lzlib older than 1.10 */ - Pp_show_msg( pp, "Bad version or dictionary size in member header." ); - else if( !loose_trailing ) - Pp_show_msg( pp, "Corrupt header in multimember file." ); - else if( !ignore_trailing ) - Pp_show_msg( pp, "Trailing data not allowed." ); - else break; /* trailing data */ - return 2; - } - } - if( lz_errno == LZ_header_error ) - { - if( first_member ) - show_file_error( pp->name, - "Bad magic number (file not in lzip format).", 0 ); - else if( !ignore_trailing ) - Pp_show_msg( pp, "Trailing data not allowed." ); - else break; /* trailing data */ - return 2; - } - if( lz_errno == LZ_mem_error ) { Pp_show_msg( pp, mem_msg ); return 1; } - if( verbosity >= 0 ) - { - Pp_show_msg( pp, 0 ); - fprintf( stderr, "%s at pos %llu\n", ( lz_errno == LZ_unexpected_eof ) ? - "File ends unexpectedly" : "Decoder error", - LZ_decompress_total_in_size( decoder ) ); - } - return 2; - } - if( LZ_decompress_finished( decoder ) == 1 ) break; - if( in_size == 0 && out_size == 0 ) - { - const unsigned long long size = LZ_decompress_total_in_size( decoder ); - if( total_in == size ) internal_error( "library error (stalled)." ); - total_in = size; - } - } - if( verbosity == 1 ) fputs( testing ? "ok\n" : "done\n", stderr ); - if( empty && multi ) - { show_file_error( pp->name, "Empty member not allowed.", 0 ); return 2; } - return 0; - } - - -static int decompress( const int infd, Pretty_print * const pp, - const bool from_stdin, const bool ignore_trailing, - const bool loose_trailing, const bool testing ) - { - LZ_Decoder * const decoder = LZ_decompress_open(); - int retval; - - if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) - { Pp_show_msg( pp, mem_msg ); retval = 1; } - else retval = do_decompress( decoder, infd, pp, from_stdin, ignore_trailing, - loose_trailing, testing ); - LZ_decompress_close( decoder ); - return retval; - } - - -static void show_error( const char * const msg, const int errcode, - const bool help ) - { - if( verbosity < 0 ) return; - if( msg && msg[0] ) - fprintf( stderr, "%s: %s%s%s\n", program_name, msg, - ( errcode > 0 ) ? ": " : "", - ( errcode > 0 ) ? strerror( errcode ) : "" ); - if( help ) - fprintf( stderr, "Try '%s --help' for more information.\n", - invocation_name ); - } - - -static void show_file_error( const char * const filename, - const char * const msg, const int errcode ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: %s: %s%s%s\n", program_name, filename, msg, - ( errcode > 0 ) ? ": " : "", - ( errcode > 0 ) ? strerror( errcode ) : "" ); - } - - -static void internal_error( const char * const msg ) - { - if( verbosity >= 0 ) - fprintf( stderr, "%s: internal error: %s\n", program_name, msg ); - exit( 3 ); - } - - -int main( const int argc, const char * const argv[] ) - { - /* Mapping from gzip/bzip2 style 0..9 compression levels to the - corresponding LZMA compression parameters. */ - const Lzma_options option_mapping[] = - { - { 65535, 16 }, /* -0 (65535,16 chooses fast encoder) */ - { 1 << 20, 5 }, /* -1 */ - { 3 << 19, 6 }, /* -2 */ - { 1 << 21, 8 }, /* -3 */ - { 3 << 20, 12 }, /* -4 */ - { 1 << 22, 20 }, /* -5 */ - { 1 << 23, 36 }, /* -6 */ - { 1 << 24, 68 }, /* -7 */ - { 3 << 23, 132 }, /* -8 */ - { 1 << 25, 273 } }; /* -9 */ - Lzma_options encoder_options = option_mapping[6]; /* default = "-6" */ - const unsigned long long max_member_size = 0x0008000000000000ULL; /* 2 PiB */ - const unsigned long long max_volume_size = 0x4000000000000000ULL; /* 4 EiB */ - unsigned long long member_size = max_member_size; - unsigned long long volume_size = 0; - const char * default_output_filename = ""; - Mode program_mode = m_compress; - bool force = false; - bool ignore_trailing = true; - bool keep_input_files = false; - bool loose_trailing = false; - bool recompress = false; - bool to_stdout = false; - if( argc > 0 ) invocation_name = argv[0]; - - enum { opt_chk = 256, opt_lt }; - const ap_Option options[] = - { - { '0', "fast", ap_no }, - { '1', 0, ap_no }, - { '2', 0, ap_no }, - { '3', 0, ap_no }, - { '4', 0, ap_no }, - { '5', 0, ap_no }, - { '6', 0, ap_no }, - { '7', 0, ap_no }, - { '8', 0, ap_no }, - { '9', "best", ap_no }, - { 'a', "trailing-error", ap_no }, - { 'b', "member-size", ap_yes }, - { 'c', "stdout", ap_no }, - { 'd', "decompress", ap_no }, - { 'f', "force", ap_no }, - { 'F', "recompress", ap_no }, - { 'h', "help", ap_no }, - { 'k', "keep", ap_no }, - { 'm', "match-length", ap_yes }, - { 'n', "threads", ap_yes }, - { 'o', "output", ap_yes }, - { 'q', "quiet", ap_no }, - { 's', "dictionary-size", ap_yes }, - { 'S', "volume-size", ap_yes }, - { 't', "test", ap_no }, - { 'v', "verbose", ap_no }, - { 'V', "version", ap_no }, - { opt_chk, "check-lib", ap_no }, - { opt_lt, "loose-trailing", ap_no }, - { 0, 0, ap_no } }; - - /* static because valgrind complains and memory management in C sucks */ - static Arg_parser parser; - if( !ap_init( &parser, argc, argv, options, 0 ) ) - { show_error( mem_msg, 0, false ); return 1; } - if( ap_error( &parser ) ) /* bad option */ - { show_error( ap_error( &parser ), 0, true ); return 1; } - - int argind = 0; - for( ; argind < ap_arguments( &parser ); ++argind ) - { - const int code = ap_code( &parser, argind ); - if( !code ) break; /* no more options */ - const char * const pn = ap_parsed_name( &parser, argind ); - const char * const arg = ap_argument( &parser, argind ); - switch( code ) - { - case '0': case '1': case '2': case '3': case '4': case '5': - case '6': case '7': case '8': case '9': - encoder_options = option_mapping[code-'0']; break; - case 'a': ignore_trailing = false; break; - case 'b': member_size = getnum( arg, pn, 100000, max_member_size ); break; - case 'c': to_stdout = true; break; - case 'd': set_mode( &program_mode, m_decompress ); break; - case 'f': force = true; break; - case 'F': recompress = true; break; - case 'h': show_help(); return 0; - case 'k': keep_input_files = true; break; - case 'm': encoder_options.match_len_limit = - getnum( arg, pn, LZ_min_match_len_limit(), - LZ_max_match_len_limit() ); break; - case 'n': break; /* ignored */ - case 'o': if( strcmp( arg, "-" ) == 0 ) to_stdout = true; - else { default_output_filename = arg; } break; - case 'q': verbosity = -1; break; - case 's': encoder_options.dictionary_size = get_dict_size( arg, pn ); - break; - case 'S': volume_size = getnum( arg, pn, 100000, max_volume_size ); break; - case 't': set_mode( &program_mode, m_test ); break; - case 'v': if( verbosity < 4 ) ++verbosity; break; - case 'V': show_version(); return 0; - case opt_chk: return check_lib(); - case opt_lt: loose_trailing = true; break; - default: internal_error( "uncaught option." ); - } - } /* end process options */ - - if( strcmp( PROGVERSION, LZ_version_string ) != 0 ) - internal_error( "wrong PROGVERSION." ); -#if !defined LZ_API_VERSION || LZ_API_VERSION < 1012 -#error "lzlib 1.12 or newer needed." -#else - if( LZ_api_version() < 1012 ) /* minilzip passes null to LZ_decompress_read */ - { show_error( "lzlib 1.12 or newer needed. Try --check-lib.", 0, false ); - return 1; } - if( LZ_api_version() != LZ_API_VERSION ) show_error( - "warning: wrong library API version. Try --check-lib.", 0, false ); - else -#endif - if( strcmp( LZ_version_string, LZ_version() ) != 0 ) show_error( - "warning: wrong library version_string. Try --check-lib.", 0, false ); - -#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ - setmode( STDIN_FILENO, O_BINARY ); - setmode( STDOUT_FILENO, O_BINARY ); -#endif - - static const char ** filenames = 0; - int num_filenames = max( 1, ap_arguments( &parser ) - argind ); - filenames = resize_buffer( filenames, num_filenames * sizeof filenames[0] ); - filenames[0] = "-"; - - int i; - bool filenames_given = false; - for( i = 0; argind + i < ap_arguments( &parser ); ++i ) - { - filenames[i] = ap_argument( &parser, argind + i ); - if( strcmp( filenames[i], "-" ) != 0 ) filenames_given = true; - } - - if( program_mode == m_compress ) - { - if( volume_size > 0 && !to_stdout && default_output_filename[0] && - num_filenames > 1 ) - { show_error( "Only can compress one file when using '-o' and '-S'.", - 0, true ); return 1; } - } - else volume_size = 0; - if( program_mode == m_test ) to_stdout = false; /* apply overrides */ - if( program_mode == m_test || to_stdout ) default_output_filename = ""; - - output_filename = resize_buffer( output_filename, 1 ); - output_filename[0] = 0; - if( to_stdout && program_mode != m_test ) /* check tty only once */ - { outfd = STDOUT_FILENO; if( !check_tty_out( program_mode ) ) return 1; } - else outfd = -1; - - const bool to_file = !to_stdout && program_mode != m_test && - default_output_filename[0]; - if( !to_stdout && program_mode != m_test && ( filenames_given || to_file ) ) - set_signals( signal_handler ); - - static Pretty_print pp; - Pp_init( &pp, filenames, num_filenames ); - - int failed_tests = 0; - int retval = 0; - const bool one_to_one = !to_stdout && program_mode != m_test && !to_file; - bool stdin_used = false; - struct stat in_stats; - for( i = 0; i < num_filenames; ++i ) - { - const char * input_filename = ""; - int infd; - const bool from_stdin = strcmp( filenames[i], "-" ) == 0; - - Pp_set_name( &pp, filenames[i] ); - if( from_stdin ) - { - if( stdin_used ) continue; else stdin_used = true; - infd = STDIN_FILENO; - if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue; - if( one_to_one ) { outfd = STDOUT_FILENO; output_filename[0] = 0; } - } - else - { - const int eindex = extension_index( input_filename = filenames[i] ); - infd = open_instream( input_filename, &in_stats, program_mode, - eindex, one_to_one, recompress ); - if( infd < 0 ) { set_retval( &retval, 1 ); continue; } - if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue; - if( one_to_one ) /* open outfd after checking infd */ - { - if( program_mode == m_compress ) - set_c_outname( input_filename, true, volume_size > 0 ); - else set_d_outname( input_filename, eindex ); - if( !open_outstream( force, true ) ) - { close( infd ); set_retval( &retval, 1 ); continue; } - } - } - - if( one_to_one && !check_tty_out( program_mode ) ) - { set_retval( &retval, 1 ); return retval; } /* don't delete a tty */ - - if( to_file && outfd < 0 ) /* open outfd after checking infd */ - { - if( program_mode == m_compress ) set_c_outname( default_output_filename, - false, volume_size > 0 ); - else - { output_filename = resize_buffer( output_filename, - strlen( default_output_filename ) + 1 ); - strcpy( output_filename, default_output_filename ); } - if( !open_outstream( force, false ) || !check_tty_out( program_mode ) ) - return 1; /* check tty only once and don't try to delete a tty */ - } - - const struct stat * const in_statsp = - ( input_filename[0] && one_to_one ) ? &in_stats : 0; - int tmp; - if( program_mode == m_compress ) - tmp = compress( member_size, volume_size, infd, &encoder_options, &pp, - in_statsp ); - else - tmp = decompress( infd, &pp, from_stdin, ignore_trailing, loose_trailing, - program_mode == m_test ); - if( close( infd ) != 0 ) - { show_file_error( pp.name, "Error closing input file", errno ); - set_retval( &tmp, 1 ); } - set_retval( &retval, tmp ); - if( tmp ) - { if( program_mode != m_test ) cleanup_and_fail( retval ); - else ++failed_tests; } - - if( delete_output_on_interrupt && one_to_one ) - close_and_set_permissions( in_statsp ); - if( input_filename[0] && !keep_input_files && one_to_one && - ( program_mode != m_compress || volume_size == 0 ) ) - remove( input_filename ); - } - if( delete_output_on_interrupt ) /* -o */ - close_and_set_permissions( ( retval == 0 && !stdin_used && - filenames_given && num_filenames == 1 ) ? &in_stats : 0 ); - else if( outfd >= 0 && close( outfd ) != 0 ) /* -c */ - { - show_error( "Error closing stdout", errno, false ); - set_retval( &retval, 1 ); - } - if( failed_tests > 0 && verbosity >= 1 && num_filenames > 1 ) - fprintf( stderr, "%s: warning: %d %s failed the test.\n", - program_name, failed_tests, - ( failed_tests == 1 ) ? "file" : "files" ); - free( output_filename ); - Pp_free( &pp ); - free( filenames ); - ap_free( &parser ); - return retval; - } diff --git a/testsuite/check.sh b/testsuite/check.sh index d4c5eff..a78f156 100755 --- a/testsuite/check.sh +++ b/testsuite/check.sh @@ -1,9 +1,9 @@ #! /bin/sh -# check script for Lzlib - Compression library for the lzip format -# Copyright (C) 2009-2025 Antonio Diaz Diaz. +# check script for Lzlib - A compression library for lzip files +# Copyright (C) 2009-2016 Antonio Diaz Diaz. # # This script is free software: you have unlimited permission -# to copy, distribute, and modify it. +# to copy, distribute and modify it. LC_ALL=C export LC_ALL @@ -11,7 +11,6 @@ objdir=`pwd` testdir=`cd "$1" ; pwd` LZIP="${objdir}"/minilzip BBEXAMPLE="${objdir}"/bbexample -FFEXAMPLE="${objdir}"/ffexample LZCHECK="${objdir}"/lzcheck framework_failure() { echo "failure in testing framework" ; exit 1 ; } @@ -20,471 +19,178 @@ if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then exit 1 fi -[ -e "${LZIP}" ] 2> /dev/null || - { +if [ -e "${LZIP}" ] 2> /dev/null ; then true +else echo "$0: a POSIX shell is required to run the tests" echo "Try bash -c \"$0 $1 $2\"" exit 1 - } +fi if [ -d tmp ] ; then rm -rf tmp ; fi mkdir tmp cd "${objdir}"/tmp || framework_failure -cp "${testdir}"/test.txt in || framework_failure +cat "${testdir}"/test.txt > in || framework_failure in_lz="${testdir}"/test.txt.lz -fox_lf="${testdir}"/fox_lf -fox_lz="${testdir}"/fox.lz -fnz_lz="${testdir}"/fox_nz.lz +test2="${testdir}"/test2.txt fail=0 -test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; } - -"${LZIP}" --check-lib # just print warning -[ $? != 2 ] || { test_failed $LINENO ; exit 2 ; } # unless bad lzlib.h printf "testing lzlib-%s..." "$2" "${LZIP}" -fkqm4 in -[ $? = 1 ] || test_failed $LINENO -[ ! -e in.lz ] || test_failed $LINENO +if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi "${LZIP}" -fkqm274 in -[ $? = 1 ] || test_failed $LINENO -[ ! -e in.lz ] || test_failed $LINENO -for i in bad_size -1 0 4095 513MiB 1G 1T 1P 1E 1Z 1Y 10KB ; do - "${LZIP}" -fkqs $i in - [ $? = 1 ] || test_failed $LINENO $i - [ ! -e in.lz ] || test_failed $LINENO $i -done +if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -fkqs-1 in +if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -fkqs0 in +if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -fkqs4095 in +if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -fkqs513MiB in +if [ $? = 1 ] && [ ! -e in.lz ] ; then printf . ; else printf - ; fail=1 ; fi "${LZIP}" -tq in -[ $? = 2 ] || test_failed $LINENO +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi "${LZIP}" -tq < in -[ $? = 2 ] || test_failed $LINENO +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi "${LZIP}" -cdq in -[ $? = 2 ] || test_failed $LINENO +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi "${LZIP}" -cdq < in -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -dq -o in < "${in_lz}" -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -dq -o in "${in_lz}" -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -dq -o out nx_file.lz -[ $? = 1 ] || test_failed $LINENO -[ ! -e out ] || test_failed $LINENO -"${LZIP}" -q -o out.lz nx_file -[ $? = 1 ] || test_failed $LINENO -[ ! -e out.lz ] || test_failed $LINENO -"${LZIP}" -qf -S100k -o out in in # only one file with -o and -S -[ $? = 1 ] || test_failed $LINENO -{ [ ! -e out ] && [ ! -e out.lz ] ; } || test_failed $LINENO -# these are for code coverage -"${LZIP}" -cdt "${in_lz}" 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -t -- nx_file.lz 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -t "" < /dev/null 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" --help > /dev/null || test_failed $LINENO -"${LZIP}" -n1 -V > /dev/null || test_failed $LINENO -"${LZIP}" -m 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -z 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" --bad_option 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" --t 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" --test=2 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" --output= 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" --output 2> /dev/null -[ $? = 1 ] || test_failed $LINENO -printf "LZIP\001-.............................." | "${LZIP}" -t 2> /dev/null -printf "LZIP\002-.............................." | "${LZIP}" -t 2> /dev/null -printf "LZIP\001+.............................." | "${LZIP}" -t 2> /dev/null +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi +dd if="${in_lz}" bs=1 count=6 2> /dev/null | "${LZIP}" -tq +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi +dd if="${in_lz}" bs=1 count=20 2> /dev/null | "${LZIP}" -tq +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi printf "\ntesting decompression..." -for i in "${in_lz}" "${testdir}"/test_sync.lz ; do - "${LZIP}" -t "$i" || test_failed $LINENO "$i" - "${LZIP}" -d "$i" -o out || test_failed $LINENO "$i" - cmp in out || test_failed $LINENO "$i" - "${LZIP}" -cd "$i" > out || test_failed $LINENO "$i" - cmp in out || test_failed $LINENO "$i" - "${LZIP}" -d "$i" -o - > out || test_failed $LINENO "$i" - cmp in out || test_failed $LINENO "$i" - "${LZIP}" -d < "$i" > out || test_failed $LINENO "$i" - cmp in out || test_failed $LINENO "$i" - rm -f out || framework_failure -done +"${LZIP}" -t "${in_lz}" +if [ $? = 0 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -cd "${in_lz}" > copy || fail=1 +cmp in copy || fail=1 +printf . -cp "${in_lz}" out.lz || framework_failure -"${LZIP}" -dk out.lz || test_failed $LINENO -cmp in out || test_failed $LINENO -rm -f out || framework_failure -"${LZIP}" -cd "${fox_lz}" > fox || test_failed $LINENO -cp fox copy || framework_failure -cp "${in_lz}" copy.lz || framework_failure -"${LZIP}" -d copy.lz out.lz 2> /dev/null # skip copy, decompress out -[ $? = 1 ] || test_failed $LINENO -[ ! -e out.lz ] || test_failed $LINENO -cmp fox copy || test_failed $LINENO -cmp in out || test_failed $LINENO -"${LZIP}" -df copy.lz || test_failed $LINENO -[ ! -e copy.lz ] || test_failed $LINENO -cmp in copy || test_failed $LINENO -rm -f copy out || framework_failure +"${LZIP}" -t "${testdir}"/test_sync.lz +if [ $? = 0 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -cd "${testdir}"/test_sync.lz > copy || fail=1 +cmp in copy || fail=1 +printf . -cp "${in_lz}" out.lz || framework_failure -"${LZIP}" -d -S100k out.lz || test_failed $LINENO # ignore -S -[ ! -e out.lz ] || test_failed $LINENO -cmp in out || test_failed $LINENO +rm -f copy +cat "${in_lz}" > copy.lz || framework_failure +"${LZIP}" -dk copy.lz || fail=1 +cmp in copy || fail=1 +printf "to be overwritten" > copy || framework_failure +"${LZIP}" -dq copy.lz +if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -df copy.lz +if [ $? = 0 ] && [ ! -e copy.lz ] && cmp in copy ; then + printf . ; else printf - ; fail=1 ; fi -printf "to be overwritten" > out || framework_failure -"${LZIP}" -df -o out < "${in_lz}" || test_failed $LINENO -cmp in out || test_failed $LINENO -"${LZIP}" -d -o ./- "${in_lz}" || test_failed $LINENO -cmp in ./- || test_failed $LINENO -rm -f ./- || framework_failure -"${LZIP}" -d -o ./- < "${in_lz}" || test_failed $LINENO -cmp in ./- || test_failed $LINENO -rm -f ./- || framework_failure +printf "to be overwritten" > copy || framework_failure +"${LZIP}" -df -o copy < "${in_lz}" || fail=1 +cmp in copy || fail=1 +printf . -cp "${in_lz}" anyothername || framework_failure -"${LZIP}" -dv - anyothername - < "${in_lz}" > out 2> /dev/null || - test_failed $LINENO -cmp in out || test_failed $LINENO -cmp in anyothername.out || test_failed $LINENO -rm -f anyothername.out || framework_failure +rm -f copy +"${LZIP}" -s16 < in > anyothername || fail=1 +"${LZIP}" -d -o copy - anyothername - < "${in_lz}" +if [ $? = 0 ] && cmp in copy && cmp in anyothername.out ; then + printf . ; else printf - ; fail=1 ; fi +rm -f copy anyothername.out "${LZIP}" -tq in "${in_lz}" -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -tq nx_file.lz "${in_lz}" -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -cdq in "${in_lz}" > out -[ $? = 2 ] || test_failed $LINENO -cat out in | cmp in - || test_failed $LINENO # out must be empty -"${LZIP}" -cdq nx_file.lz "${in_lz}" > out # skip nx_file, decompress in -[ $? = 1 ] || test_failed $LINENO -cmp in out || test_failed $LINENO -rm -f out || framework_failure -cp "${in_lz}" out.lz || framework_failure -for i in 1 2 3 4 5 6 7 ; do - printf "g" >> out.lz || framework_failure - "${LZIP}" -atvvvv out.lz "${in_lz}" 2> /dev/null - [ $? = 2 ] || test_failed $LINENO $i -done -"${LZIP}" -dq in out.lz -[ $? = 2 ] || test_failed $LINENO -[ -e out.lz ] || test_failed $LINENO -[ ! -e out ] || test_failed $LINENO -[ ! -e in.out ] || test_failed $LINENO -"${LZIP}" -dq nx_file.lz out.lz -[ $? = 1 ] || test_failed $LINENO -[ ! -e out.lz ] || test_failed $LINENO -[ ! -e nx_file ] || test_failed $LINENO -cmp in out || test_failed $LINENO -rm -f out || framework_failure +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -tq foo.lz "${in_lz}" +if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -cdq in "${in_lz}" > copy +if [ $? = 2 ] && cat copy in | cmp in - ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -cdq foo.lz "${in_lz}" > copy +if [ $? = 1 ] && cmp in copy ; then printf . ; else printf - ; fail=1 ; fi +rm -f copy +cat "${in_lz}" > copy.lz || framework_failure +"${LZIP}" -dq in copy.lz +if [ $? = 2 ] && [ -e copy.lz ] && [ ! -e copy ] && [ ! -e in.out ] ; then + printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -dq foo.lz copy.lz +if [ $? = 1 ] && [ ! -e copy.lz ] && [ ! -e foo ] && cmp in copy ; then + printf . ; else printf - ; fail=1 ; fi cat in in > in2 || framework_failure -"${LZIP}" -t "${in_lz}" "${in_lz}" || test_failed $LINENO -"${LZIP}" -cd "${in_lz}" "${in_lz}" -o out > out2 || test_failed $LINENO -[ ! -e out ] || test_failed $LINENO # override -o -cmp in2 out2 || test_failed $LINENO -rm -f out2 || framework_failure -"${LZIP}" -d "${in_lz}" "${in_lz}" -o out2 || test_failed $LINENO -cmp in2 out2 || test_failed $LINENO -rm -f out2 || framework_failure +"${LZIP}" -s16 -o copy2 < in2 || fail=1 +"${LZIP}" -t copy2.lz || fail=1 +"${LZIP}" -cd copy2.lz > copy2 || fail=1 +cmp in2 copy2 || fail=1 +printf . -cat "${in_lz}" "${in_lz}" > out2.lz || framework_failure -lines=`"${LZIP}" -tvv out2.lz 2>&1 | wc -l` || test_failed $LINENO -[ "${lines}" -eq 2 ] || test_failed $LINENO "${lines}" - -printf "\ngarbage" >> out2.lz || framework_failure -"${LZIP}" -tvvvv out2.lz 2> /dev/null || test_failed $LINENO -"${LZIP}" -atq out2.lz -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -atq < out2.lz -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -adkq out2.lz -[ $? = 2 ] || test_failed $LINENO -[ ! -e out2 ] || test_failed $LINENO -"${LZIP}" -adkq -o out2 < out2.lz -[ $? = 2 ] || test_failed $LINENO -[ ! -e out2 ] || test_failed $LINENO -printf "to be overwritten" > out2 || framework_failure -"${LZIP}" -df out2.lz || test_failed $LINENO -cmp in2 out2 || test_failed $LINENO -rm -f out2 || framework_failure - -touch empty em || framework_failure -"${LZIP}" -0 em || test_failed $LINENO -"${LZIP}" -dk em.lz || test_failed $LINENO -cmp empty em || test_failed $LINENO -cat em.lz em.lz | "${LZIP}" -t || test_failed $LINENO -cat em.lz em.lz | "${LZIP}" -d > em || test_failed $LINENO -cmp empty em || test_failed $LINENO -cat em.lz "${in_lz}" | "${LZIP}" -t || test_failed $LINENO -cat em.lz "${in_lz}" | "${LZIP}" -d > out || test_failed $LINENO -cmp in out || test_failed $LINENO -cat "${in_lz}" em.lz | "${LZIP}" -t || test_failed $LINENO -cat "${in_lz}" em.lz | "${LZIP}" -d > out || test_failed $LINENO -cmp in out || test_failed $LINENO +printf "garbage" >> copy2.lz || framework_failure +rm -f copy2 +"${LZIP}" -atq copy2.lz +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -atq < copy2.lz +if [ $? = 2 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -adkq copy2.lz +if [ $? = 2 ] && [ ! -e copy2 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -adkq -o copy2 < copy2.lz +if [ $? = 2 ] && [ ! -e copy2 ] ; then printf . ; else printf - ; fail=1 ; fi +printf "to be overwritten" > copy2 || framework_failure +"${LZIP}" -df copy2.lz || fail=1 +cmp in2 copy2 || fail=1 +printf . printf "\ntesting compression..." -"${LZIP}" -c -0 in in in -S100k -o out3.lz > copy2.lz || test_failed $LINENO -[ ! -e out3.lz ] || test_failed $LINENO # override -o and -S -"${LZIP}" -0f in in --output=copy2.lz || test_failed $LINENO -"${LZIP}" -d copy2.lz -o out2 || test_failed $LINENO -[ -e copy2.lz ] || test_failed $LINENO -cmp in2 out2 || test_failed $LINENO -rm -f copy2.lz || framework_failure - -"${LZIP}" -cf "${in_lz}" > lzlz 2> /dev/null # /dev/null is a tty on OS/2 -[ $? = 1 ] || test_failed $LINENO -"${LZIP}" -Fvvm36 -o - -s16 "${in_lz}" > lzlz 2> /dev/null || test_failed $LINENO -"${LZIP}" -cd lzlz | "${LZIP}" -d > out || test_failed $LINENO -cmp in out || test_failed $LINENO -rm -f lzlz out || framework_failure - -"${LZIP}" -0 -o ./- in || test_failed $LINENO -"${LZIP}" -cd ./- | cmp in - || test_failed $LINENO -rm -f ./- || framework_failure -"${LZIP}" -0 -o ./- < in || test_failed $LINENO # don't add .lz -[ ! -e ./-.lz ] || test_failed $LINENO -"${LZIP}" -cd ./- | cmp in - || test_failed $LINENO -rm -f ./- || framework_failure +"${LZIP}" -cfq "${in_lz}" > out # /dev/null is a tty on OS/2 +if [ $? = 1 ] ; then printf . ; else printf - ; fail=1 ; fi +"${LZIP}" -cF -s16 "${in_lz}" > out || fail=1 +"${LZIP}" -cd out | "${LZIP}" -d > copy || fail=1 +cmp in copy || fail=1 +printf . for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do - "${LZIP}" -k -$i -s16 in || test_failed $LINENO $i - mv in.lz out.lz || test_failed $LINENO $i - printf "garbage" >> out.lz || framework_failure - "${LZIP}" -df out.lz || test_failed $LINENO $i - cmp in out || test_failed $LINENO $i - - "${LZIP}" -$i -s16 in -c > out || test_failed $LINENO $i - "${LZIP}" -$i -s16 in -o o_out || test_failed $LINENO $i # don't add .lz - [ ! -e o_out.lz ] || test_failed $LINENO - cmp out o_out || test_failed $LINENO $i - rm -f o_out || framework_failure - printf "g" >> out || framework_failure - "${LZIP}" -cd out > copy || test_failed $LINENO $i - cmp in copy || test_failed $LINENO $i - - "${LZIP}" -$i -s16 < in > out || test_failed $LINENO $i - "${LZIP}" -d < out > copy || test_failed $LINENO $i - cmp in copy || test_failed $LINENO $i - - rm -f out.lz || framework_failure - printf "to be overwritten" > out || framework_failure - "${LZIP}" -f -$i -s16 -o out < in || test_failed $LINENO $i # don't add .lz - [ ! -e out.lz ] || test_failed $LINENO - "${LZIP}" -df -o copy < out || test_failed $LINENO $i - cmp in copy || test_failed $LINENO $i + "${LZIP}" -k -$i -s16 in || fail=1 + mv -f in.lz copy.lz || fail=1 + printf "garbage" >> copy.lz || fail=1 + "${LZIP}" -df copy.lz || fail=1 + cmp in copy || fail=1 done -rm -f copy out || framework_failure +printf . -cat in in in in in in in in > in8 || framework_failure -"${LZIP}" -1s12 -S100k in8 || test_failed $LINENO -"${LZIP}" -t in800001.lz in800002.lz || test_failed $LINENO -"${LZIP}" -cd in800001.lz in800002.lz | cmp in8 - || test_failed $LINENO -[ ! -e in800003.lz ] || test_failed $LINENO -rm -f in800001.lz in800002.lz || framework_failure -"${LZIP}" -1s12 -S100k -o out.lz in8 || test_failed $LINENO -# ignore -S -"${LZIP}" -d out.lz00001.lz out.lz00002.lz -S100k -o out || test_failed $LINENO -cmp in8 out || test_failed $LINENO -"${LZIP}" -t out.lz00001.lz out.lz00002.lz || test_failed $LINENO -[ ! -e out.lz00003.lz ] || test_failed $LINENO -rm -f out out.lz00001.lz out.lz00002.lz || framework_failure -"${LZIP}" -1ks4Ki -b100000 in8 || test_failed $LINENO -"${LZIP}" -t in8.lz || test_failed $LINENO -"${LZIP}" -cd in8.lz -o out | cmp in8 - || test_failed $LINENO # override -o -[ ! -e out ] || test_failed $LINENO -"${LZIP}" -0 -S100k -o out < in8.lz || test_failed $LINENO -"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO -"${LZIP}" -cd out00001.lz out00002.lz | cmp in8.lz - || test_failed $LINENO -[ ! -e out00003.lz ] || test_failed $LINENO -rm -f out00001.lz out00002.lz || framework_failure -"${LZIP}" -1 -S100k -o out < in8.lz || test_failed $LINENO -"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO -"${LZIP}" -cd out00001.lz out00002.lz | cmp in8.lz - || test_failed $LINENO -[ ! -e out00003.lz ] || test_failed $LINENO -rm -f out00001.lz out00002.lz || framework_failure -"${LZIP}" -0 -F -S100k in8.lz || test_failed $LINENO -"${LZIP}" -t in8.lz00001.lz in8.lz00002.lz || test_failed $LINENO -"${LZIP}" -cd in8.lz00001.lz in8.lz00002.lz | cmp in8.lz - || test_failed $LINENO -[ ! -e in8.lz00003.lz ] || test_failed $LINENO -rm -f in8.lz00001.lz in8.lz00002.lz || framework_failure -"${LZIP}" -0kF -b100k in8.lz || test_failed $LINENO -"${LZIP}" -t in8.lz.lz || test_failed $LINENO -"${LZIP}" -cd in8.lz.lz | cmp in8.lz - || test_failed $LINENO -rm -f in8.lz in8.lz.lz || framework_failure - -"${BBEXAMPLE}" in || test_failed $LINENO -"${BBEXAMPLE}" "${in_lz}" || test_failed $LINENO -"${BBEXAMPLE}" "${fox_lf}" || test_failed $LINENO - -"${FFEXAMPLE}" -h > /dev/null || test_failed $LINENO -"${FFEXAMPLE}" > /dev/null -[ $? = 1 ] || test_failed $LINENO -rm -f out || framework_failure -"${FFEXAMPLE}" -b in out || test_failed $LINENO -cmp in out || test_failed $LINENO -"${FFEXAMPLE}" -b in | cmp in - || test_failed $LINENO -"${FFEXAMPLE}" -b in8 | cmp in8 - || test_failed $LINENO -"${FFEXAMPLE}" -b "${fox_lf}" | cmp "${fox_lf}" - || test_failed $LINENO -"${FFEXAMPLE}" -d "${in_lz}" - | cmp in - || test_failed $LINENO -"${FFEXAMPLE}" -c in | "${FFEXAMPLE}" -d | cmp in - || test_failed $LINENO -"${FFEXAMPLE}" -m in | "${FFEXAMPLE}" -d | cmp in - || test_failed $LINENO -"${FFEXAMPLE}" -l in | "${FFEXAMPLE}" -d | cmp in - || test_failed $LINENO -cat "${fox_lf}" "${in_lz}" | "${FFEXAMPLE}" -r | cmp in - || test_failed $LINENO -cat in8 "${in_lz}" | "${FFEXAMPLE}" -r | cmp in - || test_failed $LINENO -cat "${in_lz}" "${fox_lf}" "${in_lz}" | "${FFEXAMPLE}" -r - | cmp in2 - || - test_failed $LINENO -cat "${in_lz}" in8 "${in_lz}" | "${FFEXAMPLE}" -r - - | cmp in2 - || - test_failed $LINENO - -"${LZCHECK}" in || test_failed $LINENO -"${LZCHECK}" "${in_lz}" || test_failed $LINENO -"${LZCHECK}" "${fox_lf}" || test_failed $LINENO -rm -f in8 || framework_failure - -printf "\ntesting bad input..." - -cat em.lz em.lz > ee.lz || framework_failure -"${LZIP}" -t < ee.lz || test_failed $LINENO -"${LZIP}" -d < ee.lz > em || test_failed $LINENO -cmp empty em || test_failed $LINENO -"${LZIP}" -tq ee.lz -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -dq ee.lz -[ $? = 2 ] || test_failed $LINENO -[ ! -e ee ] || test_failed $LINENO -"${LZIP}" -cdq ee.lz > em -[ $? = 2 ] || test_failed $LINENO -cmp empty em || test_failed $LINENO -rm -f empty em || framework_failure -cat "${in_lz}" em.lz "${in_lz}" > inein.lz || framework_failure -"${LZIP}" -t < inein.lz || test_failed $LINENO -"${LZIP}" -d < inein.lz > out2 || test_failed $LINENO -cmp in2 out2 || test_failed $LINENO -"${LZIP}" -tq inein.lz -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -dq inein.lz -[ $? = 2 ] || test_failed $LINENO -[ ! -e inein ] || test_failed $LINENO -"${LZIP}" -cdq inein.lz > out2 -[ $? = 2 ] || test_failed $LINENO -cmp in2 out2 || test_failed $LINENO -rm -f out2 inein.lz em.lz || framework_failure - -headers='LZIp LZiP LZip LzIP LzIp LziP lZIP lZIp lZiP lzIP' -body='\001\014\000\000\101\376\367\377\377\340\000\200\000\215\357\002\322\001\000\000\000\000\000\000\000\045\000\000\000\000\000\000\000' -cp "${in_lz}" int.lz || framework_failure -printf "LZIP${body}" >> int.lz || framework_failure -if "${LZIP}" -t int.lz ; then - for header in ${headers} ; do - printf "${header}${body}" > int.lz || framework_failure - "${LZIP}" -tq int.lz # first member - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -tq < int.lz - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -cdq int.lz > /dev/null - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -tq --loose-trailing int.lz - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -tq --loose-trailing < int.lz - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -cdq --loose-trailing int.lz > /dev/null - [ $? = 2 ] || test_failed $LINENO ${header} - cp "${in_lz}" int.lz || framework_failure - printf "${header}${body}" >> int.lz || framework_failure - "${LZIP}" -tq int.lz # trailing data - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -tq < int.lz - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -cdq int.lz > /dev/null - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -t --loose-trailing int.lz || - test_failed $LINENO ${header} - "${LZIP}" -t --loose-trailing < int.lz || - test_failed $LINENO ${header} - "${LZIP}" -cd --loose-trailing int.lz > /dev/null || - test_failed $LINENO ${header} - "${LZIP}" -tq --loose-trailing --trailing-error int.lz - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -tq --loose-trailing --trailing-error < int.lz - [ $? = 2 ] || test_failed $LINENO ${header} - "${LZIP}" -cdq --loose-trailing --trailing-error int.lz > /dev/null - [ $? = 2 ] || test_failed $LINENO ${header} - done -else - printf "warning: skipping header test: 'printf' does not work on your system." -fi -rm -f int.lz || framework_failure - -"${LZIP}" -tq "${fnz_lz}" -[ $? = 2 ] || test_failed $LINENO - -for i in fox_v2.lz fox_s11.lz fox_de20.lz \ - fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do - "${LZIP}" -tq "${testdir}"/$i - [ $? = 2 ] || test_failed $LINENO $i +for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do + "${LZIP}" -c -$i -s16 in > out || fail=1 + printf "g" >> out || fail=1 + "${LZIP}" -cd out > copy || fail=1 + cmp in copy || fail=1 done +printf . -for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do - "${LZIP}" -cdq "${testdir}"/$i > out - [ $? = 2 ] || test_failed $LINENO $i - cmp fox out || test_failed $LINENO $i +for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do + "${LZIP}" -$i -s16 < in > out || fail=1 + "${LZIP}" -d < out > copy || fail=1 + cmp in copy || fail=1 done -rm -f fox || framework_failure +printf . -cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure -cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure -if dd if=in3.lz of=trunc.lz bs=14682 count=1 2> /dev/null && - [ -e trunc.lz ] && cmp in2.lz trunc.lz ; then - for i in 6 20 14664 14683 14684 14685 14686 14687 14688 ; do - dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null - "${LZIP}" -tq trunc.lz - [ $? = 2 ] || test_failed $LINENO $i - "${LZIP}" -tq < trunc.lz - [ $? = 2 ] || test_failed $LINENO $i - "${LZIP}" -cdq trunc.lz > /dev/null - [ $? = 2 ] || test_failed $LINENO $i - "${LZIP}" -dq < trunc.lz > /dev/null - [ $? = 2 ] || test_failed $LINENO $i - done -else - printf "warning: skipping truncation test: 'dd' does not work on your system." -fi -rm -f in2.lz in3.lz trunc.lz || framework_failure +for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do + "${LZIP}" -f -$i -s16 -o out < in || fail=1 + "${LZIP}" -df -o copy < out.lz || fail=1 + cmp in copy || fail=1 +done +printf . -cp "${in_lz}" ingin.lz || framework_failure -printf "g" >> ingin.lz || framework_failure -cat "${in_lz}" >> ingin.lz || framework_failure -"${LZIP}" -atq ingin.lz -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -atq < ingin.lz -[ $? = 2 ] || test_failed $LINENO -"${LZIP}" -acdq ingin.lz > out -[ $? = 2 ] || test_failed $LINENO -cmp in out || test_failed $LINENO -"${LZIP}" -adq < ingin.lz > out -[ $? = 2 ] || test_failed $LINENO -cmp in out || test_failed $LINENO -"${LZIP}" -t ingin.lz || test_failed $LINENO -"${LZIP}" -t < ingin.lz || test_failed $LINENO -"${LZIP}" -dk ingin.lz || test_failed $LINENO -cmp in ingin || test_failed $LINENO -"${LZIP}" -cd ingin.lz > out || test_failed $LINENO -cmp in out || test_failed $LINENO -"${LZIP}" -d < ingin.lz > out || test_failed $LINENO -cmp in out || test_failed $LINENO -"${FFEXAMPLE}" -d ingin.lz | cmp in - || test_failed $LINENO -"${FFEXAMPLE}" -r ingin.lz | cmp in2 - || test_failed $LINENO -rm -f in2 out ingin ingin.lz || framework_failure +"${BBEXAMPLE}" in || fail=1 +printf . +"${BBEXAMPLE}" out || fail=1 +printf . +"${BBEXAMPLE}" ${test2} || fail=1 +printf . + +"${LZCHECK}" in || fail=1 +printf . +"${LZCHECK}" out || fail=1 +printf . +"${LZCHECK}" ${test2} || fail=1 +printf . echo if [ ${fail} = 0 ] ; then diff --git a/testsuite/fox.lz b/testsuite/fox.lz deleted file mode 100644 index 509da82..0000000 Binary files a/testsuite/fox.lz and /dev/null differ diff --git a/testsuite/fox_bcrc.lz b/testsuite/fox_bcrc.lz deleted file mode 100644 index 8f6a7c4..0000000 Binary files a/testsuite/fox_bcrc.lz and /dev/null differ diff --git a/testsuite/fox_crc0.lz b/testsuite/fox_crc0.lz deleted file mode 100644 index 1abe926..0000000 Binary files a/testsuite/fox_crc0.lz and /dev/null differ diff --git a/testsuite/fox_das46.lz b/testsuite/fox_das46.lz deleted file mode 100644 index 43ed9f9..0000000 Binary files a/testsuite/fox_das46.lz and /dev/null differ diff --git a/testsuite/fox_de20.lz b/testsuite/fox_de20.lz deleted file mode 100644 index 10949d8..0000000 Binary files a/testsuite/fox_de20.lz and /dev/null differ diff --git a/testsuite/fox_mes81.lz b/testsuite/fox_mes81.lz deleted file mode 100644 index d50ef2e..0000000 Binary files a/testsuite/fox_mes81.lz and /dev/null differ diff --git a/testsuite/fox_nz.lz b/testsuite/fox_nz.lz deleted file mode 100644 index 44a4b58..0000000 Binary files a/testsuite/fox_nz.lz and /dev/null differ diff --git a/testsuite/fox_s11.lz b/testsuite/fox_s11.lz deleted file mode 100644 index dca909c..0000000 Binary files a/testsuite/fox_s11.lz and /dev/null differ diff --git a/testsuite/fox_v2.lz b/testsuite/fox_v2.lz deleted file mode 100644 index 8620981..0000000 Binary files a/testsuite/fox_v2.lz and /dev/null differ diff --git a/testsuite/test.txt b/testsuite/test.txt index 423f0c0..9196a3a 100644 --- a/testsuite/test.txt +++ b/testsuite/test.txt @@ -1,7 +1,8 @@ GNU GENERAL PUBLIC LICENSE Version 2, June 1991 - Copyright (C) 1989, 1991 Free Software Foundation, Inc. + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. @@ -338,7 +339,8 @@ Public License instead of this License. GNU GENERAL PUBLIC LICENSE Version 2, June 1991 - Copyright (C) 1989, 1991 Free Software Foundation, Inc. + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. diff --git a/testsuite/test.txt.lz b/testsuite/test.txt.lz index 5dc169f..41d2e39 100644 Binary files a/testsuite/test.txt.lz and b/testsuite/test.txt.lz differ diff --git a/testsuite/fox_lf b/testsuite/test2.txt similarity index 100% rename from testsuite/fox_lf rename to testsuite/test2.txt diff --git a/testsuite/test_sync.lz b/testsuite/test_sync.lz index 2a6218b..db680c3 100644 Binary files a/testsuite/test_sync.lz and b/testsuite/test_sync.lz differ