Commit | Line | Data |
---|---|---|
14b82d6c C |
1 | TODO file for gzip. |
2 | ||
3 | Some of the planned features include: | |
4 | ||
5 | - Structure the sources so that the compression and decompression code | |
6 | form a library usable by any program, and write both gzip and zip on | |
7 | top of this library. This would ideally be a reentrant (thread safe) | |
8 | library, but this would degrade performance. In the meantime, you can | |
9 | look at the sample program zread.c. | |
10 | ||
11 | The library should have one mode in which compressed data is sent | |
12 | as soon as input is available, instead of waiting for complete | |
13 | blocks. This can be useful for sending compressed data to/from interactive | |
14 | programs. | |
15 | ||
16 | - Make it convenient to define alternative user interfaces (in | |
17 | particular for windowing environments). | |
18 | ||
19 | - Support in-memory compression for arbitrarily large amounts of data | |
20 | (zip currently supports in-memory compression only for a single buffer.) | |
21 | ||
22 | - Map files in memory when possible, this is generally much faster | |
23 | than read/write. (zip currently maps entire files at once, this | |
24 | should be done in chunks to reduce memory usage.) | |
25 | ||
26 | - Add a super-fast compression method, suitable for implementing | |
27 | file systems with transparent compression. One problem is that the | |
28 | best candidate (lzrw1) is patented twice (Waterworth 4,701,745 | |
29 | and Gibson & Graybill 5,049,881). The lzrw series of algorithms | |
30 | are available by ftp in ftp.adelaide.edu.au:/pub/compression/lzrw*. | |
31 | ||
32 | - Add a super-tight (but slow) compression method, suitable for long | |
33 | term archives. One problem is that the best versions of arithmetic | |
34 | coding are patented (4,286,256 4,295,125 4,463,342 4,467,317 | |
35 | 4,633,490 4,652,856 4,891,643 4,905,297 4,935,882 4,973,961 | |
36 | 5,023,611 5,025,258). | |
37 | ||
38 | Note: I will introduce new compression methods only if they are | |
39 | significantly better in either speed or compression ratio than the | |
40 | existing method(s). So the total number of different methods should | |
41 | reasonably not exceed 3. (The current 9 compression levels are just | |
42 | tuning parameters for a single method, deflation.) | |
43 | ||
44 | - Add optional error correction. One problem is that the current version | |
45 | of ecc cannot recover from inserted or missing bytes. It would be | |
46 | nice to recover from the most common error (transfer of a binary | |
47 | file in ascii mode). | |
48 | ||
49 | - Add a block size (-b) option to improve error recovery in case of | |
50 | failure of a complete sector. Each block could be extracted | |
51 | independently, but this reduces the compression ratio. | |
52 | ||
53 | - Use a larger window size to deal with some large redundant files that | |
54 | 'compress' currently handles better than gzip. | |
55 | ||
56 | - Implement the -e (encrypt) option. | |
57 | ||
58 | Send comments to Jean-loup Gailly <jloup@chorus.fr>. |