Commit | Line | Data |
---|---|---|
598c9c42 WJ |
1 | This directory contains the GNU DIFF and DIFF3 utilities, version 1.15. |
2 | See file COPYING for copying conditions. To compile and install on | |
3 | system V, you must edit the makefile according to comments therein. | |
4 | ||
5 | Report bugs to bug-gnu-utils@prep.ai.mit.edu | |
6 | ||
7 | Version 1.15 has the following new features; please see below for details. | |
8 | ||
9 | -L (+file-label) option | |
10 | -u (+unified) option | |
11 | -a and -m options for diff3 | |
12 | Most output styles can represent incomplete input lines. | |
13 | `Text' is defined by ISO 8859. | |
14 | diff3 exit status 0 means success, 1 means overlaps, 2 means trouble. | |
15 | ||
16 | ||
17 | This version of diff provides all the features of BSD's diff. | |
18 | It has these additional features: | |
19 | ||
20 | An input file may end in a non-newline character. If so, its last | |
21 | line is called an incomplete line and is distinguished on output | |
22 | from a full line. In the default, -c, and -u output styles, an | |
23 | incomplete output line is followed by a diagnostic line that starts | |
24 | with \. With -n, an incomplete line is output without a trailing | |
25 | newline. Other output styles (-D, -e, -f) cannot represent an | |
26 | incomplete line, so they pretend that there was a newline, and -e and -f | |
27 | also print an error message. For example, suppose F and G are one-byte | |
28 | files that contain just ``f'' and ``g'', respectively. | |
29 | ||
30 | Then ``diff F G'' outputs | |
31 | ||
32 | 1c1 | |
33 | < f | |
34 | \ No newline at end of file | |
35 | --- | |
36 | > g | |
37 | \ No newline at end of file | |
38 | ||
39 | (The exact diagnostic message may differ, e.g. for non-English locales.) | |
40 | ``diff -n F G'' outputs the following without a trailing newline: | |
41 | ||
42 | d1 1 | |
43 | a1 1 | |
44 | g | |
45 | ||
46 | ``diff -e F G'' sends two diagnostics to stderr and the following to stdout: | |
47 | ||
48 | 1c | |
49 | g | |
50 | . | |
51 | ||
52 | A file is considered to be text if its first characters are all in the | |
53 | ISO 8859 character set; BSD's diff uses Ascii. | |
54 | ||
55 | GNU DIFF has the following additional options: | |
56 | ||
57 | -a Always treat files as text and compare them line-by-line, | |
58 | even if they do not appear to be text. | |
59 | ||
60 | -B ignore changes that just insert or delete blank lines. | |
61 | ||
62 | -C # | |
63 | request -c format and specify number of context lines. | |
64 | ||
65 | -F regexp | |
66 | in context format, for each unit of differences, show some of | |
67 | the last preceding line that matches the specified regexp. | |
68 | ||
69 | -H use heuristics to speed handling of large files that | |
70 | have numerous scattered small changes. The algorithm becomes | |
71 | asymptotically linear for such files! | |
72 | ||
73 | -I regexp | |
74 | ignore changes that just insert or delete lines that | |
75 | match the specified regexp. | |
76 | ||
77 | -L label | |
78 | Use the specified label in file header lines output by the -c option. | |
79 | This option may be given zero, one, or two times, | |
80 | to affect neither label, just the first file's label, or both labels. | |
81 | A file's default label is its name, a tab, and its modification date. | |
82 | ||
83 | -N in directory comparison, if a file is found in only one directory, | |
84 | treat it as present but empty in the other directory. | |
85 | ||
86 | -p equivalent to -c -F'^[_a-zA-Z]'. This is useful for C code | |
87 | because it shows which function each change is in. | |
88 | ||
89 | -T print a tab rather than a space before the text of a line | |
90 | in normal or context format. This causes the alignment | |
91 | of tabs in the line to look normal. | |
92 | ||
93 | -u[#] | |
94 | produce unified style output with # context lines (default 3). | |
95 | This style is like -c, but it is more compact because context | |
96 | lines are printed only once. Lines from just the first file | |
97 | are marked '-'; lines from just the second file are marked '+'. | |
98 | ||
99 | This version of diff3 has all of BSD diff3's features, with the following | |
100 | additional features. | |
101 | ||
102 | An input file may end in a non-newline character. With the -m option, | |
103 | an incomplete last line stays incomplete. Other output styles treat | |
104 | incomplete lines like diff. | |
105 | ||
106 | The file name '-' denotes the standard input. It can appear at most once. | |
107 | ||
108 | diff3 has the following additional options: | |
109 | ||
110 | -a Always treat files as text and compare them line-by-line, | |
111 | even if they do not appear to be text. | |
112 | ||
113 | -i Include 'w' and 'q' commands at the end of the output, to write out | |
114 | the changed file, thus emulating system V behavior. One of the edit | |
115 | script options -e, -E, -x, -X, -3 must also be specified. | |
116 | ||
117 | -m Apply the edit script to the first file and send the result to | |
118 | standard output. Unlike piping diff3's output to ed(1), this works | |
119 | even for binary files and incomplete lines. -E is assumed if no edit | |
120 | script option is specified. This option is incompatible with -i. | |
121 | ||
122 | -L label | |
123 | Use the specified label for lines output by the -E and -X options, | |
124 | one of which must also be specified. This option may be given zero, | |
125 | one, or two times; the first label marks <<<<<<< lines and the second | |
126 | marks >>>>>>> lines. The default labels are the names of the first and | |
127 | third files on the command line. Thus ``diff3 -L X -L Z -E A B C'' | |
128 | acts like ``diff3 -E A B C'', except that the output looks like it | |
129 | came from files named X and Z rather than from files named A and C. | |
130 | ||
131 | Exit status 0 means success, 1 means overlaps were found and -E or -X was | |
132 | specified, and 2 means trouble. | |
133 | ||
134 | ||
135 | ||
136 | GNU DIFF was written by Mike Haertel, David Hayes, Richard Stallman | |
137 | and Len Tower. The basic algorithm is described in: "An O(ND) | |
138 | Difference Algorithm and its Variations", Eugene Myers, Algorithmica | |
139 | Vol. 1 No. 2, 1986, p 251. | |
140 | ||
141 | Many bugs were fixed by Paul Eggert. The unified diff idea and format | |
142 | are from Wayne Davison. | |
143 | ||
144 | Suggested projects for improving GNU DIFF: | |
145 | ||
146 | * Handle very large files by not keeping the entire text in core. | |
147 | ||
148 | One way to do this is to scan the files sequentally to compute hash | |
149 | codes of the lines and put the lines in equivalence classes based only | |
150 | on hash code. Then compare the files normally. This will produce | |
151 | some false matches. | |
152 | ||
153 | Then scan the two files sequentially again, checking each match to see | |
154 | whether it is real. When a match is not real, mark both the | |
155 | "matching" lines as changed. Then build an edit script as usual. | |
156 | ||
157 | The output routines would have to be changed to scan the files | |
158 | sequentially looking for the text to print. |