BSD 3 development
[unix-history] / usr / doc / password
CommitLineData
2074ceed
BJ
1.\" tbl mm ^ eqn ^ troff -ms
2.EQ
3delim $$
4.EN
5.RP
6....TM 78-1271-5 39199 39199-11
7.ND April 3, 1978
8.TL
9Password Security:
10A Case History
11.OK
12Encryption
13Computing
14.AU "MH 2C-524" 3878
15Robert Morris
16.AU "MH 2C-523" 2394
17Ken Thompson
18.AI
19.MH
20.AB
21This paper describes the history of the design of the
22password security scheme on a remotely accessed time-sharing
23system.
24The present design was the result of countering
25observed attempts to penetrate the system.
26The result is a compromise between extreme security and
27ease of use.
28.AE
29.CS 6 0 6 0 0 4
30.SH
31INTRODUCTION
32.PP
33Password security on the
34.UX
35time-sharing system [1] is provided by a
36collection of programs
37whose elaborate and strange design is the outgrowth of
38many years of experience with earlier versions.
39To help develop a secure system, we have had a continuing
40competition to devise new ways to
41attack the security of the system (the bad guy) and, at the same time, to
42devise new techniques to resist the new attacks (the good guy).
43This competition has been in the same vein as the
44competition of long standing between manufacturers of armor
45plate and those of armor-piercing shells.
46For this reason, the description that follows will
47trace the history of the password system rather than simply
48presenting the program in its current state.
49In this way, the reasons for the design will be made clearer,
50as the design cannot be understood without also
51understanding the potential attacks.
52.PP
53An underlying goal has been to provide password security
54at minimal inconvenience to the users of the system.
55For example, those who want to run a completely open
56system without passwords, or to have passwords only at the
57option of the individual users, are able to do so, while
58those who require all of their users to have passwords
59gain a high degree of security
60against penetration of the system by unauthorized
61users.
62.PP
63The password system must be able not only to prevent
64any access to the system by unauthorized users
65(i.e. prevent them from logging in at all),
66but it must also
67prevent users who are already logged in from doing
68things that they are not authorized to do.
69The so called ``super-user'' password, for example, is especially
70critical because the super-user has all sorts of
71permissions and has essentially unlimited access to
72all system resources.
73.PP
74Password security is of course only one component of
75overall system security, but it is an essential component.
76Experience has shown that attempts to penetrate
77remote-access systems have been astonishingly
78sophisticated.
79.PP
80Remote-access systems are peculiarly vulnerable to
81penetration by outsiders as there are threats at the
82remote terminal, along the communications link, as well
83as at the computer itself.
84Although the security of a password encryption algorithm
85is an interesting intellectual and mathematical problem,
86it is only one tiny facet of a very large problem.
87In practice, physical security of the computer, communications
88security of the communications link, and physical control
89of the computer itself loom as far more important issues.
90Perhaps most important of all is control over the actions
91of ex-employees, since they are not under any direct control
92and they may have intimate
93knowledge about the system, its resources, and
94methods of access.
95Good system security involves realistic
96evaluation of the risks not only of deliberate
97attacks but also of casual unauthorized access
98and accidental disclosure.
99.SH
100PROLOGUE
101.PP
102The UNIX system was first implemented with a password file that contained
103the actual passwords of all the users, and for that reason
104the password file had to
105be heavily protected against being either read or written.
106Although historically, this had been the technique used
107for remote-access systems,
108it was completely unsatisfactory for several reasons.
109.PP
110The technique is excessively vulnerable to lapses in
111security.
112Temporary loss of protection can occur when
113the password file is being edited or otherwise modified.
114There is no way to prevent the making of copies by
115privileged users.
116Experience with several earlier remote-access systems
117showed that such lapses occur with frightening frequency.
118Perhaps the most memorable such occasion occurred
119in the early 60's when
120a system administrator on the CTSS system at MIT
121was editing the
122password file and another system administrator was editing
123the daily message that is printed on everyone's terminal
124on login.
125Due to a software design error, the temporary editor files
126of the two users were interchanged and thus, for a time, the password
127file was printed on every terminal when it was logged in.
128.PP
129Once such a lapse in security has been discovered, everyone's
130password must be changed, usually simultaneously, at a considerable
131administrative cost.
132This is not a great matter, but
133far more serious is the high probability of such lapses
134going unnoticed by the system administrators.
135.PP
136Security against unauthorized disclosure of the passwords was,
137in the last analysis, impossible with this system because,
138for example, if the
139contents of the file system are put on to magnetic tape for
140backup, as they must be, then anyone who has physical
141access to the tape
142can read anything on it with no restriction.
143.PP
144Many programs must get information of various kinds
145about the users of the system, and these programs in general
146should have no special permission to read the password file.
147The information which should have been in the password file actually was
148distributed (or replicated) into a number of files, all of
149which had to be updated whenever a user was added to or
150dropped from the system.
151.SH
152THE FIRST SCHEME
153.PP
154The obvious solution is to arrange that the passwords not
155appear in the system at all, and it is not difficult to decide
156that this can be done by encrypting each user's password,
157putting only the encrypted form in the password file, and
158throwing away his original password (the one that
159he typed in).
160When the user later tries to log in to the system, the password
161that he types is encrypted and compared with the encrypted
162version in the password file.
163If the two match, his login attempt is accepted.
164Such a scheme was first described
165in [3, p.91ff.].
166It also seemed advisable to devise
167a system in which neither the password file nor the
168password program itself needed to be
169protected against being read by anyone.
170.PP
171All that was needed to implement these ideas
172was to find a means of encryption that was very difficult
173to invert, even when the encryption program
174is available.
175Most of the standard encryption methods used (in the past)
176for encryption of messages are rather easy to invert.
177A convenient and rather good encryption program happened
178to exist on the system at the time; it simulated the
179M-209 cipher machine [4]
180used by the U.S. Army during World War II.
181It turned out that the M-209 program was usable, but with
182a given key, the ciphers produced by this program are
183trivial to invert.
184It is a much more difficult matter to find out the key
185given the cleartext input and the enciphered output of the program.
186Therefore,
187the password was used not as the text to be encrypted but as the
188key, and a constant was encrypted using this key.
189The encrypted result was entered into the password file.
190.SH
191ATTACKS ON THE FIRST APPROACH
192.PP
193Suppose that the bad guy has available
194the text of the password encryption program and
195the complete password file.
196Suppose also that he has substantial computing
197capacity at his disposal.
198.PP
199One obvious approach to penetrating the password
200mechanism is to attempt to find a general method of inverting
201the encryption algorithm.
202Very possibly this can be done, but few
203successful results
204have come to light, despite substantial efforts extending
205over a period of more than five years.
206The results have not proved to be very useful
207in penetrating systems.
208.PP
209Another approach to penetration is simply to keep trying
210potential
211passwords until one succeeds; this is a general cryptanalytic
212approach called
213.I
214key search.
215.R
216Human beings being what they are, there is a strong tendency
217for people to choose relatively short and simple passwords that
218they can remember.
219Given free choice, most people will choose their passwords
220from a restricted character set (e.g. all lower-case letters),
221and will often choose words or names.
222This human habit makes the key search job a great deal easier.
223.PP
224The critical factor involved in key search is the amount of
225time needed to encrypt a potential password and to check the result
226against an entry in the password file.
227The running time to encrypt one trial password and check
228the result turned out to be approximately 1.25 milliseconds on
229a PDP-11/70 when the encryption algorithm was recoded for
230maximum speed.
231It is takes essentially no more time to test the encrypted
232trial password against all the passwords in
233an entire password file, or for that matter, against
234any collection of encrypted passwords, perhaps collected
235from many installations.
236.PP
237If we want to check all passwords of length
238.I
239n
240.R
241that consist entirely of lower-case letters, the number
242of such passwords is $26 sup n$.
243If we suppose that the password consists of
244printable characters only, then the number of possible passwords
245is somewhat less than $95 sup n$.
246(The standard system ``character erase'' and ``line kill''
247characters are, for example, not prime
248candidates.)
249We can immediately estimate the running time of a program that
250will test every password of a given length with all of its
251characters chosen from some set of characters.
252The following table gives estimates of the running time
253required on a PDP-11/70
254to test all possible character strings of length $n$
255chosen from various sets of characters: namely, all lower-case
256letters, all lower-case letters plus digits,
257all alphanumeric characters, all 95 printable
258ASCII characters, and finally all 128 ASCII characters.
259.TS
260cccccc
261cccccc
262nnnnnn.
263 26 lower-case 36 lower-case letters 62 alphanumeric 95 printable all 128 ASCII
264n letters and digits characters characters characters
265.sp .5
2661 30 msec. 40 msec. 80 msec. 120 msec. 160 msec.
2672 800 msec. 2 sec. 5 sec. 11 sec. 20 sec.
2683 22 sec. 58 sec. 5 min. 17 min. 43 min.
2694 10 min. 35 min. 5 hrs. 28 hrs. 93 hrs.
2705 4 hrs. 21 hrs. 318 hrs.
2716 107 hrs.
272.TE
273.LP
274One has to conclude that it is no great matter for someone with
275access to a PDP-11 to test all lower-case alphabetic strings up
276to length five
277and, given access to the machine for, say, several weekends, to test
278all such strings up to six characters in length.
279By using such a program against a collection of actual encrypted
280passwords, a substantial fraction of all the passwords will be
281found.
282.PP
283Another profitable approach for the bad guy is to use the word
284list from a dictionary or to use a list of names.
285For example, a large commercial dictionary contains typicallly about
286250,000 words; these words can be checked in about five minutes.
287Again, a noticeable fraction of any collection of passwords
288will be found.
289Improvements and extensions will be (and have been) found by
290a determined bad guy.
291Some ``good'' things to try are:
292.IP -
293The dictionary with the words spelled backwards.
294.IP -
295A list of first names (best obtained from some mailing list).
296Last names, street names, and city names also work well.
297.IP -
298The above with initial upper-case letters.
299.IP -
300All valid license plate numbers in your state.
301(This takes about five hours in New Jersey.)
302.IP -
303Room numbers, social security numbers, telephone numbers, and
304the like.
305.PP
306The authors have conducted experiments to try to determine
307typical users' habits in the choice of passwords when no
308constraint is put on their choice.
309The results were disappointing, except to the bad guy.
310In a collection of 3,289 passwords
311gathered from many users over a long period of time;
312.IP
31315 were a single ASCII character;
314.IP
31572 were strings of two ASCII characters;
316.IP
317464 were strings of three ASCII characters;
318.IP
319477 were string of four alphamerics;
320.IP
321706 were five letters, all upper-case or all lower-case;
322.IP
323605 were six letters, all lower-case.
324.LP
325An additional 492 passwords appeared in various available
326dictionaries, name lists, and the like.
327A total of 2,831, or 86% of this sample of passwords fell into one of
328these classes.
329.PP
330There was, of course, considerable overlap between the
331dictionary results and the character string searches.
332The dictionary search alone, which required only five
333minutes to run, produced about one third of the passwords.
334.PP
335Users could be urged (or forced) to use either longer passwords
336or passwords chosen from a larger character set, or the system
337could itself choose passwords for the users.
338.SH
339AN ANECDOTE
340.PP
341An entertaining and instructive example is
342the attempt made at one installation to force users to use less predictable
343passwords.
344The users did not choose their own passwords; the system supplied
345them.
346The supplied passwords were eight characters long and
347were taken from the character set consisting of
348lower-case letters and digits.
349They were generated by a pseudo-random number generator
350with only $2 sup 15$ starting values.
351The time required to search (again on a PDP-11/70) through
352all character strings of length 8 from a 36-character
353alphabet is 112 years.
354.PP
355Unfortunately, only $2 sup 15$ of them need be looked at,
356because that is the number of possible outputs of the random
357number generator.
358The bad guy did, in fact, generate and test each of these strings
359and found every one of the system-generated passwords using
360a total of only about one minute of machine time.
361.SH
362IMPROVEMENTS TO THE FIRST APPROACH
363.NH
364Slower Encryption
365.PP
366Obviously, the first algorithm used was far too fast.
367The announcement of the DES encryption algorithm [2]
368by the National Bureau of Standards
369was timely and fortunate.
370The DES is, by design, hard to invert, but equally valuable
371is the fact that it is extremely slow when implemented in
372software.
373The DES was implemented and used in the following way:
374The first eight characters of the user's password are
375used as a key for the DES; then the algorithm
376is used to encrypt a constant.
377Although this constant is zero at the moment, it is easily
378accessible and can be made installation-dependent.
379Then the DES algorithm is iterated 25 times and the
380resulting 64 bits are repacked to become a string of
38111 printable characters.
382.NH
383Less Predictable Passwords
384.PP
385The password entry program was modified so as to urge
386the user to use more obscure passwords.
387If the user enters an alphabetic password (all upper-case or
388all lower-case) shorter than six characters, or a
389password from a larger character set shorter than five
390characters, then the program asks him to enter a
391longer password.
392This further reduces the efficacy of key search.
393.PP
394These improvements make it exceedingly difficult to find
395any individual password.
396The user is warned of the risks and if he cooperates,
397he is very safe indeed.
398On the other hand, he is not prevented from using
399his spouse's name if he wants to.
400.NH
401Salted Passwords
402.PP
403The key search technique is still
404likely to turn up a few passwords when it is used
405on a large collection of passwords, and it seemed wise to make this
406task as difficult as possible.
407To this end, when a password is first entered, the password program
408obtains a 12-bit random number (by reading the real-time clock)
409and appends this to the password typed in by the user.
410The concatenated string is encrypted and both the
41112-bit random quantity (called the $salt$) and the 64-bit
412result of the encryption are entered into the password
413file.
414.PP
415When the user later logs in to the system, the 12-bit
416quantity is extracted from the password file and appended
417to the typed password.
418The encrypted result is required, as before, to be the same as the
419remaining 64 bits in the password file.
420This modification does not increase the task of finding
421any individual
422password,
423starting from scratch,
424but now the work of testing a given character string
425against a large collection of encrypted passwords has
426been multiplied by 4096 ($2 sup 12$).
427The reason for this is that there are 4096 encrypted
428versions of each password and one of them has been picked more
429or less at random by the system.
430.PP
431With this modification,
432it is likely that the bad guy can spend days of computer
433time trying to find a password on a system with hundreds
434of passwords, and find none at all.
435More important is the fact that it becomes impractical
436to prepare an encrypted dictionary in advance.
437Such an encrypted dictionary could be used to crack
438new passwords in milliseconds when they appear.
439.PP
440There is a (not inadvertent) side effect of this
441modification.
442It becomes nearly impossible to find out whether a
443person with passwords on two or more systems has used
444the same password on all of them,
445unless you already know that.
446.NH
447The Threat of the DES Chip
448.PP
449Chips to perform the DES encryption are already commercially
450available and they are very fast.
451The use of such a chip speeds up the process of password
452hunting by three orders of magnitude.
453To avert this possibility, one of the internal tables
454of the DES algorithm
455(in particular, the so-called E-table)
456is changed in a way that depends on the 12-bit random
457number.
458The E-table is inseparably wired into the DES chip,
459so that the commercial chip cannot be used.
460Obviously, the bad guy could have his own chip designed and
461built, but the cost would be unthinkable.
462.NH
463A Subtle Point
464.PP
465To login successfully on the UNIX system, it is necessary
466after dialing in to type a valid user name, and then the
467correct password for that user name.
468It is poor design to write the login command in such a way that it
469tells an interloper when he has typed in a invalid user name.
470The response to an invalid name should be identical to
471that for a valid name.
472.PP
473When the slow encryption algorithm was first implemented,
474the encryption was done only if the user name was valid,
475because otherwise there was no encrypted password to
476compare with the supplied password.
477The result was that the response was delayed
478by about one-half second if the name was valid, but was
479immediate if invalid.
480The bad guy could find out
481whether a particular user name was valid.
482The routine was modified to do the encryption in either
483case.
484.SH
485CONCLUSIONS
486.PP
487On the issue of password security, UNIX is probably
488better than most systems.
489The use of encrypted passwords appears reasonably
490secure in the absence of serious attention of experts
491in the field.
492.PP
493It is also worth some effort to conceal even the encrypted
494passwords.
495Some UNIX systems have instituted what is called an
496``external security code'' that must be typed when
497dialing into the system, but before logging in.
498If this code is changed periodically, then someone
499with an old password will likely be prevented from
500using it.
501.PP
502Whenever any security procedure is instituted that attempts
503to deny access to unauthorized persons, it is wise to
504keep a record of both successful and unsuccessful attempts
505to get at the secured resource.
506Just as an out-of-hours visitor to a computer center normally
507must not only identify himself, but a record is usually also kept of
508his entry.
509Just so, it is a wise precaution to make and keep a record
510of all attempts to log into a remote-access time-sharing
511system, and certainly all unsuccessful attempts.
512.PP
513Bad guys fall on a spectrum whose one end is someone with
514ordinary access to a system and whose goal is to find
515out a particular password (usually that of the super-user)
516and, at the other end, someone who wishes to collect as
517much password information as possible from as many systems
518as possible.
519Most of the work reported here serves to frustrate the latter type;
520our experience indicates that the former type of bad guy never
521was very successful.
522.PP
523We recognize that a time-sharing system must operate in a
524hostile environment.
525We did not attempt to hide the security aspects of the operating
526system, thereby playing the customary make-believe game in
527which weaknesses of the system are not discussed no matter
528how apparent.
529Rather we advertised the password algorithm and invited attack
530in the belief that this approach would minimize future trouble.
531The approach has been successful.
532.SG MH-1271-RM/KT
533.SH
534References
535.IP [1]
536Ritchie, D.M. and Thompson, K.
537The UNIX Time-Sharing System.
538.I
539Comm. ACM
540.B
54117
542.R
543(July 1974),
544pp. 365-375.
545.IP [2]
546.I
547Proposed Federal Information Processing Data Encryption Standard.
548.R
549Federal Register (40FR12134), March 17, 1975
550.IP [3]
551Wilkes, M. V.
552.I
553Time-Sharing Computer Systems.
554.R
555American Elsevier,
556New York, (1968).
557.IP [4]
558U. S. Patent Number 2,089,603.