This section describes the
bottom level of I/O on the
The lowest level of I/O in
provides no buffering or any other services;
it is in fact a direct entry into the operating system.
You are entirely on your own,
you have the most control over what happens.
And since the calls and usage are quite simple,
this isn't as bad as it sounds.
all input and output is done
by reading or writing files,
because all peripheral devices, even the user's terminal,
are files in the file system.
This means that a single, homogeneous interface
handles all communication between a program and peripheral devices.
In the most general case,
before reading or writing a file,
it is necessary to inform the system
If you are going to write on a file,
it may also be necessary to create it.
The system checks your right to do so
Do you have permission to access it?),
returns a small positive integer
Whenever I/O is to be done on the file,
the file descriptor is used instead of the name to identify the file.
(This is roughly analogous to the use of
information about an open file is maintained by the system;
the user program refers to the file
The file pointers discussed in section 3
are similar in spirit to file descriptors,
but file descriptors are more fundamental.
A file pointer is a pointer to a structure that contains,
among other things, the file descriptor for the file in question.
Since input and output involving the user's terminal
special arrangements exist to make this convenient.
When the command interpreter (the
three files, with file descriptors 0, 1, and 2,
called the standard input,
the standard output, and the standard error output.
All of these are normally connected to the terminal,
so if a program reads file descriptor 0
and writes file descriptors 1 and 2,
without worrying about opening the files.
the shell changes the default assignments for file descriptors
from the terminal to the named files.
Similar observations hold if the input or output is associated with a pipe.
Normally file descriptor 2 remains attached to the terminal,
so error messages can go there.
the file assignments are changed by the shell,
The program does not need to know where its input
comes from nor where its output goes,
so long as it uses file 0 for input and 1 and 2 for output.
All input and output is done by
For both, the first argument is a file descriptor.
The second argument is a buffer in your program where the data is to
The third argument is the number of bytes to be transferred.
n_read = read(fd, buf, n);
n_written = write(fd, buf, n);
Each call returns a byte count
which is the number of bytes actually transferred.
the number of bytes returned may be less than
bytes remained to be read.
(When the file is a terminal,
normally reads only up to the next newline,
which is generally less than what was requested.)
A return value of zero bytes implies end of file,
indicates an error of some sort.
For writing, the returned value is the number of bytes
it is generally an error if this isn't equal
to the number supposed to be written.
The number of bytes to be read or written is quite arbitrary.
The two most common values are
which means one character at a time
which corresponds to a physical blocksize on many peripheral devices.
This latter size will be most efficient,
but even character at a time I/O
is not inordinately expensive.
Putting these facts together,
we can write a simple program to copy
This program will copy anything to anything,
since the input and output can be redirected to any file or device.
#define BUFSIZE 512 /* best size for PDP-11 UNIX */
main() /* copy input to output */
while ((n = read(0, buf, BUFSIZE)) > 0)
If the file size is not a multiple of
will return a smaller number of bytes
It is instructive to see how
higher level routines like
which does unbuffered input.
#define CMASK 0377 /* for making char's > 0 */
getchar() /* unbuffered single character input */
return((read(0, &c, 1) > 0) ? c & CMASK : EOF);
accepts a character pointer.
The character being returned must be masked with
to ensure that it is positive;
otherwise sign extension may make it negative.
but not necessarily for other machines.)
does input in big chunks,
and hands out the characters one at a time.
#define CMASK 0377 /* for making char's > 0 */
getchar() /* buffered version */
static char buf[BUFSIZE];
if (n == 0) { /* buffer is empty */
n = read(0, buf, BUFSIZE);
return((--n >= 0) ? *bufp++ & CMASK : EOF);
Open, Creat, Close, Unlink
standard input, output and error files,
you must explicitly open files in order to
There are two system entry points for this,
discussed in the previous section,
except that instead of returning a file pointer,
it returns a file descriptor,
is a character string corresponding to the external file name.
is 0 for read, 1 for write, and 2 for read and write access.
otherwise it returns a valid file descriptor.
a file that does not exist.
is provided to create new files,
returns a file descriptor
if it was able to create the file
will truncate it to zero length;
a file that already exists.
If the file is brand new,
there are nine bits of protection information
controlling read, write and execute permission for
Thus a three-digit octal number
is most convenient for specifying the permissions.
specifies read, write and execute permission for the owner,
and read and execute permission for the group and everyone else.
here is a simplified version of
a program which copies one file to another.
(The main simplification is that our version
and does not permit the second argument
#define PMODE 0644 /* RW for owner, R for group, others */
main(argc, argv) /* cp: copy f1 to f2 */
error("Usage: cp from to", NULL);
if ((f1 = open(argv[1], 0)) == -1)
error("cp: can't open %s", argv[1]);
if ((f2 = creat(argv[2], PMODE)) == -1)
error("cp: can't create %s", argv[2]);
while ((n = read(f1, buf, BUFSIZE)) > 0)
if (write(f2, buf, n) != n)
error("cp: write error", NULL);
error(s1, s2) /* print error message and die */
there is a limit (typically 15-25)
on the number of files which a program
may have open simultaneously.
Accordingly, any program which intends to process
many files must be prepared to re-use
breaks the connection between a file descriptor
file descriptor for use with some other file.
or return from the main program closes all open files.
Random Access \(em Seek and Lseek
File I/O is normally sequential:
takes place at a position in the file
right after the previous one.
a file can be read or written in any arbitrary order.
provides a way to move around in
a file without actually reading
lseek(fd, offset, origin);
forces the current position in the file
which is taken relative to the location
Subsequent reading or writing will begin at that position.
can be 0, 1, or 2 to specify that
the beginning, from the current position, or from the
end of the file respectively.
seek to the end before writing:
To get back to the beginning (``rewind''),
it could also be written as
it is possible to treat files more or less like large arrays,
at the price of slower access.
For example, the following simple function reads any number of bytes
from any arbitrary place in a file.
get(fd, pos, buf, n) /* read n bytes from position pos */
lseek(fd, pos, 0); /* get to pos */
return(read(fd, buf, n));
the basic entry point to the I/O system
integers have only 16 bits,
to multiply the given offset by 512
(the number of bytes in one physical block)
as if it were 0, 1, or 2 respectively.
Thus to get to an arbitrary place in a large file
requires two seeks, first one which selects
the block, then one which
equal to 1 and moves to the desired byte within the block.
The routines discussed in this section,
and in fact all the routines which are direct entries into the system
Usually they indicate an error by returning a value of \-1.
Sometimes it is nice to know what sort of error occurred;
for this purpose all these routines, when appropriate,
leave an error number in the external cell
The meanings of the various error numbers are
in the introduction to Section II
so your program can, for example, determine if
an attempt to open a file failed because it did not exist
or because the user lacked permission to read it.
you may want to print out the
will print a message associated with the value
is an array of character strings which can be indexed
and printed by your program.