How to Use file descriptors to read from and write to multiple files in Linux
File Input and Output
A script can receive input from a file and send output to a file so that the script can run without user interaction. Later, a user can review the output file, or another script can use the output file as its input.
File input and output are accomplished in the shell by integer handles that the kernel uses to track all open files in a process. These numeric values are file descriptors. The best-known file descriptors are 0 (stdin), 1 (stdout), and 2 (stderr)
The numbers 3 through 9 are for programmer-defined file descriptors. Use these to associate numeric values to pathnames. In a program, if there is a need to read or write multiple times to the same file, a shorthand file descriptor value might reduce errors in referencing the file name. Also, a change to the file name must be done only once if the file is subsequently accessed through the file descriptor. The table below shows the syntax for file redirection.
Command | Description |
---|---|
< file | Takes standard input from file |
0< file | Takes standard input from file |
> file | Puts standard output to file |
1> file | Puts standard output to file |
2> file | Puts standard error to file |
exec fd> /some/filename | Assigns the file descriptor fd to /some/filename for output |
exec fd< /some/filename | Assigns the file descriptor fd to /some/filename for input |
read <&fd var1 | Reads from the file descriptor fd and stores into variable var1 |
cmd >& fd | Executes cmd and sends output to the file descriptor fd |
exec fd<<- | Closes the file descriptor fd |
User-Defined File Descriptors
You can use a file descriptor to assign a numeric value to a file instead of using the file name. The exec command is one of the built-in commands in a shell. Use this command to assign a file descriptor to a file. The syntax is:
# exec fd> filename
# exec fd< filename
No spaces are allowed between the file descriptor number (fd) and the redirection symbol (> for output, < for input). After a file descriptor is assigned to a file, you can use the descriptor with the shell redirection operators. On output, if the file does not exist, it is created. If it does exist, it is emptied. On input, if the file does not exist, an error occurs.
# command >&fd
# command <&fd
The file descriptor assigned to a file is valid in the current shell only.
File Descriptors in the Bourne Shell
In the following example, the syntax for reading and writing using file descriptors uses the Bourne syntax for the read and echo statements. This syntax works with both shells.
Try doing the following:
- Copy the /etc/hosts file to the /tmp/hosts2 file.
- Use grep to read the /tmp/hosts2 file, strip out the comment lines, and send the output to the /tmp/hosts3 file.
- Assign file descriptor 3 to the /tmp/hosts3 file for input, and then each read statement issued to file descriptor 3 reads a record from the file. The statement used to associate file descriptor 3 to the input file named /tmp/hosts3 is: “exec 3< /tmp/hosts3”
- Assign file descriptor 4 to the /tmp/hostsfinal file for output. If the output file does not exist, it is created. If it does exist, its size truncates to 0 bytes. The statement that associates file descriptor 4 to the output file is: “exec 4> /tmp/hostsfinal”
- Read the /tmp/hosts3 file and output two fields to the /tmp/hostsfinal file. Reading from the input file is accomplished by specifying the file descriptor number to the & argument of the print statement. This causes the output to be written as a line in the output file. The output file is written sequentially.
- Close the input file when it is no longer needed. This is good practice. The OS closes the file automatically when the process terminates if the program fails to close it
- When all writes to the output file are complete, close the file.
$ cat readex.sh
#!/bin/sh
# Script name: readex.sh
##### Step 1 - Copy /etc/host
cp /etc/hosts /tmp/hosts2
##### Step 2 - Strip out comment lines
grep -v ’^#’ /tmp/hosts2 > /tmp/hosts3
##### Step 3 - fd 3 is input file /tmp/hosts3
exec 3 /tmp/hostsfinal
##### The following 4 statements accomplish STEP 5
read <& 3 addr1 name1 alias # Read from fd 3
read & 4 # Write to fd 4 (do not write aliases)
echo $name2 $addr2 >& 4 # Write to fd 4 (do not write aliases)
##### END OF STEP 5 statements
exec 3<&- # Step 6 - close fd 3
exec 4<&- # Step 7 - close fd 4
$ ./readex.sh
$ more /tmp/hosts2
#
# Internet host table
# 127.0.0.1 localhost
192.9.200.111 ultrabear loghost
192.9.200.121 ladybear
$ more /tmp/hosts3
127.0.0.1 localhost
192.9.200.111 ultrabear loghost
192.9.200.121 ladybear
$ more /tmp/hostsfinal
localhost 127.0.0.1
ultrabear 192.9.200.111
Korn Shell File Descriptors
The following example shows how the previous example would be written in the Korn shell. In the Korn shell, the read statement uses a -ufd syntax rather than the <& fd syntax.
The print statement uses a -ufd syntax to direct the output to file descriptor 4 in the Korn shell.
$ cat readex.ksh
#!/bin/ksh
# Script name: readex.ksh
##### Step 1 - Copy /etc/host
cp /etc/hosts /tmp/hosts2
##### Step 2 - Strip out comment lines
grep -v ’^#’ /tmp/hosts2 > /tmp/hosts3
##### Step 3 - fd 3 is an input file /tmp/hosts3
exec 3 /tmp/hostsfinal
##### The following 4 statements accomplish STEP 5
read -u3 addr1 name1 alias # read from fd 3
read -u3 addr2 name2 alias # read from fd 3
print -u4 $name1 $addr1 # write to fd 4 (do not write aliases)
print -u4 $name2 $addr2 # write to fd 4 (do not write aliases)
##### END OF STEP 5 statements
exec 3<&- # Step 6 - close fd 3
exec 4<&- # Step 7 - close fd 4
$ ./readex.ksh
$ more /tmp/hosts2
#
# Internet host table
#
127.0.0.1 localhost
192.9.200.111 ultrabear loghost
192.9.200.121 ladybear
$ more /tmp/hosts3
127.0.0.1 localhost
192.9.200.111 ultrabear loghost
192.9.200.121 ladybear
$ more /tmp/hostsfinal
localhost 127.0.0.1
ultrabear 192.9.200.111