Linux 高级编程 - 必须掌握的 5 个底层 IO 函数

版权声明：本文为 DLonng 原创文章，可以随意转载，但必须在明确位置注明出处！

底层输入输出（Low-Level Input/Output）

这篇博客主要介绍 Linux 原生的 IO 操作（Low IO），你可能会想不是有跨平台的 ANSI C 可以使用么，为啥还要学习底层 IO ? 有以下 4 个原因：

用于读取大块的二进制文件
在解析之前将整个文件读入内核
执行数据传输以外的操作
将描述符传递给子进程

你需要知道这 4 个使用底层 IO 的原因，在以后遇到实际的情况时能够想到利用底层 IO 来解决。因为底层输入输出的函数也有很多，这篇博客主要介绍 5 个最常用，最基本的底层 IO 函数：

打开文件：open
关闭文件：close
读取文件：read
写入文件：write
操作文件指针：lseek

如果你还有兴趣学习其他的底层 IO 函数，建议你查看 glibc 的官方底层 IO 的学习资料，那是最好，最权威的资料，下面就一起来看看这 5 个函数的用法。

open & close

我们操作 IO 首先要学会的就是打开和关闭文件，我们使用 open 和 close 这两个函数，他们的声明如下（man 2 open）：

open

// open 需要 3 个头文件
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

/*
 * filename: 要打开的文件名称
 * flags: 打开标记，例如：O_CREAT | ORDWR 表示文件不存在就创建，并且可读可写
 * mode: 打开权限
 * return: 成功，返回一个新的文件描述符 fd，失败返回 -1，并设置 errno
 */
int open (const char *filename, int flags, mode t mode);

打开文件也可以用 create，不过这个函数已经弃用了！

close

再来看看 close，关闭文件会有如下结果：

文件描述符被取消分配
文件上进程所拥有的任何记录锁都将被解锁
当与管道或 FIFO 相关联的所有文件描述符都已关闭时，任何未读数据被丢弃

#include <unistd.h>

/*
 * fd: 要关闭的文件的描述符
 * return: 成功返回 0，失败返回 -1，并设置 errno
 */
int close(int fd);

实例 1: open_close.c

我们来看一个简单的例子：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char* argv[]) {
	if (argc < 2) {
		printf("please input filename:\n");
		exit(1);
	}

	// argv[1] = "1.txt", 打开的文件不存在就创建 | 可读可写，666（rw-rw-rw）
	int fd = open(argv[1], O_CREAT | O_RDWR, 666);

	if (fd < 0) {
		printf("file open fail.\n");
		exit(1);
	} else {
		printf("file open success.\n");
		// 必须关闭文件
		close(fd);
		printf("file closed.\n");
	}

	return 0;
}

编译：

gcc open_close.c -o open_close

运行，我们没有新建 1.txt 文件：

./open_close 1.txt
file open success.
file close success.

文件打开成功，并且当前目录下也有了一个 1.txt 文件了，说明我们指定的 O_CREAT 标记使得程序建立了这个文件。这两个函数基本用法就是这样，更多更详细的用法还需要你自己到 glibc 官网或者 man 2 open 去找。

read & write

我们在打开文件后肯定会做的就是读写文件了，不然你打开文件干嘛，我们来看看读文件的 API：

read

read 函数从文件描述符 filedes 指定的文件中读取 size 个字节，存储到 buffer 中。

#include <unistd.h>

/*
 * filedes: 要读取的文件描述符
 * buffer: 存储读取字节的缓冲区
 * size: 要读取的大小
 * return: 成功返回读取的字节数，失败返回 -1，并设置 errno
 */
ssize_t read (int filedes, void *buffer, size t size);

write

write 函数从 buffer 中取出 size 个字节的数据，写到 filedes 描述符表示的文件中。

#include <unistd.h>

/*
 * filedes: 写入的文件描述符
 * buffer: 存储待写入数据的缓存区
 * size: 要写入的字节数
 * return: 成功返回写入的字节数，失败返回 -1，并设置 errno
 */
ssize_t write (int filedes, const void *buffer, size t size);

实例 2: read_write.c

来看一个简单的读写文件的例子：读取 file1 的内容，写到 file2 中，相当于文件拷贝

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char* argv[]) {
	// 1. 打开两个文件
	int fd1 = open(argv[1], O_CREAT | O_RDWR, 0666);
	int fd2 = open(argv[2], O_CREAT | O_RDWR, 0666);

	if ((fd1 < 0) || (fd2 < 0)) {
		printf("file open fail.\n");
		exit(1);
	} else {
		printf("file1 open success: fd1 = %d\n", fd1);
		printf("file2 open success: fd2 = %d\n", fd2);

		char buf[1024];
		// clear
		memset(buf, 0, 1024);

		int read_length = 0;
		int write_length = 0;

		// 2. 将文件 1 的内容读取到 buf 中
		if ((read_length = read(fd1, buf, 1024)) != -1)  {
			// 3. 将 buf 的内容写入到文件 2 中
			if (read_length == write(fd2, buf, read_length)) 
				printf("write to fd2 success.\n");
			else 
				printf("write to fd2 falil.\n");
		} else {
			printf("read fd1 fail.\n");
		}
		
		// 必须关闭 2 个文件
		close(fd1);
		close(fd2);

		printf("file1 close success.\n");
		printf("file2 close success.\n");
	}

	return 0;
}

编译：

gcc read_write.c -o read_write

运行，1.txt 的内容是 hello world：

./read_write 1.txt 2.txt

file1 open success: fd1 = 3
file2 open success: fd2 = 4
write to fd2 success.
file1 close success.
file2 close success.

写入成功，查看下 2.txt 的内容：

cat 2.txt

hello world

发现成功写入 hello world 到 2.txt 文件了，并且注意到 fd1 = 3, fd2 = 4，这也说明前面的 3 个文件描述符被系统使用了。

lseek

lseek 用来移动文件指针，什么是文件指针呢？你可以理解为当前读取或者写入的位置，我们移动这个指针可以控制读取或者写入数据的位置，声明如下：

#include <sys/types.h>
#include <unistd.h>

/*
 * filedes: 要操作的文件描述符
 * offset: 根据当前 whence 的偏移量
 * whence: 指定当前文件指针的位置
 * return: 成功返回设置后的文件位置，可以使用 `SEEK_CUR` 查看当前文件指针位置，
 *         失败返回 -1 并设置 errno
 */
off_t lseek (int filedes, off t offset, int whence);

其中 whence 参数需要特别注意，它有 3 种情况：

SEEK_SET：设置文件指针指向文件开始并偏移 offset 字节处
SEEK_CUR：设置文件指针只想当前位置偏移 offset 字节处
SEEK_END：设置文件指针指向文件末尾偏移 offset 字节处

实例 3：file_length.c

我们可以使用 int file_length = lseek(fd, 0, SEEK_END) 来求文件的长度，这个操作经常被使用。

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/types.h>
#include <unistd.h>

int main(int argc, char* argv[]) {
	// 打开文件
	int fd = open(argv[1], O_CREAT | O_RDWR, 0666);

	if (fd < 0) {
		printf("file open fail.\n");
		exit(1);
	} else {
		printf("file1 open success: fd = %d\n", fd);

		// 得到文件长度
		int file_length = lseek(fd, 0, SEEK_END);
		
		// 输出文件长度
		printf("file_length = %d\n", file_length);

		// 关闭文件
		close(fd);
		printf("file1 close success.\n");
	}

	return 0;
}

编译：

gcc file_length.c -o file_length

运行，注意到 1.txt 中的 hello world 加上 ‘\0’ 一共 12 个字符：

./file_length 1.txt
file1 open success: fd = 3
file_length = 12
file1 close success.

打印出 12，计算正确啦。

结语

到此，我们就学习了 6 个底层的 IO 函数，并实际练习了 3 个例子，把这 3 个例子搞清楚，基本的用法也就掌握的差不多了，更加详细的用法可以查看系统提供的 man 手册 man 2 open 等，或者查阅 glibc 官方文档，祝你学习愉快。

最后，感谢你的阅读，我们下次再见 :)