Week 9 · Unix Files & Storage Interfaces
第 9 周 · Unix 文件与存储接口
From Descriptors to Directories
掌握文件描述符到目录遍历的全链路
System call pipelines, metadata, hard/soft links, and directory walking aligned with lab & exam scenarios
系统调用流程、元数据、硬/软链接与目录遍历,紧贴实验与考试题型
🎯 Learning Objectives学习目标
- Explain how file descriptors abstract open files and how they relate to stdio streams.解释文件描述符如何抽象打开的文件,并与 stdio 流建立联系。
- Use low-level system calls (
open, read, write, close) safely for streaming I/O.安全使用底层系统调用(open、read、write、close)进行流式 I/O。
- Inspect metadata with
stat and interpret permissions, link counts, and special file types.通过 stat 检查文件元数据,解释权限、链接计数和特殊文件类型。
- Compare hard vs symbolic links and reason about inode sharing.比较硬链接与符号链接,分析 inode 共享机制。
- Traverse directories recursively while avoiding cycles and respecting sandbox rules.递归遍历目录,避免循环并遵守沙箱规则。
🧭 Exam Alignment考试对齐
- Descriptor tracing — 22T3 Q7 variants require mapping duplicates and inherited fds.描述符追踪 — 22T3 Q7 衍生题需映射重复和继承的 fd。
- Metadata Q&A — Lab09 «stat_report» prepares for questions about permissions & link counts.元数据问答 — Lab09 “stat_report” 预演权限与链接计数题。
- Directory walkers — Past finals ask to detect cycles and skip hidden entries.目录遍历 — 历届期末要求检测循环并跳过隐藏文件。
- Link semantics — 20T2 Q6 discuss rename/link/unlink effects on inodes.链接语义 — 20T2 Q6 探讨 rename/link/unlink 对 inode 的影响。
- Streaming vs buffered I/O — Tutorial09 compares read/write with stdio buffering.流式 vs 缓存 I/O — Tutorial09 对比 read/write 与 stdio 缓冲。
Coverage goals: assemble ≥3 file-descriptor diagrams, catalogue metadata fields, and document at least two directory traversal patterns with cycle guards.
覆盖目标:绘制 ≥3 个文件描述符示意,整理关键元数据字段,并记录至少两种带循环防护的目录遍历模式。
📚 Core Concepts核心概念
File Descriptor Basics文件描述符基础
Descriptors are integers indexing kernel file tables. 0/1/2 map to stdin/stdout/stderr. Duplication via dup2 enables redirection.描述符是索引内核文件表的整数。0/1/2 对应标准输入/输出/错误。通过 dup2 可实现重定向。
Low-level System Calls底层系统调用
open returns descriptor; read/write transfer bytes; close releases kernel resources. Always check return values.open 返回描述符;read/write 传输字节;close 释放资源。必须检查返回值。
Metadata & Permissions元数据与权限
stat reveals size, ownership, timestamps, mode bits, link count, and device IDs. Understand octal permissions and special bits (setuid, sticky).stat 展示大小、所有者、时间戳、模式位、链接数与设备号。需掌握八进制权限及特殊位(setuid、sticky)。
Hard vs Symbolic Links硬链接 vs 软链接
Hard links share the same inode, increasing link count; symlinks store path text and can span filesystems.硬链接共享同一 inode 并增加链接计数;符号链接存储路径文本,可跨文件系统。
Directory Traversal目录遍历
Use opendir/readdir/closedir or ftw. Skip “.” and “..”, detect cycles via inode+device pairs.通过 opendir/readdir/closedir 或 ftw 遍历目录。忽略 “.” 和 “..”,并用 inode+device 组合检测循环。
Streams vs Syscalls流式 I/O 与系统调用
stdio buffers data for convenience; mixing stdio with raw descriptors requires flush/sync steps to stay consistent.stdio 为方便会缓存数据;在 stdio 与底层描述符混用时需刷新/同步以保持一致。
🧪 Worked Examples示例串讲
Example 1 — Implementing tee with dup2示例 1 — 使用 dup2 实现 tee
Duplicate stdout into a log file while still printing to terminal. Use dup2(log_fd, STDOUT_FILENO) after opening file.将 stdout 复制到日志文件同时输出终端。打开文件后执行 dup2(log_fd, STDOUT_FILENO)。
Remember to save original stdout via dup if you need to restore later.如需恢复标准输出,先用 dup 备份原 stdout。
Example 2 — Summarising metadata from stat示例 2 — stat 元数据摘要
Collect size, mode, UID/GID, link count, and modification time; format as table for Lab09 «stat_report».收集文件大小、模式、UID/GID、链接数与修改时间,并以表格展示,配合 Lab09 “stat_report”。
⚠️ Common Pitfalls易错点
- Neglecting error handling for
read/write (short reads, EINTR).忽略 read/write 的错误处理(短读、EINTR)。
- Following symbolic links blindly during recursion leads to infinite loops.递归遍历时盲目跟随符号链接会造成无限循环。
- Assuming unlink removes file data immediately even when other hard links exist.误以为 unlink 会立即删除数据,忽略其他硬链接存在。
- Mixing stdio and raw descriptors without flushing buffered data.混用 stdio 与底层描述符时未刷新缓冲区导致数据错乱。
🛠️ Practice Task实践任务
Build fs_inspector.c: recursively scan a directory tree, print metadata summary, detect cycles, and optionally copy regular files to a backup location.实现 fs_inspector.c:递归扫描目录树,输出元数据摘要,检测循环,并可选复制常规文件至备份目录。
- Track visited (device, inode) pairs to avoid repeated traversal.通过记录 (device, inode) 对避免重复遍历。
- Respect file permissions; fall back gracefully when lacking access.遵守权限限制;无权访问时优雅退回。
- Optional Optional: add JSON export combining metadata for integration with dashboards.可选 Optional:导出 JSON 供仪表盘使用。
🧪 Tutorial & Lab Mapping教程与实验映射
Tutorial 09 HighlightsTutorial 09 精要
- Descriptor duplication puzzles (dup2 behaviour).描述符复制谜题(dup2 行为)。
- Permission matrix exercises translating mode bits to human-readable form.权限矩阵练习,将模式位转换为人类可读形式。
- Link-count reasoning with rename/unlink scenarios.通过 rename/unlink 场景推理链接计数。
Lab 09 Programming TasksLab 09 编程任务
- filecopy.c — robust copy using read/write loops.filecopy.c — 使用 read/write 循环完成稳健复制。
- stat_report.c — summarise metadata for list of files.stat_report.c — 输出文件元数据汇总。
- link_mirror.c — replicate directory tree using hard links.link_mirror.c — 使用硬链接镜像目录树。
- walk_fs.c (challenge) — recursive walker with cycle detection.walk_fs.c(挑战) — 带循环检测的递归遍历器。
📝 Study Log学习记录
- Inputs shared: files.pdf, Lab09 spec, Tutorial09 worksheet, sample 22T2/22T3 file-system questions.提供资料:files.pdf、Lab09 说明、Tutorial09 习题、22T2/22T3 文件系统题。
- Prompt: “Create descriptor/link diagrams that double as exam answer templates.”提示词:“构建描述符/链接示意图,直接可用于考试答题。”
- Breakthrough: Tracking device+inode pairs simplified cycle detection logic.收获: 记录 device+inode 组合后,循环检测逻辑大幅简化。
- Misconception fixed: Previously thought symlink increments target link count — corrected via stat experiments.修正误区: 曾认为符号链接会增加目标链接计数,经 stat 实验纠正。
- Action items: Practise translating mode bits to rwx string; rehearse unlink vs remove scenarios.后续行动: 练习将模式位转换为 rwx 字符串;复盘 unlink 与 remove 场景差异。
Premium Quiz — 40 Questions on Unix FilesPremium 测验 — 40 道 Unix 文件题
28 basic (syscalls, metadata) · 8 intermediate (directory walking) · 4 advanced (link semantics & streaming design)基础28题(系统调用/元数据)· 中级8题(目录遍历)· 高级4题(链接语义与流式设计)
🔒
Open Week 9 Quiz (Premium)
打开第 9 周测验(会员)
🔭 Next Week Preview下周预告
Week 10 shifts to numeric representations and encoding (floating point & Unicode). Preview floating_point.pdf and unicode.pdf.第10周转向数值表示与编码(浮点与 Unicode)。请预习 floating_point.pdf 与 unicode.pdf。
📎 Resources & Checklist资源与检查表
- PPT: files.pdf p1–68, focus on system call examples.PPT:files.pdf 第1–68页,重点关注系统调用示例。
- Autotest:
1521 autotest lab09_filecopy, lab09_stat_report, lab09_link_mirror, lab09_walk_fs.自动测试:1521 autotest lab09_filecopy、lab09_stat_report、lab09_link_mirror、lab09_walk_fs。
- Self-check: Can you explain difference between hard vs soft links? Can you safely interleave stdio and raw descriptors?自检:能否解释硬链接与软链接差异?能否安全混用 stdio 与底层描述符?