This lab focuses on chaining Unix commands together using pipelines and using filters like sort, cut, uniq, wc, grep, awk, and sed to process text data. Mastering these patterns is essential for the lab exercises and exam.
本实验重点学习如何使用管道将 Unix 命令串联起来,以及使用 sort、cut、uniq、wc、grep、awk、sed 等 过滤器来处理文本数据。掌握这些模式对于完成实验和考试至关重要。
The pipe | connects the standard output of one command to the standard input of the next.
管道 | 将前一个命令的标准输出连接到下一个命令的标准输入。
Read file → sort lines → show first 10
读文件 → 排序 → 显示前10行
| Option选项 | Meaning含义 | Example例子 |
|---|---|---|
-t'|' | Set delimiter to |设置分隔符为 | | sort -t'|' |
-k2,2 | Sort by column 2 only只按第2列排序 | sort -k2,2 |
-n | Numeric sort数字排序 | sort -k2,2n |
-r | Reverse (descending)逆序(降序) | sort -k1,1r |
-u | Sort and remove duplicates排序并去重 | sort -u |
-k6.5,6.7 | Column 6, chars 5–7第6列第5到7个字符 | sort -k6.5,6.7nr |
-n, numbers sort alphabetically: 10 comes before 9.-n, they sort numerically: 9 comes before 10.
不加 -n 按字母排:10 排在 9 前面。-n 按数字排:9 排在 10 前面。
-kSTART,END[options]
| Write写法 | Meaning含义 |
|---|---|
-k1,1 | Column 1 only, alphabetic ascending只有第1列,字母升序 |
-k1,1r | Column 1 only, alphabetic descending只有第1列,字母降序 |
-k2,2n | Column 2 only, numeric ascending只有第2列,数字升序 |
-k6.5,6.7 | Column 6, characters 5 through 7第6列第5到第7个字符 |
-k1,14 means from column 1 to column 14 — NOT column 1 character 4! Always write -k1,1 to mean "column 1 only".
⚠️ -k1,14 是"从第1列到第14列",不是"第1列第4个字符"!要表示"只有第1列"必须写 -k1,1。
cut slices a line into parts using a delimiter, then picks the part(s) you want.
cut 用分隔符把一行切成若干部分,然后取你想要的部分。
| Option选项 | Meaning含义 |
|---|---|
-f1 | Take field/column 1 (Tab-separated by default)取第1列(默认Tab分隔) |
-f1,3 | Take columns 1 and 3取第1和第3列 |
-d'|' | Use | as delimiter用 | 作为分隔符 |
Step 1: split by -, take part 1 → Mon 10:00
Step 2: split by :, take part 1 → Mon 10
第1步:用 - 分割,取第1部分 → Mon 10:00
第2步:用 : 分割,取第1部分 → Mon 10
| Option选项 | Meaning含义 |
|---|---|
uniq | Remove adjacent duplicate lines删除相邻重复行 |
uniq -c | Count how many times each line appears统计每行出现次数 |
Output format: count value — the count is in $1, value in $2.
输出格式:次数 值 — 次数是 $1,值是 $2。
| Option选项 | Meaning含义 |
|---|---|
wc -l | Count lines统计行数 |
wc -w | Count words统计词数 |
wc -c | Count characters统计字符数 |
| Option选项 | Meaning含义 |
|---|---|
grep 'pattern' | Keep lines matching pattern保留匹配的行 |
grep -v 'pattern' | Keep lines NOT matching pattern保留不匹配的行 |
grep '^COMP' | Lines starting with COMP以 COMP 开头的行 |
grep -E | Extended regex (enables |, +, ?)扩展正则(支持 |, +, ?) |
awk processes each line. After uniq -c, use it to filter by occurrence count.
awk 处理每一行。配合 uniq -c 可以按出现次数过滤。
| Write写法 | Meaning含义 |
|---|---|
$1 | First column (the count from uniq -c)第1列(uniq -c 的次数) |
$2 | Second column (the value)第2列(值) |
$1 >= 2 | Condition: count ≥ 2条件:次数 ≥ 2 |
{print $2} | Action: print second column动作:打印第2列 |
| Command命令 | Meaning含义 |
|---|---|
s/old/new/ | Replace first match per line替换每行第一个匹配 |
s/old/new/g | Replace ALL matches per line替换每行所有匹配 |
/pattern/d | Delete lines matching pattern删除匹配的整行 |
s/pattern// | Delete matched text, keep empty line删除匹配内容,保留空行 |
-n '/pattern/p' | Print only matching lines只打印匹配行 |
/start/,/end/d | Delete a range of lines删除范围内的行 |
Use \(.*\) to capture content and \1 to reuse it.
用 \(.*\) 捕获内容,用 \1 引用它。
"stdlib.h" → captured as \1 = stdlib.h → result: <stdlib.h>
"stdlib.h" → 捕获为 \1 = stdlib.h → 结果:<stdlib.h>
\( opens, \) closes. Missing one causes an error!
⚠️ 捕获组必须成对:\( 开始,\) 结束。少一个会报错!
| Command命令 | Result结果 | |
|---|---|---|
| Delete whole line删整行 | /TODO/d | Line disappears completely整行消失 |
| Delete matched text删匹配文本 | s/\/\/ TODO.*// | Empty line remains (leading spaces stay)空行保留(前面空格还在) |
Test your knowledge with 30 practice questions covering sort, cut, uniq, wc, grep, awk, and sed. Immediate feedback on your answers with detailed explanations.
通过30道练习题测试你对 sort, cut, uniq, wc, grep, awk, sed 的掌握程度。 立即获得答案反馈和详细解析。
Start Quiz → 开始测验 →