COMP2521 Week 3 - Sorting Algorithms

⚖️ Sorting Stability ⚖️ 排序稳定性

A sorting algorithm is stable if elements with equal keys maintain their original relative order after sorting. 如果一个排序算法在排序后，对于相等键值的元素能保持它们原始的相对顺序，那么该算法就是稳定的。

⚡ Why does stability matter? ⚡ 为什么稳定性很重要？

Stability is essential for multi-key sorting. When sorting by multiple fields (e.g., primary by band name, secondary by title), you sort by the secondary key first, then by the primary key using a stable algorithm. This preserves the secondary ordering within equal primary keys. 稳定性对于多关键字排序至关重要。当按多个字段排序（如主关键字为乐队名、次关键字为歌名）时，你需要先按次关键字排序，然后用稳定算法按主关键字排序。这样可以保持相同主关键字中次关键字的顺序。

Example: Sort by letter (stable) Original: { (a,5), (b,4), (c,1), (d,2), (d,3), (a,4), (c,2), (b,3) } Stable: { (a,5), (a,4), (b,4), (b,3), (c,1), (c,2), (d,2), (d,3) } ↑ original order preserved within same letter!示例：按字母排序（稳定）原始数组：{ (a,5), (b,4), (c,1), (d,2), (d,3), (a,4), (c,2), (b,3) } 稳定排序：{ (a,5), (a,4), (b,4), (b,3), (c,1), (c,2), (d,2), (d,3) } ↑ 相同字母的元素保持了原始的相对顺序！

💡 Multi-Key Sorting (LSD Strategy): 💡 多关键字排序（LSD 策略）：

Sort by the secondary key first (any algorithm) 先按次要关键字排序（任何算法都行）
Then sort by the primary key using a stable algorithm 然后用稳定算法按主要关键字排序

🎯 Selection Sort 🎯 选择排序 Unstable 不稳定

Idea: Find the minimum element in the unsorted region and swap it to the front. Like picking the lightest item from a shelf, one by one. 思路：在未排序区域中找到最小元素，将其与最前面的元素交换。就像从货架上一个一个挑最轻的物品。

void selectionSort(int A[], int n) {
    for (int i = 0; i < n - 1; i++) {
        int min = i;
        for (int j = i + 1; j < n; j++) {
            if (A[j] < A[min]) {
                min = j;
            }
        }
        swap(A[i], A[min]);
    }
}

Trace Example: { 6, 3, 9, 7, 1 } 执行演示：{ 6, 3, 9, 7, 1 }

Round 1: Find min=1, swap with 6 → [1, 3, 9, 7, 6] Round 2: Find min=3, already there → [1, 3, 9, 7, 6] Round 3: Find min=6, swap with 9 → [1, 3, 6, 7, 9] Round 4: Find min=7, already there → [1, 3, 6, 7, 9] ✓第1轮：找到最小值1，与6交换 → [1, 3, 9, 7, 6] 第2轮：找到最小值3，已在正确位置 → [1, 3, 9, 7, 6] 第3轮：找到最小值6，与9交换 → [1, 3, 6, 7, 9] 第4轮：找到最小值7，已在正确位置 → [1, 3, 6, 7, 9] ✓

⚠️ Why Unstable? The "swap" can jump equal elements past each other. E.g. [5a, 5b, 2] → swap 5a with 2 → [2, 5b, 5a]. Now 5a is after 5b! ⚠️ 为什么不稳定？"交换"操作会让相等的元素跨过彼此。例如 [5a, 5b, 2] → 将 5a 与 2 交换 → [2, 5b, 5a]。现在 5a 在 5b 后面了！

Complexity: O(n²) comparisons always. O(n) swaps (at most n-1). 复杂度：比较次数始终为 O(n²)。交换次数 O(n)（最多 n-1 次）。

🫧 Bubble Sort 🫧 冒泡排序 Stable 稳定

Idea: Repeatedly compare adjacent pairs and swap if out of order. Large values "bubble up" to the end. Includes an early-exit optimization with a swapped flag. 思路：反复比较相邻的数对，如果顺序错误就交换。大的值会"冒泡"到末尾。包含一个用 swapped 标志实现的提前退出优化。

void bubbleSort(int A[], int n) {
    for (int i = n - 1; i >= 1; i--) {
        bool swapped = false;
        for (int j = 0; j < i; j++) {
            if (A[j] > A[j + 1]) {
                swap(A[j], A[j + 1]);
                swapped = true;
            }
        }
        if (!swapped) break;  // Already sorted!
    }
}

⚡ Rabbits vs Turtles: ⚡ 兔子与乌龟：

Rabbits 🐇 — Large values at the front. They "hop" to the end quickly in one pass. 兔子 🐇 — 在前端的大数字。它们一轮就能"跳"到末尾。
Turtles 🐢 — Small values at the end. They crawl forward only 1 position per pass, forcing maximum iterations. 乌龟 🐢 — 在末端的小数字。它们每轮只能往前爬 1 个位置，迫使算法执行最多的轮次。

💡 Why Stable? Only swaps when A[j] > A[j+1] (strictly greater). Equal elements are never swapped, so their relative order is preserved. 💡 为什么稳定？只在 A[j] > A[j+1]（严格大于）时交换。相等的元素永远不会被交换，因此相对顺序得以保留。

Best case: O(n) — already sorted, 1 pass. Worst case: O(n²) — reverse sorted. 最好情况：O(n) — 已排序，1轮结束。最坏情况：O(n²) — 完全逆序。

🃏 Insertion Sort 🃏 插入排序 Stable 稳定

Idea: Like sorting playing cards in your hand. Pick up one card at a time and insert it into the correct position among cards already sorted. 思路：就像整理手中的扑克牌。每次拿起一张牌，插入到已排好的牌中的正确位置。

void insertionSort(int A[], int n) {
    for (int i = 1; i < n; i++) {
        int key = A[i];
        int j = i - 1;
        while (j >= 0 && A[j] > key) {
            A[j + 1] = A[j];  // shift right
            j--;
        }
        A[j + 1] = key;  // insert
    }
}

Counting Comparisons Example 比较次数计算示例

Array: { 2, 4, 6, 8, 7, 5, 3, 1 } Insert 4: compare 4 vs 2 → stop = 1 comparison Insert 6: compare 6 vs 4 → stop = 1 comparison Insert 8: compare 8 vs 6 → stop = 1 comparison Insert 7: compare 7 vs 8, 7 vs 6 → stop = 2 comparisons Insert 5: compare 5 vs 8,7,6,4 → stop = 4 comparisons Insert 3: compare 3 vs 8,7,6,5,4,2 → stop = 6 comparisons Insert 1: compare 1 vs 8,7,6,5,4,3,2 = 7 comparisons Total: 1+1+1+2+4+6+7 = 22数组：{ 2, 4, 6, 8, 7, 5, 3, 1 } 插入 4：比较 4 vs 2 → 停止 = 1 次比较插入 6：比较 6 vs 4 → 停止 = 1 次比较插入 8：比较 8 vs 6 → 停止 = 1 次比较插入 7：比较 7 vs 8, 7 vs 6 → 停止 = 2 次比较插入 5：比较 5 vs 8,7,6,4 → 停止 = 4 次比较插入 3：比较 3 vs 8,7,6,5,4,2 → 停止 = 6 次比较插入 1：比较 1 vs 8,7,6,5,4,3,2 = 7 次比较总计：1+1+1+2+4+6+7 = 22

Best case: O(n) — already sorted. Worst case: O(n²) — reverse sorted. 最好情况：O(n) — 已排序。最坏情况：O(n²) — 完全逆序。

🔀 Merge Sort 🔀 归并排序 Stable 稳定

Idea: Divide the array in half, recursively sort each half, then merge the two sorted halves together. Classic divide-and-conquer. 思路：将数组分成两半，递归排序每一半，然后将两个有序的半数组合并在一起。经典的分治法。

The Merge Algorithm 合并算法

Two pointers walk through two sorted subarrays. Compare the front elements, pick the smaller one. When one subarray is exhausted, append the remaining elements without any comparison. 两个指针分别遍历两个有序子数组。比较队首元素，取较小的一个。当一个子数组用完后，剩余元素直接追加，不再进行比较。

Merge: { 4, 6, 7, 11, 11, 16 } + { 2, 5, 6, 9, 10, 11 } ↑ ptr1 ↑ ptr2 Step 1: 4 vs 2 → pick 2 from S2 Step 2: 4 vs 5 → pick 4 from S1 Step 3: 6 vs 5 → pick 5 from S2 Step 4: 6 vs 6 → pick 6 from S1 (stable: prefer S1) Step 5: 7 vs 6 → pick 6 from S2 Step 6: 7 vs 9 → pick 7 from S1 Step 7: 11 vs 9 → pick 9 from S2 Step 8: 11 vs 10→ pick 10 from S2 Step 9: 11 vs 11→ pick 11 from S1 Step 10: 11 vs 11→ pick 11 from S1 Step 11: 16 vs 11→ pick 11 from S2 S2 empty → append [16] without comparison Total: 11 comparisons合并：{ 4, 6, 7, 11, 11, 16 } + { 2, 5, 6, 9, 10, 11 } ↑ 指针1 ↑ 指针2 第1步：4 vs 2 → 取 2 (S2) 第2步：4 vs 5 → 取 4 (S1) 第3步：6 vs 5 → 取 5 (S2) 第4步：6 vs 6 → 取 6 (S1)（稳定：优先取S1）第5步：7 vs 6 → 取 6 (S2) 第6步：7 vs 9 → 取 7 (S1) 第7步：11 vs 9 → 取 9 (S2) 第8步：11 vs 10→ 取 10 (S2) 第9步：11 vs 11→ 取 11 (S1) 第10步：11 vs 11→ 取 11 (S1) 第11步：16 vs 11→ 取 11 (S2) S2 为空 → 直接追加 [16]，不需要比较总计：11 次比较

Time: O(n log n) always. Space: O(n) extra. Stable. 时间：始终 O(n log n)。空间：额外 O(n)。稳定。

⚡ Quick Sort ⚡ 快速排序 Unstable 不稳定

Idea: Choose a pivot, partition the array into elements smaller and larger than the pivot, then recursively sort each partition. 思路：选择一个枢纽元（pivot），将数组分成比枢纽元小和大的两部分，然后递归排序每一部分。

Median-of-Three Pivot Selection 三数取中法选择枢纽元

To avoid worst-case performance on sorted arrays, pick three elements and use the median as pivot: 为了避免对已排序数组出现最坏情况，选取三个元素并用其中位数作为枢纽元：

Array: { 6, 4, 5, 9, 2, 8, 1, 3, 7 } Index: 0 1 2 3 4 5 6 7 8 1. First element (index 0): 6 2. Last element (index 8): 7 3. Middle element (index 4): 2 Three candidates: { 6, 2, 7 } Sorted: { 2, 6, 7 } Median = 6 → This is the pivot!数组：{ 6, 4, 5, 9, 2, 8, 1, 3, 7 } 索引： 0 1 2 3 4 5 6 7 8 1. 第一个元素（索引 0）：6 2. 最后一个元素（索引 8）：7 3. 中间元素（索引 4）：2 三个候选值：{ 6, 2, 7 } 排序后：{ 2, 6, 7 } 中位数 = 6 → 这就是枢纽元！

⚠️ Why Unstable? During partitioning, elements can be swapped across large distances. Equal elements may be rearranged unpredictably. ⚠️ 为什么不稳定？在分区过程中，元素可能会进行大跨度交换。相等的元素可能会被不可预测地重新排列。

Average: O(n log n). Worst case: O(n²) — bad pivot. In-place. Unstable. 平均：O(n log n)。最坏：O(n²) — 选了坏的枢纽元。原地排序。不稳定。

🔢 Radix Sort (LSD) 🔢 基数排序（LSD） Stable 稳定

Idea: Sort by one digit/character at a time, starting from the least significant (rightmost) position. Each pass uses a stable sub-sort (e.g., counting sort). 思路：每次按一个数位/字符排序，从最低位（最右边）开始。每一轮都使用稳定的子排序（如计数排序）。

Example: Sorting 3-letter Words 示例：排序三字母单词

Original: ["mad", "dog", "pod", "bag", "lag", "lab", "dab"] Pass 1 — Sort by 3rd letter (rightmost): b: lab, dab d: mad, pod g: dog, bag, lag → ["lab", "dab", "mad", "pod", "dog", "bag", "lag"] Pass 2 — Sort by 2nd letter: a: mad, bag, lag, lab, dab o: dog, pod → ["mad", "bag", "lag", "lab", "dab", "dog", "pod"] Pass 3 — Sort by 1st letter: b: bag d: dab, dog l: lag, lab m: mad p: pod → ["bag", "dab", "dog", "lab", "lag", "mad", "pod"] ✓原始：["mad", "dog", "pod", "bag", "lag", "lab", "dab"] 第1轮 — 按第3个字母（最右）排序： b: lab, dab d: mad, pod g: dog, bag, lag → ["lab", "dab", "mad", "pod", "dog", "bag", "lag"] 第2轮 — 按第2个字母排序： a: mad, bag, lag, lab, dab o: dog, pod → ["mad", "bag", "lag", "lab", "dab", "dog", "pod"] 第3轮 — 按第1个字母排序： b: bag d: dab, dog l: lag, lab m: mad p: pod → ["bag", "dab", "dog", "lab", "lag", "mad", "pod"] ✓

⚡ Critical: Each pass MUST use a stable sort! Otherwise, previous passes' work is destroyed. ⚡ 关键：每一轮必须使用稳定排序！否则之前的排序成果会被破坏。

Time: O(d × (n + k)) where d = digits, k = base. Not comparison-based! 时间：O(d × (n + k))，其中 d = 位数，k = 基数。不是基于比较的排序！

📊 Algorithm Comparison 📊 算法对比

Algorithm算法	Best最好	Average平均	Worst最坏	Space空间	Stable?稳定？
Selection	O(n²)	O(n²)	O(n²)	O(1)	❌
Bubble	O(n)	O(n²)	O(n²)	O(1)	✅
Insertion	O(n)	O(n²)	O(n²)	O(1)	✅
Merge	O(n log n)	O(n log n)	O(n log n)	O(n)	✅
Quick	O(n log n)	O(n log n)	O(n²)	O(log n)	❌
Radix (LSD)	O(d·n)	O(d·n)	O(d·n)	O(n+k)	✅