This note is mainly about sorting algorithms. In this series of notes, pseudocode will be used; unless data structures are crucial for the implementation of an algorithm, we will mostly consider the algorithm itself rather than its implementation.
The table below is a Cheat Sheet for common sorting algorithms.
Complexity Sorting
To understand sorting, we need to be able to describe it using both intuitive language and code.
The sorting problem can be summarized as:
- Input: unordered elements (with a total order on <).
- Output: ordered elements, from smallest to largest.
Let's start with sorting.
Selection Sort
Described in words:
- Step 0: Find the minimum in indices , then swap it with the element at position 0.
- Step 1: Find the minimum in indices , then swap it with the element at position 1.
- ...
- Step n-1: Find the minimum at index (only one left, so no swap needed).
A total of swaps are performed, and each swap requires checking up to elements of the array. Therefore, the complexity is .
Described in pseudocode:
C++for(int i = 0; i < n; i++):
// 维护每个循环中的最小的数值
min_index = i
for(int j = i + 1; j < n; j++):
// 将 a[i] 与 a[i, length(A)]中最小的元素交换
if A[j] < A[min_index]
min_index = j
swap A[i] and A[min_index]
This algorithm places the -th smallest element into a[i]. To the left of the -th position in the array are the smallest elements, and they will not be accessed again.
Time Complexity: Worst case .
Bubble Sort
- Starting from position 0, compare two adjacent numbers. If the former is larger than the latter, swap their positions. If the former is smaller, keep them as they are.
- After one complete round of comparisons, the largest number will reach the last position, just like a bubble rising from water to the surface.
- Then repeat this process, but the range of comparison narrows by one each time (since the last number is already sorted).
- Until no numbers need to be swapped, the sorting is complete.
C++for (int i = 0; i < n; i++): // 控制排序轮数
for (int j = 0; j < n - i - 1; j++): // 遍历未排序部分
if A[j] > A[j + 1]: // 如果前一个元素比后一个大
swap A[j] and A[j + 1] // 交换这两个元素
A total of bubbles will occur. Each time, the maximum value is pushed to the right. In the worst case, it will be pushed times, so the complexity is .
Insertion Sort
- Ensure is ordered (already done).
- Ensure is ordered.
- Ensure is ordered.
- ……
- Ensure is ordered.
It's like picking up cards.
C++for (int i = 1; i < n; i++):
// 当前抓到的牌是第 i 张牌
card = A[i]
j = i - 1
while j > 0 and A[j] > card:
// 排序前 i 张牌
A[j+1] = A[j]
j = j - 1
A[j+1] = card
Characteristics: It is sensitive to the initial state. If the array is already mostly sorted, the sorting will be much faster.
Time Complexity: Average and worst case , best case .
Complexity Sorting
Merge Sort
Merge sort is a recursive sorting algorithm. Described in words:
- First, sort the left side (recursive step)
- Then, sort the right side (recursive step)
- Then, integrate the entire array (merge)
Described in code:
C++mergesort(int[] arr, int L, int R):
if (L==R) return;
// 获得中点
mid = L + ((R-L)>>1);
// 排序左侧
mergesort(arr, L, mid);
// 排序右侧
mergesort(arr, mid+1, R);
// 合并步骤
merge(arr, L, mid, R);
Merge sort applies the idea of recursion, using a Divide and Conquer approach, processing half of the sorting each time and finally unifying them. Merge sort splits at the midpoint, sorts the part before the midpoint, sorts the part after the midpoint, and finally merges the two sorted parts with the midpoint value.
Here, merge is actually the most important and error-prone part.
C++merge(int[] arr, int L, int mid, int R):
// 准备合并用的辅助空间 (其大小为 R-L + 1)
int[] help = new int[R-L+1];
// 准备指向左侧第一位的下标和右侧第一位的下标
int i = 0; p1 = L; p2 = M+1;
while(p1 <= M && p2 <=R): //左右都不越界
// 永远先拷贝小的;当左右相等的时候,先复制左侧以保证稳定性
help[i++] = arr[p1] <= arr[p2] ? arr[p1++] : arr[p2++];
while(p1 <= M): // 这种情况下 p2 先越界了,把 p1 剩下的部分拷贝完
help[i++] = arr[p1++];
while(p2 <= R): // 这种情况下 p1 先越界了,把 p2 剩下的部分拷贝完
help[i++] = arr[p2++];
for(int i = 0; i < help.length; i++):
arr[L+i] = help[i];
But it's not actually difficult; the core purpose is just to merge two sorted arrays. The auxiliary space prepared for merging must have a length equal to the sum of the two, which is . Then, use dual pointers to look at both lists simultaneously, copying whichever is smaller first.
Time Complexity: Average, worst, and best cases are all .
Quick Sort
C++quickSort(A, low, high)
if low < high
pi = partition(A, low, high)
quickSort(A, low, pi-1)
quickSort(A, pi+1, high)
else return
In short, quick sort selects a pivot value each time, then throws elements smaller than the pivot to its left and elements larger than the pivot to its right, and then recursively performs quick sort on both sides.
Average case:
Best case:
Worst case: - when the array is already sorted or all elements are equal.
Swap Operation
Here, we specifically introduce swapping based on the XOR operation (^). We can swap data using the following code.
C++int a, b; // 假设 a 和 b 现在都已经初始化并且有值
a = a ^ b;
b = a ^ b;
a = a ^ b; // 此时 a 和 b 已经完成数据的交换
The XOR operation is a bitwise operation. Its property is that if two bits are equal, the XOR result is 0; if they are different, it is 1, i.e.,
- 1 ^ 1 = 0, 0 ^ 0 = 0, 1 ^ 0 = 1, 0 ^ 1 = 1;
Therefore, it satisfies the following properties:
- 0 ^ n = n.
- n ^ n = 0. => a ^ b ^ a = a ^ a ^ b = 0 ^ b = b.
- a ^ b = b ^ a.
- a ^ (b ^ c) = (a ^ b) ^ c.
Readers can verify the effect of the three lines of code above on their own.
References:
- Introduction to Algorithm, 3rd Edition
