sort
the bubble sort
def bubble_sort(data):
n = len(data)
for idx1 in range(n-1):
for idx2 in range(idx1+1,n):
if data[idx1] > data[idx2]:
data[idx1], data[idx2] = data[idx2], data[idx1]
return data
data = [2,7,1,6,6,3,0,6,9]
print(bubble_sort(data))
[0, 1, 2, 3, 6, 6, 6, 7, 9]
A bubble sort can be modified to stop early if it finds that the list has become sorted. This means that for lists that require just a few passes, a bubble sort may have an advantage in that it will recognize the sorted list and stop.
def short_bubble(data):
n = len(data)
exchange = True
compare_num = n - 1 # at most n-1 compare times
while compare_num > 0 and exchange:
exchange = False
for idx in range(compare_num):
if data[idx] > data[idx+1]:
exchange = True
data[idx], data[idx+1] = data[idx+1], data[idx]
compare_num -= 1
return data
data = [2,7,1,6,6,3,0,6,9]
print(short_bubble(data))
[0, 1, 2, 3, 6, 6, 6, 7, 9]
the selection sort
def select_sort(data):
n = len(data)
exchange_times = n-1
for fill_idx in range(n-1, 0,-1):
max_idx = 0
for idx in range(1,fill_idx+1):
if data[idx] > data[max_idx]:
max_idx = idx
data[fill_idx], data[max_idx] = data[max_idx], data[fill_idx]
return data
data = [2,7,1,6,6,3,0,6,9,-1]
print(select_sort(data))
[-1, 0, 1, 2, 3, 6, 6, 6, 7, 9]
the insertion sort
It always maintains a sorted sublist in the lower positions of the list. Each new item is then “inserted” back into the previous sublist such that the sorted sublist is one item larger.
def insert_sort(data):
n = len(data)
for idx1 in range(1,n):
for idx2 in range(idx1):
if data[idx1] < data[idx2]:
data[idx1],data[idx2] = data[idx2],data[idx1]
return data
def insertionSort(alist):
for index in range(1,len(alist)):
currentvalue = alist[index]
position = index
while position>0 and alist[position-1]>currentvalue:
alist[position]=alist[position-1]
position = position-1
alist[position]=currentvalue
data = [2,7,1,6,6,3,0,6,9,-1]
alist = [54,26,93,17,77,31,44,55,20]
print(insert_sort(data))
print(insert_sort(alist))
[-1, 0, 1, 2, 3, 6, 6, 6, 7, 9]
[17, 20, 26, 31, 44, 54, 55, 77, 93]
the shell sort
The shell sort, sometimes called the “diminishing increment sort,” improves on the insertion sort by breaking the original list into a number of smaller sublists, each of which is sorted using an insertion sort. The unique way that these sublists are chosen is the key to the shell sort. Instead of breaking the list into sublists of contiguous items, the shell sort uses an increment i, sometimes called the gap, to create a sublist by choosing all items that are i items apart.
the complexity of shell sort tends to fall somewhere between
def gap_shell_sort(data, start_pos, gap_len):
n = len(data)
for idx1 in range(start_pos+gap_len, n, gap_len):
for idx2 in range(0, idx1+1, gap_len):
if data[idx2] > data[idx1]:
data[idx2], data[idx1] = data[idx1], data[idx2]
def shell_sort(data):
sublist_len = len(data) // 2
while sublist_len > 0:
for idx in range(sublist_len):
gap_shell_sort(data, idx, sublist_len)
sublist_len //= 2
alist = [54,26,93,17,77,31,44,55,20]
shell_sort(alist)
print(alist)
[17, 20, 26, 31, 44, 54, 55, 77, 93]
the merge sort
the merge sort achives a better time complixity
however, the space complixity is
the core of merge sort algorithm is inpired by divid and conqure algorithm.
def merge_sort(data):
n = len(data)
if n<2:
return data
left = merge_sort(data[:n//2])
right = merge_sort(data[n//2:])
idx_left = idx_right = 0
idx = 0
while idx_left < len(left) and idx_right < len(right):
if left[idx_left] < right[idx_right]:
data[idx] = left[idx_left]
idx_left += 1
idx += 1
else:
data[idx] = right[idx_right]
idx_right += 1
idx += 1
while idx_left < len(left):
data[idx] = left[idx_left]
idx_left += 1
idx += 1
while idx_right < len(right):
data[idx] = right[idx_right]
idx_right += 1
idx += 1
return data
alist = [54,26,93,17,77,31,44,55,20]
merge_sort(alist)
print(alist)
[17, 20, 26, 31, 44, 54, 55, 77, 93]
the quick sort
The quick sort uses divide and conquer to gain the same advantages as the merge sort, while not using additional storage. As a trade-off, however, it is possible that the list may not be divided in half. When this happens, we will see that performance is diminished.
A quick sort first selects a value, which is called the pivot value. Although there are many different ways to choose the pivot value, we will simply use the first item in the list. The role of the pivot value is to assist with splitting the list. The actual position where the pivot value belongs in the final sorted list, commonly called the split point, will be used to divide the list for subsequent calls to the quick sort.
To analyze the quickSort function, note that for a list of length n, if the partition always occurs in the middle of the list, there will again be logn divisions. In order to find the split point, each of the n items needs to be checked against the pivot value. The result is
Unfortunately, in the worst case, the split points may not be in the middle and can be very skewed to the left or the right, leaving a very uneven division. In this case, sorting a list of n items divides into sorting a list of 0 items and a list of n−1 items. Then sorting a list of n−1 divides into a list of size 0 and a list of size n−2, and so on. The result is an
We mentioned earlier that there are different ways to choose the pivot value. In particular, we can attempt to alleviate some of the potential for an uneven division by using a technique called median of three. To choose the pivot value, we will consider the first, the middle, and the last element in the list. In our example, those are 54, 77, and 20. Now pick the median value, in our case 54, and use it for the pivot value (of course, that was the pivot value we used originally).
Ealier version of python used quick sort to implement the
def quick_sort(data):
"""
using data[0] as the
"""
def qsort(data, start, end):
if end - start < 1:
return
pivot = data[start]
left_idx = start
right_idx = end
# partition
while left_idx < right_idx:
while data[right_idx] >= pivot and left_idx < right_idx:
right_idx -= 1
data[left_idx] = data[right_idx]
while data[left_idx] <= pivot and left_idx < right_idx:
left_idx += 1
data[right_idx] = data[left_idx]
data[left_idx] = pivot
qsort(data, 0, left_idx-1)
qsort(data, left_idx+1, end)
qsort(data, 0, len(data)-1)
alist = [54,26,93,17,77,31,44,55,20]
quick_sort(alist)
print(alist)
[17, 20, 26, 31, 44, 54, 55, 77, 93]
def quickSort(alist):
quickSortHelper(alist,0,len(alist)-1)
def quickSortHelper(alist,first,last):
if first<last:
splitpoint = partition(alist,first,last)
quickSortHelper(alist,first,splitpoint-1)
quickSortHelper(alist,splitpoint+1,last)
def partition(alist,first,last):
pivotvalue = alist[first]
leftmark = first+1
rightmark = last
done = False
while not done:
while leftmark <= rightmark and \
alist[leftmark] <= pivotvalue:
leftmark = leftmark + 1
while alist[rightmark] >= pivotvalue and \
rightmark >= leftmark:
rightmark = rightmark -1
if rightmark < leftmark:
done = True
else:
temp = alist[leftmark]
alist[leftmark] = alist[rightmark]
alist[rightmark] = temp
temp = alist[first]
alist[first] = alist[rightmark]
alist[rightmark] = temp
return rightmark
alist = [54,26,93,17,77,31,44,55,20]
quickSort(alist)
print(alist)
[17, 20, 26, 31, 44, 54, 55, 77, 93]
def qsort(data):
n = len(data)
def mid_idx(data, start, end):
mid = (start + end)//2
if data[mid]<= max(data[start], data[end]) and data[mid]>= min(data[start], data[end]):
return mid
if data[start]<= max(data[mid], data[end]) and data[start]>= min(data[mid], data[end]):
return start
if data[end]<= max(data[start], data[mid]) and data[end]>= min(data[start], data[mid]):
return end
def qsort_helper(data, start, end):
if end - start < 1:
return
pivot_idx = mid_idx(data, start, end)
data[start], data[pivot_idx] = data[pivot_idx], data[start]
pivot = data[start]
left = start
right = end
# partition
while left < right:
while data[right] >= pivot and left < right:
right -= 1
data[left] = data[right]
while data[left] <= pivot and left < right:
left += 1
data[right] = data[left]
data[left] = pivot
qsort_helper(data, start, left-1)
qsort_helper(data, left+1, end)
qsort_helper(data, 0, n-1)
alist = [54,26,93,17,77,31,44,55,20]
qsort(alist)
print(alist)
[17, 20, 26, 31, 44, 54, 55, 77, 93]
the Radix sort
the comparison sort algorithms achieve their goal by comparing the individual sort keys to other keys in the list. We have reviewed five sorting algorithms above. The first three ——- bubble, selection, and insertion ——have a worst case time of
the natural question is can we do better than
this dose not mean, however, that the sorting operation cannot be done faster than
radix sort is a fast distribution sorting algorithm that orders keys by examining the individual components of the keys instead of comparing the keys themselves. For example, when sorting integer keys, the individual digits of the keys are compared from least significant to most significant. This is a special purpose sorting algorithm but can be used to sort many types of keys, including positive integers, strings, and floating-point values.
from collections import deque
def radix_sort(data, digit_num):
slots = [deque() for i in range(10)]
col_num = 1
for _ in range(digit_num):
for ele in data:
key = (ele // col_num) % 10
slots[key].append(ele)
idx = 0
for que in slots:
while que:
data[idx] = que.popleft()
idx += 1
col_num *= 10
return data
if __name__ == "__main__":
data = [23,4,67,1,4,9,49,21,18]
print(radix_sort(data, 2))