MIT Algorithms (3)

Sorting
    -How fast can we sort
    -It depends on model:)

    the model is of what you can do with the elements

    e.g
        Quicksort       θ(nlgn)
        Heapsort        θ(nlgn)
        merge sort      θ(nlgn)
        insertion sort  θ(n²)

    so, can we do better than θ(nlgn)?


1)  Comparison sorting (model)
    only use comparisons to determine relative order of elements
2)  Decision-tree model
    e.g:
        sort <a1, a2, a3>

                  1 : 2
                    │
            ┌───────┴─────────┐
          2 : 3             1 : 3
            │                 │
       ┌────┴────┐       ┌────┴──────┐
    ┌─────┐    1 : 3  ┌─────┐      2 : 3
    │1 2 3│      │    │2 1 3│        │
    └─────┘      │    └─────┘        │
            ┌────┴────┐         ┌────┴────┐
         ┌─────┐   ┌─────┐   ┌─────┐   ┌─────┐
         │1 3 2│   │3 1 2│   │2 3 1│   │3 2 1│
         └─────┘   └─────┘   └─────┘   └─────┘

        What this tree means is each node you're making a comparison. a : b means if a > b, you go the left, or go the right. And the you proceed, when you get down to a leaf, this is the answer.

        In genernal, <a1, a2, ... , an>
        - each internal node a=has a label of the form i:j, 1 < i, j < n
        - means compare ai vs aj
        - left subtree give subsequent comparisons if ai <= aj
        - right subtree give subsequent comparisons if ai > aj
        - each leaf node give a permutation

3)  Decision trees model, comparison sorts
    - one tree for each n
    - view algo as splitting into two forks whenever it makes a comparison
    - what this tree is list comparisons along all possible instraction traces
    - running time (#comparisons)
      = length of path
    - worst-case run time
      = height of tree

    lower bound on decisison-tree sorting
    Any decision tree sorting n elements has height Ω(nlgn)

    Proof.
        - #leaves must be >= n!
        - height of tree, h
          => #leaves <= 2ⁿ
          => n! <= 2ⁿ
          => h >= lgn!
               >= lg((n/e)ⁿ)
               = nlg(n/e)
               = n(lgn - lg(e))
               = Ω(nlgn)

4)  Sorting in linear time

    Counting sort:
        Input  A[1...k]
        Output B[1...n] sorting of A
        Auxiliary storage: C[1...k]

        for i <- 1 to k
            do C[i] <- 0
        for j <- 1 to n
            do C[A[j]] <- C[A[j]] + 1
        for i <- 2 to k
            do C[i] <- C[i] + c[]
        for j <- n down to 1
            do B[C[A[j]]] <- A[j]
               C[A[j]] <- C[A[j]] - 1

        e.g
         init.

              ┌───────────────────┐
         A =  │ 4 │ 1 │ 3 │ 4 │ 3 │
              └───────────────────┘
                1   2   3   4   5
              ┌───────────────────┐
         B =  │   │   │   │   │   │
              └───────────────────┘
                1   2   3   4   5
              ┌───────────────┐
         C =  │   │   │   │   │
              └───────────────┘
                1   2   3   4

         B has the same size of A
         C.length = max value in A (know in advanced)

         step.
            A[1] equals 4, so increase C[4] to 1
            A[2] equals 1, so increase C[1] to 1
            A[3] equals 3, so increase C[3] to 1
            A[4] equals 4, so increase C[4] to 2
            A[5] equals 3, so increase C[3] by 2

         result. (the end of the third for loop)
              ┌───────────────┐
         C =  │ 1 │ 0 │ 2 │ 2 │
              └───────────────┘
                1   2   3   4
              ┌───────────────┐
         C'=  │ 1 │ 1 │ 3 │ 5 │
              └───────────────┘
         C'[i] = C[i-1] + C[i]

         now, look at the last for loop
         we take value from the end of array A
           loop.
             A[j] == 3
             C[A[j]] == 3
             B[C[A[j]]] <= 3
             C[A[j]]--
             j--
           loop.

          running time: O(k+n)

5)  Radix sort(Herman Holleritrh ~1890)
      sort the least sig. digit first
      (digit-by-digit)
      e.g
        3 2 9
        4 5 7
        6 5 7
        8 3 9
        4 3 6
        7 2 0
        3 5 5

       step.
       a. take the least significant digit and sort by them, if there are two equal numbers, we preserved their orders.
        7 2 0
        3 5 5
        4 3 6
        4 5 7
        6 5 7
        3 2 9
        8 3 9

       b. now sort by the middle digit
        7 2 0
        3 2 9
        4 3 6
        8 3 9
        3 5 5
        4 5 7
        6 5 7

       c. sort the last digit
        3 2 9
        3 5 5
        4 3 6
        4 5 7
        6 5 7
        7 2 0
        8 3 9

      Correctness:
        we just induct on the digit position that we're currently sorting, t
        assume that we're sorted on the low-order t-1 digits
        then sorton digit t
        └>if two element have the same t-th digit (not changed)
        so they will stay the same order (stablity)
        └>if different t-th digit => sorted order

      Running time : O(k+n)
        the range of numbers is 1 to k ro 0 to k - 1

      Analysis:
        -using counting sort for each digit
        -support we have n integers, each b bits
         (range from 0 to 2^b - 1)
         -split each integer into b/r digits, each r bits long,  (base 2^r, the maximum value we have in on of these digits. It's between 0 and 2^r. In some sense, k for a run of counting sort)
        -b/r is the number of rounds

      Running time:
         O(b/r*(n+k))
        =O(b/r*(n+2^r))

        How do we minimize a function with respect to one variable?
        We can take the derivative of this function by r, differenriate by r, set the derivative equal to 0
        and that should be a critical point in this function
        It turns out this function is unimodal in r, and you will find the minimum.

        - b/r * n wants r big
        - b/r * 2ⁿ wants r small
        - choose r max subject to n >= 2^r (upper bound)
          r = lgn

      So if we plug in r = lgn we get
      O((b * n) / lgn)

      if numbers m range 0 ... 2^b - 1
                         0 ... n^d - 1 (d is a constant or d is some parameter)
      then time = O(d * n)
      if d = O(1) then time O(n) as long as d is at most lgn

Cocoa

MIT Algorithms (3)

Leave a Reply Cancel reply

いまが最高！