MIT Algorithms (3)


Sorting
    -How fast can we sort
    -It depends on model:)
    
    the model is of what you can do with the elements
    
    e.g
        Quicksort       θ(nlgn)  
        Heapsort        θ(nlgn)
        merge sort      θ(nlgn)
        insertion sort  θ(n2)
        
    so, can we do better than θ(nlgn)?

    
1)  Comparison sorting (model)
    only use comparisons to determine relative order of elements
2)  Decision-tree model
    e.g:
        sort <a1, a2, a3>
        
                  1 : 2    
                    │
            ┌───────┴─────────┐
          2 : 3             1 : 3
            │                 │
       ┌────┴────┐       ┌────┴──────┐
    ┌─────┐    1 : 3  ┌─────┐      2 : 3
    │1 2 3│      │    │2 1 3│        │
    └─────┘      │    └─────┘        │
            ┌────┴────┐         ┌────┴────┐
         ┌─────┐   ┌─────┐   ┌─────┐   ┌─────┐
         │1 3 2│   │3 1 2│   │2 3 1│   │3 2 1│
         └─────┘   └─────┘   └─────┘   └─────┘
        
        What this tree means is each node you're making a comparison. a : b means if a > b, you go the left, or go the right. And the you proceed, when you get down to a leaf, this is the answer.
        
        In genernal, <a1, a2, ... , an>
        - each internal node a=has a label of the form i:j, 1 < i, j < n
        - means compare ai vs aj
        - left subtree give subsequent comparisons if ai <= aj
        - right subtree give subsequent comparisons if ai > aj
        - each leaf node give a permutation
        
3)  Decision trees model, comparison sorts
    - one tree for each n
    - view algo as splitting into two forks whenever it makes a comparison
    - what this tree is list comparisons along all possible instraction traces
    - running time (#comparisons)
      = length of path
    - worst-case run time
      = height of tree
    
    lower bound on decisison-tree sorting
    Any decision tree sorting n elements has height Ω(nlgn)
    
    Proof.
        - #leaves must be >= n!
        - height of tree, h
          => #leaves <= 2n
          => n! <= 2n
          => h >= lgn!
               >= lg((n/e)n)
               = nlg(n/e)
               = n(lgn - lg(e))
               = Ω(nlgn)
              
4)  Sorting in linear time
    
    Counting sort:
        Input  A[1...k]
        Output B[1...n] sorting of A
        Auxiliary storage: C[1...k]
        
        for i <- 1 to k
            do C[i] <- 0
        for j <- 1 to n
            do C[A[j]] <- C[A[j]] + 1
        for i <- 2 to k
            do C[i] <- C[i] + c[]
        for j <- n down to 1
            do B[C[A[j]]] <- A[j]
               C[A[j]] <- C[A[j]] - 1
              
        e.g
         init.
        
              ┌───────────────────┐
         A =  │ 4 │ 1 │ 3 │ 4 │ 3 │
              └───────────────────┘
                1   2   3   4   5
              ┌───────────────────┐
         B =  │   │   │   │   │   │
              └───────────────────┘
                1   2   3   4   5
              ┌───────────────┐
         C =  │   │   │   │   │
              └───────────────┘
                1   2   3   4
                
         B has the same size of A
         C.length = max value in A (know in advanced)
        
         step.
            A[1] equals 4, so increase C[4] to 1
            A[2] equals 1, so increase C[1] to 1
            A[3] equals 3, so increase C[3] to 1
            A[4] equals 4, so increase C[4] to 2
            A[5] equals 3, so increase C[3] by 2
            
         result. (the end of the third for loop)
              ┌───────────────┐
         C =  │ 1 │ 0 │ 2 │ 2 │
              └───────────────┘
                1   2   3   4
              ┌───────────────┐
         C'=  │ 1 │ 1 │ 3 │ 5 │
              └───────────────┘
         C'[i] = C[i-1] + C[i]
        
         now, look at the last for loop
         we take value from the end of array A
           loop.
             A[j] == 3
             C[A[j]] == 3
             B[C[A[j]]] <= 3
             C[A[j]]--
             j--
           loop.
        
          running time: O(k+n)
          
5)  Radix sort(Herman Holleritrh ~1890)
      sort the least sig. digit first
      (digit-by-digit)
      e.g
        3 2 9
        4 5 7
        6 5 7
        8 3 9
        4 3 6
        7 2 0
        3 5 5
        
       step.
       a. take the least significant digit and sort by them, if there are two equal numbers, we preserved their orders.
        7 2 0
        3 5 5
        4 3 6
        4 5 7
        6 5 7
        3 2 9
        8 3 9
        
       b. now sort by the middle digit
        7 2 0
        3 2 9
        4 3 6
        8 3 9
        3 5 5
        4 5 7
        6 5 7
        
       c. sort the last digit
        3 2 9
        3 5 5
        4 3 6
        4 5 7
        6 5 7
        7 2 0
        8 3 9
        
      Correctness:
        we just induct on the digit position that we're currently sorting, t
        assume that we're sorted on the low-order t-1 digits
        then sorton digit t
        └>if two element have the same t-th digit (not changed)
        so they will stay the same order (stablity)
        └>if different t-th digit => sorted order
        
      Running time : O(k+n)
        the range of numbers is 1 to k ro 0 to k - 1
    
      Analysis:
        -using counting sort for each digit
        -support we have n integers, each b bits
         (range from 0 to 2b - 1)
         -split each integer into b/r digits, each r bits long,  (base 2r, the maximum value we have in on of these digits. It's between 0 and 2r. In some sense, k for a run of counting sort)
        -b/r is the number of rounds
    
      Running time:
         O(b/r*(n+k))
        =O(b/r*(n+2r))
        
        How do we minimize a function with respect to one variable?
        We can take the derivative of this function by r, differenriate by r, set the derivative equal to 0
        and that should be a critical point in this function
        It turns out this function is unimodal in r, and you will find the minimum.
        
        - b/r * n wants r big
        - b/r * 2n wants r small
        - choose r max subject to n >= 2r (upper bound)
          r = lgn
        
      So if we plug in r = lgn we get
      O((b * n) / lgn)
      
      if numbers m range 0 ... 2b - 1
                         0 ... nd - 1 (d is a constant or d is some parameter)
      then time = O(d * n)
      if d = O(1) then time O(n) as long as d is at most lgn
      

Leave a Reply

Your email address will not be published. Required fields are marked *

9 + thirteen =