ML: Homework 3

Due Friday May 10

  1. Compute the information gain and gain ratio for a decision tree with leaf nodes: {0,0,0,1,1,2}, {1,1,1,2}, {2,2,2}
  2. Simplify the expression
    log2(3/5) + log2(5/7) + log2(7/9)

  3. What is loge(1)?
  4. Apply a version of the k-means algorithm (Euclidean distance) to create two clusters for the following data:
        (1, 0), (2, 0), (2.9, 0), (4, 0), (5, 0)
      

    When does the algorithm terminate?
    Does the result depend on which points you choose as the seeds?
    What is the recommended strategy in the text?
    What are the possible clusters?

  5. find the inverse of the following matrix:
    | 1  1|
    |-1  1|