ML:Solutions HW 1

  1. rule sets, rules need not have an else clause. Rules are not ordered.
    • if wage increase ≤ 2.5 then bad
    • if wage increase > 2.5 and holidays > 10 then good
    • if wage increase > 2.5 and holidays ≤ 10 and
      wage increase ≤ 4 then bad

    • if wage increase > 2.5 and holidays ≤ 10 and
      wage increase > 4 then good
  2. (replicated subtree)
                 p = T
          true  /     \ false
             q = H    q = H
            /    \   /     \
           0      1  1      0
      

    need four rules unless you use else

    • Web mining: number of times accessed, number of times linked by other,
    • Credit rating: income, outstanding debt, own-home, mortgage/rent
      payment, loan payments, years at job, years in home

    • Image screening: list of pixel values, adjacent region labels,
      confidence of adjacent region labels, average intensity, number of
      pixels

    • Load forecasting: temperature, humidity, hours sunshine, wind,
      outlook, day of week, major holiday

    • Device diagnosis: Fourier spectrum, loudness, location of sound
  3. When data has been anonymized, data mining could be used to
    identify a person from the data.

  4.     classification: outcome is play or not
        clustering: multiple classes, play or not play is treated like the other attributes
        associations: if rainy then humidity is high
        numerical outcome: number of hours played as a function of temperature