ML: Final

Due Thursday morning
10. The purpose of this question is to demonstrate that you understand how explore a complex data set and apply a paired t-test to find a good classification algorithm. You will explore the glass.arff data set using many of the tools that Weka provides. First run the Weka Explorer on glass.arff with zeroR, 1R, J48, and CART algorithms. Look at the misclassification rates. Report these rates and use the visualization tool to see what attributes seem most useful. Next run the Weka Experimenter (described in Chapter 13) to compare algorithms pairwise. Report which algorithms have best performance and whether it is significantly better than others.

The glass.arff dataset can be found in the Weka’s data folder which is available at C:\Program Files\Weka-3-6\data\glass.arff in the lab computers. This data set should also be easily accessible in the data folder of the Weka installation on your own machine. If you can’t find it download it from the file share in the handouts section.

11. Choose one of the problems you had trouble with on the in-class final and write out the solution. You may also hand in any notes you made during the exam.