Combining Pattern Classifiers : Methods and AlgorithmsWiley, June 2004.
04.08 : WHITAKER, C.J., KUNCHEVA, L.I. & COCKCROFT, P.D.
A logodds criterion for selection of diagnostic tests
Summary:We propose a criterion for selection of independent binary diagnostic tests (signs). The criterion maximises the difference betwen the logodds for having the disease and the logodds for not having the disease. A parallel is drawn between the logodds criterion and the standard minimum error criterion. The error criterion is "progression non-monotone" which means that even for independent binary signs, the best set of two signs might not contain the single best sign. The logodds criterion is progression monotine, therefore the selection procedure consists of simply selecting the individually best features. A data set for scrapie in sheep is used as an illustration.
Published in:Proc IAPR International Workshop on Statistical Pattern Recognition, Lisbon, Portugal (2004) 575-582.
04.09 : KUNCHEVA, L.I., WHITAKER, C.J., COCKCROFT, P.D. & HOARE, Z.
Pre-Selection of Independent Binary Features: An Application to Diagnosing Scrapie in Sheep
Accepted for:20th Conference on Uncertainty in Artificial Intelligence, UAI-2004, Banff, Canada, 2004, 2004, 325-332.
04.10 : KUNCHEVA, L.I.
Classifier Ensembles for Changing Environments
Published in:Proc. International Workshop on Multiple Classifier Systems, MCS 2004, Cagliari, Italy 2004
Lecture Notes in Computer Science, vol.3077, eds. F.Roli, J.Kittler & T.Windeatt, (2004) 1-15.
04.11 : KUNCHEVA, L.I.
Diversity in multiple classifier systems
Guest editorial in:Information Fusion 6 (2005) 3-4.
04.12 : KUNCHEVA, L.I. & WHITAKER, C.J.
Published in:Encyclopedia of Statistics in Behavioral Science 3,
D. Howell and B. Everitt (Eds.), Wiley (2005) 1532-1535.
04.13 : SHIPP, C.A.
A Study on Diversity in Classifier Ensembles
Summary:In this thesis we cary out a series of investigations into the relationship between diversity and combination methods and diversity and AdaBoost.
In our investigation we study the relationships between nine combination methods. Two data sets are used. We consider the overall accuracies of the combination methods, their improvement over the single best classifier, and the correlation between the ensemble outputs using the different combination methods.
Next we introduce ten diversity measures. Using the same two data sets, we study the relationships between the diversity measures. Then we look at their relationship to the combination methods previously studied. The ranges of the ten diversity measures for three classifiers are derived. They are compared with the theoretical ranges and their implications for the accuracy of the ensemble are studied.
We then proced to investigate the diversity of classifier ensembles built using the AdaBoost algorithm. We cary out experiments with two datasets using ten-fold cross validation. We build 100 classifiers each time using linear classifiers, quadratic classifiers or neural networks. We study how diversity varies as the classifier ensemble grows and how the different types of classifier compare.
Next we consider ways of improving AdaBoost's performance. We conduct an investigation into how modifying the size of the training sets and the complexity of the individual classifiers alters the ensemble's performance. We carry out experiments using three datasets.
Lastly we consider using Pareto optimality to determine which classifiers built by AdaBoost to add to the ensemble. We carry out experiments with ten datasets. We compare standard AdaBoost to AdaBoost with two versions of the Pareto-optimality method called Pareto 5 and Pareto 10, to see whether we can reduce the ensemble size without harming the ensemble accuracy.
Published in:University of Wales, Bangor, PhD thesis (2004)
Download:gzipped postscript file: shipp.ps.gz
pdf file: shipp.pdf