Skip to main content Skip to section menu

Bangor University - School of Computer Science

Mathematics Preprints 2007

Pattern Recognition

07.02 : KUNCHEVA, L.I.

A stability index for feature selection

Download:

lkAIA07.pdf

Published in:

Proc. IASTED, Artificial Intelligence and Applications, Innsbruck, Austria (2007) to 390-395.

07.03 : NARASIMHAMURTHY, A. & KUNCHEVA, L.I.

A framework for generating data to simulate changing environments

Download:

anlkAIA07.pdf

Published in:

Proc. IASTED, Artificial Intelligence and Applications, Innsbruck, Austria (2007) 384-389.

07.04 : HOARE, Z.S.J.

Feature selection and classification of non-traditional data.
Examples from veterinary medecine.

Summary:

Early diagnosis of notifiable diseases in the veterinary domain is important with regard to agriculture, the health sector and the economy, With no diagnostic test in the live animal for either BSE or Scrapie many cases may be mis-diagnosed.

Traditionally, data for pattern recognition is stored as recorded cases of interest either labelled with their outcome (suitable for supervised classification) or unlabelled. Each case is described by a collection of symptoms, recorded as present/absent. These are called binary features. In the case of medical data, the amount of cases recorded in this way may be limited for many reasons. To overcome this lack of data expert-estimated probability tables have been proposed as a substitute. These non-traditional tables contain the estimated percentage frequencies of clinical symptoms in various diseases. The construction of the tables assumed that the clinical signs (features) were independent given the diseases (classes).

Given the data, various feature selection techniques were applied and compared in this study in order to select a reduced subset of features (symptoms). The potential, limitations and stability of Sequential Forward Selection (SFS) in particular, were investigated.

Decision trees and naive Bayes classifier models were applied for the diagnosis task. The apparent success and stability of Naive Bayes in the medical domain led to an in-depth investigation of the effects of this type of data and its inherent assumptions on the model. Naive Bayes is known to be optimal in the case of independent features, which is the condition assumed by the estimated probability tables in the non-traditional data. Various proposed adaptations to the Naive Bayes model were investigated with regard to their optimality when the independence assumption is violated. Finally, the performance of Naive Bayes with regard to traditionally stored medical data with binary features was assessed. Naive Bayes and its adaptations performed well with the traditional data. Since the effect of assuming independence when it is not true is minimal, using the non-traditional data with the Naive Bayes classifier can be a practical solution for veterinary diagnosis.

Published in:

University of Wales, Bangor, PhD thesis (February 2007)

Download:

07.12 : RODRIGUEZ,J.J. & KUNCHEVA, L.I.

Time series classification: Decision forests and SVM on interval and DTW features,

Download:

jrlkkdd07.pdf

Published in:

Proc Workshop on Time Series Classification, 13th International Conference on Knowledge Discovery and Data mining, San Jose, CA, 2007.

07.13 : KUNCHEVA, L.I. & RODRIGUEZ,J.J.

Classifier ensembles with a random linear oracle,

Download:

lkjrtkde07.pdf

Published in:

IEEE Transactions on Knowledge and Data Engineering, 19 (2007) 500-508.

07.14 : KUNCHEVA, L.I. & RODRIGUEZ,J.J.

An experimental study on Rotation Forest ensembles,

Download:

lkjrmcs07.pdf

Published in:

Proc 7th International Workshop on Multiple Classifier Systems, MCS'07, Prague, Czech Republic, 2007,
Lecture Notes in Computer Science 4472 (2007) 459-468.

07.15 : RODRIGUEZ,J.J. & KUNCHEVA, L.I.

Naive Bayes ensembles with a random oracle,

Download:

jrlkmcs07.pdf

Published in:

Proc 7th International Workshop on Multiple Classifier Systems, MCS'07, Prague, Czech Republic, 2007,
Lecture Notes in Computer Science 4472 (2007) 450-458.

07.16 : HADJITODOROV, S. T. & KUNCHEVA, L.I.

Selecting diversifying heuristics for cluster ensembles,

Download:

shlkmcs07.pdf

Published in:

Proc 7th International Workshop on Multiple Classifier Systems, MCS'07, Prague, Czech Republic, 2007,
Lecture Notes in Computer Science 4472 (2007) 200-209.

07.17 : SANCHEZ, J.S. & KUNCHEVA, L.I.

Data reduction using classifier ensembles,

Download:

sslkesann07.pdf

Published in:

Proc. 11th European Symposium on Artificial Neural Networks, Bruges, Belgium, (2007).

Site footer