On performance of Meta-learning Templates on Different

Transkript

On performance of Meta-learning Templates
on Different Datasets
Pavel Kordík, Jan Černý
Czech Technical University in Prague
Department of Computer Systems
Faculty of Information Technology
Computational
Intelligence Group
MI-POA
Pavel Kordík, Jan Černý (CTU in Prague)
Meta-learning Templates
WCCI (IJCNN) 2012, Brisbane
1/17
Introduction
Overview
Overview
• Introduction to predictive modeling
o Base models
o Optimization of topology and parameters
o Ensembles
•
•
•
•
•
Meta-learning templates explained
Benchmarking results
Landmarking experiment
Evaluation
Conclusion
2/17
Meta-learning templates
An example
Meta-learning template – An Example
Ensembles:
• Bagging (Bootstrap Agregating),
• Boosting,
• Stacking,
• and many others
Bagging
DT
(sm=5)
Boosting
2NN
5NN
Stacking
(SVM)
NN
DT
(sm=2)
DT
(sm=10)
Base algorithms:
• DT (Decision Tree),
• KNN (K-Nearest Neighbor),
• NN (Neural Network),
• SVM (Support Vector Machine),
• and many others
ClassifierBagging{DecisionTree(splitmin=5),Cl
assifierBoosting{2NearestNeighborClassifier,
StackingClassifier(SVM){NeuralNetClassifier,
DecisionTree(splitmin=2}},
5NearestNeighborClassifier,
DecisionTree(splitmin=10)}
3/17
Template execution
Meta-learning template execution
Template execution:
Bootstrap
sampling
Sequential
Boosting
weighted
sampling
DT
(sm=5)
2NN
• Distributing data
• Building base models
• Finalizing meta models
5NN
DT
(sm=10)
Stacking
(SVM)
4/17
Template execution
Meta-learning template execution
Template execution:
Bootstrap
sampling
Sequential
weighted
sampling
DT
(sm=5)
2NN
• Distributing data
• Building base models
• Finalizing meta models
5NN
DT
(sm=10)
Copy train set
Collect outputs
Build SVM
NN
DT
(sm=2)
5/17
Model generated
Meta-learning template
Template execution:
Bagging
DT
Boosting
5NN
DT
Model finished
2NN
Stacking
(SVM)
NN
DT
6/17
Model recall explained
Using the model
Input vector
Model recall:
Bagging
• Input vectors propagation
• Base models recall
• Output blending and propagation
DT
Boosting
Input vector
2NN
Input vector
DT
5NN
Input vector
Input vector
Stacking
(SVM)
NN
Input vector
DT
Input vector
7/17
Using the model
Model recall:
Bagging
Output
Output
DT
Boosting
5NN
Output
DT
Output
2NN
Stacking
(SVM)
Output
Output
NN
DT
8/17
Using the model
Model recall:
Bagging
Output
Output
DT
Boosting
5NN
Output
Output
2NN
Stacking
(SVM)
Output
DT
Combined by the
SVM meta-model
Output
Output
NN
DT
9/17
Using the model
Model recall:
Bagging
Output
DT
Output
Output
Boosting
5NN
Output
Output
2NN
Stacking
(SVM)
NN
Output
DT
Combined by the
weighted voting
DT
10/17
Using the model
Combined by voting
Output
Model recall:
Bagging
Output
DT
2NN
Output
Output
Boosting
5NN
Output
DT
Stacking
(SVM)
NN
DT
11/17
Evolving templates by the genetic programming
Evolution of templates
Genetic programming:
Fitness of template – average generalization
performance of models produced by the template
12/17
Benchmarking templates
Templates evolved for individual datasets
Benchmarking results - templates
Glass
Balance
Breast
Diabetes
Ecoli
Heart
Texture1
Texture2
Ionosphere
Spirals
Pendigits
Vehicle
Wine
Spambase
Segment
Fourier
Spirals+3
Spread
Meta-learning templates evolved on benchmarking datasets
StackingProbabilities{4x KNN(k=2,vote=true,measure=ManhattanDistance)}
Boosting{57x ClassifierModel{<outputs>x PolynomialModel(degree=4)}}
ClassifierModel{<outputs>x CascadeGenModel[7x CascadeGenModel[5x GaussianModel]]}
SVM(kernel=dot)
ClassifierArbitrating{2x ClassifierBagging{3x SVM(kernel=anova)}}
KNN(k=15,vote=false,measure=CosineSimilarity)
ClassifierArbitrating{4x ClassifierModel{<outputs>x PolynomialModel(degree=2)}}
CascadeGenProb{8x Boosting{2x ClassifierModel{<outputs>x ExpModel}}}
DecisionTree(maxdepth=20,conf=0.25,alt=10)
CascadeGenProb{8x ClassifierArbitrating{4x
KNN(k=3,vote=false,measure=MixedEuclideanDistance)}}
KNN(k=3,vote=false,measure=CosineSimilarity)
ClassifierArbitrating{6x ClassifierModel{<outputs>x DivideModel(mult=6.68)[7x
PolynomialModel(degree=3)]}}
CascadeGenProb{9x ClassifierModel{<outputs>x BoostingRTModel(tr=0.1)[8x
GaussianModel]}}
ClassifierModel{<outputs>x CascadeGenModel[9x SigmoidModel]}
Boosting{17x DecisionTree(maxdepth=24,conf=0.082,alt=0)}
NeuralNetClassifier(net=-1x0,epsilon=0.00001,learn=0.3,momentum=0.2)
KNN(k=3,vote=true,measure=EuclideanDistance)
CascadeGenProb{5x CascadeGenProb{3x KNN(k=9,vote=true,measure=CosineSimilarity)}}
13/17
Benchmarking templates
Comparing the results to base algorithms performance
Benchmarking results - performances
Meta-learn.
Templates
Baseline algorithms performance [%]
Bayes
DT
KNN
NeuNet
SVM
Glass
Balance
Breast
Diabetes
Ecoli
Heart
Texture1
Texture2
Ionosphere
Spirals
Pendigits
Vehicle
Wine
Spambase
Segment
Fourier
Spirals+3
Spread
83.08
99.49
97.65
79.03
90.00
86.53
99.38
65.59
95.83
95.56
99.47
78.46
100.00
93.40
97.62
83.33
94.22
99.00
46.38
90.35
95.80
75.13
80.18
83.04
87.87
22.05
89.37
47.79
85.69
43.79
97.65
82.07
80.01
76.52
55.37
2.60
63.81
41.35
94.38
70.53
82.36
73.19
89.49
7.04
92.46
42.84
93.59
54.93
92.35
90.55
94.51
72.14
43.16
3.60
67.33
78.00
95.74
70.74
79.58
77.26
97.36
43.21
86.80
79.47
99.37
71.69
96.12
91.16
96.94
79.63
39.26
62.10
61.96
93.30
96.03
75.42
84.18
82.01
98.44
60.01
89.71
48.56
94.60
75.38
97.65
91.25
95.70
82.79
54.52
4.87
50.38
86.25
96.14
77.68
85.88
78.96
94.85
22.22
84.17
49.05
83.79
67.69
98.71
90.26
88.35
80.19
54.21
6.50
Average [%]
90.98
68.98
66.79
78.43
77.02
71.96
14/17
Executing all templates on all datasets
Run evolved templates on all datasets
What is the accuracy of template evolved on
dataset X and executed on dataset Y?
15/17
Resulting performances on individual datasets
Questions
Discussion
16/17
Conclusion
Summary and remarks
Concluding remarks
• Hierarchical templates are specially useful for problems
with a complex decision boundary.
• They are best in average, even without being optimized to
concrete datasets.
Average performance over
all benchmarking datasets
StackingProb{5X StackingProb{5X
NeuralNetClassifier}}
17/17
Discover meta-learning templates for data predictive modeling.
MODGEN provides you with the meta-learning templates for predictive modeling.
Upload your data and we compute the performace of several well know algorithms,
evolve the best combination of these and produce the report.
www.modgen.net
Coming soon …

On performance of Meta-learning Templates on Different

Transkript

Podobné dokumenty

Soupis publikovaných prací