About research
Algorithms for concept drift handling are important for many applications. In this paper we present decision tree ensemble classification method based on the Random Forest algorithm for concept drift.
The weighted majority voting ensemble aggregation rule is employed based on the ideas of Accuracy Weighted Ensemble (AWE) method. Base learner weight in our case is computed for each sample evaluation using base learners accuracy and intrinsic proximity measure of Random Forest. Our algorithm exploits both temporal weighting of samples and ensemble pruning as a forgetting strategy.
We present results of empirical comparison of our method with the state of the art classifiers for concept drift handling such as SEA, Hoeffding Adaptive tree and Online Bagging.
Paper link
Benchmark datasets
In this experiment, we evaluated our algorithm on CoverType. Cover type dataset contains the forest cover type for 30 x 30 meter cells obtained from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. It contains 581, 012 instances and 54 attributes, and it has been used as a benchmark in several papers on data stream classification.
Testing results
Covertype dataset mean accuracy is showed in the table below.
blockSize | windowSize | numNN | RF(new blocks) | PDSRF(new blocks) | PDSRF(windows) |
---|---|---|---|---|---|
300 | 1000 | 5 | 0.7765 | 0.8115 | 0.5550 |
300 | 1000 | 10 | 0.7769 | 0.8105 | 0.5534 |
300 | 1000 | 20 | 0.7769 | 0.8052 | 0.5459 |
300 | 1500 | 5 | 0.7767 | 0.8111 | 0.5567 |
300 | 1500 | 10 | 0.7789 | 0.8121 | 0.5555 |
300 | 1500 | 20 | 0.7780 | 0.8083 | 0.5478 |
500 | 500 | 5 | 0.8276 | 0.8638 | 0.5293 |
500 | 500 | 10 | 0.8273 | 0.8616 | 0.5266 |
500 | 500 | 20 | 0.8268 | 0.8602 | 0.5286 |
500 | 1000 | 5 | 0.8276 | 0.8645 | 0.5253 |
500 | 1000 | 10 | 0.8287 | 0.8627 | 0.5288 |
500 | 1000 | 20 | 0.8275 | 0.8604 | 0.5260 |
500 | 1500 | 5 | 0.8275 | 0.8649 | 0.5247 |
500 | 1500 | 10 | 0.8274 | 0.8629 | 0.5206 |
500 | 1500 | 20 | 0.8270 | 0.8596 | 0.5227 |
Authors
Aleksei V. Zhukov
Denis N. Sidorov
Aoife M. Foley
Adele H. Marshall
Contact
For any information related to this research contact Zhukov Aleksei