View on GitHub

Proximity driven streaming random forest

Machine learning within concept-drift

Download this project as a .zip file Download this project as a tar.gz file

About research

Algorithms for concept drift handling are important for many applications. In this paper we present decision tree ensemble classification method based on the Random Forest algorithm for concept drift.
The weighted majority voting ensemble aggregation rule is employed based on the ideas of Accuracy Weighted Ensemble (AWE) method. Base learner weight in our case is computed for each sample evaluation using base learners accuracy and intrinsic proximity measure of Random Forest. Our algorithm exploits both temporal weighting of samples and ensemble pruning as a forgetting strategy.
We present results of empirical comparison of our method with the state of the art classifiers for concept drift handling such as SEA, Hoeffding Adaptive tree and Online Bagging.

Paper link

Benchmark datasets

In this experiment, we evaluated our algorithm on CoverType. Cover type dataset contains the forest cover type for 30 x 30 meter cells obtained from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. It contains 581, 012 instances and 54 attributes, and it has been used as a benchmark in several papers on data stream classification.

Testing results

Covertype dataset mean accuracy is showed in the table below.

blockSize windowSize numNN RF(new blocks) PDSRF(new blocks) PDSRF(windows)
300 1000 5 0.7765 0.8115 0.5550
300 1000 10 0.7769 0.8105 0.5534
300 1000 20 0.7769 0.8052 0.5459
300 1500 5 0.7767 0.8111 0.5567
300 1500 10 0.7789 0.8121 0.5555
300 1500 20 0.7780 0.8083 0.5478
500 500 5 0.8276 0.8638 0.5293
500 500 10 0.8273 0.8616 0.5266
500 500 20 0.8268 0.8602 0.5286
500 1000 5 0.8276 0.8645 0.5253
500 1000 10 0.8287 0.8627 0.5288
500 1000 20 0.8275 0.8604 0.5260
500 1500 5 0.8275 0.8649 0.5247
500 1500 10 0.8274 0.8629 0.5206
500 1500 20 0.8270 0.8596 0.5227

Authors

Aleksei V. Zhukov
Denis N. Sidorov
Aoife M. Foley
Adele H. Marshall

Contact

For any information related to this research contact Zhukov Aleksei