Abstract
As the volume of data increase rapidly, most traditional machine learning algorithms become computationally prohibitive. Furthermore, the available data can be so big that a single machine's memory can easily be overflown.
We propose Coreset-Based Conformal Prediction, a strategy for dealing with big data by applying conformal predictors to a weighted summary of data - namely the coreset. We compare our approach against stand-alone inductive conformal predictors over three large competition-grade datasets to demonstrate that our coreset-based strategy may not only significantly improve the learning speed, but also retains predictions validity and the predictors' efficiency.
We propose Coreset-Based Conformal Prediction, a strategy for dealing with big data by applying conformal predictors to a weighted summary of data - namely the coreset. We compare our approach against stand-alone inductive conformal predictors over three large competition-grade datasets to demonstrate that our coreset-based strategy may not only significantly improve the learning speed, but also retains predictions validity and the predictors' efficiency.
Original language | English |
---|---|
Pages | 142-162 |
Publication status | Published - 2019 |
Event | 8th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2019) - , Bulgaria Duration: 9 Sept 2019 → 11 Sept 2019 https://cml.rhul.ac.uk/copa2019/ |
Conference
Conference | 8th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2019) |
---|---|
Country/Territory | Bulgaria |
Period | 9/09/19 → 11/09/19 |
Internet address |
Keywords
- logistic regression
- conformal predictors
- importance sampling