IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012) (2012)
Boston, MA, USA USA
June 25, 2012 to June 28, 2012
Robert Cain , School of Computing Science, Centre for Cybercrime and Computer Security, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Aad van Moorsel , School of Computing Science, Centre for Cybercrime and Computer Security, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Probabilistic and stochastic models are routinely used in performance, dependability and security evaluation, and determining appropriate values for model parameters is a long-standing problem in the practical use of such models. With the increasing emphasis on human aspects and business considerations, data collection to estimate parameter values often gets prohibitively expensive, since it may involve questionnaires, costly audits or additional monitoring and processing. In this paper we articulate a set of optimization problems related to data collection, and provide efficient algorithms to determine the optimal data collection strategy for a model. The main idea is to model the uncertainty of data sources and determine its influence on output accuracy by solving the model. This approach is particularly natural for data sources that rely on sampling, such as questionnaires or monitoring, since uncertainty can be expressed using the central limit theorem. We pay special attention to the efficiency of our optimization algorithm, using ideas inspired by importance sampling to derive optimal strategies for a range of parameter values from a single set of experiments.
optimization, data collection, probabilistic modelling, dependability, information security
R. Cain and A. van Moorsel, "Optimization of data collection strategies for model-based evaluation and decision-making," IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012)(DSN), Boston, MA, USA USA, 2012, pp. 1-10.