Learning with a regularizer is popular and is effective in decreasing generalization errors. It is well known that generalization errors depend not only on a regularization parameter but also on the type of a regularizer. For example comparative studies of a Gaussian regularizer, i.e., sum of squared connection weights, and a Laplace regularizer, i.e., sum of the absolute values of connection weights, have been carried out using examples, but the results of empirical studies have varied from paper to paper [3][4][6][7]. These have motivated studies on theoretical evaluation of regularizers. Hansen et al. [2] and Goutte et al. [1] have evaluated generalization errors in estimating a mean value of a Gaussian variable. The estimation of a mean value, however, is too simple to draw useful conclusions on the comparative advantage of various regularizers.We have already proposed to estimate model parameters as a function of those without a regularizer [5]. A generalization error, therefore, can be calculated using these function parameters, which are closely related to regularization parameters. Minimization of a generalization error, therefore, provides the optimal regularization parameters and model parameters. To theoretically evaluate generalization errors based on finite number of samples, linear regression models are considered here. Although linear regression models are much simpler than neural networks, it could still draw qualitative conclusions on the comparative advantage of regularizers, which are hardly available by empirical studies.The previous study by the authors was based on the assumption that true model parameters and a noise variance were known a priori [5]. It was further assumed that input variables were mutually independent, which did not accord with real data. The latter assumption was adopted to merely simplify the computation of generalization errors. The introduction of correlations between input variables allows us to dispense with the latter assumption at the expense of computational complexity.In real data, the former assumption is also rarely satisfied. In the present paper, we propose a novel procedure for theoretically evaluating regularizers based on data. Firstly, we estimate model parameters and a noise variance from data. This is easy in case of linear regression models. Secondly, if these estimates are true, we calculate the optimal regularization parameters and model parameters by the previously proposed method [5]. It provides, hopefully, their better estimates. Thirdly, if the resulting estimates are true, we again calculate the optimal regularization parameters and model parameters. This procedure can be repeated iteratively. This iterative estimation is a key idea of the present paper. Detailed formulation and the results of computer experiments will follow.