loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fifth IEEE International Conference on Data Mining (ICDM'05)
Effective Estimation of Posterior Probabilities: Explaining the Accuracy of Randomized Decision Tree Approaches
Houston, Texas
November 27-November 30
ISBN: 0-7695-2278-5
Wei Fan, IBM T.J.Watson Research
Ed Greengrass, US Department of Defense
Joe McCloskey, US Department of Defense
Philip S. Yu, IBM T.J.Watson Research
Kevin Drummey, US Department of Defense
There has been increasing number of independently proposed randomization methods in different stages of decision tree construction to build multiple trees. Randomized decision tree methods have been reported to be significantly more accurate than widely-accepted single decision trees, although the training procedure of some methods incorporates a surprisingly random factor and therefore opposes the generally accepted idea of employing gain functions to choose optimum features at each node and compute a single tree that fits the data. One important question that is not well understood yet is the reason behind the high accuracy. We provide an insight based on posterior probability estimations. We first establish the relationship between effective posterior probability estimation and effective loss reduction. We argue that randomized decision tree methods effectively approximate the true probability distribution using the decision tree hypothesis space. We conduct experiments using both synthetic and real-world datasets under both 0-1 and cost-sensitive loss functions.
Citation:
Wei Fan, Ed Greengrass, Joe McCloskey, Philip S. Yu, Kevin Drummey, "Effective Estimation of Posterior Probabilities: Explaining the Accuracy of Randomized Decision Tree Approaches," icdm, pp.154-161, Fifth IEEE International Conference on Data Mining (ICDM'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.