Search For:

Displaying 1-41 out of 41 total
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
Found in: Journal of the ACM (JACM)
By David M. Blei, Michael I. Jordan, Michael I. Jordan, Michael I. Jordan, Thomas L. Griffiths, Thomas L. Griffiths, Thomas L. Griffiths
Issue Date:January 2010
pp. 1-30
We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how this stochastic process can be used as a prior distribution in a B...
     
MLI: An API for Distributed Machine Learning
Found in: 2013 IEEE International Conference on Data Mining (ICDM)
By Evan R. Sparks,Ameet Talwalkar,Virginia Smith,Jey Kottalam,Xinghao Pan,Joseph Gonzalez,Michael J. Franklin,Michael I. Jordan,Tim Kraska
Issue Date:December 2013
pp. 1187-1192
MLI is an Application Programming Interface designed to address the challenges of building Machine Learning algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, ...
 
Distributed Low-Rank Subspace Segmentation
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Ameet Talwalkar,Lester Mackey,Yadong Mu,Shih-Fu Chang,Michael I. Jordan
Issue Date:December 2013
pp. 3543-3550
Vision problems ranging from image clustering to motion segmentation to semi-supervised learning can naturally be framed as subspace segmentation problems, in which one aims to recover multiple low-dimensional subspaces from noisy and corrupted input data....
 
Local Privacy and Statistical Minimax Rates
Found in: 2013 IEEE 54th Annual Symposium on Foundations of Computer Science (FOCS)
By John C. Duchi,Michael I. Jordan,Martin J. Wainwright
Issue Date:October 2013
pp. 429-438
Working under local differential privacy-a model of privacy in which data remains private even from the statistician or learner-we study the tradeoff between privacy guarantees and the utility of the resulting statistical estimators. We prove bounds on inf...
 
Qualcomm Context-Awareness Symposium Sets Research Agenda for Context-Aware Smartphones
Found in: IEEE Pervasive Computing
By Paul Lukowicz,Sanjiv Nanda,Vidya Narayanan,Hal Albelson,Deborah L. McGuinness,Michael I. Jordan
Issue Date:January 2012
pp. 76-79
The first context-aware applications have found their way into app stores. However, these are mostly simple location-aware services and basic motion-analysis tools that are well behind the state of the art in wearable context recognition. Understanding how...
 
Visually Relating Gene Expression and in vivo DNA Binding Data
Found in: Bioinformatics and Biomedicine, IEEE International Conference on
By Min-Yu Huang,Lester Mackey,Soile V.E. Keränen,Gunther H. Weber,Michael I. Jordan,David W. Knowles,Mark D. Biggin,Bernd Hamann
Issue Date:November 2011
pp. 586-589
Gene expression and in vivo DNA binding data provide important information for understanding gene regulatory networks: in vivo DNA binding data indicate genomic regions where transcription factors are bound, and expression data show the output resulting fr...
 
Sufficient dimension reduction for visual sequence classification
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Alex Shyr, Raquel Urtasun, Michael I. Jordan
Issue Date:June 2010
pp. 3610-3617
When classifying high-dimensional sequence data, traditional methods (e.g., HMMs, CRFs) may require large amounts of training data to avoid overfitting. In such cases dimensionality reduction can be employed to find a low-dimensional representation on whic...
 
Online System Problem Detection by Mining Patterns of Console Logs
Found in: Data Mining, IEEE International Conference on
By Wei Xu, Ling Huang, Armando Fox, David Patterson, Michael Jordan
Issue Date:December 2009
pp. 588-597
We describe a novel application of using data mining and statistical learning methods to automatically monitor and detect abnormal execution traces from console logs in an online setting. Different from existing solutions, we use a two stage detection syst...
 
Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning
Found in: Data Engineering, International Conference on
By Archana Ganapathi, Harumi Kuno, Umeshwar Dayal, Janet L. Wiener, Armando Fox, Michael Jordan, David Patterson
Issue Date:April 2009
pp. 592-603
One of the most challenging aspects of managing a very large data warehouse is identifying how queries will behave before they start executing. Yet knowing their performance characteristics --- their runtimes and resource usage --- can solve two important ...
 
Nonnegative Matrix Factorization for Combinatorial Optimization: Spectral Clustering, Graph Matching, and Clique Finding
Found in: Data Mining, IEEE International Conference on
By Chris Ding, Tao Li, Michael I. Jordan
Issue Date:December 2008
pp. 183-192
Nonnegative matrix factorization (NMF) is a versatile model for data clustering. In this paper, we propose several NMF inspired algorithms to solve different data mining problems. They include (1) multi-way normalized cut spectral clustering, (2) graph mat...
 
ICMLA 2008 Invited Speakers
Found in: Machine Learning and Applications, Fourth International Conference on
By Dan Roth, Jude Shavlik, Michael I. Jordan, Bin Yu, Andrew Moore, Philip Bourne
Issue Date:December 2008
pp. xxv-xxviii
No summary available.
   
Convex and Semi-Nonnegative Matrix Factorizations
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Chris Ding, Tao Li, Michael I. Jordan
Issue Date:January 2010
pp. 45-55
We present several new variations on the theme of nonnegative matrix factorization (NMF). Considering factorizations of the form X=FG^T, we focus on algorithms in which G is restricted to containing nonnegative entries, but allowing the data matrix X to ha...
 
Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes
Found in: Computer Vision, IEEE International Conference on
By Jyri J. Kivinen, Erik B. Sudderth, Michael I. Jordan
Issue Date:October 2007
pp. 1-8
We develop nonparametric Bayesian models for multiscale representations of images depicting natural scene categories. Individual features or wavelet coefficients are marginally described by Dirichlet process (DP) mixtures, yielding the heavy-tailed margina...
 
Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization
Found in: Data Mining, IEEE International Conference on
By Tao Li, Chris Ding, Michael I. Jordan
Issue Date:October 2007
pp. 577-582
Consensus clustering and semi-supervised clustering are important extensions of the standard clustering paradigm. Consensus clustering (also known as aggregation of clustering) can improve clustering robustness, deal with distributed and heterogeneous data...
 
Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for Failure Detection and Localization
Found in: Autonomic Computing, International Conference on
By Peter Bodíc, Greg Friedman, Lukas Biewald, Helen Levine, George Candea, Kayur Patel, Gilman Tolle, Jon Hui, Armando Fox, Michael I. Jordan, David Patterson
Issue Date:June 2005
pp. 89-100
Web applications suffer from software and configuration faults that lower their availability. Recovering from failure is dominated by the time interval between when these faults appear and when they are detected by site operators. We introduce a set of too...
 
Failure Diagnosis Using Decision Trees
Found in: Autonomic Computing, International Conference on
By Mike Chen, Alice X. Zheng, Jim Lloyd, Michael I. Jordan, Eric Brewer
Issue Date:May 2004
pp. 36-43
We present a decision tree learning approach to diagnosing failures in large Internet sites. We record runtime properties of each request and apply automated machine learning and data mining techniques to identify the causes of failures. We train decision ...
 
Iterative Discovery of Multiple AlternativeClustering Views
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Donglin Niu,Jennifer G. Dy,and Michael I. Jordan
Issue Date:July 2014
pp. 1340-1353
Complex data can be grouped and interpreted in many different ways. Most existing clustering algorithms, however, only find one clustering solution, and provide little guidance to data analysts who may not be satisfied with that single clustering and may w...
 
Combinatorial Clustering and the Beta Negative Binomial Process
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Tamara Broderick,Lester Mackey,John Paisley,Michael Jordan
Issue Date:May 2014
pp. 1
We develop a Bayesian nonparametric approach to a general family of latent class problems in which individuals can belong simultaneously to multiple classes and where each class can be exhibited multiple times by an individual. We introduce a combinatorial...
 
Statistical Machine Learning and Computational Biology
Found in: Bioinformatics and Biomedicine, IEEE International Conference on
By Michael I. Jordan
Issue Date:November 2007
pp. 4
Statistical machine learning is a field that combines algorithmic ideas with foundational concepts from probability and statistics. This combination makes statistical machine learning an essential tool for computational biology, in part because probabilist...
   
LOGOS: a modular Bayesian model for de novo motif detection
Found in: Computational Systems Bioinformatics Conference, International IEEE Computer Society
By Eric P. Xing, Wei Wu, Michael I. Jordan, Richard M. Karp
Issue Date:August 2003
pp. 266
The complexity of the global organization and internal structures of motifs in higher eukaryotic organisms raises significant challenges for motif detection techniques. To achieve successful de novo motif detection it is necessary to model the complex depe...
 
Active spectral clustering via iterative uncertainty reduction
Found in: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '12)
By Fabian L. Wauthier, Michael I. Jordan, Nebojsa Jojic
Issue Date:August 2012
pp. 1339-1347
Spectral clustering is a widely used method for organizing data that only relies on pairwise similarity measurements. This makes its application to non-vectorial data straight-forward in principle, as long as all pairwise similarities are available. Howeve...
     
Divide-and-conquer and statistical inference for big data
Found in: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '12)
By Michael I. Jordan
Issue Date:August 2012
pp. 4-4
I present some recent work on statistical inference for Big Data. Divide-and-conquer is a natural computational paradigm for approaching Big Data problems, particularly given recent developments in distributed and parallel computing, but some interesting c...
     
Managing data transfers in computer clusters with orchestra
Found in: Proceedings of the ACM SIGCOMM 2011 conference on SIGCOMM (SIGCOMM '11)
By Ion Stoica, Justin Ma, Matei Zaharia, Michael I. Jordan, Mosharaf Chowdhury
Issue Date:August 2011
pp. 98-109
Cluster computing applications like MapReduce and Dryad transfer massive amounts of data between their computation stages. These transfers can have a significant impact on job performance, accounting for more than 50% of job completion times. Despite this ...
     
Detecting large-scale system problems by mining console logs
Found in: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles (SOSP '09)
By Armando Fox, David Patterson, Ling Huang, Michael I. Jordan, Wei Xu
Issue Date:October 2009
pp. 117-132
Surprisingly, console logs rarely help operators detect problems in large-scale datacenter services, for they often consist of the voluminous intermixing of messages from many software components written by independent developers. We propose a general meth...
     
Fast approximate spectral clustering
Found in: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '09)
By Donghui Yan, Ling Huang, Michael I. Jordan
Issue Date:June 2009
pp. 1-24
Spectral clustering refers to a flexible class of clustering procedures that can produce high-quality clusterings on small data sets but which has limited applicability to large-scale problems due to its computational complexity of O(n3) in general, with n...
     
Learning from measurements in exponential families
Found in: Proceedings of the 26th Annual International Conference on Machine Learning (ICML '09)
By Dan Klein, Michael I. Jordan, Percy Liang
Issue Date:June 2009
pp. 1-8
Given a model family and a set of unlabeled examples, one could either label specific examples or state general constraints---both provide information about the desired model. In general, what is the most cost-effective way to learn? To address this questi...
     
An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators
Found in: Proceedings of the 25th international conference on Machine learning (ICML '08)
By Michael I. Jordan, Percy Liang
Issue Date:July 2008
pp. 584-591
Statistical and computational concerns have motivated parameter estimators based on various forms of likelihood, e.g., joint, conditional, and pseudolikelihood. In this paper, we present a unified framework for studying these estimators, which allows us to...
     
An HDP-HMM for systems with state persistence
Found in: Proceedings of the 25th international conference on Machine learning (ICML '08)
By Alan S. Willsky, Emily B. Fox, Erik B. Sudderth, Michael I. Jordan
Issue Date:July 2008
pp. 312-319
The hierarchical Dirichlet process hidden Markov model (HDP-HMM) is a flexible, nonparametric model which allows state spaces of unknown size to be learned from data. We demonstrate some limitations of the original HDP-HMM formulation (Teh et al., 2006), a...
     
Statistical debugging: simultaneous identification of multiple bugs
Found in: Proceedings of the 23rd international conference on Machine learning (ICML '06)
By Alex Aiken, Alice X. Zheng, Ben Liblit, Mayur Naik, Michael I. Jordan
Issue Date:June 2006
pp. 1105-1112
We describe a statistical approach to software debugging in the presence of multiple bugs. Due to sparse sampling issues and complex interaction between program predicates, many generic off-the-shelf algorithms fail to select useful bug predictors. Taking ...
     
Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture
Found in: Proceedings of the 23rd international conference on Machine learning (ICML '06)
By Eric P. Xing, Kyung-Ah Sohn, Michael I. Jordan, Yee-Whye Teh
Issue Date:June 2006
pp. 1049-1056
Uncovering the haplotypes of single nucleotide polymorphisms and their population demography is essential for many biological and medical applications. Methods for haplotype inference developed thus far---including methods based on coalescence, finite and ...
     
A graphical model for predicting protein molecular function
Found in: Proceedings of the 23rd international conference on Machine learning (ICML '06)
By Barbara E. Engelhardt, Michael I. Jordan, Steven E. Brenner
Issue Date:June 2006
pp. 297-304
We present a simple statistical model of molecular function evolution to predict protein function. The model description encodes general knowledge of how molecular function evolves within a phylogenetic tree based on the proteins' sequence. Inputs are a ph...
     
Predictive low-rank decomposition for kernel methods
Found in: Proceedings of the 22nd international conference on Machine learning (ICML '05)
By Francis R. Bach, Michael I. Jordan
Issue Date:August 2005
pp. 33-40
Low-rank matrix decompositions are essential tools in the application of kernel methods to large-scale learning problems. These decompositions have generally been treated as black boxes---the decomposition of the kernel matrix that they deliver is independ...
     
Scalable statistical bug isolation
Found in: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation (PLDI '05)
By Alex Aiken, Alice X. Zheng, Ben Liblit, Mayur Naik, Michael I. Jordan
Issue Date:June 2005
pp. 280-280
We present a statistical debugging algorithm that isolates bugs in programs containing multiple undiagnosed bugs. Earlier statistical algorithms that focus solely on identifying predictors that correlate with program failure perform poorly when there are m...
     
Variational methods for the Dirichlet process
Found in: Twenty-first international conference on Machine learning (ICML '04)
By David M. Blei, Michael I. Jordan
Issue Date:July 2004
pp. 182-182
Variational inference methods, including mean field methods and loopy belief propagation, have been widely used for approximate probabilistic inference in graphical models. While often less accurate than MCMC, variational methods provide a fast determinist...
     
Decentralized detection and classification using kernel methods
Found in: Twenty-first international conference on Machine learning (ICML '04)
By Martin J. Wainwright, Michael I. Jordan, XuanLong Nguyen
Issue Date:July 2004
pp. 182-182
We consider the problem of decentralized detection under constraints on the number of bits that can be transmitted by each sensor. In contrast to most previous work, in which the joint distribution of sensor observations is assumed to be known, we address ...
     
Multiple kernel learning, conic duality, and the SMO algorithm
Found in: Twenty-first international conference on Machine learning (ICML '04)
By Francis R. Bach, Gert R. G. Lanckriet, Michael I. Jordan
Issue Date:July 2004
pp. 182-182
While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vecto...
     
Bayesian haplo-type inference via the dirichlet process
Found in: Twenty-first international conference on Machine learning (ICML '04)
By Eric Xing, Michael I. Jordan, Roded Sharan
Issue Date:July 2004
pp. 182-182
The problem of inferring haplotypes from genotypes of single nucleotide polymorphisms (SNPs) is essential for the understanding of genetic variation within and among populations, with important applications to the genetic analysis of disease propensities a...
     
Bug isolation via remote program sampling
Found in: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation (PLDI'03)
By Alex Aiken, Alice X. Zheng, Ben Liblit, Michael I. Jordan
Issue Date:June 2003
pp. 329-338
We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program's user community. Several example applications illustrate ways to use sampled instrumentation to isolate bugs. Assertion-dense code can...
     
Stable algorithms for link analysis
Found in: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '01)
By Alice X. Zheng, Andrew Y. Ng, Michael I. Jordan
Issue Date:September 2001
pp. 258-266
The Kleinberg HITS and the Google PageRank algorithms are eigenvector methods for identifying ``authoritative'' or ``influential'' articles, given hyperlink or citation information. That such algorithms should give reliable or consistent answers is surely...
     
A statistical approach to decision tree modeling
Found in: Proceedings of the seventh annual conference on Computational learning theory (COLT '94)
By Michael I. Jordan
Issue Date:July 1994
pp. 13-20
A statistical approach to decision tree modeling is described. In this approach, each decision in the tree is modeled parametrically as is the process by which an output is generated from an input and a sequence of decisions. The resulting model yields a l...
     
Neural networks
Found in: ACM Computing Surveys (CSUR)
By Christopher M. Bishop, Michael I. Jordan
Issue Date:March 1988
pp. 73-75
Two-dimensional image motion is the projection of the three-dimensional motion of objects, relative to a visual sensor, onto its image plane. Sequences of time-orderedimages allow the estimation of projected two-dimensional image motion as either instantan...
     
 1