Search For:

Displaying 1-40 out of 40 total
Guest Editorial: Special Focus on Bioinformatics and Systems Biology
Found in: IEEE/ACM Transactions on Computational Biology and Bioinformatics
By Fang-Xiang Wu, Jun Huan
Issue Date:March 2011
pp. 292-293
No summary available.
 
Knowledge Discovery in Academic Drug Discovery Programs: Opportunities and Challenges
Found in: Data Mining, IEEE International Conference on
By Jun Huan
Issue Date:December 2010
pp. 1218
In United State several universities and research institutes including the national health institute (NIH) recently started programs aiming for drug discovery. With the initiatives, huge volumes of data have been collected and shared with public free of ch...
 
GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications in Cheminformatics
Found in: IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
By Aaron Smalter, Gerald Lushington, Gerald Lushington, Jun Huan, Jun Huan, Yi Jia, Yi Jia
Issue Date:April 2010
pp. 197-207
Graph data mining is an active research area. Graphs are general modeling tools to organize information from heterogeneous sources and have been applied in many scientific, engineering, and business fields. With the fast accumulation of graph data, buildin...
     
Multitask Learning with Feature Selection for Groups of Related Tasks
Found in: 2013 IEEE International Conference on Data Mining (ICDM)
By Meenakshi Mishra,Jun Huan
Issue Date:December 2013
pp. 1157-1162
Multitask learning has been thoroughly proven to improve the generalization performance given a set of related tasks. Most multitask learning algorithm assume that all tasks are related. However, if all the tasks are not related, negative transfer of infor...
 
A Family of Joint Sparse PCA Algorithms for Anomaly Localization in Network Data Streams
Found in: IEEE Transactions on Knowledge and Data Engineering
By Ruoyi Jiang,Hongliang Fei,Jun Huan
Issue Date:November 2013
pp. 2421-2433
Determining anomalies in data streams that are collected and transformed from various types of networks has recently attracted significant research interest. Principal component analysis (PCA) has been extensively applied to detecting anomalies in network ...
 
Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary
Found in: IEEE/ACM Transactions on Computational Biology and Bioinformatics
By Said Bleik,Meenakshi Mishra,Jun Huan,Min Song
Issue Date:September 2013
pp. 1211-1217
Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph ...
 
When Additional Views are Not Free: Active View Completion for Multi-view Semi-Supervised Learning
Found in: 2012 IEEE 12th International Conference on Data Mining Workshops
By Brian Quanz,Jun Huan
Issue Date:December 2012
pp. 169-178
Multi-view semi-supervised learning methods exploit the combination of multiple data views and unlabeled data in order to learn better predictive functions with limited labeled data. However, their applicability is limited since typically one data view is ...
 
Identification of transposable elements of the giant panda (Ailuropoda melanoleuca) genome
Found in: 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)
By Avindra Fernando,Jun Huan,Justin P. Blumenstiel,Jin Lin,Xue-wen Chen,Bo Luo
Issue Date:October 2012
pp. 674-681
Transposable elements are very common in genomes and play and important role in evolution. Recently, using the next generation sequencing technologies more and more non-traditional genomes such as the giant panda are being sequenced. Identifying transposab...
 
Drug-induced QT prolongation prediction using co-regularized multi-view learning
Found in: 2012 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
By Jintao Zhang,Jun Huan
Issue Date:October 2012
pp. 1-6
Drug-induced QT prolongation is a major life-threatening adverse drug effect. It is crucial to predict the QT prolongation effect as early as possible in drug development, however, data on drugs that induce QT prolongation are very limited and noisy. Multi...
 
Structured Feature Selection and Task Relationship Inference for Multi-task Learning
Found in: Data Mining, IEEE International Conference on
By Hongliang Fei,Jun Huan
Issue Date:December 2011
pp. 171-180
Multi-task Learning (MTL) aims to enhance the generalization performance of supervised regression or classification by learning multiple related tasks simultaneously. In this paper, we aim to extend the current MTL techniques to high dimensional data sets ...
 
Bayesian Classifiers for Chemical Toxicity Prediction
Found in: Bioinformatics and Biomedicine, IEEE International Conference on
By Meenakshi Mishra,Brian Potetz,Jun Huan
Issue Date:November 2011
pp. 595-599
A major concern across the globe is the growing number of new chemicals that are brought to use on a regular basis without having any knowledge about their toxic behavior. The challenge here is that the growth in the number of chemicals is fast, and the tr...
 
Knowledge transfer with low-quality data: A feature extraction issue
Found in: Data Engineering, International Conference on
By Brian Quanz,Jun Huan,Meenakshi Mishra
Issue Date:April 2011
pp. 769-779
Effectively utilizing readily available auxiliary data to improve predictive performance on new modeling tasks is a key problem in data mining. In this research the goal is to transfer knowledge between sources of data, particularly when ground truth infor...
 
Feature Selection in the Tensor Product Feature Space
Found in: Data Mining, IEEE International Conference on
By Aaron Smalter, Jun Huan, Gerald Lushington
Issue Date:December 2009
pp. 1004-1009
Classifying objects that are sampled jointly from two or more domains has many applications. The tensor product feature space is useful for modeling interactions between feature sets in different domains but feature selection in the tensor product feature ...
 
CGM: A biomedical text categorization approach using concept graph mining
Found in: Bioinformatics and Biomedicine Workshop, IEEE International Conference on
By S. Bleik, Min Song, A. Smalter, Jun Huan, G. Lushington
Issue Date:November 2009
pp. 38-43
Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing tec...
 
Application of Kernel Functions for Accurate Similarity Search in Large Chemical Databases
Found in: Bioinformatics and Biomedicine, IEEE International Conference on
By Xiaohong Wang, Jun Huan, Aaron Smalter, Gerald H. Lushington
Issue Date:November 2009
pp. 356-361
Similarity search in chemical structure databases is an important problem with many applications in chemicalgenomics, drug design, and efficient chemical probe screeningamong others. It is widely believed that structure based methods provide an efficient w...
 
The Analysis of Arabidopsis thaliana Circadian Network Based on Non-stationary DBNs Approach with Flexible Time Lag Choosing Mechanism
Found in: Bioinformatics and Biomedicine, IEEE International Conference on
By Yi Jia, Jun Huan
Issue Date:November 2009
pp. 178-181
Dynamic Bayesian Networks (DBNs) are widely used in regulatorynetwork structure inference from noisy gene expression data. Howevermost of the previous researches assumed that the underlyingstochastic processes that generates the gene expression data aresta...
 
Anomaly Detection with Sensor Data for Distributed Security
Found in: Computer Communications and Networks, International Conference on
By Brian Quanz, Hongliang Fei, Jun Huan, Joseph Evans, Victor Frost, Gary Minden, Daniel Deavours, Leon Searl, Daniel DePardo, Martin Kuehnhausen, Daniel Fokum, Matt Zeets, Angela Oguna
Issue Date:August 2009
pp. 1-6
No summary available.
 
Towards Site-Based Protein Functional Annotations
Found in: Bioinformatics and Biomedicine, IEEE International Conference on
By Seak Fei Lei, Jun Huan
Issue Date:November 2008
pp. 193-198
The exact relationship between protein active centers and protein functions is unclear even after decades of intensive study. To improve the functional prediction ability based on the local protein structures, we proposed three different methods. 1) We use...
 
Mining RNA Tertiary Motifs with Structure Graphs
Found in: Scientific and Statistical Database Management, International Conference on
By Xueyi Wang, Jun Huan, Jack S. Snoeyink, Wei Wang
Issue Date:July 2007
pp. 31
We present a novel application of graph database mining to identify tertiary motifs in RNA structures. In our method, we abstract an RNA molecule as a labeled graph and use a frequent subgraph mining technique to derive tertiary motifs. By applying our tec...
 
An Efficient Exact Algorithm for the Motif Stem Search Problem over Large Alphabets
Found in: IEEE/ACM Transactions on Computational Biology and Bioinformatics
By Qiang Yu,Hongwei Huo,Jeffrey Scott Vitter,Jun Huan,Yakov Nekrich
Issue Date:February 2015
pp. 1
In recent years, there has been an increasing interest in planted (l, d) motif search (PMS) with applications to discovering significant segments in biological sequences. However, there has been little discussion about PMS over large alphabets. This paper ...
 
Semantics-driven frequent data pattern mining on electronic health records for effective adverse drug event monitoring
Found in: 2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
By Jingshan Huang,Jun Huan,Alexander Tropsha,Jiangbo Dang,He Zhang,Min Xiong
Issue Date:December 2013
pp. 608-611
Continued surveillance of post-marketing Adverse Drug Events (ADEs) is considered essential for patient safety, and Electronic Health Records (EHRs) serve as a critical source for identifying relevant information. But effective EHR knowledge discovery and ...
   
A new on-line chemical biology data visualization system
Found in: 2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
By Peng Hao,Jintao Zhang,Jun Huan
Issue Date:December 2013
pp. 35-37
To support the nations public sector probe and drug discovery programs, in this paper, we develop a chemical biology data visualization system based on our designed Molecular Libraries Biological Database (MLBD) that tackles the limitations of the primary ...
   
StemFinder: An efficient algorithm for searching motif stems over large alphabets
Found in: 2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
By Qiang Yu,Hongwei Huo,Jeffrey Scott Vitter,Jun Huan,Yakov Nekrich
Issue Date:December 2013
pp. 473-476
Motif stem search (MSS) is a recent motif search problem to search motifs on large-alphabet inputs. A motif stem is an l-length string with some wildcards. The goal of the MSS problem is to find a set of stems that represents a superset of all (l, d) motif...
   
Graph Database Indexing Using Structured Graph Decomposition
Found in: Data Engineering, International Conference on
By David W. Williams, Jun Huan, Wei Wang
Issue Date:April 2007
pp. 976-985
We introduce a novel method of indexing graph databases in order to facilitate subgraph isomorphism and similarity queries. The index is comprised of two major data structures. The primary structure is a directed acyclic graph which contains a node for eac...
 
Reconstruction of Ancestral Gene Order after Segmental Duplication and Gene Loss
Found in: Computational Systems Bioinformatics Conference, International IEEE Computer Society
By Jun Huan, Jan Prins, Wei Wang, Todd Vision
Issue Date:August 2003
pp. 484
No summary available.
   
Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary
Found in: IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
By Jun Huan, Meenakshi Mishra, Min Song, Said Bleik
Issue Date:September 2013
pp. 1211-1217
Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph ...
     
CoNet: feature generation for multi-view semi-supervised learning with partially observed views
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Brian Quanz, Jun Huan
Issue Date:October 2012
pp. 1273-1282
Multi-view semi-supervised learning methods try to exploit the combination of multiple views along with large amounts of unlabeled data in order to learn better predictive functions when limited labeled data is available. However, lack of complete view dat...
     
Non-stationary bayesian networks based on perfect simulation
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Jun Huan, Wenrong Zeng, Yi Jia
Issue Date:October 2012
pp. 1095-1104
Non-stationary Dynamic Bayesian Networks (Non-stationary DBNs) are widely used to model the temporal changes of directed dependency structures from multivariate time series data. However, the existing change-points based non-stationary DBNs methods have se...
     
Biomedical text categorization with concept graph representations using a controlled vocabulary
Found in: Proceedings of the 11th International Workshop on Data Mining in Bioinformatics (BIOKDD '12)
By Jun Huan, Meenakshi Mishra, Min Song, Said Bleik
Issue Date:August 2012
pp. 26-32
Recent work using graph representations for text categorization has shown promising performance over conventional bag-of-words representation of text documents. In this paper we investigate a graph representation of texts for the task of text categorizatio...
     
Inductive multi-task learning with multiple view data
Found in: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '12)
By Jintao Zhang, Jun Huan
Issue Date:August 2012
pp. 543-551
In many real-world applications, it is becoming common to have data extracted from multiple diverse sources, known as "multi-view" data. Multi-view learning (MVL) has been widely studied in many applications, but existing MVL methods learn a single task in...
     
Content based social behavior prediction: a multi-task learning approach
Found in: Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM '11)
By Bo Luo, Hongliang Fei, Jun Huan, Ruoyi Jiang, Yuhao Yang
Issue Date:October 2011
pp. 995-1000
Information Flow Studies analyze the principles and mechanisms of social information distribution and is an essential research topic in social networks. Traditional approaches are primarily based on the social network graph topology. However, topology itse...
     
Guest Editorial: Special Focus on Bioinformatics and Systems Biology
Found in: IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
By Fang-Xiang Wu, Jun Huan
Issue Date:March 2011
pp. 292-293
Markov chain Monte Carlo has been the standard technique for inferring the posterior distribution of genome rearrangement scenarios under a Bayesian approach. We present here a negative result on the rate of convergence of the generally used Markov chains....
     
Regularization and feature selection for networked features
Found in: Proceedings of the 19th ACM international conference on Information and knowledge management (CIKM '10)
By Brian Quanz, Hongliang Fei, Jun Huan
Issue Date:October 2010
pp. 1893-1896
In the standard formalization of supervised learning problems, a datum is represented as a vector of features without prior knowledge about relationships among features. However, for many real world problems, we have such prior knowledge about structure re...
     
Boosting with structure information in the functional space: an application to graph classification
Found in: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '10)
By Hongliang Fei, Jun Huan
Issue Date:July 2010
pp. 643-652
Boosting is a very successful classification algorithm that produces a linear combination of "weak" classifiers (a.k.a. base learners) to obtain high quality classification models. In this paper we propose a new boosting algorithm where base learners have ...
     
Large margin transductive transfer learning
Found in: Proceeding of the 18th ACM conference on Information and knowledge management (CIKM '09)
By Brian Quanz, Jun Huan
Issue Date:November 2009
pp. 1327-1336
Recently there has been increasing interest in the problem of transfer learning, in which the typical assumption that training and testing data are drawn from identical distributions is relaxed. We specifically address the problem of transductive transfer ...
     
L2 norm regularized feature kernel regression for graph data
Found in: Proceeding of the 18th ACM conference on Information and knowledge management (CIKM '09)
By Hongliang Fei, Jun Huan
Issue Date:November 2009
pp. 593-600
Features in many real world applications such as Cheminformatics, Bioinformatics and Information Retrieval have complex internal structure. For example, frequent patterns mined from graph data are graphs. Such graph features have different number of nodes ...
     
G-hash: towards fast kernel-based similarity search in large graph databases
Found in: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT '09)
By Aaron Smalter, Gerald H. Lushington, Jun Huan, Xiaohong Wang
Issue Date:March 2009
pp. 94-104
Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and similarity search. With the fast accumulation of graph databases, similarity search ...
     
Biological pathways as features for microarray data classification
Found in: Proceeding of the 2nd international workshop on Data and text mining in bioinformatics (DTMBIO '08)
By Brian Quanz, Jun Huan, Meeyoung Park
Issue Date:October 2008
pp. 1001-1001
Classification using microarray gene expression data is an important task in bioinformatics. Due to the high dimensionality and small sample size that characterizes microarray data, there has recently been a drive to incorporate any available information i...
     
Structure feature selection for graph classification
Found in: Proceeding of the 17th ACM conference on Information and knowledge mining (CIKM '08)
By Hongliang Fei, Jun Huan
Issue Date:October 2008
pp. 1001-1001
With the development of highly efficient graph data collection technology in many application fields, classification of graph data emerges as an important topic in the data mining and machine learning community. Towards building highly accurate classificat...
     
SPIN: mining maximal frequent subgraphs from graph databases
Found in: Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '04)
By Jan Prins, Jiong Yang, Jun Huan, Wei Wang
Issue Date:August 2004
pp. 581-586
One fundamental challenge for mining recurring subgraphs from semi-structured data sets is the overwhelming abundance of such patterns. In large graph databases, the total number of frequent subgraphs can become too large to allow a full enumeration using ...
     
 1