Third IEEE International Conference on Data Mining (2003)
Melbourne, Florida
Nov. 19, 2003 to Nov. 22, 2003
ISBN: 0-7695-1978-4
pp: 735
Byung-Hoon Park , Oak Ridge National Laboratory
George Ostrouchov , Oak Ridge National Laboratory
Gong-Xin Yu , Oak Ridge National Laboratory
Al Geist , Oak Ridge National Laboratory
Andrey Gorin , Oak Ridge National Laboratory
Nagiza F. Samatova , Oak Ridge National Laboratory
We note that a set of statistically "unusual" protein-profile pairs in experimentally determined database of protein-protein interactions can typify protein-protein interactions, and propose a novel method called PICUPP that sifts such protein-profile pairs using a statistical simulation. It is demonstrated that unusual Pfam and InterPro profile pairs can be extracted from the DIP database using a bootstrapping approach. We particularly illustrate that such protein-profile pairs can be used for predicting putative pairs of interacting proteins. Their prediction accuracies are around 86% and 90% when InterPro and Pfam profiles are used, respectively at 75% confidence level.

