loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 7th Computer Information Systems and Industrial Management Applications
Nonlinear Dimensionality Reduction by Isomap and MLEdim as Applied to Amino-Acid Distribution in Yeast ORFs
June 26-June 28
ISBN: 978-0-7695-3184-7
We consider the multivariate distribution of amino-acids coding for proteins in Open Reading Frames (ORFs). An appropriate statistical model of this distribution might throw some light on the interdependency of the 20 amino-acids and contribute to the problem of verification of known ORFs (At the date 3. April 2008 only 71.02\% of known ORFs were verified). From a graphical analysis od the data we deduce that the data cloud mightbe modelled by a curvilinear manifold of smaller dimension embedded in a larger, 20-dimensional space. To check that assumption we have applied to the recorded data (containing frequency of appearing 20 amino-acids in ORFs found in the 7th yeast chromosome) two nonlinear methods referred to as the Isomap (Tennenbaum et al., 2000 ) and MLEdim (Levina and Bickel, 2005). These two methods, based on complete different principles, gave similar results: the true 'intrinsic' dimension of the investigated data appears several dimensions smaller as originally supposed.
Index Terms:
intrinsic dimension, reduction of dimensionality, genetic code, Open Reading Frames (ORFs) in yeast, Isomap, MLEdim estimator
Citation:
Anna Bartkowiak, "Nonlinear Dimensionality Reduction by Isomap and MLEdim as Applied to Amino-Acid Distribution in Yeast ORFs," cisim, pp.183-188, 2008 7th Computer Information Systems and Industrial Management Applications, 2008
Usage of this product signifies your acceptance of the Terms of Use.