The Community for Technology Leaders
2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (2016)
Haifa, Israel
Sept. 11, 2016 to Sept. 15, 2016
ISBN: 978-1-5090-5308-7
pp: 363-372
Andrew Anderson , Lero, Trinity College Dublin, Ireland
David Gregg , Lero, Trinity College Dublin, Ireland
ABSTRACT
We propose a scheme for reduced-precision representation of floating point data on a continuum between IEEE-754 floating point types. Our scheme enables the use of lower precision formats for a reduction in storage space requirements and data transfer volume. We describe how our scheme can be accelerated using existing hardware vector units on a general-purpose processor (GPP). Exploiting native vector hardware allows us to support reduced precision floating point with low overhead. We demonstrate that supporting reduced precision in the compiler as opposed to using a library approach can yield a low overhead solution for GPPs.
INDEX TERMS
Hardware, Standards, Writing, Software, Approximation algorithms, Memory management,Vector Architecture, Approximate Computing, Floating Point, Multiple Precision, SIMD
CITATION
Andrew Anderson, David Gregg, "Vectorization of multibyte floating point data formats", 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT), vol. 00, no. , pp. 363-372, 2016, doi:10.1145/2967938.2967966
91 ms
(Ver 3.3 (11022016))