References¶

[AMD2013]

AMD. AMD64 Architecture Programmer’s Manual Volume 1: Application Programming. http://support.amd.com/TechDocs/24592.pdf.

[AMD2013b]

AMD. AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions. http://support.amd.com/TechDocs/24594.pdf.

[AMD2013c]

AMD. AMD64 Architecture Programmer’s Manual Volume 4: 128-Bit and 256-Bit Media Instructions. http://support.amd.com/TechDocs/26568.pdf.

[AMD2013d]

AMD. AMD64 Architecture Programmer’s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions. http://support.amd.com/TechDocs/26569_APM_v5.pdf.

[AMD2014]

AMD. Software Optimization Guide for AMD Family 15h Processors. http://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf.

[Anderson1999]

1. Anderson. A Distillation Algorithm for Floating-Point Summation. DOI: 10.1137/S1064827596314200.

[Brisebarre2010]

J.-M. Muller, N. Brisebarre, F. de Dinechin, C.-P. Jeannerod, V. Lefèvre, G. Melquiond, N. Revol, D. Stehlé, and S. Torres. Handbook of Floating-Point Arithmetic. DOI: 10.1007/978-0-8176-4705-6_1. ISBN: 978-0-8176-4704-9.

[Fog2014]

Fog. 4. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. http://www.agner.org/optimize/instruction_tables.pdf.

[Graillat2007]

Graillat, P. Langlois, and N. Louvet. Accurate dot products with FMA. http://rnc7.loria.fr/louvet_poster.pdf.

[Hayes2009]

Y.-K. Zhu and W. B. Hayes. Correct Rounding and a Hybrid Approach to Exact Floating-Point Summation. DOI: 10.1137/070710020.

[Hayes2010]

Y.-K. Zhu and W. B. Hayes. Algorithm 908: Online Exact Summation of Floating-Point Streams. DOI: 10.1145/1824801.1824815.

[Higham2002]

1. Higham. Accuracy and Stability of Numerical Algorithms. DOI: 10.1137/1.9780898718027.

[Hollingsworth2012]

Hollingsworth. New “Bulldozer” and “Piledriver” Instructions. http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf.

[IEEE-754-2008]

Institute of Electrical and Electronics Engineers (IEEE). 754-2008 - Standard for Binary Floating-Point Arithmetic. https://standards.ieee.org/findstds/standard/754-2008.html.

[ISO-IEC-14882-2011]

International Organization for Standardization (ISO). Programming languages - C++. http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=50372.

[ISO-IEC-9899-2011]

International Organization for Standardization (ISO). Programming languages - C. http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=57853.

[Intel2015]

Intel. Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 1: Basic Architecture. https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf.

[Intel2015a]

Intel. Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 2: Instruction Set Reference, A-Z. https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf.

[Kulisch1986]

1. Kulisch and W. L. Miranker. The Arithmetic of the Digital Computer: A New Approach. DOI: 10.1137/1028001.

[Kulisch2013]

1. Kulisch. Computer Arithmetic and Validity. Theory, Implementation, and Applications.. http://www.degruyter.com/view/product/186284. ISBN: 978-3-11-030180-9.

[Lathus2012]

Lathus. Verifikationsmethoden für spärlich besetzte Matrizen.

[Lomont2011]

Lomont. Introduction to Intel(R) Advanced Vector Extensions. https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions.

[Malcolm1971]

1. Malcolm. On Accurate Floating-point Summation. DOI: 10.1145/362854.362889.

[Meghana2013]

Meghana. An Introduction to the 4th Generation Intel(R) Core(TM) Processor. https://software.intel.com/en-us/articles/an-introduction-to-the-intel-4th-generation-core-processor.

[Ogita2005]

Ogita, S. M. Rump, and S. Oishi. Accurate Sum and Dot Product. DOI: 10.1137/030601818.

[Ogita2008]

1. Rump, T. Ogita, and S. Oishi. Accurate Floating-Point Summation Part I: Faithful Rounding. DOI: 10.1137/050645671.

[Rump2009]

1. Rump. Ultimately Fast Accurate Summation. DOI: 10.1137/080738490.

[Rump2012]

1. Rump. Error estimation of floating-point summation and dot product. DOI: 10.1007/s10543-011-0342-4.

[Stallman2015]

1. Stallman. Using the GNU Compiler Collection. https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc.pdf.

[Youngbauer2012]

Youngbauer. Second-Generation AMD A-Series APUs Enable Best-in-Class PC Mobility, Entertainment, and Gaming Experience in Single Chip. http://www.amd.com/en-us/press-releases/Pages/second-generation-amd-a-series-2012may15.aspx.