References

[AMD2013]AMD. AMD64 Architecture Programmer’s Manual Volume 1: Application Programming. http://support.amd.com/TechDocs/24592.pdf.
[AMD2013b]AMD. AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions. http://support.amd.com/TechDocs/24594.pdf.
[AMD2013c]AMD. AMD64 Architecture Programmer’s Manual Volume 4: 128-Bit and 256-Bit Media Instructions. http://support.amd.com/TechDocs/26568.pdf.
[AMD2013d]AMD. AMD64 Architecture Programmer’s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions. http://support.amd.com/TechDocs/26569_APM_v5.pdf.
[AMD2014]AMD. Software Optimization Guide for AMD Family 15h Processors. http://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf.
[Anderson1999]
    1. Anderson. A Distillation Algorithm for Floating-Point Summation. DOI: 10.1137/S1064827596314200.
[Brisebarre2010]J.-M. Muller, N. Brisebarre, F. de Dinechin, C.-P. Jeannerod, V. Lefèvre, G. Melquiond, N. Revol, D. Stehlé, and S. Torres. Handbook of Floating-Point Arithmetic. DOI: 10.1007/978-0-8176-4705-6_1. ISBN: 978-0-8176-4704-9.
[Fog2014]
  1. Fog. 4. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. http://www.agner.org/optimize/instruction_tables.pdf.
[Graillat2007]
  1. Graillat, P. Langlois, and N. Louvet. Accurate dot products with FMA. http://rnc7.loria.fr/louvet_poster.pdf.
[Hayes2009]Y.-K. Zhu and W. B. Hayes. Correct Rounding and a Hybrid Approach to Exact Floating-Point Summation. DOI: 10.1137/070710020.
[Hayes2010]Y.-K. Zhu and W. B. Hayes. Algorithm 908: Online Exact Summation of Floating-Point Streams. DOI: 10.1145/1824801.1824815.
[Higham2002]
    1. Higham. Accuracy and Stability of Numerical Algorithms. DOI: 10.1137/1.9780898718027.
[Hollingsworth2012]
  1. Hollingsworth. New “Bulldozer” and “Piledriver” Instructions. http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf.
[IEEE-754-2008]Institute of Electrical and Electronics Engineers (IEEE). 754-2008 - Standard for Binary Floating-Point Arithmetic. https://standards.ieee.org/findstds/standard/754-2008.html.
[ISO-IEC-14882-2011]International Organization for Standardization (ISO). Programming languages - C++. http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=50372.
[ISO-IEC-9899-2011]International Organization for Standardization (ISO). Programming languages - C. http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=57853.
[Intel2015]Intel. Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 1: Basic Architecture. https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf.
[Intel2015a]Intel. Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 2: Instruction Set Reference, A-Z. https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf.
[Kulisch1986]
    1. Kulisch and W. L. Miranker. The Arithmetic of the Digital Computer: A New Approach. DOI: 10.1137/1028001.
[Kulisch2013]
    1. Kulisch. Computer Arithmetic and Validity. Theory, Implementation, and Applications.. http://www.degruyter.com/view/product/186284. ISBN: 978-3-11-030180-9.
[Lathus2012]
  1. Lathus. Verifikationsmethoden für spärlich besetzte Matrizen.
[Lomont2011]
  1. Lomont. Introduction to Intel(R) Advanced Vector Extensions. https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions.
[Malcolm1971]
    1. Malcolm. On Accurate Floating-point Summation. DOI: 10.1145/362854.362889.
[Meghana2013]
  1. Meghana. An Introduction to the 4th Generation Intel(R) Core(TM) Processor. https://software.intel.com/en-us/articles/an-introduction-to-the-intel-4th-generation-core-processor.
[Ogita2005]
  1. Ogita, S. M. Rump, and S. Oishi. Accurate Sum and Dot Product. DOI: 10.1137/030601818.
[Ogita2008]
    1. Rump, T. Ogita, and S. Oishi. Accurate Floating-Point Summation Part I: Faithful Rounding. DOI: 10.1137/050645671.
[Rump2009]
    1. Rump. Ultimately Fast Accurate Summation. DOI: 10.1137/080738490.
[Rump2012]
    1. Rump. Error estimation of floating-point summation and dot product. DOI: 10.1007/s10543-011-0342-4.
[Stallman2015]
    1. Stallman. Using the GNU Compiler Collection. https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc.pdf.
[Youngbauer2012]
  1. Youngbauer. Second-Generation AMD A-Series APUs Enable Best-in-Class PC Mobility, Entertainment, and Gaming Experience in Single Chip. http://www.amd.com/en-us/press-releases/Pages/second-generation-amd-a-series-2012may15.aspx.