Procesory a jejich architektura (sebrané spisy)
30. 10. 2014
Moderní procesory jsou velice komplikované křemíkové bestie a mnoho lidí neví, nebo má jen mlhavou představu, co se děje uvnitř a co způsobuje, že některé programy běží neuvěřitelně rychle a jiné tak pomalu, že si člověk stihne uvařit kafe a udělat něco k jídlu.
Pokud se chcete dozvědět, jak CPU fungují, připravil jsem pro vás seznam článků, videí, paperů a knih, které vám na této cestě za poznáním pomůžou a do nejmenších detailů osvětlí fantastická soukolí Stroje.
Obecné
- 08/2012 - Modern Microprocessors A 90 Minute Guide!
- 05/2013 - A Journey Through the CPU Pipeline
- 01/2010 - A Crash Course in Modern Hardware
- 08/2012 - Memory Access Patterns Are Important
- 09/2014 - How L1 and L2 CPU caches work, and why they’re an essential part of modern chips
- 06/2014 - CPU Cache Essentials
- 09/2012 - CPU Caches (video)
- 02/2013 - CPU Cache Flushing Fallacy
- 10/2013 - Caching In: Understand, Measure and Use your CPU Cache more effectively
- 01/2013 - Cache Money Hoes
- 11/2007 - What Every Programmer Should Know About Memory
- 07/2010 - Memory Barriers: a Hardware View for Software Hackers
- 05/2010 - Fast and slow if-statements: branch prediction in modern processors (čísla už neplatí, současná CPU detekují opakující se vzory)
- 02/2013 - Frustum culling: turning the crank
- 04/2014 - Native Code Performance on Modern CPUs: A Changing Landscape (video)
- 02/2014 - Instruction latencies and throughput for AMD and Intel x86 processors
- 10/2014 - x86 Instruction tables
- 09/2012 - Weak vs. Strong Memory Models
- A Primer on Memory Consistency and Cache Coherence
- CPU cache
- The microarchitecture of Intel, AMD and VIA CPUs - An optimization guide for assembly programmers and compiler makers
Detaily procesorů a architektur
- 08/2012 - AMD's Steamroller Detailed: 3rd Generation Bulldozer Core
- 09/2012 - Oracle hurls Sparc T5 gladiators into big-iron arena
- 08/2013 - You won't find this in your phone: A 4GHz 12-core Power8 for badass boxes
- 09/2013 - Architecture and Performance of the Tilera TILE-Gx8072 Manycore Processor (video)
- 11/2013 - Intel unveils 72-core x86 Knights Landing CPU for exascale supercomputing
- 12/2013 - Programming a 144-computer Chip to Minimize Power (procesor navržený speciálně pro zásobníkový jazyk Forth)
- 01/2014 - Knights Landing Details
- 02/2014 - Better late than never: Monster 15-core Xeon chips let loose by Intel
- 06/2014 - Researchers unveil experimental 36-core chip
- 08/2014 - Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
- 09/2014 - Inside Intel's Xeon E5-2600 V3: Huge Performance Leaps
- 09/2014 - Intel Xeon E5 Version 3: Up to 18 Haswell EP Cores
- Benchmarks : Measuring Cache and Memory Latency (access patterns, paging and TLBs)
- Benchmarks : Intel Mobile Haswell (CrystalWell): Memory Sub-System
- Epiphany Architecture Reference
Algoritmy a datové struktury využívající vlastnosti CPU ve svůj prospěch
- 06/2007 - Cache-oblivious data structures
- 10/2013 - Cache-Oblivious Maps (video)
- 09/2013 - Methods for High-Throughput Computation of Elementary Functions
- An Experimental Study of Sorting and Branch Prediction
- AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors
- MICA: A Holistic Approach to Fast In-Memory Key-Value Storage
- The Bw-Tree: A B-tree for New Hardware Platforms
- The Multikernel: A new OS architecture for scalable multicore systems
Mill CPU
Názorná série videí, která představuje chystanou architekturu Mill a během toho ukáže, jak fungují současné procesory a jak se od nich Mill liší.
- 08/2013 - Instruction Encoding
- 07/2013 - The Belt
- 10/2013 - Memory
- 11/2013 - Prediction
- 12/2013 - Metadata
- 02/2014 - Execution
- 03/2014 - Security
- 05/2014 - Specification
- 07/2014 - Pipelining
Itanium
- The Itanium processor, part 1: Warming up
- The Itanium processor, part 2: Instruction encoding, templates, and stops
- The Itanium processor, part 3: The Windows calling convention, how parameters are passed
- The Itanium processor, part 3b: How does spilling actually work?
- The Itanium processor, part 4: The Windows calling convention, leaf functions
- The Itanium processor, part 5: The GP register, calling functions, and function pointers
- The Itanium processor, part 6: Calculating conditionals
- The Itanium processor, part 7: Speculative loads
- The Itanium processor, part 8: Advanced loads
- The Itanium processor, part 9: Counted loops and loop pipelining
- The Itanium processor, part 10: Register rotation
Concurrency
- 08/2014 - An Overview of Kernel Lock Improvements
- 08/2013 - Lock-Based vs Lock-Free Concurrent Algorithms
- 09/2014 - Making Sense of the Intel Haswell Transactional Synchronization eXtensions
- 01/2014 - What's the deal with Hardware Transactional Memory!?! (video)
- 02/2014 - Hardware Transactional Memory in Java, or why synchronized will be cool again.
Java, JVM
- 03/2007 - Advanced Topics in Programming Languages: The Java Memory Model (video)
- 06/2014 - Advice for the concurrently confused: AtomicLong JDK7/8 vs. LongAdder
- Optimizing the future Java through collaboration
- Java vs. Scala: Divided We Fail
- 11/2013 - Java's Atomic and volatile, under the hood on x86