Edit count of the user (user_editcount ) | null |
Name of the user account (user_name ) | '2601:1C0:5801:F538:545:DE8C:D854:8943' |
Age of the user account (user_age ) | 0 |
Groups (including implicit) the user is in (user_groups ) | [
0 => '*'
] |
Global groups that the user is in (global_user_groups ) | [] |
Whether or not a user is editing through the mobile interface (user_mobile ) | true |
Page ID (page_id ) | 137146 |
Page namespace (page_namespace ) | 0 |
Page title without namespace (page_title ) | 'Memory hierarchy' |
Full page title (page_prefixedtitle ) | 'Memory hierarchy' |
Last ten users to contribute to the page (page_recent_contributors ) | [
0 => '122.163.201.111',
1 => 'Bhattasamuel',
2 => 'GliderMaven',
3 => 'Jarble',
4 => '70.247.162.60',
5 => 'Nbarth',
6 => 'Ahunt',
7 => '168.167.134.101',
8 => 'BD2412',
9 => '72.231.176.223'
] |
First user to contribute to the page (page_first_contributor ) | 'The Anome' |
Action (action ) | 'edit' |
Edit summary/reason (summary ) | 'Fixed typo' |
Whether or not the edit is marked as minor (no longer in use) (minor_edit ) | false |
Old page wikitext, before the edit (old_wikitext ) | '{{merge|Computer data storage|date=November 2015}}
[[File:ComputerMemoryHierarchy.svg|thumb|300px|Diagram of the computer memory hierarchy]]
In [[computer architecture]] the '''memory hierarchy''' is a concept used for storin & discussing performance issues in computer architectural design, algorithm predictions, and the lower level [[computer programming|programming]] constructs such as involving [[locality of reference]]. The memory hierarchy in [[computer storage]] distinguishes each level in the hierarchy by response time. Since response time, complexity, and capacity are related,<ref name=toyzee /> the levels may also be distinguished by their performance and controlling technologies.
The many trade-offs in designing for high performance will include the structure of the memory hierarchy, i.e. the size and technology of each component. So the various components can be viewed as forming a hierarchy of memories (m<sub>1</sub>,m<sub>2</sub>,...,m<sub>n</sub>) in which each member m<sub>i</sub> is in a sense subordinate to the next highest member m<sub>i+1</sub> of the hierarchy. To limit waiting by higher levels, a lower level will respond by filling a buffer and then signaling to activate the transfer.
There are four major storage levels.<ref name="toyzee">{{cite book
|last1=Toy
|first1=Wing
|last2=Zee
|first2=Benjamin
|title=Computer Hardware/Software Architecture
|year=1986
|publisher=Prentice Hall
|isbn=0-13-163502-6
|page=30
}}</ref>
# ''Internal'' – [[Processor register]]s and [[CPU cache|cache]].
# Main – the system [[Random-access memory|RAM]] and controller cards.
# On-line mass storage – [[Computer storage#Secondary storage|Secondary]] storage.
# Off-line bulk storage – [[Computer storage#Tertiary storage|Tertiary]] and [[Computer storage#Off-line storage|Off-line storage]].
This is a general memory hierarchy structuring. Many other structures are useful. For example, a paging algorithm may be considered as a level for [[virtual memory]] when designing a [[computer architecture]], and one can include a level of [[nearline storage]] between online and offline storage.
==Properties of the technologies in the memory hierarchy==
* Adding complexity slows down the ''memory hierarchy''.<ref>[[Write-combining]]</ref>
* CMOx memory technology stretches the Flash space in the memory hierarchy<ref>{{cite web
|title=Memory Hierarchy
|url=http://www.unitysemi.com/applications-memory-hierarchy.html
|publisher=Unitity Semiconductor Corporation
|accessdate=16 September 2009}}</ref>
* One of the main ways to increase system performance is minimising how far down the memory hierarchy one has to go to manipulate data.<ref>{{cite web
|title=Multi-Core
|url=http://www.pixelbeat.org/docs/memory_hierarchy/
|author=Pádraig Brady
|accessdate=16 September 2009}}</ref>
* Latency and bandwidth are two metrics associated with caches and memory. Neither of them is uniform, but is specific to a particular component of the memory hierarchy.<ref name=sun>{{Cite journal
| first = Ruud
| last = van der Pas
| author-link =
| first2 =
| last2 =
| author2-link =
| editor-last =
| editor-first =
| editor2-last =
| editor2-first =
| contribution = Memory Hierarchy in Cache-Based Systems
| contribution-url = http://www.sun.com/blueprints/1102/817-0742.pdf
| series =
| year = 2002
| page = 26
| place = Santa Clara, California
| publisher = Sun Microsystems
| url = http://www.sun.com/
| doi =
| id = 817-0742-10}}</ref>
* Predicting where in the memory hierarchy the data resides is difficult.<ref name=sun />
* ...the location in the memory hierarchy dictates the time required for the prefetch to occur.<ref name=sun />
==Application of the concept==
[[File:Hwloc.png|thumb|right|300px|Memory hierarchy of an AMD Bulldozer server.]]
The number of levels in the memory hierarchy and the performance at each level has increased over time. For example, the memory hierarchy of an Intel Haswell Mobile <ref>{{cite web|last=Crothers |first=Brooke |url=http://news.cnet.com/8301-13579_3-57609045-37/dissecting-intels-top-graphics-in-apples-15-inch-macbook-pro/ |title=Dissecting Intel's top graphics in Apple's 15-inch MacBook Pro - CNET |publisher=News.cnet.com |date= |accessdate=2014-07-31}}</ref> processor circa 2013 is:
* [[Processor register]]s – the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size
* [[CPU cache|Cache]]
** Level 0 (L0) [[Micro-operation|Micro operations]] cache – 6 [[KiB]] <ref>{{cite web|url=http://www.anandtech.com/show/6355/intels-haswell-architecture/6 |title=Intel's Haswell Architecture Analyzed: Building a New PC and a New Intel |publisher=AnandTech |date= |accessdate=2014-07-31}}</ref> in size
** Level 1 (L1) [[Opcode|Instruction]] cache – 128 KiB in size
** Level 1 (L1) Data cache – 128 KiB in size. Best access speed is around 700 [[GiB]]/second<ref name=sisd_qa_f_mem_hsw>{{cite web|url=http://www.sisoftware.co.uk/?d=qa&f=mem_hsw |title=SiSoftware Zone |publisher=Sisoftware.co.uk |date= |accessdate=2014-07-31}}</ref>
** Level 2 (L2) Instruction and data (shared) – 1 [[MiB]] in size. Best access speed is around 200 GiB/second<ref name=sisd_qa_f_mem_hsw />
** Level 3 (L3) Shared cache – 6 MiB in size. Best access speed is around 100 GB/second<ref name=sisd_qa_f_mem_hsw />
** Level 4 (L4) Shared cache – 128 MiB in size. Best access speed is around 40 GB/second<ref name=sisd_qa_f_mem_hsw />
* [[Computer memory|Main memory]] ([[Primary storage]]) – [[GiB|Gigabytes]] in size. Best access speed is around 10 GB/second.<ref name=sisd_qa_f_mem_hsw /> In the case of a [[Non-Uniform Memory Access|NUMA]] machine, access times may not be uniform
* [[Disk storage]] ([[Secondary storage]]) – [[TiB|Terabytes]] in size. As of 2013, best access speed is from a [[Solid-state drive|solid state drive]] is about 600 MB/second <ref>{{cite web|url=http://www.tomshardware.com/charts/ssd-charts-2013/AS-SSD-Sequential-Read,2782.html |title=Charts, benchmarks SSD Charts 2013, AS-SSD Sequential Read |publisher=Tomshardware.com |date= |accessdate=2014-07-31}}</ref>
* [[Nearline storage]] ([[Tertiary storage]]) – Up to [[exabytes]] in size. As of 2013, best access speed is about 160 MB/second<ref>{{cite web|url=http://www.lto.org/technology/generations.html |title=Ultrium - LTO Technology - Ultrium GenerationsLTO |publisher=Lto.org |date= |accessdate=2014-07-31}}</ref>
* [[Offline storage]]
The lower levels of the hierarchy – from disks downwards – are also known as [[tiered storage]]. The formal distinction between online, nearline, and offline storage is:<ref name="pearson2010">{{cite web |last=Pearson |first=Tony |year=2010 |url=https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2 |title=Correct use of the term Nearline. |work=IBM Developerworks, Inside System Storage |accessdate=2015-08-16}}</ref>
* Online storage is immediately available for I/O.
* Nearline storage is not immediately available, but can be made online quickly without human intervention.
* Offline storage is not immediately available, and requires some human intervention to bring online.
For example, always-on spinning disks are online, while spinning disks that spin-down, such as massive array of idle disk ([[MAID]]), are nearline. Removable media such as tape cartridges that can be automatically loaded, as in a [[tape library]], are nearline, while cartridges that must be manually loaded are offline.
Most modern [[Central processing unit|CPUs]] are so fast that for most program workloads, the [[wikt:bottleneck|bottleneck]] is the [[locality of reference]] of memory accesses and the efficiency of the [[CPU cache|caching]] and memory transfer between different levels of the hierarchy{{Citation needed|date=September 2009}}. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the ''space cost'', as a larger memory object is more likely to overflow a small/fast level and require use of a larger/slower level. The resulting load on memory use is known as ''pressure'' (respectively ''register pressure'', ''cache pressure'', and (main) ''memory pressure''). Terms for data being missing from a higher level and needing to be fetched from a lower level are, respectively: [[register spilling]] (due to [[register pressure]]: register to cache), [[cache miss]] (cache to main memory), and (hard) [[page fault]] (main memory to disk).
Modern [[programming language]]s mainly assume two levels of memory, main memory and disk storage, though in [[assembly language]] and [[inline assembler]]s in languages such as [[C (programming language)|C]], registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system):
*'''Programmers''' are responsible for moving data between disk and memory through file I/O.
*'''Hardware''' is responsible for moving data between memory and caches.
*'''[[Compiler optimization|Optimizing compilers]]''' are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.
Many programmers assume one level of memory. This works fine until the application hits a performance wall. Then the memory hierarchy will be assessed during [[code refactoring]].
==See also==
* [[Cloud storage]]
* [[Memory wall]]
* [[Locality of reference#Spatial and temporal locality usage|Use of spatial and temporal locality: hierarchical memory]]
* [[Cache (computing)#The difference between buffer and cache|The difference between buffer and cache]]
* [[Random-access memory#Memory hierarchy]]
* [[CPU cache#Cache hierarchy in a modern processor|Cache hierarchy in a modern processor]]
* [[Computer data storage]]
* [[Computer memory]]
* [[Tiered storage]]
==References==
{{reflist}}
{{DEFAULTSORT:Memory Hierarchy}}
[[Category:Computer architecture]]
[[Category:Computer memory]]
[[Category:Hierarchy]]' |
New page wikitext, after the edit (new_wikitext ) | '{{merge|Computer data storage|date=November 2015}}
[[File:ComputerMemoryHierarchy.svg|thumb|300px|Diagram of the computer memory hierarchy]]
In [[computer architecture]] the '''memory hierarchy''' is a concept used for storing & discussing performance issues in computer architectural design, algorithm predictions, and the lower level [[computer programming|programming]] constructs such as involving [[locality of reference]]. The memory hierarchy in [[computer storage]] distinguishes each level in the hierarchy by response time. Since response time, complexity, and capacity are related,<ref name=toyzee /> the levels may also be distinguished by their performance and controlling technologies.
The many trade-offs in designing for high performance will include the structure of the memory hierarchy, i.e. the size and technology of each component. So the various components can be viewed as forming a hierarchy of memories (m<sub>1</sub>,m<sub>2</sub>,...,m<sub>n</sub>) in which each member m<sub>i</sub> is in a sense subordinate to the next highest member m<sub>i+1</sub> of the hierarchy. To limit waiting by higher levels, a lower level will respond by filling a buffer and then signaling to activate the transfer.
There are four major storage levels.<ref name="toyzee">{{cite book
|last1=Toy
|first1=Wing
|last2=Zee
|first2=Benjamin
|title=Computer Hardware/Software Architecture
|year=1986
|publisher=Prentice Hall
|isbn=0-13-163502-6
|page=30
}}</ref>
# ''Internal'' – [[Processor register]]s and [[CPU cache|cache]].
# Main – the system [[Random-access memory|RAM]] and controller cards.
# On-line mass storage – [[Computer storage#Secondary storage|Secondary]] storage.
# Off-line bulk storage – [[Computer storage#Tertiary storage|Tertiary]] and [[Computer storage#Off-line storage|Off-line storage]].
This is a general memory hierarchy structuring. Many other structures are useful. For example, a paging algorithm may be considered as a level for [[virtual memory]] when designing a [[computer architecture]], and one can include a level of [[nearline storage]] between online and offline storage.
==Properties of the technologies in the memory hierarchy==
* Adding complexity slows down the ''memory hierarchy''.<ref>[[Write-combining]]</ref>
* CMOx memory technology stretches the Flash space in the memory hierarchy<ref>{{cite web
|title=Memory Hierarchy
|url=http://www.unitysemi.com/applications-memory-hierarchy.html
|publisher=Unitity Semiconductor Corporation
|accessdate=16 September 2009}}</ref>
* One of the main ways to increase system performance is minimising how far down the memory hierarchy one has to go to manipulate data.<ref>{{cite web
|title=Multi-Core
|url=http://www.pixelbeat.org/docs/memory_hierarchy/
|author=Pádraig Brady
|accessdate=16 September 2009}}</ref>
* Latency and bandwidth are two metrics associated with caches and memory. Neither of them is uniform, but is specific to a particular component of the memory hierarchy.<ref name=sun>{{Cite journal
| first = Ruud
| last = van der Pas
| author-link =
| first2 =
| last2 =
| author2-link =
| editor-last =
| editor-first =
| editor2-last =
| editor2-first =
| contribution = Memory Hierarchy in Cache-Based Systems
| contribution-url = http://www.sun.com/blueprints/1102/817-0742.pdf
| series =
| year = 2002
| page = 26
| place = Santa Clara, California
| publisher = Sun Microsystems
| url = http://www.sun.com/
| doi =
| id = 817-0742-10}}</ref>
* Predicting where in the memory hierarchy the data resides is difficult.<ref name=sun />
* ...the location in the memory hierarchy dictates the time required for the prefetch to occur.<ref name=sun />
==Application of the concept==
[[File:Hwloc.png|thumb|right|300px|Memory hierarchy of an AMD Bulldozer server.]]
The number of levels in the memory hierarchy and the performance at each level has increased over time. For example, the memory hierarchy of an Intel Haswell Mobile <ref>{{cite web|last=Crothers |first=Brooke |url=http://news.cnet.com/8301-13579_3-57609045-37/dissecting-intels-top-graphics-in-apples-15-inch-macbook-pro/ |title=Dissecting Intel's top graphics in Apple's 15-inch MacBook Pro - CNET |publisher=News.cnet.com |date= |accessdate=2014-07-31}}</ref> processor circa 2013 is:
* [[Processor register]]s – the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size
* [[CPU cache|Cache]]
** Level 0 (L0) [[Micro-operation|Micro operations]] cache – 6 [[KiB]] <ref>{{cite web|url=http://www.anandtech.com/show/6355/intels-haswell-architecture/6 |title=Intel's Haswell Architecture Analyzed: Building a New PC and a New Intel |publisher=AnandTech |date= |accessdate=2014-07-31}}</ref> in size
** Level 1 (L1) [[Opcode|Instruction]] cache – 128 KiB in size
** Level 1 (L1) Data cache – 128 KiB in size. Best access speed is around 700 [[GiB]]/second<ref name=sisd_qa_f_mem_hsw>{{cite web|url=http://www.sisoftware.co.uk/?d=qa&f=mem_hsw |title=SiSoftware Zone |publisher=Sisoftware.co.uk |date= |accessdate=2014-07-31}}</ref>
** Level 2 (L2) Instruction and data (shared) – 1 [[MiB]] in size. Best access speed is around 200 GiB/second<ref name=sisd_qa_f_mem_hsw />
** Level 3 (L3) Shared cache – 6 MiB in size. Best access speed is around 100 GB/second<ref name=sisd_qa_f_mem_hsw />
** Level 4 (L4) Shared cache – 128 MiB in size. Best access speed is around 40 GB/second<ref name=sisd_qa_f_mem_hsw />
* [[Computer memory|Main memory]] ([[Primary storage]]) – [[GiB|Gigabytes]] in size. Best access speed is around 10 GB/second.<ref name=sisd_qa_f_mem_hsw /> In the case of a [[Non-Uniform Memory Access|NUMA]] machine, access times may not be uniform
* [[Disk storage]] ([[Secondary storage]]) – [[TiB|Terabytes]] in size. As of 2013, best access speed is from a [[Solid-state drive|solid state drive]] is about 600 MB/second <ref>{{cite web|url=http://www.tomshardware.com/charts/ssd-charts-2013/AS-SSD-Sequential-Read,2782.html |title=Charts, benchmarks SSD Charts 2013, AS-SSD Sequential Read |publisher=Tomshardware.com |date= |accessdate=2014-07-31}}</ref>
* [[Nearline storage]] ([[Tertiary storage]]) – Up to [[exabytes]] in size. As of 2013, best access speed is about 160 MB/second<ref>{{cite web|url=http://www.lto.org/technology/generations.html |title=Ultrium - LTO Technology - Ultrium GenerationsLTO |publisher=Lto.org |date= |accessdate=2014-07-31}}</ref>
* [[Offline storage]]
The lower levels of the hierarchy – from disks downwards – are also known as [[tiered storage]]. The formal distinction between online, nearline, and offline storage is:<ref name="pearson2010">{{cite web |last=Pearson |first=Tony |year=2010 |url=https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2 |title=Correct use of the term Nearline. |work=IBM Developerworks, Inside System Storage |accessdate=2015-08-16}}</ref>
* Online storage is immediately available for I/O.
* Nearline storage is not immediately available, but can be made online quickly without human intervention.
* Offline storage is not immediately available, and requires some human intervention to bring online.
For example, always-on spinning disks are online, while spinning disks that spin-down, such as massive array of idle disk ([[MAID]]), are nearline. Removable media such as tape cartridges that can be automatically loaded, as in a [[tape library]], are nearline, while cartridges that must be manually loaded are offline.
Most modern [[Central processing unit|CPUs]] are so fast that for most program workloads, the [[wikt:bottleneck|bottleneck]] is the [[locality of reference]] of memory accesses and the efficiency of the [[CPU cache|caching]] and memory transfer between different levels of the hierarchy{{Citation needed|date=September 2009}}. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the ''space cost'', as a larger memory object is more likely to overflow a small/fast level and require use of a larger/slower level. The resulting load on memory use is known as ''pressure'' (respectively ''register pressure'', ''cache pressure'', and (main) ''memory pressure''). Terms for data being missing from a higher level and needing to be fetched from a lower level are, respectively: [[register spilling]] (due to [[register pressure]]: register to cache), [[cache miss]] (cache to main memory), and (hard) [[page fault]] (main memory to disk).
Modern [[programming language]]s mainly assume two levels of memory, main memory and disk storage, though in [[assembly language]] and [[inline assembler]]s in languages such as [[C (programming language)|C]], registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system):
*'''Programmers''' are responsible for moving data between disk and memory through file I/O.
*'''Hardware''' is responsible for moving data between memory and caches.
*'''[[Compiler optimization|Optimizing compilers]]''' are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.
Many programmers assume one level of memory. This works fine until the application hits a performance wall. Then the memory hierarchy will be assessed during [[code refactoring]].
==See also==
* [[Cloud storage]]
* [[Memory wall]]
* [[Locality of reference#Spatial and temporal locality usage|Use of spatial and temporal locality: hierarchical memory]]
* [[Cache (computing)#The difference between buffer and cache|The difference between buffer and cache]]
* [[Random-access memory#Memory hierarchy]]
* [[CPU cache#Cache hierarchy in a modern processor|Cache hierarchy in a modern processor]]
* [[Computer data storage]]
* [[Computer memory]]
* [[Tiered storage]]
==References==
{{reflist}}
{{DEFAULTSORT:Memory Hierarchy}}
[[Category:Computer architecture]]
[[Category:Computer memory]]
[[Category:Hierarchy]]' |
Unified diff of changes made by edit (edit_diff ) | '@@ -1,5 +1,5 @@
{{merge|Computer data storage|date=November 2015}}
[[File:ComputerMemoryHierarchy.svg|thumb|300px|Diagram of the computer memory hierarchy]]
-In [[computer architecture]] the '''memory hierarchy''' is a concept used for storin & discussing performance issues in computer architectural design, algorithm predictions, and the lower level [[computer programming|programming]] constructs such as involving [[locality of reference]]. The memory hierarchy in [[computer storage]] distinguishes each level in the hierarchy by response time. Since response time, complexity, and capacity are related,<ref name=toyzee /> the levels may also be distinguished by their performance and controlling technologies.
+In [[computer architecture]] the '''memory hierarchy''' is a concept used for storing & discussing performance issues in computer architectural design, algorithm predictions, and the lower level [[computer programming|programming]] constructs such as involving [[locality of reference]]. The memory hierarchy in [[computer storage]] distinguishes each level in the hierarchy by response time. Since response time, complexity, and capacity are related,<ref name=toyzee /> the levels may also be distinguished by their performance and controlling technologies.
The many trade-offs in designing for high performance will include the structure of the memory hierarchy, i.e. the size and technology of each component. So the various components can be viewed as forming a hierarchy of memories (m<sub>1</sub>,m<sub>2</sub>,...,m<sub>n</sub>) in which each member m<sub>i</sub> is in a sense subordinate to the next highest member m<sub>i+1</sub> of the hierarchy. To limit waiting by higher levels, a lower level will respond by filling a buffer and then signaling to activate the transfer.
' |
New page size (new_size ) | 10181 |
Old page size (old_size ) | 10180 |
Size change in edit (edit_delta ) | 1 |
Lines added in edit (added_lines ) | [
0 => 'In [[computer architecture]] the '''memory hierarchy''' is a concept used for storing & discussing performance issues in computer architectural design, algorithm predictions, and the lower level [[computer programming|programming]] constructs such as involving [[locality of reference]]. The memory hierarchy in [[computer storage]] distinguishes each level in the hierarchy by response time. Since response time, complexity, and capacity are related,<ref name=toyzee /> the levels may also be distinguished by their performance and controlling technologies.'
] |
Lines removed in edit (removed_lines ) | [
0 => 'In [[computer architecture]] the '''memory hierarchy''' is a concept used for storin & discussing performance issues in computer architectural design, algorithm predictions, and the lower level [[computer programming|programming]] constructs such as involving [[locality of reference]]. The memory hierarchy in [[computer storage]] distinguishes each level in the hierarchy by response time. Since response time, complexity, and capacity are related,<ref name=toyzee /> the levels may also be distinguished by their performance and controlling technologies.'
] |
Whether or not the change was made through a Tor exit node (tor_exit_node ) | 0 |
Unix timestamp of change (timestamp ) | 1463000256 |