HCIP: Hybrid Short Long History Table-based Cache Instruction Prefetcher

Swapnita Srivastava; P.K.  Singh

doi:10.47164/ijngc.v13i3.758

Published Oct 31, 2022

https://doi.org/10.47164/ijngc.v13i3.758

Download

PDF

Statistic

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Volume 13, Special Issue 3, October 2022

Swapnita Srivastava

9984631766

P.K. Singh

Department of Computer Science and Engineering Madan Mohan Malaviya University, Gorakhpur

Abstract

In modern applications, instruction cache misses have become a performance constraint, and numerous prefetchers have been developed to conceal memory latency. With today's client and server workloads, large instruction working sets require more. These working sets are typically large enough to fit in the Last Level Cache (LLC). However, the Level 1 Instruction (L1-I) cache has a high miss rate, which typically prevents the processor front-end from receiving instructions. Instruction prefetching is a latency hiding method that allows the LLC to send instructions to the L1-I cache. In order to design a high-performance cache architecture, prefetching instructions in the L1-I cache is a fundamental approach. When developing an efficient and effective prefetcher, accuracy and coverage are the most important parameters to be considered. This paper proposed a novel Hybrid Short Long History Table-based Cache Instruction Prefetcher (HCIP) for the L1-I cache. The HCIP makes use of a hybrid configuration of the two history-based prefetchers tables that are Long History Table (LST) and Short History Table (SHT). The transitive closure of the control flow graph is the PRE+PC table used in HCIP. In contrast to PIPS and NOPREF, HCIP indicates maximum coverage of 67% for the majority of the benchmarks given.

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Srivastava, S., & Singh, P. . (2022). HCIP: Hybrid Short Long History Table-based Cache Instruction Prefetcher. International Journal of Next-Generation Computing, 13(3). https://doi.org/10.47164/ijngc.v13i3.758

References

Ansari, A., Golshan, F., Lotfi-Kamran, P., and Sarbazi-Azad, H. 2021. Mana: Microarchitecting an instruction prefetcher. arXiv preprint arXiv:2102.01764 . DOI: https://doi.org/10.1109/TC.2022.3176825
Ayers, G., Nagendra, N. P., August, D. I., Cho, H. K., Kanev, S., Kozyrakis, C., Krishnamurthy, T., Litz, H., Moseley, T., and Ranganathan, P. 2019. Asmdb understanding and mitigating front-end stalls in warehouse-scale computers. In Proceedings DOI: https://doi.org/10.1145/3307650.3322234
of the 46th International Symposium on Computer Architecture. 462–473.
Baer, J.-L. 2009. Microprocessor architecture: from simple pipelines to chip multiprocessors. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511811258
Barroso, L. A., Gharachorloo, K., and Bugnion, E. 1998. Memory system characterization of commercial workloads. In Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No. 98CB36235). IEEE, 3–14. DOI: https://doi.org/10.1145/279361.279363
Christian, A. and Chapa, D. 2021. Instruction prefetchers and cache replacement policies. Ph.D. thesis.
Falsafi, B. and Wenisch, T. F. 2014. A primer on hardware prefetching. Synthesis Lectures DOI: https://doi.org/10.1007/978-3-031-01743-8
on Computer Architecture 9, 1, 1–67.
Ferdman, M., Kaynak, C., and Falsafi, B. 2011. Proactive instruction fetch. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. 152–162. DOI: https://doi.org/10.1145/2155620.2155638
Ferdman, M., Wenisch, T. F., Ailamaki, A., Falsafi, B., and Moshovos, A. 2008. Temporal instruction fetch streaming. In 2008 41st IEEE/ACM International Symposium on Microarchitecture. IEEE, 1–10. DOI: https://doi.org/10.1109/MICRO.2008.4771774
Gober, N., Chacon, G., Jim´enez, D., and Gratz, P. 2020. Temporal ancestry prefetcher. The 1st Instruction Prefetching Championship (IPC1).
Gupta, V., Kalani, N. S., and Panda, B. Run-jump-run: Bouquet of instruction pointer jumpers for high performance instruction prefetching.
Jin, R., Ruan, N., Xiang, Y., and Wang, H. 2011. Path-tree: An efficient reachability indexing scheme for large directed graphs. ACM Transactions on Database Systems (TODS) 36, 1, 1–44. DOI: https://doi.org/10.1145/1929934.1929941
Kanev, S., Darago, J. P., Hazelwood, K., Ranganathan, P., Moseley, T., Wei, G.-Y., and Brooks, D. 2015. Profiling a warehouse-scale computer. In Proceedings of the 42nd Annual International Symposium on Computer Architecture. 158–169. DOI: https://doi.org/10.1145/2749469.2750392
Karp, R. M. 1990. The transitive closure of a random digraph. Random Structures & Algorithms 1, 1, 73–93. DOI: https://doi.org/10.1002/rsa.3240010106
Khan, T. A., Sriraman, A., Devietti, J., Pokam, G., Litz, H., and Kasikci, B. 2020. I-spy: Context-driven conditional instruction prefetching with coalescing. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 146– 159. DOI: https://doi.org/10.1109/MICRO50266.2020.00024
Kolli, A., Saidi, A., and Wenisch, T. F. 2013. Rdip: Return-address-stack directed instruction prefetching. In 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 260–271. DOI: https://doi.org/10.1145/2540708.2540731
Lo, J. L., Barroso, L. A., Eggers, S. J., Gharachorloo, K., Levy, H. M., and Parekh, S. S. 1998. An analysis of database workload performance on simultaneous multithreaded processors. In Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No. 98CB36235). IEEE, 39–50. DOI: https://doi.org/10.1145/279361.279367
Michaud, P. 2020. Pips: Prefetching instructions with probabilistic scouts. In IPC-1-First Instruction Prefetching Championship. 1–4.
Nakamura, T., Koizumi, T., Degawa, Y., Irie, H., Sakai, S., and Shioya, R. 2020. D-jolt: Distant jolt prefetcher. The 1st Instruction Prefetching Championship (IPC1). Ramirez, A., Santana, O. J., Larriba-Pey, J. L., and Valero, M. 2002. Fetching instruction streams. In 35th Annual IEEE/ACM International Symposium on Microarchitecture,
(MICRO-35). Proceedings. IEEE, 371–382.
Reinman, G., Calder, B., and Austin, T. 1999. Fetch directed instruction prefetching. In MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture. IEEE, 16–27.
Ros, A. and Jimborean, A. 2020. The entangling instruction prefetcher. IEEE Computer Architecture Letters 19, 2, 84–87. DOI: https://doi.org/10.1109/LCA.2020.3002947
Seznec, A. 2020. The fnl+ mma instruction cache prefetcher. In IPC-1-First Instruction Prefetching Championship. 1–5.
Spracklen, L., Chou, Y., and Abraham, S. G. 2005. Effective instruction prefetching in chip multiprocessors for modern commercial applications. In 11th International Symposium on High-Performance Computer Architecture. IEEE, 225–236.
Weiss, M. 1992. The transitive closure of control dependence: The iterated join. ACM Letters on Programming Languages and Systems (LOPLAS) 1, 2, 178–190. DOI: https://doi.org/10.1145/151333.151337
Yeh, T.-Y., Marr, D. T., and Patt, Y. N. 1993. Increasing the instruction fetch rate via multiple branch prediction and a branch address cache. In Proceedings of the 7th International Conference on Supercomputing. 67–76. DOI: https://doi.org/10.1145/165939.165956

About Journal

HCIP: Hybrid Short Long History Table-based Cache Instruction Prefetcher

Downloads

Metrics

Abstract

References

Most read articles by the same author(s)

About Journal

##plugins.themes.academic_pro.article.sidebar##

Downloads

Metrics

##plugins.themes.academic_pro.article.main##

Abstract

##plugins.themes.academic_pro.article.details##

References

Most read articles by the same author(s)