Method for designing specialized computing systems on the basis of hardware and software cooptimization
https://doi.org/10.32362/2500-316X-2025-13-3-44-53
EDN: QWXGNC
Abstract
Objectives. Pipelining is an effective method for increasing the clock frequency of digital circuits. At the same time, balancing the pipeline stages during circuit synthesis at the register transfer level does not yet guarantee a balanced topological implementation of such a pipeline in terms of signal propagation delays according to the selected technological basis. This is due to the specifics of the algorithms for placing and routing components of digital devices, which are not capable of optimizing solutions in a strict mathematical sense in an acceptable time. In practice, approaches for developing digital devices combine manual control of topological constraints that set general rules for placing components with automatic optimization for localized fragments of the circuit are used to obtain results close to optimal. Pipeline circuits are based on a simple connection diagram of individual stages to demonstrate the effect of using topological design constraints on their example. On the basis of pipeline structures, a number of algorithms can be implemented to effectively complement programmable processor devices and provide hardware acceleration of some tasks. The present work develops methodological recommendations for managing topological design constraints in the implementation of pipeline computing structures based on programmable logic devices (PLD) with field-programmable gate array (FPGA) architecture.
Methods. The work is based on accepted methods for designing and modeling digital systems.
Results. Based on the analysis, modifications to a 32-bit CORDIC transcendental function computation pipeline were developed. By adding design constraints on the placement of register groups corresponding to the pipeline stages a significant increase in the clock frequency can be achieved as compared to automatic placement to reduce the running time of the tracing algorithms. The resulting effect is systematically reproduced in several implemented versions of the pipeline.
Conclusions. The presented recommendations can be used to control the clock frequency and number of stages of pipeline computing structures while simultaneously reducing the time of one iteration and routing of a module based on PLD with FPGA architecture.
About the Authors
I. E. TarasovRussian Federation
Ilya E. Tarasov, Dr. Sci. (Eng.), Associated Professor, Head of the Laboratory of Specialized Computing Systems
78, Vernadskogopr., Moscow, 119454 Russia
Scopus Author ID 57213354150
Competing Interests:
The authors declare no conflicts of interest.
P. N. Sovietov
Russian Federation
Peter N. Sovietov, Cand. Sci. (Eng.), Senior Researcher, Laboratory of Specialized Computing Systems
78, Vernadskogo pr., Moscow, 119454 Russia
Scopus Author ID 57221375427
Competing Interests:
The authors declare no conflicts of interest.
D. V. Lulyava
Russian Federation
Daniil V. Lulyava, Junior Researcher, Laboratory of Specialized Computing Systems
78, Vernadskogo pr., Moscow, 119454 Russia
Scopus AuthorID 58811698000
Competing Interests:
The authors declare no conflicts of interest.
N. A. Duksin
Russian Federation
Nikita A. Duksin, Engineer, Laboratory of Specialized Computing Systems
78, Vernadskogo pr., Moscow, 119454 Russia
Scopus Author ID 58811361100
Competing Interests:
The authors declare no conflicts of interest.
References
1. Saidov B.B., Telezhkin V.F., Gudaev N.N., et al. Development of Equipment for Experimental Study of Digital Algorithms in Nonstationary Signal Processing Problems. Ural Radio Engineering Journal. 2022;6(2):186–204. https://doi.org/10.15826/urej.2022.6.2.004
2. Jasek R. SHA-1 and MD5 Cryptographic Hash Functions: Security Overview. Communications (Komunikacie). 2015;17(1):73–80.
3. Carrión D.S., Prohaska V., Diez O. Exploration of TPUs for AI Applications. In: Daimi K., Al Sadoon A. (Eds.). Proceedings of the Second International Conference on Advances in Computing Research (ACR’24). ACR 2024. Lecture Notes in Networks and Systems. Springer; 2024. V. 956. P. 559. https://doi.org/10.1007/978-3-031-56950-0_47
4. Tarasov I.Е., Sovietov P.N., Lulyava D.V., Mirzoyan D.I. Method for designing specialized computing systems based on hardware and software co-optimization. Russian Technological Journal. 2024;12(3):37–45. https://doi.org/10.32362/2500-316X-2024-12-3-37-45
5. Alekhin V.A. Designing Electronic Systems Using SystemC and SystemC–AMS. Russian Technological Journal. 2020;8(4):79–95 (in Russ.). https://doi.org/10.32362/2500-316X-2020-8-4-79-95
6. Pham-Quoc C., Dinh-Duc A.-V. Automatic generation of area constraints for FPGA implementation. In: 2011 IEEE 3rd International Conference on Communication Software and Networks (ICCSN). 2011. P. 469–472. https://doi.org/10.1109/ICCSN.2011.6014937
7. Li K., Lei L., Guang Q., Shi J.-Y., Hao Y. Improving the performance of an SOC design for network processing based on FPGA with PlanAhead. In: 2011 International Conference on Electronics, Communications and Control (ICECC). 2011. P. 297–300. https://doi.org/10.1109/ICECC.2011.6066640
8. Sarker A.L.Md., Lee M.H. Synthesis of VHDL code for FPGA design flow using Xilinx PlanAhead tool. In: 2012 International Conference on Education and e-Learning Innovations (ICEELI). 2012. https://doi.org/10.1109/ICEELI.2012.6360614
9. Song X., Lu R., Guo Z. High-Performance Reconfigurable Pipeline Implementation for FPGA-Based SmartNIC. Micromachines. 2024;15(4):449. https://doi.org/10.3390/mi15040449
10. Anderson T., Wheeler T.J. An FPGA-based hardware accelerator supporting sensitive sequence homology filtering with profile hidden Markov models. BMC Bioinformatics. 2024;25:247. https://doi.org/10.1186/s12859-024-05879-3
11. Tarasov I.E., Sovetov P.N. Device for Calculating Transcendental Functions and Multiplying Binary Numbers: Pat. 222880 U1 RF. Publ. 22.01.2024 (in Russ.).
12. Oishi R., Kadomoto J., Irie H., Sakai S. FPGA-based Garbling Accelerator with Parallel Pipeline Processing. IEICE Trans. Inform. Syst. 2023;E106.D(12):1988–1996. https://doi.org/10.1587/transinf.2023PAP0002
13. Nurvitadhi E., Sheffield D., Sim J., et al. Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC. In: 2016 International Conference on Field-Programmable Technology (FPT). 2016. P. 77–84. https://doi.org/10.1109/FPT.2016.7929192
14. Hennessy J.L., Patterson D.A. A new golden age for computer architecture: Domain-specific hardware/software co-design, enhanced security, open instruction sets, and agile chip development. In: Proceedings of the 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 2018. P. 27–29. https://doi.org/10.1109/ISCA.2018.00011
15. Hennessy J.L., Patterson D.A. Computer Architecture: A Quantitative Approach: 6th ed. The Morgan Kaufmann Series in Computer Architecture and Design. 2017. 936 p.
Supplementary files
|
1. P-block allocation in a programmable logic device using FPGA architecture | |
Subject | ||
Type | Исследовательские инструменты | |
View
(35KB)
|
Indexing metadata ▾ |
- Modifications to a 32-bit CORDIC transcendental function computation pipeline were developed.
- By adding design constraints on the placement of register groups corresponding to the pipeline stages a significant increase in the clock frequency can be achieved as compared to automatic placement to reduce the running time of the tracing algorithms.
Review
For citations:
Tarasov I.E., Sovietov P.N., Lulyava D.V., Duksin N.A. Method for designing specialized computing systems on the basis of hardware and software cooptimization. Russian Technological Journal. 2025;13(3):44-53. https://doi.org/10.32362/2500-316X-2025-13-3-44-53. EDN: QWXGNC