Method for designing specialized computing systems on the basis of hardware and software cooptimization

I. E. Tarasov; P. N. Sovietov; D. V. Lulyava; N. A. Duksin

doi:10.32362/2500-316X-2025-13-3-44-53

Method for designing specialized computing systems on the basis of hardware and software cooptimization

I. E. Tarasov, P. N. Sovietov, D. V. Lulyava, N. A. Duksin

https://doi.org/10.32362/2500-316X-2025-13-3-44-53

EDN: QWXGNC

Full Text:

PDF (Rus) PDF (Eng) Suppl.

Generate QR code

Abstract

Objectives. Pipelining is an effective method for increasing the clock frequency of digital circuits. At the same time, balancing the pipeline stages during circuit synthesis at the register transfer level does not yet guarantee a balanced topological implementation of such a pipeline in terms of signal propagation delays according to the selected technological basis. This is due to the specifics of the algorithms for placing and routing components of digital devices, which are not capable of optimizing solutions in a strict mathematical sense in an acceptable time. In practice, approaches for developing digital devices combine manual control of topological constraints that set general rules for placing components with automatic optimization for localized fragments of the circuit are used to obtain results close to optimal. Pipeline circuits are based on a simple connection diagram of individual stages to demonstrate the effect of using topological design constraints on their example. On the basis of pipeline structures, a number of algorithms can be implemented to effectively complement programmable processor devices and provide hardware acceleration of some tasks. The present work develops methodological recommendations for managing topological design constraints in the implementation of pipeline computing structures based on programmable logic devices (PLD) with field-programmable gate array (FPGA) architecture.
Methods. The work is based on accepted methods for designing and modeling digital systems.
Results. Based on the analysis, modifications to a 32-bit CORDIC transcendental function computation pipeline were developed. By adding design constraints on the placement of register groups corresponding to the pipeline stages a significant increase in the clock frequency can be achieved as compared to automatic placement to reduce the running time of the tracing algorithms. The resulting effect is systematically reproduced in several implemented versions of the pipeline.
Conclusions. The presented recommendations can be used to control the clock frequency and number of stages of pipeline computing structures while simultaneously reducing the time of one iteration and routing of a module based on PLD with FPGA architecture.

Keywords

PLD, pipeline, constraints, CORDIC

About the Authors

I. E. Tarasov

MIREA – Russian Technological University
Russian Federation

Ilya E. Tarasov, Dr. Sci. (Eng.), Associated Professor, Head of the Laboratory of Specialized Computing Systems
78, Vernadskogopr., Moscow, 119454 Russia
Scopus Author ID 57213354150

Competing Interests:

The authors declare no conflicts of interest.

P. N. Sovietov

MIREA – Russian Technological University
Russian Federation

Peter N. Sovietov, Cand. Sci. (Eng.), Senior Researcher, Laboratory of Specialized Computing Systems
78, Vernadskogo pr., Moscow, 119454 Russia
Scopus Author ID 57221375427

Competing Interests:

The authors declare no conflicts of interest.

D. V. Lulyava

MIREA – Russian Technological University
Russian Federation

Daniil V. Lulyava, Junior Researcher, Laboratory of Specialized Computing Systems
78, Vernadskogo pr., Moscow, 119454 Russia
Scopus AuthorID 58811698000

Competing Interests:

The authors declare no conflicts of interest.

N. A. Duksin

MIREA – Russian Technological University
Russian Federation

Nikita A. Duksin, Engineer, Laboratory of Specialized Computing Systems
78, Vernadskogo pr., Moscow, 119454 Russia
Scopus Author ID 58811361100

Competing Interests:

The authors declare no conflicts of interest.

References

1. Saidov B.B., Telezhkin V.F., Gudaev N.N., et al. Development of Equipment for Experimental Study of Digital Algorithms in Nonstationary Signal Processing Problems. Ural Radio Engineering Journal. 2022;6(2):186–204. https://doi.org/10.15826/urej.2022.6.2.004

2. Jasek R. SHA-1 and MD5 Cryptographic Hash Functions: Security Overview. Communications (Komunikacie). 2015;17(1):73–80.

3. Carrión D.S., Prohaska V., Diez O. Exploration of TPUs for AI Applications. In: Daimi K., Al Sadoon A. (Eds.). Proceedings of the Second International Conference on Advances in Computing Research (ACR’24). ACR 2024. Lecture Notes in Networks and Systems. Springer; 2024. V. 956. P. 559. https://doi.org/10.1007/978-3-031-56950-0_47

4. Tarasov I.Е., Sovietov P.N., Lulyava D.V., Mirzoyan D.I. Method for designing specialized computing systems based on hardware and software co-optimization. Russian Technological Journal. 2024;12(3):37–45. https://doi.org/10.32362/2500-316X-2024-12-3-37-45

5. Alekhin V.A. Designing Electronic Systems Using SystemC and SystemC–AMS. Russian Technological Journal. 2020;8(4):79–95 (in Russ.). https://doi.org/10.32362/2500-316X-2020-8-4-79-95

6. Pham-Quoc C., Dinh-Duc A.-V. Automatic generation of area constraints for FPGA implementation. In: 2011 IEEE 3rd International Conference on Communication Software and Networks (ICCSN). 2011. P. 469–472. https://doi.org/10.1109/ICCSN.2011.6014937

7. Li K., Lei L., Guang Q., Shi J.-Y., Hao Y. Improving the performance of an SOC design for network processing based on FPGA with PlanAhead. In: 2011 International Conference on Electronics, Communications and Control (ICECC). 2011. P. 297–300. https://doi.org/10.1109/ICECC.2011.6066640

8. Sarker A.L.Md., Lee M.H. Synthesis of VHDL code for FPGA design flow using Xilinx PlanAhead tool. In: 2012 International Conference on Education and e-Learning Innovations (ICEELI). 2012. https://doi.org/10.1109/ICEELI.2012.6360614

9. Song X., Lu R., Guo Z. High-Performance Reconfigurable Pipeline Implementation for FPGA-Based SmartNIC. Micromachines. 2024;15(4):449. https://doi.org/10.3390/mi15040449

10. Anderson T., Wheeler T.J. An FPGA-based hardware accelerator supporting sensitive sequence homology filtering with profile hidden Markov models. BMC Bioinformatics. 2024;25:247. https://doi.org/10.1186/s12859-024-05879-3

11. Tarasov I.E., Sovetov P.N. Device for Calculating Transcendental Functions and Multiplying Binary Numbers: Pat. 222880 U1 RF. Publ. 22.01.2024 (in Russ.).

12. Oishi R., Kadomoto J., Irie H., Sakai S. FPGA-based Garbling Accelerator with Parallel Pipeline Processing. IEICE Trans. Inform. Syst. 2023;E106.D(12):1988–1996. https://doi.org/10.1587/transinf.2023PAP0002

13. Nurvitadhi E., Sheffield D., Sim J., et al. Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC. In: 2016 International Conference on Field-Programmable Technology (FPT). 2016. P. 77–84. https://doi.org/10.1109/FPT.2016.7929192

14. Hennessy J.L., Patterson D.A. A new golden age for computer architecture: Domain-specific hardware/software co-design, enhanced security, open instruction sets, and agile chip development. In: Proceedings of the 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 2018. P. 27–29. https://doi.org/10.1109/ISCA.2018.00011

15. Hennessy J.L., Patterson D.A. Computer Architecture: A Quantitative Approach: 6th ed. The Morgan Kaufmann Series in Computer Architecture and Design. 2017. 936 p.

Supplementary files

	1. P-block allocation in a programmable logic device using FPGA architecture
	Subject
	Type	Исследовательские инструменты
	View (35KB)	Indexing metadata ▾

Modifications to a 32-bit CORDIC transcendental function computation pipeline were developed.
By adding design constraints on the placement of register groups corresponding to the pipeline stages a significant increase in the clock frequency can be achieved as compared to automatic placement to reduce the running time of the tracing algorithms.

Review

For citations:

Tarasov I.E., Sovietov P.N., Lulyava D.V., Duksin N.A. Method for designing specialized computing systems on the basis of hardware and software cooptimization. Russian Technological Journal. 2025;13(3):44-53. https://doi.org/10.32362/2500-316X-2025-13-3-44-53. EDN: QWXGNC

JATS XML

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2782-3210 (Print)
ISSN 2500-316X (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

	Title	P-block allocation in a programmable logic device using FPGA architecture
	Type	Исследовательские инструменты
	Date	2025-07-19

User

Russian Technological Journal

Method for designing specialized computing systems on the basis of hardware and software cooptimization

Full Text:

Abstract

Keywords

About the Authors

References

Supplementary files

Review

For citations:

Cookies policy