Accelerating HLS Autotuning of Large, Highly-parameterized Reconfigurable SoC Mappings

Loading...
Thumbnail Image

Degree type

Doctor of Philosophy (PhD)

Graduate group

Electrical and Systems Engineering

Discipline

Electrical Engineering

Subject

Bayesian optimization
Design space exploration
High-level synthesis
Reconfigurable computing

Funder

Grant number

License

Copyright date

2023

Distributor

Related resources

Contributor

Abstract

High-level synthesis has accelerated the adoption of autotuners to explore the design spaces of applications mapped on systems-on-chip with reconfigurable logic. Design-space size increases exponentially in the number of design parameters, and building a single configuration of a full application easily consumes hours, so existing autotuners are frequently demonstrated with small kernels and small design spaces to render the problem tractable. This dissertation shows that 30+-parameter applications mapped on 200k+-LUT reconfigurable SoCs can be explored in less than 12 build times on an 8-core host using the model-based approach we refine. We explore various techniques to reduce the tuning time. At the heart of our tuner is an iterative refinement approach that builds a prediction model representing the design space. Our models are multi-fidelity models, which enable discontinuation of unpromising builds in multi-stage CAD flows. We organized the build resources into a pipeline to improve the tuning performance and increase the utilization of build resources. Build failures are mitigated in several ways. Invalid accelerator configurations are replaced with valid ones on-the-fly. Routing errors caused by congestion are mitigated through congestion models. Because the curse of dimensionality deteriorates the performance quickly as the number of parameters increases, we apply dimensionality reduction to focus on the most important parameters. To validate our approach, we injected 32-46 parameters, varying from pragmas to CAD tool parameters, into the Rosetta benchmarks. Compared to OpenTuner, our tuner succeeds 71% more often at finding mappings onto the ZCU102 within 12 hours, and the found mapping is 3.5x faster. Alternatively, we observed that tuning runs are on average at least 8.8x shorter.

Date of degree

2023

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Volume number

Issue number

Publisher

Publisher DOI

Journal Issues

Comments

Recommended citation