admin 管理员组

文章数量: 887021

Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations

Information

  • Paper:Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations
  • Author: Wei Zuo
  • Key words: HLS,polyhedral

Backgrounds

Classify the optimizations:
1.Optimize hardware implementation within a code block (intra-block)
2.Optimize communication and pipelining between code blocks (inter-block).
Why:
Real-world applications contain data-dependent blocks of code and communicate through complex data access patterns. Existing high level synthesis tools cannot apply these powerful optimizations unless the code is inherently compatible.

Both intra-block parallelization and inter-block pipelining are currently supported by existing HLS tools. But these optimizations cannot always be enabled with the default data access patterns. Thus, the goal of this work is to enable efficient integrated use of intra- and inter-block optimizations through loop transformation.

Work

This paper have presented an integrated technique using polyhedral models to model and enable both intra- and inter-block optimizations:

  • An automated polyhedral model-based framework that systematically identifies effective access patterns and applies appropriate loop transformations that enable intra- and inter-block optimizations.
  • An automated framework to generate communication interfaces between blocks.

Optimization Framework :


Goal:minimize the overall latency (maximize the performance speedup)
Steps:

①Classification of Array Access Patterns:
The array access pattern is defined by matrix M :

Then classify the array access patterns based on the values of a1,a2, b1and b2:

Then derive the associated loop transformations(这里定义了一个 transformation function T,为给定的数据访问模式导出所需的循环转换。没看懂)

②Performance Estimation:

To perform this evaluation, we develop a performance metric that combines modeling of both intra- and inter-block speedup and their associated implementation overhead.

大致方法是定义每个程序为一个K个块的序列,每个块都包含一个d维的嵌套循环,访问一个n维的数组,后面根据slope的不同对cost进行分析,这个模型是在全局范围内对块内并行化和块间pipeline进行粗粒度优化。另外,这个阶段可以根据数据依赖关系对设计空间比较轻易地进行剪枝。

③Implementation:
They automatically perform the loop transformations, insert high level synthesis directives, and generate FIFO interfaces between computation and communication blocks. (If the communication block requires a large communication buffer and a complex multi-read communication interface, they automatically insert customized communication blocks.)

They integrate their framework into PoCC polyhedral framework,then modified the framework to automatically produce source code compatible with their chosen HLS tool, AutoPilot(这个应该就是Vitis HLS的一部分,通过这个工具提供的一系列directives进行优化)
(PoCC is a source-to-source compiler that includes a set of tools for polyhedral compilation. It extracts the polyhedral intermediate representation at source code level.)

强调:they automatically improve the number of situations in which the directives can be used.

本文标签: 笔记 论文 FPGA