Hardware Synthesis from Simulink and CUDA Models
Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. The synthesis flow uses an intermediate translation into SystemC and exploit modern C-to-RTL synthesis tools.
Another related approach that is being explored in our group is the synthesis of efficient hardware implementations from Simulink models, also based on translation to SystemC and subsequent high-level synthesis. We developed a modeling strategy based on a representation which can be easily used to explore different micro-architectures. It also accurately models different datapath bit widths and arithmetic overflow/saturation modes in a single C model, which is both compatible with the S-function Simulink modeling style and amenable to efficient HW synthesis.
More information is available here.
10 papers on this subject have been presented to international conferences.