Skip to main content.

ThreadPoolComposer (TPC)

ThreadPoolComposer has been superseded by TaPaSCo - The Task Parallel System Composer in 2016; TPC is no longer supported, because TaPaSCo offers significant improvements over TPC.
ThreadPoolComposer Flow Overview
ThreadPoolComposer (TPC) is an open-source, high-level synthesis compilation flow based on Scala/sbt and Xilinx Vivado IP Integrator Tcl that automates many tedious steps to produce hardware designs consisting of accelerator pools. It also provides a uniform programming interface called TPC API, which greatly simplifies the use of these accelerators in C/C++ applications. Currently three platforms are supported by TPC:

ThreadPoolComposer System Architecture Overview

With the uniform API layers all three platforms can be utilized by the same source code; you only need to write your application once, then run it on all platforms. Using the C++ API headers (also called TPC++ API), a blocking accelerator call is as easy as:

        ThreadPoolComposer tpc;
        tpc.launch_no_return(12345 /*magic ID for the kernel*/, param1, param2, array1, array2, ...)

Accelerators are connected via AXILite (control registers) and (optionally) AXI4 master interfaces for memory access. They can be designed in any toolflow: Vivado HLS is supported out-of-the-box, accelerators can be parameterized and synthesized using abstract descriptions in JSON format and will automatically be built using Vivado HLS. But also custom IP cores (e.g., generated by Chisel, Bluespec, or written directly in Verilog/VHDL) are supported, as long as an IP-XACT description is supplied and the control register map adheres to TPC conventions (see documentation for details).

Design Space Exploration

Automating the construction of the accelerator pools is nice and gives FPGA beginners are valuable head start, but ThreadPoolComposer goes beyond to offer more experienced users a real benefit: TPC can automatically analyze your accelerator library with respect to area utilization,maximal frequency of the cores, and dump the data in CSV format. Based on this data, TPC can perform a simple design space exploration (DSE) in which you can choose to optimize area utilization, frequency or both. When running in DSE mode, ThreadPoolComposer does not require any user supervision: It will iterate automatically over the design space until a design achieves timing closure.

Even more interesting for accelerator designers: TPC can handle variants of the same core, and span the design space across multiple implementations of the same core, which is very helpful to evaluate the impact of parameterization in high-level languages (e.g., compare a LUT-RAM vs. BRAM-based buffers, fully unrolled vs. partially unrolled loops, ...).

Modern Hardware Description/Construction Languages

ThreadPoolComposer was developed with modern HDLSs in mind, like Chisel, or Bluespec. The toolflow is written in Scala and easily extensible to accomodate custom IP generation passes using modern hardware construction languages as well as classic Verilog/VHDL. The infrastructure provided by TPC (base design, drivers, APIs, ...) allows the user to avoid a lot of tedious ground work and focus on the interesting work instead. A uniform environment also makes it easier to compare the results.


ThreadPoolComposer is free software published under the GNU LGPLv3. Dual licensing is possible, so in case you are interested in a commercial license, please contact us at
threadpoolcomposer "at"

Academic Work

If ThreadPoolComposer was helpful for your research, please cite us with the following paper:

Jens Korinth, David de la Chevallerie, Andreas Koch
An Open-Source Tool Flow for the Composition of Reconfigurable Hardware Thread Pool Architectures
THE 23rd IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), Vancouver BC (CAN), 05-2015
Paper FCCM 2015


We're currently working on providing a public source code repository for ThreadPoolComposer. At the moment, the latest release can be found in archived form below:

Latest Release
The latest public release is 2016.03: ThreadPoolComposer-2016.03.tar.xz

SD Card Images
We provide ready-to-use SD card images for the zedboard and the ZC706 based on ArchLinuxARM. They are not required, but provide a quick starting point on the Zynq boards. See in the archives for further instructions.

Support / Contact

If you run into problems, have feature requests, or would like to contribute code to TPC don't hesitate to contact us at
threadpoolcomposer "at"

-- Main.jk - 31 Mar 2016