Release 3.3.0 (2022/01/03)
Taskflow 3.3.0 is the 4th release in the 3.x line! This release includes several new changes, such as sanitized data race, pipeline parallelism, documentation, and unit tests.
Download
Taskflow 3.3.0 can be downloaded from here.
System Requirements
To use Taskflow v3.3.0, you need a compiler that supports C++17:
- GNU C++ Compiler at least v8.4 with -std=c++17
- Clang C++ Compiler at least v6.0 with -std=c++17
- Microsoft Visual Studio at least v19.27 with /std:c++17
- AppleClang Xcode Version at least v12.0 with -std=c++17
- Nvidia CUDA Toolkit and Compiler (nvcc) at least v11.1 with -std=c++17
- Intel C++ Compiler at least v19.0.1 with -std=c++17
- Intel DPC++ Clang Compiler at least v13.0.0 with -std=c++17 and SYCL20
Taskflow works on Linux, Windows, and Mac OS X.
Release Summary
- This release has resolved data race issues reported by tsan and has incorporated essential sanitizers into the continuous integration workflows for detecting data race, illegal memory access, and memory leak of the Taskflow codebase.
- This release has introduced a new pipeline interface (tf::Pipeline) that allow users to create a pipeline scheduling framework for implementing pipeline algorithms. 
- This release has introduced a new thread-id mapping algorithm to resolve unexpected thread-local storage (TLS) errors when building Taskflow projects in a shared library environment.
New Features
Taskflow Core
- Changed all lambda operators in parallel algorithms to copy by default
- Cleaned up data race errors in tsan caused by incorrect memory order
- Enhanced scheduling performance by caching tasks in the invoke loop
- Added tf::Task:: data to allow associating a task with user-level data 
- Added tf::Executor::named_async to allow associating an asynchronous task a name
- Added tf::Executor::named_silent_async to allow associating a silent asynchronous task a name
- Added tf::Subflow::named_async to allow associating an asynchronous task a name
- Added tf::Subflow::named_silent_async to allow associating a silent asynchronous task a name
- Added multi-conditional tasking to allow a task to jump to multiple successors
- Added tf::Runtime tasking interface to enable in-task scheduling control 
- Added tf::Taskflow:: transform to perform parallel-transform algorithms 
- Added tf::Graph interface to allow users to create custom module tasks 
- Added tf::FlowBuilder:: erase to remove a task from the associated graph 
cudaFlow
Starting from v3.3, using tf::taskflow/cuda/cudaflow.hpp. See Breaking Changes.
syclFlow
This release does not have any update on syclFlow.
Utilities
- Added tf::SmallVector to the documentation 
- Added relax_cpu call to optimize the work-stealing loop
Taskflow Profiler (TFProf)
This release does not have any update on the profiler.
Bug Fixes
- Fixed incorrect static TLS access when building Taskflow in a shared lib
- Fixed memory leak in updating tf::cudaFlowCapturer of undestroyed graph 
- Fixed data race in the object-pool when accessing the heap pointer
- Fixed invalid lambda capture by reference in tf::Taskflow:: sort 
- Fixed invalid lambda capture by reference in tf::Taskflow:: reduce 
- Fixed invalid lambda capture by reference in tf::Taskflow:: transform_reduce 
- Fixed invalid lambda capture by reference in tf::Taskflow:: for_each 
- Fixed invalid lambda capture by reference in tf::Taskflow:: for_each_index 
If you encounter any potential bugs, please submit an issue at issue tracker.
Breaking Changes
For the purpose of compilation speed, you will need to separately include the follwoing files for using specific features and algorithms:
- taskflow/algorithm/reduce.hppfor creating a parallel-reduction task
- taskflow/algorithm/sort.hppfor creating a parallel-sort task
- taskflow/algorithm/transform.hppfor creating a parallel-transform task
- taskflow/algorithm/pipeline.hppfor creating a parallel-pipeline task
- taskflow/cuda/cudaflow.hppfor creating a tf::- cudaFlow and a tf:: - cudaFlowCapturer tasks 
- taskflow/cuda/algorithm/for_each.hppfor creating a single-threaded task on a CUDA GPU
- taskflow/cuda/algorithm/for_each.hppfor creating a parallel-iteration task on a CUDA GPU
- taskflow/cuda/algorithm/transform.hppfor creating a parallel-transform task on a CUDA GPU
- taskflow/cuda/algorithm/reduce.hppfor creating a parallel-reduce task on a CUDA GPU
- taskflow/cuda/algorithm/scan.hppfor creating a parallel-scan task on a CUDA GPU
- taskflow/cuda/algorithm/merge.hppfor creating a parallel-merge task on a CUDA GPU
- taskflow/cuda/algorithm/sort.hppfor creating a parallel-sort task on a CUDA GPU
- taskflow/cuda/algorithm/find.hppfor creating a parallel-find task on a CUDA GPU
Deprecated and Removed Items
This release does not have any deprecated and removed items.
Documentation
- Revised Building and Installing
- Revised Static Tasking
- Revised Composable Tasking
- Revised Conditional Tasking
- Revised GPU Tasking (cudaFlow)
- Revised GPU Tasking (cudaFlowCapturer)
- Revised Limit the Maximum Concurrency
- Revised Parallel Sort to add header-include information
- Revised Parallel Reduction to add header-include information
- Revised cudaFlow Algorithms to add header-include information
- Revised CUDA Standard Algorithms to add header-include information
- Added Interact with the Runtime
- Added Parallel Transforms
- Added Task-parallel Pipeline
Miscellaneous Items
We have published Taskflow in the following venues:
- Tsung-Wei Huang, Dian-Lun Lin, Chun-Xun Lin, and Yibo Lin, "Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System," IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 33, no. 6, pp. 1303-1320, June 2022
- Tsung-Wei Huang, "TFProf: Profiling Large Taskflow Programs with Modern D3 and C++," IEEE International Workshop on Programming and Performance Visualization Tools (ProTools), St. Louis, Missouri, 2021
Please do not hesitate to contact Dr. Tsung-Wei Huang if you intend to collaborate with us on using Taskflow in your scientific computing projects.