The task engine is a service for running batch jobs in the background. It performs a variety of functions primarily covering:
- Publications and Printing: handles jobs that are submitted for publication via the scheduling engine (through "Publish") or user initiated Printing, the task engine performs all the relevant processing and rendering.
- Data Preparation and ETL: handles all tasks and jobs that are submitted for data processing
- Other background functions like cleaning operations, authentication synchronization and a variety of other system level operations.
Data source performance is heavily affected by concurrency and the number of requests processed at any given time. Background tasks can heavily impact performance for end users if it competes for those resources.
To ensure /optimal performance for users during normal business hours, its best to set the time period per day that represents "peak hours". Using this time split, it's possible to ensure that fewer jobs are being executed by the task engine during peak times, giving end-users better access to system resources when they are actively using the system.
Enter peak hours for each day in the Peak Hours panel.
Settings for the entire platform can be determined in the mid panel or per task engine in the cluster (if running in multi-server mode).
Select how the cluster will allocate resources to each task engine.
- Manual: the admin manually sets which task servers will perform what activities and the number of threads to run concurrently in peak/ off peak times.
- Automatic - Percentage: this lets the admin assign a percentage "coverage" that each task activity should have across all the task servers in the cluster and lets the engine automatically assign resources accordingly.
Typically, if the cluster is built manually, with additional nodes added on an irregular basis, the manual model is preferred - because admins are given control for how to allocate task activities. If the cluster shrinks and grows automatically (with a Kubernetes deployment for example), then the manual approach is infeasible - and the automatic approach must be used instead.
- Peak threads / Off peak threads: these numbers will drive the number of tasks that can be run by the task engine(s) during the hours set, on specific days of the week. Generally, the peak threads should be low enough to eliminate resource competition with users, while the off-peak threads can be increased to ensure maximum usage of resources when users are not on the system (both off hours and off days).
Task Type: In a multi-server configuration, it's possible to designate which task engine will run which type of task: Print / Publish vs Data processing / ETL, or both. Being able to designate task type by server provides admins control over resource allocation, performance settings (at the hardware level), and system throughput.