Unified Parallel Runtime
Contemporary mobile architectures are integrating more programmable accelerators such as GPU, DSP, and FPGA. The chips are also shipped with the support of certain programming frameworks such as OpenCL to program the accelerators. Meanwhile, more mobile applications such as computer vision and deep neural network applications have high demand of computation resources to fulfill their performance requirement. When multiple applications vie for acceleration by specific compute resources, it can cause resource contention and inefficient resource utilization. Hence, we propose a unified parallel runtime framework which has a centralized management of the acceleration task requests from applications. The runtime schedules the tasks to the best compute resource according to application and system requirements.