TY - GEN
T1 - Load Characterization of AI Applications using DQoES Scheduler for Serving Multiple Requests
AU - Putra, Taufiq Odhi Dwi
AU - Ijtihadie, Royyana Muslim
AU - Ahmad, Tohari
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - In today's era, many types of Artificial Intelligence (AI)-based applications are being developed to fulfill a variety of needs, for example, counting objects recorded using a camera, identifying diseases by processing MRI images, and predicting traffic congestion levels at specific times. One way to provide infrastructure resources that match the workload of AI-based applications is to understand the patterns or characteristics of their workloads. Because an AI model is run using a Graphical Processing Unit (GPU), several parts of the AI model's architecture use Video Random Access Memory (VRAM) as temporary storage media to speed up the running time. This paper analyzes the characteristics of workloads from AI-based applications in terms of running time and VRAM usage, where experiments are conducted in two request scenarios: sequential request and concurrent request and using four types of AI models from the Super Resolution General Adversarial Network (SRGAN), namely no prune, random unstructured, L1 norm, and L2 norm. Based on the experimental results, the workload of all four types of SRGAN models will be almost the same when using the sequential request scenario, while in the concurrent request scenario, the four types of SRGAN models have different workloads. There are models that are more effectively processed one at a time rather than several at once, for example, in the SRGAN no prune model, and there are models that if processed several at once at the same time will be more effective compared to being processed one at a time, for example in the SRGAN random unstructured and L2 norm models.
AB - In today's era, many types of Artificial Intelligence (AI)-based applications are being developed to fulfill a variety of needs, for example, counting objects recorded using a camera, identifying diseases by processing MRI images, and predicting traffic congestion levels at specific times. One way to provide infrastructure resources that match the workload of AI-based applications is to understand the patterns or characteristics of their workloads. Because an AI model is run using a Graphical Processing Unit (GPU), several parts of the AI model's architecture use Video Random Access Memory (VRAM) as temporary storage media to speed up the running time. This paper analyzes the characteristics of workloads from AI-based applications in terms of running time and VRAM usage, where experiments are conducted in two request scenarios: sequential request and concurrent request and using four types of AI models from the Super Resolution General Adversarial Network (SRGAN), namely no prune, random unstructured, L1 norm, and L2 norm. Based on the experimental results, the workload of all four types of SRGAN models will be almost the same when using the sequential request scenario, while in the concurrent request scenario, the four types of SRGAN models have different workloads. There are models that are more effectively processed one at a time rather than several at once, for example, in the SRGAN no prune model, and there are models that if processed several at once at the same time will be more effective compared to being processed one at a time, for example in the SRGAN random unstructured and L2 norm models.
KW - Application
KW - Artificial Intelligence
KW - Load Characterization
KW - Multiple Requests
KW - Scheduler
UR - http://www.scopus.com/inward/record.url?scp=85194095605&partnerID=8YFLogxK
U2 - 10.1109/ISDFS60797.2024.10527227
DO - 10.1109/ISDFS60797.2024.10527227
M3 - Conference contribution
AN - SCOPUS:85194095605
T3 - 12th International Symposium on Digital Forensics and Security, ISDFS 2024
BT - 12th International Symposium on Digital Forensics and Security, ISDFS 2024
A2 - Varol, Asaf
A2 - Karabatak, Murat
A2 - Varol, Cihan
A2 - Tuba, Eva
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th International Symposium on Digital Forensics and Security, ISDFS 2024
Y2 - 29 April 2024 through 30 April 2024
ER -