Kara Bembridge
March 22, 2024
Reading time:
One of the largest trade fair of its kind, and a global platform for the embedded community, Embedded World 2024 will be taking place next month at the NürnbergMesse in the quaint city of Nuremberg, Germany. Mark your calendars from April 9 to 11 to get a first-hand glace at developments that are shaping the embedded industry as we know it.
Representing the open source piece of the puzzle, Collabora will once again be exhibiting in Hall 4, Booth 4-404. We're eager to demonstrate what we've been up to with multiple demos at hand, including:
WhisperFusion, the most advanced AI Front Counter/Drive-Thru sales agent. Powered by WhisperLive, WhisperSpeech, Phi-2, torch.compile, and AMD.
Machine learning video analysis on STMicroelectronics' STM32MP2 platform using an upstream-ready H.264 encoder (V4L2) and GStreamer's all-new analytics metadata framework
In addition to our booth and demonstrations, Marcus Edel will be taking part in the Embedded Vision & Edge AI track at the Embedded World Conference, with a talk exploring how language learning models utilize GPU capabilities on embedded devices.
Collabora @ Embedded World 2024
GPU-Accelerated LLM on an Embedded Device
Presented by Marcus Edel - Wednesday, April 10, 16:00 CEST (UTC+2)
NCC OstOpen language model developments have paved the way for progress in areas like question-answering, translation, and creative tasks. While most of these models thrive on powerful GPUs, we're curious about their performance on widely available embedded systems. Enter Machine Learning Compilation: an emerging tool that autonomously sets up, refines, and manages machine learning tasks across different platforms. Interestingly, many embedded devices come with mobile GPUs, which can be harnessed for added speed. With this in mind, we've made strides in running the Llama-2 model on a Mali-G610 GPU. We've leveraged Machine Learning Compilation to harness the GPU's capabilities; specifically, we've efficiently deployed the Llama-2 model by utilizing optimization techniques such as quantization, fusion, and layout tweaks. Furthermore, we tapped into a universal GPU kernel optimization framework written in TVM TensorIR, adapting it for Mali GPUs. And lastly, we've employed the OpenCL codegen backend from TVM, tailoring it for Mali GPUs.
Whether its writing a line of code or shaping a long-term strategic software development plan, Collabora can accelerate and facilitate the realization of your embedded Open Source projects. Let us show you how we can help! Drop in for a friendly 'Hallo' at booth 4-404!
See you there!
15/11/2024
The Linux Foundation Member Summit is an opportune time to gather on the state of open source. Our talk will address the concerns and challenges…
14/11/2024
Today, we are delighted to announce a growing collaboration with MediaTek which will enable Collabora to introduce, improve, and maintain…
06/11/2024
Join us at electronica 2024! In partnership with Renesas, Collabora will be showcasing GStreamer open source AI video analytics on the Renesas…
Comments (0)
Add a Comment