AI model lightweighting and optimization technology company Nota and AI semiconductor fabless company FuriosaAI are pursuing technological cooperation with the goal of expanding their business domains. Nota’s core business is “model quantization,” a technology that reduces the size of artificial intelligence models while preserving performance as much as possible. In 2022, the company launched the AI model lightweighting platform “NetsPresso,” which currently supports three types of tasks: model development, model optimization, and model validation. FuriosaAI officially released its second-generation Neural Processing Unit (NPU), RNGD (Renegade), in June 2024 and recently received mass-produced units and begun delivering products.
FuriosaAI’s 2nd-generation NPU ‘RNGD’ and the NXT-RNGD server composed of eight RNGD cards / Source = IT Donga
RNGD currently delivers optimal efficiency for large language model inference in server environments. AI semiconductors are broadly divided into those used for “training,” in which data is learned to build the model itself, and those used for “inference,” in which a completed model is run. NVIDIA GPUs can be used for both training and inference, but due to strong demand for training, the total cost of ownership for inference use is relatively high. FuriosaAI therefore proposes using RNGD in place of GPUs for AI inference workloads.
If the models being run require large capacity or memory resources, more compute cards are needed and power consumption increases. When AI models are run in quantized form, memory requirements are significantly reduced, enabling relatively more models to be operated. For example, Meta’s Llama 3.1 70B model requires about 140GB of video memory, but when quantized to 4-bit (INT4), memory usage can be reduced to 35–40GB. Some quality is inevitably lost during compression, but the capability of a quantization technology company is measured by how closely it can preserve the original accuracy.
Performance results of quantizing CGG19 and MobileNetV1 using Nota NetsPresso / Source = Nota
Depending on the case, Nota’s NetsPresso can compress model size by up to one-tenth and increase inference speed by up to 42 times. When the CNN model CGG19, composed of 19 layers (16 convolutional + 3 fully connected), was quantized, performance improved from 5.28FPS to 222.22FPS after compression, while MobileNetV1 improved from 28.08FPS to 480.77FPS, a 17-fold increase. By contrast, when VGG19 was compressed, accuracy declined from 72.28% to 71.14%, and for MobileNetV1, from 66.68% to 66.11%, only a 0.57% reduction. NetsPresso’s role is to compress specific models so that more models can be supported and to enhance the operational efficiency of AI accelerators.
Nota also supports optimally performing hardware through its hardware-aware AI optimization technology. Previously, it took specialized AI engineers several months to tune hardware, but with NetsPresso’s automated pipeline and pre-optimization, this process can be bypassed and models can be deployed directly. Optimization currently focuses mainly on edge products from Arm, Qualcomm, NVIDIA, and Renesas, and FuriosaAI’s RNGD is expected to be added to this optimization list.
RNGD will also be integrated into Nota’s visual recognition AI solution, Nota Vision Agent / Source = Nota
Meanwhile, Nota and FuriosaAI are introducing a packaged solution that combines RNGD with Nota’s visual recognition AI solution “Nota Vision Agent (NVA).” The two companies opened the door to technology supply through a memorandum of understanding on technological cooperation signed last November, and with this collaboration they are beginning to build a joint business model. Based on a Vision-Language Model, Nota Vision Agent supports real-time monitoring of captured images, context-based incident summarization and analysis report generation, and natural language-based Q&A and video search. By combining this with RNGD’s vision processing capabilities, the companies intend to offer it as an integrated AI device.
For FuriosaAI, this demonstrates that RNGD can be deployed across diverse industrial sites. Recently, a wide range of fields, including security, healthcare, and distribution, as well as urban control and smart buildings, have been combining CCTV with AI. The wider the application of Nota Vision Agent, the more it will demonstrate the versatility of RNGD’s vision processing capabilities.
Nota CEO Chae Myung-soo stated, “This agreement is the result of NetsPresso’s AI hardware optimization technology expanding beyond on-device AI environments such as mobile and mobility into the high-performance data center domain, and once again proving its commercial value in the market,” adding, “Together with FuriosaAI, we will showcase Korea’s AI technological capabilities to the global market.”
FuriosaAI CEO Baek Joon-ho commented, “The combination of FuriosaAI’s innovative NPU technology and Nota’s advanced AI optimization capabilities will serve as an opportunity to prove the potential of Korean-style AI in the global market,” and expressed his intention, saying, “Through close collaboration with Nota, we will introduce solutions that deliver high performance and efficiency in real industrial environments.”
FuriosaAI begins commercial operations, meaningful results expected in the second half
Nota and FuriosaAI signed an MOU last December, and are now fully launching joint commercialization efforts / Source = Nota
Last year, FuriosaAI, together with LG AI Research, announced plans to configure a package that runs the Exaone model offline using the NXT-RNGD server, which combines eight RNGD cards in a rack form factor. The company also established a cooperative framework with US-based AI infrastructure company ByteBridge to support digital infrastructure across the Asia-Pacific region. On the commercialization front, it is working with Deepnoid on commercialization of support for M4CXR, an AI solution for medical image reading and diagnostic reporting, and support for NotaAI’s NVA is part of the same initiative.
On January 28, the company announced the official delivery of 4,000 RNGD cards, manufactured by TSMC and assembled by ASUS. This indicates that FuriosaAI can now actively respond to AI semiconductor demand associated with technological cooperation and MOUs concluded in various forms since last year, and begin product sales. That said, the outlook for the AI semiconductor market itself is currently uncertain due to issues such as memory supply, and attention is focused on whether FuriosaAI will be able to move forward smoothly.
IT Donga reporter Nam Si-hyun (sh@itdonga.com)
ⓒ dongA.com. All rights reserved. Reproduction, redistribution, or use for AI training prohibited.
Popular News