DeepX, Finding the Singularity
An era of ‘on-device AI’ is approaching, wherein electronic devices will have AI chips embedded in them. The possibilities of having talking cars and thinking refrigerators are intriguing. However, to answer this question, we must first identify an AI chip with “unparalleled technology.” The world is currently engaged in a race to create AI-specific semiconductors that can overcome the limitations of GPUs, which have traditionally been considered the key driver of AI development. Among the companies involved in this race is the South Korea-based AI semiconductor startup, DeepX, which is developing its own AI chip. They are integrating NPUs that emulate the human brain into small electronic devices such as CCTVs, robots, and drones, to achieve the highest power ratio in the world (AI computation processing power per unit of power) in the device edge area.
If we define the ‘technological singularity’ as the point where artificial intelligence exceeds human intelligence, then this small chip marks the start of that moment. “Without AI semiconductors, there is no AI,” says DeepX CEO Kim Lokwon. We interviewed him to gain insights into the future of AI semiconductors.
Jack-of-all-trades
Q. Companies developing AI semiconductors typically announce one chip at the end of a project. However, DeepX has surprised the industry by releasing four chips bundled together. In my opinion, DeepX’s differentiation strategy begins at this point.
A. Different electronic devices require different types of semiconductors. For instance, the AI used in CCTV only needs to analyze video, while the AI in industrial robots must perform more complex computations, such as deciding which object to pick up and how to move it. To address this issue, we have created a chip that can connect to one electronic device to compute AI or a chip that can connect to multiple devices to compute AI. Additionally, we have combined four semiconductor products with distinct roles and functions for each device into a complete package solution that can be universally applied.
Q. That would be the All-Info AI Total Solution. The package consists of DX-V1, DX-V3, DX-M1, and DX-H1. Can you explain the characteristics and functions of each?
A. The four chips can be roughly divided into two types: DX-M1-DX-H1 are AI boosters, and DX-V1-DX-V3 are AI enablers. Each has slightly different target applications. DX-M1 can be used in consumer and industrial robots, smart factories, and smart mobility. DX-H1 is for AI servers used in colocation and hyperscale* data centers. Dedicated to AI inference functions, it maximizes performance, power, and cost efficiency over GPGPUs. The DX-V1 can process the latest AI algorithms on a single camera. It can be used in CCTV, robots, drones, VR cameras, and camera surveillance systems. DX-V3 is specialized for autonomous driving, robot vision, and other applications that require processing of 3D sensors other than cameras. All four chips come with DX-GEN1, an NPU developed by DeepX.
*Colocation DataCenter: This is a facility where multiple companies can rent space and power for their servers. The facility generates revenue by bringing servers from different companies together in one place.
Hyperscale DataCenter: A data center built on a massive scale typically includes approximately 5,000 servers and 3,000 square meters or more, although there is no official definition.
Q. DeepX offers packaged solutions that can be utilized in both server and edge environments, making it a versatile and convenient option. It’s like a ‘one-size-fits-all’ solution. Furthermore, I have heard that DeepX has achieved a world-class Performance Per Watt (PPW) ratio, while still being highly adaptable. There is currently no dominant player in the AI semiconductor market. I am curious to know about DeepX’s competitive advantage that enables it to stay ahead in the race for AI chips.
A. The process of computing on very large AI models using traditional GPUs has required a lot of power and cost. In fact, the power consumption of the GPUs sold by Nvidia in a year is equivalent to the annual consumption of many developing countries. However, with the emergence of on-device AI market, GPUs are no longer a feasible option to run AI on everything we use today. Although many companies are trying to develop AI-specific semiconductors to replace GPUs, it’s still difficult to find a massively commercialized AI edge device. These devices are usually battery-powered, and hence, they need to be cooled. In this scenario, semiconductor companies like ours are determined to increase the efficiency of AI computation while addressing the issue of heat dissipation. Therefore, we are developing source technology for low-power, high-performance AI semiconductors to achieve the world’s best power-to-energy ratio.
Q. how does DeepX manage to design semiconductors that are both the fastest and consume the lowest amount of power in the world?
A. We developed IQ8™, an INT8* model compression technology, and Smart Memory Access technology to minimize D-RAM usage to a fraction or less of that of GPUs. These key technologies enable complex AI models to be more accurate and lightweight without deterioration*. Even when using LPDDR, a low-power memory solution rather than expensive HBM, AI computation processing performance is the world’s highest and offers more than 10 times the power efficiency of GPUs. In particular, competitors’ AI semiconductors in the global market use 32MB to 50MB of cache memory, while DeepX’s products reduce the amount of cache memory to about one-fourth. We can say that our products are superior in most of the key features of AI semiconductors, such as computational processing power, AI computation accuracy, and the types of AI algorithms supported.
*INT8: Each bit can represent two numbers. With 8 bits, we can represent 2^8=256 integers, which means we can go from -128 to 127.
*Deterioration: The deterioration of an insulator’s chemical and physical properties due to external or internal influences.
DeepX has something special, What is it?
Q. What are the top three strengths of DeepX’s technology?
A. These are low manufacturing cost, low power consumption, and high efficiency and performance. One of the ways in which manufacturing costs were reduced was by minimizing the use of SRAM, which has a high unit cost. SRAM is used as a cache memory to store data in the AI computational structure. However, by reducing the amount of cache memory used, the size of the chip itself was reduced, allowing more chips to be manufactured on a single wafer. This significantly lowers the cost of manufacturing.
Additionally, the smaller size of the product results in reduced power consumption. DeepX’s chips are optimized to reuse data and memory to reduce unnecessary operations, which further reduces power consumption while increasing computational processing speeds. This optimization of memory usage continues to improve the speed of computational processing, leading to higher accuracy.
*Wafer: A round disk of single-crystal silicon, 5 to 10 centimeters in diameter, used as the basis for making semiconductors.
Q. We believe that DeepX’s unique technology means more than just being a “fast mover” in the global AI semiconductor market. How will DeepX’s AI semiconductors accelerate the commercialization of on-device AI products and change consumers’ lives?
A. The DX-M1 is an NPU designed for robots and security systems to perform computation and reasoning. It can be widely used in the security market to operate major facilities safely and prevent disasters and crimes. In fact, the physical security market is where AI is making the fastest inroads. The DX-M1 can be installed on hardware for object detection or intelligent video analysis based on AI semiconductors. It uses a 5-nanometer (㎚) process, which enables real-time AI computation processing of more than 30 frames per second (FPS) for 16 or more channels of multichannel video on a single chip. Additionally, the manufacturing cost is extremely low. The DX-M1 has only one-third the design area of other NPUs, which means that the manufacturing cost is also one-third less.
Some examples of NPUs utilize more advanced technology than traditional ones. Samsung Electronics’ foundry manufactures the DX-V1 on the 28-nanometer (㎚) process, which is powered by the latest AI algorithm, YOLOv7. This algorithm is not compatible with conventional NPUs and offers users faster and more efficient computation.
Q. It is impressive that DeepX has developed a software solution that enables customers to easily use their products. The ability to operate four AI semiconductors using a single software framework is a significant advantage for customers. It resembles Nvidia’s ‘CUDA’ in that it provides a complete stack of products and services, from hardware to software.
A. We set out to develop a personalized solution that was tailored to our customers’ products. From the beginning, our goal was to address the fragmented on-device AI market, both in our four semiconductors and in the software we provide. To achieve this, we created DXNN®, an AI software development kit (SDK). We designed DXNN® in such a way that even customers who are already familiar with CUDA can adopt it without difficulty, enabling them to quickly integrate with our system.
For reference, The DX-H1 is designed to work well with current GPU solutions and can support GPU-based trained AI models through a flexible interface. The DX-M1 is also available in a compact M.2 module for quick and easy connection to existing embedded systems, making it easy for customers to integrate DeepX’s low-power, high-performance AI solutions.
The singularity is near
Q. Recently, at CES 2024, DeepX was recognized for its excellence in embedded technology, robotics, and computer hardware and components. Your booth was visited by over 5,000 attendees who appreciated your success and applauded you. Congratulations! (laughs) It’s great to see DeepX making its mark in the global AI semiconductor market. I believe that their success is just around the corner.
A. Since the release of CES 2024, DeepX’s Early Engagement Customer Program (EECP) has doubled in size, with close to 90 global companies currently undergoing pre-qualification testing. This clearly indicates the increasing demand for DeepX’s on-device AI solutions in the industry. We are actively promoting our technology at various conferences in the US, Europe, and Taiwan. Our AI chip has already demonstrated its capabilities through samples, and we plan to have mass-produced chips available in the market later this year. If manufacturers begin to use our chips in their finished products, we aim to be the leading provider in the on-device AI market by 2025.
Q. What are the future plans of DeepX and what technological advancements can we expect?
A. We are aiming to develop a new technology that can run a very large language model while consuming less than 5 watts of power. This technology will enable artificial intelligence to become ubiquitous in human life, going beyond the realm of science. Our technology will significantly improve the affordability, low power consumption, and high performance of on-device AI. Currently, the latest trend in on-device AI involves incorporating large natural language processing models that facilitate conversations, translation, summarization, writing, coding, and other functions. We define generative AI as a “federated operation of LLMs,” which would be the decisive technology for humanity to commercialize. It is a collaboration between on-device AI and on-premise AI. With our server and edge AI semiconductor technology, we expect to develop ultra-low power semiconductors to reduce energy consumption and carbon emissions by 1000 times.
*On-premise AI: It refers to AI built within a company’s own data center, without any external connections. This contrasts with cloud AI, which processes data offsite using cloud systems.
Q. Global on-device AI products are expected to reach 300 million units worldwide this year, according to global market research firm Gartner. I’d love to hear your honest thoughts on AI and AI semiconductors.
A. AI is the final invention of humanity as it will mark the endpoint of human evolution. However, the realization of AI is not possible without AI semiconductors. These days, semiconductors are ubiquitous in our lives – from smartphones and cars to home appliances, elevators and even refrigerators. And with the integration of AI, semiconductors are becoming even more critical. Self-driving cars, smart factories, and robotic toys are just some of the examples that will become part of our everyday lives. The most significant question right now is who can build better AI semiconductors faster. At DeepX, we believe that we are solving the challenges of traditional AI semiconductors in one fell swoop, which is why we stand out.
Q. Last question. Did DeepX discover a singularity?
A. I believe that we have advanced the era of AI, which will be implemented in our daily lives in the future. It would have taken much longer for AI semiconductors in CCTVs and robots to move on from old algorithms without DeepX’s products. Someone must do it, so we are doing it first.