math

Researchers Claim New Technique Slashes AI Energy Use By 95% – Slashdot

Posted by BeauHD from the would-you-look-at-that dept.

Researchers at BitEnergy AI, Inc. have developed Linear-Complexity Multiplication (L-Mul), a technique that reduces AI model power consumption by up to 95% by replacing energy-intensive floating-point multiplications with simpler integer additions. This method promises significant energy savings without compromising accuracy, but it requires specialized hardware to fully realize its benefits. Decrypt reports: L-Mul tackles the AI energy problem head-on by reimagining how AI models handle calculations. Instead of complex floating-point multiplications, L-Mul approximates these operations using integer additions. So, for example, instead of multiplying 123.45 by 67.89, L-Mul breaks it down into smaller, easier steps using addition. This makes the calculations faster and uses less energy, while still maintaining accuracy. The results seem promising. “Applying the L-Mul operation in tensor processing hardware can potentially reduce 95% energy cost by element wise floating point tensor multiplications and 80% energy cost of dot products,” the researchers claim. Without getting overly complicated, what that means is simply this: If a model used this technique, it would require 95% less energy to think, and 80% less energy to come up with new ideas, according to this research.

The algorithm’s impact extends beyond energy savings. L-Mul outperforms current 8-bit standards in some cases, achieving higher precision while using significantly less bit-level computation. Tests across natural language processing, vision tasks, and symbolic reasoning showed an average performance drop of just 0.07% — a negligible tradeoff for the potential energy savings. Transformer-based models, the backbone of large language models like GPT, could benefit greatly from L-Mul. The algorithm seamlessly integrates into the attention mechanism, a computationally intensive part of these models. Tests on popular models such as Llama, Mistral, and Gemma even revealed some accuracy gain on certain vision tasks.

At an operational level, L-Mul’s advantages become even clearer. The research shows that multiplying two float8 numbers (the way AI models would operate today) requires 325 operations, while L-Mul uses only 157 — less than half. “To summarize the error and complexity analysis, L-Mul is both more efficient and more accurate than fp8 multiplication,” the study concludes. But nothing is perfect and this technique has a major achilles heel: It requires a special type of hardware, so the current hardware isn’t optimized to take full advantage of it. Plans for specialized hardware that natively supports L-Mul calculations may be already in motion. “To unlock the full potential of our proposed method, we will implement the L-Mul and L-Matmul kernel algorithms on hardware level and develop programming APIs for high-level model design,” the researchers say.

The opulence of the front office door varies inversely with the fundamental solvency of the firm.

Working…

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to top button

Adblock Detected

Block the adblockers from browsing the site, till they turn off the Ad Blocker.