In this article, we'll explore how advancements in computing infrastructure and algorithms will impact the potential of AI technologies. To grasp the future of AI and its capabilities, it's essential to first understand the current limitations and how they might be overcome.
Over the past decade, one significant trend in machine learning has been the shift towards edge computing. Edge computing involves running algorithms directly on data-capturing devices like IoT devices. When done effectively, this approach saves costs, reduces bandwidth usage, and minimizes latency.
However, I expect the move towards edge computing to down as we enter the era of generative AI and foundational models, especially in the realm of conversational AI. These new models demand extensive memory and computational resources, making them costly, slow in terms of latency, and unsuitable to run on current generations of edge devices.
These limitations also affect potential business applications. For instance, OpenAI recently introduced GPT-4 turbo, which offers an expanded context window and lower prices. Nonetheless, even with reduced prices, running a single 128k query can cost more than $2. This pricing model would require a high monthly fee for SaaS services or significant limitations.
Generic AI algorithms, like ChatGPT, capable of handling various tasks, may not be as efficient and cost-effective as algorithms optimized for specific use cases. Infrastructure limitations and computing costs are two crucial dimensions influencing AI development.
Looking ahead to the next ten years, we can view AI development as occurring in waves of improvements:
Large waves: These entail the introduction of new algorithms, like transformers, that expand capabilities and require substantial computing infrastructure, often accessible through cloud services.
Small waves: These represent efficiency improvements, such as optimizing existing algorithms, reducing computational requirements, or introducing smaller models tailored for specific use cases.
Massive waves: These are rare and driven by shifts in computing paradigms. Quantum computers are on the horizon and could usher in a new era for AI.
We've recently witnessed a large wave of generative AI. In the next 2-3 years, we anticipate several smaller waves of efficiency improvements in algorithms, infrastructure, and a renewed move toward single-use case models. New AI chips (TPUs, GPUs) may lower the cost of running models, making them more accessible for various business cases.
A few years down the road, we might see new algorithms replacing the current ones, potentially leading to another large wave requiring cloud-based models.
Between five to ten years from now, I expect quantum computers to become more widely available, possibly leading to a new era in AI algorithm capabilities. There was recently an announcement that the supercomputer Lumi, including its quantum computer, is available for any company for research purposes. For the quantum computer to revolutionize AI, they would first have to become mainstream and available at scale through cloud providers. When this happens, expect there to be some interesting new capabilities in AI soon after.
For those devising AI strategies, solely considering the cost of running algorithms may be misguided. Costs have already dropped significantly, and further reductions are to be expected. Waiting until costs are lower is likely be a losing strategy in segments where the competition is fierce.