Last year, MIT developed an AI/ML algorithm capable of learning and adapting to new information while on the job, not just during its initial training phase. These “liquid” neural networks (in the Bruce Lee sense) literally play 4D chess — their models requiring time-series data to operate — which makes them ideal for use in time-sensitive tasks like pacemaker monitoring, weather forecasting, investment forecasting, or autonomous vehicle navigation. But, the problem is that data throughput has become a bottleneck, and scaling these systems has become prohibitively expensive, computationally speaking.
On Tuesday, MIT researchers announced that they have devised a solution to that restriction, not by widening the data pipeline but by solving a differential equation that has stumped mathematicians since 1907. Specifically, the team solved, “the differential equation behind the interaction of two neurons through synapses… to unlock a new type of fast and efficient artificial intelligence algorithms.”
“The new machine learning models we call ‘CfC’s’ [closed-form Continuous-time] replace the differential equation defining the computation of the neuron with a closed form approximation, preserving the beautiful properties of liquid networks without the need for numerical integration,” MIT professor and CSAIL Director Daniela Rus said in a Tuesday press statement. “CfC models are causal, compact, explainable, and efficient to train and predict. They open the way to trustworthy machine learning for safety-critical applications.”
So, for those of us without a doctorate in Really Hard Math, differential equations are formulas that can describe the state of a system at various discrete points or steps throughout the process. For example, if you have a robot arm moving from point A to B, you can use a differential equation to know where it is in between the two points in space at any given step within the process. However, solving these equations for every step quickly gets computationally expensive as well. MIT’s “closed form” solution end-arounds that issue by functionally modeling the entire description of a system in a single computational step. AS the MIT team explains:
Imagine if you have an end-to-end neural network that receives driving input from a camera mounted on a car. The network is trained to generate outputs, like the car’s steering angle. In 2020, the team solved this by using liquid neural networks with 19 nodes, so 19 neurons plus a small perception module could drive a car. A differential equation describes each node of that system. With the closed-form solution, if you replace it inside this network, it would give you the exact behavior, as it’s a good approximation of the actual dynamics of the system. They can thus solve the problem with an even lower number of neurons, which means it would be faster and less computationally expensive.
By solving this equation at the neuron-level, the team is hopeful that they’ll be able to construct models of the human brain that measure in the millions of neural connections, something not possible today. The team also notes that this CfC model might be able to take the visual training it learned in one environment and apply it to a wholly new situation without additional work, what’s known as out-of-distribution generalization. That’s not something current-gen models can really do and would prove to be a significant step towards the generalized AI systems of tomorrow.
Author: A. Tarantola
Source: Engadget