Translate

Monday, January 19, 2026

Brilliance of Transformer Neural Network

 

More you read about neural network more you are in awe. And then you read about how Transformer architecture of Neural Network works that is essentially driving the present AI mainstream interface -the generative AI. I am especially amazed by the process of embedding words, the three dimensionality of a word in context space. Vectorization of words i.e., converting words into codes, while maintaining the meaning in its changing comprehensive context is an enviable, indeed seemingly impossible, task. It's amazing that you start by working out how much some words are popular and then create an equation to see how they are connected, this word-to-word correlation creates a pattern, and if code is able to emulate this pattern, that is, it shows similarity, then that means it has captured the context, the essence of its reality. Well, it is much elaborate than this but this is the crux of self-attention. To place a word in space and use matrix of codes to connect other words and create a map of meanings from these intricate relationships is brilliance. Instead of linear the parallelized understanding not only effectively works the emergent essence but suits powerful computation GPUs hence can be scaled. And more you feed more it sharpens and is ready for future!        

“Attention Is All You Need”, the seminal 2017paper, is now recognized as landmark in AI development and most cited paper in modern AI (visit me https://depalan.blogspot.com/2024/03/attention-is-all-you-need.html). Though I feel ‘Team Transformer’ is not widely recognized as they should be, especially Jakob Uszkoriet, Ashish Vaswani, Illia Polosukhin and Noam Shazeer. They surely are front runner for Turing prize. Meanwhile I was quite shocked during an interview I was conducting two years back that even a recent alumnus (NIT Mesra) was not aware of Ashish Vaswani. If he was involved in non-engineering work then it is understandable, but here he is at the core of tech that is at the center of AI revolution. Indian government speaks about AI education they need to start by recognizing the pioneers Ashish Vaswani and Niki Parmar.                      

Gaming the system

  "When a measure becomes a target, it ceases to be a good measure." Goodhart's Law This is precisely what education is redu...