A few hours after Google announced their new family of language models named Gemini, I was asked what I thought about them. I gave an honest answer, that I hadn't really looked into them. I saw a headline that they had were released, and shrugged it off. Why? Well, I'm kind of sick of them..
When I say this, it is not meant to discredit or downplay how impressive the models are. The ability to ask a question about linear algebra followed by a prompt to write a poem in the tone of pirate and confidently get a response that is generally accurate is incredible. I remember less than two years ago messing around with BERT and its variants, but being dissapointed in respones that were nonsensical.
But with that said, I've become a bit unimpressed with new models. Most of them tell you that they are some percentage point better than other models on a variety of different metrics. And while that is progress, it's just not that exciting to me. On top of that, the architecture of the models are largely the same, which means the differences stem from variations in training data. And if your model is only performing better because your model has ___ billion parameters more than the last, great. I am happy that you and your team have access to more compute.
What has been impressive to me, are the smaller models. In fact, I think most of the productivity gains we see will be from these smaller models. Retrieval Augmented Generation (RAG) models are one of the more common ways to use LLM's on domain specific data. The magic sauce for those is not the LLM, but the similarity search returning the context. As long as the LLM is able to produce coherent text, all it needs to do is mix up the input to provide the answer. In simple experiments, I have found that 3 bit quantized models are more than sufficient compared to 8 bit or even full base models. Perhaps I should run an actual experiment, I am a data scientist after all.
Maybe I'm not sick of LLM's. Maybe it's just that I am sick of people talking about "AI" as if it's something magical new inventions. AI is not some sentient thing, its a vague way to refer to complex applied math. Or maybe I am just a jaded data scientist sick of hearing people talk about AI who couldn't tell you how to multiply two vecotrs or explain what a p-value is. It's probably the later.