Researchers make language models scalable self-learners
The scientists used a natural language-based logical inference dataset to create smaller language models that outperformed much larger counterparts. Socrates once said: "It is not the size of a thing, but the quality that truly matters. For it is in the nature of substance, not its volume, that true value is found." Does size always matter for large language models (LLMs)- In a technological landscape bedazzled by LLMs taking center stage, a team of MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers think smaller models shouldn't be overlooked, especially for natural language understanding products widely deployed in the industry. To that end, the researchers cooked up an approach to long-standing problems of inefficiency and privacy associated with big, text-based AI models - a logic-aware model that outperforms 500-times-bigger counterparts on some language understanding tasks without human-generated annotations, while preserving privacy and robustness with high performance. LLMs, which have shown some promising skills in generating language, art, and code, are computationally expensive, and their data requirements can risk privacy leaks when using application programming interfaces for data upload. Smaller models have been historically less capable, particularly in multitasking and weakly supervised tasks, compared to their larger counterparts. So what's helping these smaller models act so mighty, then? Something called "textual entailment," a way to help these models understand a variety of language tasks, where if one sentence (the premise) is true, then the other sentence (the hypothesis) is likely to be true as well.
Advert