Alberta Machine Intelligence Institute

ACL 2023: Key Insights & Contributions from Amii

Published

Oct 30, 2023

The Association for Computational Linguistics (ACL) conference took place this year in early July in Toronto. It’s the largest computational linguistics conference of the year, with over 3,000 attendees this year (2023).

This was the first ACL conference since ChatGPT was released in December 2022. Discussion of Large Language Models (LLMs) buzzed in the hallways and poster sessions, partially in response to this year’s keynotes, which gave two very different views of ChatGPT’s promise.

Concerns over the “immortal language model

The first keynote was delivered by Geoffrey Hinton, a Turing Award winner who is often referred to as one of the “Godfathers of AI”. Recently, Geoff raised the alarm over the rapid progress of artificial intelligence. He warned that current language models are powerful enough that we ought to be concerned about their usage and availability to the general public. He felt so strongly about this danger that he left his position at Google, thinking he would be freer to voice his concerns without a Google affiliation.

So, it was no surprise that Geoff spoke about the abilities of LLMs, and their remarkable power. Geoff described the “immortal language model,” noting that the power of a language model lives in software that is separable from hardware, and that the software can continue to exist even if the hardware fails. Humans don’t have that same separation; we don’t have the ability to efficiently save our knowledge or transfer it to the next generation. We have institutions devoted to the transfer of knowledge, but that transfer is time-consuming and lossy at best. Compare that to the transfer of knowledge from one AI to another. It’s fast, efficient and can be made completely lossless. This is one of the great dangers that Hinton has indicated.

LLMs as libraries

The final keynote featured renowned cognitive scientist Alison Gopnik. Gopnik is known for her work on child development, and has recently applied her experimental methods to study the capabilities of LLMs.

In her keynote, Gopnik spoke broadly about the history of knowledge transfer. As a species, she says, we have evolved to live well beyond our childbearing years. Why? Gopnik hypothesises that it’s to bolster knowledge transfer. The transfer of knowledge has been core to human success: from libraries to the printing press to the internet.

Gopnik described the “post-menopausal grandmother” as the quintessential knowledge source: the original library. Like a library, a post-menopausal grandmother is full of knowledge. But, unlike a post-menopausal grandmother, a physical library doesn’t know anything. A library can’t actually leverage that knowledge to generate new ideas. To Gopnik, LLMs are just knowledge sources. They only store what we’ve already created, never creating anything new.

Gopnik noted that there are two cognitive capacities that enable cultural revolution: the capacity to imitate and the capacity to innovate. Imitation alone doesn’t support progress. Without innovation, an imitating agent is useless. Gopnik claims ChatGPT can only imitate and presents evidence that it’s not as good at exploration as humans. Though reasonably good at causal reasoning, ChatGPT is bad at causal discovery.

But what is innovation? Often innovating is combining existing ideas in novel ways — that is, it is just new combinations of imitation. The probabilistic nature of ChatGPT makes this kind of innovation very possible.

When is imitation an innovation?

One of my favourite paper presentations at ACL this year was Kevin Du et al.’s new method for interpretability analysis. The authors created a new method to characterize the paths of importance within a NN. Their method combines two old ideas: SGD and Entropy. The result is something surprising and useful! But, it is also just the combination of two old ideas.

What about Geoff Hinton, godfather of AI? He is famous, in part, for backprop. Backprop is powerful because it combines the chain rule with caching, which accelerates the training of NNs. And what are NNs but a computer implementation of what we know about neurons? To be clear, these were major, transformative innovations. But they are, at their core, strongly rooted in previous knowledge. My point is not that anyone could have had Geoff Hinton’s or Kevin Du’s insights. Rather, I’m suggesting that most innovation comes from imitation. So why can’t ChatGPT innovate through imitation?

Both keynotes recognised LLMs as powerful tools for knowledge transfer. While Hinton gave cautionary praise to the capabilities of LLMS, Gopnik reassured us that LLMs are just the newest iteration of a library, a fancy knowledge source incapable of innovation. I think both views are a little right. A lot of what LLMs do is regurgitate. But occasionally, they regurgitate combinations that are interesting and innovative. And, crucially, to recognize that innovation still requires input from a person.

Many critiqued Hinton’s keynote as using mostly anecdotal evidence to support his claims about LLM capabilities. But the skill of LLMs remains largely in the eye of the beholder. LLMs currently don’t have a good sense of when their generated sentences represent good ideas vs mediocre ones. Maybe our next breakthrough will be to teach LLMs to recognize the power of their own innovations. It takes a different kind of skill to know a good idea when you see it.

Share