Noah A. Smith – 91探花News /news Tue, 18 Nov 2025 17:09:28 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 $10M gift from Charles and Lisa Simonyi establishes AI@ 91探花to advance artificial intelligence and emerging technologies /news/2025/11/18/10-million-gift-from-charles-and-lisa-simonyi-establishes-aiuw-to-advance-artificial-intelligence-and-emerging-technologies/ Tue, 18 Nov 2025 17:02:43 +0000 /news/?p=89914 a man and a woman sitting together
The 91探花announced a foundational $10 million gift from philanthropists Charles and Lisa Simonyi to support work in artificial intelligence and emerging technologies. Photo: 91探花

The 91探花 today announced a foundational $10 million gift from philanthropists Charles and Lisa Simonyi to support groundbreaking work in artificial intelligence and emerging technologies.

The gift will establish a new initiative, AI@UW, to support the UW鈥檚 global leadership in advancing AI, machine learning and related areas of computing. Noah A. Smith, currently the Amazon Professor of Machine Learning in the Paul G. Allen School of Computer Science & Engineering, will become the vice provost for artificial intelligence and the inaugural Charles and Lisa Simonyi Endowed Chair for Artificial Intelligence and Emerging Technologies. The chair appointment is pending Board of Regents approval.

“With this generous gift from Charles and Lisa Simonyi, we will further position the 91探花as a model for how universities can responsibly and creatively adapt to the age of AI across education, research, administration and governance,鈥 91探花Provost Tricia Serio said. 鈥淏y leading the AI@ 91探花initiative, Vice Provost Noah Smith will guide our efforts to accelerate innovation and collaboration, illuminate achievements, propagate effective practices throughout the 91探花community and beyond, and ensure that our graduates are prepared for the workforce of today and tomorrow.鈥

profile image of a man
Noah A. Smith will become the vice provost for artificial intelligence and the inaugural Charles and Lisa Simonyi Endowed Chair for Artificial Intelligence and Emerging Technologies. Photo: 91探花

91探花researchers and faculty already are globally recognized for cultivating a deep understanding of the science and potential of these rapidly developing technologies. Work at the 91探花is creating practical and responsible applications for AI that span the academic enterprise, contribute to industry and uplift society.

Charles and Lisa Simonyi have a long history of supporting the UW. Lisa Simonyi is the chair of the 91探花Foundation Board, and Charles Simonyi is a technical fellow at Microsoft, where he also was a pioneer in developing software applications.

鈥淭he future of computing, research and innovations is deeply connected to the next era in artificial intelligence and machine learning,鈥 Lisa and Charles Simonyi said. 鈥淲e believe in the UW鈥檚 ability to engage students and faculty toward discoveries that will transform the university, the region and, indeed, the world. We are pleased to lend our support to advancing this exciting, interdisciplinary field.鈥

The Charles and Lisa Simonyi gift also will support the creation of an AI governance committee, student scholarships, community engagement and investments in computing resources and equipment.

鈥淭his extraordinary gift from the Simonyis demonstrates their vision and deep trust in the UW鈥檚 role as a global leader in innovation,鈥 91探花President Robert J. Jones said. 鈥淚t is a foundational investment that will help ensure artificial intelligence is developed and applied responsibly 鈥 serving humanity and advancing knowledge in ways that reflect our shared values.鈥

Read related coverage in and .

 

In the near term, the vice provost for artificial intelligence will establish a SEED-AI grant program to fund projects, led by 91探花faculty, that elevate the use of AI in 91探花educational activities. SEED-AI grants will support innovative, exploratory projects aiming to discover how AI can enhance learning and teaching across disciplines, enlighten the 91探花community, and inspire future developments of AI in the educational context.

Thanks to the Simonyi gift, Smith said, the 91探花will model how universities can responsibly and creatively adapt to the age of AI across education, research, administration and governance.

鈥淭he UW鈥檚 people are already leading the way in shaping universities in the time of AI,鈥 Smith said. 鈥淲hile its rapid rise has been surprising, as an AI researcher and teacher I鈥檓 energized by the chance to promote AI literacy, explore how AI can enrich learning across disciplines and help steer AI’s development in ways that are most useful to the University鈥檚 mission.鈥

Contact Smith at nasmith@cs.washington.edu.

]]>
Q&A: 91探花researchers answer common questions about language models like ChatGPT /news/2024/01/09/qa-uw-researchers-answer-common-questions-about-language-models-like-chatgpt/ Tue, 09 Jan 2024 16:40:25 +0000 /news/?p=84032
A team 91探花 researchers have published a guide explaining language models, the technology that underlies chatbots. Photo:

Language models have, somewhat surreptitiously, dominated news for the last year. Often called 鈥渁rtificial intelligence,鈥 these systems underlie chatbots like ChatGPT and Google Bard.

But a team of researchers at the 91探花 noticed that, even amid a year of AI commotion, many people struggle to find accurate, comprehensible information on what language models are and how they work. News articles frequently focus on the latest advances or corporate controversies, while research papers are too technical and granular for the public. So recently, the team published 鈥,鈥 a paper explaining language models in lay terms.

For answers to some common questions, 91探花News spoke with lead author , a 91探花doctoral student in the Paul G. Allen School of Computer Science & Engineering; co-author , a master’s student in the Allen School; and senior author , a professor in the Allen School.

Briefly, what are language models and how do they work?

Sofia Serrano: A language model is essentially a next-word predictor. It looks at a lot of text and notices which words tend to follow after which sequences of other words. Typically, when we’re talking about a language model, we’re now talking about a large machine learning model, which contains a lot of different numbers called parameters. Those numbers are tweaked with each new bit of textual data that the model is trained on. The result is a giant mathematical function that overall is pretty good at predicting which words come next, given the words that have been supplied in a prompt, or that the model has produced so far. It turns out that these large models also pick up things about the structure of language and things that fall under the umbrella of common sense or world knowledge.

Common terms:

  • Language Model (LM): An algorithm trained on large amounts of text to predict which words generally follow which sequences of other words.
  • Artificial Intelligence (AI): A broad term for several research areas focused on improving machines鈥 ability to process information in ways that seem to mimic human intelligence.
  • Natural Language Processing (NLP): An area of computer science focused on processing and generating language.
  • Machine Learning (ML): An area of computer science focused on training algorithms to solve problems from data.
  • Parameter (in a language model): A value in a language model鈥檚 mathematical function that can be adjusted as the model is trained. Current large language models can contain more than a trillion parameters.
  • Prompt: A user鈥檚 text input to a language model.

In the paper you bring up this idea of the 鈥渂lack box,鈥 which refers to the difficulty in knowing what鈥檚 going on inside this giant function. What, specifically, do researchers still not understand?

Noah Smith: We understand the mechanical level very well 鈥斅爐he equations that are being calculated when you push inputs and make a prediction. We also have some understanding at the level of behavior, because people are doing all kinds of scientific studies on language models, as if they were lab subjects.

In my view, the level we have almost no understanding of is the mechanisms above the number crunching that are kind of in the middle. Are there abstractions that are being captured by the functions? Is there a way to slice through those intermediate calculations and say, 鈥淥h, it understands concepts, or it understands syntax鈥?

It鈥檚 not like looking under the hood of your car. Somebody who understands cars can explain to you what each piece does and why it’s there. But the tools we have for inspecting what’s going on inside a language model鈥檚 predictions are not great. These days they have anywhere from a billion to maybe even a trillion parameters. That’s more numbers than anybody can look at. Even in smaller models, the numbers don鈥檛 have any individual meaning. They work together to take that previous sequence of words and turn it into a prediction about the next word.

Why do you distinguish between AI and language models?

SS: 鈥AI鈥 is an umbrella term that can refer to a lot of different research communities that revolve around making computers 鈥渓earn鈥 in some way. But it can also refer to systems or models that are developed using these 鈥渓earning鈥 techniques. When we say 鈥渓anguage model,鈥 we鈥檙e being more specific about a particular concept that falls under the umbrella of AI.

NS: The term 鈥淎I鈥 brings with it a lot of preconceived ideas. I think that’s part of why it’s used in marketing so much. The term 鈥渓anguage model鈥 has a precise technical definition. We can be clear about exactly what a language model is and is not, and it isn鈥檛 going to bring up all these preconceptions and feelings.

SS: Even within natural language processing research communities, people talk about language models 鈥渢hinking鈥 or 鈥渞easoning.鈥 In some respects that language makes sense as shorthand. But when we use the term 鈥渢hinking,鈥 we mostly know how that works for humans. Yet when we apply that terminology to language models, it can create this perception that a similar process is happening.

Again, a language model is a bunch of numbers in a learned mathematical function. It鈥檚 fair game to say that those numbers are capable of recovering or surfacing information that the model has seen before, or finding connections between input text. But often there鈥檚 a tendency to go further and make assumptions about any kind of reasoning the models might possess. We haven’t really seen this level of fluency decoupled from other aspects of what we consider intelligence. So it’s really easy for us to mistake fluency for all of the other things that we typically roll into the term 鈥渋ntelligence.鈥

Could you give an example of how that fluency translates to things that would be perceived as intelligent?

Zander Brumbaugh: I think determining what a display of intelligence is can be quite difficult. For example, if someone asked a model, 鈥淚鈥檓 struggling and feeling down 鈥 what should I do?鈥 The model may offer seemingly reasoned advice. Someone with limited experience with language models might perceive that as intelligence, instead of next-word prediction.

NS: If you tell a model, 鈥淚’m having a bad day,鈥 and its response sounds like a therapist, it has likely read a bunch of articles online that coach people on empathy, so it can be very fluent when it鈥檚 latching on to the right context. But if it starts feeding on your sadness and telling you you’re awful, it’s probably latching on to some other source of text. It can reproduce the various qualities of human intelligence and behavior that we see online. So if a model behaves in a way that seems intelligent, you should first ask, 鈥淲hat did it see in the training data that looks like this conversation?鈥

What makes compiling a good data set to train a language model difficult in some instances?

ZB: Today鈥檚 models roughly comprise the entire public internet. It takes enormous amounts of resources to be able to gather that data. In language modeling, essentially, what you put in is what you鈥檙e going to get out. So people are researching how to best collect data, filter it and make sure that you’re not putting in something that’s toxic or harmful or just at its lowest quality. Those all present separate challenges.

Why is it vital to have testing data that’s not in the original training data set?

NS: I call this the cardinal rule of machine learning. When you’re evaluating a model, you want to make sure that you’re measuring how well it does on something it hasn’t seen before. In the paper, we compare this to a student who somehow gets a copy of the final exam answer key. It doesn’t matter whether they looked at it. Their exam is just not useful in judging whether they learned anything. It鈥檚 the same with language models. If the test examples were in the training data, then it could have just memorized what it saw. There’s a large contingent of researchers who see these models as doing a lot of memorization 鈥 maybe not perfect memorization, but fuzzy memorization. Sometimes the word 鈥渃ontamination鈥 gets used. If the training data was contaminated with the test, it doesn’t mean the language model is stupid or smart or anything. It just means we can’t conclude anything.

What鈥檚 it important for the public to understand about language models right now?

ZB: We need to keep separating language models from notions of intelligence. These models are imperfect. They can sound very fluent, but they’re prone to hallucinations 鈥 which is when they generate erroneous or fictional information. I know people who are using language models for something relatively important, such as looking up information. But they give a fuzzy representation of what they’ve learned. They’re not databases or Google search.

NS: If you look at great technological achievements 鈥 the airplane or the internet 鈥 most resulted from having a clear goal. We wanted to move people through the air, or send information between computers. But just a few years ago, language models were largely research artifacts. A few were being used in some systems, such as Google Translate. But I don’t think researchers had a clear sense of solving a problem by creating a product. I think we were more saying, 鈥淟et’s see what happens if we scale this up.鈥 Then, serendipitously, this fluency yielded these other results. But the research wasn’t done with a target in mind, and even now nobody quite knows what that target is. And that’s kind of exciting because some of us would like to see these models made more open because we think there is a lot of potential. But big tech companies have no reason to make a tool that works really well for Sofia or me or you. So the models have to be democratized.

What are some basic steps toward that democratization?

NS: Some organizations are building language models that are open, where the parameters, code and data are shared. I work part-time for one of those organizations, the , but there are others. Meta has put out models, without the data, but that’s still better than nothing. A company called EleutherAI puts out open models. These models are still often quite expensive to run. So I think we need more investment in research that makes them more efficient, that lets us take a big model and make it cheap enough to run on a laptop.

For more information, contact Serrano at sofias6@cs.washington.edu, Brumbaugh at brumbzan@cs.washington.edu and Smith at nasmith@cs.washington.edu.

]]>