Navigating AI can be overwhelming. We’ve put together this short introductory guide to help you make confident and intentional choices about where and how to engage with AI in your classroom.
We will get you up to speed by outlining how generative AI models like ChatGPT work, summarizing debates around AI usage in the classroom, and providing you with some resources for further exploration.
1. What is AI?
Artificial intelligence (AI) refers to an evolving set of technologies, dating back to the 1950s. Public fear and perception of both AI and robots were stoked and perpetuated by science fiction tropes about man versus machine. Today, AI encompasses a broad set of technologies that rely on large amounts of data to make predictions or decisions. Over the past twenty years, as the ability to produce and store vast amounts of data has increased dramatically, so have the possibilities of building technologies that incorporate AI, like more precise GPS navigation, email spam filters, and search engines.
One reason Artificial Intelligence tends to be a confusing concept is that AI is an umbrella term, like the term “transportation,” which has meant different things over time (bicycle or spaceship!), and certainly means different things in different contexts (rickshaw or jet!). Given the vagueness of the term AI, continuous changes to the technologies, and misrepresentation by the media, there is a lot of uncertainty about what AI is and is not. Various industries and applications use AI, including customer service, fraud detection, medical diagnostics, and media recommendation systems.
Over the past year, a subset of AI known as “generative AI” has captured the public imagination and caused understandable concern among some educators. Generative AI technologies include conversational chatbots (such as ChatGPT and Bard) and image generation tools (such as DALL-E and Midjourney). Chatbots are based on a technology called large language models. For example, ChatGPT is a chatbot by OpenAI based on the large language models GPT-3, GPT-3.5, and GPT-4, and Bard is a chatbot by Google based on the large language model PaLM. Other generative AI tools can be used to generate code, music, and video.
2. The Basics of Generative AI: Large Language Models
“Generative AI” refers to tools that generate new content (typically text or images), and “large language models” are a specific type of generative AI model. One way to think about large language models is to picture them as an extremely powerful form of autocomplete. A simple autocomplete takes the last word you typed, refers to a table to find the most likely words that could follow it, and then suggests some options. In a tool like this, the table might have been generated by analyzing a large body of text, counting how many times each word follows any other word, and calculating probabilities. Similarly, large language models like GPT-3.5 and GPT-4, which power ChatGPT, analyze the input text and then predict the next likely word based on the words that have come so far, and add that word to the string. This continues, word by word, until a complete response is generated.
However, there are some important ways that large language models differ from a simple autocomplete:
- Size of the training dataset. The models underlying these tools generate their next-word probabilities based on patterns found in billions of words within a collection of preselected texts known as the training data.
- Complexity of the next-word determination. Rather than simply referencing a probability table, large language models like OpenAI’s GPT-3.5 and GPT-4 first perform billions of calculations using parameters that were determined from the training data in order to transform the initial prompt into a prediction about what word might come next. These are often referred to as “neural networks” (see the Glossary below for a definition of this term). They also analyze the semantic structure of the sentences, which factors into their calculations.
- Reliance on humans for training. Without proper guardrails in place, large language models have a tendency to generate toxic content. In order for these tools to provide polite and safe responses, they go through a process called “reinforcement learning from human feedback.” This process requires human workers to manually review AI-generated content and provide feedback, which is then used to further refine the model. Notably, many of these multibillion dollar US-based tech companies use international contract workers, who are exposed to traumatizing content with low financial compensation and minimal mental health support.
(See Time‘s article “OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic.”)
Large language models are deceptive, because they appear to understand, think, and reason. They do none of these things. They are designed to mimic humans, but they are not alive, do not have subjective experiences (despite their use of the term “I”), and cannot think or feel.
For an in-depth, but still accessible, exploration of how these tools work, check out the primer “Large language models, explained with a minimum of math and jargon.”
Image generation tools such as DALL-E, Midjourney, and Stable Diffusion work in a similar way to large language models under the hood. There is, however, an additional level of translation from text to image for these tools. The implications of image generation tools are far reaching, including questions of copyright, ownership, and the human labor used to refine the models. For more information on generative image tools and their implications, we recommend Eryk Salvaggio’s Critical Topics: AI Images, which contains recorded lectures and a host of useful links.
There are also generative AI tools for creating videos, writing code, making music, and more.
The following chart is a summary of key capabilities, limitations, and concerns around ChatGPT and other conversational chatbots from the National Centre for AI:
|• It can write plausible sounding text on any topic.
• It can generate answers to a range of questions, including coding, math-type problems and multiple choice.
• It is getting increasingly accurate and sophisticated with each release.
• It generates unique text each time you use it.
• It’s great at other tasks like text summarization.
|• It can generate plausible but incorrect information.
• ChatGPT’s large language model (GPT-3) is only trained on information up until Sept 2021 (but those with a paid ChatGPT plan have access to a version that can access the Internet).
• It has limited ability to explain the sources of information for its responses (this varies between large language models).
|• It can and does produce biased output (culturally, politically, etc.).
• It can generate unacceptable output.
• It has a high environmental impact, concerns around human labor and ownership of training material.
• It presents security and privacy concerns around the way users’ data is used to train the models.
• There is a danger of digital inequity.
Citation: Michael Webb, “2.3. A summary of key capabilities, limitations, and concerns around ChatGPT and other Large Language Models,” A Generative AI Primer, National Centre for AI, last modified May 22, 2023, https://nationalcentreforai.jiscinvolve.org/wp/2023/05/11/generative-ai-primer/.
Small adjustments were made to the text for clarity.
3. Incorporating AI into your teaching
AI tools, like other technologies, can be useful for teaching and learning when used properly. AI tools are not wholesale good or evil; they are powerful when used responsibly and with clear intention.
Given the availability of generative AI tools, students will be curious about how they might support their learning, and educators should be asking the same questions. The following list represents our recommendations for educators at the postsecondary level (university and beyond):
- Learn the AI basics. If you feel overwhelmed or confused by AI, familiarize yourself with the content on this site, and review some of the recommended readings. Even respectable publications can misrepresent the power or promise of AI. See here for a list of common misrepresentations of AI.
- Know the affordances of AI tools. Consider what AI technologies, particularly large language models (LLMs), do well, what they do poorly, and where they might provide new insights or assistance within your particular field. (See the table above for more on the capabilities, limitations, and concerns associated with AI chatbots.) Can you modify an existing assignment of yours to leverage what a particular tool does well, or create a teaching moment around what can be learned about the tech itself from what it does poorly?
- Create a classroom AI policy. Create an AI classroom policy for each of your courses, and discuss your policy with your students on the first day of class. We recommend metaLAB’s AI Code of Conduct, which you can further customize. You may even consider creating the policy in collaboration with your students. Incorporate their feedback, concerns, and requests for additional clarity.
Policy implementation suggestions:
- If you are not allowing AI tool use in any form for any assignment, from brainstorming/ideation to anything else, make sure that you communicate this explicitly to your students, and seek feedback at the end of the term on how that policy served them and their learning. Consider revisiting in the future.
- If you are allowing some AI tool use, make sure to be explicit about when students can use the tool in their work process, what tools are permitted, and how they should cite or annotate their AI tool use. Seek feedback at the end of the term on this policy, and consider revising your policy in the future.
- Use a critical approach. When giving an assignment that uses AI, always include a critical component. For example, ask students to reflect on why the tool performed in a certain way, whether it did well or poorly, and how it affected their own thinking. Beyond their current uses, what are the implications of the tool’s performance with respect to questions of equity, democracy, education, or information quality?
- Lead with trust. Be clear that any violation of your policy constitutes academic dishonesty, but lead with trust. This moment can be an opportunity to discuss what motivates your students: why are they in college or in your particular course, what do they want to learn, what do they perceive the learning outcomes to be, how do they learn best, and what impediments could limit their abilities to do so?
- Recognize the opportunity. Recognize and seize the opportunity to educate your students and to learn from them. What are some short- and long-term risks and benefits of AI (see Kapoor & Narayanan, 2023)? How do students feel about the environmental impact, risk of overreliance, media manipulation, etc.? What does the future of education look like? What does AI bring to bear on your particular field of study? How might we encourage students to learn from each other?
- Approach with humility and openness. Even AI experts don’t agree about what the future holds. Approach conversations about AI in education with humility and a healthy skepticism. Encourage your students to read and explore the various schools of thought in the field of AI. Above all, strive to distinguish fact from hype in your journey to critical AI literacy.
|A sequence of instructions for solving a problem or performing a task. Algorithms define how an artificial intelligence system processes input data to recognize patterns, make decisions, and generate outputs.
|Artificial Intelligence (AI)
|Computer systems designed to perform tasks associated with human intelligence, such as pattern recognition or decision making.
|A program that communicates with humans through text in a written interface, built on top of a large language model. Examples include ChatGPT by OpenAI, Bard by Google, and more. While many people refer to chatbots and LLMs interchangeably, technically the chatbot is the user interface built on top of an LLM.
|A type of model designed to be a general-purpose “foundation” for a wide range of applications. Foundation models can be adapted (or “fine-tuned”) for domain- or task-specific purposes. In contrast, “narrow” models are designed such that they are limited to specific tasks.
|Generative Artificial Intelligence (GAI)
|A subfield of Artificial Intelligence, referring to models capable of generating content (such as language, images, or music). The output of GAI models is based on patterns learned from extensive training datasets.
|In the context of AI, a falsehood presented as truth by a large language model. For example, the model may confidently fabricate details about an event, provide incorrect dates, create false citations, or dispense incorrect medical advice.
|Large Language Model (LLM)
|A type of generative AI model that works specifically with written language (both natural language and code). The models are trained on massive corpuses of text that have been taken from the Internet. Examples include GPT-3 and GPT-4 by OpenAI which power ChatGPT, Claude by Anthropic, PaLM by Google, LLaMA by Meta, and more.
|A field of computer science in which a system learns patterns or trends from underlying data. Machine learning algorithms perform tasks like prediction or decision making.
|A type of computational model (named due to the design elements that loosely resemble elements of the human brain, including interconnected nodes and layers), which can be trained to recognize patterns and make predictions. During training, the connections between different layers of the neural network are iteratively strengthened or weakened based on input “training data” and other model design parameters.
|In the context of AI, it is the input text written by a human that is given to a generative AI model. The prompt often describes what you are looking for, but may also give specific instructions about style, tone, or format.
|Reinforcement Learning from Human Feedback (RLHF)
|A technique that trains a model directly from human feedback. RLHF is often used in tasks where it’s difficult to define a clear, algorithmic solution but where humans can easily judge the quality of the model’s output. With generative AI models, RLHF is one method used to identify and filter out problematic content like violence and hate speech.
|The content used to teach a machine learning system how to perform a particular task. Training data gives the system a knowledge base from which the model can make predictions or identify patterns. Training data might include images, text, code, or other types of media. It can be structured or unstructured, depending on the type of training process being used.