Library: Faculty Guide to Generative AI: What Is an AI Chatbot, and What Can It Do (and Not Do)?

ChatGPT, in Its Own (?) Words

ChatGPT is one of the players in the natural language AI field. It is not the only one, but it is the one that is best known, following the attention generated by its November 2022 beta rollout. When asked to define itself, ChatGPT replied,

ChatGPT is a large language model developed by OpenAI that uses a type of artificial intelligence called deep learning to generate human-like text in response to natural language inputs. (https://chat.openai.com/chat, 2/17/2023)

In other words, it is a complex piece of software that can be asked questions in conversational language and produces replies that read as if they were written by a human being. It achieves this by inputting and analyzing enormous amounts of existing text (that's the "large language"), which enables it to mimic sentence structure and argument.

What Can Generative AI Generate?

Generative AI can produce a wide variety of content.

Text, including essays, stories, poems, blog posts, screenplays, instruction manuals, product blurbs, and more
Images, from text prompts, existing images, or a combination
Video
Audio
Code

The factor that determines how good an AI's creation will be is the quality of the prompt. Crafting AI prompts is a skill in itself, which will become ever more valuable as AI becomes more and more integrated with everyday life.

Issues and Problems

Generative artificial intelligence (gen AI hereafter) can produce an essay on a text, an explanation of a natural phenomenon, a review of a play, a meal plan, a vacation suggestion, or answers to a vast number of other questions. The quality of its output is highly suspect, however, and ethical questions arise from the way that AIs are trained and behave.

A major problem with gen AI at this point is that the dataset that informs its underlying structure comes in large part from the internet, and therefore reflects the biases and flaws of the internet. It can produce material that is false or misleading, and it can also sometimes use language that is racist, misogynistic, or otherwise offensive. Gen AI can't check its data or, usually, look for new data; when it doesn't have an answer, it can make an answer up (AI experts refer to this as "hallucination" or "stochastic parroting"). Users have also found that, when the chatbot is asked for its sources, it can generate fake citations.

Additionally, gen AI lacks the ability to do real critical thinking, so if a user requests involved comparisons or analysis, it is likely to provide bland text without individuality.

Another issue is privacy. ChatGPT and other AIs, such as Claude, claim they do not use the questions and prompts that they're given as additional training; however, if you choose to rework an output you're given, which is a common and often necessary strategy for getting the result you want, the company can interpret that as a request for improvement to their AI and then use the prompt and associated material for training. They may collect other kinds of data as well (ChatGPT, notably, requires users to give it a phone number). There is no information available about what AI companies do with the data they can collect, and their privacy policies raise some questions (see the Verge, linked below).

There are related questions concerning AI and copyright. Is using copyrighted material as training data for an AI a violation of copyright law? The lawsuits filed by the New York Times and bestselling authors (updates on a second suit here) argue that it is. Should the tech companies making AI tools have to request permission to use that material? Should authors and artists receive compensation? Another messy area is who—if anyone—owns the material that an AI generates. (U.S. court decisions favor the idea that there must be a human creator involved, but how involved that human must be is being decided on a case-by-case basis.)