Open Source AI: To Release or Not To Release the GPT-2 Synthetic Text Generator

In February 2019, the San Francisco-based Open AI group made a decision that sent reverberations through the AI and open source communities worldwide. First, it announced “GPT-2,” a major improvement in language models which, according to its creators, generates “coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization—all without task-specific training.” Open AI then added this:

Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT-2 along with sampling code. We are not releasing the dataset, training code, or GPT-2 model weights.

Open-AI also released a technical paper. GPT-2 is trained as a large-scale unsupervised language model on 40 GBs of content scraped from the Internet with a Reddit karma score of over 3. Given the “fake news” era, much discussion in the community followed on the potential harms to society vs. the benefits to researchers. Open AI then did a staged release. In May 2019, it released an expanded dataset with a more detailed model. Finally, in November, it released the full GPT-2, arguing this:

We’ve seen no strong evidence of misuse so far. While we’ve seen some discussion around GPT-2’s potential to augment high-volume/low-yield operations like spam and phishing, we haven’t seen evidence of writing code, documentation, or instances of misuse. We think synthetic text generators have a higher chance of being misused if their outputs become more reliable and coherent. We acknowledge that we cannot be aware of all threats, and that motivated actors can replicate language models without model release.

Go to This website lets you test GPT-2 live by inputting any sentence and lets you watch it generate synthetic text as a response in the context of your input. Type any sentence about an accident or an assassination. See for yourself what the machine generates. Here is one example.

User: Baltimore Mayor Catherine Pugh was sentenced to three years in prison yesterday.

GPT-2: Baltimore Mayor Catherine Pugh was sentenced to three years in prison yesterday. She was convicted of knowingly accepting illegal campaign contributions from the family of Gov. Larry Hogan. According to a Fox 5 report, "Pugh’s husband Peter Ruiz donated $1,000 to Hogan’s campaign in 2013. He also hosted a fundraiser with Hogan in October 2014, with $2,500 going to her campaign, a meal and a photo of the two of them with their grandchildren. Ruiz also testified at Pugh’s trial that Ruiz is 'very pro-Trump' and 'knows what’s up.' But by taking the funds, Pugh and Ruiz admitted that they had violated Maryland law by 'tattling' or soliciting a campaign

Example 1 (generated 2/28/2020) shows how a machine trained on web pages can generate completely fictitious and politically-loaded narratives. There are more examples of dehumanizing and racist text that GPT-2 generates based on what you feed it.

On Github, in their section on “Out-of-scope use cases” for GPT-2, Open AI states, “Because large scale language models like GPT-2 do not distinguish fact from fiction, we don’t support use-cases that require the generated text to be true.” It also acknowledges that language models like GPT-2 “reflect the biases inherent to the systems they were trained on, so we do not recommend that they be deployed into systems that interact with humans unless the deployers first carry out a study of biases relevant to the intended use-case.”

Discussion questions
  1. For the full release of GPT-2, who are the stakeholders involved? Who are the people and/or organizations directly or indirectly impacted by GPT-2’s release? Who are benefited? What types of harms might arise?

  2. What issues and concerns come into focus in this case from applying each of the five ethical lenses? Rights Fairness/Justice Utilitarianism Common good **Virtues

  3. Given your discussion, how would you assess the ethics of Open AI’s decision in November to release GPT-2 in full?

To Ship or Not to Ship

Rachel works as a Quality Assurance Engineer at a large electronics company. She is responsible for the final testing of her company’s servers and is part of a team which decides when new products will be shipped to distributors for sale.

Rachel’s company has a contract with another company which makes the chips which are incorporated into the servers Rachel’s company makes. The business model for this product is to release a new generation server approximately every six months, meaning Rachel has a limited timeframe to conduct her Quality Control tests.

Because there is such a short amount of time between the release of each next new product, the Quality and Assurance department cannot perform every possible test on the servers to ensure they are defect free. Rachel will not ship a product if there is any possibility that the server could malfunction and cause physical harm to the customer. However, she will ship a product that has a higher likelihood of failure resulting in data loss for the customer, because she knows that if she doesn’t, her company’s competitor will.

Discussion Questions
  1. Is this an ethical way to conduct business?

  2. How should she determine when to ship a product with known defects?