Academic Honesty and Integrity

Tools You Can Use Right Now

The very first question faculty members ask me when they learn about ChatGPT is whether or not there are any tools available that can reveal if a text was created by an AI engine. A week ago, I released this video on GPTZero: 

 

In just one week, this area of the issue has seen significant innovation and change. Not only has GPTZero undergone a notable streamlining and redesign, but OpenAI, the company behind ChatGPT, released its own. 

 

You can access them here:

  • OpenAI “Text Classifier” (LINK)
  • GPTZero (LINK)

  However, these programs have serious limitations and I want to share what I know right now (2nd week of February) about how those limitations will impact how you’re able to assess student work. 

Are they easy to use? 

Mostly, yes, and they are getting easier and more streamlined every day. You can compare the way GPTZero looked in the videos I did last week to what it looks like now and it’s remarkable. 

Will they tell me if a student’s work was generated by Artificial Intelligence? 

It’s complicated. Neither of the most well-known programs will say definitively if writing is or is not AI-generated. They use slippery terms like “Likely.”

OpenAI’s “Text Classifier” (with its notably non-judgemental appellation) will choose a level from a rubric-like spectrum of classifications: “very unlikely, unlikely, unclear if it is, possibly, or likely AI-generated.” 

  • Both programs warn users that false-positives are common. 
  • Both programs warn against using these tools as the sole determinant when deciding if a student’s work is their own.

You can read what OpenAI has to say about their tool and its limitations here (LINK).  

Do these programs explain how they work? In other words, do they share how they determine if writing is AI-generated? 

No, not entirely or ( in OpenAI’s version) not at all. This is the most frustrating aspect to using them right now. We simply don’t know their criteria and rationale. OpenAI said recently that its classifier correctly identified only 26% of AI-written text as “likely AI-written” and incorrectly identified text created by humans 9% of the time. It’s worth noting they haven’t shared how they do this. In addition, the low rates of identification are frightening when you consider that they’re trying to identify their own generated text.

The Academic Integrity Program recommends that you only use these results in the context of an broader evaluation process when you suspect that a student’s work may not be their own. It is also recommended to only use these results as the starting point for a conversation with the student. You can access more information and tips about having that kind of conversation here (LINK). 

It’s important to remember that this is a rapidly developing/escalating issue. These tools may look and act remarkably different in the coming days and weeks. Keep checking back with the “AI-and-AI” hub here (LINK) at the TILT website for the most updated information and, as always, feel free to contact me at 970-491-2898. 



magnifying glass on table