ChatGPT texts detectable - model reliably recognizes AI plagiarism
Articles from ChatGPT should appear as natural as possible, modeled on the human writing with which the AI has been trained. It is therefore difficult to distinguish such plagiarized text from actual writing.
Previous attempts to automatically detect AI-generated texts have had a success rate of well under 50% in some cases. That makes 99% sound very different and much more promising.
A team from the University of Kansas, which published its results on November 6, 2023 at sciencedirect.com, was able to develop a system that can reliably flag artificially created scientific articles.
Narrow area of operation
In the test setup, texts from thirteen scientific journals, all dealing with chemistry, were compared with a total of 200 texts originating from either GPT-3.5 or GPT-4.
According to the authors, 198 of these texts were recognized as AI-generated, which corresponds to a rate of 99%. This was based on 20 text features such as variable sentence length, the typical occurrence of certain words or punctuation. In addition, training was carried out with numerous scientific texts from the field of chemistry.
And this combination of the classic structure and language of scientific texts and the focus on one subject area are responsible for the reliability of the system.
In a further test with articles from a news site, however, the detector failed completely. Virtually no artificially created news item was identified as such.
Nevertheless, it seems promising that such a high success rate can be achieved with tools such as text analysis for specific subject areas.