The artificial intelligence (AI) chatbot ChatGPT may be a decent imitation of human workers in several fields, but scientific research is not one of them, according to a new study that used a computer program to spot fake studies generated by the chatbot. But AI is still able to fool some humans with its scientific writing, according to previous research.
Since its appearance in November 2022, ChatGPT has become an extremely popular tool for writing reports, sending emails, filling out documents, translating languages and writing computer code. But the chatbot has also been criticized for its plagiarism and lack of precision, while raising fears it could help spread “fake news” and replace some human workers.
In the new study, published June 7 in the journal Physical Sciences Cell Reports, researchers have created a new computer learning program to differentiate between real scientific papers and fake examples written by ChatGPT. Scientists trained the program to identify key differences between 64 real studies published in the journal Science and 128 articles created by ChatGPT using the same 64 articles as prompts.
The team then tested how well their model could differentiate between a different subset of real and ChatGPT-generated articles, which included 60 real Science journal articles and 120 AI-generated forgeries. The program flagged articles written by the AI more than 99% of the time and could correctly differentiate between paragraphs written by a human and those written by a chatbot 92% of the time.
Related: The ‘troubling’ deployment of AI exposes its flaws. How concerned should we be?
ChatGPT-generated articles differed from human text in four main ways: paragraph complexity, sentence length diversity, punctuation marks, and “popular words.” For example, human authors write longer and more complex paragraphs, while AI articles used punctuation not found in real articles, such as exclamation points.
The researchers’ program also spotted many glaring factual errors in articles about AI.
“One of the biggest problems is that he [ChatGPT] assembles text from many sources and there is no type of checking for accuracy,” lead author of the study Heather Desairean analytical chemist at the University of Kansas, said in the statement. As a result, reading through ChatGPT-generated writing can be like “playing a game of two truths and a lie,” she added.
It is important to create computer programs to differentiate real papers from AI-generated papers, as previous studies have suggested that humans may not be as good at spotting the differences.
In December 2022, another research group uploaded a study to the preprint server bioRxiv, which found that journal reviewers could only identify AI-generated study summaries — abstract paragraphs found at the beginning of a scientific article — only about 68% of the time, while computer programs could identify fakes 99% of the time. Examiners also misidentified 14% of real papers as fakes. Human reviewers would almost certainly be better at identifying entire articles than a single paragraph, the study researchers wrote, but it still underscores that human error could allow some AI-generated content to go unnoticed. (This study has not yet been peer-reviewed.)
The researchers of the new study say they are delighted that their program is effective in eliminating false papers, but warn that it is only a proof of concept. Larger scale studies are needed to create robust models that are even more reliable and can be trained in specific scientific disciplines to maintain the integrity of the scientific methodthey write (themselves) in their article.
#ChatGPT #chatbot #create #compelling #scientific #papers