
The battle against plagiarism begins much like an arms race: each time an application is built to detect plagiarism, the plagiariser becomes better at avoiding detection.
However, the plagiarism arms race can have a final point is when it takes so much time and wits to avoid detection, one might as well write original material. For English material, there is now a large number of good plagiarism detectors, such as Turnitin. However, many other languages still lag behind.
“Plagiarism was becoming a growing problem in Hungary at the beginning of this millennium and there were no plagiarism check services available in Hungarian,” said Máté Pataki, a senior research fellow at MTA SZTAKI. And many of the existing systems have had problems with the Hungarian encoding and accented characters.
MTA SZTAKI, the Computer and Automation Research Institute at the Hungarian Academy of Sciences in Budapest, is launching a new version of the anti-plagiarism software KOPI. KOPI can now check for translations of Wikipedia pages and is powered by desktop grids.
The KOPI Plagiarism Search Portal has been available to the public since 2004, however, it has thus far only checked for plagiarism of Hungarian content.
“A year ago, we decided to go a step further and try to detect not only one language copy-and-paste plagiarism cases but also translated plagiarisms. Due to the spread of the Internet and the growing English language knowledge translated plagiarism has become an issue, not only in academic circles but also in newspapers,” said Pataki.
The researchers' new algorithm for KOPI checks students’ work against Wikipedia. Wikipedia may only represent one small segment of the Web, but the researchers were processing the material in German, French and English as well as Hungarian, and it took weeks on Pataki's previous server to process the quantity of data, they said.
Pataki and his colleagues are using desktop grids to process Wikipedia approximately every month, meaning the processing time is much shorter and their databases can be kept up to date.
KOPI was ported to desktop grids under the program GASuC in Hungary by research fellow Attila Marosi, who said that from a technical perspective, it suits desktop grids perfectly because the tasks can easily be divided into smaller tasks. “And, from social perspective we think that the application tries to solve an interesting problem that people would like to contribute to,” Marosi said.
KOPI will be available for everyone to use, and the MTA SZTAKI team will be showcasing the tool around Hungary, to universities and secondary schools, according to Agnes Szeberenyi, the coordinator of GASuC.
The team is already planning the next upgrade to the application, which they hope will incorporate more sources other than Wikipedia and will also be able to check if a student has tried to escape detection by using synonyms while plagiarizing.