中国论文抄袭买卖问题依然猖獗

  • A+
所属分类:撰写投稿

科学Science杂志曾经经过详细的卧底调查后发现,中国存在着严重的论文买卖活动。时隔1年,科学杂志Science再次发文提示中国论文买卖抄袭问题依然猖獗。

Centre for Genomic Regulation的Guillaume Filion和Pompeu Fabra University的Lucas Carey,从PubMed下载了2012年1月到2014年4月的论文发表数据。他们用自然语言处理技术梳理近两200万论文的摘要,希望从中分析出2014年的新热点。

中国论文抄袭买卖问题依然猖獗

他们发现,提及CRISPR、lncRNA的文章明显增多。CRISPR是一种基因编辑的新技术,被Science杂志评为2013年十大突破之一,lncRNA的全名是长非编码RNA,这种RNA现在是基因组领域的热点话题。

在这些平凡的研究之中,他们发现了一个令人意外的发现:以前很少出现的数据库从今年2份开始,这个数据库的出现频率突然上升为一周一次。这个数据库名为CISCOM(Centralised Information Service for Complementary Medicine, London),属于伦敦的补充医学研究委员会。该数据库是一个鲜为人知的数据库。

Filion和Carey进一步发现,有32篇不同主题的文章很奇怪,它们都是分析CISCOM数据库和一些常用数据库比如Google Scholar、PubMed和Web of Science已发表数据的meta分析或综述。而且这些文章全部来自于中国,作者是分布在多个城市的28个不同研究团队。

Filion在自己的博文中指出,这些文章有着「令人不安的相似性」,于是他与Carey决定搞清楚到底发生了什么。他们下载了25篇有嫌疑的论文全文,并用剽窃检测程序iThenticate进行检测,结果并未发现什么问题。

然而,这些文章的讨论部分都含有类似的表述,只有很小的改动。例如一篇文章写道「Importantly, the inclusion criteria of cases and controls were not well defined in all included studies and thus might have influenced our results.」另一篇写道「Importantly, the inclusion criteria of cases and controls were not well defined in all included studies, which might also have influenced our results.」

另外有四篇文章具有同样的语法错误,如「our results had lacked sufficient statistical power」中多余的「had」。Filion和Carey发现,这些文章似乎来自于多个模板。可以看出,文章作者主动对文段进行洗牌,这是一种规避剽窃检测软件的手法,与洗黑钱类似

这些多数可疑论文都是在2013年年底提交的,因此不可能存在发表之后的剽窃。Filion和Carey因此推测这些文章可能来自于同一家公司。在复旦大学遗传学家Yao Yu的帮助下,他们找到了一家论文买卖公司。这家公司在其网站上宣传可以定制meta分析论文,经过联系和询问,该公司给出了报价:影响因子2或3的meta分析文章大约1万美金

去年Science报道了历时五个月的调查,发现了十几家类似的公司。这些公司除了用客户提供的数据草拟论文,还伪造数据、提供论文加名和售卖已完成的论文。

在已完成的论文中,最受欢迎的就是meta分析,这可能是因为它们不需要原始数据。2013年六月PLOS ONE杂志发表的一篇分析文章指出,从2003年到2011年,中国meta分析文章的增加比美国快16倍。不过,Filion和Carey并不打算再深入调查这个问题,因为他们认为「我们不是专门干这个的猎手,我们做的是大数据分析。」

Copycat papers flag continuing headache in China

中国论文抄袭买卖问题依然猖獗

SHANGHAI, CHINA—Two computational biologists searching for trends in journals indexed in the search engine PubMed stumbled across signs that China’s paper-selling companies remain active, 1 year after Science published a detailed undercover investigation describing a highly sophisticated and lucrative industry.

Guillaume Filion of the Centre for Genomic Regulation and Lucas Carey from Pompeu Fabra University, both in Barcelona, downloaded all PubMed records for papers published between January 2012 and this past April. Combing over the abstracts for those 2 million papers using a big data technique called natural language processing, they isolated terms that spiked in use in 2014.

They hoped to find 「new topics about to detonate,」 Filion says. Not surprisingly, they found an uptick in papers mentioning cutting-edge topics like CRISPR, a gene-editing technique that was named a runner-up for Science’s 2013 Breakthrough of the Year, and lncRNA, or long non-coding RNA, an unusually long form of RNA that is now a hot topic in genomics.

But alongside those more predictable trends, one term stuck out: a little-known database run by the Research Council for Complementary Medicine in London called CISCOM, or the Centralised Information Service for Complementary Medicine. Until 2013, the scholars note, the term 「CISCOM」 appeared in only two to three papers per year. In February, the database began cropping up once a week.

Looking more closely, Filion and Carey found a group of 32 papers on varying topics that nonetheless shared some curious characteristics. All were meta-analysis or review papers that analyzed already-published data in CISCOM, along with more commonly used databases like Google Scholar, PubMed, and Web of Science. Moreover, all originated in China, from 28 different research groups spread out across several cities.

Filion, who described what he calls the 「disturbingly similar」 papers in a blog post published on 4 October, set out with Carey to determine what was going on. They downloaded complete versions of the 25 papers to which they had access through various institutional subscriptions or other means. (All but two papers are behind a pay wall.) Running the papers through the plagiarism detection program iThenticate turned up no red flags.

But the discussion sections of all the papers contain similar statements, with only minor changes. For example, one paper reads, 「Importantly, the inclusion criteria of cases and controls were not well defined in all included studies and thus might have influenced our results.」 Another states, 「Importantly, the inclusion criteria of cases and controls were not well defined in all included studies, which might also have influenced our results.」

Four of the papers include the same grammatical error—the extraneous 「had」 in 「our resultshad lacked sufficient statistical power.」 But in mapping out the relationships among the papers, the duo noticed that the writers seemed to be drawing from multiple templates. That suggests, Filion says, 「that the writers actively shuffle the texts」—a method of evading plagiarism detection software known as text laundering.

Most of the papers were submitted in late 2013, making it impossible that some authors plagiarized others after publication. Filion and Carey thus hypothesized that the papers might all be the work of a single company. With help from Yao Yu, a geneticist at Fudan University in Shanghai, the scholars identified an outfit whose website advertises tailored meta-analysis papers and contacted the company to inquire about its services. The company reportedly offers meta-analysis papers for journals with an impact factor of 2 or 3 for about $10,000.

A 5-month investigation published in Science last year found dozens of similar companies offering an array of services aimed at securing publication in journals indexed in Thomson Reuters’ Science Citation Index, Thomson Reuters’ Social Sciences Citation Index, or Elsevier’s Engineering Index—which at many Chinese institutions are critical to securing promotions. In addition to preparing original papers from scratch with data provided by their clients, China’s paper-selling companies fabricate data, arrange to add scientists’ names to already accepted papers, and sell finished manuscripts.

Among the most popular options for finished manuscripts are meta-analyses, perhaps because they require no original data. One legitimate analysis published in PLOS ONE in June 2013 found that from 2003 to 2011, meta-analysis papers from China rose more than 16 times faster than did such papers from the United States. Combing PubMed for other trends might turn up more evidence of malfeasance. But Filion says he and Carey now plan to turn their attention to other topics: 「We are not witch-hunters, we are big data analysts.」

科学杂志原文:Copycat papers flag continuing headache in China

本文参考生物360一文:http://www.bio360.net/news/show/11880.html

weinxin
公众号
科研动力微信公众号,欢迎关注!

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: