BioCreative IV

IAT (Task 5) For Biocurators [2013-07-22]


A common problem faced by biocurators when using text mining systems is that they are difficult to use or do not provide an output that can be directly exploited by biocurators during their literature curation process. In this respect, the BioCreative Interactive Text Mining (IAT) task has served as a great means to observe the approaches, standards and functionalities used by state-of-the-art text mining systems with potential applications in the biocuration domain. The IAT task also provides a means for biocurators to be directly involved in the testing of text mining systems. For the upcoming BioCreative IV, nine teams have submitted a text mining/NLP system targeted to a specific biocuration task. These systems will be formally evaluated, but not competitively.

Invitation to Participate

This is an open invitation to biocurators to participate in a user study on the system of their choice during mid August-September prior to the BioCreative IV workshop (October 7-9). This study is conducted remotely and is time flexible. There are two levels of participation: full (involves actual curation of a selected corpus, completing a user survey, and being listed as co-author on the BioCreative IAT overview article) and partial (involves performing basic pre-defined tasks at the system's website, completing a user survey and being acknowledge in the BioCreative IAT overview article). Below we list the text mining systems that have registered to participate in the Track 5-IAT. Note that only a brief description of each system is shown. A link to the general website is provided although the final version to be used in the biocuration task may not be ready yet. The systems will provide guidelines, tutorials, and training sometime in early August, and results from the user evaluation should be returned by September 25. To register for participation please click here.

The results of this evaluation will be presented during the BioCreative IV workshop October 7-9, 2013 in the Washington DC area. Attendance to the workshop is not a prerequisite to participate, but you are more than welcome to do so.

Why should I participate?

The benefits to biocurators participating in this activity are multifold, including:

  • direct communication and interaction with developers
  • exposure to new text mining tools that can be potentially adapted and integrated into the biocuration workflow
  • contribution to the development of text mining systems that meet the needs of the biocuration community
  • dissemination of findings in a peer-reviewed journal article.

What does it take to participate?

As described before there are two levels of participation:

1-Participation in an actual biocuration task (Full): For this task there is a training phase where the user gets familiar with annotation guidelines for the task and the website functionalities via remote interaction with text mining groups and a testing phase that may involve i) deciding if the retrieved article is relevant for the given biocuration task (triage) and/or ii) annotation of entities (proteins/genes, disease, chemicals) and relations among them. The latter activity usually involves manual curation of a set of documents (number depends on specific system, but usually around 30 abstracts or 15 full-length articles) and curation of a set of documents using the text mining tool (30 abstracts or 15 full-length articles). The dataset selected could be a set of articles that is relevant to the biocurator's domain of expertise (to be discussed with system developers). The biocurator should record the total time for the manual and the system-assisted curation. As mentioned before, time is flexible, with the only requirement of finishing the task by September 25. Participation in this activity entitles the be listed as co-author in the Overview BioCreative IAT manuscript, but note that its publication will depend on the peer-review process.

2-Participation in testing the usability of the website by performing basic pre-defined tasks (Partial): The user will navigate the system to perform such tasks and will report on usability of the website via a user survey. The user won’t be trained here, but only use help support provided by the website. Participation in this activity will be acknowledged in the Overview BioCreative IAT manuscript, but note that its publication will depend on the peer-review process.

How do I register for participation?

Please check the system(s) that you are interested in (from the list below) and fill in the information required in the form. To learn more about the systems, go to the system description webpage. You can always contact us if you have additional questions by sending email to

Please select one or more systems from this list:
Argo (Annotation of metabolic process-related named entities, namely chemical entities and genes or gene products)
BioQRator (Validate retrieved articles based on relevance on protein-protein interaction information)
CellFinder (Annotation of gene, expression relation and cell type in text snippets from a set of articles)
Egas (Identification and extraction of PPI events described over PubMed abstracts related to neuropathological disorders)
MarkerRIF (Triage on disease-biomarker, and extraction of these entities with normalization)
OntoGene (Detection of Gene/Chemical/Diseases and their interactions)
RLIMS-P (Annotation of kinase, substrate and site with normalization information)
SciKnowMine (Triage based on pre-trained categories of interest)
Tagtog (Annotation of gene names within full-text documents especially machine-predicted documents)
Please specify if you would like to be considered for Full Participation or Partial Participation for the systems selected, as explained in the above section.

