1- Invitation to Text Mining Teams
We invite text mining teams for the submissions of system descriptions that highlight a text mining/NLP system with respect to a specific biocuration task. The description should be biocurator centric, providing clear examples of input (such as PMID, gene, keyword) and output (list of relevant articles, compound recognition, PPI, etc) of the system, and provide the context in which the system can be used effectively (e.g. the task is only applicable for articles about a given taxon group). The track is open to systems that process abstracts and/or full-length articles.
The description should be no longer than 6 pages including figures, and Word or RTF formats are preferred.
All submissions will be evaluated by the BioCreative Organizing Committee according to the following criteria:
- Relevance and Impact: Is the system currently being used in a biocuration task/workflow?
- Adaptability: Is it robust and adaptable to applications for other related biocuration tasks (i.e., can be utilized by multiple databases/resources)?
- Interactivity: Does it provide an interactive web interface for biocurator’s testing?
- Performance: Can the system be benchmarked and provide precision and recall for the task?
For system evaluation, the participating teams should:
- Define a curation task according to the system capabilities
- Provide a set of annotated examples as a practice test for curators prior to the evaluation (30-50 examples). The idea is to provide some examples of correct system output for the task, so curators learn what to expect from the system.
- Suggest biocurator(s) who could annotate literature corpus as the gold standard for evaluation before the workshop (approximately 50 documents)
- Benchmark system and submit precision/recall for the given task(s) by March1, 2012. Note that here we request your own benchmarking, which may have already been published, to make sure that the systems have gone through some formal evaluation.
The list of selected teams will be posted with accompanying description on or about January 20, 2012. The BioCreative Organizing Committee along with the teams will identify and recruit biocurators that will participate in the evaluation of the system.
The evaluation will include comparing time-on-task in manual vs. system activities, as well as precision and recall for uncurated and curated set comparing to gold standard (generated by the suggested biocurator, and blind to the systems).
To assist teams with this activity, a document with the description of a system and a proposed task is found at the end of this page under Downloads.
At the workshopThe selected systems will be presented on the second day of the workshop. A demo session, where the users (biocurators) will have the opportunity to use the systems, will follow.
Finally, the results and observations from the system evaluation will be presented.
DisseminationAccepted descriptions will be published in the workshop proceedings.
SpecificationsSystems should be web based and compatible with Mozilla Firefox 4.0 or higher.
2- Invitation to Biocurators
We invite biocurators to participate in a user study on the text mining system of their choice prior to and at the workshop. The user study will involve (i) manual curation and text-mining of the literature corpus by the biocurators; (ii) recording of the user interactions with the system (with logs of all queries and clicks); and (iii) a post-study survey.
The list of systems can be found here.
a-Biocurator TasksPrior to the workshop (around March 1, 2012) each biocurator will curate a set of these documents manually (approx. 25) and a set of documents using the selected system (approx. 25).
Manual task: the user will be given the list of documents in Pubmed environment for their manual processing and results should be stored in a spreadsheet.
Using system: curator will validate the output provided by the system and store information using the system or a spreadsheet (output to be determined by each system).
More details on the evaluation will be available once the systems are selected.
b-Requirements for user study and evaluationBiocurators assigned for testing the system should install a client-side web-browser add-on that will allow tracking time and user activity during testing (more details are upcoming). Users will be informed as to the nature of the data being collected and asked whether they want to opt out of data collection when the browser opens. The data that is collected will be sent to one of the organizers (Cecilia Arighi) automatically when a session is complete.
We will also conduct pre- and post-study surveys, in which users will be asked additional questions that may help elucidate their experience with the system. Questions for the post-study survey will focus on task completion (how the user rates their own performance, how the user rates the system’s help, whether the user decided to give up on the task before completion and if so why) and whether they understood how to use features of the system to perform “typical” actions (depending on the system). Questions for the pre-study survey will focus on how users typically perform the task. Surveys will need to be tailored to specific systems in order to provide the optimal feedback.
Registration to participate in this activity is now open and we encourage biocurators to register now, but provide final commitment once systems are posted.
3- Important Dates
| Item | Deadline | Submit via | Comment |
|---|---|---|---|
| Team Registration | November 15, 2011 | web | register here |
| Text Mining System Description | December 31, 2011 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III |
| System Benchmarking Results | March 1, 2012 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III-result |
| Submission of Practice Test | March 1, 2012 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III-test |
| Interface Available for Testing | March 1, 2012 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III-interface |
| Biocurator's Evaluation | March 15, 2012 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III-biocurator |
| Workshop | Noon April 4- 5pm April 5, 2012 | Reporting from testing Track III will be presented |
All questions pertaining to this track should be directed to Cecilia Arighi (arighi@dbi.udel.edu).
| Return to Homepage |