Entity Relations Supporting Task (REL)

The Entity Relations (REL) task is a supporting task in the BioNLP Shared Task 2011.

The task concerns the detection of relations stated to hold between a gene or gene product and a related entity such as a protein domain or protein complex.

Task Results

The REL supporting task is completed. Final submissions were received from four teams, and the results are summarized in the following table (approximate entity boundary matching criteria):

The primary performance metric is overall F-score, shown in bold in the table above.

Detailed Results

University of Turku

------------------------------------------------------------------------------------

Relation Class gold (match) answer (match) recall prec. fscore

------------------------------------------------------------------------------------

Protein-Component 334 ( 170) 245 ( 168) 50.90 68.57 58.43

Subunit-Complex 163 ( 79) 118 ( 79) 48.47 66.95 56.23

===[TOTAL]=== 497 ( 249) 363 ( 247) 50.10 68.04 57.71

------------------------------------------------------------------------------------

VIB - Ghent University

------------------------------------------------------------------------------------

Relation Class gold (match) answer (match) recall prec. fscore

------------------------------------------------------------------------------------

Protein-Component 334 ( 158) 427 ( 156) 47.31 36.53 41.23

Subunit-Complex 163 ( 78) 202 ( 77) 47.85 38.12 42.43

===[TOTAL]=== 497 ( 236) 629 ( 233) 47.48 37.04 41.62

------------------------------------------------------------------------------------

Concordia University

------------------------------------------------------------------------------------

Relation Class gold (match) answer (match) recall prec. fscore

------------------------------------------------------------------------------------

Protein-Component 334 ( 78) 146 ( 76) 23.35 52.05 32.24

Subunit-Complex 163 ( 43) 108 ( 43) 26.38 39.81 31.73

===[TOTAL]=== 497 ( 121) 254 ( 119) 24.35 46.85 32.04

------------------------------------------------------------------------------------

University Of Science, VNU, HCMC

------------------------------------------------------------------------------------

Relation Class gold (match) answer (match) recall prec. fscore

------------------------------------------------------------------------------------

Protein-Component 334 ( 70) 319 ( 69) 20.96 21.63 21.29

Subunit-Complex 163 ( 8) 12 ( 8) 4.91 66.67 9.14

===[TOTAL]=== 497 ( 78) 331 ( 77) 15.69 23.26 18.74

------------------------------------------------------------------------------------

Task Definition

Entities

Similarly to many main tasks of the shared task, the supporting task provides as a starting point human-annotated gene and gene product entities, annotated as "Protein". The correct annotations for these entities are provided both for the training and test data.

Human-created gold annotation for the related entities will only be provided for the training data, and systems will need to detect the related entities as part of addressing the supporting task. However, the type of these entities does not need to be resolved; all non-Protein entities are annotated using an unspecified class "Entity".

Relations

Relations are binary and represented as typed, ordered entity pairs. All relations considered in the task involve exactly one Protein entity (given) and one other entity (detected by participating systems). The arguments and relation types are fixed so that the first argument (Arg1) is always a Protein and the second argument (Arg2) is always an Entity.

By contrast from the annotation of the primary tasks, the entity relations supporting task only involves relations holding between entities co-occurring within a single sentence.

The following table shows the relations considered in the supporting task.

Relation type

Subunit-Complex

Protein-Component

Arguments

Arg1:Protein, Arg2:Entity

Arg1:Protein, Arg2:Entity

Subunit-Complex is a Component-Object relation that holds between a protein complex and its subunits, individual proteins. The Protein-Component is a less specific Object-Component relation that holds between a gene/protein and its component, such as a protein domain or the promoter of a gene.

Data Format

The data format of the supporting task files is described on the file formats page.

Evaluation

Evaluation is relation-oriented and based on the standard precision, recall and F-score metrics. Relations output by participating systems are correct if the associated Entity matches an Entity in the gold annotation (using approximate boundary matching criteria) and a Relation of a type matching that output by the system is included in the gold relation data between the corresponding gold entities.

Note that only relations are evaluated; the accuracy with which systems detect (candidate) related entities will not be separately considered in the results.

Datasets

A small sample of annotations for the task is available below (attachment). The full training and development test data are available from the download page.