Frequently Asked Questions (FAQ)¶
--- Challenge ---¶
1. What is CholecTriplet2022?
An Endoscopic vision challenge on the recognition and localization of
tool-tissue interactions in surgical videos in the form of triplets. It
is an upgrade of the CholecTriplet2021 challenge with localization task.
2. What is a triplet?
A combination of {instrument,verb,anatomical target}
that describes a
surgical action.
3. Who can participate?
Anyone who signs the challenge agreement except the members of the
organizing lab.
4. Will the challenge submission still be open after the submission
deadline?
Likely yes, but only submissions made before the deadline will be
eligible for awards.
--- Registration ---¶
1. Why is my registration not yet approved?
For your registration to be approved, you must send a signed challenge
contract. Check the Getting Started page for more details.
2. Must every member of my team submit a signed contract?
One contract per team is sufficient. However, every member of the team
must abide by the terms and conditions in the signed contract. The
dataset obtained after signature of this contract remains confidential
and cannot be transferred to someone outside the team.
3. Is it mandatory to register as a team?
Yes.
4. What if I am working alone, must I still register as a team?
Yes. A team can comprise of only one person.
5. Must every member of my team register on the challenge website?
Yes.
6. When is the deadline for team registration?
All registrations end on July 1, 2022.
--- CholecT50 dataset ---¶
1. What labels in the dataset should be used to train the model?
Triplet labels. The standalone instrument labels, verb labels, and
target labels are provided as additional labels should in case they can
help your modeling. Their usage is optional and entirely depends on your
propose methods. The localization can be learnt by weak supervision
preferably using the instrument presence labels or the triplet labels.
2. I found some action triplet marked null for the verb & target
components in the ground truth, is this an error?
No, some clinically valid triplets are not in the considered 100 triplet
classes, due to their occurrence frequency and clinical relevance to
the considered procedure.
3. How many types of null triplets are possible in the dataset?
The possible null triplet classes can be grouped into two as
follows:
a.) Instrument inclusive null: this is seen in a situation where
there is no instrument in the frame, or where the instrument involved in
the action is not from valid classes in the dataset. In this case, the
label is {null-instrument,null-verb,null-target}
. Since triplet is a
multi-label classification problem, this class is true only when all
other classes are negative in one frame. It is NOT included in the 100
triplet classes.
b.) Non-instrument inclusive null: this is a situation where a
valid instrument class is present, but verb or target involved in the
action is from an invalid class, or the triplet combination is not in
the considered 100 classes due to reason in (2) above. In this case, we
retain only the instrument presence label while verb/target are
marked null. They are 6 classes from such
situation: {grasper,null-verb,null-target}, {bipolar,null-verb,null-target}, {hook,null-verb,null-target}, {scissors,null-verb,null-target}, {clipper,null-verb,null-target}
,
and {irrigator,null-verb,null-target}
.
4. Is the train/val/test split different from the one used in the
published papers?
Yes. For the challenge, we restrict the test set to only videos that are
not in the public domain. We recommend 40/5 train/val split on the
provided data but participants are entirely free to define their own
splits. The training set is the entire CholecT45 dataset [7].
5. Is the challenge test dataset publicly available?
No, while the trained dataset are 45 videos of publicly available
Cholec80 [1], the test set is a private dataset (not Cholec80) of
the same type of surgery.
6. I have observed that there are some black images in the dataset, are
these images corrupted?
No, as a privacy protection measure, we zeroed out all images that
display the faces of the clinicians or the patients. For temporal
consistency reason, we do not remove the zeroed frames from the dataset.
--- My Challenge Methods ---¶
1. What is the expected output of the model?
A model produces two type of outputs per frame:
- A vector of N=100 probability scores for the triplet recognition in
the format
[score1, score2, ..., scoreN]
. - A list of box-triplet pairing for each positive triplet instance in
the format:
[
[tripletID, instrumentID, confidence, x, y, w, h], [tripletID, instrumentID, confidence, x, y, w, h], . . .
]
The model predictions for final submission will be converted to python dict and saved as JSON file. We will provide guide for this during submission.
2. Do I need to predict the instrument, verb and target separately?
No, while you may want to leverage the extra annotations provided for
instrument, verb and target to improve your model, you are only required
to predict the final triplet IDs as a vector[100] of probability
scores.
3. Will the inference pipeline preserve the temporal information?
Yes. Participants are free to train their model either on sequential or
shuffled frames. During testing, we will use an input setup that
preserves the temporal frame order per video.
4. What is the frame-rate for the test set?
1 FPS. Same as the train data.
5. Is the testing going to be an online prediction?
Yes. We will maintain a real-time scenario during testing. This means
that our test input setup will collect your model's outputs at
time t
before feeding the input frame at time t+1
. Your method can
accumulate and utilize previous frames information but not a future one.
5. My model performance is quite low, do I still need to submit?
Triplet recognition task is generally challenging. The average
performance of a random model is 0.01%. So if you beat this performance,
you have a good method to submit for the competition.
--- Baseline Methods ---¶
1. Where can I find a published paper/article on surgical triplet
recognition?
* Tripnet [2]: first deep learning model for the recognition of
action triplets in surgical videos:
[Nwoye C.I.
et.al, Recognition of Instrument-Tissue Interactions in Endoscopic
Videos via Action Triplets, MICCAI 2020]
Please, note that the models in this paper are trained and evaluated on
CholecT40 (a subset of CholecT50)
* Rendezvous [3]: the journal extension of Tripnet baseline, trained
on CholecT50:
[Nwoye C.I.
et.al, Rendezvous: Attention Mechanisms for the Recognition of Surgical
Action Triplets in Endoscopic Videos]
* Summary of methods and results in the previous action triplet
challenge [4]:
[Nwoye C.I.
et.al, Cholectriplet2021: A benchmark challenge for surgical action
triplet recognition]
* Official dataset splits and benchmarking of baseline methods of
surgical action triplet dataset [5]:
[Nwoye C.I.
et.al, Data splits and metrics for method benchmarking on surgical
action triplet datasets]
2. Where can I find a trained model (code) on triplet recognition?
Some code are available on CAMMA public git
repo: https://github.com/CAMMA-public
We also provide a sample code in the colab code blog to help you get
started.
Note, we do not provide the weights for any sample/published model.
3. Must I follow the same strategy as in the published papers?
No, you are free to develop any method that works for you: deep
learning, machine learning, rule-based inference, etc.
4. Can I submit exactly the same model as in the published papers?
Submitting original and novel method is highly recommended, however, you
are not constrained on what to submit.
--- Training ---¶
1. Is pretraining on a surgical dataset allowed?
Yes, you are free to pretrain your model on any
third-party public dataset. Additional use of any private dataset
is not allowed for this challenge.
--- Submission ---¶
1. How do I submit my method?
Methods are to be submitted as docker file. We will provide a docker
template and submission guideline by June 2022.
The submission channel will open on July 1, 2022.
--- Evaluation ---¶
1. What is the metrics for the evaluation?
Our evaluation will be based on mean average precision (mAP) provided
by ivtmetrics library [6]. The ivtmetrics is a
special library for the evaluation of tool-tissue interaction detection.
It can be installed using either pip or *conda *python library
installer. More details about the metrics and its usage can be found on
the method and evaluation page.
2. What will my model be evaluated on?
We will evaluate each model on 3 criteria:
- Triplet recognition performance (mean average precision mAP).
- Instrument localization performance (mAP at box IoU of 0.5).
- Triplet detection performance (correct triplet-box matching).
We plan to award prize for best model in each of the sub-tasks.
--- Publication ---¶
1. Will my challenge submission be published?
We plan a joint publication of the surgical action triplet detection
which will include the submitted challenge models and results. More
information will be provided on this as time goes on.
2. Who will be co-authors?
Top N performing team can submit at most 2 qualifying authors. The
sub-challenge organizers determine the order of the authors in a joint
challenge paper. Information on the N number of teams will be provided
before the challenge presentation and would depend on the number of
participation.
3. When can a participant publish an independent research on this
dataset?
Participants are allowed to publish their own results separately on
triplet recognition using only the publicly released CholecT45 dataset
[7]. However, triplet detection and localization results which is the
prime focus of this challenge cannot be published until after a
publication of a joint challenge paper. Publication cannot be made on
CholecT50 [3] before the joint publication.
4. When will the joint results be published?
This should be expected before the end of 2023.