Workflow question: Can you restrict automatic NER pre-annotation to simplify the annotation process?

Thanks for the help so far! 

### Problem
I am trying to use Medcat to annotate main clinically relevant findings in pathology reports using a large pretrained SNOMED International CDB in Medcattrainer.  My goal is not to annotate every SNOMED concept in the text. I am mainly interested in the main clinically relevant findings, for example histological type, grade, ER/PR/HER2/Ki-67, margins, lymphovascular invasion (as highlighted in blue in the screenshot). Currently, each document is automatically pre-annotated with many  concepts such as ("material", "size", "cells", "protocol", etc.) that are not relevant for my use case. However, I still need to manually mark as "terminate" or "incorrect" before I can submit. This makes the annotation process quite slow. I guess that my use case is also quite different from the intended use of Medcat, but I am wondering if there is a better way.

I have a curated whitelist of ~200 clinically relevant CUIs per organ. The `CUI File` project filter restricts concept lookup but does not restrict automatic pre-annotation, so irrelevant concepts outside the whitelist are still automatically recognised in grey.

<img width="1417" height="450" alt="Image" src="https://github.com/user-attachments/assets/1e076596-183b-40ba-8175-4bc8ad473f17" />

### Questions
- Is there a supported way to disable automatic NER pre-annotation entirely, while keeping the full CDB available for manual concept lookup?
- Alternatively, can I restrict automatic pre-annotation to only the CUIs in the project `CUI File`, while keeping the full CDB available for manual lookup when concepts are missing from the whitelist?
- Or would it be better to build a small CDB containing only the ~200 whitelisted concepts? If so, can I still manually search and annotate concepts outside that CDB using the CDB search filter, or does every concept need to be fully added including CUI, name, and synonyms — which also seems inefficient?
- Or is the best approach to continue with the full CDB and terminate unwanted concepts as negative training examples?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow question: Can you restrict automatic NER pre-annotation to simplify the annotation process? #528

Problem

Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Workflow question: Can you restrict automatic NER pre-annotation to simplify the annotation process? #528

Description

Problem

Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions