:py:mod:`medcat.utils.ner.deid` =============================== .. py:module:: medcat.utils.ner.deid .. autoapi-nested-parse:: De-identification model. This describes a wrapper on the regular CAT model. The idea is to simplify the use of a DeId-specific model. It tackles two use cases 1) Creation of a deid model 2) Loading and use of a deid model I.e for use case 1: Instead of: cat = CAT(cdb=ner.cdb, addl_ner=ner) You can use: deid = DeIdModel.create(ner) And for use case 2: Instead of: cat = CAT.load_model_pack(model_pack_path) anon_text = deid_text(cat, text) You can use: deid = DeIdModel.load_model_pack(model_pack_path) anon_text = deid.deid_text(text) Or if/when structured output is desired: deid = DeIdModel.load_model_pack(model_pack_path) anon_doc = deid(text) # the spacy document The wrapper also exposes some CAT parts directly: - config - cdb Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: medcat.utils.ner.deid.DeIdModel Attributes ~~~~~~~~~~ .. autoapisummary:: medcat.utils.ner.deid.logger .. py:data:: logger .. py:class:: DeIdModel(cat) Bases: :py:obj:`medcat.utils.ner.model.NerModel` The DeID model. This wraps a CAT instance and simplifies its use as a de-identification model. It provides methods for creating one from a TransformersNER as well as loading from a model pack (along with some validation). It also exposes some useful parts of the CAT it wraps such as the config and the concept database. .. py:method:: __init__(cat) .. py:method:: train(json_path, *args, **kwargs) Train the underlying transformers NER model. All the extra arguments are passed to the TransformersNER train method. :param json_path: The JSON file path to read the training data from. :type json_path: Union[str, list, None] :param train_nr: The number of the NER object in cat._addl_train to train. Defaults to 0. :type train_nr: int :param \*args: Additional arguments for TransformersNER.train . :param \*\*kwargs: Additional keyword arguments for TransformersNER.train . :Returns: **Tuple[Any, Any, Any]** -- df, examples, dataset .. py:method:: eval(json_path, *args, **kwargs) Evaluate the underlying transformers NER model. All the extra arguments are passed to the TransformersNER eval method. :param json_path: The JSON file path to read the training data from. :type json_path: Union[str, list, None] :param train_nr: The number of the NER object in cat._addl_train to train. Defaults to 0. :type train_nr: int :param \*args: Additional arguments for TransformersNER.eval . :param \*\*kwargs: Additional keyword arguments for TransformersNER.eval . :Returns: **Tuple[Any, Any, Any]** -- df, examples, dataset .. py:method:: deid_text(text, redact = False) Deidentify text and potentially redact information. De-identified text. If redaction is enabled, identifiable entities will be replaced with starts (e.g `*****`). Otherwise, the replacement will be the CUI or in other words, the type of information that was hidden (e.g [PATIENT]). :param text: The text to deidentify. :type text: str :param redact: Whether to redact the information. :type redact: bool :Returns: **str** -- The deidentified text. .. py:method:: deid_multi_texts(texts, redact = False, addl_info = ['cui2icd10', 'cui2ontologies', 'cui2snomed'], n_process = None, batch_size = None) Deidentify text on multiple branches :param texts: Text to be annotated :type texts: Union[Iterable[str], Iterable[Tuple]] :param redact: Whether to redact the information. :type redact: bool :param addl_info: Additional info. Defaults to ['cui2icd10', 'cui2ontologies', 'cui2snomed']. :type addl_info: List[str], optional :param n_process: Number of processes. Defaults to None. :type n_process: Optional[int], optional :param batch_size: The size of a batch. Defaults to None. :type batch_size: Optional[int], optional :raises ValueError: In case of unsupported input. :Returns: **List[str]** -- List of deidentified documents. .. py:method:: load_model_pack(model_pack_path, config = None) :classmethod: Load DeId model from model pack. The method first loads the CAT instance. It then makes sure that the model pack corresponds to a valid DeId model. :param config: Config for DeId model pack (primarily for stride of overlap window) :param model_pack_path: The model pack path. :type model_pack_path: str :raises ValueError: If the model pack does not correspond to a DeId model. :Returns: **DeIdModel** -- The resulting DeI model. .. py:method:: _is_deid_model(cat) :classmethod: .. py:method:: _get_reason_not_deid(cat) :classmethod: