`medcat.ner.transformers_ner`

Module Contents

Classes

TransformersNER

TODO: Add documentation

Attributes

logger

medcat.ner.transformers_ner.logger

class medcat.ner.transformers_ner.TransformersNER(cdb, config=None, training_arguments=None)

Bases: object

TODO: Add documentation

Parameters:: config (Optional[medcat.config_transformers_ner.ConfigTransformersNER]) –

name = 'transformers_ner'

__init__(cdb, config=None, training_arguments=None)

Parameters:: config (Optional[medcat.config_transformers_ner.ConfigTransformersNER]) –
Return type:: None

create_eval_pipeline()

get_hash()

A partial hash trying to catch differences between models.

Returns:: str – The hex hash.
Return type:: str

_prepare_dataset(json_path, ignore_extra_labels, meta_requirements, file_name='data.json')

train(json_path=None, ignore_extra_labels=False, dataset=None, meta_requirements=None, trainer_callbacks=None)

Train or continue training a model give a json_path containing a MedCATtrainer export. It will continue training if an existing model is loaded or start new training if the model is blank/new.

Parameters:

json_path (str or list) – Path/Paths to a MedCATtrainer export containing the meta_annotations we want to train for.
ignore_extra_labels – Makes only sense when an existing deid model was loaded and from the new data we want to ignore labels that did not exist in the old model.
dataset – Defaults to None.
meta_requirements – Defaults to None
trainer_callbacks (List[TrainerCallback]) – A list of trainer callbacks for collecting metrics during the training at the client side. The transformers Trainer object will be passed in when each callback is called.

Returns:

Tuple – The dataframe, examples, and the dataset

Return type:

Tuple

eval(json_path=None, dataset=None, ignore_extra_labels=False, meta_requirements=None)

Parameters:: json_path (Union[str, list, None]) –

save(save_dir_path)

Save all components of this class to a file

Parameters:: save_dir_path (str) – Path to the directory where everything will be saved.
Return type:: None

classmethod load(save_dir_path, config_dict=None)

Load a meta_cat object.

Parameters:

save_dir_path (str) – The directory where all was saved.
config_dict (dict) – This can be used to overwrite saved parameters for this meta_cat instance. Why? It is needed in certain cases where we autodeploy stuff.

Returns:

meta_cat (medcat.MetaCAT) – You don’t say

Return type:

TransformersNER

static batch_generator(stream, batch_size_chars)

Parameters:

stream (Iterable[spacy.tokens.Doc]) –
batch_size_chars (int) –

Return type:

Iterable[List[spacy.tokens.Doc]]

pipe(stream, *args, **kwargs)

Process many documents at once.

Parameters:

stream (Iterable[spacy.tokens.Doc]) – List of spacy documents.
*args – Extra arguments (not used here).
**kwargs – Extra keyword arguments (not used here).

Yields:

Doc – The same document.

Returns:

Iterator[Doc] – If the stream is None or empty.

Return type:

Iterator[spacy.tokens.Doc]

_process(stream, batch_size_chars)

Parameters:

stream (Iterable[Union[spacy.tokens.Doc, None]]) –
batch_size_chars (int) –

Return type:

Iterator[Optional[spacy.tokens.Doc]]

__call__(doc)

Process one document, used in the spacy pipeline for sequential document processing.

Parameters:: doc (Doc) – A spacy document
Returns:: Doc – The same spacy document.
Return type:: spacy.tokens.Doc

medcat.ner.transformers_ner

Module Contents

Classes

Attributes

`medcat.ner.transformers_ner`