`medcat.datasets.medcat_annotations`

Module Contents

Classes

`MedCATAnnotationsConfig`	BuilderConfig for MedCATAnnotations.
`MedCATAnnotations`	MedCATAnnotations: Output of MedCAT

Attributes

`_CITATION`
`_DESCRIPTION`

medcat.datasets.medcat_annotations._CITATION = Multiline-String

Show Value

"""@ARTICLE{Kraljevic2021-ln,
  title="Multi-domain clinical natural language processing with {MedCAT}: The Medical Concept Annotation Toolkit",
  author="Kraljevic, Zeljko and Searle, Thomas and Shek, Anthony and Roguski, Lukasz and Noor, Kawsar and Bean, Daniel and Mascio, Aurelie and Zhu, Leilei and Folarin, Amos A and Roberts, Angus and Bendayan, Rebecca and Richardson, Mark P and Stewart, Robert and Shah, Anoop D and Wong, Wai Keong and Ibrahim, Zina and Teo, James T and Dobson, Richard J B",
  journal="Artif. Intell. Med.",
  volume=117,
  pages="102083",
  month=jul,
  year=2021,
  issn="0933-3657",
  doi="10.1016/j.artmed.2021.102083"
}
"""

medcat.datasets.medcat_annotations._DESCRIPTION = Multiline-String

Show Value

"""Takes as input a pickled dict of annotated documents from MedCAT. The format should be:
    {'document_id': {'entities': <entities>, ...}
Where entities is the output from medcat.get_entities(<...>)['entities']
"""

class medcat.datasets.medcat_annotations.MedCATAnnotationsConfig

Bases: datasets.BuilderConfig

BuilderConfig for MedCATAnnotations.

Parameters:: **kwargs – keyword arguments forwarded to super.

class medcat.datasets.medcat_annotations.MedCATAnnotations(cache_dir=None, dataset_name=None, config_name=None, hash=None, base_path=None, info=None, features=None, token=None, use_auth_token='deprecated', repo_id=None, data_files=None, data_dir=None, storage_options=None, writer_batch_size=None, name='deprecated', **config_kwargs)

Bases: datasets.GeneratorBasedBuilder

MedCATAnnotations: Output of MedCAT

Parameters:

cache_dir (Optional[str]) –
dataset_name (Optional[str]) –
config_name (Optional[str]) –
hash (Optional[str]) –
base_path (Optional[str]) –
info (Optional[datasets.info.DatasetInfo]) –
features (Optional[datasets.features.Features]) –
token (Optional[Union[bool, str]]) –
repo_id (Optional[str]) –
data_files (Optional[Union[str, list, dict, datasets.data_files.DataFilesDict]]) –
data_dir (Optional[str]) –
storage_options (Optional[dict]) –
writer_batch_size (Optional[int]) –

BUILDER_CONFIGS

_info()

Construct the DatasetInfo object. See DatasetInfo for details.

Warning: This function is only called once and the result is cached for all following .info() calls.

Returns:: info – (DatasetInfo) The dataset information

_split_generators(dl_manager): Returns SplitGenerators.

_generate_examples(filepath): This function returns the examples in the raw (text) form.

medcat.datasets.medcat_annotations

Module Contents

Classes

Attributes

`medcat.datasets.medcat_annotations`