medcat.preprocessing.iterators

Module Contents

Classes

EmbMimicCSV

Iterate over MIMIC data in CSV format

BertEmbMimicCSV

Iterate over MIMIC data in CSV format

BaseEmbMimicCSV

Iterate over MIMIC data in CSV format

RawCSV

Iterate over MIMIC data in CSV format

FastEmbMimicCSV

Iterate over MIMIC data in CSV format

SimpleIter

Attributes

NUM

FAST_SPLIT

medcat.preprocessing.iterators.NUM = 'NUMNUM'
medcat.preprocessing.iterators.FAST_SPLIT
class medcat.preprocessing.iterators.EmbMimicCSV(csv_paths, tokenizer, emb_dict=None)

Bases: object

Iterate over MIMIC data in CSV format

csv_paths: paths to csv files containing the mimic data

Parameters:
  • csv_paths (List[str]) –

  • tokenizer (Any) –

  • emb_dict (Optional[Dict]) –

__init__(csv_paths, tokenizer, emb_dict=None)
Parameters:
  • csv_paths (List[str]) –

  • tokenizer (Any) –

  • emb_dict (Optional[Dict]) –

Return type:

None

__iter__()
Return type:

Iterable[List]

class medcat.preprocessing.iterators.BertEmbMimicCSV(csv_paths, tokenizer)

Bases: object

Iterate over MIMIC data in CSV format

csv_paths: paths to csv files containing the mimic data

Parameters:
  • csv_paths (List[str]) –

  • tokenizer (pytorch_pretrained_bert.BertTokenizer) –

__init__(csv_paths, tokenizer)
Parameters:
  • csv_paths (List[str]) –

  • tokenizer (pytorch_pretrained_bert.BertTokenizer) –

Return type:

None

__iter__()
Return type:

Iterable[List]

class medcat.preprocessing.iterators.BaseEmbMimicCSV(csv_paths, tokenizer)

Bases: object

Iterate over MIMIC data in CSV format

csv_paths: paths to csv files containing the mimic data

Parameters:
  • csv_paths (List[str]) –

  • tokenizer (pytorch_pretrained_bert.BertTokenizer) –

__init__(csv_paths, tokenizer)
Parameters:
  • csv_paths (List[str]) –

  • tokenizer (pytorch_pretrained_bert.BertTokenizer) –

Return type:

None

__iter__()
Return type:

Iterable[Tuple]

class medcat.preprocessing.iterators.RawCSV(csv_paths)

Bases: object

Iterate over MIMIC data in CSV format

csv_paths: paths to csv files containing the mimic data

Parameters:

csv_paths (List[str]) –

__init__(csv_paths)
Parameters:

csv_paths (List[str]) –

Return type:

None

__iter__()
Return type:

Iterable[str]

class medcat.preprocessing.iterators.FastEmbMimicCSV(csv_paths)

Bases: object

Iterate over MIMIC data in CSV format

csv_paths: paths to csv files containing the mimic data

Parameters:

csv_paths (List[str]) –

__init__(csv_paths)
Parameters:

csv_paths (List[str]) –

Return type:

None

__iter__()
Return type:

Iterable[List[str]]

class medcat.preprocessing.iterators.SimpleIter(text_path)

Bases: object

Parameters:

text_path (str) –

__init__(text_path)
Parameters:

text_path (str) –

Return type:

None

__iter__()
Return type:

Iterable[List[str]]