:py:mod:`medcat.config_rel_cat` =============================== .. py:module:: medcat.config_rel_cat Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: medcat.config_rel_cat.General medcat.config_rel_cat.Model medcat.config_rel_cat.Train medcat.config_rel_cat.ConfigRelCAT .. py:class:: General Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel` The General part of the RelCAT config .. py:attribute:: device :type: str :value: 'cpu' The device to use (CPU or GPU). NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: relation_type_filter_pairs :type: List :value: [] Map from category values to ID, if empty it will be autocalculated during training .. py:attribute:: vocab_size :type: medcat.config.Optional[int] .. py:attribute:: lowercase :type: bool :value: True If true all input text will be lowercased .. py:attribute:: cntx_left :type: int :value: 15 Number of tokens to take from the left of the concept .. py:attribute:: cntx_right :type: int :value: 15 Number of tokens to take from the right of the concept .. py:attribute:: window_size :type: int :value: 300 Max acceptable dinstance between entities (in characters), care when using this as it can produce sentences that are over 512 tokens (limit is given by tokenizer) .. py:attribute:: mct_export_max_non_rel_sample_size :type: int :value: 200 Limit the number of 'Other' samples selected for training/test. This is applied per encountered medcat project, sample_size/num_projects. .. py:attribute:: mct_export_create_addl_rels :type: bool :value: False When processing relations from a MedCAT export, relations labeled as 'Other' are created from all the annotations pairs available .. py:attribute:: tokenizer_name :type: str :value: 'bert' The name of the tokenizer user. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: model_name :type: str :value: 'bert-base-uncased' The name of the model used. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: log_level :type: int The log level for RelCAT. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: max_seq_length :type: int :value: 512 The maximum sequence length. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: tokenizer_special_tokens :type: bool :value: False Tokenizer. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: annotation_schema_tag_ids :type: List :value: [] If a foreign non-MCAT trainer dataset is used, you can insert your own Rel entity token delimiters into the tokenizer, copy those token IDs here, and also resize your tokenizer embeddings and adjust the hidden_size of the model, this will depend on the number of tokens you introduce .. py:attribute:: labels2idx :type: Dict .. py:attribute:: idx2labels :type: Dict .. py:attribute:: pin_memory :type: bool :value: True .. py:attribute:: seed :type: int :value: 13 The seed for random number generation. NOTE: If used along MetaCAT or additional NER, only one of the seeds will take effect NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: task :type: str :value: 'train' The task for RelCAT. NB! For these changes to take effect, the pipe would need to be recreated. .. py:class:: Model Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel` The model part of the RelCAT config .. py:class:: Config .. py:attribute:: extra :value: 'allow' .. py:attribute:: validate_assignment :value: True .. py:attribute:: input_size :type: int :value: 300 .. py:attribute:: hidden_size :type: int :value: 768 The hidden size. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: hidden_layers :type: int :value: 3 hidden_size * 5, 5 being the number of tokens, default (s1,s2,e1,e2+CLS). NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: model_size :type: int :value: 5120 The size of the model. NB! For these changes to take effect, the pipe would need to be recreated. .. py:attribute:: dropout :type: float :value: 0.2 .. py:attribute:: num_directions :type: int :value: 2 2 - bidirectional model, 1 - unidirectional .. py:attribute:: padding_idx :type: int .. py:attribute:: emb_grad :type: bool :value: True If True the embeddings will also be trained .. py:attribute:: ignore_cpos :type: bool :value: False If set to True center positions will be ignored when calculating representation .. py:class:: Train Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel` The train part of the RelCAT config .. py:class:: Config .. py:attribute:: extra :value: 'allow' .. py:attribute:: validate_assignment :value: True .. py:attribute:: nclasses :type: int :value: 2 Number of classes that this model will output .. py:attribute:: batch_size :type: int :value: 25 .. py:attribute:: nepochs :type: int :value: 1 .. py:attribute:: lr :type: float :value: 0.0001 .. py:attribute:: adam_epsilon :type: float :value: 0.0001 .. py:attribute:: test_size :type: float :value: 0.2 .. py:attribute:: gradient_acc_steps :type: int :value: 1 .. py:attribute:: multistep_milestones :type: List[int] :value: [2, 4, 6, 8, 12, 15, 18, 20, 22, 24, 26, 30] .. py:attribute:: multistep_lr_gamma :type: float :value: 0.8 .. py:attribute:: max_grad_norm :type: float :value: 1.0 .. py:attribute:: shuffle_data :type: bool :value: True Used only during training, if set the dataset will be shuffled before train/test split .. py:attribute:: class_weights :type: medcat.config.Optional[Any] .. py:attribute:: score_average :type: str :value: 'weighted' What to use for averaging F1/P/R across labels .. py:attribute:: auto_save_model :type: bool :value: True Should the model be saved during training for best results .. py:class:: ConfigRelCAT Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel` The RelCAT part of the config .. py:class:: Config .. py:attribute:: extra :value: 'allow' .. py:attribute:: validate_assignment :value: True .. py:attribute:: general :type: General .. py:attribute:: model :type: Model .. py:attribute:: train :type: Train