:py:mod:`medcat.config_rel_cat`
===============================

.. py:module:: medcat.config_rel_cat


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medcat.config_rel_cat.General
   medcat.config_rel_cat.Model
   medcat.config_rel_cat.Train
   medcat.config_rel_cat.ConfigRelCAT


.. py:class:: General


   Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel`

   The General part of the RelCAT config

   .. py:attribute:: device
      :type: str
      :value: 'cpu'

      
   .. py:attribute:: relation_type_filter_pairs
      :type: List
      :value: []

      Map from category values to ID, if empty it will be autocalculated during training

   .. py:attribute:: vocab_size
      :type: medcat.config.Optional[int]

      
   .. py:attribute:: lowercase
      :type: bool
      :value: True

      If true all input text will be lowercased

   .. py:attribute:: cntx_left
      :type: int
      :value: 15

      Number of tokens to take from the left of the concept

   .. py:attribute:: cntx_right
      :type: int
      :value: 15

      Number of tokens to take from the right of the concept

   .. py:attribute:: window_size
      :type: int
      :value: 300

      Max acceptable dinstance between entities (in characters), care when using this as it can produce sentences that are over 512 tokens (limit is given by tokenizer)

   .. py:attribute:: mct_export_max_non_rel_sample_size
      :type: int
      :value: 200

      Limit the number of 'Other' samples selected for training/test. This is applied per encountered medcat project, sample_size/num_projects.

   .. py:attribute:: mct_export_create_addl_rels
      :type: bool
      :value: False

      When processing relations from a MedCAT export, relations labeled as 'Other' are created from all the annotations pairs available

   .. py:attribute:: tokenizer_name
      :type: str
      :value: 'bert'

      
   .. py:attribute:: model_name
      :type: str
      :value: 'bert-base-uncased'

      
   .. py:attribute:: log_level
      :type: int

      
   .. py:attribute:: max_seq_length
      :type: int
      :value: 512

      
   .. py:attribute:: tokenizer_special_tokens
      :type: bool
      :value: False

      
   .. py:attribute:: annotation_schema_tag_ids
      :type: List
      :value: []

      If a foreign non-MCAT trainer dataset is used, you can insert your own Rel entity token delimiters into the tokenizer,     copy those token IDs here, and also resize your tokenizer embeddings and adjust the hidden_size of the model, this will depend on the number of tokens you introduce

   .. py:attribute:: labels2idx
      :type: Dict

      
   .. py:attribute:: idx2labels
      :type: Dict

      
   .. py:attribute:: pin_memory
      :type: bool
      :value: True

      
   .. py:attribute:: seed
      :type: int
      :value: 13

      
   .. py:attribute:: task
      :type: str
      :value: 'train'

      
.. py:class:: Model


   Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel`

   The model part of the RelCAT config

   .. py:class:: Config


      .. py:attribute:: extra

         
      .. py:attribute:: validate_assignment
         :value: True

         
   .. py:attribute:: input_size
      :type: int
      :value: 300

      
   .. py:attribute:: hidden_size
      :type: int
      :value: 768

      
   .. py:attribute:: hidden_layers
      :type: int
      :value: 3

      hidden_size * 5, 5 being the number of tokens, default (s1,s2,e1,e2+CLS)

   .. py:attribute:: model_size
      :type: int
      :value: 5120

      
   .. py:attribute:: dropout
      :type: float
      :value: 0.2

      
   .. py:attribute:: num_directions
      :type: int
      :value: 2

      2 - bidirectional model, 1 - unidirectional

   .. py:attribute:: padding_idx
      :type: int

      
   .. py:attribute:: emb_grad
      :type: bool
      :value: True

      If True the embeddings will also be trained

   .. py:attribute:: ignore_cpos
      :type: bool
      :value: False

      If set to True center positions will be ignored when calculating represenation


.. py:class:: Train


   Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel`

   The train part of the RelCAT config

   .. py:class:: Config


      .. py:attribute:: extra

         
      .. py:attribute:: validate_assignment
         :value: True

         
   .. py:attribute:: nclasses
      :type: int
      :value: 2

      Number of classes that this model will output

   .. py:attribute:: batch_size
      :type: int
      :value: 25

      
   .. py:attribute:: nepochs
      :type: int
      :value: 1

      
   .. py:attribute:: lr
      :type: float
      :value: 0.0001

      
   .. py:attribute:: adam_epsilon
      :type: float
      :value: 0.0001

      
   .. py:attribute:: test_size
      :type: float
      :value: 0.2

      
   .. py:attribute:: gradient_acc_steps
      :type: int
      :value: 1

      
   .. py:attribute:: multistep_milestones
      :type: List[int]
      :value: [2, 4, 6, 8, 12, 15, 18, 20, 22, 24, 26, 30]

      
   .. py:attribute:: multistep_lr_gamma
      :type: float
      :value: 0.8

      
   .. py:attribute:: max_grad_norm
      :type: float
      :value: 1.0

      
   .. py:attribute:: shuffle_data
      :type: bool
      :value: True

      Used only during training, if set the dataset will be shuffled before train/test split

   .. py:attribute:: class_weights
      :type: medcat.config.Optional[Any]

      
   .. py:attribute:: score_average
      :type: str
      :value: 'weighted'

      What to use for averaging F1/P/R across labels

   .. py:attribute:: auto_save_model
      :type: bool
      :value: True

      Should the model be saved during training for best results


.. py:class:: ConfigRelCAT


   Bases: :py:obj:`medcat.config.MixingConfig`, :py:obj:`medcat.config.BaseModel`

   The RelCAT part of the config

   .. py:class:: Config


      .. py:attribute:: extra

         
      .. py:attribute:: validate_assignment
         :value: True

         
   .. py:attribute:: general
      :type: General

      
   .. py:attribute:: model
      :type: Model

      
   .. py:attribute:: train
      :type: Train