:py:mod:`medcat.utils.meta_cat.ml_utils`
========================================

.. py:module:: medcat.utils.meta_cat.ml_utils


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medcat.utils.meta_cat.ml_utils.FocalLoss


Functions
~~~~~~~~~

.. autoapisummary::

   medcat.utils.meta_cat.ml_utils.set_all_seeds
   medcat.utils.meta_cat.ml_utils.create_batch_piped_data
   medcat.utils.meta_cat.ml_utils.predict
   medcat.utils.meta_cat.ml_utils.split_list_train_test
   medcat.utils.meta_cat.ml_utils.print_report
   medcat.utils.meta_cat.ml_utils.train_model
   medcat.utils.meta_cat.ml_utils.eval_model


Attributes
~~~~~~~~~~

.. autoapisummary::

   medcat.utils.meta_cat.ml_utils.logger


.. py:data:: logger

   
.. py:function:: set_all_seeds(seed)


.. py:function:: create_batch_piped_data(data, start_ind, end_ind, device, pad_id)

   Creates a batch given data and start/end that denote batch size, will also add
   padding and move to the right device.

   :param data: Data in the format: [[<[input_ids]>, <cpos>, Optional[int]], ...], the third column is optional
                and represents the output label
   :type data: List[Tuple[List[int], int, Optional[int]]]
   :param start_ind: Start index of this batch
   :type start_ind: int
   :param end_ind: End index of this batch
   :type end_ind: int
   :param device: Where to move the data
   :type device: torch.device
   :param pad_id: Padding index
   :type pad_id: int

   :Returns: * **x ()** -- Same as data, but subsetted and as a tensor
             * **cpos ()** -- Center positions for the data
             * **attention_mask** -- Indicating padding mask for the data
             * **y** -- class label of the data


.. py:function:: predict(model, data, config)

   Predict on data used in the meta_cat.pipe

   :param model: The model.
   :type model: nn.Module
   :param data: Data in the format: [[<input_ids>, <cpos>], ...]
   :type data: List[Tuple[List[int], int, Optional[int]]]
   :param config: Configuration for this meta_cat instance.
   :type config: ConfigMetaCAT

   :Returns: * **predictions** (*List[int]*) -- For each row of input data a prediction
             * **confidence** (*List[float]*) -- For each prediction a confidence value


.. py:function:: split_list_train_test(data, test_size, shuffle = True)

   Shuffle and randomly split data

   :param data: The data.
   :type data: List
   :param test_size: The test size.
   :type test_size: float
   :param shuffle: Whether to shuffle the data. Defaults to True.
   :type shuffle: bool

   :Returns: **Tuple** -- The train data, and the test data.


.. py:function:: print_report(epoch, running_loss, all_logits, y, name = 'Train')

   Prints some basic stats during training

   :param epoch: Number of epochs.
   :type epoch: int
   :param running_loss: The loss
   :type running_loss: List
   :param all_logits: List of logits
   :type all_logits: List
   :param y: The y array.
   :type y: Any
   :param name: The name of the report. Defaults to Train.
   :type name: str


.. py:class:: FocalLoss(alpha=None, gamma=2)


   Bases: :py:obj:`torch.nn.Module`

   Base class for all neural network modules.

   Your models should also subclass this class.

   Modules can also contain other Modules, allowing to nest them in
   a tree structure. You can assign the submodules as regular attributes::

       import torch.nn as nn
       import torch.nn.functional as F

       class Model(nn.Module):
           def __init__(self):
               super().__init__()
               self.conv1 = nn.Conv2d(1, 20, 5)
               self.conv2 = nn.Conv2d(20, 20, 5)

           def forward(self, x):
               x = F.relu(self.conv1(x))
               return F.relu(self.conv2(x))

   Submodules assigned in this way will be registered, and will have their
   parameters converted too when you call :meth:`to`, etc.

   .. note::
       As per the example above, an ``__init__()`` call to the parent class
       must be made before assignment on the child.

   :ivar training: Boolean represents whether this module is in training or
                   evaluation mode.
   :vartype training: bool

   .. py:method:: __init__(alpha=None, gamma=2)

      Initialize internal Module state, shared by both nn.Module and ScriptModule.


   .. py:method:: forward(inputs, targets)


.. py:function:: train_model(model, data, config, save_dir_path = None)

   Trains a LSTM model and BERT with autocheckpoints

   :param model: The model
   :type model: nn.Module
   :param data: The data.
   :type data: List
   :param config: MetaCAT config.
   :type config: ConfigMetaCAT
   :param save_dir_path: The save dir path if required. Defaults to None.
   :type save_dir_path: Optional[str]

   :Returns: **Dict** -- The classification report for the winner.

   :raises Exception: If auto-save is enabled but no save dir path is provided.


.. py:function:: eval_model(model, data, config, tokenizer)

   Evaluate a trained model on the provided data

   :param model: The model.
   :type model: nn.Module
   :param data: The data.
   :type data: List
   :param config: The MetaCAT config.
   :type config: ConfigMetaCAT
   :param tokenizer: The tokenizer.
   :type tokenizer: TokenizerWrapperBase

   :Returns: **Dict** -- Results (precision, recall, f1, examples, confusion matrix)