:py:mod:`medcat.utils.relation_extraction.pad_seq` ================================================== .. py:module:: medcat.utils.relation_extraction.pad_seq Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: medcat.utils.relation_extraction.pad_seq.Pad_Sequence .. py:class:: Pad_Sequence(seq_pad_value, label_pad_value = -1) .. py:method:: __init__(seq_pad_value, label_pad_value = -1) Used in rel_cat.py in RelCAT to create DataLoaders for train/test datasets. collate_fn for dataloader to collate sequences of different input_ids, ent1/ent2, and label lengths into a fixed length batch. This is applied per batch and not on the whole DataLoader data, padded x sequence, y sequence, x lengths and y lengths of batch. :param seq_pad_value: pad value for input_ids. :type seq_pad_value: int :param label_pad_value: pad value for labels. Defaults to -1. :type label_pad_value: int .. py:method:: __call__(batch) Pads a batch of input_ids. :param batch: gets the batch of Tensors from RelData.dataset (check __getitem__() method for data returned) and pads the token sequence + labels as needed See https://pytorch.org/docs/stable/_modules/torch/nn/utils/rnn.html#pad_sequence for extra info. :type batch: List[torch.Tensor] :Returns: **Tuple[Tensor, Tensor, Tensor, LongTensor, LongTensor]** -- padded data padded input ids, ent1&ent2 start token pos, padded labels, padded input_id_lengths, padded labels length