:py:mod:`medcat.utils.relation_extraction.bert.tokenizer`
=========================================================

.. py:module:: medcat.utils.relation_extraction.bert.tokenizer


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medcat.utils.relation_extraction.bert.tokenizer.TokenizerWrapperBERT_RelationExtraction


Attributes
~~~~~~~~~~

.. autoapisummary::

   medcat.utils.relation_extraction.bert.tokenizer.logger


.. py:data:: logger

   
.. py:class:: TokenizerWrapperBERT_RelationExtraction(hf_tokenizers=None, max_seq_length = None, add_special_tokens = False)


   Bases: :py:obj:`medcat.utils.relation_extraction.tokenizer.BaseTokenizerWrapper_RelationExtraction`

   Base class for all fast tokenizers (wrapping HuggingFace tokenizers library).

   Inherits from [`~tokenization_utils_base.PreTrainedTokenizerBase`].

   Handles all the shared methods for tokenization and special tokens, as well as methods for
   downloading/caching/loading pretrained tokenizers, as well as adding tokens to the vocabulary.

   This class also contains the added tokens in a unified way on top of all tokenizers so we don't have to handle the
   specific vocabulary augmentation methods of the various underlying dictionary structures (BPE, sentencepiece...).

   .. py:attribute:: name
      :value: 'tokenizer_wrapper_bert_rel'

      Wrapper around a huggingface BERT tokenizer so that it works with the
      RelCAT models.

      :param hf_tokenizers: A huggingface Fast BERT.
      :type hf_tokenizers: `transformers.models.bert.tokenization_bert_fast.PreTrainedTokenizerFast`

   .. py:attribute:: name
      :value: 'bert-tokenizer'

      
   .. py:attribute:: pretrained_model_name_or_path
      :value: 'bert-base-uncased'

      
   .. py:method:: load(tokenizer_path, relcat_config, **kwargs)
      :classmethod: