:py:mod:`medcat.pipe`
=====================

.. py:module:: medcat.pipe


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medcat.pipe.Pipe


Attributes
~~~~~~~~~~

.. autoapisummary::

   medcat.pipe.logger
   medcat.pipe.DEFAULT_SPACY_MODEL


.. py:data:: logger

   
.. py:data:: DEFAULT_SPACY_MODEL
   :value: 'en_core_web_md'

   
.. py:class:: Pipe(tokenizer, config)


   Bases: :py:obj:`object`

   A wrapper around the standard spacy pipeline.

   :param tokenizer: What will be used to split text into tokens,
                     can be anything built as a spacy tokenizer.
   :type tokenizer: spacy.tokenizer.Tokenizer
   :param config: Global config for medcat.
   :type config: medcat.config.Config

   Properties:
       nlp (spacy.language.<lng>):
           The base spacy NLP pipeline.

   .. py:property:: spacy_nlp
      :type: spacy.language.Language

      The spaCy Language object.

      :Returns: **Language** -- The spacy model/language.

   .. py:method:: __init__(tokenizer, config)


   .. py:method:: _init_nlp(config)


   .. py:method:: add_tagger(tagger, name = None, additional_fields = [])

      Add any kind of a tagger for tokens.

      :param tagger: Any object/function that takes a spacy doc as an input, does something
                     and returns the same doc.
      :type tagger: Callable
      :param name: Name for this component in the pipeline. (Default value = None)
      :type name: Optional[str], optional
      :param additional_fields: Fields to be added to the `_` properties of a token. (Default value = [])
      :type additional_fields: List[str], optional


   .. py:method:: add_token_normalizer(config, name = None, spell_checker = None)


   .. py:method:: add_ner(ner, name = None)

      Add NER from CAT to the pipeline, will also add the necessary fields
      to the document and Span objects.

      :param ner: The NER instance
      :type ner: NER
      :param name: The pipeline name (Default value = None)
      :type name: Optional[str], optional


   .. py:method:: add_linker(linker, name = None)

      Add entity linker to the pipeline, will also add the necessary fields
      to Span object.

      :param linker: Any object/function created based on the requirements for a spaCy pipeline components. Have
                     a look at https://spacy.io/usage/processing-pipelines#custom-components
      :type linker: Linker
      :param name: The component name (Default value = None)
      :type name: Optional[str], optional


   .. py:method:: add_meta_cat(meta_cat, name = None)


   .. py:method:: add_rel_cat(rel_cat, name = None)


   .. py:method:: add_addl_ner(addl_ner, name = None)


   .. py:method:: batch_multi_process(texts, n_process = None, batch_size = None)

      Batch process a list of texts in parallel.

      :param texts: The input sequence of texts to process.
      :type texts: Iterable[str]
      :param n_process: The number of processes running in parallel.
                        Defaults to max(mp.cpu_count() - 1, 1).
      :type n_process: int
      :param batch_size: The number of texts to buffer. Defaults to 1000.
      :type batch_size: int

      :Returns: **Generator[Doc]** -- The output sequence of spacy documents with the extracted entities


   .. py:method:: set_error_handler(error_handler)


   .. py:method:: reset_error_handler()


   .. py:method:: force_remove(component_name)


   .. py:method:: destroy()


   .. py:method:: _ensure_serializable(doc)
      :staticmethod:


   .. py:method:: __call__(text)