:py:mod:`medcat.cdb_maker`
==========================

.. py:module:: medcat.cdb_maker


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medcat.cdb_maker.CDBMaker


Attributes
~~~~~~~~~~

.. autoapisummary::

   medcat.cdb_maker.PH_REMOVE
   medcat.cdb_maker.logger


.. py:data:: PH_REMOVE

   
.. py:data:: logger

   
.. py:class:: CDBMaker(config, cdb = None)


   Bases: :py:obj:`object`

   Given a CSV as shown in https://github.com/CogStack/MedCAT/tree/master/examples/<example> it creates a CDB or
   updates an existing one.

   :param config: Global config for MedCAT.
   :type config: medcat.config.Config
   :param cdb: If set the `CDBMaker` will update the existing `CDB` with
               new concepts in the CSV (Default value `None`).
   :type cdb: medcat.cdb.CDB

   .. py:method:: __init__(config, cdb = None)


   .. py:method:: reset_cdb()

      This will re-create a new internal CDB based on the same config.

      This will be necessary if/when you're wishing to call `prepare_csvs`
      multiple times on the same object `CDBMaker` instance.


   .. py:method:: prepare_csvs(csv_paths, sep = ',', encoding = None, escapechar = None, index_col = False, full_build = False, only_existing_cuis = False, **kwargs)

      Compile one or multiple CSVs into a CDB.

      Note: This class/method generally uses the same instance of the CDB.
            So if you're using the same CDBMaker and calling `prepare_csvs`
            multiple times, you are likely to get leakage from prior calls
            into new ones.
            To reset the CDB, call `reset_cdb`.

      :param csv_paths: An array of paths to the csv files that should be processed. Can also be an array of pd.DataFrames
      :type csv_paths: Union[pd.DataFrame, List[str]]
      :param sep: If necessary a custom separator for the csv files (Default value ',').
      :type sep: str
      :param encoding: Encoding to be used for reading the CSV file (Default value `None`).
      :type encoding: Optional[str]
      :param escapechar: Escape char for the CSV (Default value None).
      :type escapechar: Optional[str]
      :param index_col: Index column for pandas read_csv (Default value False).
      :type index_col: bool
      :param full_build: If False only the core portions of the CDB will be built (the ones required for
                         the functioning of MedCAT). If True, everything will be added to the CDB - this
                         usually includes concept descriptions, various forms of names etc (take care that
                         this option produces a much larger CDB) (Default value False).
      :type full_build: bool
      :param only_existing_cuis: If True no new CUIs will be added, but only linked names will be extended. Mainly used when
                                 enriching names of a CDB (e.g. SNOMED with UMLS terms) (Default value `False`).
      :type only_existing_cuis: bool
      :param kwargs: Will be passed to pandas for CSV reading
      :type kwargs: Any

      .. note::

         \*\*kwargs:
             Will be passed to pandas for CSV reading
         csv:
             Examples of the CSV used to make the CDB can be found on [GitHub](link)

      :Returns: **CDB** -- CDB with the new concepts added.


   .. py:method:: destroy_pipe()