:py:mod:`medcat.utils.regression.converting` ============================================ .. py:module:: medcat.utils.regression.converting Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: medcat.utils.regression.converting.ContextSelector medcat.utils.regression.converting.PerWordContextSelector medcat.utils.regression.converting.PerSentenceSelector medcat.utils.regression.converting.UniqueNamePreserver Functions ~~~~~~~~~ .. autoapisummary:: medcat.utils.regression.converting.get_matching_case medcat.utils.regression.converting.medcat_export_json_to_regression_yml Attributes ~~~~~~~~~~ .. autoapisummary:: medcat.utils.regression.converting.logger .. py:data:: logger .. py:class:: ContextSelector Bases: :py:obj:`abc.ABC` Describes how the context of a concept is found. A sub-class should be used as this one has no implementation. .. py:method:: _splitter(text) .. py:method:: make_replace_safe(text) Make the text replace-safe. That is, wrap all '%' as '%%' so that the `text % replacement` syntax can be used for an inserted part (and that part only). :param text: The text to use :type text: str :Returns: **str** -- The replace-safe text .. py:method:: get_context(text, start, end, leave_concept = False) :abstractmethod: Get the context of a concept within a larger body of text. The concept is specifiedb by its start and end indices. :param text: The larger text :type text: str :param start: The starting index :type start: int :param end: The ending index :type end: int :param leave_concept: Whether to leave the concept or replace it by '%s'. Defaults to False :type leave_concept: bool :Returns: **str** -- The select contexts .. py:class:: PerWordContextSelector(words_before, words_after) Bases: :py:obj:`ContextSelector` Context selector that selects a number of words from either side of the concept, regardless of punctuation. :param words_before: Number of words to select from before concept :type words_before: int :param words_after: Number of words to select from after concepts :type words_after: int .. py:method:: __init__(words_before, words_after) .. py:method:: get_context(text, start, end, leave_concept = False) Get the context of a concept within a larger body of text. The concept is specifiedb by its start and end indices. :param text: The larger text :type text: str :param start: The starting index :type start: int :param end: The ending index :type end: int :param leave_concept: Whether to leave the concept or replace it by '%s'. Defaults to False :type leave_concept: bool :Returns: **str** -- The select contexts .. py:class:: PerSentenceSelector Bases: :py:obj:`ContextSelector` Context selector that selects a sentence as context. Sentences are said to end with either ".", "?" or "!". .. py:attribute:: stoppers :value: '\\.+|\\?+|!+' .. py:method:: get_context(text, start, end, leave_concept = False) Get the context of a concept within a larger body of text. The concept is specifiedb by its start and end indices. :param text: The larger text :type text: str :param start: The starting index :type start: int :param end: The ending index :type end: int :param leave_concept: Whether to leave the concept or replace it by '%s'. Defaults to False :type leave_concept: bool :Returns: **str** -- The select contexts .. py:class:: UniqueNamePreserver Used to preserver unique names in a set .. py:method:: __init__() .. py:method:: name2nrgen(name, nr) The method to generate name and copy-number combinations. :param name: The base name :type name: str :param nr: The number of the copy :type nr: int :Returns: **str** -- The combined name .. py:method:: get_unique_name(orig_name, dupe_nr = 0) Get the unique name of dupe number (at least) as high as specified. :param orig_name: The original / base name :type orig_name: str :param dupe_nr: The number of the copy to start from. Defaults to 0. :type dupe_nr: int :Returns: **str** -- The unique name .. py:function:: get_matching_case(cases, filters) Get a case that matches a set of filters (if one exists) from within a list. :param cases: The list to look in :type cases: List[RegressionCase] :param filters: The filters to compare to :type filters: List[TypedFilter] :Returns: **Optional[RegressionCase]** -- The regression case (if found) or None .. py:function:: medcat_export_json_to_regression_yml(mct_export_file, cont_sel = PerSentenceSelector(), model_card = None) Extract regression test cases from a MedCATtrainer export yaml. This is done based on the context selector specified. :param mct_export_file: The MCT export file path :type mct_export_file: str :param cont_sel: The context selector. Defaults to PerSentenceSelector(). :type cont_sel: ContextSelector :param model_card: The optional model card for generating metadata :type model_card: Optional[dict] :Returns: **str** -- Extracted regression cases in YAML form