medcat.utils.regression.targeting

Module Contents

Classes

TranslationLayer

The translation layer for translating:

FilterStrategy

Describes the filter strategy.

FilterType

The types of targets that can be specified

TypedFilter

A filter with multiple values to filter against.

FilterOptions

A class describing the options for the filters

CUIWithChildFilter

A filter with multiple values to filter against.

Attributes

logger

logger

medcat.utils.regression.targeting.logger
medcat.utils.regression.targeting.logger
class medcat.utils.regression.targeting.TranslationLayer(cui2names, name2cuis, cui2type_ids, cui2children)

The translation layer for translating: - CUIs to names - names to CUIs - type_ids to CUIs - CUIs to chil CUIs

The idea is to decouple these translations from the CDB instance in case something changes there.

Parameters:
  • cui2names (Dict[str, Set[str]]) – The map from CUI to names

  • name2cuis (Dict[str, List[str]]) – The map from name to CUIs

  • cui2type_ids (Dict[str, Set[str]]) – The map from CUI to type_ids

  • cui2children (Dict[str, Set[str]]) – The map from CUI to child CUIs

__init__(cui2names, name2cuis, cui2type_ids, cui2children)
Parameters:
  • cui2names (Dict[str, Set[str]]) –

  • name2cuis (Dict[str, List[str]]) –

  • cui2type_ids (Dict[str, Set[str]]) –

  • cui2children (Dict[str, Set[str]]) –

Return type:

None

targets_for(cui)
Parameters:

cui (str) –

Return type:

Iterator[Tuple[str, str]]

all_targets(all_cuis, all_names, all_types)

Get a generator of all target information objects. This is the starting point for checking cases.

Parameters:
  • all_cuis (Set[str]) – The set of all CUIs to be queried

  • all_names (Set[str]) – The set of all names to be queried

  • all_types (Set[str]) – The set of all type IDs to be queried

Yields:

Iterator[Tuple[str, str]] – The iterator of the target info

Return type:

Iterator[Tuple[str, str]]

get_children_of(found_cuis, cui, depth=1)

Get the children of the specifeid CUI in the listed CUIs (if they exist).

Parameters:
  • found_cuis (Iterable[str]) – The list of CUIs to look in

  • cui (str) – The target parent CUI

  • depth (int) – The depth to carry out the search for

Returns:

List[str] – The list of children found

Return type:

List[str]

get_parents_of(found_cuis, cui, depth=1)

Get the parents of the specifeid CUI in the listed CUIs (if they exist).

If needed, higher order parents (i.e grandparents) can be queries for.

This uses the get_children_of method intenrnally. That is, if any of the found CUIs have the specified CUI as a child of the specified depth, the found CUIs have a parent of the specified depth.

Parameters:
  • found_cuis (Iterable[str]) – The list of CUIs to look in

  • cui (str) – The target child CUI

  • depth (int) – The depth to carry out the search for

Returns:

List[str] – The list of parents found

Return type:

List[str]

classmethod from_CDB(cdb)

Construct a TranslationLayer object from a context database (CDB).

This translation layer will refer to the same dicts that the CDB refers to. While there is no obvious reason these should be modified, it’s something to keep in mind.

Parameters:

cdb (CDB) – The CDB

Returns:

TranslationLayer – The subsequent TranslationLayer

Return type:

TranslationLayer

class medcat.utils.regression.targeting.FilterStrategy

Bases: enum.Enum

Describes the filter strategy. I.e whether to match all or any of the filters specified.

ALL = 1

Specified that all filters must be satisfied

ANY = 2

Specified that any of the filters must be satisfied

classmethod match_str(name)

Find a loose string match.

Parameters:

name (str) – The name of the enum

Returns:

FilterStrategy – The matched FilterStrategy

Return type:

FilterStrategy

class medcat.utils.regression.targeting.FilterType

Bases: enum.Enum

The types of targets that can be specified

TYPE_ID = 1

Filters by specified type_ids

CUI = 2

Filters by specified CUIs

NAME = 3

Filters by specified names

CUI_AND_CHILDREN = 4

Filter by CUI but also allow children, up to a specified distance

classmethod match_str(name)

Case insensitive matching for FilterType

Parameters:

name (str) – The naeme to be matched

Returns:

FilterType – The matched FilterType

Return type:

FilterType

class medcat.utils.regression.targeting.TypedFilter

Bases: pydantic.BaseModel

A filter with multiple values to filter against.

type: FilterType
values: List[str]
get_applicable_targets(translation, in_gen)

Get all applicable targets for this filter

Parameters:
  • translation (TranslationLayer) – The translation layer

  • in_gen (Iterator[Tuple[str, str]]) – The input generator / iterator

Yields:

Iterator[Tuple[str, str]] – The output generator

Return type:

Iterator[Tuple[str, str]]

classmethod one_from_input(target_type, vals)

Get one typed filter from the input target type and values. The values can either a be a string for a single target, a list of strings for multiple targets, or a dict in some more complicated cases (i.e CUI_AND_CHILDREN).

Parameters:
  • target_type (str) – The target type as string

  • vals (Union[str, list, dict]) – The values

Raises:

ValueError – If the values are malformed

Returns:

TypedFilter – The parsed filter

Return type:

TypedFilter

to_dict()

Convert the TypedFilter to a dict to be serialised.

Returns:

dict – The dict representation

Return type:

dict

static list_to_dicts(filters)

Create a list of dicts from list of TypedFilters.

Parameters:

filters (List[TypedFilter]) – The list of typed filters

Returns:

List[dict] – The list of dicts

Return type:

List[dict]

static list_to_dict(filters)

Create a single dict from the list of TypedFilters.

Parameters:

filters (List[TypedFilter]) – The list of typed filters

Returns:

dict – The dict

Return type:

dict

classmethod from_dict(input)

Construct a list of TypedFilter from a dict.

The assumed structure is: {<filter type>: <filtered value>} or {<filter type>: [<filtered value2>, <filtered value 2>]} There can be multiple filter types defined.

Parameters:

input (Dict[str, Any]) – The input dict.

Returns:

List[TypedFilter] – The list of constructed TypedFilter

Return type:

List[TypedFilter]

class medcat.utils.regression.targeting.FilterOptions

Bases: pydantic.BaseModel

A class describing the options for the filters

strategy: FilterStrategy
onlyprefnames: bool = False
to_dict()

Convert the FilterOptions to a dict.

Returns:

dict – The dict representation

Return type:

dict

classmethod from_dict(section)

Construct a FilterOptions instance from a dict.

The assumed structure is: {‘strategy’: <’all’ or ‘any’>, ‘prefname-only’: ‘true’}

Both strategy and prefname-only are optional.

Parameters:

section (Dict[str, str]) – The dict to parse

Returns:

FilterOptions – The resulting FilterOptions

Return type:

FilterOptions

class medcat.utils.regression.targeting.CUIWithChildFilter

Bases: TypedFilter

A filter with multiple values to filter against.

delegate: TypedFilter
depth: int
values: List[str] = []
get_applicable_targets(translation, in_gen)

Get all applicable targets for this filter

Parameters:
  • translation (TranslationLayer) – The translation layer

  • in_gen (Iterator[Tuple[str, str]]) – The input generator / iterator

Yields:

Iterator[Tuple[str, str]] – The output generator

Return type:

Iterator[Tuple[str, str]]

get_children_of(translation, cui, cur_depth)
Parameters:
Return type:

Iterator[Tuple[str, str]]

to_dict()

Convert this CUIWithChildFilter to a dict.

Returns:

dict – The dict representation

Return type:

dict