medcat.preprocessing.cleaners
Text cleaners of various levels, from removing only garbage to pretty much everything that is not a word.
Module Contents
Functions
|
Generates different forms of a name. Will edit the provided names dictionary |
|
Remove almost everything from text |
|
Remove almost everything from text |
|
|
|
|
|
|
|
|
|
|
|
Attributes
- medcat.preprocessing.cleaners.prepare_name(raw_name, nlp, names, config)
Generates different forms of a name. Will edit the provided names dictionary and add information generated from the name.
- Parameters:
raw_name (str) – Thre raw name to prepare.
nlp (Language) – Spacy nlp model.
names (Dict) – Dictionary of existing names for this concept in this row of a CSV. The new generated name versions and other required information will be added here.
config (Config) – Global config for medcat.
- Returns:
names (Dict) – The new dictionary of prepared names.
- Return type:
Dict
- medcat.preprocessing.cleaners.basic_clean(text)
Remove almost everything from text
- Parameters:
text (str) – Text to be cleaned.
- Returns:
str – The cleaned text.
- Return type:
str
- medcat.preprocessing.cleaners.clean_text(text)
Remove almost everything from text
- Parameters:
text (str) – Text to be cleaned.
- Returns:
str – The cleaned text.
- Return type:
str
- medcat.preprocessing.cleaners.BR_U4
- medcat.preprocessing.cleaners.CB
- medcat.preprocessing.cleaners.CB_D
- medcat.preprocessing.cleaners.BR
- medcat.preprocessing.cleaners.PH_RM
- medcat.preprocessing.cleaners.SKIP_CHARS
- medcat.preprocessing.cleaners.clean_drugs_uk(text, stopwords=None, umls=False)
- Parameters:
text (str) –
stopwords (Optional[List[str]]) –
umls (bool) –
- Return type:
str
- medcat.preprocessing.cleaners.clean_name(text, stopwords=None, umls=False)
- Parameters:
text (str) –
stopwords (Optional[List[str]]) –
umls (bool) –
- Return type:
str
- medcat.preprocessing.cleaners.clean_umls(text, stopwords=None)
- Parameters:
text (str) –
stopwords (Optional[List[str]]) –
- Return type:
str
- medcat.preprocessing.cleaners.clean_def(text)
- Parameters:
text (str) –
- Return type:
str
- medcat.preprocessing.cleaners.clean_snt(text)
- Parameters:
text (str) –
- Return type:
str
- medcat.preprocessing.cleaners.clean_snomed_name(text)
- Parameters:
text (str) –
- Return type:
str