medcat.utils.checkpoint
Module Contents
Classes
The base class of checkpoint objects |
|
The class for managing checkpoints of specific training type and their configuration |
Attributes
- medcat.utils.checkpoint.T
- medcat.utils.checkpoint.logger
- class medcat.utils.checkpoint.Checkpoint(dir_path, *, steps=DEFAULT_STEP, max_to_keep=DEFAULT_MAX_TO_KEEP)
Bases:
object
The base class of checkpoint objects
- Parameters:
dir_path (str) – The path to the parent directory of checkpoint files.
steps (int) – The number of processed sentences/documents before a checkpoint is saved (N.B.: A small number could result in error “no space left on device”),
max_to_keep (int) – The maximum number of checkpoints to keep (N.B.: A large number could result in error “no space left on device”).
- property steps: int
- Return type:
int
- property max_to_keep: int
- Return type:
int
- property count: int
- Return type:
int
- property dir_path: str
- Return type:
str
- DEFAULT_STEP = 1000
- DEFAULT_MAX_TO_KEEP = 1
- __init__(dir_path, *, steps=DEFAULT_STEP, max_to_keep=DEFAULT_MAX_TO_KEEP)
- Parameters:
dir_path (str) –
steps (int) –
max_to_keep (int) –
- Return type:
None
- classmethod from_latest(dir_path)
Retrieve the latest checkpoint from the parent directory.
- Parameters:
dir_path (str) – The path to the directory containing checkpoint files.
- Returns:
T – A new checkpoint object.
- Raises:
Exception – If no checkpoint is found.
- Return type:
T
- save(cdb, count)
Save the CDB as the latest checkpoint.
- Parameters:
cdb (CDB) – The MedCAT CDB object to be checkpointed.
count (int) – The number of the finished steps.
- Return type:
None
- restore_latest_cdb()
Restore the CDB from the latest checkpoint.
- Returns:
cdb (CDB) – The MedCAT CDB object.
- Raises:
Exception – If no checkpoint is found.
- Return type:
- static _get_ckpt_file_paths(dir_path)
- Parameters:
dir_path (str) –
- Return type:
List[str]
- static _get_steps_and_count(file_path)
- Return type:
Tuple[int, int]
- class medcat.utils.checkpoint.CheckpointConfig
Bases:
object
- output_dir: str = 'checkpoints'
- steps: int
- max_to_keep: int
- class medcat.utils.checkpoint.CheckpointManager(name, checkpoint_config)
Bases:
object
The class for managing checkpoints of specific training type and their configuration
- Parameters:
name (str) – The name of the checkpoint manager (also used as the checkpoint base directory name).
checkpoint_config (medcat.utils.checkpoint.CheckpointConfig) – The checkpoint config object.
- __init__(name, checkpoint_config)
- Parameters:
name (str) –
checkpoint_config (CheckpointConfig) –
- Return type:
None
- create_checkpoint(dir_path=None)
Create a new checkpoint inside the checkpoint base directory.
- Parameters:
dir_path (str) – The path to the checkpoint directory.
- Returns:
CheckPoint – A checkpoint object.
- Return type:
- get_latest_checkpoint(base_dir_path=None)
Retrieve the latest checkpoint from the checkpoint base directory.
- Parameters:
base_dir_path (string) – The path to the directory containing checkpoint files.
- Returns:
CheckPoint – A checkpoint object
- Return type:
- classmethod get_latest_training_dir(base_dir_path)
Retrieve the latest training directory containing all checkpoints.
- Parameters:
base_dir_path (str) – The path to the directory containing all checkpointed trainings.
- Returns:
str – The path to the latest training directory containing all checkpoints.
- Raises:
ValueError – If no checkpoint is found.
- Return type:
str