mmfewshot.classification¶
classification.apis¶
- mmfewshot.classification.apis.inference_classifier(model: torch.nn.modules.module.Module, query_img: str) → Dict[source]¶
Inference single image with the classifier.
- Parameters
model (nn.Module) – The loaded classifier.
query_img (str) – The image filename.
- Returns
- The classification results that contains
pred_score of each class.
- Return type
dict
- mmfewshot.classification.apis.init_classifier(config: Union[str, mmcv.utils.config.Config], checkpoint: Optional[str] = None, device: str = 'cuda:0', options: Optional[Dict] = None) → torch.nn.modules.module.Module[source]¶
Prepare a few shot classifier from config file.
- Parameters
config (str or
mmcv.Config
) – Config file path or the config object.checkpoint (str | None) – Checkpoint path. If left as None, the model will not load any weights. Default: None.
device (str) – Runtime device. Default: ‘cuda:0’.
options (dict | None) – Options to override some settings in the used config. Default: None.
- Returns
The constructed classifier.
- Return type
nn.Module
- mmfewshot.classification.apis.multi_gpu_meta_test(model: mmcv.parallel.distributed.MMDistributedDataParallel, num_test_tasks: int, support_dataloader: torch.utils.data.dataloader.DataLoader, query_dataloader: torch.utils.data.dataloader.DataLoader, test_set_dataloader: Optional[torch.utils.data.dataloader.DataLoader] = None, meta_test_cfg: Optional[Dict] = None, eval_kwargs: Optional[Dict] = None, logger: Optional[object] = None, confidence_interval: float = 0.95, show_task_results: bool = False) → Dict[source]¶
Distributed meta testing on multiple gpus.
During meta testing, model might be further fine-tuned or added extra parameters. While the tested model need to be restored after meta testing since meta testing can be used as the validation in the middle of training. To detach model from previous phase, the model will be copied and wrapped with
MetaTestParallel
. And it has full independence from the training model and will be discarded after the meta testing.In the distributed situation, the
MetaTestParallel
on each GPU is also independent. The test tasks in few shot leaning usually are very small and hardly benefit from distributed acceleration. Thus, in distributed meta testing, each task is done in single GPU and each GPU is assigned a certain number of tasks. The number of test tasks for each GPU is ceil(num_test_tasks / world_size). After all GPUs finish their tasks, the results will be aggregated to get the final result.- Parameters
model (
MMDistributedDataParallel
) – Model to be meta tested.num_test_tasks (int) – Number of meta testing tasks.
support_dataloader (
DataLoader
) – A PyTorch dataloader of support data.query_dataloader (
DataLoader
) – A PyTorch dataloader of query data.test_set_dataloader (
DataLoader
) – A PyTorch dataloader of all test data. Default: None.meta_test_cfg (dict) – Config for meta testing. Default: None.
eval_kwargs (dict) – Any keyword argument to be used for evaluation. Default: None.
logger (logging.Logger | None) – Logger used for printing related information during evaluation. Default: None.
confidence_interval (float) – Confidence interval. Default: 0.95.
show_task_results (bool) – Whether to record the eval result of each task. Default: False.
- Returns
- Dict of meta evaluate results, containing accuracy_mean
and accuracy_std of all test tasks.
- Return type
dict | None
- mmfewshot.classification.apis.process_support_images(model: torch.nn.modules.module.Module, support_imgs: List[str], support_labels: List[str]) → None[source]¶
Process support images.
- Parameters
model (nn.Module) – Classifier model.
support_imgs (list[str]) – The image filenames.
support_labels (list[str]) – The class names of support images.
- mmfewshot.classification.apis.show_result_pyplot(img: str, result: Dict, fig_size: Tuple[int] = (15, 10), wait_time: int = 0, out_file: Optional[str] = None) → numpy.ndarray[source]¶
Visualize the classification results on the image.
- Parameters
img (str) – Image filename.
result (dict) – The classification result.
fig_size (tuple) – Figure size of the pyplot figure. Default: (15, 10).
wait_time (int) – How many seconds to display the image. Default: 0.
out_file (str | None) – Default: None
- Returns
pyplot figure.
- Return type
np.ndarray
- mmfewshot.classification.apis.single_gpu_meta_test(model: Union[mmcv.parallel.data_parallel.MMDataParallel, torch.nn.modules.module.Module], num_test_tasks: int, support_dataloader: torch.utils.data.dataloader.DataLoader, query_dataloader: torch.utils.data.dataloader.DataLoader, test_set_dataloader: Optional[torch.utils.data.dataloader.DataLoader] = None, meta_test_cfg: Optional[Dict] = None, eval_kwargs: Optional[Dict] = None, logger: Optional[object] = None, confidence_interval: float = 0.95, show_task_results: bool = False) → Dict[source]¶
Meta testing on single gpu.
During meta testing, model might be further fine-tuned or added extra parameters. While the tested model need to be restored after meta testing since meta testing can be used as the validation in the middle of training. To detach model from previous phase, the model will be copied and wrapped with
MetaTestParallel
. And it has full independence from the training model and will be discarded after the meta testing.- Parameters
model (
MMDataParallel
| nn.Module) – Model to be meta tested.num_test_tasks (int) – Number of meta testing tasks.
support_dataloader (
DataLoader
) – A PyTorch dataloader of support data and it is used to fetch support data for each task.query_dataloader (
DataLoader
) – A PyTorch dataloader of query data and it is used to fetch query data for each task.test_set_dataloader (
DataLoader
) – A PyTorch dataloader of all test data and it is used for feature extraction from whole dataset to accelerate the testing. Default: None.meta_test_cfg (dict) – Config for meta testing. Default: None.
eval_kwargs (dict) – Any keyword argument to be used for evaluation. Default: None.
logger (logging.Logger | None) – Logger used for printing related information during evaluation. Default: None.
confidence_interval (float) – Confidence interval. Default: 0.95.
show_task_results (bool) – Whether to record the eval result of each task. Default: False.
- Returns
- Dict of meta evaluate results, containing accuracy_mean
and accuracy_std of all test tasks.
- Return type
dict
- mmfewshot.classification.apis.test_single_task(model: mmfewshot.classification.utils.meta_test_parallel.MetaTestParallel, support_dataloader: torch.utils.data.dataloader.DataLoader, query_dataloader: torch.utils.data.dataloader.DataLoader, meta_test_cfg: Dict)[source]¶
Test a single task.
A task has two stages: handling the support set and predicting the query set. In stage one, it currently supports fine-tune based and metric based methods. In stage two, it simply forward the query set and gather all the results.
- Parameters
model (
MetaTestParallel
) – Model to be meta tested.support_dataloader (
DataLoader
) – A PyTorch dataloader of support data.query_dataloader (
DataLoader
) – A PyTorch dataloader of query data.meta_test_cfg (dict) – Config for meta testing.
- Returns
results_list (list[np.ndarray]): Predict results.
gt_labels (np.ndarray): Ground truth labels.
- Return type
tuple
classification.core¶
evaluation¶
- class mmfewshot.classification.core.evaluation.DistMetaTestEvalHook(support_dataloader: torch.utils.data.dataloader.DataLoader, query_dataloader: torch.utils.data.dataloader.DataLoader, test_set_dataloader: torch.utils.data.dataloader.DataLoader, num_test_tasks: int, interval: int = 1, by_epoch: bool = True, meta_test_cfg: Optional[Dict] = None, confidence_interval: float = 0.95, save_best: bool = True, key_indicator: str = 'accuracy_mean', **eval_kwargs)[source]¶
Distributed evaluation hook.
- class mmfewshot.classification.core.evaluation.MetaTestEvalHook(support_dataloader: torch.utils.data.dataloader.DataLoader, query_dataloader: torch.utils.data.dataloader.DataLoader, test_set_dataloader: torch.utils.data.dataloader.DataLoader, num_test_tasks: int, interval: int = 1, by_epoch: bool = True, meta_test_cfg: Optional[Dict] = None, confidence_interval: float = 0.95, save_best: bool = True, key_indicator: str = 'accuracy_mean', **eval_kwargs)[source]¶
Evaluation hook for Meta Testing.
- Parameters
support_dataloader (
DataLoader
) – A PyTorch dataloader of support data.query_dataloader (
DataLoader
) – A PyTorch dataloader of query data.test_set_dataloader (
DataLoader
) – A PyTorch dataloader of all test data.num_test_tasks (int) – Number of tasks for meta testing.
interval (int) – Evaluation interval (by epochs or iteration). Default: 1.
by_epoch (bool) – Epoch based runner or not. Default: True.
meta_test_cfg (dict) – Config for meta testing.
confidence_interval (float) – Confidence interval. Default: 0.95.
save_best (bool) – Whether to save best validated model. Default: True.
key_indicator (str) – The validation metric for selecting the best model. Default: ‘accuracy_mean’.
eval_kwargs – Any keyword argument to be used for evaluation.
classification.datasets¶
- class mmfewshot.classification.datasets.BaseFewShotDataset(data_prefix: str, pipeline: List[Dict], classes: Optional[Union[str, List[str]]] = None, ann_file: Optional[str] = None)[source]¶
Base few shot dataset.
- Parameters
data_prefix (str) – The prefix of data path.
pipeline (list) – A list of dict, where each element represents a operation defined in mmcls.datasets.pipelines.
classes (str | Sequence[str] | None) – Classes for model training and provide fixed label for each class. Default: None.
ann_file (str | None) – The annotation file. When ann_file is str, the subclass is expected to read from the ann_file. When ann_file is None, the subclass is expected to read according to data_prefix. Default: None.
- property class_to_idx: Mapping¶
Map mapping class name to class index.
- Returns
mapping from class name to class index.
- Return type
dict
- static evaluate(results: List, gt_labels: numpy.array, metric: Union[str, List[str]] = 'accuracy', metric_options: Optional[dict] = None, logger: Optional[object] = None) → Dict[source]¶
Evaluate the dataset.
- Parameters
results (list) – Testing results of the dataset.
gt_labels (np.ndarray) – Ground truth labels.
metric (str | list[str]) – Metrics to be evaluated. Default value is accuracy.
metric_options (dict | None) – Options for calculating metrics. Allowed keys are ‘topk’, ‘thrs’ and ‘average_mode’. Default: None.
logger (logging.Logger | None) – Logger used for printing related information during evaluation. Default: None.
- Returns
evaluation results
- Return type
dict
- classmethod get_classes(classes: Optional[Union[Sequence[str], str]] = None) → Sequence[str][source]¶
Get class names of current dataset.
- Parameters
classes (Sequence[str] | str | None) –
Three types of input will correspond to different processing logics:
If classes is a tuple or list, it will override the CLASSES predefined in the dataset.
If classes is None, we directly use pre-defined CLASSES will be used by the dataset.
If classes is a string, it is the path of a classes file that contains the name of all classes. Each line of the file contains a single class name.
- Returns
Names of categories of the dataset.
- Return type
tuple[str] or list[str]
- class mmfewshot.classification.datasets.CUBDataset(classes_id_seed: Optional[int] = None, subset: typing_extensions.Literal[train, test, val] = 'train', *args, **kwargs)[source]¶
CUB dataset for few shot classification.
- Parameters
classes_id_seed (int | None) – A random seed to shuffle order of classes. If seed is None, the classes will be arranged in alphabetical order. Default: None.
subset (str| list[str]) – The classes of whole dataset are split into three disjoint subset: train, val and test. If subset is a string, only one subset data will be loaded. If subset is a list of string, then all data of subset in list will be loaded. Options: [‘train’, ‘val’, ‘test’]. Default: ‘train’.
- get_classes(classes: Optional[Union[Sequence[str], str]] = None) → Sequence[str][source]¶
Get class names of current dataset.
- Parameters
classes (Sequence[str] | str | None) –
Three types of input will correspond to different processing logics:
If classes is a tuple or list, it will override the CLASSES predefined in the dataset.
If classes is None, we directly use pre-defined CLASSES will be used by the dataset.
If classes is a string, it is the path of a classes file that contains the name of all classes. Each line of the file contains a single class name.
- Returns
Names of categories of the dataset.
- Return type
tuple[str] or list[str]
- class mmfewshot.classification.datasets.EpisodicDataset(dataset: torch.utils.data.dataset.Dataset, num_episodes: int, num_ways: int, num_shots: int, num_queries: int, episodes_seed: Optional[int] = None)[source]¶
A wrapper of episodic dataset.
It will generate a list of support and query images indices for each episode (support + query images). Every call of __getitem__ will fetch and return (num_ways * num_shots) support images and (num_ways * num_queries) query images according to the generated images indices. Note that all the episode indices are generated at once using a specific random seed to ensure the reproducibility for same dataset.
- Parameters
dataset (
Dataset
) – The dataset to be wrapped.num_episodes (int) – Number of episodes. Noted that all episodes are generated at once and will not be changed afterwards. Make sure setting the num_episodes larger than your needs.
num_ways (int) – Number of ways for each episode.
num_shots (int) – Number of support data of each way for each episode.
num_queries (int) – Number of query data of each way for each episode.
episodes_seed (int | None) – A random seed to reproduce episodic indices. If seed is None, it will use runtime random seed. Default: None.
- class mmfewshot.classification.datasets.LoadImageFromBytes(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]¶
Load an image from bytes.
- class mmfewshot.classification.datasets.MetaTestDataset(*args, **kwargs)[source]¶
A wrapper of the episodic dataset for meta testing.
During meta test, the MetaTestDataset will be copied and converted into three mode: test_set, support, and test. Each mode of dataset will be used in different dataloader, but they share the same episode and image information.
In test_set mode, the dataset will fetch all images from the whole test set to extract features from the fixed backbone, which can accelerate meta testing.
In support or query mode, the dataset will fetch images according to the episode_idxes with the same task_id. Therefore, the support and query dataset must be set to the same task_id in each test task.
- class mmfewshot.classification.datasets.MiniImageNetDataset(subset: typing_extensions.Literal[train, test, val] = 'train', file_format: str = 'JPEG', *args, **kwargs)[source]¶
MiniImageNet dataset for few shot classification.
- Parameters
subset (str| list[str]) – The classes of whole dataset are split into three disjoint subset: train, val and test. If subset is a string, only one subset data will be loaded. If subset is a list of string, then all data of subset in list will be loaded. Options: [‘train’, ‘val’, ‘test’]. Default: ‘train’.
file_format (str) – The file format of the image. Default: ‘JPEG’
- get_classes(classes: Optional[Union[Sequence[str], str]] = None) → Sequence[str][source]¶
Get class names of current dataset.
- Parameters
classes (Sequence[str] | str | None) –
Three types of input will correspond to different processing logics:
If classes is a tuple or list, it will override the CLASSES predefined in the dataset.
If classes is None, we directly use pre-defined CLASSES will be used by the dataset.
If classes is a string, it is the path of a classes file that contains the name of all classes. Each line of the file contains a single class name.
- Returns
Names of categories of the dataset.
- Return type
tuple[str] or list[str]
- class mmfewshot.classification.datasets.TieredImageNetDataset(subset: typing_extensions.Literal[train, test, val] = 'train', *args, **kwargs)[source]¶
TieredImageNet dataset for few shot classification.
- Parameters
subset (str| list[str]) – The classes of whole dataset are split into three disjoint subset: train, val and test. If subset is a string, only one subset data will be loaded. If subset is a list of string, then all data of subset in list will be loaded. Options: [‘train’, ‘val’, ‘test’]. Default: ‘train’.
- get_classes(classes: Optional[Union[Sequence[str], str]] = None) → Sequence[str][source]¶
Get class names of current dataset.
- Parameters
classes (Sequence[str] | str | None) –
Three types of input will correspond to different processing logics:
If classes is a tuple or list, it will override the CLASSES predefined in the dataset.
If classes is None, we directly use pre-defined CLASSES will be used by the dataset.
If classes is a string, it is the path of a classes file that contains the name of all classes. Each line of the file contains a single class name.
- Returns
Names of categories of the dataset.
- Return type
tuple[str] or list[str]
- mmfewshot.classification.datasets.build_dataloader(dataset: torch.utils.data.dataset.Dataset, samples_per_gpu: int, workers_per_gpu: int, num_gpus: int = 1, dist: bool = True, shuffle: bool = True, round_up: bool = True, seed: Optional[int] = None, pin_memory: bool = False, use_infinite_sampler: bool = False, **kwargs) → torch.utils.data.dataloader.DataLoader[source]¶
Build PyTorch DataLoader.
In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs.
- Parameters
dataset (Dataset) – A PyTorch dataset.
samples_per_gpu (int) – Number of training samples on each GPU, i.e., batch size of each GPU.
workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.
num_gpus (int) – Number of GPUs. Only used in non-distributed training.
dist (bool) – Distributed training/test or not. Default: True.
shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.
round_up (bool) – Whether to round up the length of dataset by adding extra samples to make it evenly divisible. Default: True.
seed (int | None) – Random seed. Default:None.
pin_memory (bool) – Whether to use pin_memory for dataloader. Default: False.
use_infinite_sampler (bool) – Whether to use infinite sampler. Noted that infinite sampler will keep iterator of dataloader running forever, which can avoid the overhead of worker initialization between epochs. Default: False.
kwargs – any keyword argument to be used to initialize DataLoader
- Returns
A PyTorch dataloader.
- Return type
DataLoader
- mmfewshot.classification.datasets.build_meta_test_dataloader(dataset: torch.utils.data.dataset.Dataset, meta_test_cfg: Dict, **kwargs) → torch.utils.data.dataloader.DataLoader[source]¶
Build PyTorch DataLoader.
In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs.
- Parameters
dataset (Dataset) – A PyTorch dataset.
meta_test_cfg (dict) – Config of meta testing.
kwargs – any keyword argument to be used to initialize DataLoader
- Returns
- support_data_loader, query_data_loader
and test_set_data_loader.
- Return type
tuple[
Dataloader
]
- mmfewshot.classification.datasets.label_wrapper(labels: Union[torch.Tensor, numpy.ndarray, List], class_ids: List[int]) → Union[torch.Tensor, numpy.ndarray, list][source]¶
Map input labels into range of 0 to numbers of classes-1.
It is usually used in the meta testing phase, in which the class ids are random sampled and discontinuous.
- Parameters
labels (Tensor | np.ndarray | list) – The labels to be wrapped.
class_ids (list[int]) – All class ids of labels.
- Returns
Same type as the input labels.
- Return type
(Tensor | np.ndarray | list)
classification.models¶
backbones¶
- class mmfewshot.classification.models.backbones.Conv4(depth: int = 4, pooling_blocks: Sequence[int] = (0, 1, 2, 3), padding_blocks: Sequence[int] = (0, 1, 2, 3), flatten: bool = True)[source]¶
- class mmfewshot.classification.models.backbones.ConvNet(depth: int, pooling_blocks: Sequence[int], padding_blocks: Sequence[int], flatten: bool = True)[source]¶
Simple ConvNet.
- Parameters
depth (int) – The number of ConvBlock.
pooling_blocks (Sequence[int]) – Indicate which block to use 2x2 max pooling.
padding_blocks (Sequence[int]) – Indicate which block to use conv layer with padding.
flatten (bool) – Whether to flatten features from (N, C, H, W) to (N, C*H*W). Default: True.
- forward(x: torch.Tensor) → torch.Tensor[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmfewshot.classification.models.backbones.ResNet12(block: torch.nn.modules.module.Module = <class 'mmfewshot.classification.models.backbones.resnet12.BasicBlock'>, with_avgpool: bool = True, pool_size: Tuple[int, int] = (1, 1), flatten: bool = True, drop_rate: float = 0.0, drop_block_size: int = 5)[source]¶
ResNet12.
- Parameters
block (nn.Module) – Block to build layers. Default:
BasicBlock
.with_avgpool (bool) – Whether to average pool the features. Default: True.
pool_size (tuple(int,int)) – The output shape of average pooling layer. Default: (1, 1).
flatten (bool) – Whether to flatten features from (N, C, H, W) to (N, C*H*W). Default: True.
drop_rate (float) – Dropout rate. Default: 0.0.
drop_block_size (int) – Size of drop block. Default: 5.
- forward(x: torch.Tensor) → torch.Tensor[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmfewshot.classification.models.backbones.WRN28x10(depth: int = 28, widen_factor: int = 10, stride: int = 1, drop_rate: float = 0.5, flatten: bool = True, with_avgpool: bool = True, pool_size: Tuple[int, int] = (1, 1))[source]¶
- class mmfewshot.classification.models.backbones.WideResNet(depth: int, widen_factor: int = 1, stride: int = 1, drop_rate: float = 0.0, flatten: bool = True, with_avgpool: bool = True, pool_size: Tuple[int, int] = (1, 1))[source]¶
WideResNet.
- Parameters
depth (int) – The number of layers.
widen_factor (int) – The widen factor of channels. Default: 1.
stride (int) – Stride of first layer. Default: 1.
drop_rate (float) – Dropout rate. Default: 0.0.
with_avgpool (bool) – Whether to average pool the features. Default: True.
flatten (bool) – Whether to flatten features from (N, C, H, W) to (N, C*H*W). Default: True.
pool_size (tuple(int,int)) – The output shape of average pooling layer. Default: (1, 1).
- forward(x: torch.Tensor) → torch.Tensor[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
classifier¶
heads¶
- class mmfewshot.classification.models.heads.CosineDistanceHead(num_classes: int, in_channels: int, temperature: Optional[float] = None, eps: float = 1e-05, *args, **kwargs)[source]¶
Classification head for `Baseline++ https://arxiv.org/abs/2003.04390`_.
- Parameters
num_classes (int) – Number of categories.
in_channels (int) – Number of channels in the input feature map.
temperature (float | None) – Scaling factor of cls_score. Default: None.
eps (float) – Constant variable to avoid division by zero. Default: 0.00001.
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- class mmfewshot.classification.models.heads.LinearHead(num_classes: int, in_channels: int, *args, **kwargs)[source]¶
Classification head for Baseline.
- Parameters
num_classes (int) – Number of categories.
in_channels (int) – Number of channels in the input feature map.
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- class mmfewshot.classification.models.heads.MatchingHead(temperature: float = 100, loss: Dict = {'loss_weight': 1.0, 'type': 'NLLLoss'}, *args, **kwargs)[source]¶
Classification head for `MatchingNet.
<https://arxiv.org/abs/1606.04080>`_.
Note that this implementation is without FCE(Full Context Embeddings).
- Parameters
temperature (float) – The scale factor of cls_score.
loss (dict) – Config of training loss.
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- forward_support(x: torch.Tensor, gt_label: torch.Tensor, **kwargs) → None[source]¶
Forward support data in meta testing.
- forward_train(support_feats: torch.Tensor, support_labels: torch.Tensor, query_feats: torch.Tensor, query_labels: torch.Tensor, **kwargs) → Dict[source]¶
Forward training data.
- Parameters
support_feats (Tensor) – Features of support data with shape (N, C).
support_labels (Tensor) – Labels of support data with shape (N).
query_feats (Tensor) – Features of query data with shape (N, C).
query_labels (Tensor) – Labels of query data with shape (N).
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- class mmfewshot.classification.models.heads.MetaBaselineHead(temperature: float = 10.0, learnable_temperature: bool = True, *args, **kwargs)[source]¶
Classification head for `MetaBaseline https://arxiv.org/abs/2003.04390`_.
- Parameters
temperature (float) – Scaling factor of cls_score. Default: 10.0.
learnable_temperature (bool) – Whether to use learnable scale factor or not. Default: True.
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- forward_support(x: torch.Tensor, gt_label: torch.Tensor, **kwargs) → None[source]¶
Forward support data in meta testing.
- forward_train(support_feats: torch.Tensor, support_labels: torch.Tensor, query_feats: torch.Tensor, query_labels: torch.Tensor, **kwargs) → Dict[source]¶
Forward training data.
- Parameters
support_feats (Tensor) – Features of support data with shape (N, C).
support_labels (Tensor) – Labels of support data with shape (N).
query_feats (Tensor) – Features of query data with shape (N, C).
query_labels (Tensor) – Labels of query data with shape (N).
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- class mmfewshot.classification.models.heads.NegMarginHead(num_classes: int, in_channels: int, temperature: float = 30.0, margin: float = 0.0, metric_type: str = 'cosine', *args, **kwargs)[source]¶
Classification head for NegMargin.
- Parameters
num_classes (int) – Number of categories.
in_channels (int) – Number of channels in the input feature map.
temperature (float) – Scaling factor of cls_score. Default: 30.0.
margin (float) – Margin of cls_score. Default: 0.0.
metric_type (str) – The way to calculate similarity. Options:[‘cosine’, ‘softmax’]. Default: ‘cosine’
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- class mmfewshot.classification.models.heads.PrototypeHead(*args, **kwargs)[source]¶
Classification head for `ProtoNet.
<https://arxiv.org/abs/1703.05175>`_.
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- forward_support(x: torch.Tensor, gt_label: torch.Tensor, **kwargs) → None[source]¶
Forward support data in meta testing.
- forward_train(support_feats: torch.Tensor, support_labels: torch.Tensor, query_feats: torch.Tensor, query_labels: torch.Tensor, **kwargs) → Dict[source]¶
Forward training data.
- Parameters
support_feats (Tensor) – Features of support data with shape (N, C).
support_labels (Tensor) – Labels of support data with shape (N).
query_feats (Tensor) – Features of query data with shape (N, C).
query_labels (Tensor) – Labels of query data with shape (N).
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- class mmfewshot.classification.models.heads.RelationHead(in_channels: int, feature_size: Tuple[int] = (7, 7), hidden_channels: int = 8, loss: Dict = {'loss_weight': 1.0, 'type': 'CrossEntropyLoss'}, *args, **kwargs)[source]¶
Classification head for `RelationNet.
<https://arxiv.org/abs/1711.06025>`_.
- Parameters
in_channels (int) – Number of channels in the input feature map.
feature_size (tuple(int, int)) – Size of the input feature map. Default: (7, 7).
hidden_channels (int) – Number of channels for the hidden fc layer. Default: 8.
loss (dict) – Training loss. Options are CrossEntropyLoss and MSELoss.
- before_forward_query() → None[source]¶
Used in meta testing.
This function will be called before model forward query data during meta testing.
- before_forward_support() → None[source]¶
Used in meta testing.
This function will be called before model forward support data during meta testing.
- forward_relation_module(x: torch.Tensor) → torch.Tensor[source]¶
Forward function for relation module.
- forward_support(x: torch.Tensor, gt_label: torch.Tensor, **kwargs) → None[source]¶
Forward support data in meta testing.
- forward_train(support_feats: torch.Tensor, support_labels: torch.Tensor, query_feats: torch.Tensor, query_labels: torch.Tensor, **kwargs) → Dict[source]¶
Forward training data.
- Parameters
support_feats (Tensor) – Features of support data with shape (N, C, H, W).
support_labels (Tensor) – Labels of support data with shape (N).
query_feats (Tensor) – Features of query data with shape (N, C, H, W).
query_labels (Tensor) – Labels of query data with shape (N).
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
losses¶
- class mmfewshot.classification.models.losses.MSELoss(reduction: typing_extensions.Literal[none, mean, sum] = 'mean', loss_weight: float = 1.0)[source]¶
MSELoss.
- Parameters
reduction (str) – The method that reduces the loss to a scalar. Options are “none”, “mean” and “sum”. Default: ‘mean’.
loss_weight (float) – The weight of the loss. Default: 1.0.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, int]] = None, reduction_override: Optional[str] = None) → torch.Tensor[source]¶
Forward function of loss.
- Parameters
pred (Tensor) – The prediction with shape (N, *), where * means any number of additional dimensions.
target (Tensor) – The learning target of the prediction with shape (N, *) same as the input.
weight (Tensor | None) – Weight of the loss for each prediction. Default: None.
avg_factor (float | int | None) – Average factor that is used to average the loss. Default: None.
reduction_override (str | None) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”. Default: None.
- Returns
The calculated loss
- Return type
Tensor
- class mmfewshot.classification.models.losses.NLLLoss(reduction: typing_extensions.Literal[none, mean, sum] = 'mean', loss_weight: float = 1.0)[source]¶
NLLLoss.
- Parameters
reduction (str) – The method that reduces the loss to a scalar. Options are “none”, “mean” and “sum”. Default: ‘mean’.
loss_weight (float) – The weight of the loss. Default: 1.0.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, int]] = None, reduction_override: Optional[str] = None) → torch.Tensor[source]¶
Forward function of loss.
- Parameters
pred (Tensor) – The prediction with shape (N, C).
target (Tensor) – The learning target of the prediction. with shape (N, 1).
weight (Tensor | None) – Weight of the loss for each prediction. Default: None.
avg_factor (float | int | None) – Average factor that is used to average the loss. Default: None.
reduction_override (str | None) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”. Default: None.
- Returns
The calculated loss
- Return type
Tensor
utils¶
- mmfewshot.classification.models.utils.convert_maml_module(module: torch.nn.modules.module.Module) → torch.nn.modules.module.Module[source]¶
Convert a normal model to MAML model.
Replace nn.Linear with LinearWithFastWeight, nn.Conv2d with Conv2dWithFastWeight and BatchNorm2d with BatchNorm2dWithFastWeight.
- Parameters
module (nn.Module) – The module to be converted.
- Returns :
nn.Module: A MAML module.
classification.utils¶
- class mmfewshot.classification.utils.MetaTestParallel(module: torch.nn.modules.module.Module, dim: int = 0)[source]¶
The MetaTestParallel module that supports DataContainer.
Note that each task is tested on a single GPU. Thus the data and model on different GPU should be independent.
MMDistributedDataParallel
always automatically synchronizes the grad in different GPUs when doing the loss backward, which can not meet the requirements. Thus we simply copy the module and wrap it with anMetaTestParallel
, which will send data to the device model.MetaTestParallel has two main differences with PyTorch DataParallel:
It supports a custom type
DataContainer
which allows more flexible control of input data during both GPU and CPU inference.It implement three more APIs
before_meta_test()
,before_forward_support()
andbefore_forward_query()
.
- Parameters
module (
nn.Module
) – Module to be encapsulated.dim (int) – Dimension used to scatter the data. Defaults to 0.
mmfewshot.detection¶
detection.apis¶
- mmfewshot.detection.apis.inference_detector(model: torch.nn.modules.module.Module, imgs: Union[List[str], str]) → List[source]¶
Inference images with the detector.
- Parameters
model (nn.Module) – Detector.
imgs (list[str] | str) – Batch or single image file.
- Returns
- If imgs is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
- Return type
list
- mmfewshot.detection.apis.init_detector(config: Union[str, mmcv.utils.config.Config], checkpoint: Optional[str] = None, device: str = 'cuda:0', cfg_options: Optional[Dict] = None, classes: Optional[List[str]] = None) → torch.nn.modules.module.Module[source]¶
Prepare a detector from config file.
- Parameters
config (str |
mmcv.Config
) – Config file path or the config object.checkpoint (str | None) – Checkpoint path. If left as None, the model will not load any weights.
device (str) – Runtime device. Default: ‘cuda:0’.
cfg_options (dict | None) – Options to override some settings in the used config.
classes (list[str] | None) – Options to override classes name of model. Default: None.
- Returns
The constructed detector.
- Return type
nn.Module
- mmfewshot.detection.apis.multi_gpu_model_init(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader) → List[source]¶
Forward support images for meta-learning based detector initialization.
The function usually will be called before single_gpu_test in QuerySupportEvalHook. It firstly forwards support images with mode=model_init and the features will be saved in the model. Then it will call :func:model_init to process the extracted features of support images to finish the model initialization.
Noted that the data_loader should NOT use distributed sampler, all the models in different gpus should be initialized with same images.
- Parameters
model (nn.Module) – Model used for extracting support template features.
data_loader (nn.Dataloader) – Pytorch data loader.
- Returns
Extracted support template features.
- Return type
list[Tensor]
- mmfewshot.detection.apis.multi_gpu_test(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader, tmpdir: Optional[str] = None, gpu_collect: bool = False) → List[source]¶
Test model with multiple gpus for meta-learning based detector.
The model forward function requires mode, while in mmdet it requires return_loss. And the encode_mask_results is removed. This method tests model with multiple gpus and collects the results under two different modes: gpu and cpu modes. By setting ‘gpu_collect=True’ it encodes results to gpu tensors and use gpu communication for results collection. On cpu mode it saves the results on different gpus to ‘tmpdir’ and collects them by the rank 0 worker.
- Parameters
model (nn.Module) – Model to be tested.
data_loader (Dataloader) – Pytorch data loader.
tmpdir (str) – Path of directory to save the temporary results from different gpus under cpu mode. Default: None.
gpu_collect (bool) – Option to use either gpu or cpu to collect results. Default: False.
- Returns
The prediction results.
- Return type
list
- mmfewshot.detection.apis.process_support_images(model: torch.nn.modules.module.Module, support_imgs: List[str], support_labels: List[List[str]], support_bboxes: Optional[List[List[float]]] = None, classes: Optional[List[str]] = None) → None[source]¶
Process support images for query support detector.
- Parameters
model (nn.Module) – Detector.
support_imgs (list[str]) – Support image filenames.
support_labels (list[list[str]]) – Support labels of each bbox.
support_bboxes (list[list[list[float]]] | None) – Bbox in support images. If it set to None, it will use the [0, 0, image width, image height] as bbox. Default: None.
classes (list[str] | None) – Options to override classes name of model. Default: None.
- mmfewshot.detection.apis.single_gpu_model_init(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader) → List[source]¶
Forward support images for meta-learning based detector initialization.
The function usually will be called before single_gpu_test in QuerySupportEvalHook. It firstly forwards support images with mode=model_init and the features will be saved in the model. Then it will call :func:model_init to process the extracted features of support images to finish the model initialization.
- Parameters
model (nn.Module) – Model used for extracting support template features.
data_loader (nn.Dataloader) – Pytorch data loader.
- Returns
Extracted support template features.
- Return type
list[Tensor]
- mmfewshot.detection.apis.single_gpu_test(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader, show: bool = False, out_dir: Optional[str] = None, show_score_thr: float = 0.3) → List[source]¶
Test model with single gpu for meta-learning based detector.
The model forward function requires mode, while in mmdet it requires return_loss. And the encode_mask_results is removed.
- Parameters
model (nn.Module) – Model to be tested.
data_loader (DataLoader) – Pytorch data loader.
show (bool) – Whether to show the image. Default: False.
out_dir (str | None) – The directory to write the image. Default: None.
show_score_thr (float) – Minimum score of bboxes to be shown. Default: 0.3.
- Returns
The prediction results.
- Return type
list
detection.core¶
evaluation¶
- class mmfewshot.detection.core.evaluation.QuerySupportDistEvalHook(model_init_dataloader: torch.utils.data.dataloader.DataLoader, val_dataloader: torch.utils.data.dataloader.DataLoader, **eval_kwargs)[source]¶
Distributed evaluation hook for query support data pipeline.
This hook will first traverse model_init_dataloader to extract support features for model initialization and then evaluate the data from val_dataloader.
Noted that model_init_dataloader should NOT use distributed sampler to make all the models on different gpus get same data results in same initialized models.
- Parameters
model_init_dataloader (DataLoader) – A PyTorch dataloader of model_init dataset.
val_dataloader (DataLoader) – A PyTorch dataloader of dataset to be evaluated.
**eval_kwargs – Evaluation arguments fed into the evaluate function of the dataset.
- class mmfewshot.detection.core.evaluation.QuerySupportEvalHook(model_init_dataloader: torch.utils.data.dataloader.DataLoader, val_dataloader: torch.utils.data.dataloader.DataLoader, **eval_kwargs)[source]¶
Evaluation hook for query support data pipeline.
This hook will first traverse model_init_dataloader to extract support features for model initialization and then evaluate the data from val_dataloader.
- Parameters
model_init_dataloader (DataLoader) – A PyTorch dataloader of model_init dataset.
val_dataloader (DataLoader) – A PyTorch dataloader of dataset to be evaluated.
**eval_kwargs – Evaluation arguments fed into the evaluate function of the dataset.
- mmfewshot.detection.core.evaluation.eval_map(det_results: List[List[numpy.ndarray]], annotations: List[Dict], classes: List[str], scale_ranges: Optional[List[Tuple]] = None, iou_thr: float = 0.5, dataset: Optional[Union[str, List[str]]] = None, logger: Optional[object] = None, tpfp_fn: Optional[callable] = None, nproc: int = 4, use_legacy_coordinate: bool = False) → Tuple[List, List[Dict]][source]¶
Evaluate mAP of a dataset.
eval_map()
in mmdet predefines the names of classes and thus not supports report map results of arbitrary class splits.- Parameters
det_results (list[list[np.ndarray]] | list[tuple[np.ndarray]]) – The outer list indicates images, and the inner list indicates per-class detected bboxes.
annotations (list[dict]) –
Ground truth annotations where each item of the list indicates an image. Keys of annotations are:
bboxes: numpy array of shape (n, 4)
labels: numpy array of shape (n, )
bboxes_ignore (optional): numpy array of shape (k, 4)
labels_ignore (optional): numpy array of shape (k, )
classes (list[str]) – Names of class.
scale_ranges (list[tuple] | None) – Range of scales to be evaluated, in the format [(min1, max1), (min2, max2), …]. A range of (32, 64) means the area range between (32**2, 64**2). Default: None.
iou_thr (float) – IoU threshold to be considered as matched. Default: 0.5.
dataset (list[str] | str | None) – Dataset name or dataset classes, there are minor differences in metrics for different datasets, e.g. “voc07”, “imagenet_det”, etc. Default: None.
logger (logging.Logger | None) – The way to print the mAP summary. See mmcv.utils.print_log() for details. Default: None.
tpfp_fn (callable | None) – The function used to determine true false positives. If None,
tpfp_default()
is used as default unless dataset is ‘det’ or ‘vid’ (tpfp_imagenet()
in this case). If it is given as a function, then this function is used to evaluate tp & fp. Default None.nproc (int) – Processes used for computing TP and FP. Default: 4.
use_legacy_coordinate (bool) – Whether to use coordinate system in mmdet v1.x. which means width, height should be calculated as ‘x2 - x1 + 1` and ‘y2 - y1 + 1’ respectively. Default: False.
- Returns
(list, [dict, dict, …])
- Return type
tuple
utils¶
- class mmfewshot.detection.core.utils.ContrastiveLossDecayHook(decay_steps: Sequence[int], decay_rate: float = 0.5)[source]¶
Hook for contrast loss weight decay used in FSCE.
- Parameters
decay_steps (list[int] | tuple[int]) – Each item in the list is the step to decay the loss weight.
decay_rate (float) – Decay rate. Default: 0.5.
detection.datasets¶
- class mmfewshot.detection.datasets.BaseFewShotDataset(ann_cfg: List[Dict], classes: Optional[Union[str, Sequence[str]]], pipeline: Optional[List[Dict]] = None, multi_pipelines: Optional[Dict[str, List[Dict]]] = None, data_root: Optional[str] = None, img_prefix: str = '', seg_prefix: Optional[str] = None, proposal_file: Optional[str] = None, test_mode: bool = False, filter_empty_gt: bool = True, min_bbox_size: Optional[Union[float, int]] = None, ann_shot_filter: Optional[Dict] = None, instance_wise: bool = False, dataset_name: Optional[str] = None)[source]¶
Base dataset for few shot detection.
The main differences with normal detection dataset fall in two aspects.
- It allows to specify single (used in normal dataset) or multiple
(used in query-support dataset) pipelines for data processing.
- It supports to control the maximum number of instances of each class
when loading the annotation file.
The annotation format is shown as follows. The ann field is optional for testing.
[ { 'id': '0000001' 'filename': 'a.jpg', 'width': 1280, 'height': 720, 'ann': { 'bboxes': <np.ndarray> (n, 4) in (x1, y1, x2, y2) order. 'labels': <np.ndarray> (n, ), 'bboxes_ignore': <np.ndarray> (k, 4), (optional field) 'labels_ignore': <np.ndarray> (k, 4) (optional field) } }, ... ]
- Parameters
ann_cfg (list[dict]) –
Annotation config support two type of config.
loading annotation from common ann_file of dataset with or without specific classes. example:dict(type=’ann_file’, ann_file=’path/to/ann_file’, ann_classes=[‘dog’, ‘cat’])
loading annotation from a json file saved by dataset. example:dict(type=’saved_dataset’, ann_file=’path/to/ann_file’)
classes (str | Sequence[str] | None) – Classes for model training and provide fixed label for each class.
pipeline (list[dict] | None) – Config to specify processing pipeline. Used in normal dataset. Default: None.
multi_pipelines (dict[list[dict]]) –
Config to specify data pipelines for corresponding data flow. For example, query and support data can be processed with two different pipelines, the dict should contain two keys like:
query (list[dict]): Config for query-data process pipeline.
support (list[dict]): Config for support-data process pipeline.
data_root (str | None) – Data root for
ann_cfg
, img_prefix`,seg_prefix
,proposal_file
if specified. Default: None.test_mode (bool) – If set True, annotation will not be loaded. Default: False.
filter_empty_gt (bool) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests. Default: True.
min_bbox_size (int | float | None) – The minimum size of bounding boxes in the images. If the size of a bounding box is less than
min_bbox_size
, it would be added to ignored field. Default: None.ann_shot_filter (dict | None) – Used to specify the class and the corresponding maximum number of instances when loading the annotation file. For example: {‘dog’: 10, ‘person’: 5}. If set it as None, all annotation from ann file would be loaded. Default: None.
instance_wise (bool) – If set true, self.data_infos would change to instance-wise, which means if the annotation of single image has more than one instance, the annotation would be split to num_instances items. Often used in support datasets, Default: False.
dataset_name (str | None) – Name of dataset to display. For example: ‘train_dataset’ or ‘query_dataset’. Default: None.
- ann_cfg_parser(ann_cfg: List[Dict]) → List[Dict][source]¶
Parse annotation config to annotation information.
- Parameters
ann_cfg (list[dict]) –
Annotation config support two type of config.
- ’ann_file’: loading annotation from common ann_file of
dataset. example: dict(type=’ann_file’, ann_file=’path/to/ann_file’, ann_classes=[‘dog’, ‘cat’])
- ’saved_dataset’: loading annotation from saved dataset.
example:dict(type=’saved_dataset’, ann_file=’path/to/ann_file’)
- Returns
Annotation information.
- Return type
list[dict]
- get_ann_info(idx: int) → Dict[source]¶
Get annotation by index.
When override this function please make sure same annotations are used during the whole training.
- Parameters
idx (int) – Index of data.
- Returns
Annotation info of specified index.
- Return type
dict
- prepare_train_img(idx: int, pipeline_key: Optional[str] = None, gt_idx: Optional[List[int]] = None) → Dict[source]¶
Get training data and annotations after pipeline.
- Parameters
idx (int) – Index of data.
pipeline_key (str) – Name of pipeline
gt_idx (list[int]) – Index of used annotation.
- Returns
Training data and annotation after pipeline with new keys introduced by pipeline.
- Return type
dict
- class mmfewshot.detection.datasets.CropResizeInstance(num_context_pixels: int = 16, target_size: Tuple[int] = (320, 320))[source]¶
Crop and resize instance according to bbox form image.
- Parameters
num_context_pixels (int) – Padding pixel around instance. Default: 16.
target_size (tuple[int, int]) – Resize cropped instance to target size. Default: (320, 320).
- class mmfewshot.detection.datasets.FewShotCocoDataset(classes: Optional[Union[Sequence[str], str]] = None, num_novel_shots: Optional[int] = None, num_base_shots: Optional[int] = None, ann_shot_filter: Optional[Dict[str, int]] = None, min_bbox_area: Optional[Union[float, int]] = None, dataset_name: Optional[str] = None, test_mode: bool = False, **kwargs)[source]¶
COCO dataset for few shot detection.
- Parameters
classes (str | Sequence[str] | None) – Classes for model training and provide fixed label for each class. When classes is string, it will load pre-defined classes in
FewShotCocoDataset
. For example: ‘BASE_CLASSES’, ‘NOVEL_CLASSES` or ALL_CLASSES.num_novel_shots (int | None) – Max number of instances used for each novel class. If is None, all annotation will be used. Default: None.
num_base_shots (int | None) – Max number of instances used for each base class. If is None, all annotation will be used. Default: None.
ann_shot_filter (dict | None) – Used to specify the class and the corresponding maximum number of instances when loading the annotation file. For example: {‘dog’: 10, ‘person’: 5}. If set it as None, ann_shot_filter will be created according to num_novel_shots and num_base_shots.
min_bbox_area (int | float | None) – Filter images with bbox whose area smaller min_bbox_area. If set to None, skip this filter. Default: None.
dataset_name (str | None) – Name of dataset to display. For example: ‘train dataset’ or ‘query dataset’. Default: None.
test_mode (bool) – If set True, annotation will not be loaded. Default: False.
- evaluate(results: List[Sequence], metric: Union[str, List[str]] = 'bbox', logger: Optional[object] = None, jsonfile_prefix: Optional[str] = None, classwise: bool = False, proposal_nums: Sequence[int] = (100, 300, 1000), iou_thrs: Optional[Union[float, Sequence[float]]] = None, metric_items: Optional[Union[str, List[str]]] = None, class_splits: Optional[List[str]] = None) → Dict[source]¶
Evaluation in COCO protocol and summary results of different splits of classes.
- Parameters
results (list[list | tuple]) – Testing results of the dataset.
metric (str | list[str]) – Metrics to be evaluated. Options are ‘bbox’, ‘proposal’, ‘proposal_fast’. Default: ‘bbox’
logger (logging.Logger | None) – Logger used for printing related information during evaluation. Default: None.
jsonfile_prefix (str | None) – The prefix of json files. It includes the file path and the prefix of filename, e.g., “a/b/prefix”. If not specified, a temp file will be created. Default: None.
classwise (bool) – Whether to evaluating the AP for each class.
proposal_nums (Sequence[int]) – Proposal number used for evaluating recalls, such as recall@100, recall@1000. Default: (100, 300, 1000).
iou_thrs (Sequence[float] | float | None) – IoU threshold used for evaluating recalls/mAPs. If set to a list, the average of all IoUs will also be computed. If not specified, [0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95] will be used. Default: None.
metric_items (list[str] | str | None) – Metric items that will be returned. If not specified,
['AR@100', 'AR@300', 'AR@1000', 'AR_s@1000', 'AR_m@1000', 'AR_l@1000' ]
will be used whenmetric=='proposal'
,['mAP', 'mAP_50', 'mAP_75', 'mAP_s', 'mAP_m', 'mAP_l']
will be used whenmetric=='bbox'
.class_splits – (list[str] | None): Calculate metric of classes split in COCO_SPLIT. For example: [‘BASE_CLASSES’, ‘NOVEL_CLASSES’]. Default: None.
- Returns
COCO style evaluation metric.
- Return type
dict[str, float]
- get_cat_ids(idx: int) → List[int][source]¶
Get category ids by index.
Overwrite the function in CocoDataset.
- Parameters
idx (int) – Index of data.
- Returns
All categories in the image of specified index.
- Return type
list[int]
- get_classes(classes: Union[str, Sequence[str]]) → List[str][source]¶
Get class names.
It supports to load pre-defined classes splits. The pre-defined classes splits are: [‘ALL_CLASSES’, ‘NOVEL_CLASSES’, ‘BASE_CLASSES’]
- Parameters
classes (str | Sequence[str]) – Classes for model training and provide fixed label for each class. When classes is string, it will load pre-defined classes in FewShotCocoDataset. For example: ‘NOVEL_CLASSES’.
- Returns
list of class names.
- Return type
list[str]
- load_annotations(ann_cfg: List[Dict]) → List[Dict][source]¶
Support to Load annotation from two type of ann_cfg.
type of ‘ann_file’: COCO-style annotation file.
type of ‘saved_dataset’: Saved COCO dataset json.
- Parameters
ann_cfg (list[dict]) – Config of annotations.
- Returns
Annotation infos.
- Return type
list[dict]
- class mmfewshot.detection.datasets.FewShotVOCDataset(classes: Optional[Union[Sequence[str], str]] = None, num_novel_shots: Optional[int] = None, num_base_shots: Optional[int] = None, ann_shot_filter: Optional[Dict] = None, use_difficult: bool = False, min_bbox_area: Optional[Union[float, int]] = None, dataset_name: Optional[str] = None, test_mode: bool = False, coordinate_offset: List[int] = [- 1, - 1, 0, 0], **kwargs)[source]¶
VOC dataset for few shot detection.
- Parameters
classes (str | Sequence[str]) – Classes for model training and provide fixed label for each class. When classes is string, it will load pre-defined classes in FewShotVOCDataset. For example: ‘NOVEL_CLASSES_SPLIT1’.
num_novel_shots (int | None) – Max number of instances used for each novel class. If is None, all annotation will be used. Default: None.
num_base_shots (int | None) – Max number of instances used for each base class. When it is None, all annotations will be used. Default: None.
ann_shot_filter (dict | None) – Used to specify the class and the corresponding maximum number of instances when loading the annotation file. For example: {‘dog’: 10, ‘person’: 5}. If set it as None, ann_shot_filter will be created according to num_novel_shots and num_base_shots. Default: None.
use_difficult (bool) – Whether use the difficult annotation or not. Default: False.
min_bbox_area (int | float | None) – Filter images with bbox whose area smaller min_bbox_area. If set to None, skip this filter. Default: None.
dataset_name (str | None) – Name of dataset to display. For example: ‘train dataset’ or ‘query dataset’. Default: None.
test_mode (bool) – If set True, annotation will not be loaded. Default: False.
coordinate_offset (list[int]) – The bbox annotation will add the coordinate offsets which corresponds to [x_min, y_min, x_max, y_max] during training. For testing, the gt annotation will not be changed while the predict results will minus the coordinate offsets to inverse data loading logic in training. Default: [-1, -1, 0, 0].
- evaluate(results: List[Sequence], metric: Union[str, List[str]] = 'mAP', logger: Optional[object] = None, proposal_nums: Sequence[int] = (100, 300, 1000), iou_thr: Optional[Union[float, Sequence[float]]] = 0.5, class_splits: Optional[List[str]] = None) → Dict[source]¶
Evaluation in VOC protocol and summary results of different splits of classes.
- Parameters
results (list[list | tuple]) – Predictions of the model.
metric (str | list[str]) – Metrics to be evaluated. Options are ‘mAP’, ‘recall’. Default: mAP.
logger (logging.Logger | None) – Logger used for printing related information during evaluation. Default: None.
proposal_nums (Sequence[int]) – Proposal number used for evaluating recalls, such as recall@100, recall@1000. Default: (100, 300, 1000).
iou_thr (float | list[float]) – IoU threshold. Default: 0.5.
class_splits – (list[str] | None): Calculate metric of classes split defined in VOC_SPLIT. For example: [‘BASE_CLASSES_SPLIT1’, ‘NOVEL_CLASSES_SPLIT1’]. Default: None.
- Returns
AP/recall metrics.
- Return type
dict[str, float]
- get_classes(classes: Union[str, Sequence[str]]) → List[str][source]¶
Get class names.
It supports to load pre-defined classes splits. The pre-defined classes splits are: [‘ALL_CLASSES_SPLIT1’, ‘ALL_CLASSES_SPLIT2’, ‘ALL_CLASSES_SPLIT3’,
‘BASE_CLASSES_SPLIT1’, ‘BASE_CLASSES_SPLIT2’, ‘BASE_CLASSES_SPLIT3’, ‘NOVEL_CLASSES_SPLIT1’,’NOVEL_CLASSES_SPLIT2’,’NOVEL_CLASSES_SPLIT3’]
- Parameters
classes (str | Sequence[str]) – Classes for model training and provide fixed label for each class. When classes is string, it will load pre-defined classes in FewShotVOCDataset. For example: ‘NOVEL_CLASSES_SPLIT1’.
- Returns
List of class names.
- Return type
list[str]
- load_annotations(ann_cfg: List[Dict]) → List[Dict][source]¶
Support to load annotation from two type of ann_cfg.
- Parameters
ann_cfg (list[dict]) – Support two type of config.
loading annotation from common ann_file of dataset (-) – with or without specific classes. example:dict(type=’ann_file’, ann_file=’path/to/ann_file’, ann_classes=[‘dog’, ‘cat’])
loading annotation from a json file saved by dataset. (-) – example:dict(type=’saved_dataset’, ann_file=’path/to/ann_file’)
- Returns
Annotation information.
- Return type
list[dict]
- load_annotations_xml(ann_file: str, classes: Optional[List[str]] = None) → List[Dict][source]¶
Load annotation from XML style ann_file.
It supports using image id or image path as image names to load the annotation file.
- Parameters
ann_file (str) – Path of annotation file.
classes (list[str] | None) – Specific classes to load form xml file. If set to None, it will use classes of whole dataset. Default: None.
- Returns
Annotation info from XML file.
- Return type
list[dict]
- class mmfewshot.detection.datasets.GenerateMask(target_size: Tuple[int] = (224, 224))[source]¶
Resize support image and generate a mask.
- Parameters
target_size (tuple[int, int]) – Crop and resize to target size. Default: (224, 224).
- class mmfewshot.detection.datasets.NWayKShotDataloader(query_data_loader: torch.utils.data.dataloader.DataLoader, support_data_loader: torch.utils.data.dataloader.DataLoader)[source]¶
A dataloader wrapper.
It Create a iterator to generate query and support batch simultaneously. Each batch contains query data and support data, and the lengths are batch_size and (num_support_ways * num_support_shots) respectively.
- Parameters
query_data_loader (DataLoader) – DataLoader of query dataset
support_data_loader (DataLoader) – DataLoader of support datasets.
- class mmfewshot.detection.datasets.NWayKShotDataset(query_dataset: mmfewshot.detection.datasets.base.BaseFewShotDataset, support_dataset: Optional[mmfewshot.detection.datasets.base.BaseFewShotDataset], num_support_ways: int, num_support_shots: int, one_support_shot_per_image: bool = False, num_used_support_shots: int = 200, repeat_times: int = 1)[source]¶
A dataset wrapper of NWayKShotDataset.
Building NWayKShotDataset requires query and support dataset, the behavior of NWayKShotDataset is determined by mode. When dataset in ‘query’ mode, dataset will return regular image and annotations. While dataset in ‘support’ mode, dataset will build batch indices firstly and each batch indices contain (num_support_ways * num_support_shots) samples. In other words, for support mode every call of __getitem__ will return a batch of samples, therefore the outside dataloader should set batch_size to 1. The default mode of NWayKShotDataset is ‘query’ and by using convert function convert_query_to_support the mode will be converted into ‘support’.
- Parameters
query_dataset (
BaseFewShotDataset
) – Query dataset to be wrapped.support_dataset (
BaseFewShotDataset
| None) – Support dataset to be wrapped. If support dataset is None, support dataset will copy from query dataset.num_support_ways (int) – Number of classes for support in mini-batch.
num_support_shots (int) – Number of support shot for each class in mini-batch.
one_support_shot_per_image (bool) – If True only one annotation will be sampled from each image. Default: False.
num_used_support_shots (int | None) – The total number of support shots sampled and used for each class during training. If set to None, all shots in dataset will be used as support shot. Default: 200.
shuffle_support (bool) – If allow generate new batch indices for each epoch. Default: False.
repeat_times (int) – The length of repeated dataset will be times larger than the original dataset. Default: 1.
- convert_query_to_support(support_dataset_len: int) → None[source]¶
Convert query dataset to support dataset.
- Parameters
support_dataset_len (int) – Length of pre sample batch indices.
- generate_support_batch_indices(dataset_len: int) → List[List[Tuple[int]]][source]¶
Generate batch indices from support dataset.
Batch indices is in the shape of [length of datasets * [support way * support shots]]. And the dataset_len will be the length of support dataset.
- Parameters
dataset_len (int) – Length of batch indices.
- Returns
Pre-sample batch indices.
- Return type
list[list[(data_idx, gt_idx)]]
- class mmfewshot.detection.datasets.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
Save numpy array obj to json.
- default(obj: object) → object[source]¶
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- class mmfewshot.detection.datasets.QueryAwareDataset(query_dataset: mmfewshot.detection.datasets.base.BaseFewShotDataset, support_dataset: Optional[mmfewshot.detection.datasets.base.BaseFewShotDataset], num_support_ways: int, num_support_shots: int, repeat_times: int = 1)[source]¶
A wrapper of QueryAwareDataset.
Building QueryAwareDataset requires query and support dataset. Every call of __getitem__ will firstly sample a query image and its annotations. Then it will use the query annotations to sample a batch of positive and negative support images and annotations. The positive images share same classes with query, while the annotations of negative images don’t have any category from query.
- Parameters
query_dataset (
BaseFewShotDataset
) – Query dataset to be wrapped.support_dataset (
BaseFewShotDataset
| None) – Support dataset to be wrapped. If support dataset is None, support dataset will copy from query dataset.num_support_ways (int) – Number of classes for support in mini-batch, the first one always be the positive class.
num_support_shots (int) – Number of support shots for each class in mini-batch, the first K shots always from positive class.
repeat_times (int) – The length of repeated dataset will be times larger than the original dataset. Default: 1.
- generate_support(idx: int, query_class: int, support_classes: List[int]) → List[Tuple[int]][source]¶
Generate support indices of query images.
- Parameters
idx (int) – Index of query data.
query_class (int) – Query class.
support_classes (list[int]) – Classes of support data.
- Returns
- A mini-batch (num_support_ways *
num_support_shots) of support data (idx, gt_idx).
- Return type
list[tuple(int)]
- sample_support_shots(idx: int, class_id: int, allow_same_image: bool = False) → List[Tuple[int]][source]¶
Generate support indices according to the class id.
- Parameters
idx (int) – Index of query data.
class_id (int) – Support class.
allow_same_image (bool) – Allow instance sampled from same image as query image. Default: False.
- Returns
- Support data (num_support_shots)
of specific class.
- Return type
list[tuple[int]]
- mmfewshot.detection.datasets.build_dataloader(dataset: torch.utils.data.dataset.Dataset, samples_per_gpu: int, workers_per_gpu: int, num_gpus: int = 1, dist: bool = True, shuffle: bool = True, seed: Optional[int] = None, data_cfg: Optional[Dict] = None, use_infinite_sampler: bool = False, **kwargs) → torch.utils.data.dataloader.DataLoader[source]¶
Build PyTorch DataLoader.
In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs.
- Parameters
dataset (Dataset) – A PyTorch dataset.
samples_per_gpu (int) – Number of training samples on each GPU, i.e., batch size of each GPU.
workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.
num_gpus (int) – Number of GPUs. Only used in non-distributed training. Default:1.
dist (bool) – Distributed training/test or not. Default: True.
shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.
seed (int) – Random seed. Default:None.
data_cfg (dict | None) – Dict of data configure. Default: None.
use_infinite_sampler (bool) – Whether to use infinite sampler. Noted that infinite sampler will keep iterator of dataloader running forever, which can avoid the overhead of worker initialization between epochs. Default: False.
kwargs – any keyword argument to be used to initialize DataLoader
- Returns
A PyTorch dataloader.
- Return type
DataLoader
detection.models¶
- mmfewshot.detection.models.build_detector(cfg: mmcv.utils.config.ConfigDict, logger: Optional[object] = None)[source]¶
Build detector.
Build shared head.
backbones¶
- class mmfewshot.detection.models.backbones.ResNetWithMetaConv(**kwargs)[source]¶
ResNet with meta_conv to handle different inputs in metarcnn and fsdetview.
When input with shape (N, 3, H, W) from images, the network will use conv1 as regular ResNet. When input with shape (N, 4, H, W) from (image + mask) the network will replace conv1 with meta_conv to handle additional channel.
- forward(x: torch.Tensor, use_meta_conv: bool = False) → Tuple[torch.Tensor][source]¶
Forward function.
When input with shape (N, 3, H, W) from images, the network will use conv1 as regular ResNet. When input with shape (N, 4, H, W) from (image + mask) the network will replace conv1 with meta_conv to handle additional channel.
- Parameters
x (Tensor) – Tensor with shape (N, 3, H, W) from images or (N, 4, H, W) from (images + masks).
use_meta_conv (bool) – If set True, forward input tensor with meta_conv which require tensor with shape (N, 4, H, W). Otherwise, forward input tensor with conv1 which require tensor with shape (N, 3, H, W). Default: False.
- Returns
- Tuple of features, each item with
shape (N, C, H, W).
- Return type
tuple[Tensor]
dense_heads¶
- class mmfewshot.detection.models.dense_heads.AttentionRPNHead(num_support_ways: int, num_support_shots: int, aggregation_layer: Dict = {'aggregator_cfgs': [{'type': 'DepthWiseCorrelationAggregator', 'in_channels': 1024, 'with_fc': False}], 'type': 'AggregationLayer'}, roi_extractor: Dict = {'featmap_strides': [16], 'out_channels': 1024, 'roi_layer': {'output_size': 14, 'sampling_ratio': 0, 'type': 'RoIAlign'}, 'type': 'SingleRoIExtractor'}, **kwargs)[source]¶
RPN head for Attention RPN.
- Parameters
num_support_ways (int) – Number of sampled classes (pos + neg).
num_support_shots (int) – Number of shot for each classes.
aggregation_layer (dict) – Config of aggregation_layer.
roi_extractor (dict) – Config of roi_extractor.
- extract_roi_feat(feats: List[torch.Tensor], rois: torch.Tensor) → torch.Tensor[source]¶
Forward function.
- Parameters
feats (list[Tensor]) – Input features with shape (N, C, H, W).
rois – with shape (m, 5).
- forward_train(query_feats: List[torch.Tensor], support_feats: List[torch.Tensor], query_gt_bboxes: List[torch.Tensor], query_img_metas: List[Dict], support_gt_bboxes: List[torch.Tensor], query_gt_bboxes_ignore: Optional[List[torch.Tensor]] = None, proposal_cfg: Optional[mmcv.utils.config.ConfigDict] = None, **kwargs) → Tuple[Dict, List[Tuple]][source]¶
Forward function in training phase.
- Parameters
query_feats (list[Tensor]) – List of query features, each item with shape (N, C, H, W)..
support_feats (list[Tensor]) – List of support features, each item with shape (N, C, H, W).
query_gt_bboxes (list[Tensor]) – List of ground truth bboxes of query image, each item with shape (num_gts, 4).
query_img_metas (list[dict]) – List of query image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.support_gt_bboxes (list[Tensor]) – List of ground truth bboxes of support image, each item with shape (num_gts, 4).
query_gt_bboxes_ignore (list[Tensor]) – List of ground truth bboxes to be ignored of query image with shape (num_ignored_gts, 4). Default: None.
proposal_cfg (
ConfigDict
) – Test / postprocessing configuration. if None, test_cfg would be used. Default: None.
- Returns
loss components and proposals of each image.
losses: (dict[str, Tensor]): A dictionary of loss components.
proposal_list (list[Tensor]): Proposals of each image.
- Return type
tuple
- loss(cls_scores: List[torch.Tensor], bbox_preds: List[torch.Tensor], gt_bboxes: List[torch.Tensor], img_metas: List[Dict], gt_labels: Optional[List[torch.Tensor]] = None, gt_bboxes_ignore: Optional[List[torch.Tensor]] = None, pair_flags: Optional[List[bool]] = None) → Dict[source]¶
Compute losses of rpn head.
- Parameters
cls_scores (list[Tensor]) – Box scores for each scale level with shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 4, H, W)
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.gt_labels (list[Tensor]) – Class indices corresponding to each box. Default: None.
gt_bboxes_ignore (None | list[Tensor]) – Specify which bounding boxes can be ignored when computing the loss. Default: None
pair_flags (list[bool]) – Indicate predicted result is from positive pair or negative pair with shape (N). Default: None.
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- simple_test(query_feats: List[torch.Tensor], support_feat: torch.Tensor, query_img_metas: List[Dict], rescale: bool = False) → List[torch.Tensor][source]¶
Test function without test time augmentation.
- Parameters
query_feats (list[Tensor]) – List of query features, each item with shape(N, C, H, W).
support_feat (Tensor) – Support features with shape (N, C, H, W).
query_img_metas (list[dict]) – List of query image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.rescale (bool) – Whether to rescale the results. Default: False.
- Returns
- Proposals of each image, each item has shape (n, 5),
where 5 represent (tl_x, tl_y, br_x, br_y, score).
- Return type
List[Tensor]
- class mmfewshot.detection.models.dense_heads.TwoBranchRPNHead(mid_channels: int = 64, **kwargs)[source]¶
RPN head for MPSR.
- Parameters
mid_channels (int) – Input channels of rpn_cls_conv. Default: 64.
- forward_auxiliary(feats: List[torch.Tensor]) → List[torch.Tensor][source]¶
Forward auxiliary features at multiple scales.
- Parameters
feats (list[Tensor]) – List of features at multiple scales, each is a 4D-tensor.
- Returns
- Classification scores for all scale levels, each is
a 4D-tensor, the channels number is num_anchors * num_classes.
- Return type
list[Tensor]
- forward_auxiliary_single(feat: torch.Tensor) → Tuple[torch.Tensor][source]¶
Forward auxiliary feature map of a single scale level.
- forward_single(feat: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶
Forward feature map of a single scale level.
- forward_train(x: List[torch.Tensor], auxiliary_rpn_feats: List[torch.Tensor], img_metas: List[Dict], gt_bboxes: List[torch.Tensor], gt_labels: Optional[List[torch.Tensor]] = None, gt_bboxes_ignore: Optional[List[torch.Tensor]] = None, proposal_cfg: Optional[mmcv.utils.config.ConfigDict] = None, **kwargs) → Tuple[Dict, List[torch.Tensor]][source]¶
- Parameters
x (list[Tensor]) – Features from FPN, each item with shape (N, C, H, W).
auxiliary_rpn_feats (list[Tensor]) – Auxiliary features from FPN, each item with shape (N, C, H, W).
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes (list[Tensor]) – Ground truth bboxes of the image, shape (num_gts, 4).
gt_labels (list[Tensor]) – Ground truth labels of each box, shape (num_gts,). Default: None.
gt_bboxes_ignore (list[Tensor]) – Ground truth bboxes to be ignored, shape (num_ignored_gts, 4). Default: None.
proposal_cfg (ConfigDict) – Test / postprocessing configuration, if None, test_cfg would be used. Default: None.
- Returns
losses: (dict[str, Tensor]): A dictionary of loss components. proposal_list (List[Tensor]): Proposals of each image.
- Return type
tuple
- get_bboxes(cls_scores: List[torch.Tensor], bbox_preds: List[torch.Tensor], img_metas: List[Dict], cfg: Optional[mmcv.utils.config.ConfigDict] = None, rescale: bool = False, with_nms: bool = True) → List[torch.Tensor][source]¶
Transform network output for a batch into bbox predictions.
- Parameters
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 4, H, W)
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
cfg (ConfigDict | None) – Test / postprocessing configuration, if None, test_cfg would be used
rescale (bool) – If True, return boxes in original image space. Default: False.
with_nms (bool) – If True, do nms before return boxes. Default: True.
- Returns
- Proposals of each image, each item has shape (n, 5),
where 5 represent (tl_x, tl_y, br_x, br_y, score).
- Return type
List[Tensor]
- loss(cls_scores: List[torch.Tensor], bbox_preds: List[torch.Tensor], gt_bboxes: List[torch.Tensor], gt_labels: List[torch.Tensor], img_metas: List[Dict], gt_bboxes_ignore: Optional[List[torch.Tensor]] = None, auxiliary_cls_scores: Optional[List[torch.Tensor]] = None) → Dict[source]¶
Compute losses of the head.
- Parameters
cls_scores (list[Tensor]) – Box scores for each scale level, each item with shape (N, num_anchors * num_classes, H, W).
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 4, H, W)
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes_ignore (list[Tensor] | None) – specify which bounding boxes can be ignored when computing the loss. Default: None.
auxiliary_cls_scores (list[Tensor] | None) – Box scores for each scale level, each item with shape (N, num_anchors * num_classes, H, W). Default: None.
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- loss_bbox_single(bbox_pred: torch.Tensor, anchors: torch.Tensor, bbox_targets: torch.Tensor, bbox_weights: torch.Tensor, num_total_samples: int) → Tuple[Dict][source]¶
Compute loss of a single scale level.
- Parameters
bbox_pred (Tensor) – Box energies / deltas for each scale level with shape (N, num_anchors * 4, H, W).
anchors (Tensor) – Box reference for each scale level with shape (N, num_total_anchors, 4).
bbox_targets (Tensor) – BBox regression targets of each anchor weight shape (N, num_total_anchors, 4).
bbox_weights (Tensor) – BBox regression loss weights of each anchor with shape (N, num_total_anchors, 4).
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- Returns
A dictionary of loss components.
- Return type
tuple[dict[str, Tensor]]
- loss_cls_single(cls_score: torch.Tensor, labels: torch.Tensor, label_weights: torch.Tensor, num_total_samples: int) → Tuple[Dict][source]¶
Compute loss of a single scale level.
- Parameters
cls_score (Tensor) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W).
labels (Tensor) – Labels of each anchors with shape (N, num_total_anchors).
label_weights (Tensor) – Label weights of each anchor with shape (N, num_total_anchors)
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- Returns
A dictionary of loss components.
- Return type
tuple[dict[str, Tensor]]
detectors¶
- class mmfewshot.detection.models.detectors.AttentionRPNDetector(backbone: mmcv.utils.config.ConfigDict, neck: Optional[mmcv.utils.config.ConfigDict] = None, support_backbone: Optional[mmcv.utils.config.ConfigDict] = None, support_neck: Optional[mmcv.utils.config.ConfigDict] = None, rpn_head: Optional[mmcv.utils.config.ConfigDict] = None, roi_head: Optional[mmcv.utils.config.ConfigDict] = None, train_cfg: Optional[mmcv.utils.config.ConfigDict] = None, test_cfg: Optional[mmcv.utils.config.ConfigDict] = None, pretrained: Optional[mmcv.utils.config.ConfigDict] = None, init_cfg: Optional[mmcv.utils.config.ConfigDict] = None)[source]¶
Implementation of AttentionRPN.
- Parameters
backbone (dict) – Config of the backbone for query data.
neck (dict | None) – Config of the neck for query data and probably for support data. Default: None.
support_backbone (dict | None) – Config of the backbone for support data only. If None, support and query data will share same backbone. Default: None.
support_neck (dict | None) – Config of the neck for support data only. Default: None.
rpn_head (dict | None) – Config of rpn_head. Default: None.
roi_head (dict | None) – Config of roi_head. Default: None.
train_cfg (dict | None) – Training config. Useless in CenterNet, but we keep this variable for SingleStageDetector. Default: None.
test_cfg (dict | None) – Testing config of CenterNet. Default: None.
pretrained (str | None) – model pretrained path. Default: None.
init_cfg (dict | list[dict] | None) – Initialization config dict. Default: None.
- extract_support_feat(img: torch.Tensor) → List[torch.Tensor][source]¶
Extract features of support data.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
- Returns
- Features of support images, each item with shape
(N, C, H, W).
- Return type
list[Tensor]
- forward_model_init(img: torch.Tensor, img_metas: List[Dict], gt_bboxes: Optional[List[torch.Tensor]] = None, gt_labels: Optional[List[torch.Tensor]] = None, **kwargs) → Dict[source]¶
Extract and save support features for model initialization.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]) – class indices corresponding to each box.
- Returns
A dict contains following keys:
- gt_labels (Tensor): class indices corresponding to each
feature.
res4_roi_feat (Tensor): roi features of res4 layer.
res5_roi_feat (Tensor): roi features of res5 layer.
- Return type
dict
- simple_test(img: torch.Tensor, img_metas: List[Dict], proposals: Optional[List[torch.Tensor]] = None, rescale: bool = False) → List[List[numpy.ndarray]][source]¶
Test without augmentation.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.proposals (list[Tensor] | None) – override rpn proposals with custom proposals. Use when with_rpn is False. Default: None.
rescale (bool) – If True, return boxes in original image space.
- Returns
- BBox results of each image and classes.
The outer list corresponds to each image. The inner list corresponds to each class.
- Return type
list[list[np.ndarray]]
- class mmfewshot.detection.models.detectors.FSCE(backbone, neck=None, rpn_head=None, roi_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[source]¶
Implementation of FSCE
- class mmfewshot.detection.models.detectors.FSDetView(backbone: mmcv.utils.config.ConfigDict, neck: Optional[mmcv.utils.config.ConfigDict] = None, support_backbone: Optional[mmcv.utils.config.ConfigDict] = None, support_neck: Optional[mmcv.utils.config.ConfigDict] = None, rpn_head: Optional[mmcv.utils.config.ConfigDict] = None, roi_head: Optional[mmcv.utils.config.ConfigDict] = None, train_cfg: Optional[mmcv.utils.config.ConfigDict] = None, test_cfg: Optional[mmcv.utils.config.ConfigDict] = None, pretrained: Optional[mmcv.utils.config.ConfigDict] = None, init_cfg: Optional[mmcv.utils.config.ConfigDict] = None)[source]¶
Implementation of FSDetView.
- class mmfewshot.detection.models.detectors.MPSR(rpn_select_levels: List[int], roi_select_levels: List[int], *args, **kwargs)[source]¶
Implementation of MPSR..
- Parameters
rpn_select_levels (list[int]) – Specify the corresponding level of fpn features for each scale of image. The selected features will be fed into rpn head.
roi_select_levels (list[int]) – Specific which level of fpn features to be selected for each scale of image. The selected features will be fed into roi head.
- extract_auxiliary_feat(auxiliary_img_list: List[torch.Tensor]) → Tuple[List[torch.Tensor], List[torch.Tensor]][source]¶
Extract and select features from data list at multiple scale.
- Parameters
auxiliary_img_list (list[Tensor]) – List of data at different scales. In most cases, each dict contains: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore.
- Returns
- rpn_feats (list[Tensor]): Features at multiple scale used
for rpn head training.
- roi_feats (list[Tensor]): Features at multiple scale used
for roi head training.
- Return type
tuple
- extract_feat(img: torch.Tensor) → List[torch.Tensor][source]¶
Directly extract features from the backbone+neck.
- forward(main_data: Dict = None, auxiliary_data: Dict = None, img: List[torch.Tensor] = None, img_metas: List[Dict] = None, return_loss: bool = True, **kwargs) → Dict[source]¶
Calls either
forward_train()
orforward_test()
depending on whetherreturn_loss
isTrue
.Note this setting will change the expected inputs. When
return_loss=True
, the input will be main and auxiliary data for training., and whenresturn_loss=False
, the input will be img and img_meta for testing.- Parameters
main_data (dict) – Used for
forward_train()
. Dict of data and data info, where each dict has: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore. Default: None.auxiliary_data (dict) – Used for
forward_train()
. Dict of data and data info at multiple scales, where each key use different suffix to indicate different scale. For example, img_scale_i, img_metas_scale_i, gt_bboxes_scale_i, gt_labels_scale_i, gt_bboxes_ignore_scale_i, where i in range of 0 to number of scales. Default: None.img (list[Tensor]) – Used for func:forward_test or
forward_model_init()
. List of tensors of shape (1, C, H, W). Typically these should be mean centered and std scaled. Default: None.img_metas (list[dict]) – Used for func:forward_test or
forward_model_init()
. List of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys, seemmdet.datasets.pipelines.Collect
. Default: None.return_loss (bool) – If set True call
forward_train()
, otherwise callforward_test()
. Default: True.
- forward_train(main_data: Dict, auxiliary_data_list: List[Dict], **kwargs) → Dict[source]¶
- Parameters
main_data (dict) – In most cases, dict of main data contains: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore.
auxiliary_data_list (list[dict]) – List of data at different scales. In most cases, each dict contains: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore.
- Returns
a dictionary of loss components
- Return type
dict[str, Tensor]
- train_step(data: Dict, optimizer: Union[object, Dict]) → Dict[source]¶
The iteration step during training.
This method defines an iteration step during training, except for the back propagation and optimizer updating, which are done in an optimizer hook. Note that in some complicated cases or models, the whole process including back propagation and optimizer updating is also defined in this method, such as GAN.
- Parameters
data (dict) – The output of dataloader.
optimizer (
torch.optim.Optimizer
| dict) – The optimizer of runner is passed totrain_step()
. This argument is unused and reserved.
- Returns
It should contain at least 3 keys:
loss
,log_vars
,num_samples
.loss
is a tensor for back propagation, which can be a weighted sum of multiple losses.log_vars
contains all the variables to be sent to the
logger. -
num_samples
indicates the batch size (when the model is DDP, it means the batch size on each GPU), which is used for averaging the logs.- Return type
dict
- val_step(data: Dict, optimizer: Optional[Union[object, Dict]] = None) → Dict[source]¶
The iteration step during validation.
This method shares the same signature as
train_step()
, but used during val epochs. Note that the evaluation after training epochs is not implemented with this method, but an evaluation hook.
- class mmfewshot.detection.models.detectors.MetaRCNN(backbone: mmcv.utils.config.ConfigDict, neck: Optional[mmcv.utils.config.ConfigDict] = None, support_backbone: Optional[mmcv.utils.config.ConfigDict] = None, support_neck: Optional[mmcv.utils.config.ConfigDict] = None, rpn_head: Optional[mmcv.utils.config.ConfigDict] = None, roi_head: Optional[mmcv.utils.config.ConfigDict] = None, train_cfg: Optional[mmcv.utils.config.ConfigDict] = None, test_cfg: Optional[mmcv.utils.config.ConfigDict] = None, pretrained: Optional[mmcv.utils.config.ConfigDict] = None, init_cfg: Optional[mmcv.utils.config.ConfigDict] = None)[source]¶
Implementation of Meta R-CNN..
- Parameters
backbone (dict) – Config of the backbone for query data.
neck (dict | None) – Config of the neck for query data and probably for support data. Default: None.
support_backbone (dict | None) – Config of the backbone for support data only. If None, support and query data will share same backbone. Default: None.
support_neck (dict | None) – Config of the neck for support data only. Default: None.
rpn_head (dict | None) – Config of rpn_head. Default: None.
roi_head (dict | None) – Config of roi_head. Default: None.
train_cfg (dict | None) – Training config. Useless in CenterNet, but we keep this variable for SingleStageDetector. Default: None.
test_cfg (dict | None) – Testing config of CenterNet. Default: None.
pretrained (str | None) – model pretrained path. Default: None.
init_cfg (dict | list[dict] | None) – Initialization config dict. Default: None
- extract_support_feat(img)[source]¶
Extracting features from support data.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
- Returns
- Features of input image, each item with shape
(N, C, H, W).
- Return type
list[Tensor]
- forward_model_init(img: torch.Tensor, img_metas: List[Dict], gt_bboxes: Optional[List[torch.Tensor]] = None, gt_labels: Optional[List[torch.Tensor]] = None, **kwargs)[source]¶
extract and save support features for model initialization.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]) – class indices corresponding to each box.
- Returns
A dict contains following keys:
- gt_labels (Tensor): class indices corresponding to each
feature.
res5_rois (list[Tensor]): roi features of res5 layer.
- Return type
dict
- simple_test(img: torch.Tensor, img_metas: List[Dict], proposals: Optional[List[torch.Tensor]] = None, rescale: bool = False)[source]¶
Test without augmentation.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.proposals (list[Tensor] | None) – override rpn proposals with custom proposals. Use when with_rpn is False. Default: None.
rescale (bool) – If True, return boxes in original image space.
- Returns
- BBox results of each image and classes.
The outer list corresponds to each image. The inner list corresponds to each class.
- Return type
list[list[np.ndarray]]
- class mmfewshot.detection.models.detectors.QuerySupportDetector(backbone: mmcv.utils.config.ConfigDict, neck: Optional[mmcv.utils.config.ConfigDict] = None, support_backbone: Optional[mmcv.utils.config.ConfigDict] = None, support_neck: Optional[mmcv.utils.config.ConfigDict] = None, rpn_head: Optional[mmcv.utils.config.ConfigDict] = None, roi_head: Optional[mmcv.utils.config.ConfigDict] = None, train_cfg: Optional[mmcv.utils.config.ConfigDict] = None, test_cfg: Optional[mmcv.utils.config.ConfigDict] = None, pretrained: Optional[mmcv.utils.config.ConfigDict] = None, init_cfg: Optional[mmcv.utils.config.ConfigDict] = None)[source]¶
Base class for two-stage detectors in query-support fashion.
Query-support detectors typically consisting of a region proposal network and a task-specific regression head. There are two pipelines for query and support data respectively.
- Parameters
backbone (dict) – Config of the backbone for query data.
neck (dict | None) – Config of the neck for query data and probably for support data. Default: None.
support_backbone (dict | None) – Config of the backbone for support data only. If None, support and query data will share same backbone. Default: None.
support_neck (dict | None) – Config of the neck for support data only. Default: None.
rpn_head (dict | None) – Config of rpn_head. Default: None.
roi_head (dict | None) – Config of roi_head. Default: None.
train_cfg (dict | None) – Training config. Useless in CenterNet, but we keep this variable for SingleStageDetector. Default: None.
test_cfg (dict | None) – Testing config of CenterNet. Default: None.
pretrained (str | None) – model pretrained path. Default: None.
init_cfg (dict | list[dict] | None) – Initialization config dict. Default: None
- extract_feat(img: torch.Tensor) → List[torch.Tensor][source]¶
Extract features of query data.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
- Returns
Features of query images.
- Return type
list[Tensor]
- extract_query_feat(img: torch.Tensor) → List[torch.Tensor][source]¶
Extract features of query data.
- Parameters
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
- Returns
- Features of support images, each item with shape
(N, C, H, W).
- Return type
list[Tensor]
- forward(query_data: Optional[Dict] = None, support_data: Optional[Dict] = None, img: Optional[List[torch.Tensor]] = None, img_metas: Optional[List[Dict]] = None, mode: typing_extensions.Literal[train, model_init, test] = 'train', **kwargs) → Dict[source]¶
Calls one of (
forward_train()
,forward_test()
andforward_model_init()
) according to the mode. The inputs of forward function would change with the mode.When mode is ‘train’, the input will be query and support data
for training.
When mode is ‘model_init’, the input will be support template
data at least including (img, img_metas).
When mode is ‘test’, the input will be test data at least
including (img, img_metas).
- Parameters
query_data (dict) – Used for
forward_train()
. Dict of query data and data info where each dict has: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore. Default: None.support_data (dict) – Used for
forward_train()
. Dict of support data and data info dict where each dict has: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore. Default: None.img (list[Tensor]) – Used for func:forward_test or
forward_model_init()
. List of tensors of shape (1, C, H, W). Typically these should be mean centered and std scaled. Default: None.img_metas (list[dict]) – Used for func:forward_test or
forward_model_init()
. List of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys, seemmdet.datasets.pipelines.Collect
. Default: None.mode (str) – Indicate which function to call. Options are ‘train’, ‘model_init’ and ‘test’. Default: ‘train’.
- abstract forward_model_init(img: torch.Tensor, img_metas: List[Dict], gt_bboxes: Optional[List[torch.Tensor]] = None, gt_labels: Optional[List[torch.Tensor]] = None, **kwargs)[source]¶
extract and save support features for model initialization.
- forward_train(query_data: Dict, support_data: Dict, proposals: Optional[List] = None, **kwargs) → Dict[source]¶
Forward function for training.
- Parameters
query_data (dict) – In most cases, dict of query data contains: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore.
support_data (dict) – In most cases, dict of support data contains: img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore.
proposals (list) – Override rpn proposals with custom proposals. Use when with_rpn is False. Default: None.
- Returns
a dictionary of loss components
- Return type
dict[str, Tensor]
- simple_test(img: torch.Tensor, img_metas: List[Dict], proposals: Optional[List[torch.Tensor]] = None, rescale: bool = False)[source]¶
Test without augmentation.
- train_step(data: Dict, optimizer: Union[object, Dict]) → Dict[source]¶
The iteration step during training.
This method defines an iteration step during training, except for the back propagation and optimizer updating, which are done in an optimizer hook. Note that in some complicated cases or models, the whole process including back propagation and optimizer updating is also defined in this method, such as GAN. For most of query-support detectors, the batch size denote the batch size of query data.
- Parameters
data (dict) – The output of dataloader.
optimizer (
torch.optim.Optimizer
| dict) – The optimizer of runner is passed totrain_step()
. This argument is unused and reserved.
- Returns
- It should contain at least 3 keys:
loss
,log_vars
, num_samples
.loss
is a tensor for back propagation, which can be a
weighted sum of multiple losses. -
log_vars
contains all the variables to be sent to the logger. -num_samples
indicates the batch size (when the model is DDP, it means the batch size on each GPU), which is used for averaging the logs.
- It should contain at least 3 keys:
- Return type
dict
- val_step(data: Dict, optimizer: Optional[Union[object, Dict]] = None) → Dict[source]¶
The iteration step during validation.
This method shares the same signature as
train_step()
, but used during val epochs. Note that the evaluation after training epochs is not implemented with this method, but an evaluation hook.
losses¶
- class mmfewshot.detection.models.losses.SupervisedContrastiveLoss(temperature: float = 0.2, iou_threshold: float = 0.5, reweight_type: typing_extensions.Literal[none, exp, linear] = 'none', reduction: typing_extensions.Literal[none, mean, sum] = 'mean', loss_weight: float = 1.0)[source]¶
-
This part of code is modified from https://github.com/MegviiDetection/FSCE.
- Parameters
temperature (float) – A constant to be divided by consine similarity to enlarge the magnitude. Default: 0.2.
iou_threshold (float) – Consider proposals with higher credibility to increase consistency. Default: 0.5.
reweight_type (str) – Reweight function for contrastive loss. Options are (‘none’, ‘exp’, ‘linear’). Default: ‘none’.
reduction (str) – The method used to reduce the loss into a scalar. Default: ‘mean’. Options are “none”, “mean” and “sum”.
loss_weight (float) – Weight of loss. Default: 1.0.
- forward(features: torch.Tensor, labels: torch.Tensor, ious: torch.Tensor, decay_rate: Optional[float] = None, weight: Optional[torch.Tensor] = None, avg_factor: Optional[int] = None, reduction_override: Optional[str] = None) → torch.Tensor[source]¶
Forward function.
- Parameters
features (tensor) – Shape of (N, K) where N is the number of features to be compared and K is the channels.
labels (tensor) – Shape of (N).
ious (tensor) – Shape of (N).
decay_rate (float | None) – The decay rate for total loss. Default: None.
weight (Tensor | None) – The weight of loss for each prediction with shape of (N). Default: None.
avg_factor (int | None) – Average factor that is used to average the loss. Default: None.
reduction_override (str | None) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”. Default: None.
- Returns
The calculated loss.
- Return type
Tensor
roi_heads¶
- class mmfewshot.detection.models.roi_heads.ContrastiveBBoxHead(mlp_head_channels: int = 128, with_weight_decay: bool = False, loss_contrast: Dict = {'iou_threshold': 0.5, 'loss_weight': 1.0, 'reweight_type': 'none', 'temperature': 0.1, 'type': 'SupervisedContrastiveLoss'}, scale: int = 20, learnable_scale: bool = False, eps: float = 1e-05, *args, **kwargs)[source]¶
BBoxHead for FSCE.
- Parameters
mlp_head_channels (int) – Output channels of contrast branch mlp. Default: 128.
with_weight_decay (bool) – Whether to decay loss weight. Default: False.
loss_contrast (dict) – Config of contrast loss.
scale (int) – Scaling factor of cls_score. Default: 20.
learnable_scale (bool) – Learnable global scaling factor. Default: False.
eps (float) – Constant variable to avoid division by zero.
- forward(x: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶
Forward function.
- Parameters
x (Tensor) – Shape of (num_proposals, C, H, W).
- Returns
- cls_score (Tensor): Cls scores, has shape
(num_proposals, num_classes).
- bbox_pred (Tensor): Box energies / deltas, has shape
(num_proposals, 4).
- contrast_feat (Tensor): Box features for contrast loss,
has shape (num_proposals, C).
- Return type
tuple
- loss_contrast(contrast_feat: torch.Tensor, proposal_ious: torch.Tensor, labels: torch.Tensor, reduction_override: Optional[str] = None) → Dict[source]¶
Loss for contract.
- Parameters
contrast_feat (tensor) – BBox features with shape (N, C) used for contrast loss.
proposal_ious (tensor) – IoU between proposal and ground truth corresponding to each BBox features with shape (N).
labels (tensor) – Labels for each BBox features with shape (N).
reduction_override (str | None) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”. Default: None.
- Returns
The calculated loss.
- Return type
Dict
- class mmfewshot.detection.models.roi_heads.ContrastiveRoIHead(bbox_roi_extractor=None, bbox_head=None, mask_roi_extractor=None, mask_head=None, shared_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[source]¶
RoI head for FSCE.
- class mmfewshot.detection.models.roi_heads.CosineSimBBoxHead(scale: int = 20, learnable_scale: bool = False, eps: float = 1e-05, *args, **kwargs)[source]¶
BBOxHead for TFA.
The code is modified from the official implementation https://github.com/ucbdrive/few-shot-object-detection/
- Parameters
scale (int) – Scaling factor of cls_score. Default: 20.
learnable_scale (bool) – Learnable global scaling factor. Default: False.
eps (float) – Constant variable to avoid division by zero.
- forward(x: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶
Forward function.
- Parameters
x (Tensor) – Shape of (num_proposals, C, H, W).
- Returns
- cls_score (Tensor): Cls scores, has shape
(num_proposals, num_classes).
- bbox_pred (Tensor): Box energies / deltas, has shape
(num_proposals, 4).
- Return type
tuple
- class mmfewshot.detection.models.roi_heads.FSDetViewRoIHead(aggregation_layer: Optional[Dict] = None, **kwargs)[source]¶
Roi head for FSDetView.
- Parameters
aggregation_layer (dict) – Config of aggregation_layer. Default: None.
- class mmfewshot.detection.models.roi_heads.MetaRCNNResLayer(*args, **kwargs)[source]¶
Shared resLayer for metarcnn and fsdetview.
It provides different forward logics for query and support images.
- class mmfewshot.detection.models.roi_heads.MetaRCNNRoIHead(aggregation_layer: Optional[mmcv.utils.config.ConfigDict] = None, **kwargs)[source]¶
Roi head for MetaRCNN.
- Parameters
aggregation_layer (ConfigDict) – Config of aggregation_layer. Default: None.
- extract_query_roi_feat(feats: List[torch.Tensor], rois: torch.Tensor) → torch.Tensor[source]¶
Extracting query BBOX features, which is used in both training and testing.
- Parameters
feats (list[Tensor]) – List of query features, each item with shape (N, C, H, W).
rois (Tensor) – shape with (m, 5).
- Returns
RoI features with shape (N, C).
- Return type
Tensor
- extract_support_feats(feats: List[torch.Tensor]) → List[torch.Tensor][source]¶
Forward support features through shared layers.
- Parameters
feats (list[Tensor]) – List of support features, each item with shape (N, C, H, W).
- Returns
- List of support features, each item
with shape (N, C).
- Return type
list[Tensor]
- forward_train(query_feats: List[torch.Tensor], support_feats: List[torch.Tensor], proposals: List[torch.Tensor], query_img_metas: List[Dict], query_gt_bboxes: List[torch.Tensor], query_gt_labels: List[torch.Tensor], support_gt_labels: List[torch.Tensor], query_gt_bboxes_ignore: Optional[List[torch.Tensor]] = None, **kwargs) → Dict[source]¶
Forward function for training.
- Parameters
query_feats (list[Tensor]) – List of query features, each item with shape (N, C, H, W).
support_feats (list[Tensor]) – List of support features, each item with shape (N, C, H, W).
proposals (list[Tensor]) – List of region proposals with positive and negative pairs.
query_img_metas (list[dict]) – List of query image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see mmdet/datasets/pipelines/formatting.py:Collect.
query_gt_bboxes (list[Tensor]) – Ground truth bboxes for each query image, each item with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
query_gt_labels (list[Tensor]) – Class indices corresponding to each box of query images, each item with shape (num_gts).
support_gt_labels (list[Tensor]) – Class indices corresponding to each box of support images, each item with shape (1).
query_gt_bboxes_ignore (list[Tensor] | None) – Specify which bounding boxes can be ignored when computing the loss. Default: None.
- Returns
A dictionary of loss components
- Return type
dict[str, Tensor]
- simple_test(query_feats: List[torch.Tensor], support_feats_dict: Dict, proposal_list: List[torch.Tensor], query_img_metas: List[Dict], rescale: bool = False) → List[List[numpy.ndarray]][source]¶
Test without augmentation.
- Parameters
query_feats (list[Tensor]) – Features of query image, each item with shape (N, C, H, W).
support_feats_dict (dict[int, Tensor]) – used for inference only, each key is the class id and value is the support template features with shape (1, C).
proposal_list (list[Tensors]) – list of region proposals.
query_img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.rescale (bool) – Whether to rescale the results. Default: False.
- Returns
- BBox results of each image and classes.
The outer list corresponds to each image. The inner list corresponds to each class.
- Return type
list[list[np.ndarray]]
- simple_test_bboxes(query_feats: List[torch.Tensor], support_feats_dict: Dict, query_img_metas: List[Dict], proposals: List[torch.Tensor], rcnn_test_cfg: mmcv.utils.config.ConfigDict, rescale: bool = False) → Tuple[List[torch.Tensor], List[torch.Tensor]][source]¶
Test only det bboxes without augmentation.
- Parameters
query_feats (list[Tensor]) – Features of query image, each item with shape (N, C, H, W).
support_feats_dict (dict[int, Tensor]) – used for inference only, each key is the class id and value is the support template features with shape (1, C).
query_img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.proposals (list[Tensor]) – Region proposals.
(obj (rcnn_test_cfg) – ConfigDict): test_cfg of R-CNN.
rescale (bool) – If True, return boxes in original image space. Default: False.
- Returns
- Each tensor in first list
with shape (num_boxes, 4) and with shape (num_boxes, ) in second list. The length of both lists should be equal to batch_size.
- Return type
tuple[list[Tensor], list[Tensor]]
- class mmfewshot.detection.models.roi_heads.MultiRelationBBoxHead(patch_relation: bool = True, local_correlation: bool = True, global_relation: bool = True, *args, **kwargs)[source]¶
BBox head for Attention RPN.
- Parameters
patch_relation (bool) – Whether use patch_relation head for classification. Following the official implementation, patch_relation always be True, because only patch relation head contain regression head. Default: True.
local_correlation (bool) – Whether use local_correlation head for classification. Default: True.
global_relation (bool) – Whether use global_relation head for classification. Default: True.
- forward(query_feat: torch.Tensor, support_feat: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶
Forward function.
- Parameters
query_feat (Tensor) – Shape of (num_proposals, C, H, W).
support_feat (Tensor) – Shape of (1, C, H, W).
- Returns
- cls_score (Tensor): Cls scores, has shape
(num_proposals, num_classes).
- bbox_pred (Tensor): Box energies / deltas, has shape
(num_proposals, 4).
- Return type
tuple
- loss(cls_scores: torch.Tensor, bbox_preds: torch.Tensor, rois: torch.Tensor, labels: torch.Tensor, label_weights: torch.Tensor, bbox_targets: torch.Tensor, bbox_weights: torch.Tensor, num_pos_pair_samples: int, reduction_override: Optional[str] = None, sample_fractions: Sequence[Union[int, float]] = (1, 2, 1)) → Dict[source]¶
Compute losses of the head.
- Parameters
cls_scores (Tensor) – Box scores with shape of (num_proposals, num_classes)
bbox_preds (Tensor) – Box energies / deltas with shape of (num_proposals, num_classes * 4)
rois (Tensor) – shape (N, 4) or (N, 5)
labels (Tensor) – Labels of proposals with shape (num_proposals).
label_weights (Tensor) – Label weights of proposals with shape (num_proposals).
bbox_targets (Tensor) – BBox regression targets of each proposal weight with shape (num_proposals, num_classes * 4).
bbox_weights (Tensor) – BBox regression loss weights of each proposal with shape (num_proposals, num_classes * 4).
num_pos_pair_samples (int) – Number of samples from positive pairs.
reduction_override (str | None) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”. Default: None.
sample_fractions (Sequence[int | float]) – Fractions of positive samples, negative samples from positive pair, negative samples from negative pair. Default: (1, 2, 1).
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- class mmfewshot.detection.models.roi_heads.MultiRelationRoIHead(num_support_ways: int = 2, num_support_shots: int = 5, sample_fractions: Sequence[Union[int, float]] = (1, 2, 1), **kwargs)[source]¶
Roi head for AttentionRPN.
- Parameters
num_support_ways (int) – Number of sampled classes (pos + neg).
num_support_shots (int) – Number of shot for each classes.
sample_fractions (Sequence[int | float]) – Fractions of positive samples, negative samples from positive pair, negative samples from negative pair. Default: (1, 2, 1).
- extract_roi_feat(feats: List[torch.Tensor], rois: torch.Tensor) → torch.Tensor[source]¶
Extract BBOX feature function used in both training and testing.
- Parameters
feats (list[Tensor]) – Features from backbone, each item with shape (N, C, W, H).
rois (Tensor) – shape (num_proposals, 5).
- Returns
Roi features with shape (num_proposals, C).
- Return type
Tensor
- forward_train(query_feats: List[torch.Tensor], support_feats: List[torch.Tensor], proposals: List[torch.Tensor], query_img_metas: List[Dict], query_gt_bboxes: List[torch.Tensor], query_gt_labels: List[torch.Tensor], support_gt_bboxes: List[torch.Tensor], query_gt_bboxes_ignore: Optional[List[torch.Tensor]] = None, **kwargs) → Dict[source]¶
All arguments excepted proposals are passed in tuple of (query, support).
- Parameters
query_feats (list[Tensor]) – List of query features, each item with shape (N, C, H, W).
support_feats (list[Tensor]) – List of support features, each item with shape (N, C, H, W).
proposals (list[Tensor]) – List of region proposals with positive and negative query-support pairs.
query_img_metas (list[dict]) – List of query image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see mmdet/datasets/pipelines/formatting.py:Collect.
query_gt_bboxes (list[Tensor]) – Ground truth bboxes for each query image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
query_gt_labels (list[Tensor]) – Class indices corresponding to each bbox from query image.
support_gt_bboxes (list[Tensor]) – Ground truth bboxes for each support image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
query_gt_bboxes_ignore (None | list[Tensor]) – Specify which bounding boxes from query image can be ignored when computing the loss. Default: None.
- Returns
A dictionary of loss components.
- Return type
dict[str, Tensor]
- simple_test(query_feats: List[torch.Tensor], support_feat: torch.Tensor, proposals: List[torch.Tensor], query_img_metas: List[Dict], rescale: bool = False) → List[List[numpy.ndarray]][source]¶
Test without augmentation.
- Parameters
query_feats (list[Tensor]) – List of query features, each item with shape (N, C, H, W).
support_feat (Tensor) – Support features with shape (N, C, H, W).
proposals (Tensor or list[Tensor]) – list of region proposals.
query_img_metas (list[dict]) – list of query image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.proposals – Region proposals. Default: None.
rescale (bool) – Whether to rescale the results. Default: False.
- Returns
- BBox results of each image and classes.
The outer list corresponds to each image. The inner list corresponds to each class.
- Return type
list[list[np.ndarray]]
- simple_test_bboxes(query_feats: List[torch.Tensor], support_feat: torch.Tensor, query_img_metas: List[Dict], proposals: List[torch.Tensor], rcnn_test_cfg: mmcv.utils.config.ConfigDict, rescale: bool = False) → Tuple[List[torch.Tensor], List[torch.Tensor]][source]¶
Test only det bboxes without augmentation.
- Parameters
query_feats (list[Tensor]) – List of query features, each item with shape (N, C, H, W).
support_feat (Tensor) – Support feature with shape (N, C, H, W).
query_img_metas (list[dict]) – list of image info dict where each dict has: img_shape, scale_factor, flip, and may also contain filename, ori_shape, pad_shape, and img_norm_cfg. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.proposals (list[Tensor]) – Region proposals.
(obj (rcnn_test_cfg) – ConfigDict): test_cfg of R-CNN.
rescale (bool) – If True, return boxes in original image space. Default: False.
- Returns
- BBox of shape [N, num_bboxes, 5]
and class labels of shape [N, num_bboxes].
- Return type
tuple[Tensor, Tensor]
- class mmfewshot.detection.models.roi_heads.TwoBranchRoIHead(bbox_roi_extractor=None, bbox_head=None, mask_roi_extractor=None, mask_head=None, shared_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[source]¶
RoI head for MPSR.
- forward_auxiliary_train(feats: Tuple[torch.Tensor], gt_labels: List[torch.Tensor]) → Dict[source]¶
Forward function and calculate loss for auxiliary data in training.
- Parameters
feats (tuple[Tensor]) – List of features at multiple scales, each is a 4D-tensor.
gt_labels (list[Tensor]) – List of class indices corresponding to each features, each is a 4D-tensor.
- Returns
a dictionary of loss components
- Return type
dict[str, Tensor]
utils¶
detection.utils¶
- class mmfewshot.detection.utils.ContrastiveLossDecayHook(decay_steps: Sequence[int], decay_rate: float = 0.5)¶
Hook for contrast loss weight decay used in FSCE.
- Parameters
decay_steps (list[int] | tuple[int]) – Each item in the list is the step to decay the loss weight.
decay_rate (float) – Decay rate. Default: 0.5.
mmfewshot.utils¶
- class mmfewshot.utils.DistributedInfiniteGroupSampler(dataset: Iterable, samples_per_gpu: int = 1, num_replicas: Optional[int] = None, rank: Optional[int] = None, seed: int = 0, shuffle: bool = True)[source]¶
Similar to InfiniteGroupSampler but in distributed version.
The length of sampler is set to the actual length of dataset, thus the length of dataloader is still determined by the dataset. The implementation logic is referred to https://github.com/facebookresearch/detectron2/blob/main/detectron2/data/samplers/grouped_batch_sampler.py
- Parameters
dataset (Iterable) – The dataset.
samples_per_gpu (int) – Number of training samples on each GPU, i.e., batch size of each GPU. Default: 1.
num_replicas (int | None) – Number of processes participating in distributed training. Default: None.
rank (int | None) – Rank of current process. Default: None.
seed (int) – Random seed. Default: 0.
shuffle (bool) – Whether shuffle the indices of a dummy epoch, it should be noted that shuffle can not guarantee that you can generate sequential indices because it need to ensure that all indices in a batch is in a group. Default: True.
- class mmfewshot.utils.DistributedInfiniteSampler(dataset: Iterable, num_replicas: Optional[int] = None, rank: Optional[int] = None, seed: int = 0, shuffle: bool = True)[source]¶
Similar to InfiniteSampler but in distributed version.
The length of sampler is set to the actual length of dataset, thus the length of dataloader is still determined by the dataset. The implementation logic is referred to https://github.com/facebookresearch/detectron2/blob/main/detectron2/data/samplers/grouped_batch_sampler.py
- Parameters
dataset (Iterable) – The dataset.
num_replicas (int | None) – Number of processes participating in distributed training. Default: None.
rank (int | None) – Rank of current process. Default: None.
seed (int) – Random seed. Default: 0.
shuffle (bool) – Whether shuffle the dataset or not. Default: True.
- class mmfewshot.utils.InfiniteEpochBasedRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, max_iters=None, max_epochs=None)[source]¶
Epoch-based Runner supports dataloader with InfiniteSampler.
The workers of dataloader will re-initialize, when the iterator of dataloader is created. InfiniteSampler is designed to avoid these time consuming operations, since the iterator with InfiniteSampler will never reach the end.
- class mmfewshot.utils.InfiniteGroupSampler(dataset: Iterable, samples_per_gpu: int = 1, seed: int = 0, shuffle: bool = True)[source]¶
Similar to InfiniteSampler, but all indices in a batch should be in the same group of flag.
The length of sampler is set to the actual length of dataset, thus the length of dataloader is still determined by the dataset. The implementation logic is referred to https://github.com/facebookresearch/detectron2/blob/main/detectron2/data/samplers/grouped_batch_sampler.py
- Parameters
dataset (Iterable) – The dataset.
samples_per_gpu (int) – Number of training samples on each GPU, i.e., batch size of each GPU. Default: 1.
seed (int) – Random seed. Default: 0.
shuffle (bool) – Whether shuffle the indices of a dummy epoch, it should be noted that shuffle can not guarantee that you can generate sequential indices because it need to ensure that all indices in a batch is in a group. Default: True.
- class mmfewshot.utils.InfiniteSampler(dataset: Iterable, seed: int = 0, shuffle: bool = True)[source]¶
Return a infinite stream of index.
The length of sampler is set to the actual length of dataset, thus the length of dataloader is still determined by the dataset. The implementation logic is referred to https://github.com/facebookresearch/detectron2/blob/main/detectron2/data/samplers/grouped_batch_sampler.py
- Parameters
dataset (Iterable) – The dataset.
seed (int) – Random seed. Default: 0.
shuffle (bool) – Whether shuffle the dataset or not. Default: True.
- mmfewshot.utils.local_numpy_seed(seed: Optional[int] = None) → None[source]¶
Run numpy codes with a local random seed.
If seed is None, the default random state will be used.
- mmfewshot.utils.multi_pipeline_collate_fn(batch, samples_per_gpu: int = 1)[source]¶
Puts each data field into a tensor/DataContainer with outer dimension batch size. This is designed to support the case that the
__getitem__()
of dataset return more than one images, such as query_support dataloader. The main difference with thecollate_fn()
in mmcv is it can process list[list[DataContainer]].Extend default_collate to add support for :type:`~mmcv.parallel.DataContainer`. There are 3 cases:
cpu_only = True, e.g., meta data.
cpu_only = False, stack = True, e.g., images tensors.
cpu_only = False, stack = False, e.g., gt bboxes.
- :param batch (list[list[
mmcv.parallel.DataContainer
]] |: list[mmcv.parallel.DataContainer
]): Data of single batch.
- Parameters
samples_per_gpu (int) – The number of samples of single GPU.
- mmfewshot.utils.sync_random_seed(seed=None, device='cuda')[source]¶
Propagating the seed of rank 0 to all other ranks.
Make sure different ranks share the same seed. All workers must call this function, otherwise it will deadlock. This method is generally used in DistributedSampler, because the seed should be identical across all processes in the distributed group. In distributed sampling, different ranks should sample non-overlapped data in the dataset. Therefore, this function is used to make sure that each rank shuffles the data indices in the same order based on the same seed. Then different ranks could use different indices to select non-overlapped data from the same data list. :param seed: The seed. Default to None. :type seed: int, Optional :param device: The device where the seed will be put on.
Default to ‘cuda’.
- Returns
Seed to be used.
- Return type
int