Dataset module
Dataset module provides an interface for accessing the datasets and sequences.
It also provides a set of utility functions for downloading and extracting datasets.
- class vot.dataset.BasedSequence(name: str, loader: callable, metadata: dict = None)
This class implements the caching of the sequence data.
The sequence data is loaded only when it is needed.
- channel(channel: str = None) Channel
Returns the channel with the specified name. If the channel name is not specified, the default channel is returned.
- Parameters
channel (str, optional) – Channel name. Defaults to None.
- Returns
Channel
- Return type
- channels() List[str]
Returns a list of channel names in the sequence.
- Returns
List of channel names
- Return type
List[str]
- frame(index)
Returns the frame with the specified index in the sequence as a Frame object.
- Parameters
index (int) – Frame index
- Returns
Frame
- Return type
- groundtruth(index=None)
Returns the groundtruth object. If the index is specified, the object is returned as a Region object. If the sequence contains more than one object, an exception is raised. If more objects are present, this method ignores special objects.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- Returns
Groundtruth region
- Return type
- property height
Returns the sequence height.
- metadata(name: str = None, default=None)
Returns the metadata value with the specified name.
- Parameters
name (str, optional, if None, returns the entire metadata dictionary) – Metadata name
default (object, optional) – Default value. Defaults to None.
- Returns
Metadata value
- Return type
object
- object(oid, index=None) Region
Returns the object with the specified id. If the index is specified, the object is returned as a Region object.
- Parameters
oid (str) – Object id
index (int, optional) – Frame index. Defaults to None.
- Returns
Object region
- Return type
- objects() List[str]
Returns a list of object ids in the sequence.
- Returns
List of object ids
- Return type
List[str]
- property size
Returns the sequence size as a tuple (width, height)
- Returns
Sequence size
- Return type
tuple
- tags(index: int = None) List[str]
Returns a list of tags in the sequence. If the index is specified, only the tags that are present in the frame with the specified index are returned.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- Returns
List of tags
- Return type
List[str]
- values(index: int = None) List[float]
Returns a list of values in the sequence. If the index is specified, only the values that are present in the frame with the specified index are returned.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- Returns
List of values
- Return type
List[float]
- property width
Returns the sequence width.
- class vot.dataset.Channel
Abstract representation of individual image channel, a sequence of images with uniform dimensions.
- abstract filename(index: int) str
Returns filename for the given index of the channel sequence.
- Parameters
index (int) – Index of the frame
- Returns
Filename of the frame
- Return type
str
- abstract frame(index: int) Frame
Returns frame object for the given index.
- Parameters
index (int) – Index of the frame
- Returns
Frame object
- Return type
- abstract property size: int
Returns the size of the channel in bytes.
- class vot.dataset.Dataset(sequences: Mapping[str, Sequence])
Base abstract class for a tracking dataset, a list of image sequences addressable by their names and interatable.
- keys() List[str]
Returns a list of unique sequence names.
- Returns
List of sequence names
- Return type
List[str]
- list() List[str]
Returns a list of unique sequence names.
- Returns
List of sequence names
- Return type
List[str]
- exception vot.dataset.DatasetException
Dataset and sequence related exceptions.
- class vot.dataset.Frame(sequence, index)
Frame object represents a single frame in the sequence.
It provides access to the image data, groundtruth, tags and values as a wrapper around the sequence object.
- channel(channel: Optional[str] = None)
Returns the channel object for the given channel name.
- Parameters
channel (Optional[str], optional) – Name of the channel. Defaults to None.
- channels()
Returns the list of channels in the sequence.
- Returns
List of channels
- Return type
List[str]
- filename(channel: Optional[str] = None)
Returns the filename for the given channel name and frame index.
- Parameters
channel (Optional[str], optional) – Name of the channel. Defaults to None.
- Returns
Filename of the frame
- Return type
str
- groundtruth() Region
Returns the groundtruth region for the frame.
- Returns
Groundtruth region
- Return type
- Raises
DatasetException – If groundtruth is not available
- image(channel: Optional[str] = None) ndarray
Returns the image for the given channel name and frame index.
- Parameters
channel (Optional[str], optional) – Name of the channel. Defaults to None.
- Returns
Image object
- Return type
np.ndarray
- property index: int
Returns the index of the frame.
- Returns
Index of the frame
- Return type
int
- object(id: str) Region
Returns the object region for the given object id and frame index.
- Parameters
id (str) – Id of the object
- Returns
Object region
- Return type
- objects() List[str]
Returns the list of objects in the frame.
- Returns
List of object ids
- Return type
List[str]
- property sequence: Sequence
Returns the sequence object of the frame object.
- Returns
Sequence object
- Return type
- tags() List[str]
Returns the tags for the frame.
- Returns
List of tags
- Return type
List[str]
- values() Mapping[str, float]
Returns the values for the frame.
- Returns
Mapping of values
- Return type
Mapping[str, float]
- class vot.dataset.FrameList
Abstract base for all sequences, just a list of frame objects.
- class vot.dataset.InMemoryChannel
In-memory channel represents a sequence of images with uniform dimensions.
It is used to represent a sequence of images in memory.
- append(image)
Appends an image to the channel.
- Parameters
image (np.ndarray) – Image object
- filename(index)
Thwows an exception as the sequence is available in memory and not in files.
- frame(index)
Returns the frame object for the given index in the sequence channel.
- Parameters
index (int) – Index of the frame
- Returns
Frame object
- Return type
- property size
Returns the size of the channel in the format (width, height)
- Returns
Size of the channel
- Return type
Tuple[int, int]
- class vot.dataset.InMemorySequence(name, channels)
An in-memory sequence that can be used to construct a sequence programmatically and store it do disk. Used mainly for testing and debugging.
Only single object sequences are supported at the moment.
- append(images: dict, region: Region, tags: list = None, values: dict = None)
Appends a new frame to the sequence. The frame is specified by a dictionary of images, a region and optional tags and values.
- Parameters
images (dict) – Dictionary of images
region (Region) – Region
tags (list, optional) – List of tags
values (dict, optional) – Dictionary of values
- channel(channel: str) Channel
Returns the specified channel object.
- Parameters
channel (str) – Channel name
- Returns
Channel object
- Return type
- channels() List[str]
Returns a list of channel names.
- Returns
List of channel names
- Return type
List[str]
- frame(index: int) Frame
Returns the specified frame. The frame is returned as a Frame object.
- Parameters
index (int) – Frame index
- Returns
Frame object
- Return type
- groundtruth(index: int = None) Region
Returns the groundtruth object. If the index is specified, the object is returned as a Region object. If the sequence contains more than one object, an exception is raised. If the index is not specified, the groundtruth object is returned as a Region object. If the sequence contains more than one object, an exception is raised.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- Returns
Groundtruth object
- Return type
- property height: int
Returns the sequence height.
- Returns
Sequence height
- Return type
int
- metadata(name=None, default=None)
Returns the value of the specified metadata field. If the field does not exist, the default value is returned.
- Parameters
name (str) – Name of the metadata field, if None, returns the entire metadata dictionary
default (object, optional) – Default value
- Returns
Value of the metadata field
- Return type
object
- object(oid: str, index: int = None) Region
Returns the specified object. If the index is specified, the object is returned as a Region object. If the sequence contains more than one object, an exception is raised. If the index is not specified, the groundtruth object is returned as a Region object. If the sequence contains more than one object, an exception is raised.
- Parameters
oid (str) – Object id
index (int, optional) – Frame index. Defaults to None.
- Returns
Object
- Return type
- objects(index: str = None) List[str]
Returns a list of object ids. If the index is specified, only the objects that are present in the frame with the specified index are returned.
Since only single object sequences are supported, the only object id that is returned is “object”.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- property size: tuple
Returns the sequence size as a tuple (width, height)
- Returns
Sequence size
- Return type
tuple
- tags(index=None)
Returns a list of tags in the sequence. If the index is specified, only the tags that are present in the frame with the specified index are returned.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- Returns
List of tags
- Return type
List[str]
- values(index=None)
Returns a list of values in the sequence. If the index is specified, only the values that are present in the frame with the specified index are returned.
- Parameters
index (int, optional) – Frame index. Defaults to None.
- Returns
List of values
- Return type
List[str]
- property width: int
Returns the sequence width.
- Returns
Sequence width
- Return type
int
- class vot.dataset.PatternFileListChannel(path, start=1, step=1, end=None, check_files=True)
Sequence channel implementation where each frame is stored in a file and all file names follow a specific pattern.
- property base: str
Returns the base path of the sequence.
- Returns
Base path
- Return type
str
- filename(index) str
Returns the filename of the frame at the specified index.
- Parameters
index (int) – Frame index
- Returns
Filename
- Return type
str
- frame(index: int) ndarray
Returns the frame at the specified index as a numpy array. The image is loaded using OpenCV and converted to RGB color space if necessary.
- Parameters
index (int) – Frame index
- Returns
Frame
- Return type
np.ndarray
- Raises
DatasetException – If the index is out of bounds
- property height: int
Returns the height of the frames in the sequence.
- Returns
Height of the frames
- Return type
int
- property pattern
Returns the pattern of the sequence.
- Returns
Pattern
- Return type
str
- property size: tuple
Returns the size of the frames in the sequence as a tuple (width, height)
- Returns
Size of the frames
- Return type
tuple
- property width: int
Returns the width of the frames in the sequence.
- Returns
Width of the frames
- Return type
int
- class vot.dataset.Sequence(name: str)
A sequence is a list of frames (multiple channels) and a list of one or more annotated objects.
It also contains additional metadata and per-frame information, such as tags and values.
- abstract channel(channel=None) Channel
Returns the channel with the specified name or the default channel if no name is specified.
- Parameters
channel (str, optional) – Name of the channel
- Returns
Channel
- Return type
- abstract channels() Set[str]
Returns the names of all channels in the sequence.
- Returns
Names of all channels
- Return type
set
- describe()
Returns a dictionary with information about the sequence.
- Returns
Dictionary with information
- Return type
dict
- abstract groundtruth(index: int) Region
Returns the ground truth region for the specified frame index or None if no ground truth is available for the frame or the frame index is out of bounds. This is a legacy method for compatibility with single-object datasets and should not be used in new code.
- Parameters
index (int) – Frame index
- Returns
Ground truth region
- Return type
- abstract property height: int
Returns the height of the frames in the sequence in pixels.
- Returns
Height of the frames
- Return type
int
- property identifier: str
Returns the identifier of the sequence. The identifier is a string that uniquely identifies the sequence in the dataset. The identifier is usually the same as the name, but may be different if the name is not unique.
- Returns
Identifier
- Return type
str
- abstract metadata(name=None, default=None)
Returns the value of the specified metadata field. If the field does not exist, the default value is returned.
- Parameters
name (str) – Name of the metadata field, if None, returns the entire metadata dictionary
default (object, optional) – Default value
- Returns
Value of the metadata field
- Return type
object
- property name: str
Returns the name of the sequence.
- Returns
Name
- Return type
str
- abstract object(oid, index=None)
Returns the object with the specified name or identifier. If the index is specified, the object is returned only if it is visible in the frame at the specified index.
- Parameters
id (str) – Name or identifier of the object
index (int, optional) – Frame index
- Returns
Object
- Return type
- abstract objects() Set[str]
Returns the names of all objects in the sequence.
- Returns
Names of all objects
- Return type
set
- property size: Tuple[int, int]
Returns the size of the frames in the sequence in pixels as a tuple (width, height)
- Returns
Size of the frames
- Return type
tuple
- abstract tags(index=None) List[str]
Returns the tags for the specified frame index or None if no tags are available for the frame or the frame index is out of bounds.
- Parameters
index (int, optional) – Frame index
- Returns
List of tags
- Return type
list
- abstract values(index=None) Mapping[str, Number]
Returns the values for the specified frame index or None if no values are available for the frame or the frame index is out of bounds.
- Parameters
index (int, optional) – Frame index
- Returns
Dictionary of values
- Return type
dict
- abstract property width: int
Returns the width of the frames in the sequence in pixels.
- Returns
Width of the frames
- Return type
int
- class vot.dataset.SequenceData(channels, objects, tags, values, length)
- channels
Alias for field number 0
- length
Alias for field number 4
- objects
Alias for field number 1
- tags
Alias for field number 2
- values
Alias for field number 3
- class vot.dataset.SequenceIterator(sequence: Sequence)
Sequence iterator provides an iterator interface for the sequence object.
- vot.dataset.download_bundle(url: str, path: str = '.')
Downloads a dataset bundle as a ZIP file and decompresses it.
- Parameters
url (str) – Source bundle URL
path (str, optional) – Destination directory. Defaults to “.”.
- Raises
DatasetException – If the bundle cannot be downloaded or is not supported.
- vot.dataset.download_dataset(url: str, path: str)
Downloads a dataset from a given url or an alias.
- Parameters
url (str) – URL to the data bundle or metadata description file
path (str) – Destination directory
- Raises
DatasetException – If the dataset is not found or a network error occured
- vot.dataset.load_dataset(path: str) Dataset
Loads a dataset from a local directory.
- Parameters
path (str) – The path to the local dataset data
- Raises
DatasetException – When a folder does not exist or the format is not recognized.
- Returns
Dataset object
- Return type
- vot.dataset.load_sequence(path: str) Sequence
Loads a sequence from a given path (directory), tries to guess the format of the sequence.
- Parameters
path (str) – The path to the local sequence data
- Raises
DatasetException – If an loading error occures, unsupported format or other issues.
- Returns
Sequence object
- Return type
This module contains functionality for reading sequences from the storage using VOT compatible format.
- vot.dataset.common.convert_int(value: str) int
Converts the given value to an integer. If the value is not a valid integer, None is returned.
- Parameters
value (str) – The value to convert.
- Returns
The converted value or None if the value is not a valid integer.
- Return type
int
- vot.dataset.common.download_dataset_meta(url: str, path: str) None
Downloads the metadata of a dataset from a given URL and stores it in the given path.
- Parameters
url (str) – The URL to download the metadata from.
path (str) – The path to store the metadata in.
- vot.dataset.common.list_sequences(path: str) None
Indexes the sequences in the given path. Only works if there is a list.txt file in the given path or the path is a list file.
- Parameters
path (str) – The path to index sequences in.
- vot.dataset.common.read_sequence(path)
Reads a sequence from the given path.
- Parameters
path (str) – The path to read the sequence from.
- Returns
The sequence read from the given path.
- Return type
- vot.dataset.common.read_sequence_legacy(path)
Reads a sequence from the given path.
- Parameters
path (str) – The path to read the sequence from.
- Returns
The sequence read from the given path.
- Return type
- vot.dataset.common.write_sequence(directory: str, sequence: Sequence)
Writes a sequence to a directory. The sequence is written as a set of images in a directory structure corresponding to the channel names. The sequence metadata is written to a file called sequence in the root directory.
- Parameters
directory (str) – The directory to write the sequence to.
sequence (Sequence) – The sequence to write.
Extended dataset support
Many datasets are supported by the toolkit using special adapters.
### OTB
OTB dataset adapter module.
OTB is one of the earliest tracking benchmarks. It is a collection of 50/100 sequences with ground truth annotations. The dataset is available at http://cvlab.hanyang.ac.kr/tracker_benchmark/datasets.html.
- vot.dataset.otb.download_otb100(path: str)
Downloads OTB100 dataset to the given path.
- Parameters
path (str) – Path to the dataset folder.
- vot.dataset.otb.download_otb50(path: str)
Downloads OTB50 dataset to the given path.
- Parameters
path (str) – Path to the dataset folder.
- vot.dataset.otb.read_sequence(path: str)
Reads a sequence from OTB dataset. The sequence is identified by the name of the folder and the groundtruth_rect.txt file is expected to be present in the folder.
- Parameters
path (str) – Path to the sequence folder.
- Returns
The sequence object.
- Return type
### GOT10k
GOT-10k dataset adapter module.
The format of GOT-10k dataset is very similar to a subset of VOT, so there is a lot of code duplication.
- vot.dataset.got10k.load_channel(source)
Load channel from the given source.
- Parameters
source (str) – Path to the source. If the source is a directory, it is assumed to be a pattern file list. If the source is a file, it is assumed to be a video file.
- Returns
Channel object.
- Return type
- vot.dataset.got10k.read_sequence(path)
Read GOT-10k sequence from the given path.
- Parameters
path (str) – Path to the sequence.
### TrackingNet
Dataset adapter for the TrackingNet dataset.
Note that the dataset is organized a different way than the VOT datasets, annotated frames are stored in a separate directory. The dataset also contains train and test splits. The loader assumes that only one of the splits is used at a time and that the path is given to this part of the dataset.
- vot.dataset.trackingnet.list_sequences(path)
List sequences in the given path. The path is expected to be the root of the TrackingNet dataset split.
- Parameters
path (str) – Path to the dataset root.
- Returns
List of sequences.
- Return type
list
- vot.dataset.trackingnet.load_channel(source)
Load channel from the given source.
- Parameters
source (str) – Path to the source. If the source is a directory, it is assumed to be a pattern file list. If the source is a file, it is assumed to be a video file.
- Returns
Channel object.
- Return type
- vot.dataset.trackingnet.read_sequence(path)
Read sequence from the given path. Different to VOT datasets, the sequence is not a directory, but a file. From the file name the sequence name is extracted and the path to image frames is inferred based on standard TrackingNet directory structure.
- Parameters
path (str) – Path to the sequence groundtruth.
- Returns
Sequence object.
- Return type