|
vocabtree
0.0.1
|
The Dataset class is an abstract wrapper describing a dataset. More...
#include <dataset.hpp>

Public Member Functions | |
| Dataset (const std::string &base_location) | |
| Constructs a dataset given a base location. More... | |
| Dataset (const std::string &base_location, const std::string &db_data_location) | |
| Loads a dataset from the db_data_location. More... | |
| virtual | ~Dataset () |
| virtual bool | write (const std::string &db_data_location)=0 |
| Writes the dataset mapping to the input data location. More... | |
| virtual bool | read (const std::string &db_data_location)=0 |
| Reads the dataset mapping from the input data location. More... | |
| virtual std::shared_ptr< Image > | image (uint64_t id) const =0 |
| Given a unique integer ID, returns an Image associated with that ID. More... | |
| virtual uint64_t | num_images () const =0 |
| Returns the number of images in the dataset. More... | |
| std::string | location () const |
| Returns the absolute path of the data directory. More... | |
| std::string | location (const std::string &relative_path) const |
| Returns the absolute path of the file (appends the file path to the database path). More... | |
| virtual bool | add_image (const std::shared_ptr< const Image > &image)=0 |
| Adds the given image to the database, if there is an id collision, will not add the image and return false, otherwise returns true. More... | |
| std::vector< std::shared_ptr < const Image > > | all_images () const |
| Returns a vector of all images in the dataset. More... | |
| std::vector< std::shared_ptr < const Image > > | random_images (size_t count) const |
| Returns a vector of random images in the dataset of size count. More... | |
| std::vector< Dataset > | shard (const std::vector< std::string > &new_locations) |
| : Shards the dataset to the new input locations, and returns the sharded datasets More... | |
Protected Attributes | |
| std::string | data_directory |
The Dataset class is an abstract wrapper describing a dataset.
A dataset consiste of the actual data, plus a way to convert the images, or frames of a video into an integer index. The dataset should at minimum provide an easy way to map image paths to unique integers. For a sample implementation of a Dataset see the SimpleDataset class.
Combined with the Image class implementation, a Dataset + Image provides a way to find relevant paths for features and images. Note that the implementation of a Dataset or Image class should implement a relative path to the Image data, with the absolute path being interchangebale.
Definition at line 17 of file dataset.hpp.
| Dataset::Dataset | ( | const std::string & | base_location | ) |
Constructs a dataset given a base location.
An example base location might be /c/data/. Given this base location, an implementation of the Dataset should find all the data and construct a mapping between the data and the id, for example by searching through base_location + /images/.
Definition at line 7 of file dataset.cxx.
References data_directory.
| Dataset::Dataset | ( | const std::string & | base_location, |
| const std::string & | db_data_location | ||
| ) |
Loads a dataset from the db_data_location.
The base_location provides the absolute path of data.
Definition at line 11 of file dataset.cxx.
References data_directory.
|
virtual |
Definition at line 15 of file dataset.cxx.
|
pure virtual |
Adds the given image to the database, if there is an id collision, will not add the image and return false, otherwise returns true.
Implemented in SimpleDataset.
| std::vector< std::shared_ptr< const Image > > Dataset::all_images | ( | ) | const |
Returns a vector of all images in the dataset.
Definition at line 26 of file dataset.cxx.
References image(), and num_images().
Referenced by compute_bow(), compute_bow_features(), random_images(), and train_index().
|
pure virtual |
Given a unique integer ID, returns an Image associated with that ID.
Implemented in SimpleDataset.
Referenced by MatchesPage::add_match(), all_images(), benchmark_dataset(), compute_features(), InvertedIndex::search(), VocabTree::train(), and BagOfWords::train().
| std::string Dataset::location | ( | ) | const |
Returns the absolute path of the data directory.
Definition at line 17 of file dataset.cxx.
References data_directory.
Referenced by MatchesPage::add_match(), bench_oxford(), benchmark_dataset(), compute_bow(), compute_bow_features(), compute_features(), main(), operator<<(), VocabTree::search(), InvertedIndex::search(), VocabTree::train(), BagOfWords::train(), InvertedIndex::train(), train_bow(), and train_tree().
| std::string Dataset::location | ( | const std::string & | relative_path | ) | const |
Returns the absolute path of the file (appends the file path to the database path).
Definition at line 21 of file dataset.cxx.
References data_directory.
|
pure virtual |
Returns the number of images in the dataset.
Implemented in SimpleDataset.
Referenced by all_images(), benchmark_dataset(), compute_features(), operator<<(), and InvertedIndex::search().
| std::vector< std::shared_ptr< const Image > > Dataset::random_images | ( | size_t | count | ) | const |
Returns a vector of random images in the dataset of size count.
Definition at line 34 of file dataset.cxx.
References all_images().
Referenced by compute_bow(), main(), train_bow(), and train_tree().
|
pure virtual |
Reads the dataset mapping from the input data location.
Returns true if successful, false otherwise.
Implemented in SimpleDataset.
| std::vector<Dataset> Dataset::shard | ( | const std::vector< std::string > & | new_locations | ) |
: Shards the dataset to the new input locations, and returns the sharded datasets
|
pure virtual |
Writes the dataset mapping to the input data location.
Returns true if successful, false otherwise.
Implemented in SimpleDataset.
|
protected |
Definition at line 66 of file dataset.hpp.
Referenced by SimpleDataset::construct_dataset(), Dataset(), and location().