Index

In order to access records both in DB and storage, an implementation of Indexes is needed. An Indexes instance can take a specification of a set of records in the form of an Indexes.IndexSpec and turn it into a specification to access those records either in the DB or in storage.

An Indexes instance also keeps track of which records are even cached at the moment, and maintains its constituent indexes.

class tablecache.UnsupportedIndexOperation

Raised to signal that a certain operation is not supported on an index.

class tablecache.Adjustment

A specification of an adjustment to be made to the cache.

Specifies records that should be expired from the cache’s storage, as well as ones that should be loaded from the DB and put into storage.

The records specified via the expire_spec need not necessarily exist in storage. Likewise, ones specified via load_spec may already exist. Setting either to None signals that no records should be expired or loaded, respectively.

The observe_expired() and observe_loaded() methods are callbacks that should be called with expired and loaded records as the adjustment is applied. This may be used to maintain information about which records exist for the index.

Subclasses should define a __repr__() which describes the changes to be made. This will be used in logging.

__init__(expire_spec, load_spec)
Parameters:
  • expire_spec (StorageRecordsSpec | None) – Specification of records that should be expired. May be None to indicate nothing should be expired.

  • load_spec (DbRecordsSpec | None) – Specification of records that should be loaded. May be None to indicate nothing should be loaded.

Return type:

None

observe_expired(record)

Observe a record being expired.

Used to store any information needed to maintain the index.

It’s valid to observe the same record being loaded again.

Parameters:

record (Record) – The record that was expired.

Return type:

None

observe_loaded(record)

Observe a record being loaded.

Used to store any information needed to maintain the index.

It’s valid to observe a record being loaded that was previously observed being expired, as well as observe records that have already been loaded.

Parameters:

record (Record) – The record that was loaded.

Return type:

None

class tablecache.RecordScorer

Score calculator for a set of indexes.

Provides a way to calculate the scores of records in a number of indexes. Scores are orderable (most likely some kind of number) that give records a place in an index and make it possible to query many records quickly using a range of scores. Scores need not be unique (although it’s better to avoid too many collisions).

Every record always has a primary key which uniquely identifies it, which can be extracted from a record using primary_key().

This is the limited interface required by implementations of StorageTable, but it’s probably best implemented as part of an Indexes.

abstract property index_names: frozenset[str]

Return names of all indexes.

These are the names of all the indexes for which scores can be calculated. Never empty.

abstract primary_key(record)

Extract the primary key from a record.

Parameters:

record (Record) – The record to extract the primary key from.

Raises:

ValueError – If the primary key is missing or otherwise invalid.

Return type:

PrimaryKey

abstract score(index_name, record)

Calculate a record’s score for an index.

Parameters:
  • index_name (str) – Name of the index to calculate the score for.

  • record (Record) – The record to calculate the score for.

Raises:

ValueError – If the given index doesn’t exist.

Return type:

Score

class tablecache.Indexes

Bases: RecordScorer, Generic

A set of indexes used to access storage and DB tables.

This adds storage state information and ways to query a storage table and the DB to the RecordScorer interface. The purpose of this class is to tie its different indexes, their respective scoring and record access, together and potentially share information between them.

Provides a uniform way to specify a set of records to be queried from either storage or DB tables. This is done with storage_records_spec() and db_records_spec(), respectively.

Also keeps track of the set of records available from storage, as opposed to those that are only available via the DB. To this end, prepare_adjustment() is expected to be called before loading records into storage, and commit_adjustment() when the load is complete. From that point on, the state considers the records that it specified to load to be in storage. Further adjustments can be made later in order to change the records in storage.

covers() can be used to check whether a set of records is available from storage.

Methods for which a set of records needs to be specified (storage_records_spec(), db_records_spec(), covers(), and prepare_adjustment()) take an instance of the IndexSpec inner class. This encapsulates the way to specify a particular set of records for the particular implementation. Subclasses may define their own IndexSpec, but these must be inner classes and subclasses of IndexSpec (i.e. issubclass(MyIndexesImplementation.IndexSpec, Indexes.IndexSpec)). These should also include a __repr__() which describes which records are specified. This will be used in logging.

Some methods (covers(), prepare_adjustment(), and storage_records_spec()) may not be supported for every index. E.g., an index may only be meant for querying (i.e. support covers() and storage_records_spec()), but not for adjusting the indexes. In that case, these methods raise an UnsupportedIndexOperation. However, if covers() is supported, so is storage_records_spec().

If any method is called with the name of an index that doesn’t exist, a ValueError is raised.

class IndexSpec

Bases: object

Specification of a set of records in an index.

__init__(index_name)
Parameters:

index_name (str) –

Return type:

None

abstract commit_adjustment(adjustment)

Commits a prepared adjustment.

Takes an Adjustment previously returned from prepare_adjustment() and modifies internal state to reflect it. After the call, the indexes assume that the records that were specified to be deleted from storage are no longer covered, and likewise that those specified to be loaded are. Future calls to covers() will reflect that.

Parameters:

adjustment (Adjustment) – The adjustment that was previously prepared and should be committed now.

Return type:

None

abstract covers(spec)

Check whether the specified records are covered by storage.

Returns whether all of the records specified via the spec are in storage. This determination is based on previous calls to commit_adjustment().

May also return False if the records may be covered, but there isn’t enough information to be certain. This could happen when the Indexes are adjusted by a different index than this covers check is done with. E.g., if an adjustment containing a specific set of primary keys is committed and then a covers check is done for a range of primary keys, there may not be enough information to determine whether the set that was loaded contained all primary keys in the range.

A record may also be considered covered if it doesn’t exist. E.g., say records with primary keys between 0 and 10 were loaded into storage, but none even exists with primary key 5. Then that record is still covered by storage, and the cache doesn’t need to go to the DB to check if it exists.

The implementation may lie a bit about what is covered in the pursuit of performance. E.g., it may claim to cover records it technically can’t have seen, but which can’t be very old, trading exact consistency with the DB for eventual consistency in order to reduce the number of cache misses.

Parameters:

spec (IndexSpec) – A specification of the set of records that should be checked.

Raises:

UnsupportedIndexOperation – If the given index doesn’t support checking coverage.

Return type:

bool

abstract db_records_spec(spec)

Specify records in the DB based on an index.

Like storage_records_spec(), but specifies the same set of records in the DB.

Parameters:

spec (IndexSpec) –

Return type:

DbRecordsSpec

abstract prepare_adjustment(spec)

Prepare an adjustment of which records are covered by the indexes.

Returns an Adjustment, which contains a StorageRecordsSpec of records to delete from storage and a DbRecordsSpec of ones to load from the DB in order to attain the state in which exactly the records specified via the spec are loaded.

This method only specifies what would need to change in order to adjust the indexes, but does not modify the internal state of the Indexes. However, a subclass of Adjustment may be returned that contains additional information needed in commit_adjustment(), as well as implementing Adjustment.observe_expired() and Adjustment.observe_loaded(). These will be called with all the records that were expired and loaded, and can store information needed to maintain the index.

Parameters:

spec (IndexSpec) – A specification of the set of records that should be in cache after the adjustment is done.

Raises:

UnsupportedIndexOperation – If adjusting by the given index is not supported.

Return type:

Adjustment

abstract storage_records_spec(spec)

Specify records in storage based on an index.

Raises:

UnsupportedIndexOperation – If the given index doesn’t support getting records from storage.

Returns:

A specification of the set of records in storage that matches spec.

Parameters:

spec (IndexSpec) –

Return type:

StorageRecordsSpec

class tablecache.AllIndexes

Very simple indexes loading everything.

Only a single index named all, but it essentially doesn’t do anything. All operations load everything. The only control there is is to specify a recheck_predicate as a filter, but it is only used in storage_records_spec().

class IndexSpec
__init__(index_name, recheck_predicate=<function StorageRecordsSpec.always_use_record>)
Parameters:
  • recheck_predicate (RecheckPredicate) – A predicate used to filter records.

  • index_name (str) –

Return type:

None

__init__(primary_key_extractor, query_all_string)
Parameters:
  • primary_key_extractor (Callable[[Record], PrimaryKey]) – A function extracting the primary key from a record.

  • query_all_string (str) – A string to query all records from the DB.

Return type:

None

class tablecache.PrimaryKeyIndexes

Simple indexes for only selected primary keys.

An index capable of loading either everything, or a select set of primary keys. Only the primary_key index is supported. Scores are the primary key’s hash, so anything hashable works as keys. Only a single primary key attribute is supported.

The implementation is very basic and likely only useful for testing and demonstration. Issues in practice could be:

  • In storage_records_spec(), one interval is included for every primary key, which makes no use of fast access to storage an is likely slow.

  • When loading select keys, all of them are stored in a set, which can get big.

class IndexSpec
__init__(index_name, *primary_keys, all_primary_keys=False)
Parameters:
  • index_name (str) – Must be primary_key.

  • primary_keys (PrimaryKey) – Individual primary keys to specify. Mutually exclusive with all_primary_keys.

  • all_primary_keys (bool) – Whether to specify all primary keys. Mutually exclusive with primary_keys.

__init__(primary_key_extractor, query_all_string, query_some_string)
Parameters:
  • primary_key_extractor (Callable[[Record], PrimaryKey]) – A function extracting the primary key from a record.

  • query_all_string (str) –

  • query_some_string (str) –

Query_all_string:

A query string used to query all records in the DB. Will be used without parameters.

Query_some_string:

A query string used to query only a selection of primary keys. Will be used with a single parameter, which is a tuple of the primary key. Essentially, the query will have to include something like WHERE primary_key = ANY($1).

Return type:

None

class tablecache.PrimaryKeyRangeIndexes

Simple indexes for a range of primary keys.

An index capable of loading a range of primary keys. Only the primary_key index is supported. Primary keys must be numbers.

Ranges of primary keys are specified as an inclusive lower bound (ge) and an exclusive upper bound (lt) (greater-equal and less-than).

The implementation is quite simple, and adjustments will always expire all current data and load the entire requested data set, even if they overlap substantially.

class IndexSpec
__init__(index_name, *, ge, lt)
Parameters:
  • index_name (str) – Must be primary_key.

  • ge (Real) – Lower (inclusive) bound.

  • lt (Real) – Upper (exclusive) bound.

__init__(primary_key_extractor, query_range_string)
Parameters:
  • primary_key_extractor (Callable[[Record], PrimaryKey]) – A function extracting the primary key from a record.

  • query_range_string (str) –

Query_range_string:

A query string used to query a range of records in the DB. Will be used with 2 parameters, the lower inclusive bound and the upper exclusive bound. That means the query will likely have to contain something like WHERE primary_key >= $1 AND primary_key < $2.

Return type:

None