This document is meant to provide on overview of the assumptions and limitations of the driver time handling, the reasoning behind it, and describe approaches to working with these types.
Timestamps in Cassandra are timezone-naive timestamps encoded as millseconds since UNIX epoch. Clients working with timestamps in this database usually find it easiest to reason about them if they are always assumed to be UTC. To quote the pytz documentation, “The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans.” The driver adheres to this tenant, and assumes UTC is always in the database. The driver attempts to make this correct on the way in, and assumes no timezone on the way out.
When inserting timestamps, the driver handles serialization for the write path as follows:
If the input is a
datetime.datetime, the serialization is normalized by starting with the
utctimetuple() of the
datetime object is timezone-aware, the timestamp is shifted, and represents the UTC timestamp equivalent.
datetime object is timezone-naive, this results in no shift – any
datetime with no timezone information is assumed to be UTC
Note the second point above applies even to “local” times created using
>>> d = datetime.now() >>> print(d.tzinfo) None
These do not contain timezone information intrinsically, so they will be assumed to be UTC and not shifted. When generating
timestamps in the application, it is clearer to use
datetime.utcnow() to be explicit about it.
If the input for a timestamp is numeric, it is assumed to be a epoch-relative millisecond timestamp, as specified in the CQL spec – no scaling or conversion is done.
The driver always assumes persisted timestamps are UTC and makes no attempt to localize them. Returned values are
datetime.datetime. We follow this approach because the datetime API has deficiencies around daylight
saving time, and the defacto package for handling this is a third-party package (we try to minimize external dependencies
and not make decisions for the integrator).
The decision for how to handle timezones is left to the application. For the most part it is straightforward to apply
localization to the
datetimes returned by queries. One prevalent method is to use pytz for localization:
import pytz user_tz = pytz.timezone('US/Central') timestamp_naive = row.ts timestamp_utc = pytz.utc.localize(timestamp_naive) timestamp_presented = timestamp_utc.astimezone(user_tz)
This is the most robust approach (likely refactored into a function). If it is deemed too cumbersome to apply for all call sites in the application, it is possible to patch the driver with custom deserialization for this type. However, doing this depends depends some on internal APIs and what extensions are present, so we will only mention the possibility, and not spell it out here.
Date and time in Cassandra are idealized markers, much like
datetime.time in the Python standard
library. Unlike these Python implementations, the Cassandra encoding supports much wider ranges. To accommodate these
ranges without overflow, this driver returns these data in custom types:
For simple (not prepared) statements, the input values for each of these can be either a string literal or an encoded integer. See Working with dates or Working with time for details on the encoding or string formats.
The driver always returns custom types for
The driver returns
date in order to accommodate the wider range of values without overflow.
For applications working within the supported range of [
datetime.MAXYEAR], these are easily
converted to standard
datetime.date insances using
The driver returns
time in order to retain nanosecond precision stored in the database.
For applications not concerned with this level of precision, these are easily converted to standard
cassandra- Exceptions and Enums
cassandra.cluster- Clusters and Sessions
cassandra.policies- Load balancing and Failure Handling Policies
cassandra.graph- Graph Statements, Options, and Row Factories
cassandra.metadata- Schema and Ring Topology
cassandra.metrics- Performance Metrics
cassandra.query- Prepared Statements, Batch Statements, Tracing, and Row Factories
cassandra.pool- Hosts and Connection Pools
cassandra.protocol- Protocol Features
cassandra.encoder- Encoders for non-prepared Statements
cassandra.decoder- Data Return Formats
cassandra.concurrent- Utilities for Concurrent Statement Execution
cassandra.connection- Low Level Connection Info
cassandra.timestamps- Timestamp Generation
gevent-compatible Event Loop
cassandra.io.twistedreactor- Twisted Event Loop
cassandra.cqlengine.models- Table models for object mapping
cassandra.cqlengine.columns- Column types for object mapping models
cassandra.cqlengine.query- Query and filter model objects
cassandra.cqlengine.connection- Connection management for cqlengine
cassandra.cqlengine.management- Schema management for cqlengine
cassandra.cqlengine.usertype- Model classes for User Defined Types
cassandra.datastax.graph- Graph Statements, Options, and Row Factories