Was this page helpful?
ScyllaDB Python Driver is available under the Apache v2 License. ScyllaDB Python Driver is a fork of DataStax Python Driver. See Copyright here.
Cassandra 2.0+ offers support for automatic query paging. Starting with
version 2.0 of the driver, if protocol_version
is greater than
2
(it is by default), queries returning large result sets will be
automatically paged.
By default, Session.default_fetch_size
controls how many rows will
be fetched per page. This can be overridden per-query by setting
fetch_size
on a Statement
. By default, each page
will contain at most 5000 rows.
Whenever the number of result rows for are query exceed the page size, an
instance of PagedResult
will be returned instead of a normal
list. This class implements the iterator interface, so you can treat
it like a normal iterator over rows:
from cassandra.query import SimpleStatement
query = "SELECT * FROM users" # users contains 100 rows
statement = SimpleStatement(query, fetch_size=10)
for user_row in session.execute(statement):
process_user(user_row)
Whenever there are no more rows in the current page, the next page will
be fetched transparently. However, note that it is possible for
an Exception
to be raised while fetching the next page, just
like you might see on a normal call to session.execute()
.
If you use Session.execute_async()
along with,
ResponseFuture.result()
, the first page will be fetched before
result()
returns, but latter pages will be
transparently fetched synchronously while iterating the result.
If callbacks are attached to a query that returns a paged result, the callback will be called once per page with a normal list of rows.
Use ResponseFuture.has_more_pages
and
ResponseFuture.start_fetching_next_page()
to continue fetching
pages. For example:
class PagedResultHandler(object):
def __init__(self, future):
self.error = None
self.finished_event = Event()
self.future = future
self.future.add_callbacks(
callback=self.handle_page,
errback=self.handle_err)
def handle_page(self, rows):
for row in rows:
process_row(row)
if self.future.has_more_pages:
self.future.start_fetching_next_page()
else:
self.finished_event.set()
def handle_error(self, exc):
self.error = exc
self.finished_event.set()
future = session.execute_async("SELECT * FROM users")
handler = PagedResultHandler(future)
handler.finished_event.wait()
if handler.error:
raise handler.error
You can resume the pagination when executing a new query by using the ResultSet.paging_state
. This can be useful if you want to provide some stateless pagination capabilities to your application (ie. via http). For example:
from cassandra.query import SimpleStatement
query = "SELECT * FROM users"
statement = SimpleStatement(query, fetch_size=10)
results = session.execute(statement)
# save the paging_state somewhere and return current results
web_session['paging_state'] = results.paging_state
# resume the pagination sometime later...
statement = SimpleStatement(query, fetch_size=10)
ps = web_session['paging_state']
results = session.execute(statement, paging_state=ps)
Was this page helpful?
ScyllaDB Python Driver is available under the Apache v2 License. ScyllaDB Python Driver is a fork of DataStax Python Driver. See Copyright here.
cassandra
- Exceptions and Enumscassandra.cluster
- Clusters and Sessionscassandra.policies
- Load balancing and Failure Handling Policiescassandra.auth
- Authenticationcassandra.graph
- Graph Statements, Options, and Row Factoriescassandra.metadata
- Schema and Ring Topologycassandra.metrics
- Performance Metricscassandra.query
- Prepared Statements, Batch Statements, Tracing, and Row Factoriescassandra.pool
- Hosts and Connection Poolscassandra.protocol
- Protocol Featurescassandra.encoder
- Encoders for non-prepared Statementscassandra.decoder
- Data Return Formatscassandra.concurrent
- Utilities for Concurrent Statement Executioncassandra.connection
- Low Level Connection Infocassandra.util
- Utilitiescassandra.timestamps
- Timestamp Generationcassandra.io.asyncioreactor
- asyncio
Event Loopcassandra.io.asyncorereactor
- asyncore
Event Loopcassandra.io.eventletreactor
- eventlet
-compatible Connectioncassandra.io.libevreactor
- libev
Event Loopcassandra.io.geventreactor
- gevent
-compatible Event Loopcassandra.io.twistedreactor
- Twisted Event Loopcassandra.cqlengine.models
- Table models for object mappingcassandra.cqlengine.columns
- Column types for object mapping modelscassandra.cqlengine.query
- Query and filter model objectscassandra.cqlengine.connection
- Connection management for cqlenginecassandra.cqlengine.management
- Schema management for cqlenginecassandra.cqlengine.usertype
- Model classes for User Defined Typescassandra.datastax.graph
- Graph Statements, Options, and Row Factoriescassandra.datastax.graph.fluent
cassandra.datastax.graph.fluent.query
cassandra.datastax.graph.fluent.predicates