Scylla Documentation Logo Documentation
  • Server
    • Scylla Open Source
    • Scylla Enterprise
    • Scylla Alternator
  • Cloud
    • Scylla Cloud
    • Scylla Cloud Docs
  • Tools
    • Scylla Manager
    • Scylla Monitoring Stack
    • Scylla Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
Download
Menu

Caution

You're viewing documentation for a previous version of Scylla Python Driver. Switch to the latest stable version.

Scylla Python Driver Paging Large Queries

Paging Large Queries¶

Cassandra 2.0+ offers support for automatic query paging. Starting with version 2.0 of the driver, if protocol_version is greater than 2 (it is by default), queries returning large result sets will be automatically paged.

Controlling the Page Size¶

By default, Session.default_fetch_size controls how many rows will be fetched per page. This can be overridden per-query by setting fetch_size on a Statement. By default, each page will contain at most 5000 rows.

Handling Paged Results¶

Whenever the number of result rows for are query exceed the page size, an instance of PagedResult will be returned instead of a normal list. This class implements the iterator interface, so you can treat it like a normal iterator over rows:

from cassandra.query import SimpleStatement
query = "SELECT * FROM users"  # users contains 100 rows
statement = SimpleStatement(query, fetch_size=10)
for user_row in session.execute(statement):
    process_user(user_row)

Whenever there are no more rows in the current page, the next page will be fetched transparently. However, note that it is possible for an Exception to be raised while fetching the next page, just like you might see on a normal call to session.execute().

If you use Session.execute_async() along with, ResponseFuture.result(), the first page will be fetched before result() returns, but latter pages will be transparently fetched synchronously while iterating the result.

Handling Paged Results with Callbacks¶

If callbacks are attached to a query that returns a paged result, the callback will be called once per page with a normal list of rows.

Use ResponseFuture.has_more_pages and ResponseFuture.start_fetching_next_page() to continue fetching pages. For example:

class PagedResultHandler(object):

    def __init__(self, future):
        self.error = None
        self.finished_event = Event()
        self.future = future
        self.future.add_callbacks(
            callback=self.handle_page,
            errback=self.handle_err)

    def handle_page(self, rows):
        for row in rows:
            process_row(row)

        if self.future.has_more_pages:
            self.future.start_fetching_next_page()
        else:
            self.finished_event.set()

    def handle_error(self, exc):
        self.error = exc
        self.finished_event.set()

future = session.execute_async("SELECT * FROM users")
handler = PagedResultHandler(future)
handler.finished_event.wait()
if handler.error:
    raise handler.error

Resume Paged Results¶

You can resume the pagination when executing a new query by using the ResultSet.paging_state. This can be useful if you want to provide some stateless pagination capabilities to your application (ie. via http). For example:

from cassandra.query import SimpleStatement
query = "SELECT * FROM users"
statement = SimpleStatement(query, fetch_size=10)
results = session.execute(statement)

# save the paging_state somewhere and return current results
web_session['paging_stage'] = results.paging_state


# resume the pagination sometime later...
statement = SimpleStatement(query, fetch_size=10)
ps = web_session['paging_state']
results = session.execute(statement, paging_state=ps)
PREVIOUS
Performance Notes
NEXT
Lightweight Transactions (Compare-and-set)
  • 3.22.3
    • 3.25.4
    • 3.24.8
    • 3.22.3
    • 3.21.0
  • API Documentation
    • cassandra - Exceptions and Enums
    • cassandra.cluster - Clusters and Sessions
    • cassandra.policies - Load balancing and Failure Handling Policies
    • cassandra.auth - Authentication
    • cassandra.graph - Graph Statements, Options, and Row Factories
    • cassandra.metadata - Schema and Ring Topology
    • cassandra.metrics - Performance Metrics
    • cassandra.query - Prepared Statements, Batch Statements, Tracing, and Row Factories
    • cassandra.pool - Hosts and Connection Pools
    • cassandra.protocol - Protocol Features
    • cassandra.encoder - Encoders for non-prepared Statements
    • cassandra.decoder - Data Return Formats
    • cassandra.concurrent - Utilities for Concurrent Statement Execution
    • cassandra.connection - Low Level Connection Info
    • cassandra.util - Utilities
    • cassandra.timestamps - Timestamp Generation
    • cassandra.io.asyncioreactor - asyncio Event Loop
    • cassandra.io.asyncorereactor - asyncore Event Loop
    • cassandra.io.eventletreactor - eventlet-compatible Connection
    • cassandra.io.libevreactor - libev Event Loop
    • cassandra.io.geventreactor - gevent-compatible Event Loop
    • cassandra.io.twistedreactor - Twisted Event Loop
    • cassandra.cqlengine.models - Table models for object mapping
    • cassandra.cqlengine.columns - Column types for object mapping models
    • cassandra.cqlengine.query - Query and filter model objects
    • cassandra.cqlengine.connection - Connection management for cqlengine
    • cassandra.cqlengine.management - Schema management for cqlengine
    • cassandra.cqlengine.usertype - Model classes for User Defined Types
    • cassandra.datastax.graph - Graph Statements, Options, and Row Factories
    • cassandra.datastax.graph.fluent
    • cassandra.datastax.graph.fluent.query
    • cassandra.datastax.graph.fluent.predicates
  • Installation
  • Getting Started
  • Scylla Specific Features
  • Upgrading
  • Execution Profiles
  • Performance Notes
  • Paging Large Queries
  • Lightweight Transactions (Compare-and-set)
  • Security
  • User Defined Types
  • Object Mapper
    • Upgrade Guide
    • Models
    • Making Queries
    • Batch Queries
    • Connections
    • Third party integrations
    • Frequently Asked Questions
  • Working with Dates and Times
  • Scylla Cloud
  • Frequently Asked Questions
  • Create an issue
  • Edit this page

On this page

  • Paging Large Queries
    • Controlling the Page Size
    • Handling Paged Results
    • Handling Paged Results with Callbacks
    • Resume Paged Results
Logo
Docs Contact Us About Us
Mail List Icon Slack Icon
© ScyllaDB 2021 and © DataStax 2013-2017
Powered by Sphinx 4.3.2 & ScyllaDB Theme 1.2.2