Data Models

This section covers Athena query execution models and configuration classes.

Query Execution

class pyathena.model.AthenaQueryExecution(response: Dict[str, Any])[source]

Represents an Athena query execution with status and metadata.

This class encapsulates information about a query execution in Amazon Athena, including its current state, statistics, error information, and result metadata. It’s primarily used internally by PyAthena cursors but can be useful for monitoring and debugging query execution.

Query States:
  • QUEUED: Query is waiting to be executed

  • RUNNING: Query is currently executing

  • SUCCEEDED: Query completed successfully

  • FAILED: Query execution failed

  • CANCELLED: Query was cancelled

Statement Types:
  • DDL: Data Definition Language (CREATE, DROP, ALTER)

  • DML: Data Manipulation Language (SELECT, INSERT, UPDATE, DELETE)

  • UTILITY: Utility statements (SHOW, DESCRIBE, EXPLAIN)

Example

>>> # Typically accessed through cursor execution
>>> cursor.execute("SELECT COUNT(*) FROM my_table")
>>> query_execution = cursor._last_query_execution  # Internal access
>>> print(f"Query ID: {query_execution.query_id}")
>>> print(f"State: {query_execution.state}")
>>> print(f"Data scanned: {query_execution.data_scanned_in_bytes} bytes")

See also

AWS Athena QueryExecution API reference: https://docs.aws.amazon.com/athena/latest/APIReference/API_QueryExecution.html

STATE_QUEUED: str = 'QUEUED'
STATE_RUNNING: str = 'RUNNING'
STATE_SUCCEEDED: str = 'SUCCEEDED'
STATE_FAILED: str = 'FAILED'
STATE_CANCELLED: str = 'CANCELLED'
STATEMENT_TYPE_DDL: str = 'DDL'
STATEMENT_TYPE_DML: str = 'DML'
STATEMENT_TYPE_UTILITY: str = 'UTILITY'
ENCRYPTION_OPTION_SSE_S3: str = 'SSE_S3'
ENCRYPTION_OPTION_SSE_KMS: str = 'SSE_KMS'
ENCRYPTION_OPTION_CSE_KMS: str = 'CSE_KMS'
ERROR_CATEGORY_SYSTEM: int = 1
ERROR_CATEGORY_USER: int = 2
ERROR_CATEGORY_OTHER: int = 3
S3_ACL_OPTION_BUCKET_OWNER_FULL_CONTROL = 'BUCKET_OWNER_FULL_CONTROL'
__init__(response: Dict[str, Any]) None[source]
property database: str | None
property catalog: str | None
property query_id: str | None
property query: str | None
property statement_type: str | None
property substatement_type: str | None
property work_group: str | None
property execution_parameters: List[str]
property state: str | None
property state_change_reason: str | None
property submission_date_time: datetime | None
property completion_date_time: datetime | None
property error_category: int | None
property error_type: int | None
property retryable: bool | None
property error_message: str | None
property data_scanned_in_bytes: int | None
property engine_execution_time_in_millis: int | None
property query_queue_time_in_millis: int | None
property total_execution_time_in_millis: int | None
property query_planning_time_in_millis: int | None
property service_processing_time_in_millis: int | None
property output_location: str | None
property data_manifest_location: str | None
property reused_previous_result: bool | None
property encryption_option: str | None
property kms_key: str | None
property expected_bucket_owner: str | None
property s3_acl_option: str | None
property selected_engine_version: str | None
property effective_engine_version: str | None
property result_reuse_enabled: bool | None
property result_reuse_minutes: int | None
class pyathena.model.AthenaCalculationExecution(response: Dict[str, Any])[source]
__init__(response: Dict[str, Any]) None[source]
property calculation_id: str | None
property session_id: str | None
property description: str | None
property working_directory: str | None
property std_out_s3_uri: str | None
property std_error_s3_uri: str | None
property result_s3_uri: str | None
property result_type: str | None
class pyathena.model.AthenaCalculationExecutionStatus(response: Dict[str, Any])[source]
STATE_CREATING: str = 'CREATING'
STATE_CREATED: str = 'CREATED'
STATE_QUEUED: str = 'QUEUED'
STATE_RUNNING: str = 'RUNNING'
STATE_CANCELING: str = 'CANCELING'
STATE_CANCELED: str = 'CANCELED'
STATE_COMPLETED: str = 'COMPLETED'
STATE_FAILED: str = 'FAILED'
__init__(response: Dict[str, Any]) None[source]
property state: str | None
property state_change_reason: str | None
property submission_date_time: datetime | None
property completion_date_time: datetime | None
property dpu_execution_in_millis: int | None
property progress: str | None

Session Management

class pyathena.model.AthenaSessionStatus(response: Dict[str, Any])[source]
STATE_CREATING: str = 'CREATING'
STATE_CREATED: str = 'CREATED'
STATE_IDLE: str = 'IDLE'
STATE_BUSY: str = 'BUSY'
STATE_TERMINATING: str = 'TERMINATING'
STATE_TERMINATED: str = 'TERMINATED'
STATE_DEGRADED: str = 'DEGRADED'
STATE_FAILED: str = 'FAILED'
__init__(response: Dict[str, Any]) None[source]
property session_id: str | None
property state: str | None
property state_change_reason: str | None
property start_date_time: datetime | None
property last_modified_date_time: datetime | None
property end_date_time: datetime | None
property idle_since_date_time: datetime | None

Database and Table Metadata

class pyathena.model.AthenaDatabase(response)[source]
__init__(response)[source]
property name: str | None
property description: str | None
property parameters: Dict[str, str]
class pyathena.model.AthenaTableMetadata(response)[source]
__init__(response)[source]
property name: str | None
property create_time: datetime | None
property last_access_time: datetime | None
property table_type: str | None
property columns: List[AthenaTableMetadataColumn]
property partition_keys: List[AthenaTableMetadataPartitionKey]
property parameters: Dict[str, str]
property comment: str | None
property location: str | None
property input_format: str | None
property output_format: str | None
property row_format: str | None
property file_format: str | None
property serde_serialization_lib: str | None
property compression: str | None
property serde_properties: Dict[str, str]
property table_properties: Dict[str, str]

File Formats and Compression

class pyathena.model.AthenaFileFormat[source]

Constants and utilities for Athena supported file formats.

This class provides constants for file formats supported by Amazon Athena and utility methods to check format types. These are commonly used when creating tables or configuring UNLOAD operations.

Supported formats:
  • SEQUENCEFILE: Hadoop SequenceFile format

  • TEXTFILE: Plain text files (default)

  • RCFILE: Record Columnar File format

  • ORC: Optimized Row Columnar format

  • PARQUET: Apache Parquet columnar format

  • AVRO: Apache Avro format

  • ION: Amazon Ion format

Example

>>> from pyathena.model import AthenaFileFormat
>>>
>>> # Check if format is Parquet
>>> if AthenaFileFormat.is_parquet("PARQUET"):
...     print("Using columnar format")
>>>
>>> # Use in UNLOAD operations
>>> format_type = AthenaFileFormat.FILE_FORMAT_PARQUET
>>> sql = f"UNLOAD (...) TO 's3://bucket/path/' WITH (format = '{format_type}')"
>>> cursor.execute(sql)

See also

AWS Documentation on supported file formats: https://docs.aws.amazon.com/athena/latest/ug/supported-serdes.html

FILE_FORMAT_SEQUENCEFILE: str = 'SEQUENCEFILE'
FILE_FORMAT_TEXTFILE: str = 'TEXTFILE'
FILE_FORMAT_RCFILE: str = 'RCFILE'
FILE_FORMAT_ORC: str = 'ORC'
FILE_FORMAT_PARQUET: str = 'PARQUET'
FILE_FORMAT_AVRO: str = 'AVRO'
FILE_FORMAT_ION: str = 'ION'
static is_parquet(value: str) bool[source]
static is_orc(value: str) bool[source]
class pyathena.model.AthenaCompression[source]

Constants and utilities for Athena supported compression formats.

This class provides constants for compression formats supported by Amazon Athena and utility methods to validate compression types. These are commonly used when creating tables, configuring UNLOAD operations, or optimizing data storage.

Supported compression formats:
  • BZIP2: BZIP2 compression

  • DEFLATE: DEFLATE compression

  • GZIP: GZIP compression (most common)

  • LZ4: LZ4 fast compression

  • LZO: LZO compression

  • SNAPPY: Snappy compression (good for Parquet)

  • ZLIB: ZLIB compression

  • ZSTD: Zstandard compression

Example

>>> from pyathena.model import AthenaCompression
>>>
>>> # Validate compression format
>>> if AthenaCompression.is_valid("GZIP"):
...     print("Valid compression format")
>>>
>>> # Use in UNLOAD operations
>>> compression = AthenaCompression.COMPRESSION_GZIP
>>> sql = f"UNLOAD (...) TO 's3://bucket/path/' WITH (compression = '{compression}')"
>>> cursor.execute(sql)

See also

AWS Documentation on compression formats: https://docs.aws.amazon.com/athena/latest/ug/compression-formats.html

Best practices for data compression in Athena: https://docs.aws.amazon.com/athena/latest/ug/compression-support.html

COMPRESSION_BZIP2: str = 'BZIP2'
COMPRESSION_DEFLATE: str = 'DEFLATE'
COMPRESSION_GZIP: str = 'GZIP'
COMPRESSION_LZ4: str = 'LZ4'
COMPRESSION_LZO: str = 'LZO'
COMPRESSION_SNAPPY: str = 'SNAPPY'
COMPRESSION_ZLIB: str = 'ZLIB'
COMPRESSION_ZSTD: str = 'ZSTD'
static is_valid(value: str) bool[source]