Data Conversion

This section covers data type converters and parameter formatters.

Type Converters

class pyathena.converter.Converter(mappings: ~typing.Dict[str, ~typing.Callable[[str | None], ~typing.Any | None]], default: ~typing.Callable[[str | None], ~typing.Any | None] = <function _to_default>, types: ~typing.Dict[str, ~typing.Type[~typing.Any]] | None = None)[source]

Abstract base class for converting Athena data types to Python objects.

Converters handle the transformation of string values returned by Athena into appropriate Python data types. Different cursor implementations may use different converters to optimize for their specific use cases.

This class provides a framework for mapping Athena data type names to conversion functions and handles the conversion process during result set processing.

mappings

Dictionary mapping Athena type names to conversion functions.

default

Default conversion function for unmapped types.

types

Optional dictionary mapping type names to Python type objects.

__init__(mappings: ~typing.Dict[str, ~typing.Callable[[str | None], ~typing.Any | None]], default: ~typing.Callable[[str | None], ~typing.Any | None] = <function _to_default>, types: ~typing.Dict[str, ~typing.Type[~typing.Any]] | None = None) None[source]
property mappings: Dict[str, Callable[[str | None], Any | None]]

Get the current type conversion mappings.

Returns:

Dictionary mapping Athena data types to conversion functions.

property types: Dict[str, Type[Any]]

Get the current type mappings for result set descriptions.

Returns:

Dictionary mapping Athena data types to Python types.

get(type_: str) Callable[[str | None], Any | None][source]

Get the conversion function for a specific Athena data type.

Parameters:

type – The Athena data type name.

Returns:

The conversion function for the type, or the default converter if not found.

set(type_: str, converter: Callable[[str | None], Any | None]) None[source]

Set a custom conversion function for an Athena data type.

Parameters:
  • type – The Athena data type name.

  • converter – The conversion function to use for this type.

remove(type_: str) None[source]

Remove a custom conversion function for an Athena data type.

Parameters:

type – The Athena data type name to remove.

update(mappings: Dict[str, Callable[[str | None], Any | None]]) None[source]

Update multiple conversion functions at once.

Parameters:

mappings – Dictionary of type names to conversion functions.

abstract convert(type_: str, value: str | None) Any | None[source]
class pyathena.converter.DefaultTypeConverter[source]

Default implementation of the Converter for standard Python types.

This converter provides mappings for all standard Athena data types to their corresponding Python types using built-in conversion functions. It’s used by the standard Cursor class by default.

Supported conversions:
  • Numeric types: integer, bigint, real, double, decimal

  • String types: varchar, char

  • Date/time types: date, timestamp, time (with timezone support)

  • Boolean: boolean

  • Binary: varbinary

  • Complex types: array, map, row/struct

  • JSON: json

Example

>>> converter = DefaultTypeConverter()
>>> converter.convert('integer', '42')
42
>>> converter.convert('date', '2023-01-15')
datetime.date(2023, 1, 15)
__init__() None[source]
convert(type_: str, value: str | None) Any | None[source]

Parameter Formatters

class pyathena.formatter.Formatter(mappings: Dict[Type[Any], Callable[[Formatter, Callable[[str], str], Any], Any]], default: Callable[[Formatter, Callable[[str], str], Any], Any] | None = None)[source]

Abstract base class for formatting Python values for SQL queries.

Formatters handle the conversion of Python objects to SQL-compatible string representations for use in parameterized queries. They ensure proper escaping and formatting of values based on their types.

This class provides a framework for mapping Python types to formatting functions and handles the formatting process during query preparation.

mappings

Dictionary mapping Python types to formatting functions.

default

Default formatting function for unmapped types.

__init__(mappings: Dict[Type[Any], Callable[[Formatter, Callable[[str], str], Any], Any]], default: Callable[[Formatter, Callable[[str], str], Any], Any] | None = None) None[source]
property mappings: Dict[Type[Any], Callable[[Formatter, Callable[[str], str], Any], Any]]

Get the current parameter formatting mappings.

Returns:

Dictionary mapping Python types to formatting functions.

get(type_) Callable[[Formatter, Callable[[str], str], Any], Any] | None[source]

Get the formatting function for a specific Python type.

Parameters:

type – The Python value to get formatter for.

Returns:

The formatting function for the type, or the default formatter if not found.

set(type_: Type[Any], formatter: Callable[[Formatter, Callable[[str], str], Any], Any]) None[source]
remove(type_: Type[Any]) None[source]
update(mappings: Dict[Type[Any], Callable[[Formatter, Callable[[str], str], Any], Any]]) None[source]
abstract format(operation: str, parameters: Dict[str, Any] | None = None) str[source]
static wrap_unload(operation: str, s3_staging_dir: str, format_: str = 'PARQUET', compression: str = 'SNAPPY')[source]

Wrap a SELECT query with UNLOAD statement for high-performance result retrieval.

Transforms SELECT or WITH queries into UNLOAD statements that export results directly to S3 in optimized formats (Parquet, ORC) with compression. This approach is significantly faster than standard CSV-based result retrieval for large datasets and preserves data types more accurately.

Parameters:
  • operation – SQL query to wrap. Must be a SELECT or WITH statement.

  • s3_staging_dir – Base S3 directory for storing UNLOAD results.

  • format – Output file format. Defaults to Parquet for optimal performance.

  • compression – Compression algorithm. Defaults to Snappy for balanced compression ratio and speed.

Returns:

  • Modified UNLOAD query string

  • S3 location where results will be stored (None if not SELECT/WITH)

Return type:

Tuple containing

Example

>>> query = "SELECT * FROM sales WHERE year = 2023"
>>> unload_query, location = Formatter.wrap_unload(
...     query, "s3://my-bucket/results/"
... )
>>> print(unload_query)
UNLOAD (
    SELECT * FROM sales WHERE year = 2023
)
TO 's3://my-bucket/results/unload/20231215/uuid//'
WITH (
    format = 'PARQUET',
    compression = 'SNAPPY'
)

Note

Only SELECT and WITH statements are wrapped. Other statement types are returned unchanged with location=None.

class pyathena.formatter.DefaultParameterFormatter[source]

Default implementation of the Formatter for SQL parameter formatting.

This formatter provides standard formatting for common Python types used in SQL parameters. It handles proper escaping and quoting to prevent SQL injection and ensure valid SQL syntax.

Supported types:
  • None: Converts to SQL NULL

  • Strings: Properly escaped and quoted

  • Numbers: int, float, Decimal

  • Dates and times: date, datetime, time

  • Booleans: Converted to SQL boolean literals

  • Sequences: list, tuple, set (for IN clauses)

Example

>>> formatter = DefaultParameterFormatter()
>>> sql = formatter.format(
...     "SELECT * FROM users WHERE name = %(name)s AND age > %(age)s",
...     {"name": "John's Data", "age": 25}
... )
>>> print(sql)
SELECT * FROM users WHERE name = 'John''s Data' AND age > 25
__init__() None[source]
format(operation: str, parameters: Dict[str, Any] | None = None) str[source]