Release: Great Expectations 0.8.0

github link

Version 0.8.0 is a significant update to Great Expectations, with many improvements focused on configurability and usability. See the migrating versions guide for more details on specific changes, which include several breaking changes to configs and APIs.

Highlights include:

  1. Validation Operators and Actions. Validation operators make it easy to integrate GE into a variety of pipeline runners. They offer one-line integration that emphasizes configurability. See the validation operators and actions feature guide for more information.

    • The DataContext get_batch method no longer treats expectation_suite_name or batch_kwargs as optional; they must be explicitly specified.
    • The top-level GE validate method allows more options for specifying the specific data_asset class to use.
  2. First-class support for plugins in a DataContext, with several features that make it easier to configure and
    maintain DataContexts across common deployment patterns.

    • Environments: A DataContext can now manage environment_and_secrets more easily thanks to more dynamic and flexible variable substitution.
    • Stores: A new internal abstraction for DataContexts, stores_reference, make extending GE easier by consolidating logic for reading and writing resources from a database, local, or cloud storage.
    • Types: Utilities configured in a DataContext are now referenced using class_name and module_name throughout the DataContext configuration, making it easier to extend or supplement pre-built resources. For now, the “type” parameter is still supported but expect it to be removed in a future release.
  3. Partitioners: Batch Kwargs are clarified and enhanced to help easily reference well-known chunks of data using a partition_id. Batch ID and Batch Fingerprint help round out support for enhanced metadata around data assets that GE validates. See batch_identifiers for more information. The GlobReaderGenerator, QueryGenerator, S3Generator, SubdirReaderGenerator, and TableGenerator all support partition_id for easily accessing data assets.

  4. Other Improvements:

    • We’re beginning a long process of some under-the-covers refactors designed to make GE more maintainable as we begin adding additional features.
    • Restructured documentation: our docs have a new structure and have been reorganized to provide space for more easily adding and accessing reference material. Stay tuned for additional detail.
    • The command build-documentation has been renamed build-docs and now by default opens the Data Docs in the users’ browser.