Enhancement request:
We are incorporating GE into a custom data quality tool. The problem we are experiencing is that some expectations are structural and others check actual data within rows. It would be very helpful if the GE expectations provided a way for us to know programmatically whether to expect just a pass/fail or whether to expect statistics about how many rows passed/failed.
At the moment our workaround is to parse the response for specific words, but that is obviously brittle.
The main relevant concepts are column_map_expectations and column_aggregate_expectations. Map Expectations apply on a row-by-row basis. Aggregate Expectations apply to a whole column at once. (There are also analogous classes for multicolumn Expectations.)
These concepts are currently implemented as decorators within the Dataset class. We’re in the process of refactoring them to be their own classes. Among other things, this will make them much more inspectable.
In the meantime, I believe you can work around the issue by grepping for the decorator names (MetaDataset.column_map_expectation and MetaDataset.column_aggregate_expectation) in dataset.py:
$ grep -B 1 column_map_expectation dataset.py
(Ignoring a few unrelated grepped matches...)
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
<great_expectations.data_asset.data_asset.DataAsset.expectation>`, not a
``column_map_expectation`` or ``column_aggregate_expectation``.
--
expect_column_values_to_be_unique is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_not_be_null is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_null is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
For PandasDataset columns with dtype of 'object' expect_column_values_to_be_of_type is a
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>` and will
--
'object'). For PandasDataset columns with dtype of 'object' expect_column_values_to_be_of_type is a
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>` and will
--
expect_column_values_to_be_in_set is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_not_be_in_set is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_between is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_increasing is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_decreasing is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_value_lengths_to_be_between is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_between is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_match_regex is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_not_match_regex is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_match_regex_list is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_not_match_regex_list is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_match_strftime_format is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_dateutil_parseable is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_be_json_parseable is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.
--
expect_column_values_to_match_json_schema is a \
:func:`column_map_expectation <great_expectations.dataset.dataset.MetaDataset.column_map_expectation>`.