DataSource : Spark
Hosted Environment : Databricks
Use Case : I have an expectation to compare 2 months of data in same batch by counting total number of stores and the percentage of difference should be within +/- 5%.
More precisely, In the attached screenshot, I’ve a batch holding 2 months of data. I need to count total number of stores in each month and then find variance % by applying the below formula,
[ (latest month store count - previous month store count ) / latest month store count ]
The % variance should be within +/- 5% threshold, If it doesn’t meet the threshold value, the Expectation must be marked as failed. Any thoughts would be greatly appreciated !