What are consolidation queries?
A consolidation query is a query Alooma uses to generate a 1:1 replica of a data source in a target data warehouse after raw event data is loaded to the warehouse. Consolidation queries were "born" due to the append-only nature of some data warehouses, namely Redshift, to simulate "upsert" behavior.
A common use case: A company has an updateable application database (e.g. PostgreSQL, MySQL), as part of their production environment. The company's analytics solutions and applications that rely on the database both suffer from performance degradation as they churn the database - an issue that gets worse as the data grows. The company's data scientists would like their tables to be replicated to Redshift, with all the INSERTs, UPDATEs and DELETEs accounted for, to lessen the load on the DB and improve their analytics solution performance.
But with Redshift as an append-only-oriented data warehouse, how can this be done in a performant and reliable way? By using consolidation queries to bridge the gap.
How do they work?