Summary: General questions and answers pertaining to target data warehouses when using Alooma.
Most of the time, your data will be available in your target data warehouse in minutes. This depends, however, on the source, the rate the data is being transmitted, and presumes there are no transformation or replication errors. For example, some sources via API connections like Mixpanel or Salesforce have API call limitations that may delay your data replication. Other sources like pushing data via our SDKs or MySQL replication are usually available in minutes.
Basically Alooma stores data in ready-to-be-loaded CSV files in a temporary S3 bucket and then periodically uploads that data into the data warehouse (that's the optimized way to load data into Redshift, for example). Alooma uses COPY commands to put data into the data warehouse and the amount of time those commands take to complete can vary, depending on different factors.
In high-load systems the COPY commands can take a long time to complete, and this is usually the cause of latency. The amount of data being copied can play a role. Delays can also be caused by the various interdependencies in the system: Redshift can be down or slow due to AWS issues, etc.
Alooma believes the data is yours to own and do with whatever you wish, that’s why we focus on being the best data pipeline in the world--for now, that’s all we focus on. We do not store your data, nor do we provide visualizations or business intelligence. So, what you do with your data once it is in your target data warehouse really is up to you. Most of our customers utilize business intelligence and visualization tools including Looker, Mode, Chart.io, Re-dash, or Tableau to build analytical reports or dashboards.
Alooma is an integral part of many components in your ETL process (source data, data pipeline and transformation, data store, and business intelligence layer). However, Alooma is only as good as integrations and targets it is connected with. So if your source integration has a delay in sending data to Alooma, or if your target data warehouse is unavailable or extremely busy (causing Alooma to have to queue), then there may be a delay in data replication. If you find this to be a recurring issue, contact us and we can suggest best practices about how to set up your integrations and your target data warehouse for the best efficiency.
Alooma is optimized for each type of data warehouse it writes to. Alooma supports the specific data types that each data warehouse supports, accounts for each data warehouse's naming restrictions and conventions, loads to each data warehouse in its unique way (e.g. LOAD for Redshift or streaming API for BigQuery), and runs the specific commands each data warehouse needs to be optimized (e.g. VACUUM for Redshift).
Other than loading to your target data warehouse, Alooma currently only runs consolidation queries on your data warehouse, in the event you are connected to integrations which require consolidation such as MySQL or PostgreSQL.
If you are using auto-mapping for your connected integrations, Alooma makes sure to abide by your data warehouse's naming conventions and restrictions to account for capitalization and replace special characters such that you will not need to quote your queries. If you are mapping manually and use special characters or specific capitalization (based on your target data warehouse) you may need to quote your queries.
Alooma is constantly expanding its support for integrations and targets. If you have a target you'd like supported which isn't currently, please contact us! That target may already be in development and there may be opportunities to join a beta group.
The meta tables are tables Alooma creates as part of our exactly once processing framework. Learn more about Meta tables.