With Alooma, you extract data from an Input (source), and load it into an Output (destination). Before you can extract data, you'll need an Output — a place to put that data. Have you already created an Output in Alooma (something like Redshift, or BigQuery, or Snowflake)?
Once you have a place to put the data you extract, you just need to create an Input. Your Input will automatically point to your Output and start loading your data. You can see this working on the Live page or the Live tab on your Input.
Go to the Plumbing page and if you already have an Output, you'll see a blue square on the right side of the page that says the type of Output (Snowflake, for example). If not, you can create an Output (Plumbing screen, Output -> Settings).
Use the Code Engine to transform data.
Your data loads automatically once you've created in Input (the source of the data) and an Output (the destination for the data). If you want to change the schema or how the data is mapped, use the Mapper.
First things first, don't worry, Alooma has you covered. Any events that don't get loaded into your Output are captured in the Restream Queue for reloading.
Alooma is the best data pipeline in the world. Seriously, that is our goal: to be a managed, flexible, and reliable data pipeline which enables you to focus on using your data rather than trying to manage it.
If you have multiple sources of data like Mixpanel for analytics, a MySQL database with customer information, Salesforce billing data, and server logs - you probably know how hard it can be to query all these disparate sources simultaneously to gain actionable business insight.
Alooma allows you to stream data from all these sources, transform and enrich the data, and then store this data in your own data warehouse - be it Amazon Redshift, Google BigQuery, Snowflake, MemSQL, or anything else! Most importantly, Alooma does all of this in real time, with a safety net so you never lose data, while providing a powerful and dynamic code layer on top of the real time data. Finally, Alooma makes mapping both structured and semi-structured data quick and simple--reliably and at scale.
In short, Alooma is the modern way to ETL, without the headache.
That’s a good question. Most of our customers end up deciding based on these factors:
How quickly do I want to get set up? Days or months?
How complex are my data needs? One source with a schema that never changes or many sources with more in the future that change schemas often?
How fast am I growing? Can I handle my current and future scale or will any growth exceed my capabilities?
How important is data integrity? Do I need all of my data no matter what, or is data loss acceptable?
How fast do I want the data? Minutes or delayed up to a few days or weeks?
What kind of transformation or enrichment do I need? Is this just a dump and load or will making minor or even major changes along the way be of importance?
If you’re a bit fuzzy on answering some of these questions, let’s discuss your situation and we can share our best practices. Just contact us!
If building your own data pipeline seems like the best option, our VP of Research and Development wrote a blog post about this very topic to help get you started.
That's one of the things Alooma does best! We know data schemas change - it's just a matter of time. So, we've created a flexible way for you to either "set and forget" your data, or decide that you'd like to be actively involved in handling all data changes.
When setting up an integration you can decide if you want it to be set up and mapped to your target data warehouse automatically with OneClick, or if you'd like to set up a customized integration on your own. If you choose OneClick, all schema changes will be handled automatically. Regardless of your setup choice, you can also manage how you want Alooma to handle any schema changes.
Have no fear! Alooma's Restream Queue is here as your safety net to catch all events that don't get loaded into your target data warehouse for any reason. You can fix the issue and restream the errored event(s) at any time to ensure exactly once processing.
Learn more about the Restream Queue.
No, Alooma is a data pipeline and your data in our system is considered “data in motion,” which allows us to work within many financial services and health care regulations.
We do not persist your data as we are not a data warehouse. Once we send your data to your data warehouse, it is yours to own and manage.
If a situation occurs that prevents data from loading to your data warehouse (a copy error, a transformation error, type conversion error, etc.), the events that have not yet replicated to your data warehouse will be temporarily stored within Alooma’s Restream Queue until the error(s) are rectified and a Restream occurs.
Again, for our customers working with compliance requirements, if a policy of flushing the queue within 24 hours or another restriction is necessary, we can work with you to determine your needs and adjust to fit you best.
The first time you set up Alooma should take you about 10 minutes or so, depending on the number of integrations you initially add. From there, it’s up to us to perform any back-end historical data dumps or initialize any schema mappings that need a bit of extra attention. This is usually performed within 24 hours of your first sign-up, and we'll be in close communication to update you on the progress.
Of course! You can select the frequency of email digest delivery in our Settings section. Options include never, hourly, or daily. Then, you can select if you prefer notifications of each kind (warnings, errors, etc.) to arrive immediately as they occur or in a digest format.