G R A V I T I E P L A T F O R M
Dremio logo

Reduce Complexity | Save Time & Money | Unlock the Value of your Data

With data at the forefront of critical business decision making, there is an increasing need for data teams to have faster and easier access to analytics. 

Unlike other cloud-based analytics solutions, with Dremio, there is no need for copying or reshaping of data.

Dremio enables you and your team to power analytics and BI directly on your data lakes. This accelerated innovation provides enterprise-readiness across the BI ecosystem and empowers cloud, technology and SI partners to meet customers where they are with a simple, easy to use, open architecture.

Dremio’s Data Lake Engine delivers lightning fast query speed and a self-service semantic layer operating directly against your data lake storage.

Lighting-Fast Queries

These queries operate directly on data lake storage; connect to S3, ADLS, Hadoop, or wherever your data is.

Dremio technologies like Data Reflections, Columnar Cloud Cache (C3) and Predictive Pipelining work alongside Apache Arrow to make queries on your data lake storage lightning fast

Accelerate reads with Predictive Pipelining and Columnar Cloud Cache

Dremio’s Predictive Pipelining technology fetches data just before the execution engine needs it, dramatically reducing the time the engine spends waiting for data. The real-time Columnar Cloud Cache (C3) automatically caches data on local NVMe (Non-Volatile Memory Express) as it’s being accessed, enabling NVMe-level performance on your data lake storage.

 

A modern execution engine, built for the cloud

Dremio’s execution engine is built on Apache Arrow, the standard for columnar, in-memory analytics, and leverages Gandiva to compile queries to vectorized code that’s optimized for modern CPUs. A single Dremio cluster can scale elastically to meet any data volume or workload, and you can even have multiple clusters with automatic query routing.

 

Data Reflections – the ON switch for extreme speed

With a few clicks, Dremio lets you create a Data Reflection, a physically optimized data structure that can accelerate a variety of query patterns. Create as many or as few as you want; Dremio invisibly and automatically incorporates Reflections in query plans and keeps the data up to date.

 

Arrow Flight moves data 1000x faster

ODBC and JDBC were designed in the 1990s for small data, requiring all records to be serialized and deserialized. Arrow Flight replaces them with a high-speed, distributed protocol designed to handle big data, providing a 1000x increase in throughput between client applications and Dremio. You can now populate a client-side Python or R data frame with millions of records in seconds.

 

Self-Service Semantic Layer

Dremio establishes views into data (called virtual datasets) in a semantic layer on top of your physical data, so data analysts and engineers can manage, curate and share data—while maintaining governance and security—but without the overhead and complexity of copying data.

Connect any BI or data science tool (eg Tableau, Power BI, Looker and Jupyter Notebooks) to Dremio and start exploring and mining your data lake for value.

Dremio’s semantic layer is fully virtual, indexed and searchable, and the relationships between your data sources, virtual datasets and transformations and all your queries are maintained in Dremio’s data graph, so you know exactly where each virtual dataset came from.

Role-based access control makes sure that everyone has access to exactly what they need (and nothing else), and SSO enables a seamless authentication experience.

An abstraction layer enables IT to apply security and business meaning, while enabling analysts and data scientists to explore data and derive new virtual datasets.

Dremio technologies like Data Reflections, Columnar Cloud Cache (C3) and Predictive Pipelining work alongside Apache Arrow to make queries on your data lake storage extremely fast.

A semantic layer generated by your users

Dremio’s semantic layer is an integrated, searchable catalog that indexes all of your metadata, so business users can easily make sense of your data. Virtual datasets and spaces make up the semantic layer, and are all indexed and searchable.

 

Data curation, without copies

By managing data curation in a virtual context, Dremio makes it fast, easy, and cost effective to filter, transform, join, and aggregate data from one or more sources. And virtual datasets are defined with standard SQL, so you can take advantage of your existing skills and tools.

 

Use your existing BI and data science tools

Dremio appears just like a relational database, and exposes ODBC, JDBC, REST and Arrow Flight interfaces. So you can connect any BI or data science tool – Tableau, Power BI, Looker and Jupyter Notebooks etc.

 

Fine-grained access control

Dremio provides row and column-level permissions, and lets you mask sensitive data. Role-based access control makes sure that everyone has access to exactly what they need, and SSO enables a seamless authentication experience.

 

Data lineage

The relationships between your data sources, virtual datasets, and all your queries are maintained in Dremio’s data graph, telling you exactly where each dataset came from.

 

Flexible and Open

Dremio works directly with your data lake storage. You don’t have to send your data to Dremio, or have it stored in proprietary formats that lock you in. Dremio is built on open source technologies such as Apache Arrow, and can run in any cloud or data center.

Avoid vendor lock-in, query across clouds, and keep your data in storage that you control.

No vendor lock-in

Dremio works directly with your data lake storage so that you don’t have to load the data into a proprietary data warehouse and deal with skyrocketing costs. Your data stays in its existing systems and formats so that you can always use any technology to access it without using Dremio.

 

Multi-cloud and hybrid cloud

Run Dremio on AWS, Azure, or on-premise. You can even query data across disparate regions or clouds. And the abstraction provided by Dremio’s semantic layer enables you to migrate data from one location to another, without impacting your analysts or data scientists.

 

Best-of-breed technology

With Dremio, your data can stay in data lake storage that you control. You can use Dremio alongside hundreds of other technologies that also work with data lake storage, including ETL services, data science tools and compute engines.

 

Apache Arrow inside

Apache Arrow, which was originally Dremio’s internal memory format, has become the industry standard for in-memory columnar analytics with millions of monthly downloads. Arrow-enabled applications realize a dramatic increase in processing and data transport speeds. The Gandiva kernel, developed at Dremio, provides 80x speedups on top of Arrow’s other innovations, and Arrow Flight provides a modern, industry-standard way to share data across distributed systems and data science tools.

 

Join with Anything

Powerful joining abilities mean that your data is always accessible without ETL. Dremio ships with over a dozen connectors, and Dremio Hub includes many other community-developed connectors.

While a lot of your data may already be in data lake storage, you probably have data in other places too. Dremio makes it easy to join your data lake storage with all the other places you’re keeping your data, without ETL.

Connect to any database

With many built-in and community-developed connectors, Dremio can easily and securely connect to your existing databases, and even join that data to your data lake storage or other places your data is stored.

 

Powerful query pushdowns

Dremio has the most powerful query pushdowns in the industry, featuring the Advanced Relational Pushdown (ARP) engine. Dremio has a deep understanding of the database’s capabilities and query language, enabling partial and complete pushdowns for even the most complex query plans.

 

Dremio Hub

In addition to the native connectors built into Dremio, Dremio Hub provides a marketplace of community-provided connectors to download, making it easy to join your Data Lake storage with all other places you keep your data, without ETL.

 

Dremio Connector SDK

Want to connect to a source we don’t support yet? Connectors can be built to any data source with a JDBC driver and are template-based, making it simple and easy to define new connectors without complex coding.

Interested in discovering what Dremio can do for you?

For demo and licensing queries, or for more info, contact Gravitie today!

© 2021 Gravitie Data Ltd.
All rights reserved | Proudly made in Ireland