Data at your service.
The Altiscale Data Cloud is secure, scalable, and comprehensive. And our operations team ensures it’s hassle free.
The Altiscale Data Cloud was built to solve enterprise-class Big Data challenges with ease. It is a comprehensive Big Data solution, based on the Apache Hadoop and Apache Spark ecosystem. It offers a full data science workbench. It is surrounded with world-class security technology and process protections. And it all runs in a highly secure datacenter where the hardware and networking are specifically selected, configured, and managed for Big Data operations.
The result is elasticity, high reliability, and blazingly fast performance. Your Big Data jobs get done—quickly—without unnecessary stress. That’s why some of the largest organizations in the world trust Altiscale with their data and analytics and even run their own applications on top of Altiscale for their customers.
Altiscale Data Cloud
The Altiscale Data Cloud is specifically constructed and managed to deliver high-performance Big Data results. A comprehensive solution built to satisfy the demands of a broad range of data analytics requirements, it comes preconfigured with core compute engines such as Spark, Tez, and MapReduce, as well as services such as Hive, Oozie, Pig, and Spark SQL.
Altiscale runs Hadoop on hardware that is purpose-built and tuned for Hadoop. Not only does Altiscale select the best hardware for Big Data, we also configure the kernel and network parameters on top of the hardware for Big Data performance.
As a result, the Altiscale Data Cloud handily beats the performance of its competitors, as proven in real-world examples by Altiscale customers.
Storage Services — HDFS
Storage services primarily consists of Hadoop Distributed File System (HDFS) and HCatalog. HDFS is a highly reliable, fault-tolerant, distributed storage system used for storing and retrieving Big Data at high throughput. HCatalog is a storage management layer that provides a common interface to multiple compute services.
The Altiscale operations team works around the clock for you, monitoring the clusters for hardware or software failures. We take professional pride in ensuring that your service is up to date, fault tolerant, and reliable.
Altiscale supports Kerberos-enabled Hadoop clusters, which ensure the user is authenticated before accessing HDFS.
Altiscale considers scale-out architecture to be core to its business. Customers can easily expand capacity as their needs increase without having to worry themselves about hardware. Altiscale customers benefit from elastic storage, which grows and shrinks to meet customer needs over time.
Resource Management — YARN
Resource management at Altiscale Data Cloud is managed through YARN (Yet Another Resource Negotiator), a large-scale distributed operating system for Big Data introduced in Hadoop 2.x, which improves significantly on resource management in Hadoop 1.x. Big Data jobs often come in bursts, requiring rapid shifts in processing capacity. While some jobs might only require a small amount of computing capacity, others might dramatically exceed the data volume of an average job. One of the key benefits of the Altiscale Data Cloud is elasticity—giving customers rapid access to the capacity they need to get their largest jobs done, without having to worry about hardware or job scheduling. Since Altiscale is in the cloud, the resources for scaling are simply available whenever the customer needs them.
YARN lets multiple data processing engines, such as Spark, Tez, and MapReduce, run on top of Hadoop. This unlocks an entirely new approach to data analytics by enabling multiple analytics frameworks to run simultaneously and take advantage of the data stored in HDFS.
Ready for Any Job, Anytime
Run simultaneous applications on the same dataset.
Utilize Altiscale’s elasticity and expand service capacity to accommodate spiking demand.
Altiscale is optimized for Big Data and easily scales as overall data volumes grow.
Compute services consists of a set of engines and basic services that sit on top of the compute engines to perform different types of processing. There are several compute engines that run on top of Altiscale. Depending on the use case, any of the following can be used on top of Altiscale Data Cloud.
Apache Spark Run Spark in production.
Apache Tez Run your existing MapReduce jobs faster on Tez.
MapReduce Use the most stable, batch-based processing engine.
Apache Hive Use the most reliable SQL-like data warehouse purpose built on Hadoop. Visualize using Tableau or any tool that connects to Hive using JDBC or ODBC.
Big Data Solutions and Analytics
Altiscale makes it easy for you to search, explore, obtain insights, and do analytics on top of Big Data. We provide a set of open-source solutions for data exploration and analytics as well as partner with specialized analytics providers to ensure that data scientists have a complete workbench of effective tools to get their jobs done right.
Data Transfer and Connectivity Options
There are several secure options for moving data in and out of the Altiscale Data Cloud. The options available also depend on the type of data that needs to be moved. The different types of data can be roughly divided as follows:
Transfer data between Altiscale and a structured datastore, such as a relational database.
Transfer bulk data between Apache Hadoop and structured datastores, such as relational databases.
Stream data to Altiscale using a high-throughput pub/sub-based solution.
Ingest event-based data to HDFS using Camus, or stream it using Spark Streaming.
Collect, aggregate, and move large amounts of log data directly to Altiscale.
Use the standard DistCP tool that comes with Apache Hadoop, or use our open-source tools to transfer data efficiently to Altiscale. Transfer from any of the following with ease.
On-premises storage area network (SAN) or network-attached storage (NAS)
Cloud storage, such as Amazon Simple Storage Service (S3)
Whether your data is in your private cloud or Amazon S3, we provide enterprise-class connectivity options to Altiscale Data Cloud.
Workbench – SSH
Use secure shell to view, run, or access your cluster.
High-Throughput Transfer Host
A high-throughput transfer host will ingest Big Data at a high frequency.
Use IPSec to authenticate and secure your data transfers.
Deliver data directly through Altiscale’s Direct Connection.
The Altiscale Portal is a central location where you can add or remove new users, control access to your cluster, and access cluster information, job details, or usage details.
Altiscale Advisory Services and Proactive Support
Altiscale advisory services bring users a team of experts available to provide advice regarding which engines to use and how to best plan their jobs. By working with Altiscale advisory services in advance, users can more rapidly and easily achieve their goals. Altiscale proactive support helps customers keep their jobs on track, notifying them and providing fixes in advance when jobs look like they might be headed for trouble.
drives superior results.
Read what Forrester Consulting has to say about
the benefits Altiscale customers experience.
hadoop job failures
job completion times
SIGNIFICANT IMPROVEMENT IN
DATA SCIENTIST PRODUCTIVITY