Jump to content

User:Nisarg64/Apache Ambari

From Wikipedia, the free encyclopedia
Ambari
Developer(s)Apache Software Foundation
Stable release
2.1.0 / July 9, 2015 (2015-07-09)
Operating systemCross-platform
TypeDistributed computing
LicenseApache License 2.0
Websiteambari.apache.org

Apache Ambari is a software project of the Apache Software Foundation, is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. Ambari was a sub-project of Hadoop but is now a top-level project in its own right.

Ambari is used by companies including Cardinal_Health, EBay, Expedia, Kayak, Lending club, Neustar, Pandora, Priceline.com, Samsung, Shutterfly, Spotify[1]

Overview

[edit]

Ambari offers intuitive collection of tools and APIs that simplifies the operation of the clusters, thereby concealing the complexity of Hadoop. It enables system administrators to provision, manage and monitor a Hadoop cluster, and also to integrate Hadoop with the existing enterprise infrastructure. Irrespective of the size of the cluster, deployment and management of the hosts is simplified using Ambari.[2]

  • Provision a Hadoop Cluster
  • Manage a Hadoop cluster
  • Monitor a Hadoop cluster
  • Integrate Hadoop with the Enterprise

Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of hosts involved. Ambari provides a single control point for viewing, updating and managing Hadoop service life cycles.

Provision a Hadoop Cluster

[edit]
Ambari Install Wizard

Ambari includes an intuitive web interface that provides a step-by-step wizard for installing Hadoop services across multiple hosts. This allows easy provision, configuration and testing of all the Hadoop services and core components. Ambari manages configuration of Hadoop services for the cluster.[3]





Manage a Hadoop Cluster

[edit]
Cluster Configuration

Ambari acts as a center point of management for starting, stopping, and reconfiguring Hadoop services across the entire cluster. Cluster Management is further simplified by the use of tools provided by Ambari. [3]





Monitor a Hadoop Cluster

[edit]
Monitoring Dashboard

Ambari provides a dashboard for getting an instant insight of the health and status of the Hadoop cluster. It uses Ambari Metrics System for cluster metrics collection and visualizes clusters operational data in a Web Interface. Moreover, Ambari has a pre-configured Alert System that notifies a user when attention is needed(e.g., a node goes down, remaining disk space is low, etc).[3]





Integrate Hadoop with the Enterprise

[edit]

Hadoop's provisioning, management, and monitoring capabilities can be easily integrated to their own enterprise applications using the Ambari REST APIs.[3]

Ambari Architecture[4]

[edit]

Ambari Server

[edit]

Ambari server consists of an API handler which is also called coordinator. Server receives the request, generates request id and attaches it to the request. A corresponding API handler is invoked to implement the steps needed to fulfill the request.

Coordinator communicates with the Dependency Tracker to check for dependencies to be handled for the request. Dependency tracker gives prerequisite components and their required states for completion of request to coordinator. Coordinator saves these details in database. Coordinator then passes this information to the Stage Planner component. Stage Planner produces the staged sequences of operations to be performed at each node of the affected components. It uses the Manifest Generator to define the task roles for each node in each stage.

Coordinator will pass this ordered list of task to Action Manager along with request Id. Action Manager will update the state of each node component in FSM, which will show the progress of operation. FSM is also responsible to check for invalid event flow and generating failure message.

Action Manager generates Action id for each operation and adds it to the plan. Action manager picks actions from the plan and adds to queue for each affected nodes for each stage. When stage a is complete, it will pick actions from the next stage. It also start timer for scheduled actions.

Heartbeat handler receives the responses of actions and passes them to Action manager which in turn inform FSM about the change of states. Once all nodes completed their given tasks, action is considered completed. Once all actions are completed, A Stage is considered completed and next stage is begun to execute. Completion of action is also recorded in Database.

Ambari Agent

[edit]

Ambari Agent communicates with Ambari server through heartbeat messages only. Every commands received from server are appended to the action queue. Action executioner picks the action from the queue and selects appropriate component to perform that action. Generated action responses are queued in message queue which is sent to server in next heartbeat.

Features of Apache Ambari

[edit]
  • Wizard-Driven Web User Interface : Assists installation of Hadoop across multiple hosts
  • Granular Service Control : Accurate management of Hadoop services and component lifecycles
  • Configuration change history : Continuing management of Hadoop service configurations
  • RESTful APIs : Enables integration with enterprise systems

Major Features, Releases and Roadmap

[edit]
Version Release Date Features
2.2.0 TBD Core Platform
  • Ambari Server HA
2.1.0 July 2015 Core Platform
  • Update Ambari OS Support
  • Update Ambari JDK support
  • Enhanced Configs
  • Customizable Dashboard
  • Rack Awareness

Ambari Stacks: HDP

  • NFS Gateway
  • Accumulo
  • User Views
2.0.0 April 2015 Core Platform
  • Ambari Alerts
  • Ambari Metrics System
  • Rolling Upgrade
  • Automated Cluster Upgrades
  • Automated Kerberization

Ambari Blueprints

  • "Add Hosts" Blueprints API support

Ambari Stacks

  • Service Inheritance
  • Common Services
1.7.0 November 2014 Core Platform
  • Ubuntu support
  • Ambari Views framework
  • Ambari Administration
  • Cancel/Abort background operation requests
  • New Host Check: THP (Transparent Huge Pages)
  • Download Service client configs
  • Expose Ambari UI for config versioning, history and rollback
  • Ability to manage -env.sh configuration files
  • Ability to set configuration properties as <final>
  • Migration "Jobs" tab into a "Jobs View" (e.g deliver as part of Views framework)

Ambari Stacks

  • Recommendations and validations (via a "Stack Advisor")

Ambari Blueprints

  • Export service configurations via Blueprint

Capabilities specific to HDP Stacks

  • Install + Manage Flume
  • Capacity Scheduler queue refresh
  • HDFS Rebalance (AMBARI-5934)
  • Delete ZooKeeper Server host component
  • ResourceManager HA
1.6.1 July 2014
  • New Host Checks
  • Database Connection Test
  • Simplified JDBC Driver Setup
  • Stack Repository Mgmt via Ambari Web
  • Bug Fixes
1.6.0 May 2014
  • Add support for PostgreSQL DB (Ambari, Oozie, Hive Metastore)
  • Ambari Blueprints
1.5.0 April 2014
  • Maintenance Mode
  • Rolling Restarts
  • Bulk Host Operations
  • Service and Component Restarts
  • Decommission TTs, NMs, RSs
  • Add Service
  • Customize ZooKeeper Configs
  • Refresh Client Configs
  • Default JDK 7
1.4.3 January 2014
  • Configuration Groups
  • Staged Configuration Changes
  • Ambari DB MySQL support
1.4.2 December 2013
  • Reassign masters
  • HBase Multi-Master
  • Simplified local repository setup
  • Better host controls
  • Background Operations dialog off option
1.4.1 October 2013
  • Hadoop-2 stack support
  • Enable NameNode HA
  • Add ZooKeeper servers
1.2.5 August 2013
  • Manage Kerberos secure cluster
  • Security Enhancements
  • Customize Dashboard widgets
  • Improved service controls
  • Expanded host checks during install
  • Ambari server can run as non-root users as well
1.2.4 July 2013
  • Sudoer user ability for Ambari agent SSH
  • Hive Metastore Oracle DB support
  • Oozie Oracle DB and MySQL DB support
  • Ambari DB Oracle support
  • Customizable user accounts
  • Stack selection page during install
1.2.3 May 2013
  • Ability to add/remove custom configs (including core-site.xml)
  • New heatmap metrics
  • Mixed OS support
  • Add host components
  • Filter hosts by status
  • Customizable Ganglia user account
  • Customizable Hive Metastore Log and PID directories

Source Code

[edit]

Source code for Apache Ambari is available on Github [5]

See also

[edit]

References

[edit]
[edit]


Ambari Category:Configuration management Ambari