Absolutely Australia's Lowest Prices

We won't be beaten by anyone. Guaranteed

Practical Hadoop Ecosystem

This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project. While several books on Apache Hadoop are available, most are based on the main projects MapReduce and HDFS and none discusses the other Apache Hadoop ecosystem projects and how these all work together as a cohesive big data development platform. What you'll learn* How to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5. * How to run a MapReduce job * How to store data with Apache Hive, Apache HBase * How to index data in HDFS with Apache Solr * How to develop a Kafka messaging system * How to develop a Mahout User Recommender System * How to stream Logs to HDFS with Apache Flume * How to transfer data from MySQL database to Hive, HDFS and HBase with Sqoop * How create a Hive table over Apache Solr Who this book is for: The primary audience is Apache Hadoop developers. Pre-requisite knowledge of Linux and some knowledge of Hadoop is required.
Product Details

Table of Contents

Introduction1. HDFS and MapReduceHadoop Distributed FileSystemMapReduce FrameworksSetting the EnvironmentHadoop Cluster ModesRunning a MapReduce Job with MR1 FrameworkRunning MR1 in Standalone ModeRunning MR1 in Psuedo-Distributed ModeRunning MapReduce with Yarn FrameworkRunning YARN in Psuedo-Distributed ModeRunning Hadoop Streaming Section II Storing & Querying 2. Apache HiveSetting the EnvironmentConfiguring HadoopConfiguring HiveStarting HDFSStarting the Hive ServerStarting the Hive CLICreating a DatabaseUsing a DatabaseCreating a Managed TableLoading Data into a TableCreating a table using LIKEAdding Data with INSERT INTO TABLEAdding Data with INSERT OVERWRITECreating Table using AS SELECTAltering a TableTruncating a TableDropping a TableCreating an External Table 3. Apache HBase Setting the EnvironmentConfiguring HadoopConfiguring HBaseConfiguring HiveStarting HBaseStarting HBase ShellCreating a HBase TableAdding Data To HBase TableListing All TablesGetting a Row of DataScanning a TableCounting Number of Rows in a TableAltering a TableDeleting a RowDeleting a ColumnDisabling and Enabling a TableTruncating a TableDropping a TableFinding if a Table existsCreating a Hive External Table Section III Bulk Transferring & Streaming 4. Apache Sqoop Installing MySQL DatabaseCreating MySQL Database TablesSetting the EnvironmentConfiguring HadoopStarting HDFSConfiguring HiveConfiguring HBaseImporting into HDFSExporting from HDFSImporting into HiveImporting into HBase 5. Apache Flume Setting the EnvironmentConfiguring HadoopConfiguring HBaseStarting HDFSConfiguring FlumeRunning a Flume AgentConfiguring Flume for HBase SinkStreaming MySQL Log to HBase Sink Section IV Serializing 6. Apache Avro Setting the EnvironmentCreating an Avro SchemaCreating a Hive Managed TableCreating a Hive (version prior to 0.14) External Table Stored as Avro 7. Apache Parquet Setting the Environment Creating a Oracle Database Table Exporting Oracle Database to a CSV File Importing the CSV File in MongoDB Exporting MongoDB Document as CSV File Importing a CSV File to Oracle Database Section V Messaging & Indexing 8. Apache Kafka Setting the EnvironmentStarting the Kafka ServerCreating a TopicStarting a Kafka ProducerStarting a Kafka ConsumerProducing and Consuming MessagesStreaming Log Data to Apache Kafka with Apache Flume Setting the Environment Creating Kafka Topics Configuring Flume< Running Flume Agent Consuming Log Data as Kafka Messages 9. Apache Solr Setting the EnvironmentConfiguring the Solr SchemaStarting the Solr Server Indexing a Document in SolrDeleting a Document from Solr Indexing a Document in Solr with Java ClientSearching a Document in SolrCreating a Hive Managed TableCreating a Hive External TableLoading Hive External Table DataSearching Hive Table Data Indexed in Solr Section VI Machine Learning 10.Apache Mahout Setting the EnvironmentStarting HDFSSetting the Mahout EnvironmentRunning a Mahout Classification SampleRunning a Mahout Clustering SampleDeveloping a User Based Recommender System The Sample Data Setting the Environment Creating a Maven Project in Eclipse Creating a User Based Recommender Creating a Recommender Evaluator Running the Recommender Choosing a Recommender Type Choosing a User Similarity Measure Choosing a Neighborhood Type Choosing a Neighborhood Size for NearestNUserNeighborhood Choosing a Threshold for ThresholdUserNeighborhood Running the Evaluator Choosing the Split between Training Percentage and Test Percentage

About the Author

Deepak Vohra is a coder, developer, programmer, book author, and technical reviewer.

Look for similar items by category
People also searched for
How Fishpond Works
Fishpond works with suppliers all over the world to bring you a huge selection of products, really great prices, and delivery included on over 25 million products that we sell. We do our best every day to make Fishpond an awesome place for customers to shop and get what they want — all at the best prices online.
Webmasters, Bloggers & Website Owners
You can earn a 5% commission by selling Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools on your website. It's easy to get started - we will give you example code. After you're set-up, your website can earn you money while you work, play or even sleep! You should start right now!
Authors / Publishers
Are you the Author or Publisher of a book? Or the manufacturer of one of the millions of products that we sell. You can improve sales and grow your revenue by submitting additional information on this title. The better the information we have about a product, the more we will sell!
Item ships from and is sold by Fishpond World Ltd.
Back to top