Learning Apache Spark 2.0 by Asif Abbasi
Learning Apache Spark 2.0 by Asif Abbasi PDF, ePub eBook D0wnl0ad
Key Features
- Exclusive guide that covers how to get up and running with fast data processing using Apache Spark
- Explore and exploit various possibilities with Apache Spark using real-world use cases in this book
- Want to perform efficient data processing at real time? This book will be your one-stop solution.
Book Description
Spark juggernaut keeps on rolling and getting more and more momentum each day. The core challenge are they key capabilities in Spark (Spark SQL, Spark Streaming, Spark ML, Spark R, Graph X) etc. Having understood the key capabilities, it is important to understand how Spark can be used, in terms of being installed as a Standalone framework or as a part of existing Hadoop installation and configuring with Yarn and Mesos.
The next part of the journey after installation is using key components, APIs, Clustering, machine learning APIs, data pipelines, parallel programming. It is important to understand why each framework component is key, how widely it is being used, its stability and pertinent use cases.
Once we understand the individual components, we will take a couple of real life advanced analytics examples like:
- Building a Recommendation system
- Predicting customer churn
The objective of these real life examples is to give the reader confidence of using Spark for real-world problems.
What you will learn
- Overview Big Data Analytics and its importance for organizations and data professionals.
- Delve into Spark to see how it is different from existing processing platforms
- Understand the intricacies of various file formats, and how to process them with Apache Spark.
- Realize how to deploy Spark with YARN, MESOS or a Stand-alone cluster manager.
- Learn the concepts of Spark SQL, SchemaRDD, Caching, Spark UDFs and working with Hive and Parquet file formats
- Understand the architecture of Spark MLLib while discussing some of the off-the-shelf algorithms that come with Spark.
- Introduce yourself to SparkR and walk through the details of data munging including selecting, aggregating and grouping data using R studio.
- Walk through the importance of Graph computation and the graph processing systems available in the market
- Check the real world example of Spark by building a recommendation engine with Spark using collaborative filtering
- Use a telco data set, to predict customer churn using Regression
About the Author
Asif Abbasi has worked in the industry for over 15 years, in a variety of roles starting from engineering solutions to selling solutions and everything in between. Asif is currently working with SAS a Market Leader in Analytic Solutions as a Principal Business Solutions Manager for the Global Technologies Practice.
Based out of London, Asif has vast experience in consulting for major organizations & industries across the globe, and running proof-of-concepts across various industries including but not limited to Telecommunications, Manufacturing, Retail, Finance, Services, Utilities and Government.
Asif has presented at various conferences and delivered workshops on topics such as Big Data, Hadoop, Teradata, and Analytics using Aster on Teradata and Hadoop. Asif is a Oracle Certified Java EE 5 Enterprise Architect, Teradata Certified Master, PMP, Hortonworks Hadoop Certified developer and Administrator. Asif also holds a Masters degree in Computer Science and Business Administration.
From reader reviews:
Steven Maravilla:
Do you one among people who can't read pleasurable if the sentence chained in the straightway, hold on guys this specific aren't like that. This Learning Apache Spark 2.0 book is readable by simply you who hate those straight word style. You will find the details here are arrange for enjoyable examining experience without leaving perhaps decrease the knowledge that want to give to you. The writer connected with Learning Apache Spark 2.0 content conveys thinking easily to understand by many individuals. The printed and e-book are not different in the articles but it just different available as it. So , do you nevertheless thinking Learning Apache Spark 2.0 is not loveable to be your top collection reading book?
Mikel Davis:
The publication untitled Learning Apache Spark 2.0 is the publication that recommended to you to learn. You can see the quality of the book content that will be shown to an individual. The language that publisher use to explained their ideas are easily to understand. The copy writer was did a lot of investigation when write the book, so the information that they share for your requirements is absolutely accurate. You also might get the e-book of Learning Apache Spark 2.0 from the publisher to make you a lot more enjoy free time.
Donna Wright:
On this era which is the greater person or who has ability to do something more are more valuable than other. Do you want to become considered one of it? It is just simple approach to have that. What you need to do is just spending your time very little but quite enough to get a look at some books. Among the books in the top list in your reading list is usually Learning Apache Spark 2.0. This book which can be qualified as The Hungry Hillsides can get you closer in turning into precious person. By looking up and review this e-book you can get many advantages.
Read Learning Apache Spark 2.0 by Asif Abbasi for online ebook
Learning Apache Spark 2.0 by Asif Abbasi Free PDF d0wnl0ad, audio books, books to read, good books to read, cheap books, good books, online books, books online, book reviews epub, read books online, books to read online, online library, greatbooks to read, PDF best books to read, top books to read Learning Apache Spark 2.0 by Asif Abbasi books to read online.
Amazon Redshift is a cloud-based data warehouse that allows companies to process and analyze their data, both new and existing, so they can make better business decisions. It provides a highly scalable and cost-effective solution that requires no initial capital investment, and can be deployed and started within minutes. Redshift offers the same functionality as many expensive, on-premise data warehouse solutions, but it is more cost-effective, and it provides greater analysis power and access to more data.
BalasHapus