杭州博学国际教育培训中心

杭州Cloudera Apache Spark程序员

2017-12-08 13:23  57次

  • 课程价格:请电话咨询
  • 开课时间:滚动开班
  • 上课地点:请咨询客服

如遇无效、虚假、诈骗课程,请立即举报为了您的资金安全,请见面交易,切勿提前支付任何费用举报

课程详情
Cloudera Apache Spark程序员 培训班型: 公开课,内训 课程长度: 3天/18小时 培训日期: 待定 认证考试: 暂无 培训地点: 博学国际教育培训中心 环境要求: 投影仪、白板、大白纸 培训形式: 实例讲授,现场演、练、及时沟通 培训资料: 培训教材 课程内容 Cloudera Developer Training for Apache Spark 课程概述: 结合批处理、流媒体和交互分析技术,利用 Apache Spark 构建完整统一的大 数据应用。学习编写复杂的并行应用程序,为各种用例、架构和行业执行快速良好的决策和实时行动。 授课对象: 面向意欲优化应用程序速度、易用性和复杂程度的开发人员和工程师。培训对象要求 具 备Python或Scala背景知识,具备Linux 相关基础知识更佳。 培训目标: Using the Spark shell for interactive data analysis  The features of Spark’s Resilient Distributed Datasets  How Spark runs on a cluster  How Spark parallelizes task execution  Writing Spark applications  Processing streaming data with Spark 课程内容: Introduction to Spark  What is Spark?  Review: From Hadoop MapReduce to Spark  Review: HDFS  Review: YARN  Spark Overview Spark Basics  Using the Spark Shell  RDDs (Resilient Distributed Datasets)  Functional Programming in Spark Working with RDDs in Spark  Creating RDDs  Other General RDD Operations Aggregating Data with Pair RDDs  Key-Value Pair RDDs  Map-Reduce  Other Pair RDD Operations Writing and Deploying Spark Applications  Spark Applications vs. Spark Shell  Creating the SparkContext  Building a Spark Application (Scala and Java)  Running a Spark Application  The Spark Application Web UI  Hands-On Exercise: Write and Run a Spark Application  Configuring Spark Properties  Logging Parallel Processing  Review: Spark on a Cluster  RDD Partitions  Partitioning of File-based RDDs  HDFS and Data Locality  Executing Parallel Operations  Stages and Tasks Spark RDD Persistence  RDD Lineage  RDD Persistence Overview  Distributed Persistence Basic Spark Streaming  Spark Streaming Overview  Example: Streaming Request Count  DStreams  Developing Spark Streaming Applications Advanced Spark Streaming  Multi-Batch Operations  State Operations  Sliding Window Operations  Advanced Data Sources Common Patterns in Spark Data Processing  Common Spark Use Cases  Iterative Algorithms in Spark  Graph Processing and Analysis  Machine Learning  Example: k-means Improving Spark Performance  Shared Variables: Broadcast Variables  Shared Variables: Accumulators  Common Performance Issues  Diagnosing Performance Problems Spark SQL and DataFrames  Spark SQL and the SQL Context  Creating DataFrames  Transforming and Querying DataFrames  Saving DataFrames  DataFrames and RDDs  Comparing Spark SQL, Impala and Hive-on-Spark
相关课程
在线预约报名
  • 报名课程 :
    杭州Cloudera Apache Spark程序员
  • 报名学校 :
    杭州博学国际教育培训中心
  • 学生姓名 :
    电话号码 :
    联系地址 :
    情况说明 :
    验  证  码 :