An Overview of Apache Flink
Description
Apache Flink is an open source, native analytic database for Apache Hadoop. It is shipped by vendors such as Cloudera, MapR, Oracle, and Amazon. The examples provided in this course have been developing using Cloudera Apache Flink. This course is intended for those who want to learn Apache Flink.
Apache Flink is used to process huge volumes of data at lightning-fast speed using traditional SQL knowledge.
To make the most of this course, you should have a good understanding of the basics of Hadoop and HDFS commands. It is also recommended to have a basic knowledge of SQL before going through this course.
Apache Flink is the next generation Big Data tool also known as 4G of Big Data.
It is the true stream processing framework (doesn’t cut stream into micro-batches).
Flink’s kernel (core) is a streaming runtime which also provides distributed processing, fault tolerance, etc.
Flink processes events at a consistently high speed with low latency.
It processes the data at lightning fast speed.
It is the large-scale data processing framework which can process data generated at very high velocity.
Flink is an alternative to MapReduce, it processes data more than 100 times faster than MapReduce. It is independent of Hadoop but it can use HDFS to read, write, store, process the data. Flink does not provide its own data storage system. It takes data from distributed storage.
Who this course is for:
- Students, Programmers, Learners