What is Apache Spark?
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
Apache Spark is a tool in the Big Data Tools category of a tech stack.
Why developers like Apache Spark
Open-source
Fast and Flexible
One platform for every big data problem
Great for distributed SQL like applications
Easy to install and to use
Works well for most Datascience usecases
Interactive Query
Machine learning libratimery, Streaming in real
In memory Computation