Until 2018, I published comparisons of the supported versions of components in the multiple available Hadoop distributions. Those older posts showed the ebb and flow of new additions to the typical stack supported by “most” distributors slowed to a halt after everyone (mostly) added Kafka. The relative version currency was fairly stable, with Hortonworks typically first to market with many of the newest Apache versions and Cloudera often close behind and leading on some where they were more dominant on the project committee.
But as mergers, acquisitions, and the rise of the cloud-platform-as-Hadoop-provider dynamic played out of the past 2 years, I got away from the regular tracking cycle. This seems like a good time to revisit the question: Who supports what? since so many people seem to think “Hadoop” is HDFS and therefore “going away, because cloud object stores.” The players I include in this visit are AWS, Cloudera, Google and HPE. All of them support 9 pieces: Apache HDFS, Mapreduce, YARN, Hive, Pig, Spark, Sqoop, Tez and Zookeeper.
All but Google also support Apache HBase, Mahout and Oozie, as well as Hue. There is a nuance here: Google Cloud Dataproc allows you to perform initialization actions to add components (indicated on the table with “IA”) and offers scripts for dozens of installable components, but cautions that “the initialization actions provided in this repository are provided without support and you use them at your own risk.” The same applies to the various projects themselves; similar mechanisms apply to AWS, and of course Microsoft Azure, whose HDInsight begins with the Cloudera distribution, adds its own pieces and also permits you to add yours. Zeppelin is listed as “soon” for Cloudera, which would move it into the “3 supporters” category as well, but for now it appears below.
Apache Flume, Impala, Kafka, Phoenix, Presto, Sentry, Storm and Zeppelin are supported by two vendors apiece. Note that some of these are used infrequently and some will be deprecated soon, but continue to get support because many users have them in their stacks.
There are many additional pieces supported by only one of these vendors: Apache Accumulo, Ambari, Atlas, Avro, Crunch, Drill, Druid, Flink, Knox, Kudu, Livy, Lucene, Myriad, NiFi, Ozone, Parquet, Ranger, Solr, Tensorflow and others. And many more not directly supported by any of them. We’ll save that for the next post. And meanwhile, this is a blog post, not the result of a lengthy review process, and likely to have a few things that need updating or correcting. Please let me know what you spot.
Additional Resources
View Free, Relevant Gartner Research
Gartner’s research helps you cut through the complexity and deliver the knowledge you need to make the right decisions quickly, and with confidence.
Category:
Sourced from: Gartner Blog.
View the original article here.
————————————————————–
Have you checked out the new WhichVoIP.co.za website as yet? Benchmark your services against your peers, have a look at what your competitors are doing, get listed in the best Telecoms provider directory in South Africa, and advertise on the site to attract customers to your page where you can view page hits, respond to reviews, load adverts, and more. Visit WhichVoIP.co.za or jump to a leading comparison section:
Enjoy the site!
————————————————————–