Wednesday, December 14, 2011

HIVE: Data Warehousing & Analytics on Hadoop and Pig Latin: A Not-So-Foreign Language for Data Processing

Often times, a programmer has to run a bunch of MapReduce jobs before they get the final output they're looking for. Looking for the relationship between these MapReduce jobs and making sure the correct data is piped in can be difficult. HIVE and Pig both concentrate on providing a simple language layer on top of MapReduce to make the job easier for programmers.

HIVE looks a lot like SQL. It's mean to serve as a layer for those that are familiar with databases. It is then compiled down to MapReduce jobs and works on HDFS. Pig is a procedural language with support for UDFs.

It seems both layers are missing some operations due to complexity. However, they both seem to do a good job of supporting the majority of options. I do like the web interface that HIVE has for MapReduce jobs and the debugging support in Pig seems promising. Overall, these layers on top of MapReduce are important because they serve to lower the barrier to anyone who wishes to use the cloud for any sort of computing.

1 comment:

  1. Great, This specific net webpage is seriously thrilling and enjoyment to learn. I’m an enormous fan from the subjects mentioned. AMS Fulfillment Services

    ReplyDelete