LinkedIn’s DataFu library, a collection of useful UDFs for Pig, is now a standard part of Cloudera’s Hadoop distribution (CDH) starting with version 4.1:
Support for DataFu – the LinkedIn data science team was kind enough to open source their library of Pig UDFs that make it easier to perform common jobs like sessionization or set operations. Big thanks to the LinkedIn team!!!
Read the full announcement here. Pig users around the world rejoice!