Hadoop evolved as a distributed software platform for managing and transforming large quantities of data, and has grown to be one of the most popular tools to meet many of the above needs in a cost-effective manner. By abstracting away many of the high availability (HA) and distributed programming issues, Hadoop allows developers to focus on higher-level algorithms. Hadoop is designed to run on a large cluster of commodity servers and to scale to hundreds or thousands of nodes. Each disk, server, network link, and even rack within the cluster is assumed to be unreliable. This assumption allows the use of the least expensive cluster components consistent with delivering sufficient performance, including the use of unprotected local storage (JBODs). Hadoop’s design and ability to handle large amounts of data efficiently make it a natural fit as an integration, data transformation, and analytics platform.Hadoop use cases include:
Customizing content for users: Creating a better user experience through targeted and relevant ads, personalized home pages, and good recommendations.