Hadoop is an Open source Framework designed to work with large - TopicsExpress



          

Hadoop is an Open source Framework designed to work with large datasets in a distributed computing Environment. It is a part of Apache Project, under license of Apache. Hadoop provides faster data processing between the nodes, Provides high availability, and fault tolerance to Hadoop clusters. Hadoop is a powerful platform for processing, coordinating the movements of data across various architectural components. Initially Google started GFS (Google file system) which mainly works on part files. These part files are like small chunk size files or blocks. Hadoop designed after GFS. In April 2008, Hadoop established as a power full system to sort terabyte of data running on 910 node cluster in less than 4 minutes. In April 2009 500 GB of data sorted in 59 seconds on 1406 Hadoop nodes. And 1Tb of data sorted in 62 seconds on same clusters. The hardware used during sorting is 2 quad core xeons at 2.0 GHz per node 4 sata disks per node, 8 gb ram per node, 1 Gb ethernet on each node, 40 nodes per rack, redhat linux server release 5.1, sun jdk 1.6.0. (beinghadoop
Posted on: Sun, 04 Aug 2013 16:04:08 +0000

Trending Topics



Recently Viewed Topics




© 2015