
gallowaymd at ornl
Nov 14, 2011, 6:42 AM
Post #1 of 1
(63 views)
Permalink
|
|
Avoiding storage redundancy with Openstack redundant storage and HDFS 3xreplication
|
|
Not sure it's a real answer. But you can set the replication in HDFS to be what you want, 1, 2, 3, etc. not sure HDFS replication 1 makes sense for your application, but it is configurable. --- michael -----Original Message----- From: openstack-operators-bounces [at] lists [mailto:openstack-operators-bounces [at] lists] On Behalf Of Edmon Begoli Sent: Friday, November 11, 2011 10:30 PM To: openstack-operators at lists.openstack.org Subject: [Openstack-operators] Avoiding storage redundancy with Openstack redundant storage and HDFS 3xreplication A question related to standing up cloud infrastructure for running Hadoop/HDFS. We are building up an infrastructure using Openstack which has its own storage management redundancy. We are planning to use Openstack to instantiate Hadoop nodes (HDFS, M/R tasks, Hive, HBase) on demand. The problem is that HDFS by design creates three copies of the data, so there is a 4x times redundancy which we would prefer to avoid. I am asking here if anyone has had a similar case and if anyone has had any helpful solution to recommend. Thank you in advance, Edmon _______________________________________________ Openstack-operators mailing list Openstack-operators at lists.openstack.org hxxp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
|