AWS S3 alternatives for private cloud

user1459144 picture user1459144 · Sep 20, 2017 · Viewed 7.1k times · Source

Right now we have a requirement to migrate from AWS to private Data Center. We need to find out potential alternative storage instead of AWS S3. Currently S3 is used in the following way:

  • Overall storage size is 10TB;
  • Min/Avg/Max object size is 0.5/2/100 Mb;
  • We have N App instances that simultaneously writes/reads
    objects approximately 50 writes/sec, 30 reads/sec;
  • This storage should be redundant (Highly Available), Fault Tolerant, Scalable;

The naive implementation could be store this data on:

  • Simple NFS storage and add some replication functionality;
  • Just store mentioned objects in NoSQL DB (as example in Cassandra). However Cassandra will require a number of instances to support this storage (It's nor recommended to store > 1TB pn 1 Cassandra node Cassandra capacity planning)

What solution would you recommend for such scenario ?

Answer

Root G picture Root G · Jun 29, 2019

Using MinIO is your best bet if you want to have a private cloud storage. It is AWS S3 compatible meaning that applications use AWS S3 can be migrated to MinIO seamlessly. They have a tutorial how to connect MinIO server with AWS CLI. You can test it against the public hosted MinIO server https://play.min.io:9000. Please refer to AWS CLI with MinIO Server.

You can have highly available storage system using MinIO distributed setup. Beware that the dynamic expansion is not a feature of MinIO distributed setup. If you want to expand your cluster you end up spinning a new cluster with your desired number of servers/disks and then you have to migrate your data from old one to new one.

I find it much more easier to use than HDFS. In addition to this, there are a lot of technologies outside Hadoop ecosystem lack HDFS integration. For example, Docker Registry lacks built in HDFS storage driver. However, it has a S3 driver so you can use MinIO as it's object storage.