I'm getting frustrated by not finding any good explanation on how to list all files in a S3 bucket.
I have this bucket with about 20 images on. All I want to do is to list them. Someone says "just use the S3.list-method". But without any special library there is no S3.list-method. I have a S3.get-method, which I dont get to work. Arggh, would appreciate if someone told me how to simply get an list of all files(filenames) from an S3 bucket.
val S3files = S3.get(bucketName: String, path: Option[String], prefix: Option[String], delimiter: Option[String])
returns an Future[Response]
I dont know how to use this S3.get. What would be the easiest way to list all files in my S3 bucket?
Answers much appreciated!
With Scala you might now want to use Amazon's official SDK for Java which provides the AmazonS3::listObjects
method:
import scala.collection.JavaConverters._
import com.amazonaws.services.s3.model.ObjectListing
def keys(bucket: String): List[String] = nextBatch(s3Client.listObjects(bucket))
private def nextBatch(listing: ObjectListing, keys: List[String] = Nil): List[String] = {
val pageKeys = listing.getObjectSummaries.asScala.map(_.getKey).toList
if (listing.isTruncated)
nextBatch(s3Client.listNextBatchOfObjects(listing), pageKeys ::: keys)
else
pageKeys ::: keys
}
Note the recursion on ObjectListing
objects:
Since the listing of keys in a bucket is done by batch (using a pagination system as documented here), only up to the first 1000 keys would be returned by s3Client.listObjects(bucket).getObjectSummaries.asScala.map(_.getKey)
.
Thus the recursive call in order to get all keys in a bucket by asking for the next page of keys while ObjectListing::isTruncated
is true.
Beware of memory issues if your bucket is huge though.
s3Client
can be built as such:
import com.amazonaws.services.s3.{AmazonS3, AmazonS3ClientBuilder}
import com.amazonaws.auth.{AWSStaticCredentialsProvider, BasicAWSCredentials}
val credentials = new BasicAWSCredentials(awsKey, awsAccessKey)
val s3Client: AmazonS3 = AmazonS3ClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(credentials)).build()
with these requirements in build.sbt
and the latest version:
libraryDependencies ++= Seq(
"com.amazonaws" % "aws-java-sdk-bom" % "1.11.391",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.391"
)