I'm using Cassandra 2.0.9 for store quite big amounts of data, let's say 100Gb, in one column family. I would like to export this data to CSV in fast way. I tried:
I use Amazon Ec2 instance with fast storage, 15 Gb of RAM and 4 cores
Is there any better option for export gigabytes of data from Cassandra to CSV?
Update for 2020th: DataStax provides a special tool called DSBulk for loading and unloading of data from Cassandra (starting with Cassandra 2.1), and DSE (starting with DSE 4.7/4.8). In simplest case, the command line looks as following:
dsbulk unload -k keyspace -t table -url path_to_unload
DSBulk is heavily optimized for loading/unloading operations, and has a lot of options, including import/export from/to compressed files, providing the custom queries, etc.
There is a series of blog posts about DSBulk, that could provide more information & examples: 1, 2, 3, 4, 5, 6