Get current number of partitions of a DataFrame

kecso picture kecso · Feb 11, 2017 · Viewed 87.3k times · Source

Is there any way to get the current number of partitions of a DataFrame? I checked the DataFrame javadoc (spark 1.6) and didn't found a method for that, or am I just missed it? (In case of JavaRDD there's a getNumPartitions() method.)

Answer

user4601931 picture user4601931 · Feb 11, 2017

You need to call getNumPartitions() on the DataFrame's underlying RDD, e.g., df.rdd.getNumPartitions(). In the case of Scala, this is a parameterless method: df.rdd.getNumPartitions.