I am trying to find out the size/shape of a DataFrame in PySpark. I do not see a single function that can do this.
In Python I can do
data.shape()
Is there a similar function in PySpark. This is my current solution, but I am looking for an element one
row_number = data.count()
column_number = len(data.dtypes)
The computation of the number of columns is not ideal...
You can get its shape
with:
print((df.count(), len(df.columns)))