How to make shark/spark clear the cache?

venkat picture venkat · Dec 11, 2013 · Viewed 45.4k times · Source

when i run my shark queries, the memory gets hoarded in the main memory This is my top command result.


Mem: 74237344k total, 70080492k used, 4156852k free, 399544k buffers Swap: 4194288k total, 480k used, 4193808k free, 65965904k cached


this doesn't change even if i kill/stop shark,spark, hadoop processes. Right now, the only way to clear the cache is to reboot the machine.

has anyone faced this issue before? is it some configuration problem or a known issue in spark/shark?

Answer

Henrique Florencio picture Henrique Florencio · May 19, 2017

To remove all cached data:

sqlContext.clearCache()

Source: https://spark.apache.org/docs/2.0.1/api/java/org/apache/spark/sql/SQLContext.html

If you want to remove an specific Dataframe from cache:

df.unpersist()