when i run my shark queries, the memory gets hoarded in the main memory This is my top command result.
Mem: 74237344k total, 70080492k used, 4156852k free, 399544k buffers Swap: 4194288k total, 480k used, 4193808k free, 65965904k cached
this doesn't change even if i kill/stop shark,spark, hadoop processes. Right now, the only way to clear the cache is to reboot the machine.
has anyone faced this issue before? is it some configuration problem or a known issue in spark/shark?
To remove all cached data:
sqlContext.clearCache()
Source: https://spark.apache.org/docs/2.0.1/api/java/org/apache/spark/sql/SQLContext.html
If you want to remove an specific Dataframe from cache:
df.unpersist()