I want to Change case of whole column to Lowercase in Spark Dataset
Desired Input
+------+--------------------+
|ItemID| Category name|
+------+--------------------+
| ABC|BRUSH & BROOM HAN...|
| XYZ|WHEEL BRUSH PARTS...|
+------+--------------------+
Desired Output
+------+--------------------+
|ItemID| Category name|
+------+--------------------+
| ABC|brush & broom han...|
| XYZ|wheel brush parts...|
+------+--------------------+
I tried with collectAsList()
and toString()
, which is slow and complex procedure for very large dataset.
I also found a method 'lower' but didnt get to know how to get it work in dasaset Please suggest me a simple or effective way to do the above. Thanks in advance
I Got it (use Functions#lower
, see Javadoc)
import org.apache.spark.sql.functions.lower
String columnName="Category name";
src=src.withColumn(columnName, lower(col(columnName)));
src.show();
This replaced old column with new one retaining the whole Dataset.
+------+--------------------+
|ItemID| Category name|
+------+--------------------+
| ABC|brush & broom han...|
| XYZ|wheel brush parts...|
+------+--------------------+