So, I'm creating some Datasets from the java Spark API. These datasets are populated from hive table, using the spark.sql() method.
So, after performing some sql operations (like joins), I have a final dataset. What I want to do is that I want to add a new column to that final dataset, with a value of "1" to all the rows in the dataset. So, you could probably see it as adding a constrain to the Dataset.
So, for example I have this dataset:
Dataset<Row> final = otherDataset.select(otherDataset.col("colA"), otherDataSet.col("colB"));
I want to add a new column to the "final" Dataset, something like this
final.addNewColumn("colName", 1); //I know this doesn't work, but just to give you an idea.
Is there a feasible way to add the new column to all the rows of the Dataset with a value of 1?
If you want to add a constant value then you can use lit function
lit(Object literal)
Creates a Column of literal value.
Also, change the variable name final to something else
Dataset<Row> final12 = otherDataset.select(otherDataset.col("colA"), otherDataSet.col("colB"));
Dataset<Row> result = final12.withColumn("columnName", lit(1))
Hope this helps!