I have a spark data frame df
. Is there a way of sub selecting a few columns using a list of these columns?
scala> df.columns
res0: Array[String] = Array("a", "b", "c", "d")
I know I can do something like df.select("b", "c")
. But suppose I have a list containing a few column names val cols = List("b", "c")
, is there a way to pass this to df.select? df.select(cols)
throws an error. Something like df.select(*cols)
as in python
Use df.select(cols.head, cols.tail: _*)
Let me know if it works :)
The key is the method signature of select:
select(col: String, cols: String*)
The cols:String*
entry takes a variable number of arguments. :_*
unpacks arguments so that they can be handled by this argument. Very similar to unpacking in python with *args
. See here and here for other examples.