Quantcast
Viewing all articles
Browse latest Browse all 24

Answer by Martin Senne for How can I change column types in Spark SQL's DataFrame?

As the cast operation is available for Spark Column's (and as I personally do not favour udf's as proposed by @Svend at this point), how about:

df.select( df("year").cast(IntegerType).as("year"), ... )

to cast to the requested type? As a neat side effect, values not castable / "convertable" in that sense, will become null.

In case you need this as a helper method, use:

object DFHelper{  def castColumnTo( df: DataFrame, cn: String, tpe: DataType ) : DataFrame = {    df.withColumn( cn, df(cn).cast(tpe) )  }}

which is used like:

import DFHelper._val df2 = castColumnTo( df, "year", IntegerType )

Viewing all articles
Browse latest Browse all 24

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>