the answers suggesting to use cast, FYI, the cast method in spark 1.4.1 is broken.
for example, a dataframe with a string column having value "8182175552014127960" when casted to bigint has value "8182175552014128100"
df.show+-------------------+| a|+-------------------+|8182175552014127960|+-------------------+ df.selectExpr("cast(a as bigint) a").show+-------------------+| a|+-------------------+|8182175552014128100|+-------------------+
We had to face a lot of issue before finding this bug because we had bigint columns in production.