Top "Apache-spark-mllib" questions

MLlib is a machine learning library for Apache Spark

Encode and assemble multiple features in PySpark

I have a Python class that I'm using to load and process some data in Spark. Among various things I …

python apache-spark apache-spark-sql apache-spark-mllib apache-spark-ml
How to extract model hyper-parameters from spark.ml in PySpark?

I'm tinkering with some cross-validation code from the PySpark documentation, and trying to get PySpark to tell me what model …

pyspark modeling cross-validation apache-spark-mllib apache-spark-ml
Dealing with unbalanced datasets in Spark MLlib

I'm working on a particular binary classification problem with a highly unbalanced dataset, and I was wondering if anyone has …

apache-spark machine-learning classification apache-spark-mllib
How to prepare data into a LibSVM format from DataFrame?

I want to make libsvm format, so I made dataframe to the desired format, but I do not know how …

apache-spark apache-spark-sql apache-spark-mllib libsvm apache-spark-ml
Error ExecutorLostFailure when running a task in Spark

when I am trying to run it on this folder it is throwing me ExecutorLostFailure everytime Hi I am a …

apache-spark pyspark apache-spark-mllib collect
How to overwrite entire existing column in Spark dataframe with new column?

I want to overwrite a spark column with a new column which is a binary flag. I tried directly overwriting …

apache-spark dataframe pyspark apache-spark-sql apache-spark-mllib
PySpark computing correlation

I want to use pyspark.mllib.stat.Statistics.corr function to compute correlation between two columns of pyspark.sql.dataframe.…

python apache-spark pyspark apache-spark-sql apache-spark-mllib
Save ML model for future usage

I was applying some Machine Learning algorithms like Linear Regression, Logistic Regression, and Naive Bayes to some data, but I …

apache-spark pyspark apache-spark-mllib apache-spark-ml
How to use XGboost in PySpark Pipeline

I want to update my code of pyspark. In the pyspark, it must put the base model in a pipeline, …

apache-spark pyspark apache-spark-mllib xgboost apache-spark-ml
How to save and load MLLib model in Apache Spark?

I trained a classification model in Apache Spark (using pyspark). I stored the model in an object, LogisticRegressionModel. Now, I …

python apache-spark pyspark apache-spark-mllib