Top "Regression" questions

Regression analysis is a collection of statistical techniques for modeling and predicting one or multiple variables based on other data.

sklearn LogisticRegression and changing the default threshold for classification

I am using LogisticRegression from the sklearn package, and have a quick question about classification. I built a ROC curve …

python scikit-learn classification regression
ValueError: feature_names mismatch: in xgboost in the predict() function

I have trained an XGBoostRegressor model. When I have to use this trained model for predicting for a new input, …

python pandas machine-learning regression xgboost
logit regression and singular Matrix error in Python

am trying to run logit regression for german credit data (www4.stat.ncsu.edu/~boos/var.select/german.credit.html). …

python-2.7 regression statsmodels
Comparing two linear models with anova() in R

I don't quite understand what the p-value in this output means. I don't mean p-values as such, but in this …

r regression linear-regression anova
Show confidence limits and prediction limits in scatter plot

I have two arrays of data as hight and weight: import numpy as np, matplotlib.pyplot as plt heights = np.…

numpy matplotlib scipy regression seaborn
PCA first or normalization first?

When doing regression or classification, what is the correct (or better) way to preprocess the data? Normalize the data -&…

machine-learning normalization classification regression pca
ValueError: endog must be in the unit interval

While using statsmodels, I am getting this weird error: ValueError: endog must be in the unit interval. Can someone give …

python regression statsmodels
Can I draw a regression line and show parameters using scatterplot with a pandas dataframe?

I would like to produce a Scatterplot from a Pandas dataframe using the following code: df.plot.scatter(x='one', …

python pandas regression scatter-plot
R - Plm and lm - Fixed effects

I have a balanced panel data set, df, that essentially consists in three variables, A, B and Y, that vary …

r regression plm
Simple multidimensional curve fitting

I have a bunch of data, generally in the form a, b, c, ..., y where y = f(a, b, c...) …

statistics regression best-fit-curve