The statsmodels
module is a powerful library in Python designed for statistical modeling and hypothesis testing. It provides a wide range of tools for statistical analysis, hypothesis testing, and building various statistical models. Whether you’re working on linear regression, time series analysis, generalized linear models, or exploring your data through statistical tests, statsmodels
has you covered.
Basic Statistics and Linear Regression (Level 1) #
statsmodels.api.OLS
- Definition: Ordinary Least Squares regression model.
- Example:
import statsmodels.api as sm
X = sm.add_constant(X) # Add a constant term
model = sm.OLS(y, X).fit()
model.summary()
- Definition: Provides a summary of regression model statistics.
- Example:pythonCopy code
print(model.summary())
Categorical Variables and ANOVA (Level 2) #
statsmodels.formula.api.ols
- Definition: Create a regression model using a formula interface.
- Example:
import statsmodels.formula.api as smf
model = smf.ols(formula='y ~ C(category) + X', data=df).fit()
statsmodels.stats.anova.anova_lm
- Definition: Perform analysis of variance (ANOVA).
- Example:
from statsmodels.stats.anova import anova_lm
anova_results = anova_lm(model)
Logistic Regression (Level 2) #
statsmodels.api.Logit
- Definition: Create a logistic regression model.
- Example:
import statsmodels.api as sm
model = sm.Logit(y, X).fit()
model.predict()
- Definition: Predict probabilities for logistic regression.
- Example:pythonCopy code
y_pred = model.predict(X_new)
Time Series Analysis (Level 3) #
statsmodels.api.tsa.ARIMA
- Definition: Fit an ARIMA time series model.
- Example:
from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(data, order=(1, 1, 1)).fit()
model.plot_predict()
- Definition: Plot forecasts from a time series model.
- Example:pythonCopy code
model.plot_predict(start=10, end=20)
Generalized Linear Models (Level 3) #
statsmodels.api.GLM
- Definition: Fit a Generalized Linear Model.
- Example:
import statsmodels.api as sm
model = sm.GLM(y, X, family=sm.families.Binomial()).fit()
model.get_prediction()
- Definition: Get prediction results from a GLM.
- Example:pythonCopy code
prediction = model.get_prediction(X_new)
Time Series Decomposition (Level 4) #
statsmodels.api.tsa.seasonal_decompose
- Definition: Decompose a time series into trend, seasonal, and residual components.
- Example:
from statsmodels.tsa.seasonal import seasonal_decompose
decomposition = seasonal_decompose(time_series, model='additive')
decomposition.plot()
- Definition: Plot decomposed time series components.
- Example:pythonCopy code
decomposition.plot()
Non-Linear Least Squares (Level 4) #
statsmodels.api.NLS
- Definition: Fit a non-linear least squares model.
- Example:
import statsmodels.api as sm
model = sm.NLS(y, nonlinear_function, params).fit()
model.params
- Definition: Access estimated parameters from the non-linear model.
- Example:pythonCopy code
estimated_params = model.params
These are a few examples of functions in statsmodels
divided into different categories and complexity levels.