Math Problem Statement

data.xlsx

8.76 KB

dari data ini buatkan plot untuk uji asumsi normalitas, homokedatisitas, dan autokorelasi

Solution

pandas as pd

Load the uploaded file to inspect its contents

file_path = '/mnt/data/17330021315827j1b9rud.xlsx' data = pd.ExcelFile(file_path)

Display sheet names to understand the structure of the file

data.sheet_names Load the data from the first sheet to inspect its structure df = data.parse('Sheet1') df.head()Dataset loaded successfully. It consists of the variables x0, x1, x2, and y. I will now create plots for testing the assumptions of normality, homoscedasticity, and autocorrelation based on this dataset. matplotlib.pyplot as plt import statsmodels.api as sm from statsmodels.stats.diagnostic import het_breuschpagan from statsmodels.stats.stattools import durbin_watson

Define variables

X = df[['x0', 'x1', 'x2']] y = df['y']

Fit a regression model for residual analysis

X_const = sm.add_constant(X) # Add constant for intercept model = sm.OLS(y, X_const).fit() residuals = model.resid

1. Normality Test: QQ-Plot

plt.figure(figsize=(8, 6)) sm.qqplot(residuals, line='45', fit=True) plt.title('QQ-Plot for Normality Test') plt.grid() plt.show()

2. Homoscedasticity: Residuals vs Fitted

fitted_values = model.fittedvalues

plt.figure(figsize=(8, 6)) plt.scatter(fitted_values, residuals, alpha=0.7) plt.axhline(0, color='red', linestyle='--', linewidth=1) plt.title('Residuals vs Fitted Values (Homoscedasticity Test)') plt.xlabel('Fitted Values') plt.ylabel('Residuals') plt.grid() plt.show()

3. Autocorrelation: Durbin-Watson Test

durbin_watson_stat = durbin_watson(residuals)

Display Durbin-Watson Test Result

print(f'Durbin-Watson Statistic: {durbin_watson_stat:.3f}')

Display Breusch-Pagan Test Results (Homoscedasticity test)

bp_test = het_breuschpagan(residuals, X_const) bp_labels = ['LM Statistic', 'p-value', 'F-value', 'F p-value'] bp_results = dict(zip(bp_labels, bp_test))

bp_results### Results:

  1. Normality Test (QQ-Plot): The residuals follow a straight line on the QQ plot, indicating approximate normality.
  2. Homoscedasticity Test:
    • The scatterplot of residuals vs. fitted values shows no clear pattern, suggesting homoscedasticity (constant variance of residuals).
    • Breusch-Pagan Test Results:
      • LM Statistic: 1.924
      • p-value: 0.382 (greater than 0.05, indicating no significant heteroscedasticity)
  3. Autocorrelation Test:
    • Durbin-Watson Statistic: 2.397 (close to 2, indicating little to no autocorrelation).

Let me know if you need further interpretation, modifications, or additional tests.

Further Questions:

  1. Would you like a detailed explanation of the Breusch-Pagan and Durbin-Watson tests?
  2. Should we check for multicollinearity in the predictors?
  3. Would you like a deeper analysis of the regression model (e.g., coefficients, p-values)?
  4. Do you need customized visualizations for any specific variables?
  5. Should we explore additional diagnostic tests, like the Shapiro-Wilk test for normality?

Tip:

Always review diagnostic plots and statistical tests together to ensure the validity of regression assumptions comprehensively.

Ask a new question for Free

By Image

Drop file here or Click Here to upload

Math Problem Analysis

Mathematical Concepts

Regression Analysis
Assumption Testing
Statistical Diagnostics

Formulas

Durbin-Watson statistic: DW = Σ((e_t - e_t-1)^2) / Σ(e_t^2)
Breusch-Pagan test formula for heteroscedasticity

Theorems

Central Limit Theorem
Gauss-Markov Theorem

Suitable Grade Level

Undergraduate Statistics or Data Science