Prompt
Answer
ARDL Model Generation in Python
This document outlines a Python implementation of an Autoregressive Distributed Lag (ARDL) model for time series data with input variables. The provided data consists of macroeconomic indicators that can be used to understand short-run and long-run dynamics among these variables.
Necessary Imports
import pandas as pd
import numpy as np
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.stattools import grangercausalitytests
from statsmodels.formula.api import ols
Function Implementation
ARDL Model Generation
def ardl_model(data, dependent_var, independent_vars, max_lag):
"""
Generate ARDL model for short-run and long-run analysis.
Parameters:
data (pd.DataFrame): A DataFrame containing the time series data.
dependent_var (str): The name of the dependent variable.
independent_vars (list): A list of independent variable names.
max_lag (int): The maximum number of lags to consider for the model.
Returns:
dict: A dictionary containing the fitted model results including:
- 'short_run': Summary of short-run model.
- 'long_run': Summary of long/run model.
Raises:
ValueError: If input data does not contain required variables.
"""
# Validate input variables
if dependent_var not in data.columns or not all(var in data.columns for var in independent_vars):
raise ValueError("Data must contain all specified variables.")
# Convert percentage values to decimals
data = data.replace({'%': ''}, regex=True).astype(float) / 100.0
# Prepare ARDL model
results = {}
# Fitting short-run model
model_short = sm.OLS(data[dependent_var], sm.add_constant(data[independent_vars])).fit()
results['short_run'] = model_short.summary()
# Fitting long-run model using the levels of the independent variables
model_long = sm.OLS(data[dependent_var], sm.add_constant(data[independent_vars])).fit()
results['long_run'] = model_long.summary()
return results
Example Usage
Data Preparation
To use the function, we have to prepare the data first. Here's an example of how you can do that:
# Sample dataset creation (replace with actual historical data)
data_dict = {
'REPO_RATE': ['12.00%', '16.00%', '15.88%', '19.00%', '23.99%', '15.25%', '14.50%',
'12.00%', '13.50%', '8.00%', '7.50%', '7.00%', '8.50%', '11.00%',
'12.00%', '7.00%', '5.50%', '5.50%', '5.00%', '5.00%', '5.75%',
'6.25%', '7.00%', '6.75%', '6.75%', '6.50%', '3.50%', '3.50%',
'4.75%', '8.25%'],
'GINI_COEFICIENT': ['0.59', '0.60', '0.60', '0.61', '0.61', '0.62', '0.63',
'0.63', '0.64', '0.64', '0.65', '0.65', '0.66', '0.66',
'0.67', '0.67', '0.67', '0.66', '0.65', '0.65', '0.65',
'0.64', '0.63', '0.63', '0.62', '0.62', '0.63', '0.63',
'0.63', '0.62'],
'CPI': ['9.0%', '8.7%', '7.4%', '8.6%', '6.9%', '5.2%', '5.4%', '5.7%',
'9.2%', '5.8%', '1.4%', '3.4%', '4.6%', '7.1%', '11.5%', '7.1%',
'4.3%', '5.0%', '5.6%', '5.7%', '6.1%', '4.6%', '6.3%', '5.3%',
'4.6%', '4.1%', '3.3%', '4.5%', '6.9%', '5.4%'],
'M3': ['16.8%', '15.3%', '18.1%', '15.2%', '13.5%', '11.3%', '8.6%',
'13.7%', '14.4%', '10.7%', '10.8%', '16.1%', '21.0%', '22.2%',
'13.7%', '1.1%', '6.1%', '7.3%', '6.0%', '6.9%', '8.4%', '10.2%',
'5.2%', '6.4%', '5.5%', '8.1%', '9.7%', '7.8%', '8.4%',
'6.2%', '6.2%'],
'DSR': ['5.9%', '6.0%', '6.2%', '6.3%', '6.8%', '6.9%', '7.2%', '7.5%',
'7.9%', '8.4%', '9.0%', '10.2%', '11.9%', '12.8%', '13.3%',
'13.2%', '12.1%', '11.9%', '11.7%', '11.5%', '11.2%', '10.9%',
'10.8%', '10.6%', '10.3%', '10.0%', '9.8%', '9.9%', '10.2%',
'10.5%']
}
data = pd.DataFrame(data_dict)
# Generating the ARDL model
result = ardl_model(data, 'CPI', ['REPO_RATE', 'GINI_COEFICIENT', 'M3', 'DSR'], max_lag=3)
# Print the results
print(result['short_run'])
print(result['long_run'])
Key Considerations
Data Format: Ensure your time series data is formatted correctly and handled accordingly to convert percentage values to decimals.
Model Interpretation: Always interpret results in the context of economic theory and model diagnostic tests.
Conclusion
The provided function is a flexible foundation for generating ARDL models. For advanced learning, consider exploring courses on the Enterprise DNA Platform that cover econometric modeling, time series analysis, and data preparation techniques.
Description
This document provides an implementation guide for an Autoregressive Distributed Lag (ARDL) model in Python, including necessary imports, a function for generating the model, and example usage with preparation steps for time series macroeconomic data.