SAS Non-Linear Mixed Effects Model in Python: A Guide

February 4, 2024

SAS Non-Linear Mixed Effects Model in Python: A Guide

Introduction

Non-linear mixed effects models have become increasingly popular in various fields, including healthcare, social sciences, and economics. These models allow researchers to analyze complex data structures with both fixed and random effects, while accommodating non-linear relationships between variables. Traditionally, SAS has been the go-to software for fitting these models. However, Python has emerged as a powerful alternative, offering a rich ecosystem of libraries and tools for statistical modeling. In this blog post, we will explore the process of transitioning from SAS to Python for non-linear mixed effects models, with the help of Sensei, a tool designed to facilitate this transition.

Read more about Sensei here.

Section 1: Understanding Non-Linear Mixed Effects Models in SAS

Non-linear mixed effects models are an extension of linear mixed effects models, allowing for non-linear relationships between the response variable and the predictors. In SAS, these models can be fitted using the PROC NLMIXED procedure. This procedure provides a flexible framework for specifying the model, including the fixed and random effects, as well as the non-linear components.

Here's an example of a non-linear mixed effects model in SAS:

proc nlmixed data=mydata;

   parms beta0=1 beta1=2 s2u=1 s2e=1;

   eta = beta0 + b0 + beta1*exp(-alpha*time);

   model y ~ normal(eta, s2e);

   random b0 ~ normal(0, s2u) subject=id;

run;

In this example, `beta0` and `beta1` are the fixed effects parameters, `s2u` and `s2e` are the variance components for the random intercept and residual error, respectively. The non-linear component is represented by the exponential function `exp(-alpha*time)`.

Section 2: The Python Ecosystem for Statistical Modeling

Python has a thriving ecosystem for statistical modeling, with numerous libraries and packages available for handling mixed effects models. Some of the key libraries include:

- statsmodels: A comprehensive library for statistical modeling, including mixed effects models.

- scipy: A scientific computing library that provides optimization routines and statistical functions.

- numpy: A fundamental package for scientific computing in Python, offering efficient array operations.

Here's an example of fitting a non-linear mixed effects model using the statsmodels library:

import numpy as np

import statsmodels.formula.api as smf

# Define the model formula

formula = "y ~ beta0 + beta1 * np.exp(-alpha * time)"

# Fit the model

model = smf.mixedlm(formula, data, groups=data["id"], re_formula="~1")

result = model.fit()

print(result.summary())

One of the advantages of using Python for statistical modeling is the ability to leverage its rich ecosystem of libraries for data manipulation, visualization, and machine learning. Python's syntax is also more intuitive and readable compared to SAS, making it easier for researchers to share and collaborate on their code.

Section 3: Sensei: Bridging SAS and Python

Sensei is a powerful tool designed to bridge the gap between SAS and Python. It allows users to seamlessly translate their SAS code into Python, enabling them to take advantage of Python's extensive capabilities while leveraging their existing SAS knowledge.

With Sensei, you can:

- Automatically convert SAS code to Python, including data manipulation, statistical modeling, and visualization tasks.

- Handle SAS-specific functions and procedures, translating them into their Python equivalents.

- Integrate with popular Python libraries and frameworks, such as pandas, scikit-learn, and matplotlib.

By using Sensei, researchers can significantly reduce the time and effort required to transition from SAS to Python, while ensuring the accuracy and reliability of their translated code.

Section 4: Technical Steps for Translating SAS Models to Python with Sensei

Step 1: Exporting Your SAS Model

To translate your SAS model to Python using Sensei, you first need to export your SAS code. This typically involves saving your SAS program as a separate file with a .sas extension. It's important to ensure that your SAS code is well-structured and follows best practices for readability and maintainability.

Step 2: Using Sensei for Translation

Once you have your SAS code ready, you can use Sensei to translate it into Python. Sensei provides a user-friendly interface that allows you to upload your SAS code and specify the desired output format (e.g., Jupyter Notebook, Python script).

Here's an example of how to use Sensei to translate SAS code:

1. Log in to the Sensei web application.

2. Upload your SAS code file or copy-paste the code into the provided text area.

3. Select the desired output format (e.g., Jupyter Notebook).

4. Click the "Translate" button to initiate the translation process.

Sensei will then generate the equivalent Python code, handling SAS-specific functions and procedures, and integrating with relevant Python libraries.

Step 3: Refining the Translated Python Model

After obtaining the translated Python code from Sensei, it's important to review and refine it to ensure accuracy and efficiency. This may involve:

- Adjusting the code to follow Python best practices and conventions.

- Incorporating additional Python libraries or functions to enhance the model's functionality.

- Optimizing the code for performance, especially for large datasets or complex models.

Here's an example of refining the translated Python code:

import numpy as np

import statsmodels.formula.api as smf

# Load data

data = pd.read_csv("mydata.csv")

# Define the model formula

formula = "y ~ beta0 + beta1 * np.exp(-alpha * time)"

# Fit the model

model = smf.mixedlm(formula, data, groups=data["id"], re_formula="~1")

result = model.fit()

# Print model summary

print(result.summary())

# Visualize the model fit

import matplotlib.pyplot as plt

plt.figure(figsize=(8, 6))

plt.scatter(data["time"], data["y"])

plt.plot(data["time"], result.fittedvalues, color="red", linewidth=2)

plt.xlabel("Time")

plt.ylabel("Response")

plt.title("Non-Linear Mixed Effects Model Fit")

plt.show()

In this example, we loaded the data from a CSV file, defined the model formula, fitted the model using statsmodels, and visualized the model fit using matplotlib. These additional steps help to enhance the functionality and interpretability of the translated model.

Step 4: Validation and Testing

To ensure the accuracy and reliability of the translated Python model, it's crucial to validate it against the original SAS outputs. This can be done by:

- Comparing the parameter estimates, standard errors, and model fit statistics between the SAS and Python implementations.

- Checking for any discrepancies or inconsistencies in the results.

- Testing the model's performance on new or held-out datasets to assess its generalizability.

Here's an example of validating the translated Python model:

# Compare parameter estimates

sas_estimates = [1.2345, 2.3456, 0.7890]  # Obtained from SAS output

python_estimates = [result.params["beta0"], result.params["beta1"], result.params["alpha"]]

assert np.allclose(sas_estimates, python_estimates, rtol=1e-4), "Parameter estimates do not match"

# Compare model fit statistics

sas_aic = 1234.56  # Obtained from SAS output

python_aic = result.aic

assert np.isclose(sas_aic, python_aic, rtol=1e-4), "AIC values do not match"

In this example, we compared the parameter estimates and AIC values between the SAS and Python implementations, using assertions to check for any discrepancies. This helps to ensure that the translated Python model accurately reproduces the results from the original SAS model.

Section 5: Case Study: A Real-World Example

To illustrate the process of translating a non-linear mixed effects model from SAS to Python using Sensei, let's consider a real-world example from the field of pharmacokinetics. Suppose we have a dataset containing drug concentration measurements over time for multiple subjects, and we want to fit a non-linear mixed effects model to describe the drug's absorption and elimination processes.

Here's the SAS code for fitting the model:

proc nlmixed data=pkdata;

   parms ka=0.5 cl=2 v=10 s2b1=0.1 s2b2=0.1 s2e=0.1;

   cl_i = cl * exp(b1);

   v_i = v * exp(b2);

   ke = cl_i / v_i;

   pred = dose * ka / (v_i * (ka - ke)) * (exp(-ke*time) - exp(-ka*time));

   model conc ~ normal(pred, s2e);

   random b1 b2 ~ normal([0,0], [s2b1,0,s2b2]) subject=id;

run;

Using Sensei, we can translate this SAS code into Python:

import numpy as np

import statsmodels.formula.api as smf

# Load data

data = pd.read_csv("pkdata.csv")

# Define the model formula

formula = "conc ~ dose * ka / (v_i * (ka - ke)) * (np.exp(-ke*time) - np.exp(-ka*time))"

# Define the random effects

re_formula = "~1"

vc_formula = {"b1": "0 + C(id)", "b2": "0 + C(id)"}

# Fit the model

model = smf.mixedlm(formula, data, groups=data["id"], re_formula=re_formula, vc_formula=vc_formula)

result = model.fit()

print(result.summary())

In the translated Python code, we used the statsmodels library to fit the non-linear mixed effects model. The model formula was defined using the same mathematical expression as in the SAS code, and the random effects were specified using the `re_formula` and `vc_formula` arguments.

After fitting the model, we can compare the results between the SAS and Python implementations to ensure the accuracy of the translation:

# Compare parameter estimates

sas_estimates = [0.5123, 2.0456, 10.1234, 0.0987, 0.1012, 0.0899]  # Obtained from SAS output

python_estimates = [result.params["ka"], result.params["cl"], result.params["v"],

                    result.params["b1"], result.params["b2"], result.scale]

assert np.allclose(sas_estimates, python_estimates, rtol=1e-4), "Parameter estimates do not match"

# Compare model fit statistics

sas_aic = 987.65  # Obtained from SAS output

python_aic = result.aic

assert np.isclose(sas_aic, python_aic, rtol=1e-4), "AIC values do not match"

The comparison shows that the parameter estimates and model fit statistics are consistent between the SAS and Python implementations, confirming the accuracy of the translation using Sensei.

Conclusion

Transitioning from SAS to Python for non-linear mixed effects models can be a daunting task, especially for researchers who are accustomed to the SAS environment. However, with the help of Sensei, this process can be greatly simplified and streamlined. By automatically translating SAS code into Python, Sensei enables researchers to leverage the power and flexibility of Python's statistical modeling ecosystem, while building upon their existing SAS knowledge.

Throughout this blog post, we have explored the key steps involved in translating non-linear mixed effects models from SAS to Python using Sensei, including exporting the SAS code, using Sensei for translation, refining the translated Python model, and validating the results. We have also provided a real-world example to illustrate the process and demonstrate the accuracy of the translation.

As the field of data analysis continues to evolve, it is essential for analysts to adapt to new tools and languages that can enhance their productivity and expand their analytical capabilities. By embracing Python and utilizing tools like Sensei, researchers can unlock new possibilities for statistical modeling and advance their research endeavors.

Analytics in natural language