| Title: | Perform a Relative Weights Analysis |
|---|---|
| Description: | Perform a Relative Weights Analysis (RWA) (a.k.a. Key Drivers Analysis) as per the method described in Tonidandel & LeBreton (2015) <DOI:10.1007/s10869-014-9351-z>, with its original roots in Johnson (2000) <DOI:10.1207/S15327906MBR3501_1>. In essence, RWA decomposes the total variance predicted in a regression model into weights that accurately reflect the proportional contribution of the predictor variables, which addresses the issue of multi-collinearity. In typical scenarios, RWA returns similar results to Shapley regression, but with a significant advantage on computational performance. |
| Authors: | Martin Chan [aut, cre] |
| Maintainer: | Martin Chan <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.2.9000 |
| Built: | 2026-05-22 18:50:05 UTC |
| Source: | https://github.com/martinctc/rwa |
rwa()
Pass the output of rwa() and plot a bar chart of the rescaled importance values.
Signs are always calculated and taken into account, which is equivalent to setting the applysigns
argument to TRUE in rwa().
plot_rwa(rwa)plot_rwa(rwa)
rwa |
Direct list output from |
library(ggplot2) # Use a smaller sample for faster execution diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ] diamonds_small %>% rwa(outcome = "price", predictors = c("depth","carat", "x", "y", "z"), applysigns = TRUE) %>% plot_rwa()library(ggplot2) # Use a smaller sample for faster execution diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ] diamonds_small %>% rwa(outcome = "price", predictors = c("depth","carat", "x", "y", "z"), applysigns = TRUE) %>% plot_rwa()
Pass a data frame and returns a version where all columns made up of entirely missing values are removed.
remove_all_na_cols(df)remove_all_na_cols(df)
df |
Data frame to be passed through. |
This is used within rwa().
This function creates a Relative Weights Analysis (RWA) and
returns a list of outputs. RWA provides a heuristic method for estimating
the relative weight of predictor variables in multiple regression, which
involves creating a multiple regression with on a set of transformed
predictors which are orthogonal to each other but maximally related to the
original set of predictors. rwa() is optimised for dplyr pipes and shows
positive / negative signs for weights.
rwa( df, outcome, predictors, applysigns = FALSE, method = "auto", sort = TRUE, bootstrap = FALSE, n_bootstrap = 1000, conf_level = 0.95, focal = NULL, comprehensive = FALSE, include_rescaled_ci = FALSE )rwa( df, outcome, predictors, applysigns = FALSE, method = "auto", sort = TRUE, bootstrap = FALSE, n_bootstrap = 1000, conf_level = 0.95, focal = NULL, comprehensive = FALSE, include_rescaled_ci = FALSE )
df |
Data frame or tibble to be passed through. |
outcome |
Outcome variable, to be specified as a string or bare input. Must be a numeric variable. |
predictors |
Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric. |
applysigns |
Logical value specifying whether to show an estimate that
applies the sign. Defaults to |
method |
String to specify the method of regression to apply. Valid values include:
|
sort |
Logical value specifying whether to sort results by rescaled
relative weights in descending order. Defaults to |
bootstrap |
Logical value specifying whether to calculate bootstrap
confidence intervals. Defaults to |
n_bootstrap |
Number of bootstrap samples to use when bootstrap = TRUE. Defaults to 1000. |
conf_level |
Confidence level for bootstrap intervals. Defaults to 0.95. |
focal |
Focal variable for bootstrap comparisons (optional). |
comprehensive |
Whether to run comprehensive bootstrap analysis including random variable and focal comparisons. |
include_rescaled_ci |
Logical value specifying whether to include
confidence intervals for rescaled weights. Defaults to |
rwa() produces raw relative weight values (epsilons) as well as rescaled
weights (scaled as a percentage of predictable variance) for every predictor
in the model. Signs are added to the weights when the applysigns argument
is set to TRUE. See https://www.scotttonidandel.com/rwa-web for the
original implementation that inspired this package.
This function is a wrapper around rwa_multiregress() and rwa_logit(),
automatically selecting the appropriate method based on the outcome variable
or the method argument.
rwa() returns a list of outputs, as follows:
predictors: character vector of names of the predictor variables used.
rsquare: the rsquare value of the regression model (multiple regression only).
result: the final output of the importance metrics (sorted by
Rescaled.RelWeight in descending order by default).
The Rescaled.RelWeight column sums up to 100.
The Sign column indicates whether a predictor is positively or
negatively correlated with the outcome.
When bootstrap = TRUE, includes confidence interval columns for raw weights.
Rescaled weight CIs are available via include_rescaled_ci = TRUE but not recommended for inference.
n: indicates the number of observations used in the analysis.
bootstrap: bootstrap results (only present when bootstrap = TRUE), containing:
ci_results: confidence intervals for weights
boot_object: raw bootstrap object for advanced analysis
n_bootstrap: number of bootstrap samples used
lambda: lambda matrix from the RWA calculation.
RXX: Correlation matrix of all the predictor variables against each
other. Not available for logistic regression.
RXY: Correlation values of the predictor variables against the outcome
variable. Not available for logistic regression.
plot_rwa() for plotting results, rwa_multiregress() and
rwa_logit() for the underlying implementations.
library(ggplot2) # Basic RWA (results sorted by default) rwa(diamonds, "price", c("depth", "carat")) # RWA without sorting (preserves original predictor order) rwa(diamonds, "price", c("depth", "carat"), sort = FALSE) # Plot results using plot_rwa() diamonds |> rwa("price", c("depth", "carat", "x", "y")) |> plot_rwa() # For faster examples, use a subset of data for bootstrap diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ] # RWA with bootstrap confidence intervals (raw weights only) rwa(diamonds_small, "price", c("depth", "carat"), bootstrap = TRUE, n_bootstrap = 100) # Include rescaled weight CIs (use with caution for inference) rwa(diamonds_small, "price", c("depth", "carat"), bootstrap = TRUE, include_rescaled_ci = TRUE, n_bootstrap = 100) # Comprehensive bootstrap analysis with focal variable result <- rwa(diamonds_small, "price", c("depth", "carat", "table"), bootstrap = TRUE, comprehensive = TRUE, focal = "carat", n_bootstrap = 100) # View confidence intervals result$bootstrap$ci_results # Based on logistic regression (auto-detected from binary outcome) diamonds$IsIdeal <- as.numeric(diamonds$cut == "Ideal") rwa(diamonds, "IsIdeal", c("depth", "carat"))library(ggplot2) # Basic RWA (results sorted by default) rwa(diamonds, "price", c("depth", "carat")) # RWA without sorting (preserves original predictor order) rwa(diamonds, "price", c("depth", "carat"), sort = FALSE) # Plot results using plot_rwa() diamonds |> rwa("price", c("depth", "carat", "x", "y")) |> plot_rwa() # For faster examples, use a subset of data for bootstrap diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ] # RWA with bootstrap confidence intervals (raw weights only) rwa(diamonds_small, "price", c("depth", "carat"), bootstrap = TRUE, n_bootstrap = 100) # Include rescaled weight CIs (use with caution for inference) rwa(diamonds_small, "price", c("depth", "carat"), bootstrap = TRUE, include_rescaled_ci = TRUE, n_bootstrap = 100) # Comprehensive bootstrap analysis with focal variable result <- rwa(diamonds_small, "price", c("depth", "carat", "table"), bootstrap = TRUE, comprehensive = TRUE, focal = "carat", n_bootstrap = 100) # View confidence intervals result$bootstrap$ci_results # Based on logistic regression (auto-detected from binary outcome) diamonds$IsIdeal <- as.numeric(diamonds$cut == "Ideal") rwa(diamonds, "IsIdeal", c("depth", "carat"))
This function performs Relative Weights Analysis (RWA) for binary outcome variables using logistic regression. RWA provides a method for estimating the relative importance of predictor variables by transforming them into orthogonal variables while preserving their relationship to the outcome. This implementation follows Johnson (2000) for logistic regression.
rwa_logit(df, outcome, predictors, applysigns = FALSE)rwa_logit(df, outcome, predictors, applysigns = FALSE)
df |
Data frame or tibble to be passed through. |
outcome |
Outcome variable, to be specified as a string or bare input. Must be a numeric variable. |
predictors |
Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric. |
applysigns |
Logical value specifying whether to show an estimate that
applies the sign. Defaults to |
rwa_logit() returns a list of outputs, as follows:
predictors: character vector of names of the predictor variables used.
rsquare: the pseudo R-squared value (sum of epsilon weights) for the logistic regression model.
result: the final output of the importance metrics.
The Rescaled.RelWeight column sums up to 100.
The Sign column indicates whether a predictor is positively or negatively associated with the outcome.
n: indicates the number of observations used in the analysis.
lambda: the Lambda transformation matrix from the analysis.
# Create a binary outcome variable mtcars_binary <- mtcars mtcars_binary$high_mpg <- ifelse(mtcars$mpg > median(mtcars$mpg), 1, 0) # Basic logistic RWA result <- rwa_logit( df = mtcars_binary, outcome = "high_mpg", predictors = c("cyl", "disp", "hp", "wt") ) # View the relative importance results result$result # With sign information result_signed <- rwa_logit( df = mtcars_binary, outcome = "high_mpg", predictors = c("cyl", "disp", "hp", "wt"), applysigns = TRUE ) result_signed$result# Create a binary outcome variable mtcars_binary <- mtcars mtcars_binary$high_mpg <- ifelse(mtcars$mpg > median(mtcars$mpg), 1, 0) # Basic logistic RWA result <- rwa_logit( df = mtcars_binary, outcome = "high_mpg", predictors = c("cyl", "disp", "hp", "wt") ) # View the relative importance results result$result # With sign information result_signed <- rwa_logit( df = mtcars_binary, outcome = "high_mpg", predictors = c("cyl", "disp", "hp", "wt"), applysigns = TRUE ) result_signed$result
This function creates a Relative Weights Analysis (RWA) and returns a list of outputs.
RWA provides a heuristic method for estimating the relative weight of predictor variables in multiple regression, which involves
creating a multiple regression with on a set of transformed predictors which are orthogonal to each other but
maximally related to the original set of predictors.
rwa_multiregress() is optimised for dplyr pipes and shows positive / negative signs for weights.
rwa_multiregress(df, outcome, predictors, applysigns = FALSE)rwa_multiregress(df, outcome, predictors, applysigns = FALSE)
df |
Data frame or tibble to be passed through. |
outcome |
Outcome variable, to be specified as a string or bare input. Must be a numeric variable. |
predictors |
Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric. |
applysigns |
Logical value specifying whether to show an estimate that applies the sign. Defaults to |
rwa_multiregress() produces raw relative weight values (epsilons) as well as rescaled weights (scaled as a percentage of predictable variance)
for every predictor in the model.
Signs are added to the weights when the applysigns argument is set to TRUE.
See https://relativeimportance.davidson.edu/multipleregression.html for the original implementation that inspired this package.
rwa_multiregress() returns a list of outputs, as follows:
predictors: character vector of names of the predictor variables used.
rsquare: the rsquare value of the regression model.
result: the final output of the importance metrics.
The Rescaled.RelWeight column sums up to 100.
The Sign column indicates whether a predictor is positively or negatively correlated with the outcome.
n: indicates the number of observations used in the analysis.
lambda: the transformation matrix that maps the original correlated predictors to orthogonal variables while preserving their relationship to the outcome. Used internally to compute relative weights.
RXX: Correlation matrix of all the predictor variables against each other.
RXY: Correlation values of the predictor variables against the outcome variable.
# Basic multiple regression RWA result <- rwa_multiregress( df = mtcars, outcome = "mpg", predictors = c("cyl", "disp", "hp", "wt") ) # View the relative importance results result$result # With sign information result_signed <- rwa_multiregress( df = mtcars, outcome = "mpg", predictors = c("cyl", "disp", "hp", "wt"), applysigns = TRUE ) result_signed$result# Basic multiple regression RWA result <- rwa_multiregress( df = mtcars, outcome = "mpg", predictors = c("cyl", "disp", "hp", "wt") ) # View the relative importance results result$result # With sign information result_signed <- rwa_multiregress( df = mtcars, outcome = "mpg", predictors = c("cyl", "disp", "hp", "wt"), applysigns = TRUE ) result_signed$result