Package 'rwa'

Title: Perform a Relative Weights Analysis
Description: Perform a Relative Weights Analysis (RWA) (a.k.a. Key Drivers Analysis) as per the method described in Tonidandel & LeBreton (2015) <DOI:10.1007/s10869-014-9351-z>, with its original roots in Johnson (2000) <DOI:10.1207/S15327906MBR3501_1>. In essence, RWA decomposes the total variance predicted in a regression model into weights that accurately reflect the proportional contribution of the predictor variables, which addresses the issue of multi-collinearity. In typical scenarios, RWA returns similar results to Shapley regression, but with a significant advantage on computational performance.
Authors: Martin Chan [aut, cre]
Maintainer: Martin Chan <[email protected]>
License: GPL-3
Version: 0.1.2.9000
Built: 2026-05-22 18:50:05 UTC
Source: https://github.com/martinctc/rwa

Help Index


Plot the rescaled importance values from the output of rwa()

Description

Pass the output of rwa() and plot a bar chart of the rescaled importance values. Signs are always calculated and taken into account, which is equivalent to setting the applysigns argument to TRUE in rwa().

Usage

plot_rwa(rwa)

Arguments

rwa

Direct list output from rwa().

Examples

library(ggplot2)
# Use a smaller sample for faster execution
diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ]
diamonds_small %>%
  rwa(outcome = "price",
      predictors = c("depth","carat", "x", "y", "z"),
      applysigns = TRUE) %>%
  plot_rwa()

Remove any columns where all the values are missing

Description

Pass a data frame and returns a version where all columns made up of entirely missing values are removed.

Usage

remove_all_na_cols(df)

Arguments

df

Data frame to be passed through.

Details

This is used within rwa().


Create a Relative Weights Analysis (RWA)

Description

This function creates a Relative Weights Analysis (RWA) and returns a list of outputs. RWA provides a heuristic method for estimating the relative weight of predictor variables in multiple regression, which involves creating a multiple regression with on a set of transformed predictors which are orthogonal to each other but maximally related to the original set of predictors. rwa() is optimised for dplyr pipes and shows positive / negative signs for weights.

Usage

rwa(
  df,
  outcome,
  predictors,
  applysigns = FALSE,
  method = "auto",
  sort = TRUE,
  bootstrap = FALSE,
  n_bootstrap = 1000,
  conf_level = 0.95,
  focal = NULL,
  comprehensive = FALSE,
  include_rescaled_ci = FALSE
)

Arguments

df

Data frame or tibble to be passed through.

outcome

Outcome variable, to be specified as a string or bare input. Must be a numeric variable.

predictors

Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric.

applysigns

Logical value specifying whether to show an estimate that applies the sign. Defaults to FALSE.

method

String to specify the method of regression to apply. Valid values include:

  • "auto": automatically detect whether to use multiple regression or logistic regression based on the outcome variable provided.

  • "multiple": use multiple regression.

  • "logistic": use logistic regression.

sort

Logical value specifying whether to sort results by rescaled relative weights in descending order. Defaults to TRUE.

bootstrap

Logical value specifying whether to calculate bootstrap confidence intervals. Defaults to FALSE. Currently only supported for multiple regression.

n_bootstrap

Number of bootstrap samples to use when bootstrap = TRUE. Defaults to 1000.

conf_level

Confidence level for bootstrap intervals. Defaults to 0.95.

focal

Focal variable for bootstrap comparisons (optional).

comprehensive

Whether to run comprehensive bootstrap analysis including random variable and focal comparisons.

include_rescaled_ci

Logical value specifying whether to include confidence intervals for rescaled weights. Defaults to FALSE due to compositional data constraints. Use with caution.

Details

rwa() produces raw relative weight values (epsilons) as well as rescaled weights (scaled as a percentage of predictable variance) for every predictor in the model. Signs are added to the weights when the applysigns argument is set to TRUE. See https://www.scotttonidandel.com/rwa-web for the original implementation that inspired this package.

This function is a wrapper around rwa_multiregress() and rwa_logit(), automatically selecting the appropriate method based on the outcome variable or the method argument.

Value

rwa() returns a list of outputs, as follows:

  • predictors: character vector of names of the predictor variables used.

  • rsquare: the rsquare value of the regression model (multiple regression only).

  • result: the final output of the importance metrics (sorted by Rescaled.RelWeight in descending order by default).

    • The Rescaled.RelWeight column sums up to 100.

    • The Sign column indicates whether a predictor is positively or negatively correlated with the outcome.

    • When bootstrap = TRUE, includes confidence interval columns for raw weights.

    • Rescaled weight CIs are available via include_rescaled_ci = TRUE but not recommended for inference.

  • n: indicates the number of observations used in the analysis.

  • bootstrap: bootstrap results (only present when bootstrap = TRUE), containing:

    • ci_results: confidence intervals for weights

    • boot_object: raw bootstrap object for advanced analysis

    • n_bootstrap: number of bootstrap samples used

  • lambda: lambda matrix from the RWA calculation.

  • RXX: Correlation matrix of all the predictor variables against each other. Not available for logistic regression.

  • RXY: Correlation values of the predictor variables against the outcome variable. Not available for logistic regression.

See Also

plot_rwa() for plotting results, rwa_multiregress() and rwa_logit() for the underlying implementations.

Examples

library(ggplot2)
# Basic RWA (results sorted by default)
rwa(diamonds, "price", c("depth", "carat"))

# RWA without sorting (preserves original predictor order)
rwa(diamonds, "price", c("depth", "carat"), sort = FALSE)

# Plot results using plot_rwa()
diamonds |>
  rwa("price", c("depth", "carat", "x", "y")) |>
  plot_rwa()


# For faster examples, use a subset of data for bootstrap
diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ]

# RWA with bootstrap confidence intervals (raw weights only)
rwa(diamonds_small, "price", c("depth", "carat"),
    bootstrap = TRUE, n_bootstrap = 100)

# Include rescaled weight CIs (use with caution for inference)
rwa(diamonds_small, "price", c("depth", "carat"),
    bootstrap = TRUE, include_rescaled_ci = TRUE, n_bootstrap = 100)

# Comprehensive bootstrap analysis with focal variable
result <- rwa(diamonds_small, "price", c("depth", "carat", "table"),
              bootstrap = TRUE, comprehensive = TRUE, focal = "carat",
              n_bootstrap = 100)
# View confidence intervals
result$bootstrap$ci_results


# Based on logistic regression (auto-detected from binary outcome)
diamonds$IsIdeal <- as.numeric(diamonds$cut == "Ideal")
rwa(diamonds, "IsIdeal", c("depth", "carat"))

Create a Relative Weights Analysis with logistic regression

Description

This function performs Relative Weights Analysis (RWA) for binary outcome variables using logistic regression. RWA provides a method for estimating the relative importance of predictor variables by transforming them into orthogonal variables while preserving their relationship to the outcome. This implementation follows Johnson (2000) for logistic regression.

Usage

rwa_logit(df, outcome, predictors, applysigns = FALSE)

Arguments

df

Data frame or tibble to be passed through.

outcome

Outcome variable, to be specified as a string or bare input. Must be a numeric variable.

predictors

Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric.

applysigns

Logical value specifying whether to show an estimate that applies the sign. Defaults to FALSE.

Value

rwa_logit() returns a list of outputs, as follows:

  • predictors: character vector of names of the predictor variables used.

  • rsquare: the pseudo R-squared value (sum of epsilon weights) for the logistic regression model.

  • result: the final output of the importance metrics.

    • The Rescaled.RelWeight column sums up to 100.

    • The Sign column indicates whether a predictor is positively or negatively associated with the outcome.

  • n: indicates the number of observations used in the analysis.

  • lambda: the Lambda transformation matrix from the analysis.

Examples

# Create a binary outcome variable
mtcars_binary <- mtcars
mtcars_binary$high_mpg <- ifelse(mtcars$mpg > median(mtcars$mpg), 1, 0)

# Basic logistic RWA
result <- rwa_logit(
  df = mtcars_binary,
  outcome = "high_mpg",
  predictors = c("cyl", "disp", "hp", "wt")
)

# View the relative importance results
result$result

# With sign information
result_signed <- rwa_logit(
  df = mtcars_binary,
  outcome = "high_mpg",
  predictors = c("cyl", "disp", "hp", "wt"),
  applysigns = TRUE
)
result_signed$result

Create a Relative Weights Analysis (RWA)

Description

This function creates a Relative Weights Analysis (RWA) and returns a list of outputs. RWA provides a heuristic method for estimating the relative weight of predictor variables in multiple regression, which involves creating a multiple regression with on a set of transformed predictors which are orthogonal to each other but maximally related to the original set of predictors. rwa_multiregress() is optimised for dplyr pipes and shows positive / negative signs for weights.

Usage

rwa_multiregress(df, outcome, predictors, applysigns = FALSE)

Arguments

df

Data frame or tibble to be passed through.

outcome

Outcome variable, to be specified as a string or bare input. Must be a numeric variable.

predictors

Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric.

applysigns

Logical value specifying whether to show an estimate that applies the sign. Defaults to FALSE.

Details

rwa_multiregress() produces raw relative weight values (epsilons) as well as rescaled weights (scaled as a percentage of predictable variance) for every predictor in the model. Signs are added to the weights when the applysigns argument is set to TRUE. See https://relativeimportance.davidson.edu/multipleregression.html for the original implementation that inspired this package.

Value

rwa_multiregress() returns a list of outputs, as follows:

  • predictors: character vector of names of the predictor variables used.

  • rsquare: the rsquare value of the regression model.

  • result: the final output of the importance metrics.

    • The Rescaled.RelWeight column sums up to 100.

    • The Sign column indicates whether a predictor is positively or negatively correlated with the outcome.

  • n: indicates the number of observations used in the analysis.

  • lambda: the transformation matrix that maps the original correlated predictors to orthogonal variables while preserving their relationship to the outcome. Used internally to compute relative weights.

  • RXX: Correlation matrix of all the predictor variables against each other.

  • RXY: Correlation values of the predictor variables against the outcome variable.

Examples

# Basic multiple regression RWA
result <- rwa_multiregress(
  df = mtcars,
  outcome = "mpg",
  predictors = c("cyl", "disp", "hp", "wt")
)

# View the relative importance results
result$result

# With sign information
result_signed <- rwa_multiregress(
  df = mtcars,
  outcome = "mpg",
  predictors = c("cyl", "disp", "hp", "wt"),
  applysigns = TRUE
)
result_signed$result