stepsel.modeling.prep.helper
Helper functions for modeling prep. Smaller functions that are used in multiple functions.
Functions
|
Get the interaction type of an interaction. |
|
Relevel a categorical variable. |
|
Parse a model formula into its components. |
|
Recognize the types of the variables. |
Module Contents
- stepsel.modeling.prep.helper.get_interaction_type(interaction: str, interaction_numerical_variables: list, interaction_categorical_variables: list)[source]
Get the interaction type of an interaction.
- Parameters:
interaction (str) – The interaction to get the type of.
interaction_numerical_variables (list) – The numerical variables that are used in the interactions.
interaction_categorical_variables (list) – The categorical variables that are used in the interactions.
- Returns:
interaction_type – The interaction type. One of “numerical_numerical”, “categorical_categorical”, “numerical_categorical”, “categorical_numerical”.
- Return type:
str
- Raises:
ValueError – If the interaction does not contain exactly one ‘*’ character. If the interaction variables are not in exactly one of the two lists.
Examples
>>> get_interaction_type("a * b", ["a"], ["b"]) "numerical_categorical"
- stepsel.modeling.prep.helper.relevel_categorical_variable(series: pandas.Series, new_order: list)[source]
Relevel a categorical variable.
- Parameters:
series (pd.Series) – The categorical variable to relevel.
new_order (list) – The new order of the categories.
- Returns:
series – The relevelled categorical variable.
- Return type:
pd.Series
- Raises:
ValueError – If the new order is not a subset of the current categories. If the new order contains duplicates.
- stepsel.modeling.prep.helper.parse_model_formula(formula: str)[source]
Parse a model formula into its components.
- Parameters:
formula (str) – The model formula to parse.
- Returns:
left_side_variables (list) – The variables on the left side of the formula.
interaction_variables (list) – The interaction variables on the right side of the formula.
non_interaction_variables (list) – The non-interaction variables on the right side of the formula.
- Raises:
ValueError – If the formula does not contain exactly one ‘~’ character.
Examples
>>> parse_model_formula("y ~ a + b + a * b") (["y"], ["a * b"], ["a", "b"])
- stepsel.modeling.prep.helper.recognize_variable_types(data: pandas.DataFrame, interaction_variables: list, non_interaction_variables: list)[source]
Recognize the types of the variables.
- Parameters:
data (pd.DataFrame) – The data to recognize the variable types from.
interaction_variables (list) – The interaction variables to recognize the types from.
non_interaction_variables (list) – The non-interaction variables to recognize the types from.
- Returns:
dictionary – A dictionary containing the variable types.
- interaction_numerical_variableslist
The numerical variables in the interaction variables.
- interaction_categorical_variableslist
The categorical variables in the interaction variables.
- non_interaction_numerical_variableslist
The numerical variables in the non-interaction variables.
- non_interaction_categorical_variableslist
The categorical variables in the non-interaction variables.
- interaction_variableslist
The interaction variables.
- Return type:
dict
- Raises:
ValueError – If the interaction variables are not either numerical or categorical. If the non-interaction variables are not either numerical or categorical.
Examples
>>> recognize_variable_types(data, ["a * b"], ["a", "b", "c"]) (["a"], ["b"], [], [], ["a * b"]) {"non_interaction_numerical_variables": ["a"], "non_interaction_categorical_variables": ["b"], "interaction_numerical_variables": ["a"], "interaction_categorical_variables": ["b", "c"], "interaction_variables": ["a * b"]}