stepsel.modeling.prep.helper ============================ .. py:module:: stepsel.modeling.prep.helper .. autoapi-nested-parse:: Helper functions for modeling prep. Smaller functions that are used in multiple functions. Functions --------- .. autoapisummary:: stepsel.modeling.prep.helper.get_interaction_type stepsel.modeling.prep.helper.relevel_categorical_variable stepsel.modeling.prep.helper.parse_model_formula stepsel.modeling.prep.helper.recognize_variable_types Module Contents --------------- .. py:function:: get_interaction_type(interaction: str, interaction_numerical_variables: list, interaction_categorical_variables: list) Get the interaction type of an interaction. :param interaction: The interaction to get the type of. :type interaction: str :param interaction_numerical_variables: The numerical variables that are used in the interactions. :type interaction_numerical_variables: list :param interaction_categorical_variables: The categorical variables that are used in the interactions. :type interaction_categorical_variables: list :returns: **interaction_type** -- The interaction type. One of "numerical_numerical", "categorical_categorical", "numerical_categorical", "categorical_numerical". :rtype: str :raises ValueError: If the interaction does not contain exactly one '*' character. If the interaction variables are not in exactly one of the two lists. .. rubric:: Examples >>> get_interaction_type("a * b", ["a"], ["b"]) "numerical_categorical" .. py:function:: relevel_categorical_variable(series: pandas.Series, new_order: list) Relevel a categorical variable. :param series: The categorical variable to relevel. :type series: pd.Series :param new_order: The new order of the categories. :type new_order: list :returns: **series** -- The relevelled categorical variable. :rtype: pd.Series :raises ValueError: If the new order is not a subset of the current categories. If the new order contains duplicates. .. py:function:: parse_model_formula(formula: str) Parse a model formula into its components. :param formula: The model formula to parse. :type formula: str :returns: * **left_side_variables** (*list*) -- The variables on the left side of the formula. * **interaction_variables** (*list*) -- The interaction variables on the right side of the formula. * **non_interaction_variables** (*list*) -- The non-interaction variables on the right side of the formula. :raises ValueError: If the formula does not contain exactly one '~' character. .. rubric:: Examples >>> parse_model_formula("y ~ a + b + a * b") (["y"], ["a * b"], ["a", "b"]) .. py:function:: recognize_variable_types(data: pandas.DataFrame, interaction_variables: list, non_interaction_variables: list) Recognize the types of the variables. :param data: The data to recognize the variable types from. :type data: pd.DataFrame :param interaction_variables: The interaction variables to recognize the types from. :type interaction_variables: list :param non_interaction_variables: The non-interaction variables to recognize the types from. :type non_interaction_variables: list :returns: **dictionary** -- A dictionary containing the variable types. interaction_numerical_variables : list The numerical variables in the interaction variables. interaction_categorical_variables : list The categorical variables in the interaction variables. non_interaction_numerical_variables : list The numerical variables in the non-interaction variables. non_interaction_categorical_variables : list The categorical variables in the non-interaction variables. interaction_variables : list The interaction variables. :rtype: dict :raises ValueError: If the interaction variables are not either numerical or categorical. If the non-interaction variables are not either numerical or categorical. .. rubric:: Examples >>> recognize_variable_types(data, ["a * b"], ["a", "b", "c"]) (["a"], ["b"], [], [], ["a * b"]) {"non_interaction_numerical_variables": ["a"], "non_interaction_categorical_variables": ["b"], "interaction_numerical_variables": ["a"], "interaction_categorical_variables": ["b", "c"], "interaction_variables": ["a * b"]}