stepsel.modeling.prep.interaction
Preprocessing functions for interaction terms.
Functions
|
Create an interaction term between a categorical and a numerical variable. |
|
Create an interaction term between two categorical variables. |
|
Create an interaction term between two numerical variables. |
Module Contents
- stepsel.modeling.prep.interaction.interaction_categorical_numerical(series1: pandas.Series, series2: pandas.Series)[source]
Create an interaction term between a categorical and a numerical variable.
- Parameters:
series1 (pandas.Series) – The first series.
series2 (pandas.Series) – The second series.
- Returns:
interaction_df – A DataFrame with the interaction terms.
- Return type:
pandas.DataFrame
- Raises:
ValueError – If one (and only one) of the series is not categorical. If one (and only one) of the series is not numerical.
Notes
The function will create dummy variables for the categorical variable and multiply them by the numerical variable. The dummy variables will be named “categorical_variable: category * numerical_variable”.
Examples
>>> import pandas as pd >>> import numpy as np >>> from stepsel.modeling.prep import interaction_categorical_numerical >>> categorical_series = pd.Series(np.random.choice(["A", "B", "C"], size=10), name="categorical").astype("category") >>> numerical_series = pd.Series(np.random.normal(size=10), name="numerical") >>> interaction_categorical_numerical(categorical_series, numerical_series) categorical: A * numerical categorical: B * numerical categorical: C * numerical 0 -0.626453 0.417258 0.619825 1 0.183643 -0.720788 -0.720788 2 0.835979 -0.632650 -0.632650 ...
- stepsel.modeling.prep.interaction.interaction_categorical_categorical(series1: pandas.Series, series2: pandas.Series)[source]
Create an interaction term between two categorical variables.
- Parameters:
series1 (pandas.Series) – The first series.
series2 (pandas.Series) – The second series.
- Returns:
interaction – A Series with the interaction terms.
- Return type:
pandas.Series
- Raises:
ValueError – If one series1 is not categorical. If one series2 is not categorical.
Notes
The function will create an interaction term between the two categorical variables. The interaction term will be named “categorical_variable1 * categorical_variable2”. The interactions will be in form of “category1 * category2”.
Examples
>>> import pandas as pd >>> import numpy as np >>> from stepsel.modeling.prep import interaction_categorical_categorical >>> categorical_series1 = pd.Series(["A", "B", "C"], name="categorical1").astype("category") >>> categorical_series2 = pd.Series(["X", "Y", "Z"], name="categorical2").astype("category") >>> interaction_categorical_categorical(categorical_series1, categorical_series2) 0 A * X 1 B * Y 2 C * Z Name: categorical1 * categorical2, dtype: category Categories (3, object): ['A * X', 'B * Y', 'C * Z']
- stepsel.modeling.prep.interaction.interaction_numerical_numerical(series1: pandas.Series, series2: pandas.Series)[source]
Create an interaction term between two numerical variables.
- Parameters:
series1 (pandas.Series) – The first series.
series2 (pandas.Series) – The second series.
- Returns:
interaction – A Series with the interaction terms.
- Return type:
pandas.Series
- Raises:
ValueError – If one series1 is not numerical. If one series2 is not numerical.
Notes
The function will create an interaction term between the two numerical variables. The interaction term will be named “numerical_variable1 * numerical_variable2”.
Examples
>>> import pandas as pd >>> import numpy as np >>> from stepsel.modeling.prep import interaction_numerical_numerical >>> numerical_series1 = pd.Series(np.random.normal(size=10), name="numerical1") >>> numerical_series2 = pd.Series(np.random.normal(size=10), name="numerical2") >>> interaction_numerical_numerical(numerical_series1, numerical_series2) 0 -0.626453 1 -0.720788 2 -0.632650 ... Name: numerical1 * numerical2, dtype: float64