stepsel.modeling.predict ======================== .. py:module:: stepsel.modeling.predict .. autoapi-nested-parse:: The :mod:`stepsel.modeling.predict` module includes functions for making predictions. Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/stepsel/modeling/predict/scoring_table/index Classes ------- .. autoapisummary:: stepsel.modeling.predict.ScoringTableGLM Package Contents ---------------- .. py:class:: ScoringTableGLM(scoring_table: pandas.DataFrame) Class for scoring table and related functions .. method:: from_glm_model(model: GLMResultsWrapper, adjusted_coeffs: dict = None) Create a scoring table from the model fit. Suitable for statsmodels GLM. .. method:: from_csv(path: str, \*args, \*\*kwargs) Create a scoring table from a csv file. .. method:: sql(format: bool = False) Returns SQL statement for scoring table .. method:: predict_linear(X: pd.DataFrame) Predict linear part of the GLM model based on the scoring table and model matrix. .. method:: to_sql(*args, \*\*kwargs) Upload scoring table to the database .. method:: to_csv(path: str, \*args, \*\*kwargs) Save scoring table to a csv file .. method:: _reconstruct_model_matrix_columns(scoring_table: pd.DataFrame) Reconstruct model matrix columns and model variables based on the scoring table .. py:attribute:: required_columns :value: ['var1', 'var2', 'level_var1', 'level_var2', 'estimate'] .. py:attribute:: scoring_table .. py:method:: from_glm_model(model: statsmodels.genmod.generalized_linear_model.GLMResultsWrapper, adjusted_coeffs: dict = None) -> ScoringTableGLM :classmethod: Create a scoring table from the model fit. Suitable for statsmodels GLM. :param model: Model fit of statsmodels GLM. :type model: GLMResultsWrapper :param adjusted_coeffs: Dictionary of adjusted coefficients. The default is None. The format of the dictionary is as follows: {variable_name: adjusted_coefficient} Variable_name is the name of the variable in the model. Example: {"ts_new9_g: 06": 0.20, "drpou_cpp_dop3: H": -1.74} :type adjusted_coeffs: dict, optional :returns: Scoring table as a ScoringTable object. :rtype: ScoringTable .. rubric:: Notes Output of this function can be used to create scoring SQL query when uploaded to the database. .. py:method:: from_csv(path: str, *args, **kwargs) -> ScoringTableGLM :classmethod: Create a scoring table from a csv file. :param path: Path to the csv file. :type path: str :param \*args: Additional arguments passed to pandas.read_csv() function. :param \*\*kwargs: Additional arguments passed to pandas.read_csv() function. :returns: Scoring table as a ScoringTable object. :rtype: ScoringTable :raises ValueError: If the csv file does not contain the required columns. .. py:method:: __repr__() -> str Returns string representation of the ScoringTable object :returns: String representation of the ScoringTable object :rtype: str .. py:method:: _reconstruct_model_matrix_columns(scoring_table) -> tuple :staticmethod: Reconstruct model matrix columns and model features and variables based on the scoring table :param scoring_table: Required columns: var1, var2, level_var1, level_var2 :type scoring_table: pd.DataFrame :returns: Tuple of two lists and dict. List of model matrix column names. List of model features. Dict of model variables with keys "numerical" and "categorical". :rtype: tuple :raises ValueError: If the type of the row in the scoring table is not recognized. .. py:method:: sql(intercept_name: str = 'Intercept', format: bool = False) Returns SQL statement for scoring table :param intercept_name: Name of the intercept variable, by default 'Intercept' :type intercept_name: str, optional :param format: If True, the SQL statement is formatted for better readability, by default False :type format: bool, optional :returns: * *str* -- SQL statement for scoring table * *Uses* * *----* * **self.scoring_table** (*pandas.DataFrame*) -- Scoring table with columns: var1, var2, level_var1, level_var2, estimate. .. rubric:: Notes If format=True line breaks are added after each row. This is useful for manual inspection of the SQL statement. print() has to be used to print the SQL statement to the console with line breaks. .. py:method:: predict_linear(X: pandas.DataFrame) -> pandas.Series Predict linear part of the GLM model based on the scoring table and model matrix. :param X: Model matrix as created by the prepare_model_matrix() function. :type X: pd.DataFrame :returns: * *pd.Series* -- Linear part of the GLM model * *Uses* * *----* * **scoring_table** (*pd.DataFrame*) -- Required columns: var1, var2, level_var1, level_var2 * **model_matrix_columns** (*list[str]*) -- List of model matrix column names as created by the _reconstruct_model_matrix_columns() function. .. py:method:: to_sql(*args, **kwargs) -> None Upload scoring table to the database :param \*\*kwargs: Arguments passed to the pandas.DataFrame.to_sql() function. :param Uses: :param ----: :param self.scoring_table: Required columns: var1, var2, level_var1, level_var2, estimate :type self.scoring_table: pd.DataFrame .. py:method:: to_csv(*args, **kwargs) -> None Save scoring table as csv file :param \*\*kwargs: Arguments passed to the pandas.DataFrame.to_csv() function. :param Uses: :param ----: :param self.scoring_table: Required columns: var1, var2, level_var1, level_var2, estimate :type self.scoring_table: pd.DataFrame