Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] two notebooks on Bayesian probabilistic regression #520

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

meraldoantonio
Copy link
Contributor

Reference Issues/PRs

Provides example for Bayesian Conjugate Linear Regressor #500

What does this implement/fix? Explain your changes.

This contribution includes the first two notebooks in a planned series of four. These two notebooks cover:

  1. The general theory behind Bayesian Linear Regression.
  2. The conjugate prior method for solving Bayesian Linear Regression and implementation of this approach using the BayesianConjugateLinearRegressor estimator.
    In addition, lightweight utils.py and small synthetic dataset are included.

Does your contribution introduce a new dependency? If yes, which one?

None

What should a reviewer concentrate their feedback on?

Correctness of exposition

Did you add any tests for the change?

No

Any other comments?

No

PR checklist

No

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@meraldoantonio
Copy link
Contributor Author

Hi @fkiraly,

This PR includes the first two example notebooks for the Bayesian estimators. I suggest we merge these first, and I can submit the remaining notebooks, which are still in progress, in a separate PR.

I haven’t made any major changes to these notebooks since your review in the old PR (#500).

Let me know your thoughts or if I need to make any change to this PR, thanks!

"cell_type": "markdown",
"metadata": {},
"source": [
"This series of notebooks offers an in-depth exploration of the **Bayesian Linear Regression**. \n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the" is superfluous

"Linear regression is a widely used model due to its simplicity and interpretability. \n",
"\n",
"\n",
"In its simplest form, it predicts a single target $t$ as the deterministic output of the function $y$, which in turn is a linear combination of input variables $\\mathbf{x} = (x_1, \\dots, x_D)^\\top$ and parameters $\\mathbf{w} = (w_0, w_1, \\dots, w_D)^\\top$:\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think introducing "t" is more confusing than helping - it is basically the same as y, from a rough semantic perspective, but a scalar rather than a function. I would not introduce it, instead I would all the function that you currently call "y" a function "f", and the nintroduce data "y" later, to whic hthe model is fit.

Further, I would use notation $f(x|w)$, the bar is common notation to separate inputs from parameters.

"\n",
"$$\n",
"\\begin{aligned}\n",
"p(t | \\mathbf{x}, \\mathbf{w}, \\beta) &= \\mathcal{N}(t | y(\\mathbf{x}, \\mathbf{w}), \\beta^{-1}) \\\\\n",
Copy link
Collaborator

@fkiraly fkiraly Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this notation for N is odd, I think incorrect?

What is it supposed to mean?

  • Should "t" be a free variable instead?
  • N usually stands for the distibution, not the pdf.

" The second notebook delves into the concept of conjugate priors. By using conjugate distributions, we can derive posterior distributions analytically, simplifying Bayesian inference. This notebook also highlights how prior knowledge can influence the model and improve predictions in the presence of limited data.\n",
"\n",
"\n",
"3. **MCMC and Variational Inference** \n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are also 4. various strategies known as "approximate Bayes" here.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice notebooks!

High-level, I think two points of improvement are most important:

  • it is crucial that a reader gets some signposting about what they can expect from the notebooks, otherwise notebook 1 may disappoint expectations. A typical reader - seeing the bayesian folder - will want to see how they can use skpro. However, that is not what notebook 1 is about.
    • it is still interesting, but perhaps say at the start that this is more of an explainer intro with code.
  • for code notebooks, the textbook style is discouraged, at least I would recommend to use more telegram and bullet point style. That makes the notebook less of a lengthy read.

On the mechanical side:

  • in notebook 2, I would suggest to minimize extraneous content that is added only for it. For instance, can we not use one of the sklearn datasets, instead of adding csvs in the folder?
  • we should also minimize "line-by-line" logic in the cells, 20 lines of data generation is not very useful from a didactic perspective. Can we outsource this somewhere in a data generation routine, in the main repo? Or just use a dummy dataset from sklearn and split it up?
  • It is also important to split the vignettes more, e.g., have skpro vignettes separate from plotting cells.
  • for plotting distributions - at least univariate ones - you can use the skpro onboard plotting. If you think it is not sufficient, we should add more features to that.
    • multivariate is not yet supported, but that might be also a good small project?
  • Regarding the final summary, with advantages and disadvantages: conjucates are also simple, compute efficient algorithms typically. A user may want to use evaluation to see how well they fare, against more memory or compute heavy variants...

@fkiraly fkiraly added module:regression probabilistic regression module documentation Documentation & tutorials labels Jan 25, 2025
@fkiraly fkiraly changed the title [DOC] Finished first two Bayesian notebooks and their artefacts [DOC] two notebooks on Bayesian probabilistic regression Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation & tutorials module:regression probabilistic regression module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants