Symbolic Regression with PyMC

dreycenfoiles · November 21, 2024, 5:19pm

Have there been any attempts to bring symbolic regression techniques to PyMC? I was looking into Bayesian symbolic regression and I was only able to find this one paper and, if I understand correctly, they had to use an unusual type of MCMC sampler. I was looking for implementations in any of the big PPL’s, but wasn’t able to find anything. If they exist, they are not publicized.

I strongly suspect that my best option is to use a non-parametric regression technique in a model during inference and, outside the model, use a symbolic regression toolkit like PySR on the conditioned non-parametric regressor. Has anyone attempted something like this before?

Thank you all for your time

drbenvincent · December 2, 2024, 10:49pm

You should check out the work of MilesCranmer He has a package for symbolic regression in Julia - SymbolicRegression.jl. He’s also got a package called PySR which seems a first glance to be implemented in Python.

dreycenfoiles · December 3, 2024, 5:24pm

Thank you for the reply! I am familiar with PySR and I agree that it could be useful to me. The reason for my question is that PySR only works when you have clearly defined input and output data. However, I am more interested in doing symbolic regression with latent variables as a function of input data in a PyMC model and I’m not sure what the best way to accomplish that is. It seems like there’s no good way to include a symbolic regressor in a PyMC model. Therefore, my first thought is to use a nonparametric regressor like a Gaussian process and, after sampling, use PySR to get an approximate functional form of the GP outside of the PyMC model. I was wondering if anyone had attempted something like this and if they had any experiences they wanted to share.

drbenvincent · December 22, 2024, 3:23pm

If your latent variable(s) represents something that you can reason about, then you can just build a model that defines the functional relationship between the latent variable and its parents. So not symbolic regression in that you are not using PyMC to discover what this relationship is, you are defining it yourself. But if you want to explore various functional forms of the relationships between latent variables and their parents, you could just build a function factory and evaluate a whole load of models and do model comparison or ensemble prediction for example. Just some thoughts.

Topic		Replies	Views
Regressing through Latent Variables version agnostic modeling	1	73	February 3, 2025
Bayesian Vector Autoregression in PyMC Sharing	3	1017	June 17, 2022
Pymc without symbolic tensors	2	35	February 19, 2025
PyMC for Bayesian Optimization version agnostic	10	1747	March 14, 2024
Symbolic gradient of GP posterior predictions gaussian_process , pytensor	8	292	April 14, 2025

Symbolic Regression with PyMC

Related topics