How to load data from .txt?

I understand this is a basic question that must be answered somewhere in the documentation, but after an hour of searching I still don’t see how to create a model with data loaded from a .txt file. I want to run a simple autocorrelation plot with PyMC3, and have a .txt / .csv that stores the data. What is the simplest way to create a PyMC3-callable object with this data?

Many thanks in advance.

By data do you mean observed data that you can assign to a random variable, or do you just want to store the results of the trace given by pymc3.sample?

In both cases, you can use the standard pickle package that comes in Python. To store any object to a file you use pickle.dump and to load them back you use pickle.load.

To my understanding pymc3.sample requires some model built into PyMC3, so that would not be what I want. I simply want to call the function pymc3.plots.autocorrplot, in which the first parameter is trace, the result of a PyMC3 MCMC run. I already have the data from the MCMC stored in a .txt. How would I pickle.load the data so as to be able to call pymc3.plots.autocorrplot ?

Thanks again for your time.

Two things:

  1. We have moved all the plotting functionality to arviz in the master branch, so autocorrplot will now be able to handle instances of xarray.DataArray. Maybe @colcarroll can give you some hints on how to load the data properly from a txt file onto a DataArray.
  2. For the old autocorrplot, you need to build a MultiTrace instance from reading the data in the txt file. In my opinion, the easiest way is to create a new pymc3.backends.NDArray trace instance, then use the setup to say how many draws and chains there are, and finally, iterate over your data and use record to add it to the trace. The functionality put in place for loading and saving the trace instances stores much more information than just the points in each chain, so there are custom save_trace and load_trace (or save and load) methods for each backend, and you should use those (or just pickle.dump(trace)) instead of saving the points into a txt file yourself.

I would have to see the format of the text file to help you load it. Can you turn whatever is in the .txt into one or more numpy arrays? That would go a long way to computing autocorrelations.

I have no problem loading the .txt as a one-dimensional numpy array. In hindsight, I should have obviously phrased the problem as saying that I simply want to compute the autocorrelation of such an array.

Oh, then az.autocorr will do it (documentation).

az.autocorr(np.random.randn(100).cumsum())
# array([ 1.        ,  0.97315099,  0.941736  ,  0.91007784,  0.88133291, ... 

Thanks. Was I correct in my guess that there is no simple way of doing it with PyMC3?