Etaoin
Etaoin

Reputation: 8734

Version control of Mathematica notebooks

Mathematica notebooks are, of course, plaintext files -- it seems reasonable to expect that they should play nice with a version-control system (git in my case, although I doubt the specific system matters). But the fact is that any .nb file is full of cache information, timestamps, and other assorted metadata. Scads of it.

Which means that limited version control is possible -- commits and rollbacks work fine. Merging, though, is a disaster. Mathematica won't open a file with merge markers in it, and a text editor is no way to go through a .nb file.

Has anyone had any luck putting a notebook under version control? How?

Upvotes: 78

Views: 8393

Answers (9)

userrandrand
userrandrand

Reputation: 171

You can use the wolfram engine with a vscode jupyter notebook and use the git features of vscode and optionally the vscode extension gitlens. For more details on getting vscode to work with the wolfram language see https://mathematica.stackexchange.com/questions/218935/how-do-you-run-wolfram-language-code-in-vscode .

Example of git diff view of two commits

git diff

How to see a git diff of 2 versions of a notebook in vscode

Without the gitlens extension

Using vscode without the gitlens extension you have the options of checking uncommited changes since the last commit using the source control section of the source control menu in the sidebar and checking the diff of two previous commits by clicking right clicking on commits in the timeline section of the file explorer menu in the sidebar and selecting "Select for Compare" for the commit then right click on the second commit and click "Compare with Selected". You can filter out local changes like files saves from the timeline to view only the git history.

Pros

  • No need to install an extra extension
  • I do not know why but the rendered outputs of images (like plots) are not shown when I use the gitlens extension but they are shown when using the timeline section option that I explained before

With the gitlens extension

You can use the commits section in the source control menu in the right sidebar or you can use search and compare in the gitlens inspect menu in the sidebar.

Pros

  • The search and compare option might be easier to use than the timeline section in the file explorer menu if there are many commits

  • Other features and visualizations. You can check some of the features of the extension on youtube.

Upvotes: 1

Thomas Young
Thomas Young

Reputation: 1

You can use the .wls or .wl file format. create a new wls or wl script

It save only code, chapter mark, comments etc., but without extra timestamp, computed result, display style etc.,

for example, wl file in raw. wl format in raw

(* ::Package:: *)

(* ::Chapter:: *)
(*some title*)

(*som comment*)
NIntegrate[Erfc@x,{x,-1,1}]

(* ::Input:: *)
(*(*other comment*)*)
(*Plot[Beta[1/2,x],{x,-2,3}]*)

the same nb file in raw. nb format in raw

(* Content-type: application/vnd.wolfram.mathematica *)

(*** Wolfram Notebook File ***)
(* http://www.wolfram.com/nb *)
...
(* Internal cache information:
...
NotebookDataPosition[158, 7]
...
WindowFrame->Normal*)

(* Beginning of Notebook Content *)
Notebook[{

Cell[CellGroupData[{
Cell["some title", "Chapter",...],

Cell[BoxData[
 RowBox[{
  RowBox[{"(*", ....], "Code",
 CellChangeTimes->{{3.921788175703182*^9, 3.921788181859129*^9}},
 CellLabel->"In[8]:=",...],

Upvotes: 0

JP-Ellis
JP-Ellis

Reputation: 437

A new possibility is to use mathematica-notebook-filter which parses Mathematica notebooks and strips all output cells and metadata so that these are not committed into the version control system.

In the specific case of git, it is quite easy to integrate mathematica-notebook-filter so that git automatically cleans the output and metadata when calculating diffs through the use of gitattribute filters. You will need to have mathematica-notebook-filter filter installed and added to your path variable (or adapt the configuration below to point to the binary) and add the following line to your ~/.gitattributes file:

*.nb    filter=dropoutput_nb

This instructs git to parse all files matching *.nb with the dropoutput_nb filter which is defined in your ~/.gitconfig as:

[filter "dropoutput_nb"]
    clean = mathematica-notebook-filter
    smudge = cat

If, for some reason, you want to have a specific Mathematica notebook committed with all output and metadata, you can disable the filter in the project's .gitattributes file by adding:

notebook_file.nb    !filter

Disclaimer: I am the author of this tool. It is open source and feedback (both good and bad) is appreciated. Contributions are welcome on Github.

Upvotes: 4

Apo
Apo

Reputation: 141

There is a nice set of recommendations for how to use Git to do version control with Mathematica at Mathematica Stack Exchange. In short, the philosophy is to minimize use of .nb notebooks, and try to do most of the version control with .m packages (similar to what xuhdev and MMA user say above). This seems quite sensible given the way notebooks are managed.

Upvotes: 14

xuhdev
xuhdev

Reputation: 9360

Well, my solution is not using Notebook for tracking, but using plain text files (not the "Notebook" plain text).

Whenever you have a notebook, you can use the "save as..." menu to save the current file as a plain text file. When you need to load it, simply open it with Mahthematica. Tracking this file would be much nicer than tracking a Notebook file. I'm unsure about what features you may lose by using plain text format rather than the Mathematica Notebook, but I haven't found any defects so far.

Reference: http://www.topbug.net/blog/2013/05/02/track-mathematica-source-files-with-version-control-systems/

Upvotes: 1

MMA user
MMA user

Reputation: 31

Along the lines of what Simon and Kena were saying, when I have had Mathematica .nb's under version control, I often create a plain-text version of only the input code and save it with the same name but a .txt extension. While this doesn't directly solve the merging problem, it does make diff-ing work in a reasonable way and makes manual merging more obvious when I go back to edit the .nb's later. There are still some idiosyncrasies in this format, but it is MUCH easier to read than the raw .nb format.

To generate the text file, I just copy the notebook into a new blank notebook (with shortcuts, Ctrl-A,C,N,V), select the menu Cell->Delete All Output, copy the result (Ctrl-A,C), and paste the result into a plain text editor to save it. It takes surprisingly little time once you get the hang of it.

Upvotes: 3

Kena
Kena

Reputation: 6921

Not a solution to your merging problem exactly, but this is how we handle notebooks and source control in my team. Basically, we treat Mathematica notebooks the way we'd treat binary files. They're checked-in, but:

  • we always keep a pdf copy alongside the .nb (backup for restoring the information in case we lose, for some reason, the capability of readings .nb files. Still proprietary format, but a bit more widespread, and chances are both Adobe and Wolfram won't simultaneously disappear)
  • we do not allow merges
  • we code-review only the final product (the rendered notebook) instead of the .nb file.

We mostly use Mathematica for small proofs, explorations and sidetracks, so the above procedure works fine for us (our main documentation is in LaTeX, which produces friendlier documentation for non-mathematicians/non-programmers)

Upvotes: 6

Michael Pilat
Michael Pilat

Reputation: 6520

It's recommended to disable the file outline cache, which is the metadata you're referring to when you look at the notebook with a text editor. As you discovered, it can cause merge conflicts if multiple parties are editing the same notebook.

This is easily disabled with the Option Inspector. In the Mathematica menu, go to FormatOption Inspector..., in the top-left set the scope dropdown to Selected Notebook and search for FileOutlineCache in the search field. Set the option to False and save your notebook, and you should be all set.

Note that this can make opening notebooks a little slower, but unless the notebook is rather large, you probably won't notice the difference.

Upvotes: 48

Ash
Ash

Reputation: 62145

You should only get merge markers if the source control system detects changes to a single line by multiple users.

The source control system adds markers to make if very clear where the conflicts are, and to force you to manually remove them (as you resolve each conflict). There is no way for a source control system to know how to do it automatically for you.

If the file is text, but is designed to be read by a program only, it may have no end of line characters at all (or very long lines). Therefore if multiple people are working on such a file you'll get many merge conflicts.

I'm not familiar with the nb file format, but in general the solution to this problem is to ensure only one person is working on a file at a time (ie use an exclusive check-out mode for nb files).

Upvotes: 0

Related Questions