Reputation: 25318
I'm trying to build a Jupyter kernel for a language that doesn't really support a REPL, and re-defining a variable or function throws an error in that language. Unfortunately that means that I can't just keep executing the code in the order the user submits it, but instead need to substitute it if they re-visit an older cell. Let's say the user has the following two cells:
Cell 1:
int foo = 1;
Cell 2:
vec4(foo);
In my ideal scenario, I just want to stitch the cells together into one virtual source file that is in cell order and then execute that. So the resulting virtual source file should be this:
int foo = 1;
vec4(foo);
Now let's say the user goes back to cell 1 and edits foo to be 4, how can I find out that the user edited cell 1? So ideally I want to update the virtual source file to look like this:
int foo = 4;
vec4(foo);
Instead of this:
int foo = 1;
vec4(foo);
int foo = 4; // This would throw an error in the language compiler
I'm using this as my base and I've looked through the source but have been unable to find anything that would help me. Is there something that I missed? Anything else that I should be doing instead?
Upvotes: 2
Views: 3971
Reputation: 2836
Theres goes a possible solution using messaging_api (https://jupyter-client.readthedocs.io/en/latest/messaging.html#history).
import asyncio
import os
from uuid import uuid4
import json
from dataclasses import dataclass
from tornado.escape import json_encode, json_decode, url_escape
from tornado.websocket import websocket_connect
from tornado.httpclient import AsyncHTTPClient, HTTPRequest
client = AsyncHTTPClient()
session_id = 'faf69f76-6667-45d6-a38f-32460e5d7f24'
token = 'e9e267d0c802017c22bc31d276b675b4f5b3e0f180eb5c8b'
kernel_id = 'fad149a5-1f78-4827-ba7c-f1fde844f0b2'
@dataclass
class Cell:
code: str
index: int
execution_count: int
# We keep track of all cells to matain an updated index
cells = []
async def get_sessions():
url = 'http://localhost:8888/api/sessions?token={}'.format(token)
req = HTTPRequest(url=url)
resp = await client.fetch(req)
print(resp)
print(resp.body)
async def get_notebook_content(path):
url = 'http://localhost:8888/api/contents/{}?token={}'.format(path, token)
req = HTTPRequest(url=url)
resp = await client.fetch(req)
return json_decode(resp.body)
async def get_session(session_id):
ses_url = 'http://localhost:8888/api/sessions/{}?token={}'.format(session_id, token)
ses_req = HTTPRequest(url=ses_url)
resp = await client.fetch(ses_req)
return json_decode(resp.body)
# return the list of notebook cells as Cell @dataclass
def parse_cells(content):
res = []
# we iterate over notebook cells
cells = content['content']['cells']
# search the cell
for index, c in enumerate(cells):
cell_execution_count = c['execution_count']
code = c['source']
cell = Cell(code=code, index=index, execution_count=cell_execution_count)
res.append(cell)
return res
# listen to all notebook messages
async def listen():
session_data = await get_session(session_id)
notebook_path = session_data['notebook']['path']
notebook_content = await get_notebook_content(notebook_path)
# parse existing cells
cells = parse_cells(notebook_content)
# listen to all messages
req = HTTPRequest(
url='ws://localhost:8888/api/kernels/{}/channels?token={}'.format(
kernel_id,
token))
ws = await websocket_connect(req)
print('Connected to kernel websocket')
hist_msg_id = None
while True:
msg = await ws.read_message()
msg = json_decode(msg)
msg_type = msg['msg_type']
parent_msg_id = msg['parent_header']['msg_id']
if msg_type == 'execute_input':
# after a executed cell we request the history (only of the last executed cell)
hist_msg_id = uuid4().hex
ws.write_message(json_encode({
'header': {
'username': '',
'version': '5.3',
'session': '',
'msg_id': hist_msg_id,
'msg_type': 'history_request'
},
'parent_header': {},
'channel': 'shell',
'content': {
'output': False,
'raw': True,
'hist_access_type': 'tail',
'n': 1
},
'metadata': {
},
'buffers': {}
}))
elif parent_msg_id == hist_msg_id and msg_type == 'history_reply':
# we receive the history of the last executed cell with his execution_count
hist_msg_id = None # we dont expect more replies
# see message type 'history_result': https://jupyter-client.readthedocs.io/en/latest/messaging.html#history
execution_count = msg['content']['history'][0][1]
code = msg['content']['history'][0][2]
# update the existing cell
for c in cells:
if c.execution_count + 1 == execution_count:
c.code = code
c.execution_count = execution_count
print('# Cell changed: {}'.format(c))
if __name__ == '__main__':
asyncio.run(listen())
Let me try to explain it ...
We keep track of all notebooks cells and his index on a list (cells) of a Cell dataclass (code, index and execution_count)
We listen for each message from the desired session (method listen)
When a cell get's executed we request his history through the message api, to get the code and execution_count
We match the cell with the existing one throigh his execution_count and update it
I know it's a very peculiar solution, but when the notebook comunicates with the messages api, doesn't include any information about some sort of cell identity, only his code.
Important note
This solution does not manage inserting or deleting cells, we would have to find a solution using the kernel history or something like that ...
Upvotes: 4