Experiment with neurosymbolic reasoning: LLM as a parser plus a symbolic reasoner: 2025

Allikas: Lambda

Your main task is to experiment with semantic parsing with an LLM (GPT, Claude, Llama, whichever) to obtain symbolic rules and facts, which are then fed to a symbolic solver to solve the problem. The second task is to find examples where a straightforward question from LLM does not give a correct answer, but the process above does. This may not be trivial: in case you fail, you still pass the lab.

Note: this lab can be done via a browser, no strict need to install anything. However, running command line tools is probably more efficient in the end.

How to proceed (updated):

Have a look at the llmpipe subfolder in the larger nlpsolver repo. Read the README.md file explaining the example files there. In particular (an excerpt):

  • nlpsimplecollect.py: cycle through texts in a file (default `nlpsimpletext.py`) and ask LLM to parse, storing results in a file. Usable for gpt and claude.
  • nlpsimpleconv.py: take the file generated by `nlpsimplecollect.py` (default `nlpsimpletext_gpt_llmresults.txt`) and generate a new file, parsing / fixing raw LLM output
  • nlpsimpleprompt1.txt: used by `nlpsimplecollect.py` as a system prompt
  • nlpsimpletext.py: example input file used by `nlpsimplecollect.py`
  • nlpsimpletext1.py: another example input file for `nlpsimplecollect.py`
  • logicexample.txt: a very simple logic task made from the `nlpsimpletext.py` conversion output, a single handmade rule and a question. You can directly run it in logictools.org.

NB! The stuff above was partially developed and explained in detail during the 22. April lecture: have a look at the recording

What was recommended before (not wrong, just more complicated):

First, have a look at the gpt subfolder in the nlpsolver repo. The README.md file explains what is there. Certainly read this!

Use the logifyprompt3.txt as an example or a source of ideas of how to make an LLM to parse English to logic. Alternatively, here is a newer subfolder with a newer and better experimental prompt for logification logifyprompt6.txt and improved LLM caller (both gpt and claude) nlpcollectgpt.py and a converter/fixer of input from the latter: nlpconvcollected.py

Second, either

  • modify the example prompt (or write your own) to produce input for either logictools or online problog.
  • or write a converter from the LLM-produced text (look at nlpconvcollected.py) to a suitable format.

If you choose problog (which I recommend for this lab, since it will be more interesting and challenging), make the LLM parser prompt so that it can - at least for some cases - include (estimated) numeric confidences in the problog rules / facts it generates. Notice the program nlpconvcollected.py in the repo: this converts and fixes the raw LLM output generated by gpt.py or nlpcollectgpt.py.

You do not need to use the gpt.py or nlpcollectgpt.py program in the repo: perfectly OK to use a browser interface to LLM or some other LLM, not GPT or Claude.

Finally

When you get sensible output from the LLM parse process, use logictools or problog to find the answer. Once you can make that work, please create several example rule / fact / question sets.

Then try to find examples where a straightforward question from LLM does not give a correct answer, but the neurosymbolic approach does. Do not worry too much if you fail.

Finally prepare a small presentation of what you did, what worked out and what not.