no code implementations • 14 Nov 2023 • Melanie Mitchell, Alessandro B. Palmarini, Arseny Moskvichev
We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4, using the ConceptARC benchmark [10], which is designed to evaluate robust understanding and reasoning with core-knowledge concepts.
no code implementations • 13 Jun 2023 • Alessandro B. Palmarini, Christopher G. Lucas, N. Siddharth
The cost of search is amortised by training a neural search policy, reducing search breadth and effectively "compiling" useful information to compose program solutions across tasks.