no code implementations • EMNLP (spnlp) 2020 • Arindam Mitra, Sanjay Narayana, Chitta Baral
Successful application of Knowledge Representation and Reasoning (KR) in Natural Language Understanding (NLU) is largely limited by the availability of a robust and general purpose natural language parser.
1 code implementation • ACL 2022 • Yiran Luo, Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral
We find that the original Who’s Waldo dataset compiled for this task contains a large number of biased samples that are solvable simply by heuristic methods; for instance, in many cases the first name in the sentence corresponds to the largest bounding box, or the sequence of names in the sentence corresponds to an exact left-to-right order in the image.
no code implementations • Findings (ACL) 2022 • Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
Our experiments compare the zero-shot and few-shot performance of LMs prompted with reframed instructions on 12 NLP tasks across 6 categories.
1 code implementation • 6 Jun 2024 • Aswin RRV, Nemika Tyagi, Md Nayem Uddin, Neeraj Varshney, Chitta Baral
This study explores the sycophantic tendencies of Large Language Models (LLMs), where these models tend to provide answers that match what users want to hear, even if they are not entirely correct.
no code implementations • 6 Jun 2024 • Divij Handa, Pavel Dolin, Shrinidhi Kumbhar, Chitta Baral, Tran Cao Son
Reasoning about actions and change (RAC) has historically driven the development of many early AI challenges, such as the frame problem, and many AI disciplines, including non-monotonic and commonsense reasoning.
1 code implementation • 26 May 2024 • Amir Saeidi, Shivanshu Verma, Aswin RRV, Chitta Baral
However, while RL-free methods deliver satisfactory performance, they require significant data to develop a robust Supervised Fine-Tuned (SFT) model and an additional step to fine-tune this model on a preference dataset, which constrains their utility and scalability.
no code implementations • 24 May 2024 • Yiran Luo, Joshua Feinglass, Tejas Gokhale, Kuan-Cheng Lee, Chitta Baral, Yezhou Yang
We first introduce two new quantitative measures ICV and IDD to describe domain shifts in terms of consistency of classes within one domain and similarity between two stylistic domains.
1 code implementation • 23 Apr 2024 • Amir Saeidi, Shivanshu Verma, Chitta Baral
Key observations reveal that alignment methods achieve optimal performance with smaller training data subsets, exhibit limited effectiveness in reasoning tasks yet significantly impact mathematical problem-solving, and employing an instruction-tuned model notably influences truthfulness.
1 code implementation • 23 Apr 2024 • Mihir Parmar, Nisarg Patel, Neeraj Varshney, Mutsumi Nakamura, Man Luo, Santosh Mashetty, Arindam Mitra, Chitta Baral
Existing work investigating this reasoning ability of LLMs has focused only on a couple of inference rules (such as modus ponens and modus tollens) of propositional and first-order logic.
1 code implementation • 12 Apr 2024 • Agneet Chatterjee, Tejas Gokhale, Chitta Baral, Yezhou Yang
Recent advances in monocular depth estimation have been made by incorporating natural language as additional guidance.
1 code implementation • 1 Apr 2024 • Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hannaneh Hajishirzi, Vasudev Lal, Chitta Baral, Yezhou Yang
One of the key shortcomings in current text-to-image (T2I) models is their inability to consistently generate images which faithfully follow the spatial relationships specified in the text prompt.
no code implementations • 17 Mar 2024 • Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wang
Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set.
no code implementations • 16 Feb 2024 • Divij Handa, Advait Chirmule, Bimal Gajera, Chitta Baral
We first present a pilot study on the state-of-the-art LLM, GPT-4, in decoding several safe sentences that have been encrypted using various cryptographic techniques and find that a straightforward word substitution cipher can be decoded most effectively.
no code implementations • 7 Feb 2024 • Maitreya Patel, Sangmin Jung, Chitta Baral, Yezhou Yang
While LDMs offer distinct advantages, P-T2I methods' reliance on the latent space of these diffusion models significantly escalates resource demands, leading to inconsistent results and necessitating numerous iterations for a single desired image.
no code implementations • 30 Dec 2023 • Neeraj Varshney, Pavel Dolin, Agastya Seth, Chitta Baral
As Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications, their safety concerns become critical areas of NLP research.
no code implementations • 7 Dec 2023 • Maitreya Patel, Changhoon Kim, Sheng Cheng, Chitta Baral, Yezhou Yang
The T2I prior model alone adds a billion parameters compared to the Latent Diffusion Models, which increases the computational and high-quality data requirements.
1 code implementation • 16 Nov 2023 • Mihir Parmar, Aakanksha Naik, Himanshu Gupta, Disha Agrawal, Chitta Baral
Assessing these models on long sequences is crucial since prior work in the general domain has demonstrated performance degradation of LLMs on longer texts.
no code implementations • 28 Oct 2023 • Neeraj Varshney, Agneet Chatterjee, Mihir Parmar, Chitta Baral
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks; however, their large size makes their inference slow and computationally expensive.
1 code implementation • 27 Oct 2023 • Himanshu Gupta, Kevin Scaria, Ujjwala Anantheswaran, Shreyas Verma, Mihir Parmar, Saurabh Arjun Sawant, Chitta Baral, Swaroop Mishra
Finally, when pre-finetuned on our synthetic SuperGLUE dataset, T5-3B yields impressive results on the OpenLLM leaderboard, surpassing the model trained on the Self-Instruct dataset by 4. 14% points.
no code implementations • 23 Oct 2023 • Justin Payan, Swaroop Mishra, Mukul Singh, Carina Negreanu, Christian Poelitz, Chitta Baral, Subhro Roy, Rasika Chakravarthy, Benjamin Van Durme, Elnaz Nouri
With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets.
no code implementations • 2 Oct 2023 • Man Luo, Shrinidhi Kumbhar, Ming Shen, Mihir Parmar, Neeraj Varshney, Pratyay Banerjee, Somak Aditya, Chitta Baral
This work strives to understand the proficiency of LLMs in logical reasoning by offering a brief review of the latest progress in this area; with a focus on the logical reasoning datasets, tasks, and the methods adopted to utilize LLMs for reasoning.
no code implementations • 8 Sep 2023 • Ayushi Agarwal, Nisarg Patel, Neeraj Varshney, Mihir Parmar, Pavan Mallina, Aryan Bhavin Shah, Srihari Raju Sangaraju, Tirth Patel, Nihar Thakkar, Chitta Baral
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer.
no code implementations • 1 Sep 2023 • Divyanshu Raj, Chitta Baral, Nakul Gopalan
In this work, we present an approach to identify sub-tasks within a demonstrated robot trajectory using language instructions.
1 code implementation • 16 Aug 2023 • Srija Macherla, Man Luo, Mihir Parmar, Chitta Baral
We introduce a unified score for the ADD system that takes into account the interplay between symptoms and diagnosis.
1 code implementation • 7 Jun 2023 • Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang
To quantify the ability of T2I models in learning and synthesizing novel visual concepts (a. k. a.
1 code implementation • 1 Jun 2023 • Man Luo, Zhiyuan Fang, Tejas Gokhale, Yezhou Yang, Chitta Baral
We investigate knowledge retrieval with multi-modal queries, i. e. queries containing information split across image and text inputs, a challenging task that differs from previous work on cross-modal retrieval.
1 code implementation • 25 May 2023 • Ujjwala Anantheswaran, Himanshu Gupta, Mihir Parmar, Kuntal Kumar Pal, Chitta Baral
We show that EDM3 helps to learn transferable knowledge that can be leveraged to perform Event Detection and its subtasks concurrently, mitigating the error propagation inherent in pipelined approaches.
no code implementations • 23 May 2023 • Man Luo, Xin Xu, Zhuyun Dai, Panupong Pasupat, Mehran Kazemi, Chitta Baral, Vaiva Imbrasaite, Vincent Y Zhao
In-context learning (ICL), teaching a large language model (LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs.
no code implementations • 20 May 2023 • Neeraj Varshney, Mihir Parmar, Nisarg Patel, Divij Handa, Sayantan Sarkar, Man Luo, Chitta Baral
Can state-of-the-art NLP models correctly reason over the contexts of such scenarios?
1 code implementation • 17 May 2023 • Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, Mutsumi Nakamura, Arindam Mitra, Santosh Mashetty, Chitta Baral
In the MTL setting, an instruction tuned model trained on only 6% of downstream training data achieve SOTA, while using 100% of the training data results in a 3. 69% points improvement (ROUGE-L 74. 68) over the previous SOTA.
no code implementations • 8 May 2023 • Neeraj Varshney, Himanshu Gupta, Eric Robertson, Bing Liu, Chitta Baral
To initiate a systematic research in this important area of 'dealing with novelties', we introduce 'NoveltyTask', a multi-stage task to evaluate a system's performance on pipelined novelty 'detection' and 'accommodation' tasks.
no code implementations • 2 May 2023 • Neeraj Varshney, Chitta Baral
Despite remarkable progress made in natural language processing, even the state-of-the-art models often make incorrect predictions.
no code implementations • 5 Mar 2023 • Kazuaki Kashihara, Kuntal Kumar Pal, Chitta Baral, Robert P Trevino
We propose a method called Next Paragraph Prediction with Instructional Prompting (NPP-IP) to predict thread structures while grounded on the context around posts.
no code implementations • 28 Feb 2023 • Tung Thai, Ming Shen, Mayank Garg, Ayush Kalani, Nakul Vaidya, Utkarsh Soni, Mudit Verma, Sriram Gopalakrishnan, Neeraj Varshney, Chitta Baral, Subbarao Kambhampati, Jivko Sinapov, Matthias Scheutz
Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance.
no code implementations • 20 Feb 2023 • Kuntal Kumar Pal, Kazuaki Kashihara, Ujjwala Anantheswaran, Kirby C. Kuznia, Siddhesh Jagtap, Chitta Baral
We also show that with a few examples, UTS can be adapted to novel unseen tasks and the nature of data
1 code implementation • 16 Feb 2023 • Kevin Scaria, Himanshu Gupta, Siddharth Goyal, Saurabh Arjun Sawant, Swaroop Mishra, Chitta Baral
We introduce InstructABSA, an instruction learning paradigm for Aspect-Based Sentiment Analysis (ABSA) subtasks.
Ranked #1 on Aspect Extraction on SemEval-2014 Task-4
1 code implementation • 9 Feb 2023 • Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral, Chris Bryan
In pursuit of creating better benchmarks, we propose VAIDA, a novel benchmark creation paradigm for NLP, that focuses on guiding crowdworkers, an under-explored facet of addressing benchmark idiosyncrasies.
1 code implementation • 23 Jan 2023 • Pratyay Banerjee, Shweti Mahajan, Kushal Arora, Chitta Baral, Oriana Riva
Along with text, these resources include visual content such as UI screenshots and images of application icons referenced in the text.
1 code implementation • 20 Dec 2022 • Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang
We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.
1 code implementation • 7 Dec 2022 • Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral
'Actions' play a vital role in how humans interact with the world.
no code implementations • 7 Dec 2022 • Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral
'Actions' play a vital role in how humans interact with the world.
no code implementations • 23 Nov 2022 • Neeraj Varshney, Man Luo, Chitta Baral
Comparing with the FiD reader, this approach matches its accuracy by utilizing just 18. 32% of its reader inference cost and also outperforms it by achieving up to 55. 10% accuracy on NQ Open.
1 code implementation • 7 Nov 2022 • Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang
Videos often capture objects, their visible properties, their motion, and the interactions between different objects.
Ranked #1 on Counterfactual Planning on CRIPP-VQA
1 code implementation • 31 Oct 2022 • Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, Ashwin Kalyan
Mathematical reasoning skills are essential for general-purpose intelligent systems to perform tasks from grocery shopping to climate modeling.
Ranked #1 on Mathematical Reasoning on Lila (OOD)
1 code implementation • 14 Oct 2022 • Himanshu Gupta, Neeraj Varshney, Swaroop Mishra, Kuntal Kumar Pal, Saurabh Arjun Sawant, Kevin Scaria, Siddharth Goyal, Chitta Baral
We show that even state-of-the-art models such as GPT-3, GPT-2, and T5 struggle to answer the feasibility questions correctly.
no code implementations • 14 Oct 2022 • Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral
Inspired by successful quality indices in several domains such as power, food, and water, we take the first step towards a metric by identifying certain language properties that can represent various possible interactions leading to biases in a benchmark.
no code implementations • 14 Oct 2022 • Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral
Evaluation of models on benchmarks is unreliable without knowing the degree of sample hardness; this subsequently overestimates the capability of AI systems and limits their adoption in real world applications.
no code implementations • 14 Oct 2022 • Swaroop Mishra, Bhavdeep Singh Sachdeva, Chitta Baral
Pretrained Transformers (PT) have been shown to improve Out of Distribution (OOD) robustness than traditional models such as Bag of Words (BOW), LSTMs, Convolutional Neural Networks (CNN) powered by Word2Vec and Glove embeddings.
no code implementations • 11 Oct 2022 • Neeraj Varshney, Chitta Baral
Through comprehensive experiments in multiple task settings that differ in the number of models available for cascading (K value), we show that cascading improves both the computational efficiency and the prediction accuracy.
no code implementations • 10 Oct 2022 • Swaroop Mishra, Anjana Arunkumar, Chitta Baral
We find limitations in AUC; e. g., a model having higher AUC is not always better in performing selective answering.
no code implementations • 4 Oct 2022 • Man Luo, Shashank Jain, Anchit Gupta, Arash Einolghozati, Barlas Oguz, Debojeet Chatterjee, Xilun Chen, Chitta Baral, Peyman Heidari
Driven by this question, we leverage an indexing-efficient dense retriever (i. e. DrBoost) and introduce a LITE retriever that further reduces the memory of DrBoost.
no code implementations • 15 Jul 2022 • Shailaja Keyur Sampat, Maitreya Patel, Subhasish Das, Yezhou Yang, Chitta Baral
'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals.
no code implementations • 6 Jul 2022 • Man Luo, Sharad Saxena, Swaroop Mishra, Mihir Parmar, Chitta Baral
To the best of our knowledge, none of TQA datasets exist in the biomedical domain where tables are frequently used to present information.
1 code implementation • 15 Jun 2022 • Tejas Gokhale, Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Chitta Baral, Yezhou Yang
To be successful in single source domain generalization, maximizing diversity of synthesized domains has emerged as one of the most effective strategies.
4 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu
BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.
1 code implementation • 25 May 2022 • Pruthvi Patel, Swaroop Mishra, Mihir Parmar, Chitta Baral
Large Language Models (LMs) have achieved state-of-the-art performance on many Natural Language Processing (NLP) benchmarks.
no code implementations • DeepLo 2022 • Neeraj Varshney, Swaroop Mishra, Chitta Baral
Curriculum learning strategies in prior multi-task learning approaches arrange datasets in a difficulty hierarchy either based on human perception or by exhaustively searching the optimal arrangement.
no code implementations • 1 May 2022 • Mihir Parmar, Swaroop Mishra, Mor Geva, Chitta Baral
In this work, we hypothesize that annotators pick up on patterns in the crowdsourcing instructions, which bias them to write many similar examples that are then over-represented in the collected data.
7 code implementations • 16 Apr 2022 • Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, Daniel Khashabi
This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones.
2 code implementations • Findings (NAACL) 2022 • Mihir Parmar, Swaroop Mishra, Mirali Purohit, Man Luo, M. Hassan Murad, Chitta Baral
Recently, instructional prompts have shown significant improvement towards multi-task generalization; however, the effect of instructional prompts and Multi-Task Learning (MTL) has not been systematically studied in the biomedical domain.
no code implementations • ACL 2022 • Swaroop Mishra, Arindam Mitra, Neeraj Varshney, Bhavdeep Sachdeva, Peter Clark, Chitta Baral, Ashwin Kalyan
Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems.
1 code implementation • 30 Mar 2022 • Yiran Luo, Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral
We find that the original Who's Waldo dataset compiled for this task contains a large number of biased samples that are solvable simply by heuristic methods; for instance, in many cases the first name in the sentence corresponds to the largest bounding box, or the sequence of names in the sentence corresponds to an exact left-to-right order in the image.
1 code implementation • 17 Mar 2022 • Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, Chitta Baral
However, they can write alternate instructions to represent an instruction task.
1 code implementation • 16 Mar 2022 • Kirby Kuznia, Swaroop Mishra, Mihir Parmar, Chitta Baral
We show that LMs benefit from the summarized version of complicated questions.
no code implementations • Findings (ACL) 2022 • Tejas Gokhale, Swaroop Mishra, Man Luo, Bhavdeep Singh Sachdeva, Chitta Baral
However, the effect of data modification on adversarial robustness remains unclear.
no code implementations • SpaNLP (ACL) 2022 • Man Luo, Kazuma Hashimoto, Semih Yavuz, Zhiwei Liu, Chitta Baral, Yingbo Zhou
Among several interesting findings, it is important to highlight that (1) the generative readers perform better in long context QA, (2) the extractive readers perform better in short context while also showing better out-of-domain generalization, and (3) the encoder of encoder-decoder PrLMs (e. g., T5) turns out to be a strong extractive reader and outperforms the standard choice of encoder-only PrLMs (e. g., RoBERTa).
1 code implementation • ACL 2022 • Neeraj Varshney, Swaroop Mishra, Chitta Baral
Knowledge of questions' difficulty level helps a teacher in several ways, such as estimating students' potential quickly by asking carefully selected questions and improving quality of examination by modifying trivial and hard questions.
no code implementations • Findings (ACL) 2022 • Neeraj Varshney, Swaroop Mishra, Chitta Baral
In order to equip NLP systems with selective prediction capability, several task-specific approaches have been proposed.
no code implementations • 19 Jan 2022 • Man Luo, Arindam Mitra, Tejas Gokhale, Chitta Baral
We show that BM25 and our method can complement each other, and a simple hybrid model leads to further gains in the large corpus setting.
1 code implementation • Findings (ACL) 2022 • Neeraj Varshney, Pratyay Banerjee, Tejas Gokhale, Chitta Baral
Transformer-based models achieve impressive performance on numerous Natural Language Inference (NLI) benchmarks when trained on respective training datasets.
1 code implementation • 15 Oct 2021 • Hong Guan, Chitta Baral
Unlike previous work that simulates data from given probabilities and uses ML algorithms on them, we directly use the Quick Medical Reference (QMR) belief network, and apply Bayesian inference in the inference phase and Bayesian experimental design in the inquiry phase.
1 code implementation • Findings (ACL) 2022 • Tejas Gokhale, Abhishek Chaudhary, Pratyay Banerjee, Chitta Baral, Yezhou Yang
Analysis of vision-and-language models has revealed their brittleness under linguistic phenomena such as paraphrasing, negation, textual entailment, and word substitutions with synonyms or antonyms.
no code implementations • NAACL (ACL) 2022 • Man Luo, Shuguang Chen, Chitta Baral
Furthermore, we propose consistency and similarity constraints to promote the correlation and interaction between passage ranking and sentence selection. The experiments demonstrate that our framework can achieve competitive results with previous systems and outperform the baseline by 28\% in terms of exact matching of relevant sentences on the HotpotQA dataset.
no code implementations • 16 Sep 2021 • Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
Our experiments compare the zero-shot and few-shot performance of LMs prompted with reframed instructions on 12 NLP tasks across 6 categories.
1 code implementation • Findings (EMNLP) 2021 • Kuntal Kumar Pal, Chitta Baral
Some possible reasons can be the tokenizers and pre-training objectives which are not specifically designed to learn and preserve numeracy.
1 code implementation • EMNLP 2021 • Man Luo, Yankai Zeng, Pratyay Banerjee, Chitta Baral
The visual retriever aims to retrieve relevant knowledge, and the visual reader seeks to predict answers based on given knowledge.
no code implementations • ICCV 2021 • Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral
In this work, we evaluate the faithfulness of V\&L models to such geometric understanding, by formulating the prediction of pair-wise relative locations of objects as a classification as well as a regression task.
no code implementations • 1 Jul 2021 • Neeraj Varshney, Swaroop Mishra, Chitta Baral
However, our task leaves a significant challenge for NLP researchers to further improve OOD performance at each stage.
no code implementations • AKBC 2021 • Pratyay Banerjee, Swaroop Mishra, Kuntal Kumar Pal, Arindam Mitra, Chitta Baral
Two common approaches to this are (i) Use of well-structured commonsense present in knowledge graphs, and (ii) Use of progressively larger transformer language models.
1 code implementation • NAACL 2021 • Shailaja Keyur Sampat, Akshay Kumar, Yezhou Yang, Chitta Baral
Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video.
1 code implementation • Findings (ACL) 2021 • Kuntal Kumar Pal, Kazuaki Kashihara, Pratyay Banerjee, Swaroop Mishra, Ruoyu Wang, Chitta Baral
We must read the whole text to identify the relevant information or identify the instruction flows to complete a task, which is prone to failures.
no code implementations • ACL 2021 • Ming Shen, Pratyay Banerjee, Chitta Baral
In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-training strategy to tackle pronoun resolution in a fully unsupervised setting.
3 code implementations • ACL 2022 • Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi
Using this meta-dataset, we measure cross-task generalization by training models on seen tasks and measuring generalization to the remaining unseen ones.
1 code implementation • 13 Apr 2021 • Shailaja Keyur Sampat, Akshay Kumar, Yezhou Yang, Chitta Baral
Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video.
no code implementations • EACL 2021 • Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral
GQA (CITATION) is a dataset for real-world visual reasoning and compositional question answering.
1 code implementation • 28 Mar 2021 • Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral
GQA~\citep{hudson2019gqa} is a dataset for real-world visual reasoning and compositional question answering.
no code implementations • 23 Mar 2021 • Pratyay Banerjee, Kuntal Kumar Pal, Fish Wang, Chitta Baral
Inspired by recent advances in natural language processing, we propose a novel solution to infer variable names in decompiled code based on Masked Language Modeling, Byte-Pair Encoding, and neural architectures such as Transformers and BERT.
no code implementations • NAACL 2021 • Pratyay Banerjee, Tejas Gokhale, Chitta Baral
Recent work on unsupervised question answering has shown that models can be trained with procedurally generated question-answer pairs and can achieve performance competitive with supervised methods.
no code implementations • 17 Dec 2020 • Pratyay Banerjee, Chitta Baral, Man Luo, Arindam Mitra, Kuntal Pal, Tran C. Son, Neeraj Varshney
A recent work has shown that transformers are able to "reason" with facts and rules in a limited setting where the rules are natural language expressions of conjunctions of conditions implying a conclusion.
no code implementations • Findings (ACL) 2021 • Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral
Methodologies for training visual question answering (VQA) models assume the availability of datasets with human-annotated \textit{Image-Question-Answer} (I-Q-A) triplets.
3 code implementations • 3 Dec 2020 • Tejas Gokhale, Rushil Anirudh, Bhavya Kailkhura, Jayaraman J. Thiagarajan, Chitta Baral, Yezhou Yang
While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes.
1 code implementation • NeurIPS 2020 • Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Stefan Lee, Chitta Baral, Heni Ben Amor
Imitation learning is a popular approach for teaching motor skills to robots.
2 code implementations • EMNLP 2020 • Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang
In this paper, we present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input, to improve OOD generalization, such as the VQA-CP challenge.
no code implementations • 3 Sep 2020 • Samarth Rawal, Chitta Baral
Information Retrieval (IR) is the task of obtaining pieces of data (such as documents or snippets of text) that are relevant to a particular query or need from a large repository of information.
no code implementations • RepL4NLP (ACL) 2022 • Neeraj Varshney, Swaroop Mishra, Chitta Baral
In (IID, OOD) settings, we show that the representations learned by our calibrator result in an improvement of (15. 81%, 5. 64%) and (6. 19%, 13. 9%) over 'MaxProb' -- a selective prediction baseline -- on NLI and DD tasks respectively.
no code implementations • 10 Aug 2020 • Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral
A `state of the art' model A surpasses humans in a benchmark B, but fails on similar benchmarks C, D, and E. What does B have that the other benchmarks do not?
no code implementations • 14 Jul 2020 • Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral
In order to stop the inflation in model performance -- and thus overestimation in AI systems' capabilities -- we propose a simple and novel evaluation metric, WOOD Score, that encourages generalization during evaluation.
no code implementations • 18 May 2020 • Swaroop Mishra, Arindam Mitra, Neeraj Varshney, Bhavdeep Sachdeva, Chitta Baral
However, there exists a strong need for a benchmark which can evaluate the abilities of models, in performing question format independent numerical reasoning, as (i) the numerical reasoning capabilities we want to teach are not controlled by question formats, (ii) for numerical reasoning technology to have the best possible application, it must be able to process language and reason in a way that is not exclusive to a single format, task, dataset or domain.
1 code implementation • 2 May 2020 • Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral
The data creation paradigm consists of several data visualizations to help data creators (i) understand the quality of data and (ii) visualize the impact of the created data instance on the overall quality.
no code implementations • EMNLP 2020 • Pratyay Banerjee, Chitta Baral
The aim of all Question Answering (QA) systems is to be able to generalize to unseen questions.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shailaja Keyur Sampat, Yezhou Yang, Chitta Baral
Understanding images and text together is an important aspect of cognition and building advanced Artificial Intelligence (AI) systems.
no code implementations • 7 Apr 2020 • Pratyay Banerjee, Chitta Baral
Open Domain Question Answering requires systems to retrieve external knowledge and perform multi-hop reasoning by composing knowledge spread over multiple sentences.
2 code implementations • EMNLP 2020 • Zhiyuan Fang, Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang
In videos that involve active agents such as humans, the agent's actions can bring about myriad changes in the scene.
no code implementations • 6 Mar 2020 • Chitta Baral, Pratyay Banerjee, Kuntal Kumar Pal, Arindam Mitra
The challenges inspired by Winograd's councilmen example, and recent developments such as the Rebooting AI book, various NLQA datasets, research on knowledge acquisition in the NLQA context, and their use in various NLQA models have brought the issue of NLQA using ``reasoning'' with external knowledge to the forefront.
no code implementations • ECCV 2020 • Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang
We propose our {Lens of Logic (LOL)} model which uses question-attention and logic-attention to understand logical connectives in the question, and a novel Fr\'echet-Compatibility Loss, which ensures that the answers of the component questions and the composed question are consistent with the inferred logical operation.
no code implementations • 26 Nov 2019 • Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor
In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time.
no code implementations • 10 Nov 2019 • Pratyay Banerjee, Kuntal Kumar Pal, Murthy Devarakonda, Chitta Baral
In this work, we formulate the NER task as a multi-answer knowledge guided QA task (KGQA) which helps to predict entities only by assigning B, I and O tags without associating entity types with the tags.
no code implementations • 25 Sep 2019 • Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor
In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn can be used to synthesize specific motion controllers at run-time.
no code implementations • 19 Sep 2019 • Arindam Mitra, Pratyay Banerjee, Kuntal Kumar Pal, Swaroop Mishra, Chitta Baral
Recently several datasets have been proposed to encourage research in Question Answering domains where commonsense knowledge is expected to play an important role.
no code implementations • 9 Aug 2019 • Arindam Mitra, Chitta Baral, Aurgho Bhattacharjee, Ishan Shrivastava
Qualitative relationships describe how increasing or decreasing one property (e. g. altitude) affects another (e. g. temperature).
no code implementations • WS 2019 • Samarth Rawal, Siddharth Rawal, Saadat Anwar, Chitta Baral
Analyzing social media posts can offer insights into a wide range of topics that are commonly discussed online, providing valuable information for studying various health-related phenomena reported online.
no code implementations • ACL 2019 • Pratyay Banerjee, Kuntal Kumar Pal, Arindam Mitra, Chitta Baral
Open book question answering is a type of natural language based QA (NLQA) where questions are expected to be answered with respect to a given set of open book facts, and common knowledge about a topic.
Ranked #26 on Question Answering on OpenBookQA
no code implementations • ACL 2019 • Ashok Prakash, Arpit Sharma, Arindam Mitra, Chitta Baral
Our end-to-end system built in such a manner improves on the accuracy of two of the available language model based approaches by 5. 53{\%} and 7. 7{\%} respectively.
no code implementations • 24 Jun 2019 • Somak Aditya, Yezhou Yang, Chitta Baral
Deep learning based data-driven approaches have been successfully applied in various image understanding applications ranging from object recognition, semantic segmentation to visual question answering.
no code implementations • 28 May 2019 • Tejas Gokhale, Shailaja Sampat, Zhiyuan Fang, Yezhou Yang, Chitta Baral
The process of identifying changes or transformations in a scene along with the ability of reasoning about their causes and effects, is a key aspect of intelligence.
1 code implementation • 1 May 2019 • Arindam Mitra, Peter Clark, Oyvind Tafjord, Chitta Baral
While in recent years machine learning (ML) based approaches have been the popular approach in developing end-to-end question answering systems, such systems often struggle when additional knowledge is needed to correctly answer the questions.
no code implementations • 22 Apr 2019 • Arindam Mitra, Ishan Shrivastava, Chitta Baral
We present two new datasets and a novel attention mechanism for Natural Language Inference (NLI).
no code implementations • 26 Feb 2019 • Samarth Rawal, Ashok Prakash, Soumya Adhya, Sidharth Kulkarni, Saadat Anwar, Chitta Baral, Murthy Devarakonda
To help automate the process, National NLP Clinical Challenges (N2C2) conducted a shared challenge by defining 13 criteria for clinical trial cohort selection and by providing training and test datasets.
no code implementations • 10 Dec 2018 • Somak Aditya, Rudra Saha, Yezhou Yang, Chitta Baral
We propose a framework that combines recent advances in knowledge distillation (teacher-student framework), relational reasoning and probabilistic logical languages to incorporate such knowledge in existing neural networks for the task of Visual Question Answering.
no code implementations • 23 Mar 2018 • Somak Aditya, Yezhou Yang, Chitta Baral
Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image.
1 code implementation • 22 Feb 2018 • Arindam Mitra, Chitta Baral
This paper is under consideration for acceptance in
no code implementations • 17 Nov 2016 • Somak Aditya, Yezhou Yang, Chitta Baral, Yiannis Aloimonos
We compile a dataset of over 3k riddles where each riddle consists of 4 images and a groundtruth answer.
no code implementations • 10 Nov 2015 • Somak Aditya, Yezhou Yang, Chitta Baral, Cornelia Fermuller, Yiannis Aloimonos
Specifically, commonsense reasoning is applied on (a) detections obtained from existing perception methods on given images, (b) a "commonsense" knowledge base constructed using natural language processing of image annotations and (c) lexical ontological knowledge from resources such as WordNet.
no code implementations • 6 Nov 2015 • Chitta Baral, Gregory Gelfond, Enrico Pontelli, Tran Cao Son
It also allows the specification of agents' dynamic awareness of action occurrences which has future implications on what agents' know about the world and other agents' knowledge about the world.
no code implementations • 19 Jun 2013 • Chitta Baral, Nguyen H. Vo
The broader goal of our research is to formulate answers to why and how questions with respect to knowledge bases, such as AURA.
no code implementations • 15 Jun 2013 • Saadat Anwar, Chitta Baral, Katsumi Inoue
Answering realistic questions about biological systems and pathways similar to the ones used by text books to test understanding of students about biological systems is one of our long term research goals.
no code implementations • 15 Jun 2013 • Saadat Anwar, Chitta Baral, Katsumi Inoue
However, we need to make extensions to the Petri Net model and also reason with multiple simulation runs and parallel state evolutions.