-
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples (arxiv.org)Language Models Regression Analysis In-Context Learning Data Contamination Non-Transformer Models Online Learning
This study investigated how large language models (LLMs) like Claude 3 and GPT-4 perform on regression tasks using in-context examples without additional training. It showed that LLMs could carry out both linear and non-linear regression, matching or even outperforming supervised learning methods. The study also explored the impact of dataset familiarity and potential data contamination, finding that explicit knowledge of the dataset name did not significantly affect the LLMs’ performance. Additional investigations included the performance of non-transformer models and the effectiveness of LLMs on non-numerical regression tasks. The results suggest that LLMs are versatile tools capable of understanding and applying mathematical concepts learned during their pre-training phase.
Main Points- LLMs are capable of regression tasks using in-context examples.LLMs can perform both linear and non-linear regression without being specifically trained for it, rivaling traditional supervised methods like Linear Regression and Gradient Boosting.
- Performance of LLMs improves with more in-context examples.In-context training examples increase the performance of LLMs, with very capable models like Claude 3 and GPT-4 approaching near-optimal decision quality over time.
- Data contamination concerns are addressed by showing unchanged performance whether or not LLMs 'know' the dataset.An experiment comparing models' performance with and without knowledge of the dataset name showed no significant difference, addressing concerns about data contamination.
122004763 -
The article introduces the concept of Function Calling with Hermes-2-Pro-Mistral-7B, a tool designed to simplify the process of defining functions and tools for use with language model APIs. Through detailed examples, it demonstrates how this tool can be utilized to enhance the efficiency and capabilities of language models in executing specific, predefined functions.
Main Points- Function Calling CapabilitiesFirst we will define some functions/tools which the LLM will have access to. Here I use langchain to convert the Python functions into the tools format used by OpenAI. It’s much faster than writing those JSON objects by hand. Note that Hermes-2-Pro-Mistral-7B also uses this same format!
122004763