-
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples (arxiv.org)Language Models Regression Analysis In-Context Learning Data Contamination Non-Transformer Models Online Learning
This study investigated how large language models (LLMs) like Claude 3 and GPT-4 perform on regression tasks using in-context examples without additional training. It showed that LLMs could carry out both linear and non-linear regression, matching or even outperforming supervised learning methods. The study also explored the impact of dataset familiarity and potential data contamination, finding that explicit knowledge of the dataset name did not significantly affect the LLMs’ performance. Additional investigations included the performance of non-transformer models and the effectiveness of LLMs on non-numerical regression tasks. The results suggest that LLMs are versatile tools capable of understanding and applying mathematical concepts learned during their pre-training phase.
Main Points- LLMs are capable of regression tasks using in-context examples.LLMs can perform both linear and non-linear regression without being specifically trained for it, rivaling traditional supervised methods like Linear Regression and Gradient Boosting.
- Performance of LLMs improves with more in-context examples.In-context training examples increase the performance of LLMs, with very capable models like Claude 3 and GPT-4 approaching near-optimal decision quality over time.
- Data contamination concerns are addressed by showing unchanged performance whether or not LLMs 'know' the dataset.An experiment comparing models' performance with and without knowledge of the dataset name showed no significant difference, addressing concerns about data contamination.
122004763