The Race to Increase Context Length

What is Context Length?

When talking about large language models (LLMs), we use the term “context” to refer to the information that the model can take into account when generating new text or predictions. The thing that allows ChatGPT to remember all the interactions that you’ve had in a particular is the context. Each time you submit a follow up question or instruction, the entire history of your conversation is submitted as context for the LLM to take into account. And ChatGPT, just like every LLM, has an upper bound to the amount of context it can accept, usually represented as a number of tokens. We call this bound its “context length” or “context window length”.

Language models have always had a limitations on the amount of context they could accept. However, recent advancements have led to a steady increase in the context length supported by new versions of popular LLMs, creating new opportunities to leverage LLMs to address more use cases. By incorporating this expanded context, language models can better capture relationships across longer bodies of text, and in turn do a better job at understanding more complex inputs.

In this article, we will delve deeper into the significance of context length in language models, explore the challenges involved in increasing it, and examine the potential benefits for businesses and technology professionals. By understanding the importance of context length, we can grasp the transformative impact it has on the capabilities of language models and its potential to revolutionize various industries.

The Need for Longer Context Length

To truly understand human language and generate meaningful responses, language models rely heavily on context. Language is rich in meaning and is often influenced by the surrounding words, phrases, and sentences. The context provides crucial information for disambiguation, understanding idiomatic expressions, resolving pronoun references, and grasping the overall meaning of a text or conversation.

The limitation of shorter context lengths in older language models restricted their ability to capture the full depth and breadth of context. This limitation often resulted in models providing responses that were out of context or lacking in coherence. For example, a model with a shorter context might struggle to understand the meaning of a pronoun that refers to a distant noun mentioned several sentences ago. This can lead to incorrect or nonsensical responses.

This man cannot believe the garbage answer ChatGPT-2 just provided due to lack of context.

By increasing the context length, language models can better understand and interpret the relationship between words and phrases within a broader context. They can consider more relevant information and capture the dependencies and associations between different parts of a text. This enables them to generate more accurate and contextually appropriate responses.

Longer context length is particularly important in tasks such as natural language understanding, sentiment analysis, and machine translation. In these tasks, a deeper understanding of context is essential to correctly interpret the meaning, sentiment, or nuances of a text. For example, in sentiment analysis, longer context can help discern the underlying sentiment in a longer review or social media post, taking into account the full context and not just individual words or phrases.

Moreover, longer context length is crucial for tasks that involve multi-turn conversations. In dialogue systems or chatbots, understanding the conversation history is vital for generating coherent and relevant responses. With longer context, language models can better track the flow of the conversation, maintain proper context, and provide more interactive and engaging dialogue.

In short, increasing the context length in language models is essential for capturing the full richness and complexity of human language. By considering a broader context, these models can improve their understanding, generate more accurate responses, and ensure coherence in their interactions. The ability to comprehend and respond to language in a more contextual manner has significant implications for various applications, including customer support, content generation, sentiment analysis, and more.

Racing to Increase Context Length

Across the board, popular LLMs are all increasing the context window length with each. new version they release. Open AI doubled the context window from 1024 to 2048 tokens when it launched GPT-3, then jumped up to 8,000 tokens for GPT-4 with an extended context option that is capable of up to 32,000 tokens.

Llama2, with it’s permissive and comparatively open model supports 4,096 while PaLM from Google has jumped to an 8,000 token context window with its most recent release.

Not that kind of Llama

This trend shows no signs of slowing down with MosaicML recently announcing that their MPT-7B model can support up to 84k tokens.

Traditionally, language models required extensive pretraining on vast amounts of data to excel at specific tasks. However, with increased context length, foundational models can now address a wider range of use cases without the need for additional pretraining opening up many new possibilities in their use.

Implications for Businesses

Models with expanded context offer a wide range of benefits and new applications for LLMs to be used to deliver real, significant business value.

For instance, in the world of finance, these models with extended context can dive deep into a company’s financial history, spanning years. They spot trends and historical patterns that might have eluded us before. It’s akin to having a financial wizard by your side, pointing out potential risks and foreseeing future performance, helping you make smarter investment choices.

In healthcare, these models with a more expansive view can delve into a patient’s medical journey, analyzing diagnoses, treatments, and outcomes over time. It’s like having a virtual medical detective, connecting the dots between various aspects of a patient’s health, potentially revolutionizing disease diagnosis, treatment recommendations, and outcome predictions.

Moreover, expanded-context LLMs are versatile. They can also consider external factors that get fed in as additional context. In market analysis, for example, the context could contain historical sales data, economic indicators, social trends, and even the moves of competitors. The longer the context, the more data the LLM can take into consideration when generating its response. This panoramic view will enables businesses to make decisions with an edge, like adjusting pricing strategies or targeting specific customer groups, all based on a richer, more holistic understanding of the market.

In a nutshell, expanding the context length of these language models transforms data analysis and decision-making. It’s like turning on the lights in a dark room – suddenly, we see valuable insights and patterns that were previously hidden in the shadows. This leads to more accurate predictions and empowers us to make informed decisions that can shape the future of our endeavors.


The race to increase context length in LLMs has significant implications for businesses and technology professionals. The ability of foundational models to address more use cases without pretraining opens doors for innovation and optimization in various industries. With longer context lengths, businesses can leverage these models to streamline operations, enhance customer experiences, and much, much more. Technology professionals, on the other hand, are empowered to develop more advanced solutions and drive drive this technological innovation. How this race impacts the approaches of fine tunings vs other prompt engineering techniques like RAG and FLARE remains to be seen. However, for now it seems like context is king and you know what they say about playing the Game of Thrones. You win, or you die.

Chris Latimer

Chris Latimer is an experienced technology executive currently serving as the general manager of cloud for DataStax. His product leadership helped shape the Google Cloud API Management products as well as the data product suite powered by Apache Cassandra and Apache Pulsar at DataStax. Chris is based near Boulder, Colorado with his wife and three kids.

Add comment

Leave a Reply

Stay up to date on Gen AI

Subscribe and you'll receive regular updates on the latest happenings in the world of generative AI.

Follow us

Don't be shy, get in touch. We love meeting interesting people and making new friends.

Stay up to date with gen AI

Subscribe and you'll receive regular updates on the latest happenings in the world of generative AI.