Temperature is a parameter which is injected into the softmax function, enabling the users to manipulate the output probabilities. It helps us to control the creativeness of a Large Language Model.

The range of the temperature parameter is defined as 0 and 1 in OpenAI documentation. In the context of Cohere, temperature values fall within the range of 0 to 5. See the references below.


This is the original softmax function:

When we add Temperature parameter:

Remember that zj is the output of the neural network: it is a floating number. If you want to learn more about softmax function, read here.


  • As Temperature approaches 0, the output probabilities become more “sharp”. One of the probability will be close to 1.
  • As Temperature increases, the output probabilities become more “flat” or “uniform”, reducing the difference between the probabilities of different elements.

If we want repetitive answers, and no creativity at all, we can decrease the Temperature. If we want more creative answers, we can increase it.


This is an example. Let’s imagine our corpus has only 5 words: [“donut”, “cake”, “apple”, “juice”, “book”]

The prediction of next token of given sentence: “At the table, there is a delicious” will be one of the words in the corpus.

These are the original results:

You can try different temperature values to see how the output changes.

Listen my YouTube Vide from here

Happy Learning! :)

References

  • [1] https://platform.openai.com/docs/api-reference/audio/createTranscription#audio/createTranscription-temperature
  • [2] https://txt.cohere.com/llm-parameters-best-outputs-language-ai/
  • [3] https://peterchng.com/blog/2023/05/02/token-selection-strategies-top-k-top-p-and-temperature/