writing

Entropix Sampling

Oct 2024

Entropy sampling approach by xjdr; create a per-token sampling strategy for LLM using information about the distribution of logits ¹.

Entropy and Var-entropy

Entropy tells you how uncertain you are on average
Var-entropy tells you how much that uncertainty varies across different possibilities

Entropix Sampling Approach

The model’s behavior is determined by the degree of entropy and var-entropy.

High degree of confidence entropy, var-entropy the model will return the token with the highest probability.
Consistently unsure entropy, var-entropy it will either backspace and resample to get back on track or give an EOT token to prevent hallucination.
Confident on multiple paths entropy, var-entropy it will branch out and explore, returning the most confident path.
Randomness needed entropy, var-entropy the temperature will be very high and top_p will be decreased to prevent gibberish.

Reference Implementation

def calculate_varentropy_logsoftmax(logits):
    log_probs = jax.nn.log_softmax(logits, axis=-1)
    probs = jnp.exp(log_probs)
    entropy = -jnp.sum(probs * log_probs, axis=-1) / LN_2  # Convert to base-2
    varentropy = jnp.sum(probs * (log_probs / LN_2 + entropy[..., None])**2, axis=-1)
    return entropy, varentropy

(and attention heads) ↩

Entropy and Var-entropy

Entropix Sampling Approach

Reference Implementation

Footnotes