BLT's Entropy-based Patcher vs. Tokenizer Visualisation

Enter text to visualize its segmentation according to different methods:

  1. Byte Latent Transformer (BLT): Entropy-based patching plot and patched text (using blt_main_entropy_100m_512w).
  2. Tiktoken (GPT-4): Text segmented by cl100k_base tokens.
  3. Llama 3: Text segmented by the meta-llama/Meta-Llama-3-8B tokenizer.