Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution
Fast, lossless LLM inference via dual-view diffusion decoding. - chiennv2000/orthrus
Read full article →Fast, lossless LLM inference via dual-view diffusion decoding. - chiennv2000/orthrus
Read full article →