2024-10-25
Exciting news from @AIatMeta @Meta The release of quantized Llama 3.2 models marks a significant leap in AI development. With up to 4x faster inference speeds and a 56% reduction in model size, developers can now enjoy enhanced efficiency without sacrificing accuracy. This breakthrough...ensures quality and safety on resource-constrained devices. Kudos to Meta's collaboration with industry leaders...
SiliconANGLE
Meta debuts “quantized” versions of Llama 3.2 1B and 3B models, designed to run on low-powered devices and developed in collaboration with Qualcomm and MediaTek
so today we're releasing new quantized versions of Llama 3.2 1B & 3B that deliver up to 2-4x increases in inference speed and, on average, 56% reduction in model size, and 41% redu...