viva_tensor/metrics
Métricas Avançadas para Quantização
Baseado na análise do Qwen3-235B sobre algoritmos state-of-the-art:
- MSE (Mean Squared Error)
- MAE (Mean Absolute Error)
- Cosine Similarity
- SNR (Signal-to-Noise Ratio)
- SQNR (Signal-to-Quantization-Noise Ratio)
- Perplexity Delta (para LLMs)
INSIGHTS DO QWEN3:
- AWQ: Proteger 1% dos pesos salientes reduz erro drasticamente
- NF4: Quantis não-uniformes (distribuição normal) > uniformes
- GPTQ: Ponderar erro pelo Hessian melhora precisão
- Flash Attention: Online softmax com shifting evita overflow
Types
Métricas por camada (para LLMs)
pub type LayerMetrics {
LayerMetrics(
layer_name: String,
metrics: QuantMetrics,
sensitivity: Float,
)
}
Constructors
-
LayerMetrics( layer_name: String, metrics: QuantMetrics, sensitivity: Float, )
Métricas completas de quantização
pub type QuantMetrics {
QuantMetrics(
mse: Float,
mae: Float,
rmse: Float,
cosine_sim: Float,
snr_db: Float,
sqnr_db: Float,
max_error: Float,
p99_error: Float,
outlier_pct: Float,
)
}
Constructors
-
QuantMetrics( mse: Float, mae: Float, rmse: Float, cosine_sim: Float, snr_db: Float, sqnr_db: Float, max_error: Float, p99_error: Float, outlier_pct: Float, )Arguments
- mse
-
Mean Squared Error
- mae
-
Mean Absolute Error
- rmse
-
Root Mean Squared Error
- cosine_sim
-
Cosine Similarity (1.0 = perfeito)
- snr_db
-
Signal-to-Noise Ratio (dB)
- sqnr_db
-
Signal-to-Quantization-Noise Ratio (dB)
- max_error
-
Max absolute error
- p99_error
-
Percentil 99 do erro
- outlier_pct
-
Porcentagem de valores com erro > 1%
Values
pub fn benchmark_metrics() -> Nil
pub fn compute_all(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> QuantMetrics
Computa todas as métricas de uma vez
pub fn compute_saliency(
weights: tensor.Tensor,
activations: List(List(Float)),
) -> List(Float)
Computa saliência de pesos baseado em ativações Salience(w) = Var(activation) * w²
pub fn cosine_similarity(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> Float
Cosine Similarity - mede direção, não magnitude 1.0 = vetores idênticos, 0.0 = ortogonais, -1.0 = opostos
pub fn error_percentile(
original: tensor.Tensor,
quantized: tensor.Tensor,
percentile: Float,
) -> Float
Percentil do erro (aproximado via sorting)
pub fn find_salient_weights(
saliency: List(Float),
top_pct: Float,
) -> List(Int)
Identifica top K% de pesos salientes
pub fn mae(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> Float
MAE - Mean Absolute Error
pub fn max_error(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> Float
Max Error - pior caso
pub fn mse(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> Float
MSE - Mean Squared Error
pub fn outlier_percentage(
original: tensor.Tensor,
quantized: tensor.Tensor,
threshold: Float,
) -> Float
Porcentagem de outliers (erro > threshold)
pub fn rmse(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> Float
RMSE - Root Mean Squared Error
pub fn snr_db(
original: tensor.Tensor,
quantized: tensor.Tensor,
) -> Float
SNR - Signal-to-Noise Ratio em dB SNR = 10 * log10(signal_power / noise_power)
pub fn theoretical_sqnr(bits: Int) -> Float
SQNR - Signal-to-Quantization-Noise Ratio Teórico para N bits: SQNR = 6.02 * N + 1.76 dB