About 50 results
Open links in new tab
  1. SIMD transposes 1 - The ryg blog

    Jul 9, 2013 · For example, when implementing 2D separable filters, the “vertical” direction (filtering between rows) is usually easy, whereas “horizontal” (filtering between columns within the same …

  2. Speculatively speaking - The ryg blog

    Mar 4, 2013 · Luckily, that’s not too hard: converting data from AoS to SoA is essentially a matrix transpose, and our typical use case happens to be 4 separate 4-vectors, i.e. a 4×4 matrix; luckily, a …

  3. The ryg blog | When I grow up I'll be an inventor.

    Aug 8, 2024 · However, these are the passes right next to the input/output permutation, and combining the early special-case passes with the data permutation tends to solve problems in both: the data …

  4. SSE: mind the gap! | The ryg blog

    Apr 3, 2016 · If you want good SIMD performance, don’t lean on horizontal and dot-product style operations; process data in batches (not just one vec4 at a time) and transpose on input, or use a …

  5. The ryg blog | When I grow up I'll be an inventor. | Page 8

    For example, when implementing 2D separable filters, the “vertical” direction (filtering between rows) is usually easy, whereas “horizontal” (filtering between columns within the same register) is trickier – to …

  6. October 2024 – The ryg blog

    Oct 23, 2024 · 3 posts published by fgiesen during October 2024

  7. About | The ryg blog

    This is the blog of Fabian “ryg” Giesen. I work at RAD Game Tools in Kirkland/WA as a programmer. I also used to be active in the demoscene group Farbrausch and have written some useful tools and …

  8. What’s that magic computation in stb__RefineBlock? - The ryg blog

    Nov 8, 2022 · Back in 2007 I wrote my DXT1/5 (aka BC1/3) encoder rygdxt, originally for “fr-041: debris” (so it was size-constrained). A bit later I put up the source and Sean Barrett adapted it into “stb_dxt”, …

  9. Why does ASTC use ISE when almost nothing else does?

    May 29, 2026 · The ASTC texture compression format has its “integer sequence encoding” to send small integers with a uniform probability distribution within their range. When that value range is [0,2k…

  10. A small note on SIMD matrix-vector multiplication | The ryg blog

    Feb 5, 2015 · Suppose we want to calculate a product between a 4×4 matrix M and a 4-element vector v: The standard approach to computing Mv using SIMD instructions boils down to ...