placeholder

Using the most unhinged AVX-512 instruction to make the fastest phrase search algo

Disclaimers before we start For those who don’t want to read/don’t care that much, here are the results. I hope after seeing them you are compelled to read. TL;DR: I wrote a super fast phrase search algorithm using AVX-512 and achieved wins up to 1600x the performance of Meilisearch. The source ...

Click to view the original at gab-menezes.github.io

Hasnain says:

Learned so much about so many things from this one. Gotta love when you see something like this a few thousand words into a post you already thought was quite interesting

“Now that the boring stuff is behind us, let’s start the fun part. Again, just as a reminder on how the intersection works: we do two phases of intersection, one for the conventional intersection and another for the bits that would cross the group boundary, and in the end, we merge these two.

In this section, we will take a look at assembly, some cool tools to analyze this assembly, AVX-512, differences in the microarchitecture of AMD and Intel chips, emulation of instructions, and a lot more. So again, sorry to bother you with all of the previous stuff, but it was important.”

Posted on 2025-01-27T08:25:55+0000