10. 01. 2026 William Calliari Development

SQLX’s WHERE-IN queries: The Good, the Bad and the Ugly

I love SQLX for many reasons, but one in particular is the possibility to validate syntax and return types at compile-time with the query! macros. At least that’s what I used to think…

For a recent project I wanted to perform a query to get some role data from from an OAUTH 2.0 custom role claim. The query in and of itself was quite simple, but it lead me down a rabbit hole into the internals of SQLX and some limitations of MariaDB. Let’s have a look at the query first:

SELECT *
FROM roles
WHERE roles.idp_group IN ?;

Looks simple, right? Even testing it out with DataGrip works, because in the prompt you can just insert the list you want. So now we just pass our slice of strings to the query and:

error: error returned from database: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '?' at line 1
 --> src/main.rs:3:5
  |
3 |     sqlx::query!("SELECT * FROM roles WHERE idp_group IN ?;", l);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this error originates in the macro `$crate::sqlx_macros::expand_query` which comes from the expansion of the macro `sqlx::query` (in Nightly builds, run with -Z macro-backtrace for more info)
error: could not compile `sqlx_list` (bin "sqlx_list") due to 1 previous error

Weird. Why is MariaDB suddenly complaining here. The syntax should be right, if it just replaces the ? with the list I just passed. After some experimentation, to make sure I was not missing something trivial, I reached out to the developers of SQLX on their discord channel. They responded to me promptly that MariaDB (and also MySQL) do not support binding arrays like that.

So then it doesn’t work. Now we effectively have two options left: Run one query for each group and pray to god that no user has such a large number of groups that this would cause performance issues, or use QueryBuilder, but then forego the compile-time type and query checking from SQLX.

I quickly learned that the second option wasn’t really feasible. Rust’s type system is super strict and I ran into several problems when testing it. And considering that I want to use similar queries in the future (in much less-traversed parts than the auth code), this was a no-go.

So it’s the first option then? Well, not that either. This query will fetch the user auth on every request, and any slowdowns in this part of the code will haunt the user on every single request. I wouldn’t want any surprises coming my way from there.

So the only solution left: build a solution myself that does both!

The Implementation

The solution that I came up with was essentially another macro, one that generates the code for QueryBuilder while generating a separate “example” query that gets validated against the database at compile-time the way SQLX does it. I’ll quickly guide you through the steps to implement such a macro, but first we need to decide on what syntax to use.

The syntax I came up with had the following constraints: You specify a list of either strings or for-loops in the query. All the strings will be appended to the query, while the for-loops will use the QueryBuilder::separated function to append the items.

The trick here is that the body of the for-loop should function as a block, returning the last statement and therefore effectively working like a list-comprehension from python.

let groups = vec![String::new("/root")];
let roles = query_in_list!{
    RoleType,    
    "SELECT *"
    "FROM groups"
    "WHERE idp_group IN"
    for group in groups { group }
};

Now we just need our macro to actually parse this and then generate new, valid Rust source code. The parsing code took huge inspiration from the instrument macro of the tracing crate as I didn’t know how to do this prior to actually doing it. The logic is actually quite simple, like tracing, I just used the syn crate to parse the actual source code:

use syn::{
    Block, Ident, LitStr, Stmt, Token, Type,
    parse::{Parse, ParseStream, Result},
};

struct QueryInList {
    query_result_type: Type,
    query: Vec<QueryFragment>,
}

enum QueryFragment {
    Lit(LitStr),
    Loop { name: Ident, list: Ident, expr: Vec<Stmt> },
}

impl Parse for QueryInList {
    fn parse(input: ParseStream) -> Result<Self> {
        let query_result_type = input.parse::<Type>()?;
        input.parse::<Token![,]>()?;
        let mut fragments = vec![];
        while !input.is_empty() {
            if let Ok(literal) = input.parse::<LitStr>() {
                let mut content = literal.value();
                content.push(' ');
                fragments.push(QueryFragment::Lit(LitStr::new(&content, lit.span())));
                continue;
            }
            input.parse::<Token![for]>()?;
            let name = input.parse::<Ident>()?;
            input.parse::<Token![in]>()?;
            let list = input.parse::<Ident>()?;
            let block = input.parse::<Block>()?;
            fragments.push(QueryFragment::Loop {
                name,
                list,
                expr: block.stmts,
            });
        }
        Ok(QueryInList {
            query_result_type,
            query: fragments,
        })
    }
}

First we simply parse a type, followed by a comma from the input stream. Then we move on to the items of the query.

As you can see in the parse loop at line 24, I try to parse a string literal from the input stream. If that doesn’t work, I just assume it contains the for-loop and parse one after the other: the keyword for, an identifier that is used in the loop, the keyword in, another identifier for the iterable that is used, and finally a block expression that gives us our value for the query. I package everything up into a Vec and with that, we have our data prepared.

(I also push a space after each sqlx::QueryBuilder string literal, so you don’t have to remember to add the trailing white space after each line. You’re welcome.)

The next step now is to turn this data into an example query that we use to validate against the database. For that we just take our list of elements: if we encounter a string literal, we simply concatenate it, and if it’s a for-loop, we add the literal (?, ?) as an example, and keep track of how many placeholders we inserted:

With that done, all that’s left is to assemble all the pieces into valid Rust code, which we can output from our macro for the compiler to process. The generated code should look something like this:

impl QueryInList {
    fn example_query(&self) -> (usize, String) {
        let mut query = String::new(); 
        let mut placeholder = 0;
        for fragment in &self.query {
            match fragment {
                QueryFragment::Lit(lit) => query += &lit.value(),
                QueryFragment::Loop { .. } => {
                    query += " ( ?, ? ) "; 
                   placeholder += 2;
                }
            }
        }
        query.push(';');
        (placeholder, query)
    }
}

So I used the quote crate to generate the code. The great thing about quote is, we can do string interpolation on our source code, making it much easier to audit and understand later.

As a quick aside – a lot of Rust detractors might now say: “See, you need to pull in so many different crates. This is a huge supply chain risk”. Interestingly enough though, all the crates used (quote, syn and proc-macro2) come from the same person: David Tolnay (I’ve already sung his praises, so I’ll spare you that here). So the supply-chain risk is actually equivalent to one crate currently, and we would just degrade compile time and code re-usability. So please don’t just look at “big number” and assume that means the supply-chain risk is directly correlated. Much more is going into that.

Back to the topic at hand – let’s look at the generated source code:

use proc_macro::TokenStream;
use quote::quote;
use syn::parse_macro_input;

#[proc_macro]
pub fn query_in_list(input: TokenStream) -> TokenStream {
    let query_in_list = parse_macro_input!(input as QueryInList);
    let (placeholders, qe) = query_in_list.example_query(); 
    let ph = std::iter::repeat_n(true, placeholders);
    let query_result_type = query_in_list.query_result_type;
    let fragments = query_in_list.query.iter().map(|f| match f {
        QueryFragment::Lit(lit) => quote! { builder.push(#lit); },
        QueryFragment::Loop { name, list, expr } => quote! {
            builder.push(" ( ");
            let mut separated = builder.separated(", ");
            for #name in #list {
                separated.push_bind({ #(#expr)* });
            }
            builder.push(" ) ");
        },
    });

    quote! {{
        // We are generating an example query here, but discarding the result.
        // This is only used to run at compile-time against the database to
        // ensure type-safety.
        let _ = sqlx::query_as!(#query_result_type, #qe, #(#ph),*);
        let mut builder = sqlx::QueryBuilder::new("");
        #(#fragments)*
        builder
    }}
    .into()
}

So, what’s happening here? First we transform the fragments we collected from the macro into their respective code parts: Just pushing the string in case of a literal, otherwise creating the Separated and pushing all the values based on the for-loop. As you can see, I basically just paste the whole code block inside the curly-braces into the Separated::push_bind function, which actually works great.

As for our example query, we need to push the same number of arguments into the code as we have bindings in the query. For that I just created an iterator over true that yields exactly the right number. Surprisingly this works, and MariaDB doesn’t complain about any type mismatches either (put a pin in that, we’ll need that later again).

The Good

We can have our cake and eat it too (well mostly, we’ll come to that). We can validate our query at compile time, while leveraging the QueryBuilder to actually avoid the limitations of MariaDB. Nominally I’ve reached my goal, but the solution is still far from perfect, as I will explain in the next sections.

The Bad

While now we actually do have the type, in order to get the data itself out we still have to derive FromRow for the struct and pass through that code.

Must worse though, this solution is not 100% air-tight. Let me demonstrate this:

let groups = vec![String::new("/root")];
let roles = query_in_list!{
    RoleType,
    "SELECT *"    "FROM groups"
    "WHERE idp_group IN"
    for group in groups { group }
};

roles.query_as::<OtherType>(&mut db).await

Since the function query_as takes a mutable reference, we cannot return just the query because the builder, which keeps a reference to it, would be dropped at the end of the block. So between the macro and the actual use of the query it could still come to some type confusion.

The other alternative is to add another parameter to the macro: the buffer itself, hoping that the user (or an especially clever LLM) hasn’t thought about pre-filling it to avoid a compiler error for the type confusion.

The second solution could be the prettier one, as we can reuse the specified type twice, but for now, I opted for solution number one. This is definitely a point I need to come back to. But for now I don’t currently have a solution for this problem, other than hoping that the borrow checker will become smarter about this. I don’t think that will work though, as reordering drops like that can break a bunch of Rust code where the block is used to force e.g. a MutexGuard to be dropped early.

Edit (2025-12-30): Fixing the bad.

This was a real thorn in my side, but I couldn’t think of a good solution. Thankfully I was on the Chaos Communication Congress and my friend Isse suggested that I should talk to the people at the Rust assembly: they might know something that could be done. And in fact after some discussion, we came up with an idea, that while not the prettiest, resolves this problem for good.

For the uninitiated in Rust macros: Rust proc-macros work by being compiled first, and then being called by the compiler to generate more code. Right now they have the limitations that they need to live in a separate crate that can only export the macros themselves.

So first we move all the macro code into its separate crate, e.g. query_macros_impl and then create a separate crate query_macros that re-exports the macros alongside the new code.

Basically, what we came up with is the following: Create a wrapper type for the query builder that takes the row type as a generic parameter and exposes only one function, which is to get the QueryAs type, with the correct lifetime. The actual code is really simple:

use sqlx::{Database, FromRow, QueryBuilder};
use sqlx::query::QueryAs;

pub struct SealedQueryBuilder<'a, DB: Database, T: FromRow<'a, DB::Row>> {
    builder: QueryBuilder<'a, DB>,
    _marker: PhantomData<T>,
}

impl<'a, DB: Database, T: FromRow<'a, DB::Row>> SealedQueryBuilder<'a, DB, T> {
    pub fn new(builder: QueryBuilder<'a, DB>) -> Self {
        Self {
            builder,
            _marker: PhantomData,
        }
    }
    pub fn query_as(&'a mut self) -> QueryAs<'a, DB, T, DB::Arguments<'a>> {
        self.builder.build_query_as()
    }
}

We basically just create the struct, the generics can be copied almost 1:1 from the QueryAs macro and the QueryAs::fetch_all method. Then we create a function that internally calls build_query_as and… that’s it. We can now bring the type all the way through to the point where it’s being used.

The Ugly

I actually lied to you about the implementation earlier in order to not spoil the surprise in this section.

In the code that inserts the binding values there’s a edge case we didn’t cover. If the iterable is empty, we generate an invalid query and get a runtime error, so we’d have to implement custom logic before every use of the macro. This is quite counterintuitive and easy to forget. So we have two options again: Add a custom error that the user is required to handle, or make sure the query can run, even if one of the lists is empty.

I decided to go for the second option, and that’s where we’ll reach back to the pin we set earlier: remember how I said MariaDB won’t complain, even if we have a type mismatch in our query? That’s when it becomes useful again: If we just push a true for each for-loop, we can make sure the actual list is always (syntactically) valid and you’ll get an empty list back, which is what you’d probably expect anyways.

I want to also take this moment to formally apologize to any potential future co-workers who look at the queries executed in the database and then have to go down a rabbit hole on why every list contains a true at the beginning.

While that “feature” of MariaDB to silently ignore type mismatches is a blessing for fixing that edge-case, it’s also a curse – if you accidentally pass in a list of, say, IDs instead of strings, the validation will work just-fine™ while the query can never return an actual row. But since this this is not something I can influence at the moment, I have to (begrudgingly) accept this as part of the deal, not just for my macro, but for all of SQLX. Nevertheless, I’m happy I discovered it, so I can be aware of this behavior going forward.

These Solutions are Engineered by Humans

Did you find this article interesting? Does it match your skill set? Programming is at the heart of how we develop customized solutions. In fact, we’re currently hiring for roles just like this and others here at Würth IT Italy.

William Calliari

William Calliari

Author

William Calliari

Leave a Reply

Your email address will not be published. Required fields are marked *

Archive