What I had in mind was something along these lines: allowedColumns: Map[UserInpu...

simiones · on Aug 6, 2021

> - We can write literals which look like strings but have type SQL

What I'm not clear is: what prevents me from accidentally/stupidly doing:

  filter : SQL = "WHERE " + userInputCol + " = ?"

Is this special string handling some compiler magic that distinguishes literal strings from string variables? If so then I think that in a language that supports something like this you can indeed make a safe library. The main downside is that you need to work entirely with compile-time constructs - e.g. you can't use something like printf to take a compile-time format string and turn it runtime into a query; and you can't take queries from a separate file, they must be in source code. But these may be acceptable trade-offs.

Do you know of any library that implements this?

chriswarbo · on Aug 7, 2021

> Is this special string handling some compiler magic that distinguishes literal strings from string variables?

Ah, maybe I should have made it clearer that I was overloading the double-quote syntax, so we can write:

    "foo": String
    "bar": SQL
    "baz": UserInput
    "quux": Shell
    etc.

I was also relying on type inference to figure out which is which, and on '+' returning the same type as both arguments, e.g.

    +: String  -> String  -> String
    +: SQL     -> SQL     -> SQL
    +: Int     -> Int     -> Int
    +: Float   -> Float   -> Float
    +: List[T] -> List[T] -> List[T]
    etc.

This way, we see how your example fails to typecheck:

    // Code as written
    filter : SQL = "WHERE " + userInputCol + " = ?"

    // Right-hand-side must have type SQL, to match left-hand-side
    "WHERE " + userInputCol + " = ?": SQL

    // Resolving order-of-operations of the two '+' operations
    ("WHERE " + userInputCol) + " = ?" : SQL
  
    // Arguments to outer '+' have same type as return value, which is SQL
    "WHERE " + userInputCol : SQL
    " = ?" : SQL

    // Arguments to inner '+' have same type as return value, which is SQL
    "WHERE " : SQL
    userInputCol : SQL

We've inferred that userInputCol must have type SQL, so it will fail for String/UserInput/whatever.

> The main downside is that you need to work entirely with compile-time constructs - e.g. you can't use something like printf to take a compile-time format string and turn it runtime into a query; and you can't take queries from a separate file, they must be in source code.

Yep, although macros could help with that sort of thing, e.g. Haskell's quasiquotation https://wiki.haskell.org/Quasiquotation

A couple of Google hits for 'haskell quasiquote sql':

https://hackage.haskell.org/package/postgresql-simple-0.6.2/...

https://hackage.haskell.org/package/postgresql-query

> Do you know of any library that implements this?

Not completely. De-coupling double-quoted literal syntax from a single String type can be done with Haskell's OverloadedStrings feature, but that relies on a function 'fromString : String -> t', which is what we're trying to avoid https://hackage.haskell.org/package/base-4.6.0.1/docs/Data-S...

Scala's custom interpolators are similar, but they rely on a function from 'StringContext -> t', and StringContext is easily created from a String ( https://www.scala-lang.org/api/current/scala/StringContext.h... )

To ensure safety, we would need some way to encapsulate the underlying fromString/StringContext implementation to prevent it being called by anything other than the literal-expansion at compile-time.

Of course, if we're willing to use macros then it's pretty easy, like those quasiquote examples above.

Haskell's module system is famously mediocre, so it might be possible to do this overloading + encapsulation with Idris https://idris2.readthedocs.io/en/latest/reference/overloaded...

(Of course, Idris also has an incredibly expressive type system, and a very powerful macro system, AKA "elaborator reflection", so it can definitely be done; but I haven't figured out the cleanest way)