RPC frameworks are something that I never thought I would be particularly interested in… until I joined the bubble, where nearly everything used the same framework, which made RPC frameworks very appealing. But despite both my previous and current employers releasing two similar RPC frameworks (gRPC and Apache Thrift respectively), they are not really that commonly used in Open Source, from what I can tell. D-Bus technically counts, but it’s also a bus messaging system, rather than a simpler point-to-point RPC system.
On the proprietary software side, RPC/IPC frameworks have existed for dozens of years: CORBA was originally specified in 1991, and Microsoft’s COM was released in 1993. Although these are technically object models rather than just RPC frameworks, they fit into the “general aesthetics” of the discussion.
So, what’s the deal with RPC frameworks? Well, in the general sense, I like to represent them as a set of choices already made for you: they select an IDL (Interface Description Language), they provide some code generation tool leveraging the libraries they select, and they decide how structures are encoded on the wire. They are, by their own nature, restrictive rather than flexible. And that’s the good thing.
Because if we considered the most flexible options, we’d be considering IP as an RPC framework, and that’s not good — if all we have is IP, it’s hard for two components developed in isolation to be able to talk together. That’s why we have higher level protocols, and that’s why even just using HTTP as an RPC protocol is not good enough: it doesn’t define anywhere close to the semantics you need to be able to use it as a protocol without knowing both client and server code.
And one of the restrictions that I think RPC frameworks are good for, is making you drop the convention of specific programming languages — or at least of whichever programming language they didn’t take after. Because clearly, various RPC frameworks inspire themselves from different starting languages, and so their conventions feel more or less at ease in each language depending on how far they are from the starting language.
So for instance, if you look at gRPC, errors are returned with a status code and a detailed status structure, while in Thrift you declare specific exception structures that your interfaces can throw. Both options make different compromise, and they require different amount of boilerplate code to feel more at ease with different languages.
There are programming languages, particularly in the functional family (I’m looking at you, Erlang!) that don’t really “do” error checking — if you made a mistake somewhere, you expect that some type of error will be raise/thrown/returned, and everything else will fall behind it. So an RPC convention with a failure state and a (Adam Savage voice) “here’s your problem” long stack trace would fit them perfectly fine.
This would be equivalent of having HTTP only ever return error codes 400 and maybe 500 — client or server error, and that’s about it. You deal with it, after all it’s nearly always a human in front of a computer looking at the error message, no? Well…
Turns out that being specific to a point of what your error messages are can be very useful, particularly when interacting at a distance (either physical distance, or the distance of not knowing the code of whatever you’re talking to) — which is now HTTP 401 is used to trigger an authentication request on most browsers. If you wanted to go a further step, you could consider a 451 response as an automated trigger to re-request the same page from a VPN in a different country (particularly useful with GDPR-restricted news sources in the USA, nowadays).
Personally, I think this is the reason why the dream of thin client libraries, in my experience, stays a dream. While, yes, with a perfectly specified RPC interface definition you could just use the RPC functions as if they were a library themselves… that usually means that the calls don’t “feel” correct for the language, for any language.
Instead, I personally think you need a wrapper library that can expose the RPC interfaces with a language-native approach — think builder paradigms in Java, and context managers in Python. Not doing so leads, in my experience, to either people implementing their own wrapper libraries you have no control over, or pretty bad code overall, because the people knowing the language refuse to touch the unwrapped client.
This is also, increasingly, relevant for local tooling — because honestly I’d rather have an RPC-based interface over Unix Domain Sockets (which allow you to pass authentication information) rather than running command line tools as subprocesses and trying to parse their output. And while for simpler services, signal-based communication or very simple “text” protocols would work just as well, there’s value in having a “lingua franca” to speak between different services.
I guess what I’m saying is that, unlike programming languages, I do think we should make, and stick to, choices on RPC systems. The fact that for the longest time most of Windows apps could share the same basic IPC/RPC system was a significant advantage (nowadays there’s… somewhat confusion at least in my eyes — and that probably has something to do with the amount of localhost-only HTTP servers that are running on my machines).
In the Open Source world, it feels like we don’t really seem to like the idea of taking options away – which was clearly visible when the whole systemd integration started – and that makes choices, and integrations, much harder. Unfortunately, that also means significantly higher cost to integrate components together — and a big blame game when some of the bigger, not-open players decide to make some of those choices (cough application-specific passwords cough).
The first RPC system I came across was underlying NFS: https://en.wikipedia.org/wiki/Open_Network_Computing_Remote_Procedure_Call 🙂
I completely agree on the value of a shared RPC system. But I differ on the requirement of language native adaptors. My experience (albeit only inside Google) has been that this encourages very large, thick clients, and poor RPC design, such that you must use the client and are unable to call the RPC directly.
I used to be oncall for a distributed DB which has such a thick client, and having some sort of influence over clients’ aggregate behavior when sending calls over the wire was a blessing. The always told example, controlling low-level retry behavior, rather than allowing folks to issue retries at whichever pace they chose to? Definitely useful.
There’s one additional factor: thick client libraries mean you can also have client-side caching of service metadata (think data-dependent stuff like “which host:port pair hosts the data for Customer $X”) which would otherwise be hidden behind a thick client proxy on the service-side — this further distributes the system.
That is merely in the neighborhood of Flameeyes’ point, but let me tie it together: standardizing the RPC options into one single framework isn’t relevant to the RPC caller; after all, they have a call to make and will do so whichever way the service is exposed. The benefit of that standardization is for infrastructure folks, and horizontal aspects like monitoring, distributed tracing, and security.
I think, at the time, it was called Sun-RPC? Also had neato things like service discovery built in.