This podcast currently has no reviews.
Submit ReviewLuca Casonato is the tech lead for Deno Deploy and a TC39 delegate.
Deno is a JavaScript runtime from the original creator of NodeJS, Ryan Dahl.
You can help edit this transcript on GitHub.
[00:00:07] Jeremy: Today I'm talking to Luca Casonato. He's a member of the Deno Core team and a TC 39 Delegate.
[00:00:06] Luca: Hey, thanks for having me.
[00:00:07] Jeremy: So today we're gonna talk about Deno, and on the website it says, Deno is a runtime for JavaScript and TypeScript. So I thought we could start with defining what a runtime is.
[00:00:21] Luca: Yeah, that's a great question. I think this question actually comes up a lot. It's, it's like sometimes we also define Deno as a headless browser, or I don't know, a, a JavaScript script execution tool. what actually defines runtime? I, I think what makes a runtime a runtime is that it is a, it's implemented in native code.
It cannot be self-hosted. Like you cannot self-host a JavaScript runtime. and it executes JavaScript or TypeScript or some other scripting language, without relying on, well, yeah, I guess it's the self-hosting thing. Like it's, it's essentially a, a JavaScript execution engine, which is not self-hosted.
So yeah, it, it maybe has IO bindings, but it doesn't necessarily need to like, it. Maybe it allows you to read the, from the file system or, or make network calls. Um, but it doesn't necessarily have to. It's, I think the, the primary definition is something which can execute JavaScript without already being written in JavaScript.
[00:01:20] Jeremy: And when we hear about JavaScript run times, whether it's Deno or Node or Bun, or anything else, we also hear about it in the context of v8. Could you explain the relationship between V8 and a JavaScript run time?
[00:01:36] Luca: Yeah. So V8 and, and JavaScript core and Spider Monkey, these are all JavaScript engines. So these are the low level virtual machines that can execute or that can parse your JavaScript code. turn it into byte code, maybe turn it into, compiled machine code, and then execute that code. But these engines, Do not implement any IO functions.
They do not. They implement the JavaScript spec as is written. and then they provide extension hooks for, they call these host environments, um, like environments that embed these engines to provide custom functionalities to essentially poke out of the sandbox, out of the, out of the virtual machine. Um, and this is used in browsers.
Like browsers have, have these engines built in. This is where they originated from. Um, and then they poke holes into this, um, sandbox virtual machine to do things like, I don't know, writing to the dom or, or console logging or making fetch calls and all these kinds of things. And what a runtime essentially does, a JavaScript runtime is it takes one of these engines and.
It then provides its own set of host APIs, like essentially its own set of holes. It pokes into the sandbox. and depending on what the runtime is trying to do, um, the weight will do. This is gonna be different and, and the sort of API that is ultimately exposed to the end user is going to be different.
For example, if you compare Deno and node, like node is very loosey goosey, about how it pokes holds into the sandbox, it sort of just pokes them everywhere. And this makes it difficult to enforce things like, runtime permissions for example. Whereas Deno is much more strict about how it, um, pokes holds into its sandbox.
Like everything is either a web API or it's behind in this Deno name space, which means that it's, it's really easy to find, um, places where, where you're poking out of the sandbox. and really you can also compare these to browsers. Like browsers are also JavaScript run times. Um, they're just not headless.
JavaScript run times, but JavaScript run times that also have a ui. and. . Yeah. Like there, there's, there's a whole Bunch of different kinds of JavaScript run times, and I think we're also seeing a lot more like embedded JavaScript run times. Like for example, if you've used React Native before, you, you may be using Hermes as a, um, JavaScript engine in your Android app, which is like a custom JavaScript engine written just for, for, for React native.
Um, and this also is embedded within a, like react native run time, which is specific to React native. so it's also possible to have run times, for example, that are, that can be where the, where the back backing engine can be exchanged, which is kind of cool.
[00:04:08] Jeremy: So it sounds like V8's role, one way to look at it is it can execute JavaScript code, but only pure functions. I suppose you
[00:04:19] Luca: Pretty much. Yep.
[00:04:21] Jeremy: Do anything that doesn't interact with IO so you think about browsers, you were mentioning you need to interact with a DOM or if you're writing a server side application, you probably need to receive or make HTTP requests, that sort of thing.
And all of that is not handled by v8. That has to be handled by an external runtime.
[00:04:43] Luca: Exactly Like, like one, one. There's, there's like some exceptions to this. For example, JavaScript technically has some IO built in with, within its standard library, like math, random. It's like random number. Generation is technically an IO operation, so, Technically V8 has some IO built in, right? And like getting the current date from the user, that's also technically IO
So, like there, there's some very limited edge cases. It's, it's not that it's purely pure, but V8 for example, has a flag to turn it completely deterministic. which means that it really is completely pure. And this is not something which run times usually have. This is something like the feature of an engine because the engine is like so low level that it can essentially, there's so little IO that it's very easy to make deterministic where a runtime higher level, um, has, has io, um, much more difficult to make deterministic.
[00:05:39] Jeremy: And, and for things like when you're working with JavaScript, there's, uh, asynchronous programming
[00:05:46] Luca: mm-hmm.
[00:05:47] Jeremy: So you have concurrency and things like that. Is that a part of V8 or is that the responsibility of the run time?
[00:05:54] Luca: That's a great question. So there's multiple parts to this. There's the part, um, there, there's JavaScript promises, um, and sort of concurrent Java or well, yes, concurrent JavaScript execution, which is sort of handled by v8, like v8. You can in, in pure v8, you can create a promise, and you can execute some code within that promise.
But without IO there's actually no way to defer time, uh, which means that in with pure v8, you can either, you can create a promise. Which executes right now. Or you can create a promise that never executes, but you can't create a promise that executes in 10 seconds because there's no way to measure 10 seconds asynchronously.
What run times do is they add something called an event loop on top of this, um, on top of the base engine and that event loop, for example, like a very simple event loop, for example, might have a timer in it, which every second looks at if there's a timer schedule to run within that second.
And if it does, if, if that timer exists, it'll go call out to V8 and say, you can now execute that promise. but V8 is still the one that's keeping track of, of like which promises exist, and the code that is meant to be invoked when they resolve all that kind of thing. Um, but the underlying infrastructure that actually invokes which promises get resolved at what point in time, like the asynchronous, asynchronous IO is what this is called.
This is driven by the event loop, um, which is implemented by around time. So Deno, for example, it uses, Tokio for its event loop. This is a, um, an event loop written in Rust. it's very popular in the Rust ecosystem. Um, node uses libuv. This is a relatively popular runtime or, or event loop, um, implementation for c uh, plus plus.
And, uh, libuv was written for Node. Tokio was not written for Deno. But um, yeah, Chrome has its own event loop implementation. Bun has its own event loop implementation.
[00:07:50] Jeremy: So we, we might go a little bit more into that later, but I think what we should probably go into now is why make Deno, because you have Node that's, uh, currently very popular. The co-creator of Deno, to my understanding, actually created Node. So maybe you could explain to our audience what was missing or what was wrong with Node, where they decided I need to create, a new runtime.
[00:08:20] Luca: Yeah. So the, the primary point of concern here was that node was slowly diverging from browser standards with no real path to, to, to, re converging. Um, like there was nothing that was pushing node in the direction of standards compliance and there was nothing, that was like sort of forcing node to innovate.
and we really saw this because in the time between, I don't know, 2015, 2018, like Node was slowly working on esm while browsers had already shipped ESM for like three years. , um, node did not have fetch. Node hasn't had, or node only at, got fetch last year. Right? six, seven years after browsers got fetch.
Node's stream implementation is still very divergent from, from standard web streams. Node was very reliant on callbacks. It still is, um, like promises in many places of the Node API are, are an afterthought, which makes sense because Node was created in a time before promises existed. Um, but there was really nothing that was pushing Node forward, right?
Like nobody was actively investing in, in, in improving the API of Node to be more standards compliant. And so what we really needed was a new like Greenfield project, which could demonstrate that actually writing a new server side run. Is A viable, and b is totally doable with an API that is more standards combined.
Like essentially you can write a browser, like a headless browser and have that be an excellent to use JavaScript runtime, right? And then there was some things that were I on top of that, like a TypeScript support because TypeScript was incredibly, or is still incredibly popular. even more so than it was four years ago when, when Deno was created or envisioned, um, this permission system like Node really poked holes into the V8 sandbox very early on with, with like, it's gonna be very difficult for Node to ever, ever, uh, reconcile this, this.
Especially cuz the, some, some of the APIs that it, that it exposes are just so incredibly low level that like, I don't know, you can mutate random memory within your process. Um, which like if you want to have a, a secure sandbox like that just doesn't work. Um, it's not compatible. So there was really needed to be a place where you could explore this, um, direction and, and see if it worked.
And Deno was that. Deno still is that, and I think Deno has outgrown that now into something which is much more usable as, as like a production ready runtime. And many people do use it, in production. And now Deno is on the path of slowly converging back with Node, um, in from both directions. Like Node is slowly becoming more standards compliant. and depending on who you ask this was, this was done because of Deno and some people said it would had already been going on and Deno just accelerated it. but that's not really relevant because the point is that like Node is becoming more standard compliant and, and the other direction is Deno is becoming more node compliant.
Like Deno is implementing node compatibility layers that allow you to run code that was originally written for the node ecosystem in the standards compliant run time. so through those two directions, the, the run times are sort of, um, going back towards each other. I don't think they'll ever merge. but we're, we're, we're getting to a point here pretty soon, I think, where it doesn't really matter what runtime you write for, um, because you'll be able to write code written for one runtime in the other runtime relatively easily.
[00:12:03] Jeremy: If you're saying the two are becoming closer to one another, becoming closer to the web standard that runs in the browser, if you're talking to someone who's currently developing in node, what's the incentive for them to switch to Deno versus using Node and then hope that eventually they'll kind of meet in the middle.
[00:12:26] Luca: Yeah, so I think, like Deno is a lot more than just a runtime, right? Like a runtime executes JavaScript, Deno executes JavaScript, it executes type script. But Deno is so much more than that. Like Deno has a built-in format, or it has a built-in linter. It has a built-in testing framework, a built-in benching framework.
It has a built-in Bundler, it, it like can create self-hosted, um, executables. yeah, like Bundle your code and the Deno executable into a single executable that you can trip off to someone. Um, it has a dependency analyzer. It has editor integrations. it has, Yeah. Like I could go on for hours, (laughs) about all of the auxiliary tooling that's inside of Deno, that's not a JavaScript runtime.
And also Deno as a JavaScript runtime is just more standards compliant than any of the other servers at Runtimes right now. So if, if you're really looking for something which is standards complaint, which is gonna like live on forever, then it's, you know, like you cannot kill off the Fetch API ever.
The Fetch API is going to live forever because Chrome supports it. Um, and the same goes for local storage and, and like, I don't know, the Blob API and all these other web APIs like they, they have shipped and browsers, which means that they will be supported until the end of time. and yeah, maybe Node has also reached that with its api probably to some extent.
but yeah, don't underestimate the power of like 3 billion Chrome users. that would scream immediately if the Fetch API stopped working Right?
[00:13:50] Jeremy: Yeah, I, I think maybe what it sounds like also is that because you're using the API that's used in the browser places where you deploy JavaScript applications in the future, you would hope that those would all settle on using that same API so that if you were using Deno, you could host it at different places and not worry about, do I need to use a special API maybe that you would in node?
[00:14:21] Luca: Yeah, exactly. And this is actually something which we're specifically working towards. So, I don't know if you've, you've heard of WinterCG? It's a, it's a community group at the W3C that, um, CloudFlare and, and Deno and some others including Shopify, have started last year. Um, we're essentially, we're trying to standardize the concept of what a server side JavaScript runtime is and what APIs it needs to have available to be standards compliant.
Um, and essentially making this portability sort of written down somewhere and like write down exactly what code you can write and expect to be portable. And we can see like that all of the big, all of the big players that are involved in, in, um, building JavaScript run times right now are, are actively, engaged with us at WinterCG and are actively building towards this future.
So I would expect that any code that you write today, which runs. in Deno, runs in CloudFlare, workers runs on Netlify Edge functions, runs on Vercel's Edge, runtime, runs on Shopify Oxygen, is going to run on the other four. Um, of, of those within the next couple years here, like I think the APIs of these is gonna converge to be essentially the same.
there's obviously gonna always be some, some nuances. Um, like, I don't know, Chrome and Firefox and Safari don't perfectly have the same API everywhere, right? Like Chrome has some web Bluetooth capabilities that Safari doesn't, or Firefox has some, I don't know, non-standard extensions to the error object, which none of the other runtimes do.
But overall you can expect these front times to mostly be aligned. yeah, and I, I think that's, that's really, really, really excellent and that, that's I think really one of the reasons why one should really consider, like building for, for this standard runtime because it, it just guarantees that you'll be able to host this somewhere in five years time and 10 years time, with, with very little effort.
Like even if Deno goes under or CloudFlare goes under, or, I don't know, nobody decides to maintain node anymore. It'll be easy to, to run somewhere else. And also I expect that the big cloud vendors will ultimately, um, provide, manage offerings for, for the standards compliant JavaScript on time as well.
[00:16:36] Jeremy: And this WinterCG group is Node a part of that as well?
[00:16:41] Luca: Um, yes, we've invited Node, um, to join, um, due to the complexities of how node's, internal decision making system works. Node is not officially a member of WinterCG. Um, there is some individual members of the node, um, technical steering committee, which are participating. for example, um, James m Snell is, is the co-chair, is my co-chair on, on WinterCG.
He also works at CloudFlare. He's also a node, um, TSC member, Mateo Colina, who has been, um, instrumental to getting fetch landed in Node, um, is also actively involved. So Node is involved, but because Node is node and and node's decision making process works the way it does, node is not officially listed anywhere as as a member.
but yeah, they're involved and maybe they'll be a member at some point. But, yeah, let's. , see (laughs)
[00:17:34] Jeremy: Yeah. And, and it, so it, it sounds like you're thinking that's more of a, a governance or a organizational aspect of note than it is a, a technical limitation. Is that right?
[00:17:47] Luca: Yeah. I obviously can't speak for the node technical steering committee, but I know that there's a significant chunk of the node technical steering committee that is, very favorable towards, uh, standards compliance. but parts of the Node technical steering committee are also not, they are either indifferent or are actively, I dunno if they're still actively working against this, but have actively worked against standards compliance in the past.
And because the node governance structure is very, yeah, is, is so, so open and let's, um, and let's, let's all these voices be heard, um, that just means that decision making processes within Node can take so long, like. . This is also why the fetch API took eight years to ship. Like this was not a technical problem.
and it is also not a technical problem. That Node does not have URL pattern support or, the file global or, um, that the web crypto API was not on this, on the global object until like late last year, right? Like, these are not technical problems, these are decision making problems. Um, and yeah, that was also part of the reason why we started Deno as, as like a separate thing, because like you can try to innovate node, from the inside, but innovating node from the inside is very slow, very tedious, and requires a lot of fighting.
And sometimes just showing somebody, from the outside like, look, this is the bright future you could have, makes them more inclined to do something.
[00:19:17] Jeremy: Do, do you have a sense for, you gave the example of fetch taking eight years to, to get into node. Do you, do you have a sense of what the typical objection is to, to something like that? Like I, I understand there's a lot of people involved, but why would somebody say, I, I don't want this
[00:19:35] Luca: Yeah. So for, for fetch specifically, there was a, there was many different kinds of concerns. Um, one of the, I, I can maybe list two of them. One of them was for example, that the fetch API is not a good API and as such, node should not have it. which is sort of. missing the point of, because it's a standard API, how good or bad the API is is much less relevant because if you can share the API, you can also share a wrapper that's written around the api.
Right? and then the other concern was, node does need fetch because Node already has an HTTP API. Um, so, so these are both kind of examples of, of concerns that people had for a long time, which it took a long time to either convince these people or, or to, push the change through anyway. and this is also the case for, for other things like, for example, web, crypto, um, like why do we need web crypto?
We already have node crypto, or why do we need yet another streams? Implementation node already has four different streams implementations. Like, why do we need web streams? and the, the. Like, I don't know if you know this XKCD of, there's 14 competing standards. so let's write a 15th standard, to unify them all.
And then at the end we just have 15 competing standards. Um, so I think this is also the kind of concern that people were concerned about, but I, I think what we've seen here is that this is really not a concern that one needs to have because it ends up that, or it turns out in the end that if you implement web APIs, people will use web APIs and will use web APIs only for their new code.
it takes a while, but we're seeing this with ESM versus require like new code written with require much less common than it was two years ago. And, new code now using like Xhr, whatever it's called, form request or. You know, the one, I mean, compared to using Fetch, like nobody uses that name.
Everybody uses Fetch. Um, and like in Node, if you write a little script, like you're gonna use Fetch, you're not gonna use like Nodes, htp, dot get API or whatever. and we're gonna see the same thing with Readable Stream. We're gonna see the same thing with Web Crypto. We're gonna see, see the same thing with Blob.
I think one of the big ones where, where Node is still, I, I, I don't think this is one that's ever gonna get solved, is the, the Buffer global and Node. like we have the Uint8, this Uint8 global, um, and like all the run times including browsers, um, and Buffer is like a super set of that, but it's in global scope.
So it, it's sort of this non-standard extension of unit eight array that people in node like to use and it's not compatible with anything else. Um, but because it's so easy to get at, people use it anyway. So those are, those are also kind of problems that, that we'll have to deal with eventually. And maybe that means that at some point the buffer global gets deprecated and I don't know, probably can never get removed.
But, um, yeah, these are kinds of conversations that the no TSE is going have to have internally in, I don't know, maybe five years.
[00:22:37] Jeremy: Yeah, so at a high level, What's shipped in the browser, it went through the ECMAScript approval process. People got it into the browser. Once it's in the browser, probably never going away. And because of that, it's safe to build on top of that for these, these server run times because it's never going away from the browser.
And so everybody can kind of use it into the future and not worry about it. Yeah.
[00:23:05] Luca: Exactly. Yeah. And that's, and that's excluding the benefit that also if you have code that you can write once and use in both the browser and the server side around time, like that's really nice. Um, like that, that's the other benefit.
[00:23:18] Jeremy: Yeah. I think that's really powerful. And that right now, when someone's looking at running something in CloudFlare workers versus running something in the browser versus running something in. it's, I think a lot of people make the assumption it's just JavaScript, so I can use it as is. But it, it, there are at least currently, differences in what APIs are available to you.
[00:23:43] Luca: Yep. Yep.
[00:23:46] Jeremy: Earlier you were talking about how Deno is more than just the runtime. It has a linter, formatter, file watcher there, there's all sorts of stuff in there. And I wonder if you could talk a little bit to the, the reasoning behind that
[00:24:00] Luca: Mm-hmm.
[00:24:01] Jeremy: Having them all be separate things.
[00:24:04] Luca: Yeah, so the, the reasoning here is essentially if you look at other modern run time or mo other modern languages, like Rust is a great example. Go is a great example. Even though Go was designed around the same time as Node, it has a lot of these same tools built in. And what it really shows is that if the ecosystem converges, like is essentially forced to converge on a single set of built-in tooling, a that built-in tooling becomes really, really excellent because everybody's using it.
And also, it means that if you open any project written by any go developer, any, any rest developer, and you look at the tests, you immediately understand how the test framework works and you immediately understand how the assertions work. Um, and you immediately understand how the build system works and you immediately understand how the dependency imports work.
And you immediately understand like, I wanna run this project and I wanna restart it when my file changes. Like, you immediately know how to do that because it's the same everywhere. Um, and this kind of feeling of having to learn one tool and then being able to use all of the projects, like being able to con contribute to open source when you're moving jobs, whatever, like between personal projects that you haven't touched in two years, you know, like being able to learn this once and then use it everywhere is such an incredibly powerful tool.
Like, people don't appreciate this until they've used a runtime or, or, or language which provides this to them. Like, you can go to any go developer and ask them if they would like. There, there's this, there's this saying in the Go ecosystem, um, that Go FMT is nobody's favorite, but, or, uh, wait, no, I don't remember what the, how the saying goes, but the saying essentially implies that the way that go FMT formats code, maybe not everybody likes, but everybody loves go F M T anyway, because it just makes everything look the same.
And like, you can read your friend's code, your, your colleagues code, your new jobs code, the same way that you did your code from two years ago. And that's such an incredibly powerful feeling. especially if it's like well integrated into your IDE you clone a repository, open that repository, and like your testing panel on the left hand side just populates with all the tests, and you can click on them and run them.
And if an assertion fails, it's like the standard output format that you're already familiar with. And it's, it's, it's a really great feeling. and if you don't believe me, just go try it out and, and then you will believe me, (laughs)
[00:26:25] Jeremy: Yeah. No, I, I'm totally with you. I, I think it's interesting because with JavaScript in particular, it feels like the default in the community is the opposite, right? There's so many different ways. Uh, there are so many different build tools and testing frameworks and, formatters, and it's very different than, like you were mentioning, a go or a Rust that are more recent languages where they just include that, all Bundled in. Yeah.
[00:26:57] Luca: Yeah, and I, I think you can see this as well in, in the time that average JavaScript developer spends configuring their tooling compared to a rest developer. Like if I write Rust, I write Rust, like all day, every day. and I spend maybe two, 3% of my time configuring Rust tooling like. Doing dependency imports, opening a new project, creating a format or config file, I don't know, deleting the build directory, stuff like that.
Like that's, that's essentially what it means for me to configure my rest tooling. Whereas if you compare this to like a front-end JavaScript project, like you have to deal with making sure that your React version is compatible with your React on version, it's compatible with your next version is compatible with your ve version is compatible with your whatever version, right?
this, this is all not automatic. Making sure that you use the right, like as, as a front end developer, you developer. You don't have just NPM installed, no. You have NPM installed, you have yarn installed, you have PNPM installed. You probably have like, Bun installed. And, and, and I don't know to use any of these, you need to have corepack enabled in Node and like you need to have all of their global bin directories symlinked into your or, or, or, uh, included in your path.
And then if you install something and you wanna update it, you don't know, did I install it with yarn? Did I install it with N pNPM? Like this is, uh, significant complexity and you, you tend to spend a lot of time dealing with dependencies and dealing with package management and dealing with like tooling configuration, setting up esent, setting up prettier.
and I, I think that like, especially Prettier, for example, really showed, was, was one of the first things in the JavaScript ecosystem, which was like, no, we're not gonna give you a config where you, that you can spend like six hours configuring, it's gonna be like seven options and here you go. And everybody used it because, Nobody likes configuring things.
It turns out, um, and even though there's always the people that say, oh, well, I won't use your tool unless, like, we, we get this all the time. Like, I'm not gonna use Deno FMT because I can't, I don't know, remove the semicolons or, or use single quotes or change my tab width to 16. Right? Like, wait until all of your coworkers are gonna scream at you because you set the tab width to 16 and then see what they change it to.
And then you'll see that it's actually the exact default that, everybody uses. So it'll, it'll take a couple more years. But I think we're also gonna get there, uh, like Node is starting to implement a, a test runner. and I, I think over time we're also gonna converge on, on, on, on like some standard build tools.
Like I think ve, for example, is a great example of this, like, Doing a front end project nowadays. Um, like building new front end tooling that's not built on Vite Yeah. Don't like, Vite's it's become the standard and I think we're gonna see that in a lot more places.
[00:29:52] Jeremy: Yeah, though I, I think it's, it's tricky, right? Because you have so many people with their existing projects. You have people who are starting new projects and they're just searching the internet for what they should use. So you're, you're gonna have people on web pack, you're gonna have people on Vite, I guess now there's gonna be Turbo pack, I think is another one that's
[00:30:15] Luca: Mm-hmm.
[00:30:16] Jeremy: There's, there's, there's all these different choices, right? And I, I think it's, it's hard to, to really settle on one, I guess,
[00:30:26] Luca: Yeah,
[00:30:27] Jeremy: uh, yeah.
[00:30:27] Luca: like I, I, I think this is, this is in my personal opinion also failure of the Node Technical Steering committee, for the longest time to not decide that yes, we're going to bless this as the standard format for Node, and this is the standard package manager for Node. And they did, they sort of did, like, they, for example, node Blessed NPM as the standard, package manager for N for for node.
But it didn't innovate on npm. Like no, the tech nodes, tech technical steering committee did not force NPM to innovate NPMs, a private company ultimately bought by GitHub and they had full control over how the NPM cli, um, evolved and nobody forced NPM to, to make sure that package install times are six times faster than they were.
Three years ago, like nobody did that. so it didn't happen. And I think this is, this is really a failure of, of the, the, the, yeah, the no technical steering committee and also the wider JavaScript ecosystem of not being persistent enough with, with like focus on performance, focus on user experience, and, and focus on simplicity.
Like things got so out of hand and I'm happy we're going in the right direction now, but, yeah, it was terrible for some time. (laughs)
[00:31:41] Jeremy: I wanna talk a little bit about how we've been talking about Deno in the context of you just using Deno using its own standard library, but just recently last year you added a compatibility shim where people are able to use node libraries in Deno.
[00:32:01] Luca: Mm-hmm.
[00:32:01] Jeremy: And I wonder if you could talk to, like earlier you had mentioned that Deno has, a different permissions model.
on the website it mentions that Deno's HTTP server is two times faster than node in a Hello World example. And I'm wondering what kind of benefits people will still get from Deno if they choose to use packages from Node.
[00:32:27] Luca: Yeah, it's a great question. Um, so I think a, again, this is sort of a like, so just to clarify what we actually implemented, like what we have is we have support for you to import NPM packages. Um, so you can import any NPM package from NPM, from your type script or JavaScript ECMAScript module, um, that you have, you already have for your Deno code.
Um, and we will under the hood, make sure that is installed somewhere in some directory globally. Like PNPM does. There's no local node modules folder you have to deal with. There's no package of Jason you have to deal with. Um, and there's no, uh, package. Jason, like versioning things you need to deal with.
Like what you do is you do import cowsay from NPM colon cowsay at one, and that will import cowsay with like the semver tag one. Um, and it'll like do the sim resolution the same way node does, or the same way NPM does rather. And what you get from that is that essentially it gives you like this backdoor to a callout to all of the existing node code that Isri been written, right?
Like you cannot expect that Deno developers, write like, I don't know. There was this time when Deno did not really have that many, third party modules yet. It was very early on, and I don't know the, you either, if you wanted to connect to Postgres and there was no Postgres driver available, then the solution was to write your own Postgres driver.
And that is obviously not great. Um, (laughs) . So the better solution here is to let users for these packages where there's no Deno native or, or, or web native or standard native, um, package for this yet that is importable with url. Um, specifiers, you can import this from npm. Uh, so it's sort of this like backdoor into the existing NPM ecosystem.
And we explicitly, for example, don't allow you to, create a package.json file or, import bare node specifiers because we don't, we, we want to stay standards compliant here. Um, but to make this work effectively, we need to give you this little back door. Um, and inside of this back door. All hell is like, or like everything is terrible inside there, right?
Like inside there you can do bare specifiers and inside there you can like, uh, there's package.json and there's crazy node resolution and underscore underscore DIRNAME and common js. And like all of that stuff is supported inside of this backdoor to make all the NPM packages work. But on the outside it's exposed as this nice, ESM only, NPM specifiers.
and the, the reason you would want to use this over, like just using node directly is because again, like you wanna use TypeScript, no config, like necessary. You want to use, you wanna have a formatter you wanna have a linter, you wanna have tooling that like does testing and benchmarking and compiling or whatever.
All of that's built in. You wanna run this on the edge, like close to your users and like 30 different, 35 different, uh, points of presence. Um, it's like, Okay, push it to your git repository. Go to this website, click a button two times, and it's running in 35 data centers. like this is, this is the kind of ex like developer experience that you can, you do not get.
You, I will argue that you cannot get with Node right now. Like even if you're using something like ts-node, it is not possible to get the same level of developer experience that you do with Deno. And the, the, the same like speed at which you can iterate, iterate on your projects, like create new projects, iterate on them is like incredibly fast in Deno.
Like, I can open a, a, a folder on my computer, create a single file, may not ts, put some code in there and then call Deno Run may not. And that's it. Like I don't, I did not need to do NPM install I did not need to do NPM init -y and remove the license and version fields and from, from the generated package.json and like set private to true and whatever else, right?
It just all works out of the box. And I think that's, that's what a lot of people come to deno for and, and then ultimately stay for. And also, yeah, standards compliance. So, um, things you build in Deno now are gonna work in five, 10 years, with no hassle.
[00:36:39] Jeremy: And so with this compatibility layer or this, this shim, is it where the node code is calling out to node APIs and you're replacing those with Deno compatible equivalents?
[00:36:54] Luca: Yeah, exactly. Like for example, we have a shim in place that shims out the node crypto API on top of the web crypto api. Like sort of, some, some people may be familiar with this in the form of, um, Browserify shims. if anybody still remembers those, it's essentially. , your front end tooling, you were able to import from like node crypto in your front end projects and then behind the scenes your web packs or your browser replies or whatever would take that import from node crypto and would replace it with like the shim that was essentially exposed the same APIs node crypto, but under the hood, wasn't implemented with native calls, but was implemented on top of web crypto, or implemented in user land even.
And Deno does something similar. there's a couple edge cases of APIs that there's, where, where we do not expose the underlying thing that we shim to, to end users, outside of the node shim. So like there's some, some APIs that I don't know if I have a good example, like node nextTick for example.
Um, like to properly be able to shim node nextTick, you need to like implement this within the event loop in the runtime. and. , you don't need this in Deno, because Deno, you use the web standard queueMicrotask to, to do this kind of thing. but to be able to shim it correctly and run node applications correctly, we need to have this sort of like backdoor into some ugly APIs, um, which, which natively integrate in the runtime, but, yeah, like allow, allow this node code to run.
[00:38:21] Jeremy: A, anytime you're replacing a component with a, a shim, I think there's concerns about additional bugs or changes in behavior that can be introduced. Is that something that you're seeing and, and how are you accounting for that?
[00:38:38] Luca: Yeah, that's, that's an excellent question. So this is actually a, a great concern that we have all the time. And it's not just even introducing bugs, sometimes it's removing bugs. Like sometimes there's bugs in the node standard library which are there, and people are relying on these bugs to be there for the applications to function correctly.
And we've seen this a lot, and then we implement this and we implement from scratch and we don't make that same bug. And then the test fails or then the application fails. So what we do is, um, we actually run node's test suite against Deno's Shim layer. So Node has a very extensive test suite for its own standard library, and we can run this suite against, against our shims to find things like this.
And there's still edge cases, obviously, which node, like there was, maybe there's a bug which node was not even aware of existing. Um, where maybe this, like it's is, it's now standard, it's now like intended behavior because somebody relies on it, right? Like the second somebody relies on, on some non-standard or some buggy behavior, it becomes intended.
Um, but maybe there was no test that explicitly tests for this behavior. Um, so in that case we'll add our own tests to, to ensure that. But overall we can already catch a lot of these by just testing, against, against node's tests. And then the other thing is we run a lot of real code, like we'll try run Prisma and we'll try run Vite and we'll try run NextJS and we'll try run like, I don't know, a bunch of other things that people throw at us and, check that they work and they work and there's no bugs. Then we did our job well and our shims are implemented correctly. Um, and then there's obviously always the edge cases where somebody did something absolutely crazy that nobody thought possible. and then they'll open an issue on the Deno repo and we scratch our heads for three days and then we'll fix it.
And then in the next release there'll be a new bug that we added to make the compatibility with node better. so yeah, but I, yeah. Running tests is the, is the main thing running nodes test.
[00:40:32] Jeremy: Are there performance implications? If someone is running an Express App or an NextJS app in Deno, will they get any benefits from the Deno runtime and performance?
[00:40:45] Luca: Yeah. It's actually, there is performance implications and they're usually. The opposite of what people think they are. Like, usually when you think of performance implications, it's always a negative thing, right? It's always okay. Like you, it's like a compromise. like the shim layer must be slower than the real node, right?
It's not like we can run express faster than node can run, express. and obviously not everything is faster in Deno than it is in node, and not everything is faster in node than it is in Deno. It's dependent on the api, dependent on, on what each team decided to optimize. Um, and this also extends to other run times.
Like you can always cherry pick results, like, I don't know, um, to, to make your runtime look faster in certain benchmarks. but overall, what really matters is that you do not like, the first important step for for good node compatibility is to make sure that if somebody runs your code or runs their node code in Deno or your other run type or whatever, It performs at least the same.
and then anything on top of that great cherry on top. Perfect. but make sure the baselines is at least the same. And I think, yeah, we have very few APIs where we behave, where we, where, where like there's a significant performance degradation in Deno compared to Node. Um, and like we're actively working on these things.
like Deno is not a, a, a project that's done, right? Like we have, I think at this point, like 15 or 16 or 17 engineers working on Deno, spanning across all of our different projects. And like, we have a whole team that's dedicated to performance, um, and a whole team that's dedicated node compatibility.
so like these things get addressed and, and we make patch releases every week and a minor release every four weeks. so yeah, it's, it's not a standstill. It's, uh, constantly improving.
[00:42:27] Jeremy: Uh, something that kind of makes Deno stand out as it's standard library. There's a lot more in there than there is in in the node one.
[00:42:38] Luca: Mm-hmm.
[00:42:39] Jeremy: Uh, I wonder if you could speak to how you make decisions on what should go into it.
[00:42:46] Luca: Yeah, so early on it was easier. Early on, the, the decision making process was essentially, is this something that a top 100 or top 1000 NPM library implements? And if it is, let's include it. and the decision making is still short of based on that. But right now we've already implemented most of the low hanging fruit.
So things that we implement now are, have, have discussion around them whether we should implement them. And we have a process where, well we have a whole team of engineers on our side and we also have community members that, that will review prs and, and, and make comments. Open issues and, and review those issues, to sort of discuss the pros and cons of adding any certain new api.
And sometimes it's also that somebody opens an issue that's like, I want, for example, I want an API to, to concatenate two unit data arrays together, which is something you can really easily do node with buffer dot con cat, like the scary buffer thing. and there's no standards way of doing that right now.
So we have to have a little utility function that does that. But in parallel, we're thinking about, okay, how do we propose, an addition to the web standards now that makes it easy to concatenate iterates in the web standards, right? yeah, there's a lot to it. Um, but it's, it's really, um, it's all open, like all of our, all of our discussions for, for, additions to the standard library and things like that.
It's all, all, uh, public on GitHub and the GitHub issues and GitHub discussions and GitHub prs. Um, so yeah, that's, that's where we do that.
[00:44:18] Jeremy: Yeah, cuz to give an example, I was a little surprised to see that there is support for markdown front matter built into the standard library. But when you describe it as we look at the top a hundred thousand packages, are people looking at markdown? Are they looking at front matter? I, I'm sure there's a fair amount that are so that that makes sense.
[00:44:41] Luca: Yeah, like it sometimes, like that one specifically was driven by, like, our team was just building a lot of like little blog pages and things like that. And every time it was either you roll your own front matter part or you look for one, which has like a subtle bug here and the other one has a subtle bug there and really not satisfactory with any of them.
So, we, we roll that into the standard library. We add good test coverage for it good, add good documentation for it, and then it's like just a resource that people can rely on. Um, and you don't, you then don't have to make the choice of like, do I use this library to do my front meta parsing or the other library?
No, you just use the one that's in the standard library. It's, it's also part of this like user experience thing, right? Like it's just a much nicer user experience, not having to make a choice, about stuff like that. Like completely inconsequential stuff. Like which library do we use to do front matter parsing? (laughs)
[00:45:32] Jeremy: yeah. I mean, I think when, when that stuff is not there, then I think the temptation is to go, okay, let me see what node modules there are that will let me parse the front matter. Right. And then it, it sounds like probably ideally you want people to lean more on what's either in the standard library or what's native to the Deno ecosystem.
Yeah.
[00:46:00] Luca: Yeah. Like the, the, one of the big benefits is that the Deno Standard Library is implemented on top of web standards, right? Like it's, it's implemented on top of these standard APIs. so for example, there's node front matter libraries which do not run in the browser because the browser does not have the buffer global.
maybe it's a nice library to do front matter pricing with, but. , you choose it and then three days later you decide that actually this code also needs to run in the browser, and then you need to go switch your front matter library. Um, so, so those are also kind of reasons why we may include something in Strand Library, like maybe there's even really good module already to do something.
Um, but if there's certain reliance on specific node features that, um, we would like that library to also be compatible with, with, with web standards, we'll, uh, we might include in the standard library, like for example, YAML Parser, um, or the YAML Parser in the standard library is, is a fork of, uh, of the node YAML module.
and it's, it's essentially that, but cleaned up and, and made to use more standard APIs rather than, um, node built-ins.
[00:47:00] Jeremy: Yeah, it kind of reminds me a little bit of when you're writing a front end application, sometimes you'll use node packages to do certain things and they won't work unless you have a compatibility shim where the browser can make use of certain node APIs. And if you use the APIs that are built into the browser already, then you won't, you won't need to deal with that sort of thing.
[00:47:26] Luca: Yeah. Also like less Bundled size, right? Like if you don't have to shim that, that's less, less code you have to ship to the client.
[00:47:33] Jeremy: Another thing I've seen with Deno is it supports running web assembly.
[00:47:40] Luca: Mm-hmm.
[00:47:40] Jeremy: So you can export functions and call them from type script. I was curious if you've seen practical uses of this in production within the context of Deno.
[00:47:53] Luca: Yeah. there's actually a Bunch of, of really practical use cases, so probably the most executed bit of web assembly inside of Deno right now is actually yes, build like, yes, build has a web assembly, build like yeses. Build is something that's written and go. You have the choice of either running. Um, natively in machine code as, as like an ELF process on, on Linux or on on Windows or whatever.
Or you can use the web assembly build and then it runs in web assembly. And the web assembly build is maybe 50% slower than the, uh, native build, but that is still significantly faster than roll up or, or, or, or I don't know, whatever else people use nowadays to do JavaScript Bun, I don't know. I, I just use es build always, um,
So, um, for example, the Deno website, is running on Deno Deploy. And Deno Deploy does not allow you to run Subprocesses because it's, it's like this edge run time, which, uh, has certain security permissions that it's, that are not granted, one of them being sub-processes. So it needs to execute ES build. And the way it executes es build is by running them inside a web assembly.
Um, because web assembly is secure, web assembly is, is something which is part of the JavaScript sandbox. It's inside the JavaScript sandbox. It doesn't poke any holes out. Um, so it's, it's able to run within, within like very strict security context. . Um, and then other examples are, I don't know, you want to have a HTML sanitizer, which is actually built on the real HTML par in a browser.
we, we have an hdml sanitizer called com or, uh, ammonia, I don't remember. There's, there's an HTML sanitizer library on denoland slash x, which is built on the html parser from Firefox. Uh, which like ensures essentially that your html, like if you do HTML sanitization, you need to make sure your HTML par is correct, because if it's not, you might like, your browser might parse some HTML one way and your sanitizer pauses it another way and then it doesn't sanitize everything correctly.
Um, so there's this like the Firefox HTML parser compiled to web assembly. Um, you can use that to. HTML sanitization, or the Deno documentation generation tool, for example. Uh, Deno Doc, there's a web assembly built for it that allows you to programmatically, like generate documentation for, for your type script modules.
Um, yeah, and, and also like, you know, deno fmt is available as a WebAssembly module for programmatic access and a Bunch of other internal Deno, programs as well. Like, or, uh, like components, not programs.
[00:50:20] Jeremy: What are some of the current limitations of web assembly and Deno for, for example, from web assembly, can I make HTTP requests? Can I read files? That sort of thing.
[00:50:34] Luca: Mm-hmm. . Yeah. So web assembly, like when you spawn as web assembly, um, they're called instances, WebAssembly instances. It runs inside of the same vm, like the same, V8 isolate is what they're called, but. it does not have it, it's like a completely fresh sandbox, sort of, in the sense that I told you that between a runtime and like an engine essentially implements no IO calls, right?
And a runtime does, like a runtime, pokes holds into the, the, the engine. web assembly by default works the same way that there is no holes poked into its sandbox. So you have to explicitly poke some holes. Uh, if you want to do HTTP calls, for example, when, when you create web assembly instance, it gives you, or you can give it something called imports, uh, which are essentially JavaScript function bindings, which you can call from within the web assembly.
And you can use those function bindings to do anything you can from JavaScript. You just have to pass them through explicitly. and. . Yeah. Depending on how you write your web assembly, like if you write it in Rust, for example, the tooling is very nice and you can just call some JavaScript code from your Rust, and then the build system will automatically make sure that the right function bindings are passed through with the right names.
And like, you don't have to deal with anything. and if you're writing go, it's slightly more complicated. And if you're writing like raw web assembly, like, like the web assembly, text format and compiling that to a binary, then like you have to do everything yourself. Right? It's, it's sort of the difference between writing C and writing JavaScript.
Like, yeah. What level of abstraction do you want? It's definitely possible though, and that's for limitations. it, the same limitations as, as existing browsers apply. like the web assembly support in Deno is equivalent to the web assembly support in Chrome. so you can do, uh, many things like multi-threading and, and stuff like that already.
but especially around, shared mutable memory, um, and having access to that memory from JavaScript. That's something which is a real difficulty with web assembly right now. yeah, growing web assembly memory is also rather difficult right now. There's, there's a, there's a couple inherent limitations right now with web assembly itself.
Um, but those, those will be worked out over time. And, and Deno is like very up to date with the version of, of the standard, it, it implements, um, through v8. Like we're, we're, we're up to date with Chrome Beta essentially all the time. So, um, yeah. Any, anything you see in, in, in Chrome beta is gonna be in Deno already.
[00:52:58] Jeremy: So you talked a little bit about this before, the Deno team, they have their own, hosting. Platform called Deno Deploy. So I wonder if you could explain what that is.
[00:53:12] Luca: Yeah, so Deno has this really nice, this really nice concept of permissions which allow you to, sorry, I'm gonna start somewhere slightly, slightly unrelated. Maybe it sounds like it's unrelated, but you'll see in a second. It's not unrelated. Um, Deno has this really nice permission system which allows you to sandbox Deno programs to only allow them to do certain operations.
For example, in Deno, by default, if you try to open a file, it'll air out and say you don't have read permissions to read this file. And then what you do is you specify dash, dash allow read um, maybe you have to give it. they can either specify, allow, read, and then it'll grant to read access to the entire file system.
Or you can explicitly specify files or folders or, any number of things. Same goes for right permissions, same goes for network permissions. Um, same goes for running subprocesses, all these kind of things. And by limiting your permissions just a little bit. Like, for example, by just disabling sub-processes and foreign function interface, but allowing everything else, allowing reeds and allowing network access and all that kind of stuff.
we can run Deno programs in a way that is significantly more cost effective to you as the end user than, and, and like we can cold start them much faster than, like you may be able to with a, with a more conventional container based, uh, system. So what, what do you, what Deno Deploy is, is a way to run JavaScript or Deno Code, on our data centers all across the world with very little latency.
like you can write some JavaScript code which execute, which serves HTTP requests deploy that to our platform, and then we'll make sure to spin that code up all across the world and have your users be able to access it through some URL or, or, or some, um, custom domain or something like that. and this is some, this is very similar to CloudFlare workers, for example.
Um, and it's like Netlify Edge functions is built on top of Deno Deploy. Like Netlify Edge functions is implemented on top of Deno Deploy, um, through our sub hosting product. yeah, essentially Deno Deploy is, is, um, yeah, a cloud hosting service for JavaScript, um, which allows you to execute arbitrary JavaScript.
and there there's a couple, like different directions we're going there. One is like more end user focused, where like you link your GitHub repository and. Like, we'll, we'll have a nice experience like you do with Netlify and Versace, that word like your commits automatically get deployed and you get preview deployments and all that kind of thing.
for your backend code though, rather than for your front end websites. Although you could also write front-end websites and you know, obviously, and the other direction is more like business focused. Like you're writing a SaaS application and you want to allow the user to customize, the check like you're writing a SaaS application that provides users with the ability to write their own online store.
Um, and you want to give them some ability to customize the checkout experience in some way. So you give them a little like text editor that they can type some JavaScript into. And then when, when your SaaS application needs to hit this code path, it sends a request to us with the code, we'll execute that code for you in a secure way.
In a secure sandbox. You can like tell us you, this code only has access to like my API server and no other networks to like prevent data exfiltration, for example. and then you do, you can have all this like super customizable, code in inside of your, your SaaS application without having to deal with any of the operational complexities of scaling arbitrary code execution, or even just doing arbitrary code execution, right?
Like it's, this is a very difficult problem and give it to someone else and we deal with it and you just get the benefits. yeah, that's Deno Deploy, and it's built by the same team that builds the Deno cli. So, um, all the, all of your favorite, like Deno cli, or, or Deno APIs are available in there.
It's just as web standard is Deno, like you have fetch available, you have blob available, you have web crypto available, that kind of thing. yeah.
[00:56:58] Jeremy: So when someone ships you their, their code and you run it, you mentioned that the, the cold start time is very low. Um, how, how is the code being run? Are people getting their own process? It sounds like it's not, uh, using containers. I wonder if you could explain a little bit about how that works.
[00:57:20] Luca: Yeah, yeah, I can, I can give a high level overview of how it works. So, the way it works is that we essentially have a pool of, of Deno processes ready. Well, it's not quite Deno processes, it's not the same Deno CLI that you download. It's like a modified version of the Deno CLI based on the same infrastructure, that we have spun up across all of our different regions across the world, uh, across all of our different data centers.
And then when we get a request, we'll route that request, um, the first time we get request for that, that we call them deployments, that like code, right? We'll take one of these idle Deno processes and will assign that code to run in that process, and then that process can go serve the requests. and these process, they're, they're, they're isolated and they're, you.
it's essentially a V8 isolate. Um, and it's a very, very slim, it's like, it's a much, much, much slimmer version of the Deno cli essentially. Uh, which the only thing it can do is JavaScript execution and like, it can't even execute type script, for example, like type script is we pre-process it up front to make the the cold start faster.
and then what we do is if you don't get a request for some amount of. , we'll, uh, spin down that, um, that isolate and, uh, we'll spin up a new idle one in its place. And then, um, if you get another request, I don't know, an hour later for that same deployment, we'll assign it to a new isolate. And yeah, that's a cold start, right?
Uh, if you have an isolate which receives, or a, a deployment rather, which receives a Bunch of traffic, like let's say you receive a hundred requests per second, we can send a Bunch of that traffic to the same isolate. Um, and we'll make sure that if, that one isolate isn't able to handle that load, we'll spin it out over multiple isolates and we'll, we'll sort of load balance for you.
Um, and we'll make sure to always send to the, to the point of present that's closest to, to the user making the request. So they get very minimal latency. and they get we, we've these like layers of load balancing in place and, and, and. I'm glossing over a Bunch of like security related things here about how these, these processes are actually isolated and how we monitor to ensure that you don't break out of these processes.
And for example, Deno Deploy does, it looks like you have a file system cuz you can read files from the file system. But in reality, Deno Deploy does not have a file system. Like the file system is a global virtual file system. which is, is, uh, yeah, implemented completely differently than it is in Deno cli.
But as an end user you don't have to care about that because the only thing you care about is that it has the exact same API as the Deno cli and you can run your code locally and if it works there, it's also gonna work in deploy. yeah, so that's, that's, that's kind of. High level of Deno Deploy. If, if any of this sounds interesting to anyone, by the way, uh, we're like very actively hiring on, on Deno Deploy.
I happen to be the, the tech lead for, for a Deno Deploy product. So I'm, I'm always looking for engineers, to, to join our ranks and, and build cool distributed systems. Deno.com/jobs.
[01:00:15] Jeremy: for people who aren't familiar with the isolates, are these each run in their own processes, or do you have a single process and that has a whole Bunch of isolates inside it?
[01:00:28] Luca: in, in the general case, you can say that we run, uh, one isolate per process. but there's many asterisks on that. Um, because, it's, it's very complicated. I'll just say it's very complicated. Uh, in, in the general case though, it's, it's one isolate per process.
Yeah.
[01:00:45] Jeremy: And then you touched a little bit on the permissions system. Like you gave the example of somebody could have a website where they let their users give them code to execute. how does it look in terms of specifying what permissions people have? Like, is that a configuration file? Are those flags you pass in?
What, what does that look?
[01:01:08] Luca: Yeah. So, so that product is called sub hosting. It's, um, slightly different from our end user platform. Um, it's essentially a service that allows you to, like, you email us, well, we'll send you a, um, onboard you, and then what you can do is you can send HTTP requests to a certain end point with a, authentication token and.
a reference to some code to execute. And then what we'll do is, we'll, um, when we receive that HTTP request, we'll fetch the code, it's spin up and isolate, execute the code. execute the code. We serve the request, return you the response, um, and then we'll pipe logs to you and, and stuff like that.
and the, and, and part of that is also when we, when we pull the, um, the, the code for to spin up the isolate, that code doesn't just include the code that we're executing, but also includes things like permissions, and, and various other, we call this isolate configuration. Um, you can inspect, this is all public.
we have public docs for this at Deno.com/subhosting. I think. Yes, Deno.com/subhosting.
[01:02:08] Jeremy: And is that built on top of something that's a part of the public Deno project, the open source part? Or is this specific to this sub hosting product?
[01:02:19] Luca: Um, so the underlying engine or underlying runtime that executes the code here, like all of the code execution is performed by code, which is execute, which is public. Like all our, our, yeah, it uses, the Deno CLI just strips out a Bunch of stuff. It doesn't need the orchestration code, is not public.
The orchestration code is proprietary. and yeah, if you have use cases that where you would like to run this orchestration code on your own infrastructure, and yeah, you have interesting use cases, please email us. We would love to hear from you.
[01:02:51] Jeremy: separate from the, the orchestration, if it's more of an example of, let's say I deploy a Deno application and in the case that someone was able to get some, like malicious code or URLs into my application, could I tell Deno I only want this application to be able to call out to these URLs for just as an example?
[01:03:18] Luca: yes. So it's, it's slightly more complicated because you can't actually tell it that it can only call out to specific URLs, but you can tell it to call out only to specific domains or IP addresses. which sort of the same thing, but, uh, just slightly different layer of abstraction. Yeah, you can do that.
the allow net flag, allows you to specify a set of domains to allow requests to those domains. Yes,
[01:03:41] Jeremy: I see. So on the, user facing open source part, there are configuration flags where you could say, I want this application to be able to access these domains, or I don't want it to be able to use IO or whatever
[01:03:56] Luca: Yeah, exactly.
[01:03:57] Jeremy: their, their flags.
[01:03:59] Luca: Yeah. And, and on, on subhosting, this is done via the isolate configuration, which is like a JSON blob. And, yeah, like there, there's, it's, but ultimately it's all, it's all sort of, the same concept, just slightly different interfaces because like, like the, the subhosting one needs to be programmatic interface instead of, uh, something you type as an end user.
Right?
[01:04:20] Jeremy: One of the things you mentioned about Deno Deploy is it's, centered around deploying your application code to a Bunch of different locations. And you also mentioned the, the cold start times very low.
could you kind of give the case for wanting your application code at a Bunch of different sites?
[01:04:38] Luca: Mm-hmm. . Yeah. So the, the, the, the main benefit of this is that when your user makes request for your application, um, you don't have to round trip back to, um, wherever centrally hosted your application would otherwise be. Like, if you are, a, a startup, even if you're just in the US for example, it's nice to have, points of presence, not just on one of the US coasts, but on both of the US coasts because that means that your round trip time is not gonna be a hundred milliseconds, but it's gonna be 20 milliseconds.
this sort of relies on. the, like, this doesn't, there's obviously always the problem here that if your database lives in only one of the two coasts, you still need to do the round trip. And there's solutions to this, which is one caching, uh, that's the, the, the obvious sort of boring solution. Um, and then there's the solution of using databases which are built exactly for this.
For example, CockroachDB is a database which is Postgres compatible, but it's really built for, um, global distribution and built for being able to shard data across regions and have different, um, primary regions for different, uh, shards of your, of your, of your tables. which means, for example, you could have the, your users on the East coast, their data could live on a database in the east coast and your users on the west coast, their data could live on a database on the west coast.
and. your like admin panel needs to show all of them. It has an aggregate view over both coasts, right? like this is something which, which something like CockroachDB can do and it can be a really great, um, great thing here. And, we acknowledge that this is not something which is very easy to do right now and Deno tries to make everything very easy.
So you can imagine that this is something we're working on and we're working on, on database solutions. And actually I should more generally say persistent solutions that allow you to persist data, in a way that makes sense for an edge system like this. Um, where the data has persisted close to users that need it.
[01:06:44] Luca: Um, and data is cached around the world. and you still have sort of semantics, which, which are consistent with the semantics that you have, when you're locally developing your application. Like you don't want, for example, your local application development. , you don't want there to be like strong consistency there, but then in production you have eventual consistency where suddenly, I don't know, all of your code breaks because you didn't, your US west region didn't pick up the changes from US east because it's eventually consistent, right?
I mean, this is a problem that we see with a lot of the existing solutions here. like specifically CloudFlare KV for example. CloudFlare KV is, um, a single primary is a system with, with single primary, um, right regions, where there's just a Bunch of caching going on. And this leads to ventral consistency, which can be very confusing for, for end user developers.
Um, especially because if you're using this locally, the local emulator does not emulate the eventual consistency, right? so this, this, this can become very confusing very quickly. And so a, anything that we build in, in this persistence field, for example, we take very, we very seriously, um, Weigh these trade offs and make sure that if there's something that's eventually consistent, it's very clear and it works the same way, the same eventually consistent way in the cli.
[01:08:03] Jeremy: So for someone, let's say they haven't made that jump yet to use a cockroach. They, they just have their. their database instance in AWS East or whatever. does having the code at the edge where it all ends up needing to go to east, is that better than having the code be located next to the database?
[01:08:27] Luca: Yeah. Yeah. It, it, it totally does. Um, there's, there's def there's different, um, there, there's trade-offs here, right? Obviously, like if you have a, a, if you have an admin panel, for example, or a, a like user dashboard, which is very, very reliant on data from your database, and for every single request needs to fetch fresh data from the database, then maybe the trade off isn't worth it.
But most applications are not like that. Most applications are, for example, you have a landing page and that landing page needs to do AB tests. and those AB tests are based on some heuristic that you can fetch from the database every five seconds. That's fine. Like, it doesn't need to be perfect, right?
So you, you have caching in place, which, um, like by doing this caching locally to the user, um, and, and still being able to programmatically control this, like based on, I don't know, the user's user agent or, the IP address of the user or the region of the user, or. the past browsing history of that user through as, as measured by their cookies or whatever else, right?
being able to do these highly user customized actions very close to the user, means that like latency is, is like, this is a much better user experience than if you have to do the roundtrip, especially if you're a, a startup or, or, or, or a, um, service which is globally distributed and, and serves not just users in the US or the EU but like all across the world.
[01:09:52] Jeremy: And when you talk about caching in the context of Deno Deploy, is there a cache native to the system or are you expecting someone to have, uh, a Redis or a memcached, that sort of thing?
[01:10:07] Luca: Yeah. So Deno Deploy, actually has, there's a built, there's a, there's a web cache api, um, which is also the web cache API that's used by service workers and, and others. and CloudFlare also implements this cache api. Um, and this is something that's implemented in Deno cli, and it's gonna be coming to Deploy this quarter, which is, that's the native way to do caching, and otherwise you can also use Redis you can use services like Upstash or, uh, even like primitive in-memory caches where it's just an LRU that's in memory, like a, like a JavaScript data structure, right?
or even just a JavaScript map or JavaScript object, with a, with a time on it. And you automatically, and like every time you read from it and the time is above some certain threshold, you delete the cache and go fetch it again, right? Like this is, there's many things that you could consider a cache that are not like Redis or, or, or, or like the web cache api.
So there's, there's ways to do that. And there's also a Bunch of, like, modules on, in the standard library, or not in the standard library story in the, in the third party module registry and also on NPM that you can use to, to implement different cache behaviors.
[01:11:15] Jeremy: And when you give the example of a in memory cache, when you're running in Deno deploy, you're running in these isolates, which presumably can be shut down at any time. So what kind of guarantees do users have that whatever they put into memory will still be there?
[01:11:34] Luca: none like the, it's, it's a cache, right? The cache can be evicted at any time. Your isolate can be restarted at any time. It can be shut down. You can be moved to a different region. The data center could go for, go down for maintenance. Like this is something your application has to be built in, in a way that it is tolerant to, to restarts essentially.
but because it's a cache, that's fine. Because if the cache expires or, or, or the cache is cleared through some external means, the worst thing that happens is that you have a cold request again, right? And, if you're serving like a hundred requests a second, I can essentially guarantee to you that not every single request will invoke a cold start.
Like, I can guarantee to you that probably less than 0.1% of requests will, will cause a cold start. this is not like SLA anywhere. Um, because it's like totally up to, to however the, the system decides to scale you. but yeah, like it's, it, it would be very wasteful for us, for example, to spin up a new isolate for every request.
So we don't, we reuse isolates wherever possible. yeah. It's like it's in our best interest to not cold start you, um, because it's expensive for us to do all the CPU work to, to cold start an isolate, right?
[01:12:47] Jeremy: and typically with applications, people will put a, a CDN in front and they'll use things like cache control headers to be able to serve straight from the CDN Is that a supported use case with Deno Deploy or are there anything that, anything that people should be aware of when they're doing that sort of thing?
[01:13:09] Luca: Yeah, so you can do that. Um, like you could put a cache in front of Deploy but in most cases it's really not necessary. Um, because the main reasons people use CDNs is, it is essentially to like do this global distribution problem, right? Like you, you want to be able to cache close to users, but if your end application is already executing close to users, the cost of a, of a, of serving something, like serving a request from a JavaScript cache is like marginal.
It's so low. there's, there's like no nearly no CPU time involved here. it's, it's network bandwidth. That's the, that's the limiting factor and that's the limiting factor for all CDNs. Uh, so, so whether you're serving on Deploy or you have a, a separate CDN that you put in front of it, hmm. not really that big a difference.
Like you can do it. but I don't know. Deno.com doesn't, or, or, and Deno.land, like they don't have a CDN in front of them. They're running bare on, on Deno Deploy and, yeah, it's fine.
[01:14:06] Jeremy: So for, even for things like images, for example, something that. Somebody might store in object storage and put a CDN in in front.
[01:14:17] Luca: Mm-hmm.
[01:14:18] Jeremy: are you suggesting that people could put it on Deno deployed directly or just kind of curious what your thoughts are there?
[01:14:26] Luca: Yeah. Uh, like if you have a blog and your profile image is, is part of your blog, right? And you can put that in your static file folder and serve that directly from your Deno Deploy application, like that's totally cool. Uh, you should do that because that's obvious and that's the obvious way to do things.
if you're specifically building like a, image serving CDN , go reach out to us because we'd love to work with you. But also, um, like there's probably different constraints that you have. Um, like you probably very, very, very much care about network bandwidth costs, um, because that is like your one number one primary cost factor.
so yeah, it's just what's the trade off? What, what trade-offs are you willing to make? Like does some other provider give you a lower network bandwidth cost? I would argue that if you're building an, like an image cdn, then you'd probably, like, even if you have to write your application code in Haskell or in whatever, it's probably worth it if you can get like a cent cheaper gigabyte transfer fees.
just because that is like 100% of your, of your costs, um, is, is network bandwidth. So it's really a trade off based on what, what you're trying to build.
[01:15:36] Jeremy: And if I understand correctly, Deno Deploy, it's centered around applications. That take HTTP requests. So it could be a website, it could be an API that sort of thing. and sometimes when people build applications, they have other things surrounding them. They'll, they'll need scheduled jobs. They may need some form of message queue, things like that.
Things that don't necessarily fit into what Deno Deploy currently hosts. And so I wonder for things like that, what you recommend people would do while working with Deno Deploy.
[01:16:16] Luca: Yeah. Great question. unfortunately I can't tell you too much about that without, like, spoiling everything (laughs), but what I'm gonna say is you should keep your eyes peeled on our blog over the next two to three months here. I consider message queues and like, especially message queues they are a persistence feature and we are currently working on persistence features.
So yeah, that's all I'm gonna say. But, uh, you can expect Deno deployed to do things other than, um, just HTTP requests in the not so far. Future, and like cron jobs and stuff like that. Also, uh, at some point, yeah.
[01:16:54] Jeremy: All right. We'll look, we'll look out for that I guess as we wrap up, maybe you could give some examples of who's using Deno and, and what types of projects do you think are are ideal for Deno?
[01:17:11] Luca: Yeah. yeah. Uh, Deno or Deno Deploy, like do you know, like, do you know as in all of Deno or Deno deploy specifically?
[01:17:17] Jeremy: I, I mean, I guess either (laughs)
[01:17:19] Luca: Okay. . Okay. Okay. Yeah, yeah. Uh, let's, let's do it. So, one really cool use case, for example, for Deno is Slack. Uh, slack has this app platform that they're building, um, which allows you to execute arbitrary JavaScript from within inside of Slack, in response to like slash commands and like actions. I dunno if you've ever seen like those little buttons you can have in messages if you press one of those buttons, like that can execute some Deno code.
And Slack has built like this entire platform around that, and it makes use of Deno's like security features and, and built in tooling and, and all that kind of thing. Um, and that's really cool. And Netlify has built edge functions like, which is like a really, really awesome primitive they have for, for being able to customize outgoing requests to even, come up with completely new requests on the spot, um, as part of their CDN layer.
Uh, also built on top of Deno. And GitHub has built, like this platform called, flat, which allows you to like sort of, um, on cron schedules, pull data, um, into git repositories and, and process that and, and post-process that and, and, and do do things with that. And it's integrated with GitHub actions, all kind of thing.
It's kind of cool. Supabase also has some Edge has like an Edge functions product that's built on top of Deno. I'm just thinking about other, like those are, those are the obvious ones that are on the homepage. there's, I, I know for example, there's a image CDN actually that's serves images with Deno, like 400 million of them a day.
kind of related to what we were talking about earlier. Actually, I don't know if it's still 400 million. I think it's more, um, the last data I got from them was like maybe eight months ago. So probably more at this point. Um, . Yeah. A Bunch of cool, cool, cool things like that. Um, we have like a really active discord channel and there's always people showcasing what kind of stuff they built in there that we have a showcase channel.
I think that's like, if, if you're really interested in like what people are, what cool things people are building with, you know, that's like, that's a great place to, to look. I think actually we maybe also have a showcase. Do we have Deno.land/showcase? I don't remember. Show case. Oh yeah, we do Deno.com/showcase, which is a page of like a Bunch of Yeah. Projects built with Deno or, or, or products using Deno or, um, other things like that.
[01:19:35] Jeremy: Cool. if people wanna learn more about Deno or see what you're up to, where should they head?
[01:19:42] Luca: Yeah. Uh, if you wanna learn more about Deno Cli, head to Deno.land. If you wanna learn more about Deno Deploy, head to Deno.com/deploy. Um, if you want to chat to me, uh, you can hit me up on my website, lcas.dev. if you wanna chat about Deno, you can go to discord.gg/deno. yeah, and if you're interested in any of this and thought that maybe you have something to contribute here, you can either become an open source contributor on our open source project, or this is really something you wanna work on and you like distributed systems or systems engineering or fast performance, head to deno.com/jobs and, send in your resume.
We're, we're very actively hiring and, be super excited to, to, work with you.
[01:20:20] Jeremy: All right, Luca. Well thank you so much for coming on Software Engineering Radio.
[01:20:24] Luca: Thank you so much for having me.
Leaguepedia is a MediaWiki instance that covers tournaments, teams, and players in the League of Legends esports community. It's relied on by fans, analysts, and broadcasters from around the world.
Megan "River" Cutrofello joined Leaguepedia in 2014 as a community manager and by the end of her tenure in 2022 was the lead for Fandom's esports wikis.
She built up a community of contributing editors in addition to her role as the primary MediaWiki developer.
She writes on her blog and is a frequent speaker at the Enterprise MediaWiki Conference
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: Today I'm talking to Megan Cutrofello. She managed the Leaguepedia eSports wiki for eight years, and in 2017 she got an award for being the unsung hero of the year for eSports. So Megan, thanks for joining me today.
[00:00:17] Megan: Thanks for having me.
[00:00:19] Jeremy: A lot of the people I talk to are into web development, so they work with web frameworks and things like that. And I guess when you think about it, wikis are web development, but they're kind of their own world, I suppose. for someone who's going to build some kind of a site, like when does it make sense for them to use a wiki versus, uh, a content management system or just like a more traditional web framework?
[00:00:55] Megan: I think it makes the most sense to use a wiki if you're going to have a lot of contributors and you don't want all of your contributors to have access to your server.
also if your contributors aren't necessarily as tech savvy as you are, um, it can make sense to use a wiki. if you have experience with MediaWiki, I guess it makes sense to use a Wiki.
Anytime I'm building something, my instinct is always, oh, I wanna make a Wiki (laughs) . Um, so even if it's not necessarily the most appropriate tool for the job, I always. My, my first thought is, hmm, let's see, I'm, I'm making a blog. Should I make my blog in in MediaWiki? Um, so, so I always, I always wanna do that. but I think it's always, when you're collaborating is pretty much, you always wanna do MediaWiki
[00:01:47] Jeremy: And I, I think that's maybe an important point when you say people are collaborating. When I think about Wikis, I think of Wikipedia, uh, and the fact that I can click the edit button and I can see the markup right there, make a change and, and click save. And I didn't even have to log in or anything. And it seems like that workflow is built into a wiki, but maybe not so much into your typical CMS or WordPress or something like that.
[00:02:18] Megan: Yeah. Having a public ability to solicit contributions from anyone. so for Leaguepedia, we actually didn't have open contributions from the public. You did have to create an account, but it's still that open anyone can make an account and all you have to do is like, go through that one step of create an account.
Admittedly, sometimes people are like, I don't wanna make an account that's so much work. And we're like, just make the account. Come on. It's not that hard. but, uh, you still, you're a community and you want people to come and contribute ideas and you want people to come and be a part of that community to, document your open source project or, record the history of eSports or write down all of the easter eggs that you find in a video game or in a TV show, or in your favorite fantasy novels.
Um, and it's really about community and working together to create something where the whole is bigger than the sum of its parts.
[00:03:20] Jeremy: And in a lot of cases when people are contributing, I've noticed that on Wikipedia when you edit, there's an option for a, a visual editor, and then there's one for looking at the raw markup. in, in your experience, are people who are doing the edits, are they typically using the visual editor or are they mostly actually editing the, the markup?
[00:03:48] Megan: So we actually disabled the Visual editor on Leaguepedia, because the visual editor is not fantastic at knowing things about templates. Um, so a template is when you have one page that gets its content pulled into the larger page, and there's a special syntax for that, and the visual editor doesn't know a lot about that.
Um, so that's the first reason. And then the second reason is that, there's this, uh, one extension that we use that allows you to make a clickable, piece of text. It's called (https://www.mediawiki.org/wiki/Extension:CharInsert) CharInserts, uh, for character inserts. so I made a lot of these things that is sort of along the same philosophy as Visual Editor, where it's to help people not have to have the same burden of knowledge, of knowing every exact piece of source that has to be inserted into the page. So you click the thing that says like, um, insert a pick and band prefill, and then a little piece of JavaScript fires and it inserts a whole bunch of Wiki text and then you just enter the champions in the correct places. In the prefills of champions are like the characters that you play in, uh, league of Legends.
And so then you have like the text is prefilled for you and you only have to fill in into this outline. so Visual Editor would conflict with CharInserts, and I much preferred the CharInserts approach where you have this compromise in between the never interacting with source and having to have all of the source memorized.
So between the fact that Visual Editor like is not a perfect tool and has these bugs in it, and also the fact that I preferred CharInserts, we didn't use Visual Editor at all. I know that some wikis do like to use Visual Editor quite a bit, and especially if you're not working with these templates where you have all of these prefills, it can be a lot more preferred to use Visual Editor.
Visual Editor is an experience much more similar to editing something like Microsoft Word, It doesn't feel like you're editing code. and editing code is, I mean, it's scary. Like for, and when I said like, MediaWiki is when you have editors who aren't as tech savvy, as the person who set up the Wiki.
for people who don't have that experience, I mean, when you just said like you have to edit a wiki, like someone who's never done that before, they can be very intimidated by it. And you're trying to build a sense of community. You don't want to scare away your potential editors. You want everyone to be included there.
So you wanna do everything possible to make everyone feel safe, to contribute their ideas to the Wiki. and if you make them have to memorize syntax, like even something that to me feels as simple as like two open brackets and then the name of a page, and then two closed brackets means linking the page.
Like, I mean, I'm used to memorizing a lot of syntax because like, I'm a programmer, but someone who's never written code before, I mean, they're not used to memorizing things like that. So they wanna be able to click a button that says insert link, and then type the name of the page in the middle of the things that pop up there.
Um, so visual editor is. It's a lot safer to use. so a lot of wikis do prefer that. and if it, if it didn't have the bugs with the type of editing that my Wiki required, and if we weren't using CharInserts so much, we definitely would've gone for it. But, um, it wasn't conducive to the wiki that I built, so we didn't use it at all.
[00:07:42] Jeremy: And the, the compromise you're referring to, is it where the editor sees the raw markup, but then they can, there's like little buttons on the side they can click and they'll know, okay, if I click this one, then it's going to give me the text for creating a list or something like that.
[00:08:03] Megan: Yeah, it's a little bit more high level than creating a list because I would never even insert the raw syntax for creating a list. It would be a template that's going to insert a list at the very end. but basically that, yeah,
[00:08:18] Jeremy: And I, I know for myself, even though I do software development, if I click at it on a wiki and there's all the different curly brace tags, there's the square tags, and. I think if you spend some time with it, you can kind of get a sense of what it means. But for the average person who doesn't work with software in their day to day, do, do you find that, is that a big barrier for them where they, they click edit and there's all this stuff that they don't really understand?
Is that where some people just, they go, oh, I don't, I don't know what to do.
[00:08:59] Megan: I think the biggest barrier is actually clicking at it in the first place. so that was a big barrier to me actually. I didn't wanna click at it in the first place, and I guess my reasons were maybe a little bit different where for me it was like, I know that if I click edit, this is going to be a huge rabbit hole and I'm going to learn way too much about wikis and this is going to consume my entire life and look where I ended up.
So I guess I was pretty right about that. I don't know if other people feel the same way or if they just like, don't wanna get involved at all. but I think once people, click edit, they're able to figure it out pretty well. I think there's, there's two barriers or maybe three barriers. the first one is clicking edit in the first place.
The second one is if they learn to code templates at all. Media Wiki syntax is literally the worst I have encountered other than programming languages that are literally parodies. So like the white space language is worse (laughs https://en.wikipedia.org/wiki/Whitespace_(programming_language)) , but like it's two curly braces for a template and it's three curly braces for a variable.
And like, are you actually kidding me? One of my blog posts is like a plea to editors to write a comment saying the name of the template that they're ending because media wiki like doesn't provide any syntax for what you're ending. And there's no, like, there's no indentation. So you can't visually see what you're ending.
And there's no. So when I said the white sp white space language, that was maybe appropriate because MediaWiki prints all of the white space because it's really just like, PHP functions that are put into the text that you're literally putting onto the page. So any white space that you put gets printed.
So the only way to put white space into your code is if you comment it out. So anytime you wanna put a new line, you have to comment out your new line. And if you wanna indent your code, you have to comment out the indents. So it's just, I, I'm , I'm not exaggerating here. It's, it's just the worst. Occasionally you can put a little bit of white space. Because there's like some divisions in parser functions that get handled when it gets sent to the parser. And, but I mean, for the most part it's just, it's just terrible. so if I'm like writing an if statement, I'll write if, and then I'll write a commented out endif at the end, so once an editor starts to write templates, like with parser functions and stuff, that's another big barrier because, and that's not because like people don't know how to code, it's just because the MediaWiki language, and I use language very loosely, it's like this collection of PHP functions poured into this just disaster
It's just, it's not good! (laughs) And the, the next barrier is when people start to jump to Lua, which is just, I mean, it's just Lua where you can write Lua modules and then, Lua is fine. It's great, it has white space and you can make new lines and it's absolutely fine and you can write an entire code base and as long as you're writing Lua, it's, it's absolutely fantastic and there's nothing wrong with it anymore (laughs)
So as much as I just insulted the MediaWiki language, like writing Lua in MediaWiki is great (laughs) . So for, for most of my time I was writing Lua. Um, and I have absolutely no complaints about that except that Lua is one index, but actually the one indexing of Lua is fine because MediaWiki itself is one indexed.
So people complain about Lua being one index, and I'm like, what are you talking about? If it's, if another language were used, then you'd have all of this offsetting when you go to your scripting language because you'd have like the first argument from your template in MediaWiki going into your scripting language, and then you'd have to offset it to zero and everyone would be like vastly confused about what's going on.
So you should be thankful that they picked a language that's one index because it saves you all of this headache. So anyway, sorry for that tangent, but it's very good that we picked a one index language.
[00:13:17] Jeremy: When you were talking about the, the if statement and having to put in comments to have white space, is it, cuz like when I think about an if statement in most languages, the, the if statement isn't itself rendering anything, it's like deciding if you're going to do something inside of the, if so. like what, what would that white space do if you didn't comment it out in the context of the if?
[00:13:44] Megan: So actually you would be able to put some white space inside of an if statement, but you would not be able to put any white space after an if statement. and there, most likely inside of the if statement, you're printing variables or putting other parser functions. and the other parser functions also end in like two curly braces.
And, depending on what you're printing, you're likely ending with a series of like five or eight, or, I don't know, some very large set of curly braces. And so what I like to do is I would like to be able to see all of the things that I'm ending with, and I wanna know like how far the nesting goes, right.
So I wanna write like an end if, and so I have to comment that out because there's no like end if statement. so I comment out an end if there, it's more that you can't indent the statements inside of the if, because anything that you would be printing inside of your code would get printed. So if I like write text inside of the code, then that indentation would get printed into the page.
And then if I put any white space after the if statement, then that would also get printed. So technically you can have a little bit of white space before the curly braces, but that's only because it's right before the curly braces and PHP will strip the contents right inside of the parser function.
So basically if PHP is stripping something, then you're allowed to have white space there. But if PHP isn't stripping anything, then all of the white space is going to be printed and it's like so inconsistent that for the most part it's not safe to put white space anywhere because you don't, you have to like keep track of am I in a location where PHP is going to be stripping something right now or not?
and I, I wanna know what statement or what variable or what template I'm closing at any location. So I always want to, write out what I'm closing everywhere. And then I have to comment that because there was no foresight to put like an end
if clause in this white space, sensitive language.
[00:16:22] Jeremy: Yeah, I, I think I see what you mean. So you have, if you're gonna start an, if you have the, if inside these curly braces, but then, inside the, if you typically are going to render some text to the page, and so intuitively you would indent it so that it's indented in from the if statement. But then if you do that, then it's gonna be shifted to the right on, on the Wiki.
Did I get that right?
[00:16:53] Megan: Yeah. So you have the flexibility to put white space immediately because PHP will strip immediately, but then you don't have flexibility to put any white space after that, if that makes sense.
[00:17:11] Jeremy: So, so when you say immediately, is that on the following line or is that
[00:17:15] Megan: yeah, so any white space before the first clause, you have flexibility. So like if you were to put an if statement, so it's like if, and then there's a colon, all of the next white space will get stripped. Um, so then you can put some text, but then, if you wanted to like put some text and then another if statement nested within the first if statement.
It's not like Lua where you could like assign a variable and then put a comment and then put some more white space and then put another statement. And it's white space insensitive because you're just writing code and you haven't returned anything yet.
it, it's more like Jinja (View templating language) than Python for, for an analogy.
So everything is getting printed because you're in like a, this templating language, not actually a programming language. Um, so you have to work as if you're in a templating language about, you know, 70% of the time , unless you're in this like very specific location where PHP is stripping your white space because you're at the edge of an argument that's being sent there.
So it's like incredibly inconsistent. And every now and then you get to like, pretend that you're in an actual language and you have some white space, that you can indent or whatever. it's just incredibly
inconsistent, which is like what you absolutely want out of a programming language (laughs)
yeah, it's like you're, you're writing templates, but like, it seems like because of the fact that it's using php, there's
[00:18:56] Jeremy: weird exceptions to the behavior.
Yeah.
[00:18:59] Megan: Exactly. Yeah.
[00:19:01] Jeremy: and then you also mentioned these, these templates. So, if I understand correctly, this is kind of like how a lot of web frameworks will have, partials, I guess, where you'll, you'll be able to have a webpage, but it's made up of different I don't know if you would call them components, but you're able to build a full page that's made up of a bunch of different pieces.
So you could have a
[00:19:31] Megan: Yeah Yeah that's a good analogy.
[00:19:33] Jeremy: Where it's like, here's my table of contents, or here's my info box, or things like that. And those are all things that you would create a MediaWiki template for, and then somehow the, the data gets passed into those templates and the template decides how to, to render it out.
[00:19:55] Megan: Yeah.
[00:19:56] Jeremy: And for these, these templates, I, I noticed on some of the Leaguepedia pages, I noticed there's some html in some of them. I was curious if that's typical to write them with HTML or if there are different ways native to Media Wiki for, for, creating these templates.
[00:20:23] Megan: Um, it depends on what you're doing. MediaWiki has a special syntax for tables specifically. I would say that it's not necessarily recommended to use the special syntax because occasionally you can get things to not work out fantastically if people slightly break things. But it's easier to use it.
So if you know that everything's going to work out perfectly, you can use it. and it's a simple shortcut. if you go to the help page about tables on Wikipedia, everything is explained, and not all HTML works, um, for security reasons. So there's like a list of allowed, things that you can use, allowed tags, so you can't put like forms and stuff natively, but there's the widgets extension that you can use and widgets just automatically renders all html that you put inside of a widget.
Uh, and then the security layer there is that you have to have a special permission to edit a widget. so, you only give trusted people that permission and then they can put the whatever html they want there. So, we have a few forms on Leaguepedia that are there because I edited, uh, whichever widgets, and then put the widgets into a Lua module and then put the Lua module into a template and then put the template onto the page.
I was gonna say, it's not that complicated. It's not as complicated as it sounds, but I guess it really is as complicated as it sounds (laughs) . Um, so, uh, I, I won't say that. I don't know how standard it is on other wikis to use that much html, I guess Leaguepedia is pretty unique in how complicated it is.
There aren't that many wikis that do as many things as we did there. but tables are pretty common. I would say like putting divs places to style them, uh, is also pretty common. but beyond that, usually there's not too many HTML elements just because you typically wanna be mobile friendly and it's relatively hard to stay mobile friendly within the bounds of MediaWiki if you're like putting too many elements everywhere.
And then also allowing users to put whatever content inside of them that they want. The reason that we were able to get away with it is because despite the fact that we had so many editors, our content was actually pretty limited. Like if there's a bracket, it's only short team names going into it.
So, and short team names were like at most five or six characters long, so we don't have to worry about like overflow of team names. Although we designed the brackets to support overflow of team names, and the team names would wrap around and the bracket would not break. And a lot of CSS Magic went into making that work that, we worked really hard on and then did not end up using (laughsz)
[00:23:39] Jeremy: Oh no.
[00:23:41] Megan: Only short team names go into brackets.
But, that's okay. uh, and then for example, like in, uh, schedules and stuff, a lot of fields like only contain numbers or only contain timestamps. there's like a lot of tables again where like there's only two digit numbers with one decimal point and stuff like that. So a lot of the stuff that I was designing, I knew the content was extremely constrained, and if it wasn't then I said, well, too bad.
This is how I'm telling you to put the content . Um, and for technical reasons, that's the content that's gonna go here and I don't care. so there's like, A lot of understanding that if I said for technical reasons, this is how we have to do it. Then for technical reasons, that was how we had to do it.
And I was very lucky that all of the people that I worked with like had a very big appreciation with like, for technical reasons, like argument over. This is what's happening. And I know that with like different people on staff, like they would not be willing to compromise that way. Um, so I always felt like extremely lucky that like if I couldn't figure out a way to redesign or recode something in order to be more flexible, then like that would just be respected.
And that was like how we designed something. But in general, like it's, if you are not working with something as rigid as, I mean, and like the history of eSports sounds like a very fluid thing, but when you think about it, like it's mostly names of teams, names of players and statistics. There's not that much like variable stuff going on with it.
It's very easy to put in relational databases. It's very easy to put in fixed width tables. It's very easy to put in like charts that look the same on every single page. I'm not saying. It was always easy to like write everything that I wrote, and it's not, it wasn't always easy to like, deal with designs and stuff, but like relative to other topics that you can pick, it was much easier to put constraints on what was going to go where because everything was very similar across regions, across, although actually one thing.
Okay, so this will be like the, the exception that proves the rule. uh, we would trans iterate players' names when we, showed them in team rosters. So, uh, for example, when we were showing the hangul, the Korean player's names, we would show an English translation also.
Um, and we would do this for every single alphabet. but Hungarian players' names are really, really, really long. And so the transliteration doesn't fit in the table when we show the translation to the Roman alphabet. And so we couldn't do this, so we actually had to make a cargo table. Of alphabets that are allowed to be transliterated into the Roman alphabet, uh, when we have players names in that alphabet.
So we had, like, hangul was allowed and Arabic was allowed, and I can't remember the exact list, but we had like three alphabet, three or four alphabets were allowed and the rest of the alphabets were dis allowed to be transliterate into, uh, the Roman alphabet. and so again, we made up a rule that was like a hard rule across the entire Wiki where we forced the set of alphabets that were transliterated so that this tables could be the same size roughly across every single team page because these Hungarian player names are too long (laughs)
So I guess even this exception ended up being part of the rule of everything had to be standardized because these tables were just way too wide and they were running into the info box. They couldn't fit on the side. so it's really hard when you have like arbitrary user entered content to fit it into the HTML that you design.
And if you don't have people who all agree to the same standards, I mean, Even when we did have people who agreed to all of the same standards, it was really, really, really hard. And we ended up having things like a table of which alphabets to transliterate. Like that's not the kind of thing that you think you're going to end up having when you say, let's catalog the history of League of Legends eSports,
[00:28:40] Jeremy: And, and so when, let's say you had a language that you couldn't trans iterate, what would go into the table.
[00:28:49] Megan: uh, just the native alphabet.
[00:28:51] Jeremy: Oh I see. Okay.
[00:28:53] Megan: Yeah. And then if they went to the player page, then you would be able to see it transliterated. But it wouldn't show up on the team page.
[00:29:00] Jeremy: I see. And then to help people visualize what some of these things you're talking about look like when you're talking about a, a bracket, it's, is it kind of like a tree structure where you're showing which teams are facing which teams and okay,
[00:29:19] Megan: We had a very cool, CSS grid structure that used like before and after pseudo elements to generate the lines, uh, between the teams and then the teams themselves were the elements of the grid. Um, and it's very cool. Uh, I didn't design it. Um, I have a friend who I very, very fortunately have a friend who's amazing at CSS because I am like mediocre at css and she did all of our CSS for us.
And she also like did most of our designs too. Uh, so the Wiki would not be like anything like what it is without her.
[00:30:00] Jeremy: And when you're talking about making sure the designs fit on desktop and, and mobile, um, I think when you were talking earlier, you're talking about how you have these, these templates to build these tables and the, these, these brackets. Um, so I guess in which part of the wiki is it ensuring that it looks different or that it fits when you're working with these different screen sizes
[00:30:32] Megan: Usually it's a peer CSS solution. Every now and then we hide an element on mobile altogether, and some of that is actually MediaWiki core, for example, in, uh, nav boxes don't show up on mobile. And that's actually on Wikipedia too. Uh, well, I guess, yeah. I mean, being MediaWiki core, So if you've ever noticed the nav boxes that are at the bottom of pages on Wikipedia, just don't show up on like en.m.wikipedia.org.
and that way you're not like loading, you're not loading, but display noneing elements on mobile. but for the most part it's pure CSS Solutions. Um, so we use a lot of, uh, display flex to make stuff, uh, appropriate for mobile. Um, some media roles. sometimes we display none stuff for mobile. Uh, we try to avoid that because obviously then mobile users aren't getting like the full content.
Occasionally we have like overflow rules, so you're getting scroll bars on mobile and then every now and then we sort of just say, too bad if you're on mobile, you're gonna have not the greatest solution or not the greatest, uh, experience. that's typically for large data tables. so the general belief at fandom was like, if you can't make it a good experience on mobile, don't put it on the Wiki.
And I just think that's like the worst philosophy because like then no one gets a good experie. And you're just putting less content on the Wiki so no one gets to enjoy it, and no one gets to like use the content that could exist. So my philosophy has always been like the, the, core overview pages should be, as good as possible for both PC and mobile.
And if you have to optimize for one, then you slightly optimize for mobile because the majority of traffic is mobile. but attempt not to optimize for either one and just make it a good experience on both. but then the pages behind that, I say behind because we like have tabs views, so they're like sort of literally behind because it looks like folders sort of, or it looks like the tabs in a folder and you can, like, I, I don't know, it, it looks like it's behind (laughs) , the, the more detailed views where it's just really hard to design for mobile and it's really easy to design for pc and it just feels very unlikely that users on mobile are going to be looking at these pages in depth.
And it's the sort of thing. A PC user is much more likely to be looking at, and you're going to have like multiple windows open and you're gonna be tapping between them and you're gonna be doing all of your research at PC. You absolutely optimize this for PC users. Like, what the hell this is? These are like stats pages.
It's pages and pages and pages of stats. It's totally fine to optimize this for PC users. And if the option is like, optimized for PC users or don't create it at all, what are you thinking To not create it at all, like make it a good experience for someone?
So I don't, I don't understand that philosophy at all.
[00:34:06] Jeremy: Did you, um, have any statistics in terms of knowing on these types of pages, these pages that are information dense or have really big tables? Could you tell that? Oh, most of the people coming here are on computers or, or larger screens.
[00:34:26] Megan: I didn't have stats for individual pages. Um, mobile I accidentally lost Google Analytics access at some point, and honestly I wasn't interested enough to go through the process of trying to get it back. when I had it, it didn't really affect what I put time into, because it was, it was just so much what I expected it to be.
That it, it didn't really affect much. What I actually spent the most time on was looking, so you can, uh, you get URLs for search results. And so I would look through our search results, and I would look at the URL of the failed search results and, so there would be like 45 results for this particular failed search.
And then I would turn that into a redirect for what I thought the target was supposed to be. So I would make sure that people's failed searches would actually resolve to the correct thing. So if they're like typo something, then I make the typo actually resolve. So we had a lot of redirects of like common typos, or if they're using the wrong name for a tournament, then I make the wrong name for the tournament resolve.
So the analytics were actually really helpful for that. But beyond that, I, I didn't really find it that useful.
[00:35:48] Jeremy: And then when you're talking about people searching, are these people using a search box on the Wiki itself And not finding what they were looking for?
[00:36:00] Megan: Yeah. So like the internal search, so like if you search Wikipedia for like New York City, but you spell it C I Y T, , then you're not going to get a result. But it might say, did you mean New York City t y? If like 45 people did that in one month, then that would show up for me. And then I don't want them to be getting, like, that's a bad experience.
Sure. They're eventually getting there, but I mean, I don't want them to have to spend that extra time. So I'm gonna make an automatic redirect from c Y T to c i t Y
[00:36:39] Jeremy: And, and. Maybe we should have talked about this a little earlier, but the, all the information on Leaguepedia is, it's about all of the different matches and players, um, who play League of Legends. so when you edit a, a page on Wikipedia, all of that information, or a lot of it I think is, is hand entered by, by people and on Leagueapedia, which has all this information about like what, how teams did in a tournament or, intricate stats about how a game went.
That seems like a lot of information for someone to be hand entering. So I was wondering how much of that information is somebody actually manually editing those things and how much is, is done automatically or programmatically.
[00:37:39] Megan: So it's mostly hand entered. We do have a little bit of it that's automated, via a couple scripts, but for the most part it's hand entered. But after being handed, entered into a couple of data pages, it gets propagated a lot of times based on a bunch of Lua modules and the cargo extension. So when I originally joined the Wiki back in 2014, it was hand entered.
Not just once, but probably, I don't know, seven times for tournament results and probably 10 or 12 times for roster changes. It was, it was a lot. And starting in 2017, I started rewriting all of the code so that it was entered exactly one time for everything. Tournament results get entered one time into a data page and roster changes get entered one time into a data page.
And, for roster changes, that was very difficult because, for a roster change that needs to update the team history on a player page, which goes, from a join to a leave and it needs to update the, the like roster, change portal for the off season, which goes from a leave to a join because it's showing like the deltas over the off season.
And it needs to update the current team in the, player's info box, which means that the current team has to be calculated from all of the deltas that have ever occurred in that player's history and it needs to update. Current rosters in the team pages, which means that the team page needs to know all of the current players who are currently on the team, which again, needs to know all of the deltas from all of history because all that you're entering is the roster changes.
You're not entering anyone's current team. So nowhere on the wiki does it ever store a current team anymore. It only stores the roster changes. So that was a lot of code to write and deciding even what was going to be entered was a lot because, all I knew was that I was going to single source of truth that somehow and I needed to decide what was I going to single source of truth.
So I decided, um, that I was going to be this Delta and then deciding what to do with that, uh, how to store it in a relational database. It was, it was a big project. and I didn't have a background as a developer either. so this was like, I don't know, this was like my third big project ever. So, that was, that was pretty intense.
but it was, it was a lot of fun. so it is hand entered but I feel like that's underselling it a little bit.
[00:40:52] Jeremy: Yeah, cuz I was initially, I was a little confused when you mentioned how somebody might need to enter the same information multiple times. But, if I understood correctly, it would be if somebody's changing which team they're on, they would have to update, for example, the player's page and say like, oh, this player is on this team now.
And then you would have to go to their old team and remove them from the roster there.
Go to the new team, add them to the roster there, And you can see where it would kind
[00:41:22] Megan: Yeah. And then there's the roster, there's the roster nav box, and there's like the old team, you have to say, like the next team. Cuz in the previous players list, like we show former team members from the old team and you have to say like the next team. Uh, so if they had like already left their old team, you'd have to say like, new team.
Yeah, there's a, there's a lot of, a lot of places.
[00:41:50] Jeremy: And so now what it sounds like is, I'm not sure this is exactly how it works, but if you go to any location that would need that information, which team is this player on? When you go to that page, for example, if you were to go to, uh, a teams page, then it would make a SQL query to figure out I guess who most recently had a, I forget what you called it, but like a join row maybe, or like a, they, they had the action of joining this team, and now, now there's a row in the database that says they did this.
[00:42:30] Megan: it actually looks at the ten-- so I have an in in between table called tenures. And so it looks at the tenures table instead of querying all the way through the joins and leaves table and doing like the whole list of deltas. yeah. So, and it's also cached so you, it doesn't do the SQL query every time that you load the page.
So the only time that the SQL queries actually happen is if you do a save on the page. And then otherwise the entire generated HTML of the page is actually cached on the server. So you're, you're not doing that many database queries every time you load the page, so don't worry about that. but there, there can actually be something like a hundred SQL queries sometimes, when you're, saving a page.
So it would be absolute murder if you were doing that every time you went to the page. But yeah, it works. Something like that.
[00:43:22] Jeremy: Okay, so this, this tenures table is, that's kind of like what's the current state of all these players and where they are, and then.
[00:43:33] Megan: Um, the, the tenures table, caches sort of, or I guess the tenure table captures is a better word than caches um, every, join to leave historically from every team. Um, and then I save that for two reasons. The first one is so that I don't have to recompute it, uh, when I'm doing the team's table, because I have to know both the current members and the former members.
And then the second reason is also that we have a public api and so people can query that.
if they're building tools, like a lot of people use the public api, uh, for various things. And, one person built like, sort of like a six degrees of Kevin Bacon except for League of Legends, uh, using our tenures tables.
So, part of the reason that that exists is so that uh, people can use it for whatever projects that they're doing.
Cause the join, the join leave table is like pretty unfriendly and I didn't wanna have to really document that for anyone to use. So I made tenures so that that was the table I could document for people to use.
[00:44:39] Jeremy: Yeah. That, that's interesting in that, yeah, when you provide an api, then there's so many different things people can do that even if your wiki didn't really need it, they can build their own apps or their own pages built on all this information you've aggregated.
[00:44:58] Megan: Yeah. It's nice because then when someone says like, oh, can you build this as a feature request? I can say no, but you can (laughs)
[00:45:05] Jeremy: Well you've, you've done the, the hard part for them (laughs)
[00:45:09] Megan: Yeah. exactly.
[00:45:11] Jeremy: So that's cool. Yeah. that's, that's interesting too about the, the caching because yeah, I guess when you think about a wiki, most of the people who are visiting it are just visiting it to see what's on there. So the, provided that they're not logged in and they don't need anything specific to them. Yeah, you should be able to cache the whole response. It sounds like.
[00:45:41] Megan: Yeah. yeah. Caching was actually a nightmare with this in this particular thing. the, the team roster changes, because, so cargo, which I mentioned a couple times is the database extension that we used. Um, and it's basically a SQL wrapper that like, doesn't port 80% of the features that SQL has. so you can create tables and you can query, but you can't make, uh, like sub-select queries.
So your queries have to be like very simple. which is good for like most users of MediaWiki because like the average MediaWiki user doesn't have that much coding experience, but if you do have coding experience, then you're like, what, what, what am I doing? I can't, I can't do anything. Um, but it's a very powerful tool, still compared to most of what you could do with Media Wiki without this, basically you're adding a database layer to your software stack, which I mean, I, I, that's what you're doing, (laughs)
Um, so you get a huge amount of power from adding cargo to a wiki. Um, in exchange it's, it's very performance. It's like, it's, it, it's resource heavy. uh, it hurts your performance a lot. and if you don't need it, then you shouldn't use it. But frequently you need it when you're doing, difficult or not necessarily difficult, but like intensive things.
Um, anytime that you need to pull data from one page to another, you wanna use something like that. Um,
So cargo, uh, one of the things that it doesn't do is it doesn't allow you to, uh, set a primary key easily. so you have to like, just like pretend that one row in the table is your primary key, basically. it internally automatically sets one, but it won't be static or it won't be the same every time that you rebuild the table because it rebuilds the table in a random order and it just uses an auto increment primary key.
So you set a row in the table to pretend to be your ran, to pretend to be your primary key. But editors don't know what, your editors don't understand anything about primary keys. And you wanna hide this from them completely. Like, you cannot tell an editor, protect this random number, please don't change this.
So you have to hide it completely. So if you're making your own auto increment, like an editor cannot know that that exists. Like this is back to when we were talking about like visual editor. This is like, one of the things about making the wiki safe for people is like not exposing them to the internals of like, anything scary like that.
So for example, if an editor accidentally reorders two rows and your roster change data like that has to not matter. Because that can't break the entire wiki. They, you can't make an editor like freak out because they just reordered two rows in, in the page. And you can't put like a scary notice somewhere saying, under no circumstances reorder two rows here.
Like, that's gonna scare people away. And you wanna be very welcoming and say like, it's impossible to break this page no matter how hard you tried. Don't worry. Anything you do, we can just fix it. Don't worry. But the thing is that everything's going to be cached. And so in particular, um, when I said I made that tenures table, one thing I did not wanna do was resave every single row from the join leave table.
So you had to join back to, sorry, I'm going to use, join in two different connotations. you had to join back to the join leave table in order to get like all of the auxiliary data, like all of the extra columns, like, I don't know, like role, date, team name and stuff. Because otherwise the tenures table would've had like 50 columns or something.
So I needed to store the fake primary key in the tenures table, but the tenures table is cached on the player page and the join leave table is on the data page, which means that I need to purge the cache on the player page anytime that someone edits the data on the data page. Which means that, so there's like some JavaScript that does that, but if someone like changes the order of the lines, then that primary key is going to change because I have an auto increment going on.
And so I had to like very, very carefully pick a primary key here so that it was literally impossible for any kind of order change to affect what the primary key was so that the cash on the player page wasn't going to be changed by anything that the editor did in unless they were going to then update the cash on that player page after making that change.
If that makes sense. So after an editor makes a change on the news page, they're going to press a button to update the cache on the player page, but they're only going to update the player page for the one line that they change on the news page. These, uh, primary keys had to be like super invariant for accidental row moves, or also later on, like entire moves of separating a bunch of these data pages into like separate subpages because the pages were getting too big and it was like timing out the server because there were too many stores to the database on a single page every time you save the page.
And anyway, it took me like five iterations of making the primary key like more and more specific to the single line because my auto increment was like originally including every single line I was auto incrementing and then I auto incremented only when that single player was was involved. And then I auto incremented only when that player and the team was involved.
And then I reset the auto increment for that date. So, and it was just got like more and more convoluted what my primary key was. It was, it was a mess.
Anyway, this is just like another thing when you're working with volunteers who don't know what's going on and they're editing the page and they can contribute content, you have to code for the editor and not code for like minimizing complexity,
The editor's experience matters more than the cleanliness of your code base, and you just end up with these like absolute messes that make no sense whatsoever because the editor's experience matters and you always have to code to the editor. And Media Wiki is all about community, and the editor just becomes part of the software and part of the consideration of your code base, and it's very, very different from any other kind of development because they're like, the UX is just built so deeply into how you're developing.
[00:53:33] Jeremy: if I am following correctly, when I, when I think of using SQL when you were first talking about cargo and you were talking about how you make your own tables, and I'm envisioning the, the columns and the rows and, it's very common for the primary key to either be auto incrementing or some kind of GUID
But then if I understood correctly, I think what you were saying is that anytime an editor makes changes to the data, it regenerates the whole table. Is that did I get that right?
[00:54:11] Megan: It regenerates all of the rows on that page.
[00:54:14] Jeremy: and when you talk about this, these
data pages, there's some kind of media wiki or cargo specific markup where people are filling in what is going to go into the rows. And the actual primary key that's in MySQL is not exposed anywhere when they're editing the data.
[00:54:42] Megan: That's right
[00:54:44] Jeremy: And so when you're talking about trying to come up with a primary key, um, I'm trying to, I guess I'm trying to picture
[00:54:57] Megan: So usually I do page name underscore an auto increment. But then if people can rearrange the rows which they do because they wanna get the rows chronological, but some people just put it at the top of the page and then other people are like, oh my God, it's not chronological. And then they fix it and then other people are like, oh my God, you messed up the time zone.
And then they rearrange it again. Then, I mean, normally I wouldn't care because I don't really care like what the primary key is. I just care that it exists. But then because I have it cached on these player pages, I really, really do care what the primary key is. And because I need the primary key to actually agree with what it is on the data page, because I'm actually joining these things together.
and people aren't going to be updating the cache on the player page if they don't think that they edited the row because rearranging isn't actually editing and people aren't going to realize that. And again, this is burden of knowledge. People can't, I can't make them know that because they have to feel safe to make any edits.
It's bad enough that they have to know that they have to click this button to update the cache after making an edit in the first place. so, the auto increment isn't enough, so it has to be like an auto increment, but only within the set of rows that incorporate that one player. And then rearranging is pretty safe because they'd have to rearrange two pieces of news, including the same player.
And that's really unlikely to happen. It's really unlikely that someone's going to flip the order of two pieces of news that involve the same player without realizing that they're actually are editing that single player except maybe they are. So then I include the team in that also. So they'd have to rearrange two pieces of news, including the same player and the same team.
And that's like unlikely to happen in the first place. And then like, maybe a mistake happens like once a year. And at the end of the day, the thing that really saves us is that we're a wiki. We're not an official source. And so if we have a mistake once a year, like no one cares really. So we're not going for like five nines or anything.
We're going for like, you know, two (laughs) . Um, so
[00:57:28] Jeremy: so
[00:57:28] Megan: We were having like mistakes constantly until I added player and team and date to the set of things that I was auto incrementing against. and once I got all of those, it was pretty stable.
[00:57:42] Jeremy: And for the caching part, so when you're making a cargo query or a SQL query on one page and it needs to join on or get data from another page, it goes to this cache that you have instead of going directly to the actual table in the database. And the only way to get the right data is for the editor to click this button on the website that tells it to update the cache did I get that right?
[00:58:23] Megan: Not quite. So it, well, or Yes, you did sort of, it goes to the actual table. The issue here is that, the table was last updated, the last time that a page was saved. And the last time the data got saved was the last time that the page that contains the parser function that generates those rows got saved.
So, let me say that again. So, some of the data is being saved from the data page where the users manually enter it, and that's fine because the only time that gets updated is when the users manually enter it and then the page gets saved. But then these tenures tables are stored by my lua code on the player pages, and those aren't going to get updated unless the player page gets blank edited or null edited, or a save action happens from the player page.
And so the way to make a, an edit happen from the player page is either to manually go there and click edit, and then click save, which is called a blank edit because. Blank edited, you didn't do anything but you pressed save or to use my JavaScript gadget, which is clicking a button from the data page that just basically does that for you using the api.
And then that's going to update the table and then the database table, because that's where the, the cargo parser function is that writes to the database and updates the tables there. with the information, Hey, the primary key changed, because that's where the parser function is physically located in the wiki because one of them is on the data page and one of them is on the player page.
So you get this disconnect in the cache where it's on two different pages and so you have to press a save action in both of them before the table is consistent again.
[01:00:31] Jeremy: Okay. It be, it's, so this is really all about the tenure table, which the user will never mod or the editor will never modify directly. You need your code running on the data page and the player's page to run, to update the The tenure table?
[01:00:55] Megan: Yeah, exactly.
[01:00:57] Jeremy: yeah, it's totally hidden that this exists to the editor, but it's something that, that you as the person who put this all together, um, have to always be aware of, yeah.
[01:01:11] Megan: Right. So there was just so many things like this, where you just had to press this one button. I call it refresh overview because originally it was on a tournament page and you had to press, the refresh overview button to purge the cache on the overview page of the tournament. after editing the data and you would refresh, overview, to deal with this cache lag.
And everyone knew you have to refresh overview, otherwise none of your data entry is gonna like, be worth anything because it's not, the cache is just gonna lag. but every editor learned, like if there's a refresh overview button, make sure you press the refresh overview button, , otherwise nothing's gonna happen.
Um, and there is just like tons of these littered across the Wiki. and like to most people, it just like, looks like a simple little button, but like so many things happen when you press this button.
so it is, it is very important.
[01:02:10] Jeremy: Are there, no ways inside of media wiki to if somebody edits one page, for example, to force it to go and, do, I forget what you called it, like a blank save or blank edit on another page?
[01:02:27] Megan: So that wouldn't even really work because, we had 11,000 player pages. And you don't know which one the user just edited. so it, it's unclear to MediaWiki what just happened when the user just edited some part of the data page. and like the whole point here is that I can't even blank edit every single player page that the data page links to because the data page probably links to, I don't know, 200 different player pages.
So I wanna link, I wanna blank it like the five that this one news line links to. so I do that, through like HTML attributes, in the JavaScript,
[01:03:14] Jeremy: Oh, so that's why you're using JavaScript so that you can tell what the person edited because there isn't really a way to know natively in, in MediaWiki. what just changed?
[01:03:30] Megan: there's like a diff so I could, like, MediaWiki knows the characters that got changed, but it doesn't really know like semantically what happened. So it doesn't know, like, oh, a link to this just got edited and especially because, I mean it's like templates that got edited, not really like the final HTML or anything.
So Media Wiki has no idea what's going on. so yeah, so the JavaScript, uh, looks at the HTML attributes and then runs a couple API queries, and then the blank edits happen and then a couple purges after that so that the cache gets purged after the blank edit.
[01:04:08] Jeremy: Yeah. So it, it seems like on these Wiki pages, you have the html, you have the CSS you have the ability to describe these data pages, which I, I guess in the end, end up being rows in in SQL. And then finally you have JavaScript. So it kind of seems like you can do almost everything in the context of a a Wiki page.
You have so many, so
many of these tools at your, at your disposal.
[01:04:45] Megan: Yeah. Except write es6 code.
[01:04:48] Jeremy: Oh, still, still only es5.
[01:04:52] Megan: Yeah,
[01:04:52] Jeremy: Oh no. do, do you know if that's something that they are considering changing or
[01:05:01] Megan: There's a Phabricator ticket open.
[01:05:05] Jeremy: How, um, how, how many years?
[01:05:06] Megan: It has a lot of comments, oh a lot of years. I think it's since like 2014 or something
[01:05:14] Jeremy: Oh yeah. I, I guess the, the one maybe, well now now the browsers all, all support es6, but I, I guess one of the things, it sounds like media wiki, maybe side stepped is the whole, front end ecosystem in, in terms of node packages and build tools and things like that. is, is that right? It's basically you can write JavaScript and there, yeah,
[01:05:47] Megan: You can even write jQuery.
[01:05:49] Jeremy: Oh, okay. That's built in as well.
[01:05:52] Megan: Yeah .So I have to admit, like my, my front end knowledge is like a decade out of date or something because it's like what MediaWiki can do and there's like this entire ecosystem out there that I just like, don't have access to. And so I like barely know about. So I have this like side project that uses React that I've like, kind of sort of been working on.
And so like I know this tiny little bit of react and I'm like, why? Why doesn't MediaWiki do this?
Um, they are adding Vue support. So in theory I'll get to learn vue so that'll be fun.
[01:06:38] Jeremy: So I'm, I'm curious, just from the limited experience you've had, outside of,
MediaWiki, are, are there like specific things, uh, in your experience working with React where you're, you really wish you had in inside of Media Wiki?
[01:06:55] Megan: Well, really the big thing is like es6, like I really wish we could use arrow functions , like that would be fantastic. Being able to build components would be really nice. Yeah, we can't do that.
[01:07:09] Jeremy: I, I suppose you, you've touched a little bit on performance before, but I, I guess that's one thing about Wikis is that, putting what's happening in the back end, aside the, the front end experience of Wikis, they, they feel pretty consistent since they're generally mostly server rendered.
And the actual JavaScript is, is pretty light, at least from, from Wikis I've seen.
[01:07:40] Megan: Yeah. I mean you can add as much JavaScript as you want, so I guess it depends on what the users decide to do. But it's, it's definitely true that wikis tend to load faster than some websites that I've seen.
[01:07:54] Jeremy: Yeah, I mean, I guess when you think of a wiki, it's, you're there cuz you wanna get specific information and so the goal is not to necessarily reproduce like some crazy complex app or something. It's, It's, to get you the, the, information. Yeah.
[01:08:14] Megan: Yeah. No, that's actually one thing that I really like about Wikis also is that you don't have the pressure to make them look nice. I know that some people are gonna hear that and just like, totally cringe and be like, oh my God, what is she saying? ? Um, but it's actually really true. Like there's an aesthetic that Wikis and Media Wiki in particular have, and you kind of stick to that.
And within that aesthetic, I mean, you make them look as nice as you can. Um, and you certainly don't wanna like, make them deliberately ugly, but there's not a pressure to like go over the top with like marketing and branding and like, you know, you, you just make them look reasonably nice. And then the focus is on the information and the focus is on making the information as easy to understand as possible.
And a wiki that looks really nice is a wiki that's very understandable and very intuitive, and one where you. I mean, one, that the information is the joy and, you know, not, not the presentation, I guess. So it's like the presentation of the information instead of the presentation of the brand. so I, I really appreciate that about wikis.
[01:09:30] Jeremy: Yeah, that's a good point about the aesthetics in the sense of like, they have a certain look and yeah, maybe it's an authoritative look, , which, uh, is interesting cuz it's, like a, a wiki that I'll, I'll commonly go to for example, is there's the, the PC gaming Wiki. And when you look at how it's styled, it feels like very dated or it doesn't look like, I guess you could say normal webpages, but it's very much in line with what you expect a wiki to look like.
So it's, it's interesting how they have that, shared aesthetic, I guess.
[01:10:13] Megan: Yeah. yeah. No, I really like it. The Wiki experience,
[01:10:18] Jeremy: We, we kind of touched on this near the beginning, but sometimes when. I would see wikis and, and projects like Leaguepedia I would kind of wonder, you know, what's the decision between or behind it being a wiki versus something being like a custom CMS in, in the case of Leaguepedia but, you know, talking to you about how it's so, like wikis are structured so that people can contribute.
and then like you were saying, you have like this consistent look that brings the data to the user. Um, I actually, it gives me a better understanding of why so many people choose wikis as, as ways to present this information.
[01:11:07] Megan: Yeah, a a lot of people have asked me over the years why, why MediaWiki when it always feels like I'm jumping through so many hoops. Um, I mean, when I just described the caching thing to you, and that's just like one of, I don't know, dozens of struggles that I've had where, MediaWiki has gotten in the way of what I need to do.
Because really Leaguepedia is an entire software layer on top of MediaWiki, and so you might ask why. Why MediaWiki? Why not just build the software layer on top of something easier? And my answer is always, it's about the community. MediaWiki lends itself so well to community and people enjoy contributing to wikis and wikis. Wikis are just kind of synonymous with community, and they always have been. And Wikipedia sort of set the example when they launched, and it's sort of always been that way. And, you know, I feel like I'm a part of a community when I say a Wiki. And if it was just if it were a custom site that had the ability to contribute to it, you know, it just feels like it's not the same.
[01:12:33] Jeremy: I think just even seeing the edit button on Wikis is such a different experience than having the expectation, well, I guess in the case of Leaguepedia, you do have to create an account, but even without creating the account, you can still click edit and you can look at the source and you can see how all this information, or a lot of it, how it got filled in.
And I feel like it's kind of more similar to the earlier days of webpages where people could right click a site and click view source and then look at the HTML and the css, and kind of see how it was put together. versus, now with a lot of sites, the, the code has been minified or there's build tools involved so that when you look at view source on websites, it just looks crazy and you're not sure what's going on.
So I, I, I feel like wikis in some ways are, kind of closer to the, the spirit of, like the earlier H
T M L sites. Yeah.
[01:13:46] Megan: And the knowledge transfers too. If you've edit, if you've, if you've ever edited Wikipedia, then you know that like open bracket, open bracket, closed bracket. Closed bracket is how you link a page. and that knowledge transfers to admittedly maybe a little bit less so for Leaguepedia, since there, you need to know how all the templates work and there's not so much direct source editing.
it's mostly like clicking the CharInsert prefills. but there's still a lot of cross knowledge transfer, if you've edited one wiki and then change to editing another. And then it goes the other way too. If you edit Leaguepedia, then you want to go at it for the Zelda Wiki, that knowledge will transfer.
[01:14:38] Jeremy: And, and talking about the community and the editors. I, I imagine on Wikipedia, most of the people editing are volunteers. Is it the same with Leaguepedia in your experience?
[01:14:55] Megan: Um, yeah, so I was contracted, uh, or I was not contracted. My LLC was contract and then I subcontracted. Um, it changed a bit over the years, um, as people left. Uh, so at first I subcontracted quite a few people. Um, and then I guess, as you can imagine, as, there was a lot more data entry that had to be done at the start.
And less had to be done later on, as I, expanded the code base so that it was more a single source of truth, and less stuff had to be duplicated. And I guess it was, it probably became a lot more fun too, uh, when you didn't have to edit, enter the same thing multiple times. but, uh, a bunch of people, uh, moved on over the years.
and so by the end I was only subcontracting, three people. Um, and everyone else was volunteer.
[01:15:55] Jeremy: And and the people that you were subcontracting, that was for only data entry, or was that also for the actual code?
[01:16:05] Megan: No, that wasn't for data entry at all. Um, and actually that was for all of my wikis, uh, because I was. Managing like all of the eSports wikis. or one of them was for Call of Duty and Halo, uh, to manage those wikis. One of them was for, uh, just the Call of Duty Wiki. and then one of them was for Leaguepedia to do staff onboarding.
Oh
[01:16:28] Jeremy: okay. So this is, um, this is to help people contribute to all of these wikis. That's, that's what these, these, uh, subcontractors we're focusing on.
[01:16:41] Megan: Yeah,
[01:16:44] Jeremy: I guess that, that makes sense when we've been talking about the complexity, uh, what's behind Leaguepedia, but there's a lot that the editors, it sounds like, have to learn as well to be able to know basically where to go and how to fill everything out and Yeah.
[01:17:08] Megan: So basically, for the major leagues, in League of Legends, um, we required some onboarding before you could cover them because we wanted results entered within like, about one to four minutes. of the game centering, or sorry, of the games ending. Um, so that was like for North America, Korea, China, Europe, and for the, like for some regions, like the really minor ones, like second tier leagues in, like for example the national leagues in Europe, second tier or something, we kind of didn't really care if it was entered immediately.
And so anyone who wanted to enter could just enter, uh, information. So we did want the experience to be easy enough that people could figure it out on their own. and we didn't really, uh, require onboarding for that. There was like a gradation of how much onboarding we required. But typically we tried to give training as much as we could.
Um, it, it was sort of dependent on how fast people expected the results and how available someone was to provide training. so like for Latin America, there was like a lot of people who were available to provide trainings. So even like the more minor leagues, people got training there. for example, But yeah, it was, it was very collaborative.
and a lot of people, a lot of people got involved, so, yeah.
[01:18:50] Jeremy: And in terms of having this expectation of having the results in, in just a few minutes and things like that, is it, where are, are these people volunteers where they would volunteer for a slot and then there was just this expectation? Or how did that work?
work
[01:19:09] Megan: Yeah. So, um, a lot of people volunteered with us as resume experience to try and get jobs in eSports. Um, and some people just volunteered with us because they wanted to give back to the community because, we're like a really valuable resource for the community. And I mean, without volunteer contribution we wouldn't have existed.
So it was like understood that we needed people's help in order to continue existing. So some people, volunteered for that reason. Some people just found it fun to help out. so there's like a range of reasons to contribute.
[01:19:46] Jeremy: And, and you were talking about how there's some people who they, they really need this data in, in that short time span. you know, who, who are we talking about here? Are these like commentators? Are these journalists? I'm just curious who's, who's,
looking for this in such a short time span
[01:20:06] Megan: Well, fans would look for the data immediately. sometimes if we entered a wrong result, someone would like come into our discord and be like, Hey, the result of this is wrong. you know, within seconds of the wrong result going up. So we knew that people were like looking at the Wiki, like immediately.
But everyone used the data, commentators at Riot. journalists. Fans, yeah. like everyone is using it.
[01:20:33] Jeremy: and since it's so important to, like you're mentioning Riot or the tournament organizers, things like that. What kind of relationship do you have with them? Do they provide any kind of support or is it mostly just, it's something they just use
[01:20:54] Megan: I, so there is, um, I definitely talk to people at Riot pretty regularly. and we. we got like resources from them, so, they'd give us player photos to put up, and like answers to questions and stuff. but for the most part it was just something that they'd use.
[01:21:15] Jeremy: and, and so like now that unfortunately your, your contract wasn't renewed with Leaguepedia like where do you, I guess see the, the future of Leaguepedia but, but also all these other eSports wikis going, is this something that's gonna be more just community driven or, I'm, I guess I'm trying to understand, you know, how this, the gap gets filled.
[01:21:47] Megan: Yeah, I'm, I'm not sure. Um, they're doing an update to Media Wiki 1.39 next year. we'll see if stuff majorly breaks during that. probably no one's gonna be able to fix it if it does. Who knows? (laughs) um, yeah, I don't know. There's another site that hosts, uh, eSports wikis called Liquipedia
um, so it's possible that they'll adopt some of the smaller wikis. Um, I think it's pretty unlikely that they'll want to take Leaguepedia, um, just because it's too complicated of a wiki. but yeah, I, I, I don't know.
[01:22:31] Jeremy: it kind of feels like one of these things where I guess whoever is in charge of making these decisions may not fully understand the implications or, or what it takes to, to run such a, a large wiki. yeah, I guess it'll be interesting to, to see if it ends up being like you said, one, one big mess.
[01:22:58] Megan: Yeah. I got them through the 1.37 upgrade by submitting like three or four patches to cargo, during that time and discovering that the patches needed to be made prior to the upgrade happening. So, you know, I don't think that they're going to update cargo during the 1.39 upgrade and it's cargo changes that have the biggest disruption.
So they're probably safe from that. and, and I don't think 1.39 has any big parser changes. I think that's later, but yeah, there'll probably still be like a bunch of CSS changes and who knows if anyone's going to fix the follow up from that.
So, yeah, we'll see.
[01:23:46] Jeremy: Yeah, that's, um, that's kind of interesting to know too that, these upgrades to MediaWiki and, and to extensions like cargo, that they change so significantly that they require pull requests. Is that, like, is that pretty common in terms of when you do an upgrade of a MediaWiki that there there are these individual things you need to do and it's not just update package.
[01:24:18] Megan: well the cargo change was the first time that we had upgraded in like two and a half years or something. so that one in particular, I think it was expected that that one wasn't going to go so smoothly. generally updates go not that badly. I say with rising intonation, (laughs) , um, if you keep up to date with stuff, it's generally pretty okay.
Cargo is probably one of the less stable ones just because it's a relatively small contributor base, and so kind of crazy things happen sometimes. Um, Semantic Media Wiki is a lot more stable. Uh, but then the downside is that if you have a feature request for SMW it's harder to get pushed through.
But cargo still changes a lot. The big change with cargo, like the big problematic change with cargo was a tiny bug fix that just so happened to change every empty string value to nil in Lua,
You know, no big deal or anything, whatever.
[01:25:42] Jeremy: That, that's, uh, that's a good one right there.
[01:25:47] Megan: I mean,
I I don't know how no one noticed this for like a year and a half or something man,
It was a tiny bug fix.
[01:26:02] Jeremy: Mm.
[01:26:03] Megan: Like it was checked in as a bug fix and it really was a bug fix. I tracked down the guy who made the patch and I was like, I can't reproduce this bug. Can I just revert it? And he was like, I can't reproduce it either.
[01:26:21] Jeremy: Oh, wow. (laughs)
[01:26:23] Megan: And I was like, well, that's great. And I ended up just leaving it in, but then changing them back to empty string.
Um, when the extension was first released, null database values were sent to Lua as empty string due to a bug in the first place. Because null databases, null database values should just be nil in Lula. Like, come on here, . But they were sent as empty string.
And so for like five years, people were writing code, assuming that you would never get a nil value for anything that you requested from the database. So you can't make a breaking change like that without putting a config value that defaults to true.
[01:27:10] Jeremy: Yeah.
[01:27:11] Megan: So I added a legacy, nil value, legacy Lua, nil value as empty string config value or something, and, defaulted it to true and wrote in the documentation that it was recommended that you set it to false.
Or maybe I defaulted it to false. I, I don't remember what I set the default to, but I wrote in the documentation something about how you should, if possible, set this to false, but if you have a large code base, you probably need this . And then we set up Platform Ride to True, and that's the story of how I saved the shit out of our 1.37 upgrade this year.
[01:27:57] Jeremy: Oh yeah, that's, um, that's a rough one. Changing, changing people's data is very scary.
[01:28:05] Megan: Yeah, I mean, it was totally unintended. and I don't know how no one noticed this either. I mean, I guess the answer is that not very many people do the kind of stuff that I do working with Lua and Cargo in this much depth. but a fairly significant number of fandom Wikis do, and this would've just been an absolute disaster.
And the semi ironic thing is that, I, I have a wrapper that fixes the initial cargo bug where I detect every empty string value and then cast it to nil after I get my data from cargo. So I would've been completely unaffected by this. And my wiki was the primary testing wiki for cargo on the 1.37 patch. So we wouldn't have caught this, it would've gone to live
[01:28:56] Jeremy: Wow.
[01:28:58] Megan: So we got extremely lucky that I found out about this ahead of time prior to us QAing and fixed this bug
because it would've gone straight to live.
[01:29:10] Jeremy: that's wild yeah, it's just like kind of catastrophic, right? It's like, if it happens, I feel like whoever is managing the wikis is gonna be very confused. Like, why, why is everything broken? I don't, I don't understand.
[01:29:25] Megan: Right? And this is like so much broken stuff that it's like very difficult to track down what's going on. I actually had a lot of trouble figuring out what was wrong in the code base.
Causing this error. And I submitted an incorrect patch at first, and then the incorrect patch got merged, and then I had to like roll back the incorrect patch.
And then I got a merge conflict on the incorrect patch. And it, it was, it was bad. It took me three patches to get this right.
Um, But eventually, eventually I got there.
[01:30:02] Jeremy: Yeah. that's software, I guess ,
[01:30:06] Megan: Yeah.
[01:30:07] Jeremy: the, the, the thing you were trying to avoid all these years.
[01:30:10] Megan: Yeah,
[01:30:13] Jeremy: you're in it now.
[01:30:14] Megan: It really was, that was actually the reason that I went in, I got into the Wiki in the first place, um, and into e-sports. Uh, was that after Caltech, I wanted to like get away from STEM altogether. I was like, I've had enough of this. Caltech was too much, get me away, (laughs) .
And I wanted to do like event management or tournament organization or something.
And so I wanted to work in eSports. and that was like my life plan. And I wanted nothing to do with STEM and I didn't wanna work in software. I didn't wanna do math. I was a math major. I didn't wanna do math. I didn't wanna go to grad school. I wanted absolutely nothing to do with this. So that was my plan.
And somehow I stumbled and ended up in software.
[01:31:02] Jeremy: Well, at least you got the eSports part.
[01:31:05] Megan: Yeah, so that, that worked out. And really for the first couple of years I was doing like community management and social media and stuff.
Um, and I did stay away from software for about the first two years, so it lasted about two whole years.
[01:31:24] Jeremy: What ended up pulling you in?
[01:31:26] Megan: Um, actually, so when, when I signed back with Gamepedia, our developer just sort of disappeared and I was like, well, shit, I guess that's me now. (laughs)
So we had someone else writing all of our templates for a couple years, so I was able to just like make a lot of feature requests. and I'm very good at making feature requests.
If, if I ever have like, access to someone else who's writing code for me, I'm like, fantastic at just making a ton of like really minor feature requests, and just like taking off all of their time with like a billion tiny QA issues.
[01:32:09] Jeremy: You you are the backlog,
[01:32:12] Megan: Yeah, I really, um, I, there's another OSS project that I've been working on, um, which is a Discord bot and. We, our, our backlog just expands and expands and
[01:32:26] Jeremy: Oh yeah. You know what, I, I think I did look at that. I, I looked at the issues and, usually when you look at a, the issues for an open source project, it's, it's all these people using it, right? That are like, uh, there's this thing I want, but then I looked and it was all, it was all you. So I guess that's okay cuz you're, you're in the position to do something about it.
[01:32:47] Megan: The, the part that you don't know is that I'm like constantly begging other people to open tickets too.
[01:32:53] Jeremy: Really?
[01:32:55] Megan: Yeah. Like constantly. I'm like, guys, it can't just be me opening tickets all the time.
[01:33:04] Jeremy: Yeah. Yeah. If it was, if it was someone else's project, I would be like, oh, this is, uh, .
I don't know about this. But when it's your own, you know, okay. It's, it's basically like, um, it's like a roadmap I guess.
[01:33:20] Megan: Yeah. Some of them have been open for, for quite a long time, but actually a couple months ago we closed one that had been open since, I think like April, 2020.
[01:33:31] Jeremy: Oh, nice.
[01:33:32] Megan: That was quite an event.
[01:33:34] Jeremy: Yeah, it's open source, So you can do whatever you want, right. (laughs)
[01:33:41] Megan: We even have a couple good first issues that are actually good first issues.
[01:33:46] Jeremy: Yeah. Not, not getting any takers?
[01:33:49] Megan: No, we sometimes do. Yeah. I actually, we, so some of them are like semi-important features, but I like feel really bad if I ever do the good first issues myself because like somewhere else could do them. And so like, if it's like a one line ticket, I would just, I feel so much guilt for doing it myself.
[01:34:09] Jeremy: Oh, I see what you mean.
[01:34:10] Megan: I'm like, Yeah. so I just like, I can't do them. But then I'm like, oh, but this is really important. But then I'm like, oh, but we might get someone else who, and I just, I never know if I should just take the plunge and do it myself, so.
[01:34:22] Jeremy: yeah. No, that's, that's a good point. It's, it's like, like these opportunities, right. For people to, and it could, it could make a big difference for them. And then for you, it's like, I could do this in 10 minutes or whatever. ,
Uh, I, I guess it all depends on how annoyed you are by the thing not being there,
[01:34:43] Megan: Right. I know because my entire background is like community and getting new people to onboard and like the potential new contributor is worth like 10 times, like, The one PR that I can make. So I should just like absolutely leave it open for the next year.
[01:35:02] Jeremy: Yeah. Yeah, no, that's a, that's a good way of, of looking at it. I mean, I I think when you talk about open source or, or even wikis, that that sort of community aspect is, is so, so important, right? Because if it's just, if it's just one person, then I mean, it kind of, it lives or dies with the one person, right?
It, it's, it's so different when you actually get a bunch of people involved. And I think that's something like a lot of, a lot of projects struggle with
[01:35:38] Megan: Yeah. That's actually, as much as I'm like bitter about the fact that I was let go from my own project, I think the thing that I should, in a sense be the most proud of is that I grew my project to a place where that was able to happen in a sense. Like, I built this and I built it to a place where it was sustainable.
Although, we'll see how sustainable it was, (laughs) . but like I'm not needed for the day to day. and that means that like I successfully built a community.
[01:36:18] Jeremy: Yeah, no, you should be really proud about that because it's, it's not only like the, the code, right? Like over the years it sounds like you gradually made it easier and easier to contribute, but then also being able to get all these volunteers together and build a community on the discord and, and elsewhere.
Yeah, no, I think that's, I think that's really great to be able to, to do, do something like that.
[01:36:50] Megan: Thanks.
[01:36:53] Jeremy: I think that's, that's a good place to, to wrap up, but is there anything else you wanted to, to mention or do you want to tell people where to check out, uh, what you're up to?
[01:37:05] Megan: Yeah, I, I have a blog that's a little bit inactive for the past couple months, because I recently had surgery, but I, I've been saying for like five weeks that I will start, posting there again. So hopefully that happens soon. Uh, but it's river.me, and so you can check that out.
[01:37:27] Jeremy: Cool. Well, yeah, Megan, I just wanna say thanks for, for taking the time. This was, this was really interesting. the world of wikis is like this, it's like a really big part of the internet that, um, I use wikis, but I, I've never really understood kind of what's going on in, in terms of the actual technology and the community. so so thank you for, for sharing that.
[01:37:53] Megan: Yeah. Thanks so much for having me.
Victor is a software consultant in Tokyo who describes himself as a yak shaver. He writes on his blog at vadosware and curates Awesome F/OSS, a mailing list of open source products. He's also a contributor to the Open Core Ventures blog.
Before our conversation Victor wrote a structured summary of how he works on projects. I recommend checking that out in addition to the episode.
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: This episode, I talk to Victor Adossi who describes himself as a yak shaver. Someone who likes trying a whole bunch of different technologies, seeing the different options. We talk about what he uses, the evolution of front end development, and his various projects.
Talking to just different people it's always good to get where they're coming from because something that works for Google at their scale is going to be different than what you're doing with one of your smaller projects.
[00:00:31] Victor: Yeah, the context. Of course in direct conflict with that statement, I definitely use Google technology despite not needing to at all right? Like, you know, 99% of people who are doing like people like to call it indiehacking or building small products could probably get by with just Dokku. If you know Dokku or like CapRover. Are two projects that'll be like, Oh, you can just push your code here, we'll build it up like a little mini Heroku PaaS thing and just go on one big server, right? Like 99% of the people could just use that. But of course I'm not doing that. So I'm a bit of a hypocrite in that sense.
I know what I should be doing, but I'm not doing that. I am writing a Kubernetes cluster with like five nodes for no reason. Uh, yeah, I dunno, people don't normally count the controllers.
[00:01:24] Jeremy: Dokku and CapRover, I think those are where it's supposed to create a heroku like experience I think it's based off of the heroku buildpacks right? At least Dokku is?
[00:01:36] Victor: Yeah Buildpacks has actually been spun out into like a community thing so like pivotal and heroku, it's like buildpacks.io, they're trying to build a wider standard around it so that more people can get involved.
And buildpacks are actually obviously fantastic as a technology and as a a process piece. There's not much else like them and you know, that's obvious from like Heroku's success and everything. I know Dokku uses that. I don't know that Caprover does, but I haven't, I haven't really run Caprover that much.
They, they probably do. Like at this point if you're going to support building from code, it seems silly to try and build your own buildpacks. Cause that's what you will do, eventually. So you might as well use what's there.
Anyway, this is like just getting to like my personal opinions at this point, but like, if you think containers are a bad idea in 2022, You're wrong, you should, you should stop. Like you should, you should stop. Think about it. I mean, obviously there's not, um, I got a really great question at an interview once, which is, where are containers a bad idea?
That's probably one of the best like recent interview questions I've ever gotten cause I was like, Oh yeah, I mean, like, you can't, it can't be perfect everywhere, right? Nothing's perfect everywhere. So it's like, where is it? Uh, and of course the answer was networking, right? (unintelligible)
So if you need absolute performance, but like for just about everything else. Containers are kind of it at this point. Like, time has born it out, I think. So yeah, I always just like bias at taking containers at this point. So I'm probably more of a CapRover person than a Dokku person, even though I have not used, I don't use CapRover.
[00:03:09] Jeremy: Well, like something that I've heard with containers, and maybe it's changed recently, but, but something that was kind of holdout was when people would host a database sometimes they would oh we just don't wanna put this in a container and I wonder if like that matches with your thinking or if things have changed.
[00:03:27] Victor: I am not a database administrator right like I read postgres docs and I read the, uh, the Postgres documentation, and I think I know a bit about postgres but I don't commit right like so and I also haven't, like, oh, managed X terabytes on one server that you are making sure never goes down kind of deal.
But the stickiness for me, at least from when I've run, So I've done a lot of tests with like ZFS and Postgres and like, um, and also like just trying to figure out, and I run Postgres in Kubernetes of course, like on my cluster and a lot of the stuff I found around is, is like fiddly kernel things like sort of base kernel settings that you need to have set.
Like, you know, stuff like should you be using transparent huge pages, like stuff like that. But once you have that settled. Containers are just processes with name spacing and resource control, right? Like, that's it. there are some other ins and outs, but for the most part, if you're fine running a process, so people ran processes, right?
And they were just completely like unprotected. Then people made users for the processes and they limited the users and ran the processes, right? Then the next step is now you can run a process and then do the limiting the name spaces in cgroups dynamically. Like there, there's, there's sort of not a humongous difference, unless you're hitting something very specific.
Uh, but yeah, databases have been a point of contention, but I think, Kelsey Hightower had that tweet yeah. That was like, um, don't run databases in Kubernetes. And I think he called it back.
[00:04:56] Victor: I don't know, but I, I know that was uh, was one of those things that people were really unsure about at first, but then after people sort of like felt it out, they were like, Oh, it's actually fine. Yeah.
[00:05:06] Jeremy: Yeah I vaguely remember one of the concerns having to do with persistent storage. Like there were challenges with Kubernetes and needing to keep that storage around and I don't know if that's changed yeah or if that's still a concern.
[00:05:18] Victor: Uh, I'd say that definitely has changed. Uh, and it was, it was a concern, depending on where you were. Mostly people who are running AKS or EKS or you know, all those other managed Kubernetes, they're just using EBS or like whatever storage provider is like offering for storage.
Most of those people don't actually have that much of a problem with, storage in general.
Now, high performance storage is obviously different, right? So like, so you'll, you're gonna have to start doing manual, like local volume management and stuff like that. it was a problem, because obviously CSI (Kubernetes Container Storage Interface) didn't exist for some period of time, and like there was, it was hard to know what to do for if you were just running a Kubernetes cluster. I think a lot of people were just using local, first of all, local didn't even exist for a bit.
Um, they were just using host path, right? And just like, Oh, it's on the disk somewhere. Where do we, we have to go get it right? Or we have to like, sort of manage that. So that was something most people weren't ready for, especially if you were just, if you weren't like sort of a, a, a traditional sysadmin and used to doing that stuff.
And then of course local volumes came out, but I think they still had to be, um, pre-provisioned. So that's sysadmin stuff that most people, you know, maybe aren't, aren't necessarily ready for. Uh, and then most of the general solutions were slow. So like, I used Longhorn (https://longhorn.io) for a long time and Longhorn, Longhorn's great. And super easy to set up, but it can be slower and you can have some, like, delays in mount time. it wasn't ideal for, for most people.
So yeah, I, overall it's true. Databases, Databases in Kubernetes were kind of fraught with peril for a while, but it wasn't for the reason that, it wasn't for the fundamental reason that Kubernetes was just wrong or like, it wasn't the reason most people think of, which is just like, Oh, you're gonna break your database.
It's more like, running a database is hard and Kubernetes hasn't solved all the hard problems. Like, cuz that's what Kubernetes does. It basically solves a lot of problems in a very generic way. Right. So it just hadn't solved all those problems yet at this point. I think it's got decent answers on a lot of them.
So I, I mean, I don't know. I I do it. Don't, don't take what I'm saying to your, you know, PM meeting or your standup meeting, uh, anyone who's listening. But it's more like if you could solve the problems with databases in the sense before. You could probably solve 'em on Kubernetes now with a good understanding of Kubernetes.
Cause at the end of the day, it's all the same stuff. Just Kubernetes makes it a little easier to, uh, do it dynamically.
[00:07:50] Jeremy: It sounds like you could do it before, but some of the, I guess the tools or the ways of doing persistent storage were not quite there yet, or they were difficult to use. And so that was why people at the start were like, Okay, maybe it's not a good idea, but, now maybe there's some established practices for how you should run a database in Kubernetes.
And I, I suppose the other aspect too is that, like you were saying, Kubernetes is its own thing. You gotta learn Kubernetes and all its intricacies. And then running a database is also its own challenge. So if you stack the two of them together and, and the path was not really clear then maybe at the start it wasn't the best idea. Um, uh, if somebody was going to try it out now, was there like a specific resource you looked at or a specific path to where like okay this is is how I'm going to do it.
[00:08:55] Victor: I'll just say what I normally recommend to everybody.
Cause it depends on which path you wanna go right? If you wanna go down like running a database path first and figure that out, fill out that skill tree. Like go read the Postgres docs.
Well, first of all, use Postgres. That's the first tip there. But like, read those documents. And obviously you don't have to understand everything. You won't understand everything. But knowing the big pieces and sort of letting your brain see the mention of like a whole bunch of things, like what is toast?
Oh, you can do compression on columns. Like, you can do some, some things concurrently. Um, you know, what ALTER TABLE looks like. You get all that stuff kind of in your head. Um, and then I personally really believe in sort of learning by building and just like iterating. you won't get it right the first time. It's just like, it's not gonna happen. You're get, you can, you can get better the first time, right? By being really prepared and like, and leave yourself lots of outs, but you kind of have to like, get it out there. Do do your best to make sure that you can't fail, uh, catastrophically, right?
So this is like, goes back to that decision to like use ZFS as the bottom of this I'm just like, All right, well, I, I'm not a file systems expert, but if I. I could delegate some of that, you know, some of that, I can get some of that knowledge from someone else. Um, and I can make it easier for me to not fail catastrophically.
For the database side, actually read documentation on Postgres or the whatever database you're going to use, make sure you at least understand that. Then start running it like locally or whatever. Again, Docker use, use Docker locally.
It's, it's, it's fine. and then, you know, sort of graduate to running sort of more progressively, more complicated versions. what I would say for the Kubernetes side is actually similar. the Kubernetes docs are really good. they're very large. but they're good.
So you can actually go through and know all the, like, workload, workload resources, know, like what a config map is, what a secret is, right? Like what etcd is doing in this whole situation. you know, what a kublet is versus an API server, right? Like the, the general stuff, like if you go through all that, you should have like a whole bunch of ideas at least floating around in your head. And then once you try and start setting up a server, they will all start to pop up again, right? And they'll all start to like, you, like, Oh, okay, I need a CNI (Container Networking) plugin because something needs to make the services available, right? Or something needs to power the ingress, right? Like, if I wanna be able to get traffic, I need an ingress object.
But what listens, what does that, what makes that ingress object do anything? Oh, it's an ingress controller. nginx, you know, almost everyone's heard of nginx, so they're like, okay. Um, nginx, has an ingress control. Actually there's, there used to be two, I assume there's still two, but there's like one that's maintained by Kubernetes, one that's maintained by nginx, the company or whatever.
I use traefik, it's fantastic. but yeah, so I think those things kind of fall out and that is almost always my first way to explain it and to start building. And tinkering iteratively. So like, read the documentation, get a good first grasp of it, and then start building yourself because you'll, you'll get way more questions that way.
Like, you'll ask way more questions, you won't be able to make progress. Uh, and then of course you can, you know, hop into slacks or like start looking around and, and searching on the internet. oh, one of the things that really helped me out early learning Kubernetes was, Kelsey Hightower's, um, learn Kubernetes the hard way. I'm also a big believer in doing things the hard way, at least knowing what you're choosing to not know, right? distributing file system, Deltas, right? Or like changes to a file system over the network is not a new problem. Other people have solved it. There's a lot of complexity there. but if you at least know the sort of surface level of what the thing does and what it's supposed to do and how it's supposed to do it, you can make a decision on, Oh, how deep am I going to go?
Right? To prevent yourself from like, making a mistake or going too deep in the rabbit hole. If you have an idea of the sort of ecosystem and especially like, Oh, here, like the basics of how I can use this thing, that's generally very good. And doing things the hard way is a great way to get a, a feel for that, right?
Cause if you take some chunk and like, you know, the first level of doing things the hard way, uh, or, you know, Kelsey Hightower's guide is like, get a machine, right? Like, so, like, if you somehow were like, Oh, I wanna run a Kubernetes cluster. but, you know, I don't want use necessarily EKS and you wanna learn it the hard way.
You have to go get a machine, right? If you, if you're not familiar, if you run on Heroku the whole time, like you didn't manage your own machines, you gotta go like, figure out EC2, right? Or, I personally use, hetzner I love hetzner, so you have to go figure out hetzner, digital ocean, whatever.
Right. And then the next thing's like, you know, the guide's changed a lot, and I haven't, I haven't looked at it in like, in years, actually a while since I, since I've sort of been, I guess living it, but it's, it's like generate certificates, right? So if you've never dealt with SSL and like, sort of like, or I should say TLS uh, and generating certificates and how that whole dance works, right?
Which is fascinating because it's like, oh, right, nothing's secure on the internet, except that we distribute root certificates on computers that are deployed in every OS, right? Like, that's a sort of fundamental understanding you may not go deep enough to realize, but if you are fascinated by it, trying to do it manually would lead you down that path.
You'd be like, Oh, what, like what is this thing? What is a CSR? Like, why, who is signing my request? Right? And it's like, why do we trust those people? Right? And it's like, you know, that kind of thing comes out and I feel like you can only get there from trying to do it, you know, answering the questions you can.
Right. And again, it takes some judgment to know when you should not go down a rabbit hole. uh, and then iterating. of course there are people who are excellent at explaining. you can find some resources that are shortcuts. But, uh, I think particularly my bread and butter has been just to try and do it the hard way.
Avoid pitfalls or like rabbit holes when you can. But know that the rabbit hole is there, and then keep going. And sometimes if something's just too hard, you're not gonna get it the first time. Like maybe you'll have to wait like another three months, you'll try again and you'll know more sort of ambiently about everything else.
You get a little further that time. that's how I feel about that. Anyway.
[00:15:06] Jeremy: That makes sense to me. I think sometimes when people take on a project, they try to learn too many things at the same time. I, I think the example of Kubernetes and Postgres is pretty good example, where if you're not familiar with how do I install Postgres on bare metal or a vm, trying to make sense of that while you're trying to into is probably gonna be pretty difficult.
So, so splitting them up and learning them individually, that makes a lot of sense to me. And the whole deciding how deep you wanna go. That's interesting too, because I think that's very specific to the person right because sometimes you wanna go a little deeper because otherwise you don't understand how the two things connect together.
But other times it's just like with the example with certificates, some people they may go like, I just put in let's encrypt it gives me my cert I don't care right then, and then, and some people they wanna know like okay how does the whole certificate infrastructure work which I think is interesting, depending on who you are, maybe you go ahh maybe it doesn't really matter right.
[00:16:23] Victor: Yeah, and, you know, shout out to Let's Encrypt . It's, it's amazing, right? think Singlehandedly the most, most of the deployment of HTTPS that happens these days, right? so many so many of like internet providers and uh, sort of service providers will use it right?
Under the covers. Like, Hey, we've got you free SSL through Let's Encrypt, right? Like, kind of like under the, under the covers. which is awesome. And they, and they do it. So if you're listening to this, donate to them. I've done it. So now that, now the pressure is on whoever's listening, but yeah, and, and I, I wanna say I am that person as well, right?
Like, I use, Cert Manager on my cluster, right? So I'm just like, I don't wanna think about it, but I, you know, but I, I feel like I thought about it one time. I have a decent grasp. If something changes, then I guess I have to dive back in. I think it, you've heard the, um, innovation tokens idea, right?
I can't remember the site. It's like, um, do, like do boring tech or something.com (https://boringtechnology.club/) . Like it shows up on sort of hacker news from time to time, essentially. But it's like, you know, you have a certain amount of tokens and sort of, uh, we'll call them tokens, but tolerance for complexity or tolerance for new, new ideas or new ways of doing things, new processes.
Uh, and you spend those as you build any project, right? you can be devastatingly effective by just sticking to the stack, you know, and not introducing anything new, even if it's bad, right? and there's nothing wrong with LAMP stack, I don't wanna annoy anybody, but like if you, if you're running LAMP or if you run on a hostgator, right?
Like, if you run on so, you know, some, some service that's really old but really works for you isn't, you know, too terribly insecure or like, has the features you need, don't learn Kubernetes then, right? Especially if you wanna go fast. cuz you, you're spending tokens, right? You're spending, essentially brain power, right?
On learning whatever other thing. So, but yeah, like going back to that, databases versus databases on Kubernetes thing, you should probably know one of those before you, like, if you're gonna do that, do that thing. You either know Kubernetes and you like, at least feel comfortable, you know, knowing Kubernetes extremely difficult obviously, but you feel comfortable and you feel like you can debug.
Little bit of a tangent, but maybe that's even a better, sort of watermark if you know how to debug a thing. If, if it's gone wrong, maybe one or five or 10 or 20 times and you've gotten out. Not without documentation, of course, cuz well, if you did, you're superhuman.
But, um, but you've been able to sort of feel your way out, right? Like, Oh, this has gone wrong and you have enough of a model of the system in your head to be like, these are the three places that maybe have something wrong with them. Uh, and then like, oh, and then of course it's just like, you know, a mad dash to kind of like, find, find the thing that's wrong.
You should have confidence about probably one of those things before you try and do both when it's like, you know, complex things like databases and distributed systems management, uh, and orchestration.
[00:19:18] Jeremy: That's, that's so true in, in terms of you are comfortable enough being able to debug a problem because it's, I think when you are learning about something, a lot of times you start with some kind of guide or some kind of tutorial and you follow the steps. And if it all works, then great.
Right? But I think it's such a large leap from that to something went wrong and I have to figure it out. Right. Whether it's something's not right in my Dockerfile or my postgres instance uh, the queries are timing out. so many things that could go wrong, that is the moment where you're forced to figure out, okay, what do I really know about this not thing?
[00:20:10] Victor: Exactly. Yeah. Like the, the rubber's hitting the road it's uh you know the car's about to crash or has already crashed like if I open the bonnet, do I know what's happening right or am I just looking at (unintelligible).
And that's, it's, I feel sort a little sorry or sad for, for devs that start today because there's so much. Complexity that's been built up. And a lot of it has a point, but you need to kind of have seen the before to understand the point, right? So I like, I like to use front end as an example, right? Like the front end ecosystem is crazy, and it has been crazy for a very long time, but the steps are actually usually logical, right?
Like, so like you start with, you know, HTML, CSS and JavaScript, just plain, right? And like, and you can actually go in lots of directions. Like HTML has its own thing. CSS has its own sort of evolution sort of thing. But if we look at JavaScript, you're like, you're just writing JavaScript on every page, right?
And like, just like putting in script tags and putting in whatever, and it's, you get spaghetti, you get spaghetti, you start like writing, copying the same function on multiple pages, right? You just, it, it's not good. So then people, people make jquery, right? And now, now you've got like a, a bundled set of like good, good defaults that you can, you can go for, right?
And then like, you know, libraries like underscore come out for like, sort of like not dom related stuff that you do want, you do want everywhere. and then people go from there and they go to like backbone or whatever. it's because Jquery sort of also becomes spaghetti at some point and it becomes hard to manage and people are like, Okay, we need to sort of like encapsulate this stuff somehow, right?
And like the new tools or whatever is around at the same timeframe. And you, you, you like backbone views for example. and you have people who are kind of like, ah, but that's not really good. It's getting kind of slow.
Uh, and then you have, MVC stuff comes out, right? Like Angular comes out and it's like, okay, we're, we're gonna do this thing called dirty checking, and it's gonna be, it's gonna be faster and it's gonna be like, it's gonna be less sort of spaghetti and it's like a little bit more structured. And now you have sort of like the rails paradigm, but on the front end, and it takes people to get a while to get adjusted to that, but then that gets too heavy, right?
And then dirty checking is realized to be a mistake. And then, you get stuff like MVVM, right? So you get knockout, like knockout js and you got like Durandal, and like some, some other like sort of front end technologies that come up to address that problem. Uh, and then after that, like, you know, it just keeps going, right?
Like, and if you come in at the very end, you're just like, What is happening? Right? Like if it, if it, if someone doesn't sort of boil down the complexity and reduce it a little bit, you, you're just like, why, why do we do this like this? Right? and sometimes there's no good reason.
Sometimes the complexity is just like, is unnecessary, but having the steps helps you explain it, uh, or helps you understand how you got there. and, and so I feel like that is something younger people or, or newer devs don't necessarily get a chance to see. Cause it just, it would take, it would take very long right? And if you're like a new dev, let's say you jumped into like a coding bootcamp. I mean, I've got opinions on coding boot camps, but you know, it's just like, let's say you jumped into one and you, you came out, you, you made it. It's just, there's too much to know. sure, you could probably do like HTML in one month.
Well, okay, let's say like two weeks or whatever, right? If you were, if you're literally brand new, two weeks of like concerted effort almost, you know, class level, you know, work days right on, on html, you're probably decently comfortable with it. Very comfortable. CSS, a little harder because this is where things get hard.
Cause if you, if you give two weeks for, for HTML, CSS is harder than HTML kind of, right? Because the interactions are way more varied. Right? Like, and, and maybe it's one of those things where you just, like, you, you get somewhat comfortable and then just like know that in the future you're gonna see something you don't understand and have to figure it out. Uh, but then JavaScript, like, how many months do you give JavaScript? Because if you go through that first like, sort of progression that I, I I, I, I mentioned everyone would have a perfect sort of, not perfect but good understanding of the pieces, right? Like, why did we start transpiling at all? Right? Like, uh, or why did you know, why did we adopt libraries?
Like why did Bower exist? No one talks about Bower anymore, obviously, but like, Bower was like a way to distribute front end only packages, right? Um, what is it? Um, Uh, yes, there's grunt. There's like the whole build system thing, right? Once, once we decide we're gonna, we're gonna do stuff to files before we, before we push. So there's grunt, there's, uh, gulp, which is like grunt, but like, Oh, we're gonna do it all in memory. We're gonna pipe, we're gonna use this pipes thing to make sure everything goes fast. then there's like, of course that leads like the insanity that's webpack. And then there's like parcel, which did better.
There's vite there's like, there's all this, there's this progression, but how many months would it take to know that progression? It, it's too long. So they end up just like, Hey, you're gonna learn react. Which is the right thing because it's like, that's what people hire for, right? But then you're gonna be in react and be like, What's webpack, right?
And it's like, but you can't go down. You can't, you don't have the time. You, you can't sort of approach that problem from the other direction where you, which would give you better understanding cause you just don't have the time. I think it's hard for newer devs to overcome this.
Um, but I think there are some, there's some hope on the horizon cuz some things are simpler, right? Like some projects do reduce complexity, like, by watching another project sort of innovate so like react. Wasn't the first component, first framework, right? Like technically, I, I think, I think you, you might have to give that to like, to maybe backbone because like they had views and like marionette also went with that.
Like maybe, I don't know, someone, someone I'm sure will get in like, send me an angry email, uh, cuz I forgot you Moo tools or like, you know, Ember Ember. They've also, they've also been around, I used to be a huge Ember fan, still, still kind of am, but I don't use it. but if you have these, if you have these tools, right?
Like people aren't gonna know how to use them and Vue was able to realize that React had some inefficiencies, right? So React innovates the sort of component. So Reintroduces the component based model component first, uh, front end development model. Vue sees that and it's like, wait a second, if we just export this like data object, and of course that's not the only innovation of Vue, but if we just export this data object, you don't have to do this fine grained tracking yourself anymore, right?
You don't have to tell React or tell your the system which things change when other things change, right? Like you, you don't have to set up this watching and stuff, right? Um, and that's one of the reasons, like Vue is just, I, I, I remember picking up Vue and being like, Oh, I'm done. I'm done with React now.
Because it just doesn't make sense to use React because they Vue essentially either, you know, you could just say they learned from them or they, they realize a better way to do things that is simpler and it's much easier to write. Uh, and you know, functionally similar, right? Um, similar enough that it's just like, oh they boil down some of that complexity and we're a step forward and, you know, in other ways, I think.
Uh, so that's, that's awesome. Every once in a while you get like a compression in the complexity and then it starts to ramp up again and you get maybe another compression. So like joining the projects that do a compression. Or like starting to adopting those is really, can be really awesome. So there's, there's like, there's some hope, right?
Cause sometimes there is a compression in that complexity and you you might be lucky enough to, to use that instead of, the thing that's really complex after years of building on it.
[00:27:53] Jeremy: I think you're talking about newer developers having a tough time making sense of the current frameworks but the example you gave of somebody starting from HTML and JavaScript going to jquery backbone through the whole chain, that that's just by nature of you've put in a lot of time right you've done a lot of work working with each of these technologies you see the progression
as if someone is starting new just by nature of you being new you won't have been able to spend that time
[00:28:28] Victor: Do you think it could work? again, the, the, the time aspect is like really hard to get like how can you just avoid spending time um to to learn things that's like a general problem I think that problem is called education in the general sense.
But like, does it make sense for a, let's say a bootcamp or, or any, you know, school right? To attempt to guide people through the previous solutions that didn't work, right? Like in math, you don't start with calculus, right? It just wouldn't, it doesn't make sense, right? But we try and start with calculus in software, right?
We're just like, okay, here's the complexity. You've got all of it. Don't worry. Just look at this little bit. If, you know, if the compiler ever spits out a weird error uh oh, like, you're, you're, you're in for trouble cuz you, you just didn't get the. get the basics. And I think that's maybe some of what is missing.
And the thing is, it is like the constraints are hard, right? No one has infinite time, right? Or like, you know, even like, just tons of time to devote to learning, learning just front end, right? That's not even all of computing, That's not even the algorithm stuff that some companies love to throw at you, right?
Uh, or the computer sciencey stuff. I wonder if it makes more sense to spend some time taking people through the progression, right? Because discovering that we should do things via components, let's say, or, or at least encapsulate our functionality to components and compose that way, is something we, we not everyone knew, right?
Or, you know, we didn't know wild widely. And so it feels like it might make sense to touch on that sort of realization and sort of guide the student through, you know, maybe it's like make five projects in a week and you just get progressively more complex. But then again, that's also hard cause effort, right?
It's just like, it's a hard problem. But, but I think right now, uh, people who come in at the end and sort of like see a bunch of complexity and just don't know why it's there, right? Like, if you've like, sort of like, this is, this applies also very, this applies to general, but it applies very well to the Kubernetes problem as well.
Like if you've never managed nginx on more than one machine, or if you've never tried to set up a, like a, to format your file system on the machine you just rented because it just, you know, comes with nothing, right? Or like, maybe, maybe some stuff was installed, but, you know, if you had to like install LVM (Logical Volume Manager) yourself, if you've never done any of that, Kubernetes would be harder to understand.
It's just like, it's gonna be hard to understand. overlay networks are hard for everyone to understand, uh, except for network people who like really know networking stuff. I think it would be better. But unfortunately, it takes a lot of time for people to take a sort of more iterative approach to, to learning.
I try and write blog posts in this way sometimes, but it's really hard. And so like, I'll often have like an idea, like, so I call these, or I think of these as like onion, onion style posts, right? Where you either build up an onion sort of from the inside and kind of like go out and like add more and more layers or whatever.
Or you can, you can go from the outside and sort of take off like layers. Like, oh, uh, Kubernetes has a scheduler. Why do they need a scheduler? Like, and like, you know, kind of like, go, go down. but I think that might be one of the best ways to learn, but it just takes time. Or geniuses and geniuses who are good at two things, right?
Good at the actual technology and good at teaching. Cuz teaching is a skill and it's very hard. and, you know, shout out to teachers cuz that's, it's, it's very difficult, extremely frustrating. it's hard to find determinism in, in like methods and solutions.
And there's research of course, but it's like, yeah, that's, that's a lot harder than the computer being like, Nope, that doesn't work. Right? Like, if you can't, if you can't, like if you, if the function call doesn't work, it doesn't work. Right. If the person learned suboptimally, you won't know Right. Until like 10 years down the road when, when they can't answer some question or like, you know, when they, they don't understand. It's a missing fundamental piece anyway.
[00:32:24] Jeremy: I think with the example of front end, maybe you don't have time to walk through the whole history of every single library and framework that came but I think at the very least, if you show someone, or you teach someone how to work with css, and you have them, like you were talking about components before you have them build a site where there's a lot of stuff that gets reused, right? Maybe you have five pages and they all have the same nav bar.
[00:33:02] Victor: Yeah, you kind of like make them do it.
[00:33:04] Jeremy: Yeah. You make 'em do it and they make all the HTML files, they copy and paste it, and probably your students are thinking like, ah, this, this kind of sucks
[00:33:16] Victor: Yeah
[00:33:18] Jeremy: And yeah, so then you, you come to that realization, and then after you've done that, then you can bring in, okay, this is why we have components.
And similarly you brought up, manual dom manipulation with jQuery and things like that. I, I'm sure you could come up with an example of you don't even necessarily need to use jQuery. I think people can probably skip that step and just use the the, the API that comes with the browser.
But you can have them go in like, Oh, you gotta find this element by the id and you gotta change this based on this, and let them experience the. I don't know if I would call it pain, but let them experience like how it was. Right. And, and give them a complex enough task where they feel like something is wrong right. Or, or like, there, should be something better. And then you can go to you could go straight to vue or react. I'm not sure if we need to go like, Here's backbone, here's knockout.
[00:34:22] Victor: Yeah. That's like historical. Interesting.
[00:34:27] Jeremy: I, I think that would be an interesting college course or something that.
Like, I remember when, I went through school, one of the classes was programming languages. So we would learn things like, Fortran and stuff like that. And I, I think for a more frontend centered or modern equivalent you could go through, Hey, here's the history of frontend development here's what we used to do and here's how we got to where we are today.
I think that could be actually a pretty interesting class yeah
[00:35:10] Victor: I'm a bit interested to know you learned fortran in your PL class.
I, think when I went, I was like, lisp and then some, some other, like, higher classes taught haskell but, um, but I wasn't ready for haskell, not many people but fortran is interesting, I kinda wanna hear about that.
[00:35:25] Jeremy: I think it was more in terms of just getting you exposed to historically this is how things were. Right. And it wasn't so much of like, You can take strategies you used in Fortran into programming as a whole. I think it was just more of like a, a survey of like, Hey, here's, you know, here's Fortran and like you were saying, here's Lisp and all, all these different languages nd like at least you, you get to see them and go like, yeah, this is kind of a pain.
[00:35:54] Victor: Yeah
[00:35:55] Jeremy: And like, I understand why people don't choose to use this anymore but I couldn't take away like a broad like, Oh, I, I really wish we had this feature from, I think we were, I think we were using Fortran 77 or something like that.
I think there's Fortran 77, a Fortran 90, and then there's, um, I think,
[00:36:16] Victor: Like old fortran, deprecated
[00:36:18] Jeremy: Yeah, yeah, yeah. So, so I think, I think, uh, I actually don't know if they're, they're continuing to, um, you know, add new things or maintain it or it's just static. But, it's, it's more, uh, interesting in terms of, like we were talking front end where it's, as somebody who's learning frontend development who is new and you get to see how, backbone worked or how Knockout worked how grunt and gulp worked.
It, it's like the kind of thing where it's like, Oh, okay, like, this is interesting, but let us not use this again. Right?
[00:36:53] Victor: Yeah. Yeah. Right. But I also don't need this, and I will never again
[00:36:58] Jeremy: yeah, yeah. It's, um, but you do definitely see the, the parallels, right? Like you were saying where you had your, your Bower and now you have NPM and you had Grunt and Gulp and now you have many choices
[00:37:14] Victor: Yeah.
[00:37:15] Jeremy: yeah. I, I think having he history context, you know, it's interesting and it can be helpful, but if somebody was. Came to me and said hey I want to learn how to build websites. I get into front end development. I would not be like, Okay, first you gotta start moo tools or GWT.
I don't think I would do that but it I think at a academic level or just in terms of seeing how things became the way they are sure, for sure it's interesting.
[00:37:59] Victor: Yeah. And I, I, think another thing I don't remember who asked or why, why I had to think of this lately. um but it was, knowing the differentiators between other technologies is also extremely helpful right? So, What's the difference between ES build and SWC, right? Again, we're, we're, we're leaning heavy front end, but you know, just like these, uh, sorry for context, of course, it's not everyone a front end developer, but these are two different, uh, build tools, right?
For, for JavaScript, right? Essentially you can think of 'em as transpilers, but they, I think, you know, I think they also bundle like, uh, generally I'm not exactly sure if, if ESbuild will bundle as well. Um, but it's like one is written in go, the other one's written in Rust, right? And sort of there's, um, there's, in addition, there's vite which is like vite does bundle and vite does a lot of things.
Like, like there's a lot of innovation in vite that has to have to do with like, making local development as fast as possible and also getting like, you're sort of making sure as many things as possible are strippable, right? Or, or, or tree shakeable. Sorry, is is is the better, is the better term. Um, but yeah, knowing, knowing the, um, the differences between projects is often enough to sort of make it less confusing for me.
Um, as far as like, Oh, which one of these things should I use? You know, outside of just going with what people are recommending. Cause generally there is some people with wisdom sometimes lead the crowd sometimes, right? So, so sometimes it's okay to be, you know, a crowd member as long as you're listening to the, to, to someone worth listening to.
Um, and, and so yeah, I, I think that's another thing that is like the mark of a good project or, or it's not exclusive, right? It's not, the condition's not necessarily sufficient, but it's like a good projects have the why use this versus x right section in the Readme, right? They're like, Hey, we know you could use Y but here's why you should use us instead.
Or we know you could use X, but here's what we do better than X. That might, you might care about, right? That's, um, a, a really strong indicator of a project. That's good cuz that means the person who's writing the project is like, they've done this, the survey. And like, this is kind of like, um, how good research happens, right?
It's like most of research is reading what's happening, right? To knowing, knowing the boundary you're about to push, right? Or try and sort of like push one, make one step forward in, um, so that's something that I think the, the rigor isn't in necessarily software development everywhere, right?
Which is good and bad. but someone who's sort of done that sort of rigor or, and like, and, and has, and or I should say, has been rigorous about knowing the boundary, and then they can explain that to you. They can be like, Oh, here's where the boundary was. These people were doing this, these people were doing this, these people were doing this, but I wanna do this.
So you just learned now whether it's right for you and sort of the other points in the space, which is awesome. Yeah. Going to your point, I feel like that's, that's also important, it's probably not a good idea to try and get everyone to go through historical artifacts, but if just a, a quick explainer and sort of, uh, note on the differentiation, Could help for sure. Yeah. I feel like we've skewed too much frontend. No, no more frontend discussion this point.
[00:41:20] Jeremy: It's just like, I, I think there's so many more choices where the, the mental thought that has to go into, Okay, what do I use next I feel is bigger on frontend.
I guess it depends on the project you're working on but if you're going to work on anything front end if you haven't done it before or you don't have a lot of experience there's so many build tools so many frameworks, so many libraries that yeah, but we
[00:41:51] Victor: Iterate yeah, in every direction, like the, it's good and bad, but frontend just goes in every direction at the same time Like, there's so many people who are so enthusiastic and so committed and and it's so approachable that like everyone just goes in every direction at the same time and like a lot of people make progress and then unfortunately you have try and pick which, which branch makes sense.
[00:42:20] Jeremy: We've been kind of talking about, some of your experiences with a few things and I wonder if you could explain the the context you're thinking of in terms of the types of projects you typically work on like what are they what's the scale of them that sort of thing.
[00:42:32] Victor: So I guess I've, I've gone through a lot of phases, right? In sort of what I use in in my tooling and what I thought was cool. I wrote enterprise java like everybody else. Like, like it really doesn't talk about it, but like, it's like almost at some point it was like, you're either a rail shop or a Java shop, for so many people.
And I wrote enterprise Java for a, a long time, and I was lucky enough to have friends who were really into, other kinds of computing and other kinds of programming. a lot of my projects were wrapped around, were, were ideas that I was expressing via some new technology, let's say. Right?
So, I wrote a lot of haskell for, for, for a while, right? But what did I end up building with that was actually a job board that honestly didn't go very far because I was spending much more time sort of doing, haskell things, right? And so I learned a lot about sort of what I think is like the pinnacle of sort of like type development in, in the non-research world, right?
Like, like right on the edge of research and actual usability. But a lot of my ideas, sort of getting back to the, the ideas question are just things I want to build for myself. Um, or things I think could be commercially viable or like do, like, be, be well used, uh, and, and sort of, and profitable things, things that I think should be built.
Or like if, if I see some, some projects as like, Oh, I wish they were doing this in this way, Right? Like, I, I often consider like, Oh, I want, I think I could build something that would be separate and maybe do like, inspired from other projects, I should say, Right? Um, and sort of making me understand a sort of a different, a different ecosystem.
but a lot of times I have to say like, the stuff I build is mostly to scratch an itch I have. Um, and or something I think would be profitable or utilizing technology that I've seen that I don't think anyone's done in the same way. Right? So like learning Kubernetes for example, or like investing the time to learn Kubernetes opened up an entire world of sort of like infrastructure ideas, right?
Because like the leverage you get is so high, right? So you're just like, Oh, I could run an aws, right? Like now that I, now that I know this cuz it's like, it's actually not bad, it's kind of usable. Like, couldn't I do that? Right? That kind of thing. Right? Or um, I feel like a lot of the times I'll learn a technology and it'll, it'll make me feel like certain things are possible that they, that weren't before.
Uh, like Rust is another one of those, right? Like, cuz like Rust will go from like embedded all the way to WASM, which is like a crazy vertical stack. Right? It's, that's a lot, That's a wide range of computing that you can, you can touch, right? And, and there's, it's, it's hard to learn, right? The, the, the, the, uh, the, the ramp to learning it is quite steep, but, it opens up a lot of things you can write, right?
It, it opens up a lot of areas you can go into, right? Like, if you ever had an idea for like a desktop app, right? You could actually write it in Rust. There's like, there's, there's ways, there's like is and there's like, um, Tauri is one of my personal favorites, which uses web technology, but it's either I'm inspired by some technology and I'm just like, Oh, what can I use this on?
And like, what would this really be good at doing? or it's, you know, it's one of those other things, like either I think it's gonna be, Oh, this would be cool to build and it would be profitable. Uh, or like, I'm scratching my own itch. Yeah. I think, I think those are basically the three sources.
[00:46:10] Jeremy: It's, it's interesting about Rust where it seems so trendy, I guess, in lots of people wanna do something with rust, but then in a lot of they also are not sure does it make sense to write in rust? Um, I, I think the, the embedded stuff, of course, that makes a lot of sense.
And, uh, you, you've seen a sort of surge in command line apps, stuff ripgrep and ag, stuff like that, and places like that. It's, I think the benefits are pretty clear in terms of you've got the performance and you have the strong typing and whatnot and I think where there's sort of the inbetween section that's kind of unclear to me at least would I build a web application in rust I'm not sure that sort of thing
[00:47:12] Victor: Yeah. I would, I characterize it as kind of like, it's a tool toolkit, so it really depends on the problem. And think we have many tools that there's no, almost never a real reason to pick one in particular right?
Like there's, Cause it seems like just most of, a lot of the work, like, unless you're, you're really doing something interesting, right?
Like, uh, something that like, oh, I need to, I need to, like, I'm gonna run, you know, billions and billions of processes. Like, yeah, maybe you want erlang at that point, right? Like, maybe, maybe you should, that should be, you know, your, your thing. Um, but computers are so fast these days, and most languages have, have sort of borrowed, not borrowed, but like adopted features from others that there's, it's really hard to find a, a specific use case, for one particular tool.
Uh, so I often just categorize it by what I want out of the project, right? Or like, either my goals or project goals, right? Depending on, and, or like business goals, if you're, you know, doing this for a business, right? Um, so like, uh, I, I basically, if I want to go fast and I want to like, you know, reduce time to market, I use type script, right?
Oh, and also I'm a, I'm a, like a type zealot. I, I'd say so. Like, I don't believe in not having types, right? Like, it's just like there's, I think it's crazy that you would like have a function but not know what the inputs could be. And they could actually be anything, right? , you're just like, and then you have to kind of just keep that in your head.
I think that's silly. Now that we have good, we, we have, uh, ways to avoid the, uh, ceremony, right? You've got like hindley Milner type systems, like you have a way to avoid the, you can, you know, predict what types of things will be, and you can, you don't have to write everything everywhere. So like, it's not that.
But anyway, so if I wanna go fast, the, the point is that going back to that early, like the JS ecosystem goes everywhere at the same time. Typescript is excellent because the ecosystem goes everywhere at the same time. And so you've got really good ecosystem support for just about everything you could do.
Um, uh, you could write TypeScript that's very loose on the types and go even faster, but in general it's not very hard. There's not too much ceremony and just like, you know, putting some stuff that shows you what you're using and like, you know, the objects you're working with. and then generally if I wanna like, get it really right, I I'll like reach for haskell, right?
Cause it's just like the sort of contortions, and again, this takes time, this not fast, but, right. the contortions you can do in the type system will make it really hard to write incorrect code or code that doesn't, that isn't logical with itself. Of course interfacing with the outside world. Like if you do a web request, it's gonna fail sometimes, right?
Like the network might be down, right? So you have to, you basically pull that, you sort of wrap that uncertainty in your system to whatever degree you're okay with. And then, but I know it'll be correct, right? But and correctness is just not important. Most of like, Oh, I should , that's a bad quote. Uh, it's not that correct is not important.
It's like if you need to get to market, you do not necessarily need every single piece of your code to be correct, Right? If someone calls some, some function with like, negative one and it's not an important, it's not tied to money or it's like, you know, whatever, then maybe it's fine. They just see an error and then like you get an error in your back and you're like, Oh, I better fix that.
Right? Um, and then generally if I want to be correct and fast, I choose rust these days. Right? Um, these days. and going back to your point, a lot of times that means that I'm going to write in Typescript for a lot of projects. So that's what I'll do for a lot of projects is cuz I'll just be like, ah, do I need like absolute correctness or like some really, you know, fancy sort of type stuff.
No. So I don't pick haskell. Right. And it's like, do I need to be like mega fast? No, probably not. Cuz like, cuz so I don't necessarily don't necessarily need rust. Um, maybe it's interesting to me in terms of like a long, long term thing, right? Like if I, if I'm think, oh, but I want x like for example, tight, tight, uh, integration with WASM, for example, if I'm just like, oh, I could see myself like, but that's more of like, you know, for a fun thing that I'm doing, right?
Like, it's just like, it's, it's, you don't need it. You don't, that's premature, like, you know, that's a premature optimization thing. But if I'm just like, ah, I really want the ability to like maybe consider refactoring some of this out into like a WebAssembly thing later, then I'm like, Okay, maybe, maybe I'll, I'll pick Rust.
Or like, if I, if I like, I do want, you know, really, really fast, then I'll like, then I'll go Rust. But most of the time it's just like, I want a good ecosystem so I don't have to build stuff myself most of the time. Uh, and you know, type script is good enough. So my stack ends up being a lot of the time just in type script, right? Yeah.
[00:52:05] Jeremy: Yeah, I think you've encapsulated the reason why there's so many packages on NPM and why there's so much usage of JavaScript and TypeScript in general is that it, it, it fits the, it's good enough. Right? And in terms of, in terms of speed, like you said, most of the time you don't need of rust.
Um, and so typescript I think is a lot more approachable a lot of people have to use it because they do front end work anyways. And so that kinda just becomes the I don't know if I should say the default but I would say it's probably the most common in terms of when somebody's building a backend today certainly there's other languages but JavaScript and TypeScript is everywhere.
[00:52:57] Victor: Yeah. Uh, I, I, I, another thing is like, I mean, I'm, of ignored the, like, unreasonable effectiveness of like rails Cause there's just a, there's tons of just like rails warriors out there, and that's great. They're they're fantastic. I'm not a, I'm not personally a huge fan of rails but that's, uh, that's to my own detriment, right?
In, in some, in some ways. But like, Rails and Django sort of just like, people who, like, I'm gonna learn this framework it's gonna be excellent. It most, they have a, they have carved out a great ecosystem for themselves. Um, or like, you know, even php right? PHP and like Laravel, or whatever. Uh, and so I'm ignoring those, like, those pockets of productivity, right?
Those pockets of like intense productivity that people like, have all their needs met in that same way. Um, but as far as like general, general sort of ecosystem size and speed for me, um, like what you said, like applies to me. Like if I, if I'm just like, especially if I'm just like, Oh, I just wanna build a backend, Like, I wanna build something that's like super small and just does like, you know, maybe a few, a couple, you know, endpoints or whatever and just, I just wanna throw it out there.
Right? Uh, I, I will pick, yeah. Typescript. It just like, it makes sense to me. I also think note is a better. VM or platform to build on than any of the others as well. So like, like I, by any of the others, I mean, Python, Perl, Ruby, right? Like sort of in the same class of, of tool.
So I I am kind of convinced that, um, Node is better, than those as far as core abilities, right? Like threading Right. Versus the just multi-processing and like, you know, other, other, other solutions and like, stuff like that. So, if you want a boring stack, if I don't wanna use any tokens, right?
Any innovation tokens I reach for TypeScript.
[00:54:46] Jeremy: I think it's good that you brought up. Rails and, and Django because, uh, personally I've done, I've done work with Rails, and you're right in that Rails has so many built in, and the ways to do them are so well established that your ability to be productive and build something really fast hard to compete with, at least in my experience with available in the Node ecosystem.
Um, on the other hand, like I, I also see what you mean by the runtimes. Like with Node, you're, you're built on top of V8 and there's so many resources being poured into it to making it fast and making it run pretty much everywhere. I think you probably don't do too much work with managed services, but if you go to a managed service to run your code, like a platform as a service, they're gonna support Node.
Will they support your other preferred language? Maybe, maybe not,
You know that they will, they'll be able to run node apps so but yeah I don't know if it will ever happen or maybe I'm just not familiar with it, but feel like there isn't a real rails of javascript.
[00:56:14] Victor: Yeah, you're, totally right.
There are, there are. It's, it's weird. It's actually weird that there, like Uh, but, but, I kind of agree with you. There's projects that are trying it recently. There's like Adonis, um, there is, there are backends that also do, like, will do basic templating, like Nest, NestJS is like really excellent.
It's like one of the best sort of backend, projects out there. I I, I but like back in the day, there were projects like Sails, which was like very much trying to do exactly what Rails did, but it just didn't seem to take off and reach that critical mass possibly because of the size of the ecosystem, right?
Like, how many alternatives to Rails are there? Not many, right? And, and now, anyway, maybe let's say the rest of 'em sort of like died out over the years, but there's also like, um, hapi HAPI, uh, which is like also, you know, similarly, it was like angling themselves to be that, but they just never, they never found the traction they needed.
I think, um, or at least to be as wide, widely known as Rails is for, for, for the, for the Ruby ecosystem, um, but also for people to kind of know the magic, cause. Like I feel like you're productive in Rails only when you imbibe the magic, right? You, you, know all the magic context and you know the incantations and they're comforting to you, right?
Like you've, you've, you have the, you have the sort of like, uh, convention. You're like, if you're living and breathing the convention, everything's amazing, right? Like, like you can't beat that. You're just like, you're in the zone but you need people to get in that zone. And I don't think node has, people are just too, they're too frazzled.
They're going like, there's too much options. They can't, it's hard to commit, right? Like, imagine if you'd committed to backbone. Like you got, you can't, It's, it's over. Oh, it's not over. I mean, I don't, no, I don't wanna, you know, disparage the backbone project. I don't use it, but, you know, maybe they're still doing stuff and you know, I'm sure people are still working on it, but you can't, you, it's hard to commit and sort of really imbibe that sort of convention or, or, or sort of like, make yourself sort of breathe that product when there's like 10 products that are kind of similar and could be useful as well.
Yeah, I think that's, that's that's kind of big. It's weird that there isn't a rails, for NodeJS, but, but people are working on it obviously. Like I mentioned Adonis, there's, there's more. I'm leaving a bunch of them out, but that's part of the problem.
[00:58:52] Jeremy: On, on one hand, it's really cool that people are trying so many different things because hopefully maybe they can find something that like other people wouldn't have thought of if they all stick same framework. but on the other hand, it's ... how much time have we spent jumping between all these different frameworks when what we could have if we had a rails.
[00:59:23] Victor: Yeah the, the sort of wasted time is, is crazy to think about it uh, I do think about that from time to time. And you know, and personally I waste a lot of my own time. Like, just, just recently, uh, something I've working on, for a long time. I came back to it after just sort of leaving it on the shelf for a while and I was like, You know what?
I should rewrite this in rust. I, I really should. and so I talked myself into it, and I'm like, You know what? It's gonna be so much easier to deploy. I'm just gonna have one binary. I'm not gonna have to deal with anything else. I'm just like, it'll be, it'll be so much better. I'll, I'll be a lot more confident in the code I write.
And then sort of going through it and like finishing this a, a chunk of it and the kind of project it is, is like I'll have a lot of sort of, different services, right? That, that, that sort of do a similar thing, but a sort of different flavor of a, of a thing, if that makes sense. And I know that I can just go back to typescript on the second one, right?
Like, I'm, I'm doing one and I'm just like, and that's what I've decided to do. Cause I'm just like, Yeah, no, this doesn't make any sense. like, I'm spending way too much time, um, when the other thing is like, is good enough. and like, I think maybe just if you feel that, if you can, like, don't know if you stay, stay aware of just like, Oh, how much friction am I encountering and maybe I should switch. Like if you know rails and you know, typescript, you should probably use Rails, if you're bought into the magic of Rails, right? And, and of course Rails is also another thing that has always has great support from, Platforms as service companies. Rails is always gonna be, you know, have great support right, Because it's just one of those places where it's so nice and cozy that, you know, people who use it are just like, the people who don't want to think about the server underneath.
[01:01:03] Jeremy: I think that combination is really powerful. Like you were talking earlier about working with Kubernetes and learning how all that works and how to run a database and all that. And if you think about the Heroku experience, right? You create your, your Rails app. You tell Heroku I want a database and then you push it. you don't have to worry about pods or Docker or any of that. They take care of all of it for you so I think that certainly there's a value to going deeper and, and learning about how to host things yourself and things like that but I can totally understand if you have the money, uh, especially if for a business would say I don't wanna do this type of ops work I don't want to learn how to set up a cluster just want to push it to a heroku and be done with it.
[01:02:00] Victor: Yeah, You don't, no one gives you an award for learning how to, like wrangle LVM right? No no gives you that. They just like, you know you either make it to market or you don't. Uh, and it's like, uh, like I, mean, I'd love to hear about what you sort optimize but I feel like all, it's all about what you want to optimize for.
Like, are you optimizing for time to market? Are you optimizing for, a code base that people won't be able to mess up later? Right? Like a lot of just, you know, seed stage startups or like just early startups or big companies, like, it doesn't matter. We'll rewrite anyway. Right? That like the eBay example was a great, was a great sort of indication of that like it will get rewritten. So maybe it doesn't make sense. Maybe it's silly to, to optimize for strong code base the beginning. Um,
[01:02:45] Jeremy: I think it, uh, at the beginning, especially if you don't have an established audience, like you're not getting any money, then pick something that the team knows and that, you know, um, or at least the majority does, because that, I think, makes the biggest difference in speed. Speed. Because let's, let's say you, you were giving an example of I would use haskell if I need to be correct, and I would use rust if I need to be fast. but if you are picking something everybody knows and you don't have some very specific requirement, for the most part, if you're using something you already know, it's going to be built faster, it's going to be easier to read and maintain and it'll probably be more correct just because you're more familiar with that whole...
[01:03:50] Victor: So I, I agreed right up until the last point I feel like correctness is one of those, if you use a tool that lets you be too sloppy you can't stop people from being sloppy Right? Uh, like I think, and this is actually something I was thinking of earlier today, is like, I think writing good code is either people being disciplined or better systems, and of course it doesn't matter in every case, Right. and so like, so in cases where like, it, it's just not that important and, and it's better to just let it error and then someone just goes and like, fixes it, right? But if you do that too long, you get you can get spaghetti, right? You can get either spaghetti or you can get a code base that's suffering from a lot of technical debt. Uh, and it, it won't be a problem early on, but when it is, it's a big problem, right? and can drain a lot of, a lot of time. but 99% of the time, I agree.
You don't need anything other than like TypeScript or Rails or like Django, or you could, you could use perl if you want php obviously, like, you know, Right? Like, you, you could get very far, very fast with those. And often it's like, not even necessary to go anywhere else. But the only little thing I'd say is just like, I find that it's, It's so hard to be correct if you're not getting any help from your compiler, right?
Like, for me, at the very least, right? Like, if you're not getting any help from the language, it's so hard to like, write stuff. That's correct. That doesn't ship with bugs in it. Right? There was, um, there's a whole period of time where everyone was getting really excited about writing tests that were like, Oh, make sure to like, write a test with negative one.
Right? Like, just like, you know, like the next level test stuff was just like, Oh, but what if you like, you know, you gotta, I mean, and this is true, right? You have to think like, how could your system possibly be broken, right? Like, like thinking of how to break a system is hard. It's different from thinking of how to build a system, right?
It's a different skill set. But like some of those things you should really just be protected from, I think a big, uh, moment in my career was like seeing option I, I'd been lucky enough to have friends that were like exploring with stuff like, um, like haskell, super early on and like common lisp and sort of like, and reading Hacker News, shout out to AJ cuz like, that's his name.
But like, there's a, there's a person that was like, just kind of like, sort of like exploring the frontier. And then I would like hear a little bit and be like, Ooh, that's interesting. And like, kind of like, take a look, but option coming in. Like, I think Java 8 was like, wait a second option should be everywhere, right?
Because it's like NPEs Null Pointer Exceptions should almost, like, they shouldn't really be a thing, right? Like, and then you are like, Oh, wait, option should be here, but that means it has to be there and it kind of like, it just infects everything. And normally stuff that infects everything is bad, right?
You're just like, Oh no, this is bad. I better take it out. But then you're like, Wait a second. But option is right because I don't know if that thing is not a null actually right. Like the language doesn't give me that. So then, you know, you kind of stumble upon non nullable types, right? As a language feature.
And so it's, it's really hard to quantify, but I think things like that can make a, a, a, a worthwhile difference in, base choice of language as far as correctness goes and in preventing. But I also know that like, people are just blazing by in rails like, just like absolutely without the care in the world, and they're doing great and they, like, they have the, all the integrations and it's all, it's working out great for them.
But I personally just like, I'm just like, I have to, I feel the compulsion. I'm just like, I feel the compulsion and I'm just like, I need to at least do typescript and then I have a little bit more protection from myself. Uh, and then I can, and then I can go on. And it's also, it's like, it's also an excuse for me to like, write less tests as well.
Like a little bit like, you know, I'm just like, you know, I, I, I, there's, there's some, there's some, Assurance that I don't have to like go back and actually write that negative one test like the first time, Right. It in practice, like technically you, you, you should, cuz like, you know, at run time it's, it is a completely different world, right?
Like typescript is like a compile time thing thing. But if you, if you write your types well enough, you, you, you're, you're protected from some amount of that. And I find that that helps me. Personally. So, so that's the, that's the one place I'm just like, ah, I do like that correctness stuff,
[01:08:13] Jeremy: Yeah. Yeah. I, I think like, I, I do agree in a general sense with languages that have static type checking where, you know, at compile time whether something can even run, that can make a big difference. Maybe correctness wasn't the right word, but I you work in an ecosystem, whether Rails or Django or something else, you kind of know all of the, the gotchas, I guess? if you're, if you're, let's say you're building a product with Haskell and you've never Haskell before, I feel like yes, you have a lot of strong guarantees from the type system, but there are going to be things about the language or the ecosystem that you, you'll just miss just because you haven't been in it.
And I think that's what I meant by correctness in that you're going to make mistakes, either logical mistakes or mistakes in structure, right? Because if you, if you think about a Rails app, one of the things that I think is really powerful is that you can go to a company day one that uses rails and if they haven't done anything too crazy, you have a general sense of where things are some extent.
And when you're building something from scratch in a language and ecosystem you don't understand, um there's just so many scrapes and cuts you have to get before you're proficient right Um, so I, so I think that is more what I was thinking of yeah.
[01:10:01] Victor: Oh yeah. I, I'd fully agree with that yeah I fully agree with that. you don't know what you, what you don't know right. When you, uh, when you start, um especially with a new ecosystem, right because you just, everything's hard. You have to go figure out everything you have to go try and decide between two libraries that do similar things despite, you know, like knowing how it's done in another language.
But you gotta like figure out how it's done in this language, et cetera. But it's like, well, you know, at least decisions are easier elsewhere sometimes, right? Like, like in the database level or like, maybe the infrastructure level or, but yeah, I, I totally get that. It's just, most of the time you just want to go with that, uh, that faster, that faster thing, you know, Feels funny to say of course.
Cuz I never do this (laughs) . for I never, like all my, all my projects on, on essentially crazy stacks. But, but I, I try and I try and be mindful about is how much of my toil right now is even a good idea, right? Like, depending on my goals. Again, like going back to like that, it depends on what you're optimizing for right if you're optimizing for learning or like getting a really good fundamental understanding of something, then yeah, sure. If you're optimizing for like getting to market? Sure. that's a different answer. If you're, if you are optimizing for, like, being able to hire developers to work alongside you, right?
Like making it easy to hire teammates in the future, that's a different set of languages maybe. so yeah, I don't know. I kind of give the, the weasel answer, which is, you know, it depends , hmm right? But, um, yeah.
[01:11:32] Jeremy: Especially if you're, you're learning or you're doing personal projects for yourself, then yeah, if you, if you want to know how to use haskell better, then yeah, go for it. Use, use haskell, um, uh, or use rust and so on. I think another thing I think about is the deployment so if personal you are running a SaaS or you're running something that you deploy internally, then I think something like Rails, Django is totally fine especially if you use a platform as a service, then there's so many resources for you. But if your goal is to give you an example, like Mastodon, right? So we have the whole,twitter substitute thing.
Mastodon is written in Rails and it has a number of dependencies, right? you have to have Sidekiq, which runs the Workers, Elastisearch for search, um, Postgres for the database and Nginx and so on. And for somebody who's running an instance for a bunch of people, totally makes sense, right?
No big deal. where I think it's maybe a little trickier is, and I don't know if this is the intent of, Mastodon or ActivityPub in but some people, they wanna host their own instance, right? Um, rather than signing up for mastodon.social and having a whole bunch of people in one instance, they wanna able to control their instance.
They wanna host it themselves. And I think for that Rails the, the resources that it requires are a little high for that kind of small usage. So, in an example like that, if I wanted something that I wanted to be able to easily give to you and you could host it, then something like a Go or a Rust I think would make a lot of sense you can run the binary, right? And, you don't have to worry about all the things that go around running a Ruby application.
So I think that's something to think about as well. And, and we talked about command line apps as well, right? If you're gonna build a command line app and you want it to run on Windows, well the person on Windows is not gonna have python or ruby so again having it in Go or in Rust makes a lot of sense there so that's another think I would think about who is it going to be given to and who is going to deploy it as well.
[01:14:25] Victor: Yeah. That's um, that's a great point, uh, because it makes me think of sort of explosion of sysadmins writing go when it first came out I, don't know if I imagined this or I think it was real, but like there were just so, uh, up until then, like most sysadmins would be they'd like obviously like get to know their routers or their, you know, their switches and their, you know, their servers and like racking, stacking doing all that stuff.
Languages and like frameworks can unlock a certain group of people or like unblock a certain group of people and like unlock their sort of productivity. So like Ansible was one of those first things that was like really sort of easy to understand and like, Oh, you can imperatively set this machine up.
But a side effect is you get a lot of sysadmins that know Python, right? So like, now a lot of like the sort of black art stuff is accessible to you. Like, or, sorry, I say accessible to you as in accessible to me as the non sysadmin, right? Cause I'm just like, Oh, I can run this like little script this person wrote, uh, in Python and it like, will do all this stuff, right?
That I, I would've never been able to do before. And maybe I learned a little bit more about that, about that system, right? And so I, I, I saw something similar and go where people were writing a bunch of tools that were just really easy to run, right? Really, really easy to run everywhere. Um, and that means easy to download, easy to like, you know, everything's easier and, A lot of hard things got a lot easier, right? Uh, and this is same with Rust. Like, I, I believe that library that most people use is like clap, I've built a few things with Clap and it's like, it gives you excellent, uh, I guess you'd call them affordances or like ability to make a high quality CLI program with very little effort, right?
Uh, and so that means you end up writing really decent binaries, right? With like, good help texts and like reasonable like, you know, options and stuff like that. and then it's really easy to deploy to Windows, right? And like other, other platforms, uh, like you said, you don't have to try and bundle Python or, or whatever else the sort of interpreter class of languages. So yeah, I think that I'd agree that like just languages and, and, and sort of frameworks can, can unlock, easier creation of certain kinds of apps and certain sort of groups of people to share their knowledge or like to, to, to make a, a tool that's more usable by everyone else.
It could be like, kind of like a, multiplicative factor right. Just like, I made this really, really intense Python script, but like now, but to use it, you'd have to like install Python on Windows, like manage your environments, whatever. Like, I don't know if you're using pyenv, maybe you are, maybe you aren't.
Do you get the wheel? Like what, what do you do with that? no, I'll just give you a, executable and you have an executable and then now you can use all the tools that like normally work with an executable or with something that like produces output and it's just faster for everybody and everybody like just, you know, gets more value
[01:17:17] Jeremy: Cool. Well, is there anything else you wanted to, to mention or, or talk about?
[01:17:26] Victor: I don't know. oh yeah, I guess, I guess I could just like say my stack, right? Um, Oh, I, I really love Sveltekit. I've been kind of all in on Sveltekit for the front end for a while now. it feels like I've used, um, I've used nuxt I've used, like, I've used a lot of frameworks, but I'm trying to think of, of frameworks that like, do the, um, like I think, I think a local, if not global maximum for front end development is power of the front end component driven sort of paradigm and server side rendering, right?
Because there's like, what are the big advantages of using something like Rails or like whatever else that, that just, just, that's completely server side is that the pages are fast, the pages are always fast. It's there, but they don't have interactivity. Right. we've taken a weird path to get here and it looks really wasteful and maybe it is really wasteful, but at this point we now have kind of both kind of like glued and like hacked into one thing.
And I think that class of tools is like, is, is is a local maximum, if not, if not global. so, so yeah. So like, there's like next, nuxt, sveltekit. There's, there's other solutions. There's Astro like there's, there's, which is Astro's really recent.
Um, there's Ember, right? Shout out to Ember right. People, people still pushing that forward as well, which is great. but yeah, so I, I've SvelteKit also, and this is again in like direct conflict to what we've talked about this entire time, which is like, use established things that get you there fast. but like SvelteKit isn't at 1.0 yet, but it is excellent.
Like, I, I am more productive in it than I ever was with Nuxt. Um, and again, Nuxt has changed a lot since I've, you know, sort of made the switch and like, you know, maybe I, maybe it deserves a rethink and like re revisiting it, but I'm so productive with SvelteKit. I just, like, I don't mind. And like half the time I'll just, I'll just use SvelteKit, uh, and my database and then be done like no middle layer.
So like no API layer. I just like stuff it into the SvelteKit app, and then use, postgres on the backend and then I'm done and, and I feel like that's been really productive, you know, again, this is outside of the, the world where you use a rails or whatever.
Um, so yeah. So that's, that's been my stack for a lot of the products I've done recently. so yeah, if I, if I had to, I guess say something about like front end, like give SvelteKit a try. It's pretty good. Uh, and obviously like databases, just use Postgres. Stop using other things. don't, don't do that.
And like infrastructure stuff, I think Kubernetes is cool, but you probably don't need it.
Uh, I like Pulumi. I feel like no one, like I've been recommending Pulumi for a long time over Terraform. So it's just like DSLs have limits and those limits are a bad idea to have when you, like, the rest of your time is spent with no limits, right.
With like just general computing. Right. So, and Pulumi is just like, you can do your general computing and infrastructure stuff too, and it's, I feel like it's, it's always, you know, been better, but, but anyway, yeah. That's like, that's kind of my stack
[01:20:26] Jeremy: So pulumi is um, it's a way to provision infrastructure, but is there a language for it?
[01:20:35] Victor: It integrates with the language you use. And Terraform has caught up in that respect, right? Cause you have that now. but how it works is still slightly different right because if I remember correctly they still generate a Terraform file and execute that file it's, still a little bit different, which is like, it's, and it's AWS' CDK as well, right? So, so the world that's sort of caught up to where, what Pulumi is doing. But you know, I, I think it was like, I don't know, terraform 12 or something like that where it was just like, we've added better for loops.
I'm like, okay, at this point, like this is, that's the indication of like, you now need general, like you, you, you're now the dsl, like DSLs can have for loops, but it's like if you're starting to like pluck, you know, general computing languages, we have really good general computing languages right there.
You know, that was kind of my, indication to be like, okay, I Pulumi is the way, uh, for me, um again, This doesn't matter cuz like at work you're gonna, you're probably using Terraform, like, you know, just every, just like, there's, you know, everyone's using certain tools and you don't have a choice. Sometimes you have to use certain tools, but I personally have my, uh, have my pet pet likes and stuff.
[01:21:49] Jeremy: How about for caching?
[01:21:53] Victor: Uh, KeyDB. I go into rabbit holes a lot. I call myself a yak shaver cause I shave a lot of yaks and it doesn't benefit anyone really except for me most of the time. But there are lots of redis alikes out there. And the best feature set is right now KeyDB.
There's like, there's, there's one called Tendis there's, um, which is like, um, a little bit like more distributed. There's like SSdb, which will do it off disk, which is, I think because we have such fast disks now, it's good enough for a bunch of applications. Right. Especially if, like, if your alternative was like, you know, a much farther away sort of, you know, calls the farther away service.
There's Pelican out of Twitter, so they have a whole, they've got like a caching, it's like a framework kind of, right? Like they, they, they've sort of built a kernel of like really interesting caching, um, originally like sort of to serve their memcache workloads and stuff. But it's kind of grown in like, in lots of directions as well.
KeyDB is, was the most compelling and still is to, to me for, from a resource usage. Multi threading, obviously, like it is multi threaded, so it is now, it's it's way faster. Right. Um, and also like it offers flash storage, using the SSD when you can. And, and that's, Those are game changers. Right. And, and of course all the, you know, usual and clusters, right? It clusters without you, you know, paying Redis Labs any money or whatever. Um, which is, which is fine. You know, people opensource projects and, and businesses have to, you know, make money. That is a thing. But yeah, KeyDB is, is my, uh, I, whenever I'm about to spin up redis, I don't, and I spin up uh, also they were bought by Snap or bought hell of an aquihire.
I think if, if you, cuz I think sometimes that has like a negative pejorative context to it. Like you didn't, like, oh, you didn't make a billion dollars, you just got aquihired or whatever. But hell of an aquihire. Um, and, and so all of it's like free now, like all of the, like all the, the premium features are becoming free.
And I'm like, this is, this is like, I won the lottery, right? Cause um, you know, you get all the, the, the awesome stuff outta KeyDB for, for free. Um, so yeah, Caching KeyDB. I do KeyDB.
[01:24:11] Jeremy: KeyDB. I haven't heard of that one.
[01:24:14] Victor: Oh yeah, it's, um, yeah it's like keydb.dev.
[01:24:17] Jeremy: Oh KeyDB.
[01:24:18] Victor: It's awesome. They did YC.
[01:24:23] Jeremy: Oh, it uses the Redis wire protocol
[01:24:28] Victor: Like Redis is like, is the leader, unless you're using memcached for some other reason and then like obvious like have to use memcached, whatever. But, um, but yeah, Redis is like the sort of app external cache dujour for basically everywhere and when I wanna run Redis, I run KeyDB.
[01:24:51] Jeremy: And for search, do you just in search in postgres or turn to something else?
[01:24:59] Victor: Oh, you've asked a dangerous question. So I recently did some, uh, some writing. So I, I, I, so recently, um, like this year, I've branched out and done a little bit more experiments in writing for companies that have an interesting you know developer product or sometimes where like, you know, my sort of like interest and stuff just aligned, right?
So like, uh, I've worked with, um, OCV Open Core Ventures, um, which is on Sid, if you know Sid from GitLab, That's his, um, his, uh, his fund, uh, and then also Supabase, which does, um, you know, awesome stuff on Postgres. And, you know, it's fully open source that, that company is amazing as well. and search has been a thing.
So Postgres has full text search, SQLite has full text search. They both have it built in. they're very good and I think great approximations for like V1s at the very least, maybe even farther. because a lot of the time if someone's in your product and they're searching something's wrong usually, right?
Like, like, unless you have vast gobs of data, like this means your UX is not good enough for something, right? Um, but um, that said, I almost always start with Postgres full text search. and then I have the, um, there, there are, there's a huge crop of new search engines, right? So if we consider open search to be new, as in like the fork of Amazon from, from Elastic search, there's that, there's a project called Meilisearch.
There's a project called TypeSense. Um, there's Sonic, uh, there's like, um, Tantivy, uh, which which is like the, can be under net. There's like quickwit, which is like shifted to logging a little bit. Like that's their like, path to sort of, um, profitability. I, I think, I think they, they sort of shifted a little bit.
there's a bunch more that I'm, I'm missing. And so that's what I wrote about and had a lot of fun writing about for Supabase very recently. And this was, um, this was something I just had written down, right? So I was just like, I need to do a blog post. And I, I write on my blog a lot, so I'm just like, Alright.
I write up yak shaves to my blog a lot and I'm, and I was just like, I need to try and just use some of these, right? Because there's so many and they all look pretty good. And they have to have learned, like the golden standard is like, uh, solr, right? Lucene, right? Like, it's like, it's like solr and lucene and like, you know, that or whatever.
And, but a lot of times you just don't need, like, you don't necessarily need every single feature of lucene. And so there are so many new projects that are look decent. Uh, and so I got a chance to, to to sort of, I was paid to do some of that experimentation, which is awesome cause I would've done it anyway.
But it's nice to be paid to do it, on search stuff. and I actually have a project I like, I liked that so much that I made a project to try and get a more representative dataset. So I started a site called podcastsaver.com I use the podcast index, right?
Which has a lot of sort of like podcast information. And, know, if someone doesn't know about podcasts, there's like an RSS feed, right? Which is kind of like a, you can think of an XMLy uh, format where people like podcasts are just a publish of a RSS feed and the RSS feed has links to where to download the actual files, right?
So it's really open, right? Um, and so I used, um, that the structure of that to index, in multiple search engines at once, right? Running alongside each other, the information from the podcast index. this is was fun for me cuz it was like an extension of that other project. It was a really good way to test them against each other.
Very fast, right? Like, or, or like in real time. So like right now, um, if you go to podcastsaver.com and you search a podcast, it will go to one of the search engines randomly. So right now there is Postgres FTS, plus Trigram. So, so there is, um, there's also a thing called, um, Tri Trigram searches another really good like, um, sort of basic search feature.
And there's Meilisearch. So both of those are implemented right now. And there's actually a little nerds link, right? Which will show you how many, how many podcasts there are, right? So, so how many documents, essentially you can kind of assume there are. Um, and it'll show you how fast each search engine did, right?
At sort of returning an answer. Now it's a little bit of a problem because I don't you need to do some manual work to figure out whether the answer was good, right? If you're really fast but give a garbage answer, that's not good. But in general, like, so you can, you can actually use the nerd tab to control, You can like switch to only Postgres, uh, and I do that with like cookies and you can, um, you can force it to go to Postgres and you can see the quality of the answers for yourself.
But they're generally, it's pretty good from both. Like it's not, it's not terrible from, from both. So I'm, I'm kind of like glossing over that part right now, but you can see the performance and it's actually, it's like meilisearch does a great job, right? Um, and you know, there's obviously some complexity in running another service and there's some other caveats and stuff like that, but it's, it's pretty good.
And over time, I want to add more. So I wanna add, you know, at the very least typesense, like people have reached out, so like, I, I made a, a comment on this, on Hacker news and like there's a long road ahead for that and like, I honestly shouldn't be working on that cuz I have other things that I'm like, you know, I, I'm really should be full time on.
Um, But like, that's a thing I'm trying to, I'm trying to do sort of grow in the future a little bit more cuz it's just like, it's so fascinating to, to like, everything's so cheap. Like computer is cheap, you know, like there's awesome projects out there with like really advanced functionality that we can just run, like, for free or not, not for free, but like, you don't have to do the work to like build a search engine.
There's like five out there. So all you, the only thing that's missing is like knowing which one's the best fit for you and like, you can just find that out. Yeah.
[01:30:46] Jeremy: Are there any I guess early conclusions in terms of you like Meilisearch because of X or?
[01:30:53] Victor: Yeah, the, the super supabase blog post, uh, was, was a little bit better in terms of, uh, takeaways. I can say that from like meilisearch is definitely faster like meilisearch was harder for me to load and like it took a, a little bit longer cuz you know, you have to do the network call.
And to be fair, if you choose Postgres, it's in the database. So like, your copying is a lot easier. Like, manipulating stuff is a lot easier. Um, but right now when I look at the stats, like Meilisearch goes way faster. It's like almost always under a hundred milliseconds, right? And that's including, you know, um, that network, you know, round trip.
Um, but you know, Postgres is like, I don't know, I just, I just, I think it's, I I'm just so, I'm so biased. Like it is not a good idea to ever bet against Postgres, right? Like, obviously meilisearch is be like, it doesn't make sense for Postgres to be better than purpose-built tools. Um, because they are fully focused, right?
Like, they should be, they should be optimal. Cuz they, they, they don't have any other sort of conflicting constraints to think about. But Postgres is very good. It's just like, it's, it's so excellent and it, it keeps moving. Like it keeps getting better. It gets better and better every year, every like, every quarter. It's hard to not bet on it. So I often, So, so, so yeah, so I just, I, if you, I, I would say based on pure performance of podcastserver.com right now, the data lends itself to saying pick meilisearch. unfortunately that data set is incomplete. I don't have typesense up. I don't have all these other like search engines up.
So, so it's, it's, it's limited. there was also, like in the supabase post, you'll see there, there was support for like, um, misspellings and stuff was different among search engines. So there's also that axis as well. But if you happen to be running on Postgres, I really do suggest just, just give Postgres FTS a try, even if it was just Trigram search.
Like even if you just do Trigram search and do like a sort of like fuzzy search bar, cause that's probably like what a V1 would look like. Anyway, try that and then go off and like, you know, and then like, if you need like crazy faceting or like, you know, you know, really advanced features, then jump off.
Uh, but I, I don't know, that's not interesting cause I feel like it already kind of confirms what I think. So I think other people, other people need to need to do this. I need other people to please replicate, uh, and uh, come up with better, better ideas than I have
[01:33:20] Jeremy: but I think that's a good start in, in terms of when you're comparing different solutions, whether it's databases or, I don't know what you call these, but what do you call an elasticsearch?
[01:33:32] Victor: Search engine.
[01:33:34] Jeremy: You go to open source projects or the company websites and they'll have their charts and go we're x times faster than Y.
But I, I think actually having a running instance where they're going against the same data, I think that's, that's helpful really for anyone trying to compare something to for someone having gone through the time. And I think that a lot of other things too not just search engines where you could have hey, I have my system and it's got, uh I don't know five different databases or something like that. I, I'm not sure the logistics of how you would do it,
[01:34:15] Victor: Like with redis. Like just like all the Redis likes, like just all run, run 'em all at the same time. Someone needs to do that
[01:34:26] Jeremy: Could be you.
[01:34:27] Victor: Ahaha no! I do too much! Like the redis thing is obvious, right? Redis is easier, like comparing these redises and there's some great blog posts out there also that like kind of do it. But like a running service is like a really good way of like showing like, oh, this is like, we hit this cache, you know, x times a second with like, and it's like this, like naturally random sort of traffic.
This is how it performed, this is how they performed against each other. These were like the, the resources allotted or whatever. But yeah, that stuffs, that stuffs really cool. I feel like people haven't done it or aren't doing it enough.
[01:35:01] Jeremy: Yeah. I guess thing about, putting together one of these tests as well, especially when you make it live is then you have to spend the time and spend the money to maintain it right and I think, uh, if somebody's not paying you to do it's gotta be uh, Yeah. You gotta want it that bad to put it together.
[01:35:22] Victor: Hey, but you know what? we can go full circle just use Kubernetes,
Its easy if you just use Kubernetes man.
[01:35:33] Jeremy: First you gotta learn... Where, where were we? First start with postgres, kubernetes.
[01:35:42] Victor: Yeah. If you wanna use Kubernetes first, you start with Postgres and... It's like, what?
[01:35:49] Jeremy: So, learn these ten other things first then you can start to build your project.
[01:35:58] Victor: Yeah, it's silly but I know people out there have the knowledge I just feel like it's, it's like, you they just need to do some of this stuff, right? Like, it's just like, they just like need to like, have the idea or just like go, just go try it Uh, and hopefully we, get more of like, in the future.
Just like, cause, cause at some point, like there's gonna be so much choice that you're like, how are you gonna decide? How does anyone decide these days? Right? Like, you know, more people have to dedicate their time to like, trying out different things, but also sharing it. Cause I think just inside companies, you do this, you do the bakeoffs, right?
Everyone does the bakeoffs to try and figure out, you know, within a week or whatever, whether, whether they should use, let's say like Buddy Base or App Smith, right? Like, just like you, just like the rest of the team has no idea what those are, right? But someone, Does the Bakeoff maybe start sharing Bakeoffs?
There it is. There's another app idea. I, I think of a lot of ideas, and this is a, there's another one, right? Just make a site where people can share their bakeoff, like just share their bakeoff results with certain products. And then that knowledge just being available to people is like, is massively, is massively valuable.
And it, it kind of helps, it helps the products that are mentioned because they can figure out what to change, right? it kind of makes the market more efficient, right? In that vague, uh, capitalistic sense where it's like, oh, like then, you know, if everyone has a chance to improve, then we get a better product at the end of the day.
But, um, yeah, I dunno, Hopefully more people more people yak shave, please, more people waste your time. Uh, not waste, uh, use your time to, uh, to yak shave. It's, it's, it's fine.
[01:37:32] Jeremy: Well I think you have something at the end of it sometimes you can yak shave and at the end it's kind of like, well, I, I played with it and oh well.
Versus you having something to show for it.
[01:37:50] Victor: Yeah, that's true. Yeah. I won't talk about all the other projects that went absolutely nowhere.
But, uh, but yeah, I think you always feel selfish if you learn something to, and I should, I should rephrase this like I am definitely a selfish person. Like you know, like, I'm not, this is not altruism, right? It's just like, but at some point it feels like, man, someone should really know this other stuff, right? Like, if you, if you've found something that's like, interesting, like it's, it's like someone should know, cuz someone who's better at it will be like, Oh, like no, this part and this part.
Like, it's like everyone kind of wins. which is, which is awesome. So, I dunno, maybe if more people have that feeling, they'll like, they'll like share some of their stuff and like maybe you do a thing and it doesn't help you, but then someone else comes along and they're like, Oh, because I read this, I know how to do this.
And like, and then if they give that back too, it's, uh, it's pretty awesome. But anyway, that's all pie in the sky,
[01:38:57] Jeremy: I think in general, the fact that you are running a blog and, you know, you do your posts on Hacker News and, and so on. The fact that you're sharing what you've learned, I think it's is super valuable. And I think that goes for anybody who is learning a new technology or working on a problem and you run into issues or things you get stuck on for sure yeah you should share that and the way I've heard described There's always someone on the internet just waiting to tell you why you're wrong.
[01:39:35] Victor: Oh yeah. Yeah.
[01:39:36] Jeremy: And provided that they're right. That can be very helpful. Right?
[01:39:40] Victor: Yeah. Yeah. I, I actually, I love I, I personally like it because if you're a hacker in the, you know, hacker news sense that's excellent. That's like a free compiler right?
It's like a free checker right? If you just sit next to someone who is amazing at X.
And you just start bouncing ideas of like, around X and like how to do whatever it is off of them, you get it compiled.
They're just like, No, you can't do that cuz of X, Y, and Z. And you're like, Oh, okay, great. I've just saved myself like, you know, months of like thinking I could do it and like, now I know I can't do it. And the internet is great cuz it gives you access to like, to those people who are like, Yeah. And knowing it first, but if you realize that like, oh, they've chosen to share some wisdom with me like that, like, you know, or, or like trying to, Right.
Assuming you're correct, Like, even if they're not correct. Um, it's, it's, it's pretty awesome. So, so I personally welcome that. Of course it doesn't feel good to be wrong, right? I don't like that. But, um, I love it when someone like take, took the time to be like, No, your, your view on this is wrong because of this.
Or like, you know, like 99% of the time you don't need that. You should have just done this, right? Cause then I learn, a lot of my posts will have updates at the top. Right. So like when someone, like, you know, when I posted the, the thing about the throat mic to like hack me is people were like, This sounds terrible,
I was like, I didn't think it was that bad, but, uh, but I was like, you know, maybe I, maybe I shouldn't use this, uh, all the time, but it, it, you know, it was, it was like obvious that, oh, I should have, I should have never made the post without including a sample of the audio at the top, right? So like, I like went back and like an update for that and then, and then people like discussing about like, Oh, you should have used a bone conducting mic instead.
Like, and like all this other stuff that I just like didn't think about. I'm like, Oh, awesome.
And then like I update the post I go on with my life, so anyway, more people please do that and don't post it on Medium. Please don't do that. Stop, stop that. If you like, if you, if you write software, do not like, please put it some, put your writing about software somewhere else, unless, I don't know, You have to or something.
[01:41:52] Jeremy: You've reached your article limit.
[01:41:57] Victor: Yeah, yeah. Oh, also shout out to the web archive. The best way to get almost any article, right? I don't think people in the general populace know this?
But like 99% of the time if you're trying to you just go to the web archive.
It's common knowledge for us. Um, but, but it's not Common knowledge for everybody else and it just feels like they're making a lot of stuff available and legally, right. Cuz like, you know, there's like the, the precedent right now I think is, is is in favor of scraping, right? If you make a thing available to the internet, right?
LinkedIn got ruled against a while ago, but like, if you make a thing available to the internet, uh, publicly available without signing in or whatever it is assumed public, right? So it's just like, yeah, whenever I read something I'm just like, ah, article limit. I hop right on. I hop right on archive today.
But, but I just feel like it's like, it's, it's sad that developers put like, put knowledge enmasse into that particular, It's not a trap. Cause I don't, it's like I don't dislike medium, I don't have any necessarily like animosity towards medium, but it's just like we should be the most capable of, putting up something like maintaining our own websites.
Right. If it's like the death of the personal website, why is it dying with developers? Like, we should be the most capable. We have no hope of the regular world putting out websites if, if it's hard for us.
[01:43:32] Jeremy: I, I mean, I think for stuff like medium maybe sometimes it's the, the technical aspect of not wanting to set up your own site but, I think a large part of it is the social aspect. Like with Medium, you have discoverability you have the likes system, if they even call it that. Um, I think that's the same reason why people can be happy to post on twitter, right?
Um, but when it comes to posting on their own blog, it's like well, I post and then nobody comes and sees it, right? Or I don't get the, I don't get the, Well, the thing is too, like, they could be seeing it but you don't get the feedback and you don't get, you don't get the dopamine hit of like, Oh, I got 40 likes on Medium or Twitter or whatever.
And I think that's one of the challenges with personal sites where I totally agree with you. Wish people would do it and do more but I also understand you are on a little bit of an island unless you can get people to come and interact with you.
[01:44:44] Victor: There's another idea, right? Like just, you know, can you build a self hostable, but decentralized by default, medium clone. there's that's like a personal site that you could easily host you know, like, almost like WordPress, like let's say, right? Um, but with the, with enough metrics, with like, with the engagement stuff built in, even though it's not like powering a company essentially, right?
Cause like the incentives behind building in the engagement, like pumping up engagement. Make sense? If you're running a company cuz you like, you know, you're trying to get MAUs up so you can do your next round or like, you know, make more revenue. Wonder if, I don't know. Yeah, it's just like, like that is a great point cuz it's like, you don't get the positive reinforcement if you don't have the likes and the things that a company would add, right?
Like, as opposed to just like, Oh, I set up nginx and like my site's up or whatever. Like, not that anyone does that these days, but, yeah, that's, that's that's interesting. It's just like, could you make it really like just increasing the engagement of doing it yourself or like, you know, having that. Huh.
[01:45:56] Jeremy: I think sites have, have tried, I mean, it's not quite the same thing, but, dev.to, if you've seen that, like, uh, they, they have, um, I can't remember what it's called, I think it's like a canonical link or something. but basically you can post on their site and then you can put the canonical link to your own website.
And so then when somebody searches on Google, the, the traffic goes to your site. It doesn't bring up dev.to.
And then, people can comment and like on dev.to so I thought it was an interesting idea. I, I don't know how many people use it or take advantage but that's one approach anyways.
[01:46:44] Victor: Yeah, that's actually, that's cool. I don't know enough about that space. I guess.
That sounds awesome. That sounds like actually, you know, useful and like a good middle ground right in like encouraging the ecosystem but also like capturing some of that, of that value, right?
In terms of like just SEO juice, I guess, if you wanna, what, what you wanna call it. But that's awesome. I don't know, I, I, I've always thought of like dev.to And, and clearly I was, you know, at least wrong in part of dev.to Is just like medium 2.0 for, but more developer focused. Um, but I will find great blog posts on there, um, you know, more often than not, and it's just like, okay, yeah, that's, that's awesome.
Like, it, it, it works. Uh, and this canonical link thing sounds actually like very good for, um, for everybody involved, so. Awesome. Sounds like they're, they're good.
[01:47:36] Jeremy: Yeah, if people wanna check out you're up to, what, what, you're working on, where should they head?
[01:47:43] Victor: Oh God. Uh, well, like, I have my blog at, um, vadosware.io, so V A D O S WARE projects I work biggest ones right now. Oh, I guess three. Um, uh, like I, we mentioned Podcast Saver, which is cool. Uh, if you need to download podcasts, do that.
Um, I send out ideas. I send out ideas every week that I think are like valuable. valuable and like things you could turn up into like a startup or a SaaS and like kind of focus on like validating. Cuz like one thing I've learned the hard way is that validating ideas is more important than having them.
Uh, cuz you can think something is good and it won't, won't attract anybody. Um, or you know, if you don't put it in front of people, they'll, it's not gonna take off. so I do that. I send that out at like unvalidatedideas.com So that's, that's a, you know, that's the domain.
I also started, um, trying to highlight FOSS projects cuz in yak shaving what you do is you come across a lot of awesome free and open source projects that are just like, oh, like this is a whole world and like this is like pretty polished and it's like pretty good and I just bookmark So I was just like, I have so many bookmarks, it doesn't make sense that I hold all of them. Um, and like I, someone else has, should see this. So I send out, and this is uh, new for me cuz I send out that newsletter every day. So it's a daily newsletter for like free and open source projects that do, you know, do whatever, like, do lots of various things.
And that is at Awesome Foss. So you can actually spell it multiple ways, but a w s m f o s s.com. So like, awesome without the vowels. Um, but also just if you spell it normally like a normal person, like awesome the word f o s s.com. Um, so that's, that's going.
And then the, the thing that's actually like taking up all my time is nimbus, um, Nimbus Web Services is what I'm calling it.
Uh, it's not out yet, there's nothing to try there, but it is, it is my attempt, to host free and open source software. But give, 10-30% back of revenue, so not profit. Right. Cause they can be different things and like, you know, see the movie industry for like, how that can go wrong, of revenue back to open source projects that, uh, that made the software that I'm hosting.
And I, I think there's more interesting things to be done there, right? Like it can, I can be more aggressive with that. Right. If it, if it works out. Cuz it's just like, you know, it scales so well, you know, see Amazon, right. but yeah, so if you're, if you're interested in that checkout, nimbusws.com.
And that's it. I've, I've plugged everything. Everything plugged.
[01:50:38] Jeremy: Yeah that last one sounds pretty, pretty ambitious. So good luck.
[01:50:42] Victor: Thanks for taking the time.
Xe Iaso is the Archmage of Infrastructure at Tailscale and previously worked at Heroku.
This episode originally aired on Software Engineering Radio but includes some additional discussion about their blog near the end of the episode.
Topics covered:
Related Links
Transcript
[00:00:00] Jeremy: Today I'm talking to Xe Iaso, they're the archmage of infrastructure at tailscale, and they also have a great blog everyone should check out.
Xe, welcome to software engineering radio.
[00:00:12] Xe: Thanks. It's great to be here.
[00:00:14] Jeremy: I think the first thing we should start with, is what's a, a VPN, because I think some people they may have used it to remote into their workplace or something like that. But I think the, the scope of what it's good for and what it does is a lot broader than that. So maybe you could talk a little bit about that first.
[00:00:31] Xe: Okay. a VPN is short for virtual private network. It's basically a fake network that's overlaid on top of existing networks. And then you can use that network to do whatever you would with a normal computer network. this term has been co-opted by companies that are attempting to get into the, like hide my ass style market, where, you know, you encrypt your internet information and keep it safe from hackers.
But, uh, so it makes it really annoying and hard to talk about what a VPN actually is. Because tailscale, uh, the company I work for is closer to like the actual intent of a VPN and not just, you know, like hide your internet traffic. That's already encrypted anyway with another level of encryption and just make a great access point for, uh, three letter agencies.
But are there, use cases, past that, like when you're developing a piece of software, why would you decide to use a VPN outside of just because I want my, you know, my workers to be able to get access to this stuff.
[00:01:42] Xe: So something that's come up, uh, when I've been working at tailscale is that sometimes we'll make changes to something. And it'll be changes to like the user experience of something on the admin panel or something. So in a lot of other places I've worked in order to have other people test that, you know, you'd have to push it to the cloud.
It would have to spin up a review app in Heroku or some terrifying terraform of abomination would have to put it out onto like an actual cluster or something. But with tail scale, you know, if your app is running locally, you just give like the name of your computer and the port number. And you know, other people are able to just see it and poke it and experience it.
And that basically turns the, uh, feedback cycle from, you know, like having to wait for like the state of the world to converge, to, you know, make a change, press F five, give the URL to a coworker and be like, Hey, is this Gucci?
they can connect to your app as if you were both connected to the same switch.
[00:02:52] Jeremy: You don't have to worry about, pushing to a cloud service or opening ports, things like that.
[00:02:57] Xe: Yep. It will act like it's in the same room, even when they're not it'll even work. if you're at both at Starbucks and the Starbucks has reasonable policies, like holy crap, don't allow devices to connect to each other directly. so you know, you're working on. Like your screenplay app at your Starbucks or something, and you have a coworker there and you're like, Hey, uh, check this out and, uh, give them the link.
And then, you know, they're also seeing the screenplay editor.
[00:03:27] Jeremy: in terms of security and things like that. I mean, I'm picturing it kind of like we were sitting in the same room and there's a switch and we both plugged in. Normally when you do something like that, you kind of have, full access to whatever else is on the switch. Uh, you know, provided that's not being blocked by a, a firewall.
is there like a layer of security on top of that, that a VPN service like tailscale would provide.
[00:03:53] Xe: Yes. Um, there are these things called access control lists, which are kind of like firewall rules, except you don't have to deal with like the nightmare of writing an IP tables rule that also works in windows firewall and whatever they use in Mac OS. The ACL rules are applied at the tailnet level for every device in the tailnet.
So if you have like developer machines, you can put people into groups as things like developers and say that developer machines can talk to production, but not people in QA. They can only talk to testing and people on SRE have, you know, permissions to go everywhere and people within their own teams can connect to each other. you can make more complicated policies like that fairly easily.
[00:04:44] Jeremy: And when we think about infrastructure for, for companies, you were talking about how there could be development, infrastructure, production, infrastructure, and you kind of separate it all out. when you're working with cloud infrastructure. A lot of times, there's the, I always forget what it stands for, but there's like IAM.
There's like policies that you can set up with the cloud provider that says these users can access this, or these machines can access this. And, and I wonder from your perspective, when you would choose to use that versus use something at the, the network or the, the VPN level.
[00:05:20] Xe: The way I think about it is that things like IAM enforce, permissions for like more granularly scoped things like can create EC2 instances or can delete EC2 instances or something like that. And that's just kind of a different level of thing. uh, tailscale, ACLs are more, you know, X is allowed to connect to Y or with tailscale, SSH X is allowed to connect as user Y.
and that's really different than like arbitrary capability things like IAM offers.
you could think about it as an IAM system, but the main permissions that it's exposing are can X connect to Y on Zed port.
[00:06:05] Jeremy: What are some other use cases where if you weren't using a VPN, you'd have to do a lot more work or there's a lot more complexity, kind of what are some cases where it's like, okay, using a VPN here makes a lot of sense.
(The quick and simple guide to go links https://www.trot.to/go-links)
[00:06:18] Xe: There is a service internal to tailscale called go, which is a, clone of Google's so-called go links where it's basically a URL shortener that lives at http://go. And, you know, you have go/something to get to some internal admin service or another thing to get to like, you know, the company directory and notion or something, and this kind of thing you could do with a normal setup, you know, you could set it up and have to do OAuth challenges everywhere and, you know, have to put and make sure that everyone has the right DNS configuration so that, it shows up in the right place.
And then you have to deal with HTTPS um, because OAuth requires HTTPS for understandable and kind of important reasons. And it's just a mess. Like there's so many layers of stuff like the, the barrier to get, you know, like just a darn URL, shortener up turns from 20 minutes into three days of effort trying to, you know, understand how these various arcane things work together.
You need to have state for your OAuth implementation. You need to worry about what the hell a a JWT is (sigh) . It's it it's just bad. And I really think that something like tailscale with everybody has an IP address. In order to get into the network, you have to sign in with your, auth provider, your, a provider tells tailscale who you are.
So transitively every IP address is tied to an owner, which means that you can enforce access permission based on the IP address and the metadata about it that you grab from the tailscale. daemon, it's just so much simpler. Like you don't have to think about, oh, how do I set up OAuth this time? What the hell is an oauth proxy?
Um, what is a Kubernetes? That sort of thing you just think about like doing the thing and you just do it. And then everything else gets taken care of it. It's like kind of the ultimate network infrastructure, because it's both omnipresent and something you don't have to think about. And I think that's really the power of tailscale.
[00:08:39] Jeremy: typically when you would spin up a, a service that you want your developers or your system admins, to be able to log into, you would have to have some way of authenticating and authorizing that user. And so you were talking about bringing in OAuth and having your, your service understand that.
But I, I guess what you're saying is that when you have something like tailscale, that's kind of front loaded, I guess you, you authenticate with tail scale, you get onto the network, you get your IP. And then from that point on you can access all these different services that know like, Hey, because you're on the network, we know you're authenticated and those services can just maybe map that IP that's not gonna change to like users in some kind of table. Um, and not have to worry about figuring out how do I authenticate this user.
[00:09:34] Xe: I would personally more suggest that you use the, uh, whois, uh, look up route in the tailscale daemon's local API, but basically, yeah, you don't really have to worry too much about like the authentication layer because the authentication layer has already been done. You know, you've already done your two factor with Gmail or whatever, and then you can just transitively push that property onto your other machines.
[00:10:01] Jeremy: So when you talk about this, this whois daemon, can you give an example of I'm in the network now I'm gonna make a service call to an application. what, what am I doing with this? This whois daemon?
[00:10:14] Xe: It's more of like a internal API call that we expose via tailscaled's, uh, Unix, socket. but basically you give it an IP address and a port, and it tells you who the person is. It's kind of like the Unix ident protocol in a way, except completely not. And at a high level, you know, if you have something like a proxy for Grafana, you have that proxy for Grafana, make a call to the local tailscale daemon, and be like, Hey, who was this person?
And the tailscale, daemon will spit back at JSON object. Like, oh, it's this person on this device and there you can do additional logic like maybe you shouldn't be allowed to delete things from an iOS device, you know, crazy ideas like that. there's not really support for like arbitrary capabilities and tailscaled at the time of recording, but we've had some thoughts would be cool.
[00:11:17] Jeremy: would that also include things like having roles, for example, even if it's just strings, um, that you get back so that your application would know, okay. This person, is supposed to have admin access to this service based on what I got back from, this, this service.
[00:11:35] Xe: Not currently, uh, you can probably do it via convention or something, but what's currently implemented in the actual, like, source code and user experience that they, you can't do that right now. Um, it is something that I've been, trying to think about different ways to solve, but it's also a problem.
That's a bit big for me personally, to tackle.
[00:11:59] Jeremy: there's, there's so many, I guess, different ways of doing it. That it's kind of interesting to think of a solution that's kind of built into the, the network. Yeah.
[00:12:10] Xe: Yeah. and when I describe that authentication thing to some people, it makes them recoil in shock because there's kind of a Stockholm syndrome type effect with security, for a lot of things where, the easy way to do something and the secure way to do something are, you know, like completely opposite and directly conflicting with each other in almost every way.
And over time, people have come to associate security or like corporate VPNs as annoying, complicated, and difficult. And the idea of something that isn't annoying, complicated or difficult will make people reject it, like just on principle, because you know, they've been trained that, you know, VPN equals virtual pain network and it, it's hard to get that association outta people's heads because you know, a lot of VPNs are virtual pain networks.
Like. I used to work for Salesforce and Salesforce had this corporate VPN where no matter what you did, all of your traffic would go out to the internet from their data center. I think it was in San Francisco or something. And I was in the Seattle area. So whenever I had the VPN on my latency to Google shot up by like eight times and being a software person, you know, I use Google the same way that others breathe and it, it was just not fun.
And I only had the VPN on for the bare minimum of when I needed it. And, oh God, it was so bad.
[00:13:50] Jeremy: like some people, when they picture a VPN, they picture exactly what you're describing, where all of my traffic is gonna get routed to some central point. It's gonna go connect to the thing for me and then send the result back. so maybe you could talk a little bit about why that's, that's maybe a wrong assumption, I guess, in the case of tailscale, or maybe in the case of just more modern VPN solutions.
[00:14:13] Xe: Yeah. So the thing that I was describing is what I've been lovingly calling the, uh, single point of failure as a service type model of VPN, where, you know, you have like the big server somewhere, it concentrates all the connections and, you know, like does things to make the computer feel like they've teleported over there, but overall it's a single point of failure.
And if that falls over, you know, like goodbye, VPN. everybody's just totally screwed. And in contrast, tailscale does a more peer-to-peer thing so that everyone is basically on equal footing. Everyone can send traffic directly to each other, and if it can't get directly to there, it'll use a network of, uh, relay servers, uh, lovingly called Derp and you don't have to worry about, your single point of failure in your cluster, because there's just no single point of failure.
Everything will directly communicate as much as possible. And if it can't, it'll still communicate anyway.
[00:15:18] Jeremy: let's say I start up my computer and I wanna connect to a server in a data center somewhere at the very beginning, am I connecting to some server hosted at tailscale? And then. There's some kind of negotiation process where after that I connect directly or do I just connect directly straight away?
[00:15:39] Xe: If you just turn on your laptop and log in, you know, to it signs into tailscale and gets you on the tailnet and whatnot, then it will actually start all connections via Derp just so that it can negotiate the, uh, direct connection. And in case it can't, you know, it's already connected via Derp so it just continues the connection with Derp and this creates a kind of seamless magic type experience where doing things over Derp is slower.
Yes, it is measurably slower because you know, like you're not going directly, you're doing TCP inside of TCP. And you know, that comes with a average minefield of lasers or whatever you call it. And it does work though. It's not ideal if you wanna do things like copy large amounts of data, but if you want just want ssh into prod and see the logs for what the heck is going on and why you're getting paged at 3:00 AM. it's pretty great.
[00:16:40] Jeremy: What you, you were calling Derp is it where you have servers kind of all over the world and somehow it determines which one's, I guess, is it which one's closest to your destination or which one's closest to you. I'm kind of
[00:16:54] Xe: It's really interesting. It's one of the most weird distributed systems, uh, type things that I've ever seen. It's the kind of thing that could only come outta the mind of an X Googler, but basically every tailscale, every tailscale node has a connection to all of the Derp servers and through process of, you know, latency testing.
It figures out which connection is the fastest and the lowest latency. And it calls that it's home Derp but because it's connected to everything is connected to every Derp you can have two people with different home Derps getting their packets relayed too other clients from different Derps.
So, you know, if you have a laptop in Ottawa and a laptop in San Francisco, the laptop in San Francisco will probably use the, uh, Derp that's closest to it. But the laptop in Ottawa will also use the Derp that's closest to it. So you get this sort of like asynchronous thing, and it actually works out a lot better in practice, than you're probably imagining.
[00:17:52] Jeremy: And then these servers, what was the, the technical term for them? Are they like relays or what's
[00:17:58] Xe: They're relays. Uh, they only really deal with encrypted wire guard packets, and there's, no way for us at tailscale, to see the contents of Derp messages, it is literally just a forwarder. It, it literally just forwards things based on the key ID.
[00:18:17] Jeremy: I guess if tail scale isn't able to decrypt the traffic, is, is that because the, the keys are only on the user's devices, like it's on their laptop and on the server they're trying to reach, or
[00:18:31] Xe: Yeah. The private keys are live and die with those devices or the devices they were minted on. And the public keys are given to the coordination server and the coordination server spreads those around to every device in your tailnet. It does some limiting so that like, if you don't have ACL access to something, you don't get the private key, you don't get the, uh, public key for it.
The public key, not the private key, the public key, not the private key. And yeah. Then, you know, you just go that way and it'll just figure it out. It's pretty nice.
[00:19:03] Jeremy: When we're kind of talking about situations where it can't connect directly, that's where you would use the relay. what are kind of the typical cases where that happens, where you, you aren't able to just connect directly?
[00:19:17] Xe: Hotel, wifi and paranoid network security setups, hotel wifi is the most notorious one because you know, you have like an overpriced wifi connection. And if you bring, like, I don't know like, You you're recording a bunch of footage on your iPhone. And because in, 2022. The iPhone has the USB2 connection on it.
And you know, you wanna copy that. You wanna use the network, but you can't. So you could just let it upload through iCloud or something, or, you know, do the bare minimum. You need to get the, to get the data off with Derp it wouldn't be ideal, but it would work. And ironically enough, that entire complexity involved with, you know, doing TCP inside of TCP to copy a video file over to your laptop might actually be faster than USB2, which is something that I did the math for a while ago.
And I just started laughing.
[00:20:21] Jeremy: Yeah, that that is pretty, pretty ridiculous
[00:20:23] Xe: welcome to the future, man (laughs) .
[00:20:27] Jeremy: in terms of connecting directly, usually when you have a computer on the internet, you don't have all your ports open, you don't necessarily allow, just anybody to send you traffic over UDP and so forth. let's say I wanna send, UDP data to a, a server on my network, but, you know, maybe it has some TCP ports open. I I'm assuming once I connect into the network via the VPN, I'm able to use other protocols and ports that weren't necessarily exposed. Is that correct?
[00:21:01] Xe: Yeah, you can use UDP. you can do basically anything you would do on a normal network except multicast um, because multicast is weird.
I mean, there's thoughts on how to handle multicast, but the main problem is that like wireguard, which is what is tail tailscale is built on top of, is, so called OSI model layer three network, where it's at like, you know, the IP address level and multicast is a layer two or data link layer type thing.
And, those are different numbers and, you can't really easily put, you know, like broadcast packets into IP, uh, IPV4 thinks otherwise, but, uh, in practice, no people don't actually use the broadcast address.
[00:21:48] Jeremy: so for someone who's, they, they have a project or their company wants to get started. I mean, what does onboarding look like? What, what do they have to do to get all these devices talking to one another?
[00:22:02] Xe: basically you, install tail scale, you log in with a little GUI thing or on a Linux server, you run tailscale up, and then you all log to the, to a, like a G suite account with the same domain name. So, you know, if your domain is like example.com, then everybody logs in with their example.com G suite account.
And, there is no step three, everything is allowed and everything can just connect and you can change the permissions from there. By default, the ACLs are set to a, you know, very permissive allow everyone to talk to everyone on any port. Uh, just so that people can verify that it's working, you know, you can ping to your heart's content.
You can play Minecraft with others. You can, you know, host an HTTP server. You can SSH into your development box and and write blog post with emacs, whatever you want.
[00:22:58] Jeremy: okay, you install the, the software on your servers, your workstations, your laptops, and so on. And then at, after that there's some kind of webpage or dashboard you would go in and say, I want these people to be able to access these things and
[00:23:14] Xe: Mm-hmm
[00:23:15] Jeremy: these ports and so on.
[00:23:17] Xe: you, uh, can customize the access control rules with something that looks like JSON, but with trailing commas and comments allowed, and you can go from there to customize basically anything to your heart's content. you can set rules so that people on the DevOps team can access everything, but you know, maybe marketing doesn't need access to the production database.
So you don't have to worry about that as much.
[00:23:45] Jeremy: there's, there's kind of different options for VPNs. CloudFlare access, zero tier, there's, there's some kind of, I think it's Nebula from slack or something like that. so I was kind of curious from your perspective, what's the, difference between those kinds of services and, and tailscale.
[00:24:04] Xe: I'm gonna lead this out by saying that I don't totally understand the differences between a lot of them, because I've only really worked with tailscale. I know things about the other options, but, uh, I have the most experience with tailscale but from what I've been able to tell, there are things that tailscale offers that others don't like reverse mapping of IP addresses to people, or, there's this other feature that we've been working on, where you can embed tail scale as a library inside your go application, and then write a internal admin service that isn't exposed to the internet, but it's only exposed over tailscale.
And I haven't seen a way to do those things with those others, but again, I haven't done much research. Um, I understand that zero tier has some layer, two capabilities, but I've, I don't have enough time in the day to look into.
[00:25:01] Jeremy: There's been different, I guess you would call them VPN protocols. I mean, there's people have probably worked with IP sec in some situations they may have heard of OpenVPN, wireguard. in the case of tailscale, I believe you chose to build it on top of wireguard.
So I wonder if you could talk a little bit about why, you chose wireguard and, and maybe what makes it unique.
[00:25:27] Xe: I wasn't on the team that initially wrote like the core of tailscale itself. But from what I understand, wire guard was chosen because, what overhead, uh, it's literally, you just encrypt the packets, you send it to the other server, the other server decrypts them. And you know, you're done. it's also based purely on the public key. Um, the key pairs involved. And from what I understand, like at the wireguard protocol level, there's no reason why you, why you would need an IP address at all in theory, but in practice, you kind of need an IP address because you know, everything sucks. But also wire guard is like UDP only, which I think it at it's like core implementation, which is a step up from like AnyConnect and OpenVPN where they have TCP modes.
So you can experience the, uh, glorious, trash fire of TCP in TCP. And from what I understand with wireguard, you don't need to set up a certificate authority or figure out how the heck to revoke certificates. Uh, you just have key pairs and if a node needs to be removed, you delete the key pair and you're done.
And I think that really matches up with a lot of the philosophy behind how tailscale networks work a lot better. You know, you have a list of keys and if the network changes the list of keys changes, that's, that's the end of the story.
So maybe one of the big selling points was just What has the least amount of things I guess, to deal with, or what's the, the simplest, when you're using a component that you want to put into your own product, you kind of want the least amount of things that could go wrong, I guess.
[00:27:14] Xe: Yeah. It's more like simple, but not like limiting. Like, for example, a set of tinker toys is simple in that, you know, you can build things that you don't have to worry too much about the material science, but a set of tinker toys is also limiting because you know, like they're little wooden, dowels and little circles made out of wind that you stick the dowels into, you know, you can only do so much with it.
And I think that in comparison, wireguard is simple. You know, there's just key pairs. They're just encryption. And it's simple in it's like overall theory and it's implementation, but it's not limiting. Like you can do pretty much anything you want with it.
inherently whenever we build something, that's what we want, but that's a, that's an interesting way of putting it. Yeah.
[00:28:05] Xe: Yeah. It. It can be kind of annoyingly hard to figure out how to make things as simple as they need to be, but still allow for complexity to occur. So you don't have to like set up a keyboard macro to write if error not equals nil over and over.
[00:28:21] Jeremy: I guess the next thing I'd like to talk a little bit about is. We we've covered it a little bit, but at a high level, I understand that that tailscale uses wireguard, which is the open source, VPN protocol, I guess you could call it. And then there's the client software. You're saying you need to install on each of the servers and workstations.
But there's also a, a control plane. and I wonder if you could kind of talk a little bit about I guess at a high level, what are all the different components of, of tailscale?
[00:28:54] Xe: There's the agent that you install in your devices. The agent is basically the same between all the devices. It's all written in go, and it turns out that go can actually cross compile fairly well. So you have. Your, you know, your implementation in go, that is basically the, the same code, more or less running on windows, MacOS, freeBSD, Android, ChromeOS, iOS, Linux.
I think I just listed all the platforms. I'm not sure, but you have that. And then there's the sort of control plane on tailscale's side, the control plane is basically like control, uh, which is, uh, I think a get smart reference. and that is basically a key dropbox. So, you know, you You authenticate through there. That's where the admin panel's hosted. And that's what tells the different tailscale nodes uh, the keys of all the other machines on the tailnet. And also on tailscale side there's, uh, Derp which is a fleet of a bunch of different VPSs in various clouds, all over the world, both to try to minimize cost and to, uh, have resiliency because if both digital ocean and Vultr go down globally, we probably have bigger problems.
[00:30:15] Jeremy: I believe you mentioned that the, the clients were written in go, are the control plane and the relay, the Derp portion. Are those also written in go or are they
[00:30:27] Xe: They're all written and go, yeah,
go as much as possible. Yeah.
It's kind of what happens when you have some ex go team members is the core people involved in tail scale, like. There's a go compiler fork that has some additional patches that go upstream either can't accept, uh, won't accept or hasn't yet accepted, for a while. It was how we did things like trying to shave off by bites from binary size to attempt to fit it into the iOS network extension limit.
Because for some reason they only allowed you to have 15 megabytes of Ram for both like your application and working Ram. And it turns out that 15 megabytes of Ram is way more than enough to do something like OpenVPN. But you know, when you have a peer-to-peer VPN engine, it doesn't really work that well.
So, you know, that's a lot of interesting engineering challenge.
[00:31:28] Jeremy: That was specifically for iOS. So to run it on an iPhone.
[00:31:32] Xe: Yeah. Um, and amazingly after the person who did all of the optimization to the linker, trying to get the binary size down as much as possible, like replacing Unicode packages was something that's more coefficient, you know, like basically all but compressing parts of the binary to try to save space. Then the iOS, I think 15 beta dropped and we found out that they increased the network extension Ram limit to 50 megabytes and the look of defeat on that poor person's face. I feel very bad for him.
[00:32:09] Jeremy: you got what you wanted, but you're sad about it,
[00:32:12] Xe: Yeah.
[00:32:14] Jeremy: so that's interesting too. you were using a fork of the go compiler
[00:32:19] Xe: Basically everything that is built is built using, uh, the tailscale fork, of the go compiler.
[00:32:27] Jeremy: Going forward is the sort of assumption is that's what you'll do, or is it you're, you're hoping you can get this stuff upstreamed and then eventually move off of it.
[00:32:36] Xe: I'm pretty sure that, I, I don't know if I can really make a forward looking statement like that, but, I've come to accept the fact that there's a fork of the go compiler. And as a result, it allows a lot more experimentation and a bit more of control, a bit more control over what's going on. like I'm, I'm not like the most happy with it, but I've, I understand why it exists and I'm, I've made my peace with it.
[00:33:07] Jeremy: And I suppose it, it helps somewhat that the people who are working on it actually originally worked on the, go compiler at Google. Is that right?
[00:33:16] Xe: Oh yeah. If, uh, there weren't ex go team people working on that, then I would definitely feel way less comfortable about it. But I trust that the people that are working on it, know what they're doing at least enough.
[00:33:30] Jeremy: I, I feel like, that's, that's kind of the position we put ourselves in with software in general, right? Is like, do we trust our ourselves enough to do this thing we're doing?
[00:33:39] Xe: Yeah. And trust is a bitch.
[00:33:44] Jeremy: um, I think one of the things that's interesting about tail scale is that it's a product that's kind of it's like network infrastructure, right? It's to connect you to your other devices. And that's a little different than somebody running a software as a service. And so. how do you test something that's like built to support a network and, and how is that different than just making a web app or something like that.
[00:34:11] Xe: Um, well, it's a lot more complicated for one, especially when you have to have multiple devices in the mix with multiple different operating systems. And I was working on some integration tests, doing stuff for a while, and it was really complicated. You have to spin up virtual machines, you know, you have to like make sure the virtual machines are attempting to download the version of the tailscale client you wanna test and. It's it's quite a lot in practice.
[00:34:42] Jeremy: I mean, do you have a, a lab, you know, with Android phones and iPhones and laptops and all this sort of stuff, and you have some kind of automated test suite to see like, Hey, if these machines are in Ottawa and, my servers in San Francisco, like you're mentioning before that I can get from my iPhone to this server and the data center over here, that kind of thing.
[00:35:06] Xe: What's the right way to phrase this without making things look bad. Um, it's a work in progress. It it's, it's really a hard problem to solve, uh, especially when the company is fully remote and, uh, like. Address that's listed on the business records is literally one of the founders condos because you know, the company has no office.
So that makes the logistics for a lot of this. Even more fun.
[00:35:37] Jeremy: Probably any company that's in an early stage feels the same way where it's like, everything's a work in progress and we're just gonna, we're gonna keep going and we're gonna get there. And as long as everything keeps running, we're good.
[00:35:50] Xe: Yeah. I, I don't like thinking about it in that way, because it kind of sounds like pessimistic or defeatist, but at some level it's, it, it really is a work in progress because it's, it's a hard problem and hard problems take a lot of time to solve, especially if you want a solution that you're happy with.
[00:36:10] Jeremy: And, and I think it's kind of a unique case too, where it's not like if it goes down, it's like people can't do their job. Right. So it's yeah.
[00:36:21] Xe: Actually, if tail scales like control plane goes down, I don't think people would notice until they tried to like boot up a, a reboot, a laptop, or connect a new device to their tailnet. Because once, once all the tailscale agents have all of the information they need from the control plate, you know, they just, they just continue on independently and don't have to care.
Derp is also fairly independent of the, like the key dropbox component. And, you know, if that, if that goes down Derp doesn't care at all,
[00:37:00] Jeremy: Oh, okay. So if the control plane is down, as long as you had authenticated earlier in the day, you can still, I don't know if it's cached or something, but you can still continue to reach the relay servers, the Derp servers or your,
[00:37:15] Xe: other nodes. Yeah. I, I'm pretty sure that in most cases, the control plane could be down for several hours a day and nobody would notice unless they're trying to deal with the admin panel.
[00:37:28] Jeremy: Got it. that's a little bit of a relief, I suppose, for, for all of you running it,
[00:37:33] Xe: Yeah. Um, it's also kind of hard to sell people on the idea of here is a VPN thing. You don't need to self host it and they're like, what? Why? And yeah, it can be fun.
[00:37:49] Jeremy: though, I mean, I feel like anybody who has, self-hosted a VPN, they probably like don't really wanna do it. I don't know. Maybe I'm wrong.
[00:38:00] Xe: well, so a lot of the idea of wanting to self host it is, uh, I think it's more of like trying to be self-sufficient and not have to rely on other companies, failures dictating your company's downtime. And, you know, like from some level that's very understandable. And, you know, if, you know, like tail scale were to get bought out and the new owners would, you know, like basically kill the product, they'd still have something that would work for them.
I don't know if like such a defeatist attitude is like productive. But it is certainly the opinion that I have received when I have asked people why they wanna self-host. other people, don't want to deal with identity providers or the, like, they wanna just use their, they wanna use their own identity provider.
And what was hilarious was there was one, there was one thing where they were like our old VPN server died once and we got locked out of our network. So therefore we wanna, we wanna self-host tailscale in the future so that this won't happen again.
And I'm like, buddy, let's, let's just, let's just take a moment and retrace our steps here. CAuse I don't think you mean what you think you mean.
[00:39:17] Jeremy: yeah, yeah.
[00:39:19] Xe: In general, like I suggest people that, you know, even if they're like way deep into the tailscale, Kool-Aid they still have at least one other method of getting into their servers. Ideally, two. I, I admit that I'm, I come from an SRE style background and I am way more paranoid than most, but it, I usually like having, uh, a backup just in case.
[00:39:44] Jeremy: So I, I suppose, on, on that note, let's, let's talk a little bit about your role at tailscale. the title of the archmage of infrastructure is one of the, the coolest titles I've, uh, I've seen. So maybe you can go a little bit into what that entails at, at tailscale.
[00:40:02] Xe: I started that title as a joke that kind of stuck, uh, my intent, my initial intent was that every time someone asked, I'd say, I'd have a different, you know, like mystic sounding title, but, uh, archmage of infrastructure kind of stuck. And since then, I've actually been pivoting more into developer relations stuff rather than pure software engineering.
And, from the feedback that I've gotten at the various conferences I've spoken at, they like that title, even though it doesn't really fit with developer relations work at all, it it's like it fits because it doesn't. You know, that kind of coney kind of way.
[00:40:40] Jeremy: I guess this would go more into the, the infrastructure side, but. What does the, the scale of your infrastructure look like? I mean, I, I think that you touched a little bit on the fact that you have relay servers all over the place and you've got this control plane, but I wonder if you could give people a little bit of perspective of what kind of undertaking this is.
[00:41:04] Xe: I am pretty sure at this point we have more developer laptops and the like, than we do production servers. Um, I'm pretty sure that the scale of the production of production servers are in the tens, at most. Um, it turns out that computers are pretty darn and efficient and, uh, you don't really need like a lot of computers to do something amazing.
[00:41:27] Jeremy: the part that I guess surprises me a little bit is, is the relay servers, I suppose, because, I would imagine there's a lot of traffic that goes through those. are you finding that just most of the time they just aren't needed and usually you can make a direct connection and that's why you don't need too many of these.
[00:41:45] Xe: From what I understand. I don't know if we actually have a way to tell, like what percentage of data is going over the relays versus not. And I think that was an intentional decision, um, that may have been revisited I'm operating based off of like six to 12 month old information right now. But in general, like the only state that the relay servers has is in Ram.
And whenever the relay, whenever you disconnect the server, the state is dropped.
[00:42:18] Jeremy: Okay.
[00:42:19] Xe: and even then that state is like, you know, this key is listening. It is, uh, connected, uh, in case you wanna send packets over here, I guess.
it's a bit less bandwidth than you're probably thinking it's not like enough to max it out 24/7, but it is, you know, measurable and there are some, you know, costs associated with it. This is also why it's on digital ocean and vulture and not AWS. but in general, it's a lot less than you'd think. I'm pretty sure that like, if I had to give a baseless assumption, I'd say that probably about like 85% of traffic goes directly.
And the remaining is like the few cases in the whole punching engine that we haven't figured out yet. Like Palo Alto fire walls. Oh God. Those things are a nightmare.
[00:43:13] Jeremy: I see. So it's most of the traffic actually ends up. Being straight peer to peer. Doesn't have to go through your infrastructure. And, and therefore it's like, you don't need too many machines, uh, to, to make this whole thing work.
[00:43:28] Xe: Yeah. it turns out that computers are pretty darn fast and that copying data is something that computers are really good at doing. Um, so if you have, you know, some pretty darn fast computers, basically just sitting there and copying data back and forth all day, like it, you can do a lot with shockingly little.
Um, when I first started, I believe that the Derp VMs were using like sometimes as little as one core and 512 megabytes of Ram as like a primary Derp. And, you know, we only noticed when, there were some weird connection issues for people that were only on Derp because there were enough users that the machine had ran out of memory.
So we just, you know, upped the, uh, virtual machine size and called it a day. But it's, it's truly remarkable how mu how far you can get with very little
[00:44:23] Jeremy: And you mentioned the relay servers, the, the Derp servers were on services like digital ocean and Vultr. I'm assuming because of the, the bandwidth cost, for the control plane, is, is that on AWS or some other big cloud provider?
[00:44:39] Xe: it's on AWS. I believe it's in EU central 1.
[00:44:44] Jeremy: You're helping people connect from device to device and in a situation like that. what does monitoring look like in, in incidents? Like what are you looking for to determine like, Hey, something's not working.
[00:44:59] Xe: there's monitoring with, you know, Prometheus, Grafana, all of that stuff. there are some external probing things. there's also some continuous functional testing for trying to connect to tailscale and like log in as an account. And if that fails like twice in a row, then, you know, something's very wrong and, you know, raise the alarm.
But in general. A lot of our monitoring is kind of hard at some level because you know, we're tailscale at a tailscale can't always benefit from tailscale to help operate tail scale because you know, it's tailscale. Um, so it, it still trying to figure out how to detangle the chicken and egg situation.
It's really annoying.
there's the, the term dog fooding, right? Where they're saying like, oh, we, we run, um, our own development on our own platform or our own software. but I could see when your product is network infrastructure, VPNs, where that could be a little, little dicey.
[00:46:06] Xe: Yeah, it is very annoying. But I I'm pretty sure we'll figure something out. It is just a matter of when, another thing that's come up is we've kind of wanted to use tailscale's SSH features, where you specify ACLs in your, you specify ACL rules to allow people to SSH, to other nodes as various users.
but if that becomes your main access to production, then you know, like if tailscale is down and you're tailscale, like how do you get in, uh, then there's been various philosophical discussions about this. it's also slightly worse if you use what's called check mode in SSH, where, uh, tail scale, SSH without check mode, you know, you just, it, the, the server checks against the policy rules and the ACL and if it. if it's okay, it lets you in. And if not, it says no, but with check mode, there's also this like eight hour, there's this like eight hour quote unquote lifetime for you to have like sudo mode on GitHub, where you do an auth an auth challenge with your auth aprovider. And then, you know, you're given a, uh, Hey, this person has done this thing type verification.
And if that's down and that goes through the control plane, and if the control plane is down and you're tailscale, trying to debug the control plane, and in order to get into the control plane over tailscale, you need to use the, uh, control plane. It, you know, that's like chicken and egg problem level 78,
which is a mythical level of chicken egg problem that, uh, has only been foretold in the legends of yore or something.
[00:47:52] Jeremy: at that point, it sounds like somebody just needs to, to drive to the data center and plug into the switch.
[00:47:59] Xe: I mean, It's not, it's not going to, it probably wouldn't be like, you know, we need to get a person with an angle grinder off of Craigslist type bad. Like it was with the Facebook BGP outage, but it it's definitely a chicken and egg problem in its own right.
it makes you do a lot of lateral thinking too, which is also kind of interesting.
[00:48:20] Jeremy: When, when you say lateral thinking, I'm just kind of curious, um, if you have an example of what you mean.
[00:48:27] Xe: I don't know of any example that isn't NDAed. Um, but basically, you know, tail scale is getting to the, to the point where tailscale is relying on tailscale to make tailscale function and you know, yeah. This is classic oroboros style problem.
I've heard a, uh, a wise friend of mine said that that is an ideal problem to have, which sounds weird at face value. But if you're getting to that point, that means that you're successful enough that, you know, you're having that problem, which is in itself a good thing, paradoxically.
[00:49:07] Jeremy: better to have that problem than to have nobody care about the product. Right.
[00:49:12] Xe: Yeah.
[00:49:13] Jeremy: kind of on that, that note, um, you mentioned you worked at, at Salesforce, uh, I believe that was working on Heroku. I wonder if you could talk a little about your experience working at, you know, tailscale, which is kind of more of a, you know, early startup versus, uh, an established company like Salesforce.
[00:49:36] Xe: So at the time I was working at Heroku, it definitely didn't feel like I was working at Salesforce for the majority of it. It felt like I was working, you know, at Heroku, like on my resume, I listed as Heroku. When I talked about it to people, I said, I worked at Heroku and that sales force was this, you know, mythical, Ohana thing that I didn't have to deal with unless I absolutely had to.
By the end of the time I was working at Heroku, uh, the salesforce, uh, sort of started to creep in and, you know, we moved from tracking issues in GitHub issues. Like we were used to, to using their, oh, what's the polite way to say this, their creation, which is, which was like the moral equivalent of JIRA implemented on top of Salesforce.
You had to be behind the VPN for it. And, you know, every ticket had 20 fields and, uh, there were no templates. And in comparison with tail scale, you know, we just use GitHub issues, maybe some like things in notion for doing like longer term tracking or Kanban stuff, but it's nice to not have. you know, all of the pomp and ceremony of filling out 20 fields in a ticket for like two sentences of this thing is obviously wrong and it's causing X to happen.
Please fix.
[00:51:08] Jeremy: I, I like that, that phrase, the, the creation, that's a very, very diplomatic term.
[00:51:14] Xe: I mean, I can think of other ways to describe it, but I'm pretty sure those ways wouldn't be allowed on the podcast. So
[00:51:25] Jeremy: Um, but, but yeah, I, I know what you mean for sure where, it, it feels like there's this movement from, Hey, let's just do what we need. Like let's fill in the information that's actually relevant and don't do anything else to a shift to, we need to fill in these 10 fields because that's the thing we do.
Yeah.
[00:51:48] Xe: Yeah. and in the time I've been working for tail scale, I'm like employee ID 12. And, uh, tail scale has gone from a company where I literally know everyone to just recently to the point where I don't know everyone anymore. And it's a really weird feeling. I've never been in a, like a small stage startup that's gotten to this size before, and I've described some of my feelings to other people who have been there and they're like, yeah, welcome to the club. So I figure a lot of it is normal. from what I understand, though, there's a lot of intentionality to try to prevent tail skill from becoming, you know, like Google style, complexity, organizational complexity, unless that is absolutely necessary to do something.
[00:52:36] Jeremy: it's a function of size, right? Like as you have more people, more teams, then more process comes in. that's a really tricky balance to, to grow and still keep that feeling of, I'm just doing the thing, I'm doing the work rather than all this other process stuff.
[00:52:57] Xe: Yeah, but it, I've also kind of managed to pigeonhole myself off into a corner with devrel stuff. And that's been nice. I've been working a bunch with, uh, like marketing people and, uh, helping out with support occasionally and doing a, like a godawful amount of writing.
[00:53:17] Jeremy: the, the writing, for our audience's benefit, I, I think they should, they should really check out your blog because I think that the way you write your, your articles is very thoughtful in terms of the balance of the actual example code or example scripts and the descriptions and, and some there's a little bit of a narrative sometimes too.
So,
[00:53:40] Xe: Um, I'm actually more of a prose writer just by like how I naturally write things. And a lot of the style of how I write things is, I will take elements from, uh, the Socratic style of dialogue where, you know, you have the student and the teacher. And, you know, sometimes the student will ask questions that the teacher will answer.
And I found that that's a particularly useful way to help model understanding or, you know, like put side concepts off into their own little blurbs or other things like that. I also started doing those conversation things with, uh, furry art, specifically to dunk on a homophobe that was getting very angry at furry art being in, uh, another person's blog.
And that's it, it's occasionally fun to go into the, uh, orange website of bad takes and see the comments when people complain about it. oh gosh, the bad takes are hilariously good. Sometimes.
[00:54:45] Jeremy: it's good that you have like a, a positive, mindset around that. I know some people can read, uh, that sort of stuff and go, you know, just get really bummed out.
[00:54:54] Xe: One of the ways I see it is that a lot of the time algorithms are based on like sheer numbers. So if you like get something that makes people argue in the comments, that number will go up and because there's more comments on it, it makes more people more likely to, to read the article and click on it.
So, sometimes I have been known to sprinkle, what's the polite way to say this. I've been known to sprinkle like intentionally kind of things that will, uh, get people and make them want to argue about it in the comments. Purely to make the engagement numbers rise up, which makes more people likely to read the article.
And, it's kind of a dirty practice, but you know, it makes more people read the article and more people benefit. So, you know, like it's kind of morally neutral, I guess.
[00:55:52] Jeremy: usually that, that seems like, a sketchy thing. But I feel like if it's in service to, uh, like a technical blog post, I mean, why not? Right.
[00:56:04] Xe: And a lot of the times I'll usually have the like, uh, kind of bad take, be in a little conversation blurb thing so that people will additionally argue about the characterization of, you know, the imaginary cartoon shark or whatever.
[00:56:20] Jeremy: That's good. It's the, uh, it's the Xe Xe universe that they're, they're stepping into.
[00:56:27] Xe: I've heard people describe it, uh, lovingly as the xeiaso.net cinematic universe.
I've had some ideas on how to expand it in the future with more characters that have more different kind of diverse backgrounds. But, uh, it turns out that writing this stuff is hard. Like actually very hard because you have to get this right.
You have to get the right balance of like snark satire, uh, like enlightenment. And
it's, it's surprisingly harder than you'd think. Um, but after a while, I've just sort of managed to like figure out as I'm writing where the side tangents come off and which ones I should keep and which ones I should, uh, prune and which ones can also help, Gain deeper understanding with a little like Socratic dialogue to start with a Mo like an incomplete assumption, like an incomplete picture.
And then, you know, a question of, wait, what about this thing? Doesn't that conflict with that? And like, well, yes. technically it does, but realistically we don't have to worry about that as much. So we can think about it just in terms of this bigger model and, uh, that's okay. Like, uh, I mentioned the OSI model earlier, you know, like the seven layer OSI model it's, you know, genuinely overkill for basically everything, except it's a really great conceptual model for figuring out the difference between, you know, like an ethernet cable, an ethernet, like the ethernet card, the IP stack TCP and, you know, TLS or whatever.
I have a couple talks that are gonna be up by the time this is published. Uh, one of them is my, uh, rustconf talk on my, or what was it called? I think it was called the surreal horrors of PAM or something where I discussed my experience, trying to bug a PAM module in rust, uh, for work. And, uh, it's the kind of story where, you know, it's bad when you have a break point on dlopen.
[00:58:31] Jeremy: That sounds like a nightmare.
[00:58:32] Xe: Oh yeah. Like part of the attempting to fix that process involved, going very deep. We're talking like an HTML frame set in the internet archive for sunOS documentation that was written around the time that PAM was used. Like it's things that are bad enough were like everything in the frame set, but the contents had eroded away through bit rot and you know, you're very lucky just to have what you do.
[00:59:02] Jeremy: well, I'm, I'm glad it was. It was you and not me. we'll get to, to hear about it and, and not have to go through the, the suffering ourselves.
[00:59:11] Xe: yeah. One of the things I've been telling people is that I'm not like a brilliant programmer. Like I know a bunch of people who are definitely way smarter than me, but what I am is determined and, uh, determination is a bit stronger of a force than you'd think.
[00:59:27] Jeremy: Yeah. I mean, without it, nothing gets done. Right.
[00:59:30] Xe: Yeah.
[00:59:31] Jeremy: as we wrap up, is there anything we missed or anything else you wanna mention?
[00:59:36] Xe: if you wanna look at my blog, it's on xeiaso.net. That's X, E I a S o.net. Um, that's where I post things. You can see, like the 280 something articles at time of recording. It's probably gonna get to 300 at some point, oh God, it's gonna get to 300 at some point. Um, and yeah, from, I try to post articles about weekly, uh, depending on facts and circumstances, I have a bunch of talks coming up, like one about the hilarious over engineering I did in my blog.
And maybe some more. If I get back positive responses from calls for paper submissions,
[01:00:21] Jeremy: Very cool. Well, Xe thank you so much for, for coming on software engineering radio.
[01:00:27] Xe: Yeah. Thank you for having me. I hope you have a good day and, uh, try out tailscale, uh, note my bias, but I think it's great.
Jonathan Shariat is the coauthor of the book Tragic Design and co-host of the Design Review Podcast.
He's currently a Sr. Interaction Designer & Accessibility Program Lead at Google.
This episode originally aired on Software Engineering Radio.
Topics covered:
Related Links
Transcript
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: Today I'm talking to Jonathan Shariat, he's the co-author of Tragic design. The host of the design review podcast. And he's currently a senior interaction designer and accessibility program lead at Google. Jonathan, welcome to software engineering radio.
[00:00:15] Jonathan: Hi, Jeremy, thank you So much for having me on.
[00:00:18] Jeremy: the title of your book is tragic design. And I think that people can take a lot of different meanings from that. So I wonder if you could start by explaining what tragic design means to you.
[00:00:33] Jonathan: Hmm. For me, it really started with this story that we have in the beginning of the book. It's also online. Uh, I originally wrote it as a medium article and th that's really what opened my eyes to, Hey, you know, design has, is, is this kind of invisible world all around us that we actually depend on very critically in some cases.
And So this story was about a girl, you know, a nameless girl, but we named her Jenny for the story. And in short, she came for treatment of cancer at the hospital, uh, was given the medication and the nurses that were taking care of her were so distracted with the software they were using to chart, make orders, things like that, that they miss the fact that she needed hydration and that she wasn't getting it.
And then because of that, she passed away. And I still remember that feeling of just kind of outrage. And, you know, when we hear a lot of news stories, A lot of them are outraging. they, they touch us, but some of them, some of those feelings stay and they stick with you.
And for me, that stuck with me, I just couldn't let it go because I think a lot of your listeners will relate to this. Like we get into technology because we really care about the potential of technology. What could it do? What are all the awesome things that could do, but we come at a problem and we think of all the ways it could be solved with technology and here it was doing the exact opposite.
It was causing problems. It was causing harm and the design of that, or, you know, the way that was built or whatever it was failing Jenny, it was failing the nurses too, right? Like a lot of times we blame that end user and, and it caused it. So to me, that story was so tragic. Something that deeply saddened me and was regrettable and cut short someone's uh, you know, life and that's the definition of tragic, and there's a lot of other examples with varying degrees of tragic, but, um, you know, as we look at the impact technology has, and then the impact we have in creating those technologies that have such large impacts, we have a responsibility to, to really look into that and make sure we're doing as best of job as we can and avoid those as much as possible.
Because the biggest thing I learned in researching all these stories was, Hey, these aren't bad people. These aren't, you know, people who are clueless and making these, you know, terrible mistakes. They're me, they're you, they're they're people. Um, just like you and I, that could make the same mistakes.
[00:03:14] Jeremy: I think it's pretty clear to our audience where there was a loss of life, someone, someone died and that's, that's clearly tragic. Right? So I think a lot of things in the healthcare field, if there's a real negative outcome, whether it's death or severe harm, we can clearly see that as tragic.
and I, I know in your book you talk about a lot of other types of, I guess negative things that software can cause. So I wonder if you could, explain a little bit about now past the death and the severe injury. What's tragic to you.
[00:03:58] Jonathan: Yeah. still in that line of like of injury and death, And, you know, the side that most of us will actually, um, impact, our work day-to-day is also physical harm. Like, creating this software in a car. I think that's a fairly common one, but also, ergonomics, right?
Like when we bring it back to something like less impactful, but still like multiplied over the impact of, multiplied over the impact of a product rather, it can be quite, quite big, right? Like if we're designing software in a way that's very repetitive or, you know, everyone's, everyone's got that, that like scroll, thumb, scroll, you know, issue.
Right. if, uh, our phones aren't designed well, so there's a lot of ways that it can still physically impact you ergonomically. And that can cause you a lot of problem arthritis and pain, but yeah, there's, there's other, there's other, other ways that are still really impactful. So the other one is by saddening or angry.
You know, that emotional harm is very real. And oftentimes sometimes it gets overlooked a little bit because it's, um, you know, physical harm is what is so real to us, but sometimes emotional harm isn't. But, you know, we talk about in the book, the example of Facebook, putting together this great feature, which takes your most liked photo, and, you know, celebrates your whole year by you saying, Hey, look at as a hero, you're in review this, the top photo from the year, they add some great, you know, well done illustrations behind it, of, of balloons and confetti and, people dancing.
But some people had a bad year. Some people's most liked engaged photo is because something bad happened and they totally missed. And because of that, people had a really bad time with this where, you know, they lost their child that year. They lost their loved one that year, their house burnt down. Um, something really bad happened to them.
And here was Facebook putting that photo of their, of their dead child up with, you know, balloons and confetti and people dancing around it. And that was really hard for people. They didn't want to be reminded of that. And especially in that way, and these emotional harms also come into the, in the play of, on anger.
You know, we talk about, well, one, you know, there's, there's a lot of software out there that, that, um, tries to bring up news stories that anger us and which equals engagement. Um, but also ones that, um, use dark patterns to trick us into purchasing and buying and forgetting about that free trial. So they charge us for a yearly subscription and won't refund us.
Uh, if you've ever tried to cancel a subscription, you start to see some real their their real colors. Um, so emotional harm and, uh, anger is a, is a big one. We also talk about injustice in the book where there are products that are supposed to be providing justice. Um, and you know, in very real ways like voting or, you know, getting people the help that they need from the government, or, uh, for people to see their loved ones in jail.
Um, or, you know, you're getting a ticket unfairly because you couldn't read the sign was you're trying to read the sign and you, and you couldn't understand it. so yeah, we look at a lot of different ways that design and our saw the software that we create can have very real impact on people's lives and in a negative way, if we're not careful.
[00:07:25] Jeremy: the impression I get, when you talk about tragic design, it's really about anything that could harm a person, whether physically, emotionally, you know, make them angry, make them sad. And I think the, the most liked photo example is a great one, because like you said, I think the people may be building something that, that harms and they may have no idea that they're doing it.
[00:07:53] Jonathan: Exactly like that. I love that story because not, not to just jump on the bandwagon of saying bad things about like Facebook or something. No, I love that story because I can see myself designing the exact same thing, like being a part of that product, you know, building it, you know, looking at the, uh, the, the specifications, the, um, the, the PM, you know, put it that put together and the decks that we had, you know, like I could totally see that happening.
And just never, I think, never having the thought, because our we're so focused on like delighting our users and, you know, we have these metrics and these things in mind. So that's why, like, in the book, we really talk about a few different processes that need to be part of. Product development cycle to stop, pause, and think about like, well, what are the, what are the negative aspects here?
Like what are the things that could go wrong? What are the, what are the other life experiences that are negative? Um, that could be a part of this and you don't need to be a genius to think of every single thing out there. You know, like in this example, I think just talking about, you know, like, oh, well, some people might've had, you know, if they would have taken probably like, you know, one hour out of their entire project, or maybe even 10 minutes, they might've come up with like, oh, there could be bad thing.
Right. But, um, so if you don't have that, that, that moment to pause that moment to just say, okay, we have time to brainstorm together about like how this could go wrong or how, you know, the negative of life could be impacted by this, um, feature that that's all that it takes. It doesn't necessarily mean that you need to do.
You know, giant study around the impact, potential impact of this product and all the, all the ways, but really just having a part of your process that takes a moment to think about that will just create a better product and better, product outcomes. You know, if you think about all of life's experiences and Facebook can say, Hey, condolences, and like, you know, and show that thoughtfulness that would be, uh, I would have that have higher engagement that would have higher, uh, satisfaction, right?
So they could have created a better outcome by considering these things and obviously avoid the impact negative impact to users and the negative impact to their product.
[00:10:12] Jeremy: continuing on with that thought you're a senior interaction designer and you're an accessibility program lead. And so I wonder on the projects that you work on, and maybe you can give us a specific example, but how are you ensuring that you're, you're not running up against these problems where you build something that you think is going to be really great, um, for your users, but in reality ends up being harmful and specifically.
[00:10:41] Jonathan: Yeah, one of the best ways is, I mean, it should be part of multiple parts of your cycle. If, if you want something, if you want a specific outcome out of your product development life cycle, um, it needs to be from the very beginning and then a few more times, so that it's not, you know, uh, I think, uh, programmers, uh, will all latch onto this, where they have the worst end of the stick, right?
Because a and Q and QA as well. Because, you know, any bad decision or assumption that's happened early on with, you know, the, the business team or, or the PM, you know, gets like multiplied when they talk to the designer and then gets multiplied again, they hand it off. And it's always the engineer who has to, has to put the final foot down, be like, this doesn't make sense.
Or I think users are going to react this way, or, you know, this is the implication of that, that assumption. So, um, it's the same thing, you know, in our team, we have it in the very early stage when someone's putting together the idea for the feature, our project, we want to work on it's right there. There's a few, there's like a section about accessibility and a few other sections, uh, talking about like looking out for this negative impact.
So right away, we can have a discussion about it when we're talking about like what we should do about this and the D and the different, implications of implementing it. That's the perfect place for it. You know, like maybe, maybe when you're a brainstorm. Uh, about like, what should we should do? Maybe it's not okay there because you're trying to be creative.
Right. You're trying to think. But at the very next step, when you're saying, okay, like what would it mean to build this that's exactly where I should start showing up and, you know, the discussion from the team. And it depends also the, the risk involved, right? Like, uh, it depends, which is attached to how much, uh, time and effort and resources you should put towards avoiding that risk it's risk management.
So, you know, if you work, um, like my, um, you know, colleagues, uh, or, you know, some of my friends were working in the automotive industry and you're creating a software and you're worried that it might be distracting. There might be a lot more time and effort or the healthcare industry. Um, those were, those are, those might need to take a lot more resources, but if you're a, maybe a building, um, you know, SaaS software for engineers to spin up, you know, they're, um, you know resources.
Um, there might be a different amount of resources. It never is zero, uh, because you still have, are dealing with people and you'll impact them. And, you know, maybe, you know, that service goes down and that was a healthcare service that went down because of your, you know, so you really have to think about what the risk is.
And then you can map that back to how much time and effort you need to be spending on getting that. Right. And accessibility is one of those things too, where a lot of people think that it takes a lot of effort, a lot of resources to be accessible. And it really isn't. It just, um, it's just like tech debt, you know, if, if you have ignored your tech debt for, you know, five years, and then they're saying, Hey, let's all fix all the tech debt. Yeah. Nobody's going to be on board for that as much. Versus like, if, if addressing that and finding the right level of tech debt that you're okay with and when you address it and how, um, because, and just better practice. That's the same thing with accessibility is like, if you're just building it correctly, as you go, it's, it's very low effort and it just creates a better product, better decisions.
Um, and it is totally worth the increased amount of people who can use it and the improved quality for all users. So, um, yeah, it's just kind of like a win-win situation.
[00:14:26] Jeremy: one of the things you mentioned was that this should all start. At the very beginning or at least right after you've decided on what kind of product you're going to build, and that's going to make it much easier than if you come in later and try to, make fixes then, I wonder when you're all getting together and you're trying to come up with these scenarios, trying to figure out negative impacts, what kind of accessibility, needs you need to have, who are the people who are involved in that conversation?
Like, um, you know, you have a team of 50 people who needs to be in the room from the very beginning to start working this out.
[00:15:05] Jonathan: I think it would be the same people who are there for the project planning, like, um, at, on my team, we have our eng counter counterparts there. at least the team lead, if, if, if there's a lot of them, but you know, if they would go to the project kickoff, uh, they should be there.
you know, we, we have everybody in their PM, design, engineers, um, our project manager, like anyone who wants to contribute, uh, should really be there because the more minds you have with this the better, and you'll, you'll tease out much, much more of, of of all the potential problems because you have a more, more, um, diverse set of brains and life experiences to draw from.
And so you'll, you'll get closer to that 80% mark, uh, that you can just quickly take off a lot of those big items off the table, right?
[00:16:00] Jeremy: Is there any kind of formal process you follow or is it more just, people are thinking of ideas, putting them out there and just having a conversation.
[00:16:11] Jonathan: Yeah, again, it depends which industry you're in, what the risk is. So I previously worked at a healthcare industry, um, and for us to make sure that we get that right, and how it's going to impact the patients, especially though is cancer care. And they were using our product to get early warnings of adverse effects.
Our, system of figuring that like, you know, if that was going to be an issue was more formalized. Um, in, in some cases, uh, like, like actually like healthcare and especially if the, if it's a device or, or in certain software circumstances, it's determined by the FDA to be a certain category, you literally have a, uh, governmental version of this.
So the only reason that's there is because it can prevent a lot of harm, right? So, um, that one is enforced, but there's, there's reasons, uh, outside of the FDA to have that exact formalized part of your process. And it can, the size of it should scale depending on what the risk is. So on my team, the risk is, is actually somewhat low.
it's really just part of the planning process. We do have moments where we, we, um, when we're, uh, brainstorming like what we should do and how the feature will actually work. Where we talk about like what those risks are and calling out the accessibility issues. And then we address those. And then as we are ready to, um, get ready to ship, we have another, um, formalized part of the process.
There will be check if the accessibility has been taken care of and, you know, if everything makes sense as far as, you know, impact to users. So we have those places, but in healthcare, but it was much stronger where we had to, um, make sure that we re we we've tested it. We've, uh, it's robust. It's going to work on, we think it's going to work.
Um, we, you know, we do user testing has to pass that user testing, things like that before we're able to ship it, uh, to the end user.
[00:18:12] Jeremy: So in healthcare, you said that the FDA actually provides, is it like a checklist of things to follow where you must have done this? As you're testing and you must have verified these, these things that's actually given to you by the government.
[00:18:26] Jonathan: That's right. Yeah. It's like a checklist and the testing requirement. Um, and there's also levels there. So, I have, I've only, I've only done the lowest level. I know. There's like, I think like two more levels above that. Um, and again, that's like, because the risk is higher and higher and there's more stricter requirements there where maybe somebody in the FDA needs to review it at some point.
And, um, so again, like mapping it back to the risk that your company has is, is really important to understanding that is going to help you avoid and, and build a better product, avoid, you know, the bad impact and build a better product. And, and I think that's one of the things I would like to focus on as well.
And I'd like to highlight for your, for your listeners, is that, it's not just about avoiding tragic design because one thing I've discovered since writing the book and sharing it with a lot of people. Is that the exact opposite thing is usually, you know, in a vast majority of the cases ends up being a strategically great thing to pursue for the product and the company.
You know, if you think about, that, that example with, with Facebook, okay. You've run into a problem that you want to avoid, but if you actually do a 180 there and you find ways to engage with people, when they're grieving, you find people to, to develop features that help people who are grieving, you've created a value to your users, that you can help build the company off of.
Right. Um, cause they were already building a bunch of joy features. Right. Um, you know, and also like user privacy, like I, we see apple doing that really well, where they say, okay, you know, we are going to do our ML on device. We are going to do, you know, let users decide on every permission and things like that.
And that, um, is a strategy. We also see that with like something like T-Mobile, when they initially started out, they were like one of the nobody, uh, telecoms in the world. And they said, okay, what are all the unethical bad things that, uh, our competitors are doing? They're charging extra fees, you know, um, they have these weird data caps that are really confusing and don't make any sense their contracts, you get locked into for many years.
They just did the exact opposite of that. And that became their business strategy and it, and it worked for them now. They're, they're like the top, uh, company. So, um, I think there's a lot of things like that, where you just look at the exact opposite and, you, one you get to avoid the bad, tragic design, but you also see boom, you see an opportunity that, um, become, become a business strategy.
[00:21:03] Jeremy: So, so when you referred to exact opposite, I guess you're, you're looking for the potentially negative outcomes that could happen. there was the Facebook example of, of seeing a photo or being reminded of a really sad event and figuring out can I build a product around, still having that same picture, but recontextualizing it like showing you that picture in a way that's not going to make you sad or upset, but is actually a positive.
[00:21:35] Jonathan: Yeah. I mean, I don't know maybe what the solution was, but like one example that comes to mind is some companies. Now, before mother's day, we'll send you an email and say, Hey, this is coming up. Do you want us to send you emails about mother's day? Because for some people that's Can, be very painful. That's that's very thoughtful.
Right. And that's a great way to show that you, that you care. Um, but yeah, like, you know, uh, thinking about that Facebook example, like if there's a formalized way to engage with, with grieving, like, I would use Facebook for that. I don't use Facebook very often or almost at all, but you know, if somebody passed away, I would engage right with my, my Facebook account.
And I would say, okay, look, there's like, there's this whole formalized, you know, feature around, you know, uh, and, and Facebook understands grieving and Facebook understands like this w this event and may like smooth that process, you know, creates comfort for the community that's value and engagement. that is worthwhile versus artificial engagement.
That's for the sake of engagement. and that would create, uh, a better feeling towards Facebook. Uh, I would maybe like then spend more time on Facebook. So it's in their mutual interest to do it the right way. Um, and so it's great to focus on these things to avoid harm, but also to start to see new opportunities for innovation.
And we see this a lot already in accessibility where there's so many innovations that have come from just fixing accessibility issues like closed captions. We all use it, on our TVs, in busy crowded spaces, on, you know, videos that have no, um, uh, translation for us in different places.
So, SEO is, is the same thing. Like you get a lot of SEO benefit from, you know, describing your images and, and making everything semantic and things like that. And that also helps screen readers. and different innovations have come because somebody wanted to solve an accessibility need.
And then the one I love, I think it's the most common one is readability, like contrast and tech size. Sure. There's some people who won't be able to read it at all, but it hurts my eyes to read bad contrast and bad text size. And so it just benefits. Everyone creates a better design. And one of the things that comes up so often when I'm, you know, I'm the accessibility program lead.
And so I see a lot of our bugs is so many issues that, that are caught because of our, our audits and our, like our test cases around accessibility that just our bad design and our bad experience for everyone. And so we're able to fix that. And, uh, and it's just like an another driver of innovation and there's, there's, there's a ton of accessibility examples, and I think there's also a ton of these other, you know, ethical examples or, you know, uh, avoiding harm where you just can see it. It's an opportunity area where it's like, oh, let's avoid that. But then if you turn around, you can see that there's a big opportunity to create a business strategy out of it.
[00:24:37] Jeremy: Can, can you think of any specific examples where you've seen that? Where somebody, you know, doesn't treat it as something to avoid, but, but actually sees that as an opportunity.
[00:24:47] Jonathan: Yeah. I mean, I, I think that the, um, the apple example is a really good one where from the beginning, like they, they saw like, okay, in the market, there's a lot of abuse of information and people don't like that. So they created a business strategy around that And that's become a big differentiator for them.
Right. Like they, they have like ML on the device. They do. Um, they have a lot of these permission settings, you know, the Facebook. It was very much focused right. On, on using customer data and a lot of it without really asking their permission. And so once apple said, okay, now all apps need to show what you're tracking.
And, and then, um, and asked for permission to do that. A lot of people said no, and that caused about $10 billion of loss for, for Facebook. and for, for apple, it's, you know, they advertise on that now that we're, you know, ethical that, you know, we, we source things ethically and we, we care about user privacy and that's a strong position, right?
Uh, I think there's a lot of other examples out there. Like I mentioned accessibility and others, but like it they're kind of overflowing, so it's hard to pick one.
[00:25:58] Jeremy: Yeah. And I think what's interesting about that too, is with the example of focusing on user privacy or trying to be more sensitive around, death or things like that, as I think that other people in the industry will, will notice that, and then in their own products, then they may start to incorporate those things as well.
[00:26:18] Jonathan: Yeah. Yeah, exactly what the example of with T-Mobile. once that worked really, really well and they just ate up the entire market, all the other companies followed suit, right? Like now, um, having those data caps that, you know, are, are very rare, having those surprise fees are a lot, uh, rare.
Um, you know, there's, there's no more like deep contracts that lock you in and et cetera, et cetera. A lot of those have become industry standard now. Um, and so It, and it does improve the environment for everyone because, because now it becomes a competitive advantage that everybody needs to meet. Um, so yeah, I think that's really, really important.
So when you're going through your product's life cycle, you might not have the ability to make these big strategic decisions. Like, you know, we want to, you know, not have data caps or whatever, but, you know, if you, if you're on that Facebook level and you run into that issue, you could say, well, look, what could we do to address this?
What could we could do to, to help this and make, make that a robust feature? You know, when we talk about, lot of these dating apps, one of the problems was a lot of abuse, where women were being harassed or, you know, after the day didn't go well and you know, things were happening. And so a lot of apps have now dif uh, these dating apps have differentiated themselves and attracted a lot of that market because they deal with that really well.
And they have, you know, it's built into the strategy. It's oftentimes like a really good place to start too, because one it's not something we generally think about very, very well, which means your competitors. Haven't thought about it very well, which means it's a great place to, to build products, ideas off of.
[00:27:57] Jeremy: Yeah, that's a good point because I think so many applications now are like social media applications, their messaging applications there, their video chat, that sort of thing. I think when those applications were first built, they didn't really think so much about what if someone is, you know, sending hateful messages or sending, pictures that people really don't want to see.
Um, people are doing abusive things. It was like, they just assume that, oh, people will be, people will be good to each other and it'll be fine. But, uh, you know, in the last 10 years, pretty much all of the major social media companies have tried to figure out like, okay, um, what do I do if someone is being abusive and, and what's the process for that?
And basically they all have to do something now. Um,
Um
[00:28:47] Jonathan: Yeah. And that's a hard thing to like, if, if that, uh, unethical or that, um, bad design decision is deep within your business strategy and your company's strategy. It's hard to undo that like some companies are still, still have to do that very suddenly and deal with it. Right. Like, uh, I know Uber had a big, big part of them, like, uh, and some other companies, but, uh, we're like almost suddenly, like everything will come to a head and they'll need to deal with it.
Or, you know, like, Twitter now try to try to get, be acquired by Elon Musk. Uh, some of those things are coming to light, but, I, what I find really interesting is that these these areas are like really ripe for innovation. So if you're interested in, a startup idea or you're, or you're working in a startup, or, you know, you're about to start one, you know, there's a lot of maybe a lot of people out there who are thinking about side projects right now, this is a great way to differentiate and win that market against other well-established competitors is to say, okay, well, what are they, what are they doing right now that is unethical. And it's like, you know, core to their business strategy and doing that differently is really what will help you, to win that market. And we see that happening all the time, you know, especially the ones that are like these established, uh, leaders in the market. they can't pivot like you can, so being able to say, I'm, we're going to do this ethically.
We're going to do this, uh, with, you know, with these tragic design in mind and doing the opposite, that's going to help you to, to find your, your attraction in the market.
[00:30:25] Jeremy: Earlier, we were talking about. How in the medical field, there is specific regulation or at least requirements to, to try and avoid this kind of tragic design. Uh, I noticed you also worked for Intuit before. Uh, um, so for financial services, I was wondering if there was anything similar where the government is stepping in and saying like, you need to make sure that, these things happen to avoid, these harmful things that can come up.
[00:30:54] Jonathan: Yeah, I don't know. I mean, I didn't work on TurboTax, so I worked on QuickBooks, which is like a accounting software for small businesses. And I was surprised, like we didn't have a lot, like a lot of those robust things, we just relied on user feedback to tell us like, things were not going well. And, you know, and I think we should have, like, I think, I think that that was a missed opportunity, um, to.
Show your users that you understand them and you care, and to find those opportunity areas. So we didn't have enough of that. And there was things that we shipped that didn't work correctly right out of the box, which, you know, it happens, but had a negative impact to users. So it's like, okay, well, what do we do about that?
How do we fix that? Um, and if the more you formalize that and make it part of your process, the more you get out of it. And actually this is like, this is a good, a good, um, uh, pausing point bit that I think will affect a lot of engineers listening to this. So if you remember in the book, we talk about the Ford Pinto story and there isn't, I want to talk about this story and why I added it to the book.
Is that, uh, one, I think this is the thing that engineers deal with the most, um, and, and designers do too, which is that okay. we see the problem, but we don't think it's worth fixing. Okay. Um, so that, that's what I'm going. That's what we're going to dig into here. So it's a, hold on for a second while I explain some, some history about this car.
So the Ford Pinto, if you're not familiar is notorious, uh, because it was designed, um, and built and shipped and there, they knowingly had this problem where if it was rear-ended at even like a pretty low speed, it would burst into flames because the gas tank would rupture the, and then oftentimes the, the, the doors would get jammed.
And so it became a death trap of fire and caused many deaths, a lot of injuries. And, um, in an interview with the CEO at the time, like almost destroyed Ford like very seriously would have brought the whole company down and during the design of it, uh, and design meaning in the engineering sense. Uh, and the engineering design of it, they say they found this problem and the engineers came up with their best solution.
Was this a rubber block. Um, and the cost was, uh, I forget how many dollars let's say it was like $9. let's say $6, but this is again, uh, back then. And also the margin on these cars was very, very, very thin and very important to have the lowest price in the market to win those markets. The customers were very price sensitive, so they, uh, they being like the legal team looked at like some recent, cases where they have the value of life and started to come up with like a here's how many people would sue us and here's how much it would cost to, uh, to, to settle all those.
And then here's how much it would cost to add this to all the cars. And it was cheaper for them to just go with the lawsuits and they, they found. Um, and I think why, I think why this is so important is because of the two things that happened afterward, one, they were wrong. it was a lot more people it affected and the lawsuits were for a lot more money.
And two after all this was going crazy and it was about to destroy the company, they went back to the drawing board and what did the engineers find? They found a cheaper solution. They were able to rework that, that rubber block and and get it under the margin and be able to hit the mark that they wanted to.
And I think that's, there's a lot of focus on the first part because it's so unethical to the value of life and, and, um, and doing that calculation and being like we're willing to have people die, but in some industries, it's really hard to get away with that, but it's also very easy. To get into that.
It's very easy to get lulled into this sense of like, oh, we're just going to crunch the numbers and see how many users it affects. And we're okay with that. Um, versus when you have principals and you have kind of a hard line and you, and you care a lot more than you should. And, and you really push yourself to create a more ethical, more, a safer, you know, avoiding, tragic design, then you, there there's a solution out there.
Like you actually get to innovation, you actually get to the solving the problem versus when you just rely on, oh, you know, the cost benefit analysis we did is that it's going to take an engineer in a month to fix this and blah blah blah. But if, if you have those values, if you have those principles and you're like, you know what, we're not okay shipping this, then you'll, you'll find that.
They're like, okay, there's, there's a cheaper way to, to fix this. There's another way we could address this. And that happens so often. and I know a lot of engineers deal with that. A lot of saying like, oh, you know, this is not worth our time to fix. This is not worth our time to fix. And that's why you need those principles is because oftentimes you don't see it and it's, but it's right there at right outside of the edge of your vision.
[00:36:12] Jeremy: Yeah. I mean, with the Pinto example, I'm just picturing, you know, obviously there wasn't JIRA back then, but you can imagine that somebody's having an issue that, Hey, when somebody hits the back of the car, it's going to catch on fire. Um, and, and going like, well, how do I prioritize that? Right? Like, is this a medium ticket?
Is this a high ticket? And it's just like, it's just, it just seems insane, right? That you could, make the decision like, oh no, this isn't that big an issue. You know, we can move it down to low priority and, and, and, ship it.
Okay.
[00:36:45] Jonathan: Yeah. And, and, and that's really what principals do for you, right? Is they help you make the tough decisions. You don't need a principle for an easy one. Uh, and that's why I really encourage people in the book to come together as a team and come up with what are your guiding principles. Um, and that way it's not a discussion point every single time.
It's like, Hey, we've agreed that this is something that we, that we're going to care about. This is something that we are going to stop and, fix. Like, one of the things I really like about my team at Google is product excellence is very important to us. and. there are certain things that, uh, we're, you know, we're Okay. with, um, letting slip and fixing at a next iteration.
And, you know, obviously we make sure we actually do that. Um, so it's not like we, we, we always address everything, but because it's one of our principles. We care more. We have more, we take on more of those tickets and we take on more of those things and make sure that they ship before, um, can make sure that they're fixed before we ship.
And, and it shows like to the end user that th that this company cares and they have quality. Um, so it's one of it. You need a principal to kind of guide you through those difficult things that aren't obvious on a decision to decision basis, but, you know, strategically get you in somewhere important, you know, and, and like, like design debt or, um, our technical debt where it's like, this should be optimized, you know, this chunk of code, like, nah, but you know, in, in it grouping together with a hundred of those decisions.
Yeah. It's gonna, it's gonna slow it down every single project from here on out. So that's why you need those principles.
[00:38:24] Jeremy: So in the book, uh, there are a few examples of software in healthcare. And when you think about principles, you would think. Generally everybody on the team would be on board that we want to give whatever patient that's involved. We want to give them good care. We want them to be healthy. We don't want them to be harmed.
And given that I I'm wondering because you, you interviewed multiple people in the book, you have a few different case studies. Um, why do you think that medical software in particular seems to be, so it seems to have such poor UX or has so many issues.
[00:39:08] Jonathan: Yeah, that's a, complicated topic. I would summarize it with a few, maybe three different reasons. Um, one which I think is, uh, maybe a driving factor of, of some of the other ones. Is that the way that the medical, uh, industry works is the person who purchases the software. It's not the end user. So it's not like you have doctors and nurses voting on, on which software to use.
Um, and so oftentimes it's, it's more of like a sales deal and then just gets pushed out and they, and they also have to commit to these things like, um, the software is very expensive and, uh, initially with, you know, like in the early days was very much like it needs to be installed, maintain, there has to be training.
So there was a lot to money to be made, in those, in that software. And, and so the investment from the hospital was a lot, so they can't just be like, oh, can it be to actually, don't like this one, we're going to switch to the next one. So, because like, once it's sold, it's really easy to just like, keep that customer.
There's very little incentive to like really improve it unless you're selling them a new feature. So there's a lot of feature add ons. Because they can charge more for those, but improving the experience and all that kind of stuff. There is less of that. I think also there's just generally a lot less like, uh, understanding of design, in that field.
And there's a lot more because there's sort of like traditions of things. they end up putting a lot of the pressure and the, that responsibility on the end individuals. So, you know, you've heard recently of that nurse who made a medication error and she's going to jail for that. And sh you know, And oftentimes we blame that end, that end person.
So the, the nurse gets all the blame or the doctor gets all the blame. Well, what about the software, you know, who like made that confusing or, you know, what about the medication that looks exactly like this other medication? Or what about the pump tool that you have to, you know, type everything in very specifically, and the nurses are very busy.
They're doing a lot of work. There's a 12 hour shifts. They're dealing with lots of different patients, a lot of changing things for them to have to worry about having to type something a specific way. And yet when those problems happen, what do they do? They don't go in like redesign the devices. Are they more training, more training, more training, more training, and people only can absorb so much training.
and so I think that's part of the problem is that like, there's no desire to change. They blame the end, the wrong person, and. Uh, lastly, I think that, um, it is starting to change. I think we're starting to see like the ability for, because of the fact that the government is pushing healthcare records to be more interoperable, meaning like I can take my health records anywhere, that a lot of the power comes in where the data is.
And so, um, I'm hoping that, uh, you know, as the government and people and, um, and initiatives push these big companies, like epic to be more open, that things will improve. One is because they'll have to keep up with their competitors and that more competitors will be out there to improve things. Because I, I think that there's, there's the know-how out there, but like, because the there's no incentive to change and, and, and there's no like turnover and systems and there's the blaming of the end user.
We're not going to see a change anytime soon.
[00:42:35] Jeremy: that's a, that's a good point in terms of like, it, it seems like even though you have all these people who may have good ideas may want to do a startup, uh, if you've got all these hospitals that already locked into this very expensive system, then yeah. Where's, where's the room to kind of get in there in and have that change.
[00:42:54] Jonathan: yeah.
[00:42:56] Jeremy: Uh, another thing that you talk about in the book is about how, when you're in a crisis situation, the way that a user interacts with something is, is very different. And I wonder if you have any specific examples for software when, when that can happen.
[00:43:15] Jonathan: yeah. Designing for crisis is a very important part of every software because, it might be hard for you to imagine being in that situation, but, it, it definitely will still happen so. one example that comes to mind is, uh, you know, let's say you're working on a cloud, um, software, like, uh, AWS or Google cloud.
Right. there's definitely use cases and user journeys in your product where somebody would be very panicked. Right. Um, and if you've ever been on an on-call with, with something and it goes south, and it's a big deal, you don't think. Right. Right. Like when we're in crisis, our brains go into a totally different mode of like that fight or flight mode.
And we don't think the way we do, it's really hard to read and comprehend very hard. and we might not make this, the right decisions and things like that. So, you know, thinking about that, like maybe your, your let's say, like, going back to that, the cloud software, like let's say you're, you're, you're working on that, like.
Are you relying on the user reading a bunch of texts about this button, or is it very clear from the way you've crafted that exact button copy and how big it is? And, and it's where it is relation to a bunch of other content? Like what exactly it does. It's going to shut down the instance where it's gonna, you know, it's, it's gonna, do it at a delay or whatever, like be able to all those little decisions, like are really impactful.
And when you, when you run them through the, um, the, the furnace of, of, of, uh, um, a user journey that's relying on, on a really urgent situation, you'll obviously help that. And you'll, you'll start to see problems in your UI that you hadn't noticed before, or, or different problems in the way you're implementing things that you didn't notice before, because you're seeing it from a different way.
And that's one of the great things about, um, the, the systems and the book that we talk about around, like, thinking about how things could go wrong, or, you know, thinking about, you know, designing for crisis. Is it makes you think of some new use cases, which makes you think of some new ways to improve your product.
You know, that improvement you make to make it so obvious that someone could do it in a crisis would help everyone, even when they're not in a crisis. Um, so that, that's why it's important to, to focus on those things.
[00:45:30] Jeremy: And for someone who is working on these products, it's kind of hard to trigger that feeling of crisis. If there isn't actually a crisis happening. So I wonder if you can talk a little bit about how you, you try to design for that when it's not really happening to you. You're just trying to imagine what it would feel like.
[00:45:53] Jonathan: yeah. Um, you're never really going to be able to do that. Like, so some of it has to be simulated, One of the ways that we are able to sort of simulate what we call cognitive load. Which is one of the things that happen during a crisis. But what also happened when someone's very distracted, they might be using your product while they're multitasking.
We have a bunch of kids, a toddler constantly pulling on their arm and they're trying to get something done in your app. So, one of the ways that has been shown to help, uh, test that is, um, like the foot tapping method. So when you're doing user research, you have the user doing something else, like tapping or like, You know, uh, make it sound like they have a second task that they're doing on the side.
It's manageable, like tapping their feet and their, their hands or something. And then they also have to do your task. Um, so like you can like build up what those tabs with those extra things are that they have to do while they're also working on, uh, finishing the task you've given them. and, and that's one way to sort of simulate cognitive load.
some of the other things is, is really just, um, you know, listening to users, stories and, and find out, okay, this user was in crisis. Okay, great. Let's talk to them and interview them about that. Uh, if it was fairly recently within like the past six months or something like that. but, but sometimes you don't like, you just have to run through it and do your best.
Um, and you know, those black Swan events or those, even if you're able to simulate it yourself, like put your, put your, put yourself into that exact position and be in panic, which, you know, you're not able to, but if you were that still would only be your experience and you wouldn't know all the different ways that people could experience this.
So, and there's going to be some point in time where you're gonna need to extrapolate a little bit and, you know, extrapolate from what you know, to be true, but also from user testing and things like that. And, um, and then wait for a real data
[00:47:48] Jeremy: You have a chapter in the book on design that angers and there were, there were a lot of examples in there, on, on things that are just annoying or, you know, make you upset while you're using software. I wonder for like our audience, if you could share just like a few of your, your favorites or your ones that really stand out.
[00:48:08] Jonathan: My favorite one is Clippy because, um, you know, I remember growing up, uh, you know, writing software, writing, writing documents, and Clippy popping up. And, I was reading an article about it and obviously just like everybody else, I hated it. You know, as a little character, it was fun, but like when you're actually trying to get some work done, it was very annoying.
And then I remember, uh, a while later reading this article about how much work the teams put into clubby. Like, I mean, if you think about it now, It had a lot of like, um, so the AI that we're playing with just now, um, around like natural language processing, understanding, like what, what type of thing you're writing and coming up with contextualized responses, like it was pretty advanced for the, uh, very advanced for the time, you know, uh, adding animation triggers to that and all, all that.
Um, and they had done a lot of user research. I was like, what you did research in, like you had that reaction. And I love that example because, oh, and also by the way, I love how they, uh, took Clippy out and S and highlighted that as like one of the features of the next version of the office, uh, software.
but I love that example again, because I see myself in that and, you know, you ha you have a team doing something technologically amazing doing user research, uh, and putting out a very great product, but he totally missing. And a lot of products do that. A lot of teams do that. And why is that? It's because they're, um, they're not thinking about, uh, they're putting their, they're putting the business needs or the team's needs first and they're putting the user's needs second.
And whenever we do that, whenever we put ourselves first, we become a jerk, right? Like if you're in a relationship and you're always putting yourself first, that relationship is not going to last long or it's not going to go very well. And yet we Do that with our relationship with users where we're constantly just like, Hey, well, what is the business?
The business wants users to not cancel here so let's make it very difficult for people to cancel. And that's a great way to lose customers. That's a great way to create, this dissonance with your, with your users. And, um, and so if you, if you're, focused on like, this is what the we need to accomplish with the users, and then you work backwards from.
You're you're, you're, you're, you're lower your chances of missing it, of getting it wrong of angering your users. and const always think about like, you sometimes have to be very real with yourselves and your team. And I think that's really hard for a lot of teams because we have we don't want to look bad.
We don't want to, but what I found is those are the people who actually, um, get promoted. Like, you know, if you look at the managers and directors and stuff, those are the people who can be brutally honest. Right. Um, who can say, like, I don't think this is ready. I don't, I don't think this is good. And so you actually, I, I, you know, I've done that in the front of like our CEO and things like that.
And I've always had really good responses from them to say, like, we really appreciate that you, you know, uh, you can call that out and you can just call it like, it is like, Hey, this is what we see this user. Maybe we shouldn't do this at all. Maybe. Um, and that can, uh, you know, at Google that's one of the criteria that we have in our software engineers and the designers of being able to spot things that are, you know, things that we shouldn't should stop doing.
Um, and so I think that's really important for the development of, of a senior engineer, uh, to be able to, to know that that's something like, Hey, this project, I would want it to work, but in its current form is not good. And being able to call that out is very important.
[00:51:55] Jeremy: Do you have any specific examples where there was something that was like very obvious to you? To the rest of the team or to a lot of other people that wasn't.
[00:52:06] Jonathan: um, yeah, so here's an example I finally got, I was early on in my career and I finally got to lead in our whole project. So we are redesigning our business micro-site um, and I got to, I got, uh, assigned two engineers and another designer and I got to lead the whole. I was, I was like, this is my chance.
Right? So, and we had a very short timeline as well, and I put together all these designs. And, um, one of the things that we aligned on at the time was like as really cool, uh, so I put together this really cool design for the contact form, where you have like, essentially, I kind of like ad-lib, it looks like a letter.
and you know, by the way, give me a little bit of, of, uh, of, of leeway here. Cause this was like 10 years ago, but, uh, it was like a letter and you would say like, you're addressing it to our company. And so it had all the things we wanted to get out of you around like your company size, your team, like, and so our sales team would then reach out to this customer.
I designed it and I had shown it to the team and everybody loved it. Like my manager signed off on it. Like all the engineers signed off on it, even though we had a short timeline, they're like, yeah, well we don't care. That's so cool. We're going to build it. But as I put it through that test of like, does this make sense for the, what the user wants answers just kept saying no to me.
So I had to go and back in and pitch everybody and argue with them around not doing the cool idea that I wanted to do. And, um, eventually they came around and that form performed once we launched it performed really well. And I think about like, what if users had to go through this really wonky thing?
Like this is the whole point of the website is to get this contact form. It should be as easy and as straightforward as possible. So I'm really glad we did that. And I can think of many, many more of those situations where, you know, um, we had to be brutally honest with ourselves with like this isn't where it needs to be, or this isn't what we should be doing.
And we can avoid a lot of harm that way too, where it's like, you know, I don't, I don't think this is what we should be building. Right.
[00:54:17] Jeremy: So in the case of this form, was it more like you, you had a bunch of drop-downs or S you know, selections where you would say like, okay, these are the types of information that I want to get from the person filling out the form as a company. but you weren't looking so much at, as the person filling out the form, this is going to be really annoying.
Was
that kind
[00:54:38] Jonathan: exactly, exactly. Like, so their experience would have been like, they come up, they come at the end of this page or on like contact us and it's like a letter to our company. And like, we're essentially putting words in their mouth because they're, they're filling out the, letter. Um, and then, yeah, it's like, you know, you have to like read and then understand like what, what that part of this, the, the page was asking you and, you know, versus like a form where you're, you know, it's very easy.
Well-known bam. You're, you're you're on this page. So you're interested in, so like, get it, get them in there. So we were able to, to decide against that and that, you know, we, we also had to, um, say no to a few other things, but like we said yes, to some things that were great, like responsive design, um, making sure that our website worked at every single use case, which is not like a hard requirement at the time, but was really important to us and ended up helping us a lot because we had a lot of, you know, business people who are on their phone, on the go, who wanted to, to check in and fill out the form and do a bunch of other stuff and learn about us.
So that, that, that sales, uh, micro-site did really well because I think we made the right decisions and all those kinds of areas. And like those, those general, those principles helped us say no to the right things, even though it was a really cool thing, it probably would have looked really great in my portfolio for a while, but it just wasn't the right thing to do for the, the, the goal that we had.
[00:56:00] Jeremy: So did it end up being more like just a text box? You know, a contact us fill in. Yeah.
[00:56:06] Jonathan: You know, with usability, you know, if someone's familiar with something and it's, it's tired, everybody does it, but that means everybody knows how to use it. So usability constantly has that problem of innovation being less usable. Um, and so sometimes it's worth the trade-off because you want to attract people because of the innovation and they'll bill get over that hump with you because the innovation is interesting.
So sometimes it's worth it and sometimes it's not, and you really have to, I'd say most times it's not. Um, and So you have to find like, what is, when is it time to innovate and when is it time to do the what's tried and true. Um, and on a business microsite, I think it's time to do tried and true.
[00:56:51] Jeremy: So in your research for the book and all the jobs you've worked previously, are there certain. Mistakes or just UX things that you've noticed that you think that our audience should know about?
[00:57:08] Jonathan: I think dark patterns are one of the most common, you know, tragic design mistakes that we see, because again, you're putting the company first and the user second. And you know, if you go to a trash, sorry, if you go to a dark patterns.org, you can see a great list. Um, there's a few other sites that have a nice list of them and actually Vox media did a nice video about, uh, dark patterns as well.
So it's gaining a lot of traction, but you know, things like if you try to cancel your search, like Comcast service or your Amazon service, it's very hard. Like I think I wrote this in the book, but. Literally re researched what's the fastest way to delete it to, to, you know, uh, remove your Comcast account.
I prepared everything. I did it through chat because that was the fastest way for first, not to mention finding chat by the way was very, very hard for me. Um, so I took me, even though I was like, okay, I have to find I'm going to do it through chat. I'm gonna do all this. It took me a while to find like chat, which I couldn't find it.
So once I finally found it from that point to deleting from having them finally delete my account was about an hour. And I knew what to do going in just to say all the things to just have them not bother me. So th that's on purpose they've purposely. Cause it's easier to just say like fine, I'll take the discount thing.
You're throwing in my face at the last second. And it's almost become a joke now that like, you know, you have to cancel your Comcast every year, so you can keep the costs down. Um, you know, and Amazon too, like trying to find that, you know, delete my account is like so buried. You know, they do that on purpose and a lot of companies will do things like, you know, make it very easy to sign up for a free trial and, and hide the fact that they're going to charge you for a year high.
The fact that they're automatically going to bill you not remind you when it's about to expire so that they can like surprise, get you in to forget about this billing subscription or like, you know, if you've ever gotten Adobe software, um, they are really bad at that. They, they trick you into like getting this like monthly sufficient, but actually you've committed to a year.
And if you want to cancel early, we'll charge you like 80% of the year. And, uh, and there's a really hard to contact anybody about it. So, um, it happens quite often. If the more you read into those, um, different things, uh, different patterns, you'll start to see them everywhere. And users are really catching onto a lot of those things and are responding.
To those in a very negative way. And like, um, we recently, uh, looked at a case study where, you know, this free trial, um, this company had a free trial and they had like the standard free trial, um, uh, kind of design. And then their test was really just focusing on like, Hey, we're not going to scam you. If I had to summarize that the entire direction of the second one, it was like, you know, cancel any time.
Here's exactly how much you'll be charged. And on the, it'll be on this date, uh, at five days before that we'll remind you to cancel and all this stuff, um, that ended up performing about 30% better than the other one. And the reason is that people are now burned by that trick so much so that every time they see a free trial, they're like, forget it.
I don't, I don't want to deal with all this trickery. Like, oh, I didn't even care about to try the product versus like. We were not going to trick you. We really want you to actually try the product and, you know, we'll make sure that if you're not wanting to move forward with this, that you have plenty of time and plenty of chances to lead and that people respond to that now.
So that's what we talked about earlier in the show of doing the exact opposite. This is another example of that.
[01:00:51] Jeremy: Yeah, because I think a lot of people are familiar with, like you said, trying to cancel Comcast or trying to cancel their, their New York times subscription. And they, you know, everybody is just like, they get so mad at the process, but I think they also may be assume that it's a positive for the company, but what you're saying is that maybe, maybe that's actually not in the company's best interest.
[01:01:15] Jonathan: Yeah. Oftentimes what we find with these like dark patterns or these unethical decisions is that th they are successful because, um, when you look at the most impactful, like immediate metric, you can look at, it looks like it worked right. Like, um, you know, let's say for that, those free trials, it's like, okay, we implemented like all this trickery and our subscriptions went up.
But if you look at like the end, uh, result, um, which is like farther on in the process, it's always a lot harder to track that impact. But we all know, like when we look at each other, like when we, uh, we, we, we talk to each other about these different, um, examples. Like we know it to be true, that we all hate that.
And we all hate those companies and we don't want to engage with them. And we don't, sometimes we don't use the products at all. So, um, yeah, it, it, it's, it's one of those things where it actually has like that, very real impact, but harder to track. Um, and so oftentimes that's how these, these patterns become very pervasive is the oh, and page views went up, uh, this was, this was a really, you know, this is high engagement, but it was page views because people were refreshing the page trying to figure out where the heck to go. Right. So um, oftentimes they they're less effective, but they're easier to track
[01:02:32] Jeremy: So I think that's, that's a good place to, to wrap things up, but, um, if people want to check out the book or learn more about what you're working on your podcast, where should they head?
[01:02:44] Jonathan: Um, yeah, just, uh, check out tragic design.com and our podcast. You can find on any of your podcasting software, just search design review podcast.
[01:02:55] Jeremy: Jonathan, thank you so much for joining me on software engineering radio.
[01:02:59] Jonathan: alright, thanks Jeremy. Thanks everyone. And, um, hope you had a good time. I did.
This episode originally aired on Software Engineering Radio.
Randy Shoup is the VP of Engineering and Chief Architect at eBay. He was previously the VP of Engineering at WeWork and Stitch Fix, a Director of Engineering at Google Cloud where he worked on App Engine, and a Chief Engineer and Distinguished Architect at eBay in 2004.
Topics covered:
Related Links:@randyshoup
The Epic Story of Dropbox’s Exodus from the Amazon Cloud Empire
Transcript:[00:00:00] Jeremy: Today, I'm talking to Randy Shoup, he's the VP of engineering and chief architect at eBay.
[00:00:05] Jeremy: He was previously the VP of engineering at WeWork and stitch fix, and he was also a chief engineer and distinguished architect at eBay back in 2004. Randy, welcome back to software engineering radio. This will be your fifth appearance on the show. I'm pretty sure that's a record.
[00:00:22] Randy: Thanks, Jeremy, I'm really excited to come back. I always enjoy listening to, and then also contributing to software engineering radio.
Back at, Qcon 2007, you spoke with Markus Volter he's he was the founder of SE radio. And you were talking about developing eBay's new search engine at the time.
[00:00:42] Jeremy: And kind of looking back, I wonder if you could talk a little bit about how eBay was structured back then, maybe organizationally, and then we can talk a little bit about the, the tech stack and that sort of thing.
[00:00:53] Randy: Oh, sure. Okay. Yeah. Um, so eBay started in 1995. I just want to like, you know, orient everybody. Same, same as the web. Same as Amazon, same as a bunch of stuff. So E-bay was actually almost 10 years old when I joined. That seemingly very old first time. Um, so yeah. What was ebay's tech stack like then? So E-bay current has gone through five generations of its infrastructure.
It was transitioning between the second and the third when I joined in 2004. Um, so the. Iteration was Pierre Omidyar, the founder three-day weekend three-day labor day weekend in 1995, playing around with this new cool thing called the web. He wasn't intending to build a business. He just was playing around with auctions and wanted to put up a webpage.
So he had a Perl backend and every item was a file and it lived on this little 486 tower or whatever you had at the time. Um, so that wasn't scalable and wasn't meant to be. The second generation of eBay's architecture was what we called V2 very, you know, creatively, uh, that was a C++ monolith. Um, an ISAPI DLL with essentially well at its worst, which grew to 3.4 million lines of code in that single DLL and basically in a single class, not just in a single, like repo or a single file, but in a single class.
So that was very unpleasant to work in. As you can imagine, um, eBay had about a thousand engineers at the time and they were, you know, as you can imagine, like really stepping on each other's toes and not being able to make much forward progress. So starting in, I want to call it 2002. So two years before I joined, um, they were migrating to the creatively named V3 and V3 architecture was Java, and.
you know, not microservices, but like we didn't even have that term, but it wasn't even that it was mini applications. So I'm actually going to take a step back. V2 was a monolith. So like all of eBay's code in that single DLL and like that was buying and selling and search and everything. And then we had two monster databases, a primary and a backup big Oracle machines on some hardware that was bigger, you know, bigger than refrigerators and that ran eBay for a bunch of years, before we changed the upper part of the stack, we, um, chopped up the, that single monolithic database into a bunch of, um, domain specific databases or entity specific databases, right?
So a set of databases around users, you know, sharded by the user ID could talk about all that. If you want, you know, items again, sharded by item ID transactions, sharded by transaction ID... I think when I joined, it was the several hundred instances of, uh, Oracle databases, um, you know, spread around, but still that monolithic front end.
And then in 2002, I wanna say we started migrating into that V3 that I was saying, okay. So that's, uh, that was a rewrite in Java, again, many applications. So you take the front end and instead of having it be in one big unit, it was this, uh, ER file, EAR, file, if run and people remember back to, you know, those stays in Java, um, you know, 220 different of those.
So like here is the, you know, one of them for the search pages, you know, so the, you know, one application be the search application and it would, you know, do all the search related stuff, the handful of pages around search, uh, ditto for, you know, the buying area, ditto for the, you know, checkout area, ditto for the selling area...
220 of those. Um, and that was again, domain, um, vertically sliced domains. And then the relationship between those V3, uh, applications and the databases was a many to many things. So like many applicants, many of those applications interact with items. So they would interact with those item databases. Many of them would interact with users.
And so they would interact with a user databases, et cetera, uh, happy to go into as much gory detail as you want about all that. But like, that's what, uh, but we were in the transition period. You know, when I, uh, between the V2 monolith to the V3 mini applications in, uh, 2004, I'm just going to pause there and like, let me know where you want to take it.
[00:05:01] Jeremy: Yeah. So you were saying that it was, um, it started as Perl, then it became a C++, and that's kind of interesting that you said it was all in one class, right? So it's wow. That's gotta be a gigantic
[00:05:16] Randy: I mean, completely brutal. Yeah. 3.4 million lines of code. Yeah. We were hitting compiler limits on the number of methods per class.
[00:05:22] Jeremy: Oh my gosh.
[00:05:23] Randy: I'm, uh, uh, scared that I have that. I happen to know that at least at the time, uh, Microsoft allowed you 16 K uh, methods per class, and we were hitting that limit.
So, uh, not great.
[00:05:36] Jeremy: So it's just kind of interesting to think about how do you walk through that code, right? You have, I guess you just have this giant file.
[00:05:45] Randy: Yeah. I mean, there were, you know, different methods. Um, but yeah, it was a big man. I mean, it was a monolith, it was, uh, you know, it was a spaghetti mess. Um, and you know, as you can imagine, Amazon went through a really similar thing by the way. So this wasn't soup. I mean, it was bad, but like we weren't the only people that were making that, making that a mistake.
Um, and just like Amazon, where they were, uh, they did like one update a quarter (laughs) , you know, at that period, like 2000, uh, we were doing something really similar, like very, very slow. Um, you know, updates and, uh, when we moved to V3, you know, the idea was to get to do changes much faster. And we were very proud of ourselves starting in 2004 that we, uh, upgraded the whole site every two weeks.
And we didn't have to do the whole site, but like each of those individual applications that I was mentioning, right. Those 220 applications, each of those would roll out on this biweekly cadence. Um, and they had interdependencies. And so we rolled them out in this dependency order in any way, lots of, lots of complexity associated with that.
Um, yeah, there you go.
[00:06:51] Jeremy: the V3 that, that was written in Java, I'm assuming this was a, as a complete rewrite. You, you didn't use the C++ code at all.
[00:07:00] Randy: Yeah. And, uh, it was, um, we migrated, uh, page by page. So, uh, you know, in the transition period, which lasted probably five years, um, there were pages, you know, in the beginning, all pages were served by V2. In the end, all pages are served by V3 and, you know, over time you iterate and you like rewrite in parallel, you know, rewrite and maintain in parallel the V3 version of XYZ page and the V2 version of XYZ page.
Um, and then when you're ready, you start to test out at low percentages of traffic, you know, what would, what does V3 look like? Is it correct? And when it isn't do you go and fix it, but then ultimately you migrate the traffic over, um, to fully take, get fully be in the V3 world, and then you, you know, remove or comment out or whatever.
The, the code that supported that in the V2 monolith.
[00:07:54] Jeremy: And then you had mentioned using Oracle databases. Did you have a set for V2 and a separate V3 and you were kind of trying to keep them in sync?
[00:08:02] Randy: Oh, great question. Thank you for asking that question. No, uh, no. We had the databases. Um, so again, as I mentioned, we had pre-demonolith that's my that's a technical term, uh, pre broken up the databases starting in, let's call it 2000. Uh, actually I'm almost certain that's 2000. Cause we had a major site outage in 1999, which everybody still remembers who was there at the time.
Uh wasn't me or I wasn't there at the time. Uh, but you know, you can look it up. Uh, anyway, so yeah, starting in 2000, we broke up that monolithic database into what I was telling you before those entity aligned databases. Again, one set for items, one set for users, one set for transactions, you know, dot dot, dot, um, and that division of those databases was shared.
You know, those databases were shared between. The three using those things and then V sorry, V2 using those things and V3 using those things. Um, and then, you know, so we've completely decoupled the rewrite of the database, you know, kind of data storage layer from the rewrite of the application layer, if that makes sense.
[00:09:09] Jeremy: Yeah. So, so you had V2 that was connecting to these individual Oracle databases. You said like they were for different types of entities, like maybe for items and users and things like that. but it was a shared database situation where V2 was connected to the same database as V3. Is that right?
[00:09:28] Randy: Correct and also in V3, even when done. Different V3 applications, were also connecting to the same database, again, like anybody who used user, anybody who used the user entity, which is a lot we're connecting to the user suite of databases and anybody who used the item entity, which again is a lot, um, you were connecting to the item databases, et cetera.
So yeah, it was this many to many that's, I'm trying to say many to many relationship between applications in the V3 world and databases.
[00:10:00] Jeremy: Okay. Yeah, I think I, I got it because
[00:10:03] Randy: It's easier with a diagram.
[00:10:04] Jeremy: yeah. W 'cause when you, when you think about services now, um, you think of services having dependencies on other services. Whereas in this case you would have multiple services that rather than talking to a different service, they would all just talk to the same database.
They all needed users. So they all needed to connect to the user's database.
[00:10:24] Randy: Right exactly. And so, uh, I don't want to jump ahead in this conversation, but like the problems that everybody has, everybody who's feeling uncomfortable at the moment. You're right. To feel uncomfortable because that wasn't unpleasant situation and microservices, or more generally the idea that individual services would own their own data.
And only in the only are interactions to the service would be through the service interface and not like behind the services back to the, to the data storage layer. Um, that's better. And Amazon discovered that, you know, uh, lots of people discovered that around that same, around that same early two thousands period.
And so yeah, we had that situation at eBay at the time. Uh, it was better than it was before. Right, right. Better than a monolithic database and a monolithic application layer, but it definitely also had issues. Uh, as you can imagine,
[00:11:14] Jeremy: you know, thinking about back to that time where you were saying it's better than a monolith, um, what were sort of the trade-offs of, you know, you have a monolith connecting to all these databases versus you having all these applications, connecting to all these databases, like what were the things that you gained and what did you lose if that made sense?
[00:11:36] Randy: Hmm. Yeah. Well, I mean, why we did it in the first place is develop is like isolation between development teams right? So we were looking for developer productivity or the phrase we used to use was feature velocity, you know, so how quickly would we be able to move? And to the extent that we could move independently, you know, the search team could move independently from the buying team, which could move independently from the selling team, et cetera.
Um, that was what we were gaining. Um, what were we losing? Uh, you know, when you're in a monolith situation, If there's an issue, you know, where it is, it's in the monolith. You might not know where in the monolith. Um, but like there's only one place that could be. And so an issue that one has, uh, when you break things up into smaller units, uh, especially when they have this, you know, shared, shared mutable state, essentially in the form of these databases, like who changed that column?
What, you know, what's the deal. Uh, actually we did have a solution for that or something that really helped us, which was, um, now 20, more than 20 years ago, we had something that we would now call distributed tracing where, uh, actually I talked about this way back in the 2007 thing, cause it was pretty cool, uh, at the time, uh, You know, just like the spans one would create using a modern distributed tracing, you know, open telemetry or, you know, any of the disruptive tracing vendors.
Um, just like you would do that. We, we didn't use the term span, but that same idea where, um, we could, and the goal was the same to like debug stuff. So, uh, every time we were about to make a database call, we would say, Hey, I'm about to make this data, you know, we would log we about to make this database call and then it would happen.
And then we would log whether it was successful or not successful. We could see how long it took, et cetera. Um, and so we built our own, you know, monitoring system, which, which we called central application logging or CAL, uh, totally proprietary to eBay. I'm happy to talk about whatever gory details you want to know about that, but it was pretty cool certainly way back in 2000.
It was, and that was our mitigation against the thing I'm telling you, which is, you know, when something, when not. Something is weird in the database. We can kind of back up and figure out where it might've happened, or things are slow. What's, you know, what's the deal. And, uh, you know, cause sometimes the database is slow for reasons.
Um, and what, which, what thing is, you know, from an application perspective, I'm talking to 20 different databases, but things are slow. Like what is it? And, um, CAL helped us to, to figure out both elements of that, right? Like what applications are talking to, what databases and what backend services and like debug and diagnose from that perspective.
And then for a given application, what, you know, databases in backend services are you talking to? And, um, debug that. And then we have the whole, and then we, um, we, we had monitors on those things and we would notice when databases would, where be a lot of errors or where, when database is starting in slower than they used to be.
Um, and then. We implemented what people would now call circuit breakers, where we would notice that, oh, you know, everybody who's trying to talk to database 1, 2, 3, 4 is seeing it slow down. I guess 1, 2, 3, 4 is unhappy. So now flip everybody to say, don't talk to 1, 2, 3, 4, and like, just that kind of stuff.
You're not going to be able to serve. Uh, but whatever, that's better than stopping everything. So I hope that makes sense. Like, you know, so all these, all these like modern resilience techniques, um, we always had, we had our own proprietary names for them, but you know, we, we implemented a lot of them way back when,
[00:15:22] Jeremy: Yeah. And, and I guess just to contextualize it for the audience, I mean, this was back in 2004. Oh it back in 2000.
[00:15:32] Randy: Again, because we had this, sorry to interrupt you because we have, the problem is that we were just talking about where application many applications are talking to many services and databases and we didn't know what was going on. And so we needed some visibility into what was going on.
Sorry, go ahead.
[00:15:48] Jeremy: yeah. Okay. So all the way back in 2000, there's a lot less, Services out there, like nowadays you think about so many software as a service products. if you were building the same thing today, what are some of the services that people today would just go and say like, oh, I'll just, I'll just pay for this and have this company handle it for me. You know, that wasn't available, then
[00:16:10] Randy: sure. Well, there. No, essentially, no. Well, there was no cloud cloud didn't happen until 2006. Um, and there were a few software as a service vendors like Salesforce existed at the time, but they weren't usable in the way you're thinking of where I could give you money and you would operate a technical or technological software service on my behalf.
Do you know what I mean? So we didn't have any of the monitoring vendors. We didn't have any of the stuff today. So yeah. So what would we do, you know, to solve that specific problem today? Uh, I would, as we do today, I would, uh, instrument everything with open telemetry because that's generic. Thank you, Ben Siegelman and LightStep for starting that whole open sourcing process, uh, of that thing and, and, um, getting all the vendors to, you know, respect it.
Um, and then I would shoot, you know, for my backend, I would choose one of the very many wonderful, uh, you know, uh, distributed tracing vendors of which there are so many, I can't remember, but like LightStep is one honeycomb... you know, there were a bunch of, uh, you know, backend, um, distributed tracing vendors in particular, you know, for that.
Uh, what else do you have today? I mean, we could go on for hours on this one, but like, we didn't have distributed logging or we didn't have like logging vendors, you know? So there was no, uh, there was no Splunk, there was no, um, you know, any, any of those, uh, any of the many, uh, distributed log, uh, or centralized logging vendor, uh, vendors.
So we didn't have any of those things. We didn't. like caveman, you know, we rent, we, uh, you know, had our own data. We built our own data centers. We racked our own servers. We installed all the OSS in them, you know, uh, by the way, we still do all that because it's way cheaper for us at our scale to do that.
But happy to talk about that too. Uh, anyway, but yeah, no, the people who live in, I don't know if this is where you want to go in 2022, the software developer has this massive menu of options. You know, if you only have a credit card, uh, and it doesn't usually cost that much, you can get a lot of stuff done from the cloud vendors, from the software service vendors, et cetera, et cetera.
And none of that existed in 2000.
[00:18:31] Jeremy: it's really interesting to think about how different, I guess the development world is now. Like, cause you mentioned how cloud wasn't even really a thing until 2006, all these, these vendors that people take for granted. Um, none of them existed. And so it just, uh, it must've been a very, very different time.
[00:18:52] Randy: Well, we didn't know. It was every, every year is better than the previous year, you know, in software every year. You know? So at that time we were really excited that we had all the tools and capabilities that, that we did have. Uh, and also, you know, you look back from, you know, 20 years in the future and, uh, you know, it looks caveman, you know, from that perspective.
But, uh, it was, you know, all those things were cutting edge at the time. What happened really was the big companies rolled their own, right. Everybody, you know, everybody built their own data centers, rack their own servers. Um, so at least at scale and the best you could hope for the most you could pay anybody else to do is rack your servers for you.
You know what I mean? Like there were external people, you know, and they still exist. A lot of them, you know, the Rackspaces you know Equinixes, et cetera of the world. Like they would. Have a co-location facility. Uh, and you, you know, you ask them please, you know, I'd like to buy the, these specific machines and please rack these specific machines for me and connect them up on the network in this particular way.
Um, that was the thing you could pay for. Um, but you pretty much couldn't pay them to put software on there for you. That was your job. Um, and then operating. It was also your job, if that makes sense.
[00:20:06] Jeremy: and then back then, would that be where. Employees would actually have to go to the data center and then, you know, put in their, their windows CD or their Linux CD and, you know, actually do everything right there.
[00:20:18] Randy: Yeah. 100%. Yeah. In fact, um, again, anybody who operates data centers, I mean, there's more automation, but the conceptually, when we run three data centers ourselves at eBay right now, um, and all of our, all of our software runs on them. So like we have physical, we have those physical data centers. We have employees that, uh, physically work in those things, physical.
Rack and stack the servers again, we're smarter about it now. Like we buy a whole rack, we roll the whole rack in and cable it, you know, with one big chunk, uh, sound, uh, as distinct from, you know, individual wiring and the networks are different and better. So there's a lot less like individual stuff, but you know, at the end of the day, but yeah, everybody in quotes, everybody at that time was doing that or paying somebody to do exactly that.
Right. Yeah.
[00:21:05] Jeremy: Yeah. And it's, it's interesting too, that you mentioned that it's still being done by eBay. You said you have three, three data centers. because it seems like now maybe it's just assumed that someone's using a cloud service or using AWS or whatnot. And so, oh, go ahead.
[00:21:23] Randy: I was just going to say, well, I'm just going to riff off what you said, how the world has changed. I mean, so much, right? So. Uh, it's fine. You didn't need to say my whole LinkedIn, but like I used to work on Google cloud. So I've been, uh, I've been a cloud vendor, uh, at a bunch of previous companies I've been a cloud consumer, uh, at stitch fix and we work in other places.
Um, so I'm fully aware, you know, fully, fully, personally aware of, of all that stuff. But yeah, I mean, there's this, um, you know, eBay is in the, uh, eBay is at the size where it is actually. Cost-effective very, cost-effective, uh, can't tell you more than that, uh, for us to operate our own, um, uh, our own infrastructure, right?
So, you know, you know, one would expect if Google didn't operate their own infrastructure, nobody would expect Google to use somebody else's right. Like that, that doesn't make any economic sense. Um, and, uh, you know, Facebook is in the same category. Uh, for a while, Twitter and PayPal have been in that category.
So there's like this clap, you know, there are the known hyperscalers, right. You know, the, the Google, Amazon, uh, Microsoft that are like cloud vendors in addition to consumers internally have their own, their own clouds. Um, and then there's a whole class of other, um, places that operate their own internal clouds in quotes.
Uh, but don't offer them externally and again, uh, Facebook or Meta, uh, you know, is one example. eBay's another, you know, there's a, I'm making this up. Dropbox actually famously started in the cloud and then found it was much cheaper for them to operate their own infrastructure again, for the particular workloads that they had.
Um, so yeah, there's probably, I'm making this up. Let's call it two dozen around the world of these, I'm making this term up many hyperscalers, right? Like self hyperscalers or something like that. And eBay's in that category.
[00:23:11] Jeremy: I know this is kind of a, you know, a big what if, but you were saying how once you reach a certain scale, that's when it makes sense to move into your own data center. And, uh, I'm wondering if, if E-bay, had started more recently, like, let's say in the last, you know, 10 years, I wonder if it would've made sense for it to start on a public cloud and then move to, um, you know, its own infrastructure after it got bigger, or if you know, it really did make sense to just start with your own infrastructure from the start.
[00:23:44] Randy: Oh, I'm so glad you asked that. Um, the, the answer is obvious, but like, I'm so glad you asked that because I love to make this point. No one should ever, ever start by building your own servers and your own (laughs) cloud. Like, No, there's be, uh, you should be so lucky (laughs) after years and years and years that you outgrow the cloud vendors.
Right. Um, it happens, but it doesn't happen that often, you know, it happens so rarely that people write articles about it when it happens. Do you know what I mean? Like Dropbox is a good example. So yes, 100% anytime. Where are we? 2022. Any time in, more than the last 10 years? Um, yeah, let's call it. Let's call it 2010, 2012.
Right. Um, when cloud had proved itself over and you know, many times over, um, anybody who starts since that time should absolutely start in the public cloud. There's no argument about it. Uh, and again, one should be so lucky that over time, you're seeing successive zeros added to your cloud bill, and it becomes so many zeros that it makes sense to shift your focus toward building and operating your own data centers.
That's it. I haven't been part of that transition. I've been the other way, you know, at other places where, you know, I've migrated from owned data centers and colos into, into public cloud. Um, and that's the, that's the more common migration. And again, there are, there are a handful, maybe not even a handful of, uh, companies that have migrated away, but when they do, they've done all the math, right.
I mean, uh, Dropbox has done some great, uh, talks and articles about, about their transition and boy, the math makes sense for them. So, yeah.
[00:25:30] Jeremy: Yeah. And it also seems like maybe it's for certain types of businesses where moving off of public cloud. Makes sense. Like you mentioned Dropbox where so much of their business is probably centered around storage or centered around, you know, bandwidth and, you know, there's probably certain workloads that it's like need to leave public cloud earlier.
[00:25:51] Randy: Um, yeah, I think that's fair. Um, I think that, I think that's a, I think that's an insightful comment. Again, it's all about the economics at some point, you know, it's a big investment to, uh, uh, and it takes years to develop the intern, forget the money that you're paying people, but like just to develop the internal capabilities.
So they're very specialized skill sets around building an operating data centers. So like it's a big deal. Um, and, uh, yeah. So are there particular classes of workloads where you would for the same dollar figure or whatever, uh, migrate earlier or later? I'm sure that's probably true. And again, what can absolutely imagine?
Well, when they say Dropbox in this example, um, yeah, it's because like they, they need to go direct to the storage. And then, I mean, like, they want to remove every middle person, you know, from the flow of the bytes that are coming into the storage media. Um, and it makes perfect sense for, for them. And when I understood what they were doing, which was a number of years ago, they were hybrid, right. So they had, they had completely, you know, they kept the top, you know, external layer, uh, in public cloud. And then the, the storage layer was all custom. I don't know what they do today, but people could check.
[00:27:07] Jeremy: And I'm kind of coming back to your, your first time at eBay. is there anything you felt that you would've done differently with the knowledge you have now?
but with the technology that existed, then.
[00:27:25] Randy: Gosh, that's the 20, 20 hindsight. Um, the one that comes to mind is the one we touched on a little bit, but I'll say it more starkly, the. If I could, if I could go back in time 20 years and say, Hey, we're about to do this V3 transition at eBay. I would not. I would have had us move directly to what we would now call microservices in the sense that individual services own their own data storage and are only interacted with through the public interface.
Um, there's a famous Amazon memo around that same time. So Amazon did the transition from a monolith into what we would now call microservices over about a four or five-year period, 2000 to 2005. And there was a famous Jeff Bezos memo from the early part of that, where, you know, seven, you know, requirements I can't remember them, but you know, essentially it was, you may, you may, you may never, you may never talk to anybody else's database. You may only interact with other services through their public interfaces. I don't care what those public interfaces are, so they didn't standardize around. You know, CORBA or JSON or GRPC, which didn't exist at the time, you know, like they didn't standardize around any, any particular, uh, interaction mechanism, but you did need to again, have this kind of microservice capability, that's modern terminology, um, uh, where, you know, the only services own their own data and nobody can talk in the back door.
So that is the one architectural thing that I wish, you know, with 2020 hindsight, uh, that I would bring back in my time travel to 20 years ago, because that would help. That does help a lot. And to be fair, Amazon, um, Amazon was, um, pioneering in that approach and a lot of people internally and externally from Amazon, I'm told, didn't think it would work, uh, and it, and it did famously.
So that's, that's the thing I would do.
[00:29:30] Jeremy: Yeah. I'm glad you brought that up because, when you had mentioned that, I think you said there were 220 applications or something like that at certain scales, people might think like, oh, that sounds like microservices to me. But when you, you mentioned that microservice to you means it having its own data store.
I think that's a good distinction.
[00:29:52] Randy: Yeah. So, um, I talk a lot about microservices that have for, for a decade or so. Yeah. I mean, several of the distinguishing characteristics are the micro in microservices is size and scope of the interface, right? So you can have a service oriented architecture with one big service, um, or some very small number of very large services.
But the micro in microservice means this thing does, maybe it doesn't have one operation, but it doesn't have a thousand. The several or the handful or several handfuls of operations are all about this one particular thing. So that's the one part of it. And then the other part of it that is critical to the success of that is owning the, owning your own data storage.
Um, so each service, you know, again, uh, it's hard to do this with a diagram, but like imagine, imagine the bubble of the service surrounding the data storage, right? So like people, anybody from the outside, whether they're interacting synchronously, asynchronously, messaging, synchronous, whatever HTTP doesn't matter are only interacting to the bubble and never getting inside where the, uh, where the data is I hope that makes sense.
[00:31:04] Jeremy: Yeah. I mean, I mean, it's a kind of in direct contrast to before you're talking about how you had all these databases that all of these services shared. So it was probably hard to kind of keep track of, um, who had modified data. Um, you know, one service could modify it, then another service control to get data out and it's been changed, but it didn't change it.
So it could be kind of hard to track what's going on.
[00:31:28] Randy: Yeah, exactly. Inner integration at the database level is something that people have been doing since probably the 1980s. Um, and so again, I, you know, in retrospect it looks like caveman approach. Uh, it was pretty advanced at the time, actually, even the idea of sharding of, you know, Hey, there are users and the users live in databases, but they don't all live in the same one.
Uh, they live in 10 different databases or 20 different databases. And then there's this layer that. For this particular user, it figures out which of the 20 databases it's in and finds it and gets it back. And, um, you know, that was all pretty advanced. And by the way, that's all those capabilities still exist.
They're just hidden from everybody behind, you know, nice, simple, uh, software as a service, uh, interfaces anyway, but that takes nothing away from your excellent point, which is, yeah. It's, you know, when you're, again, when there's many to many to relations, when there is this many to many relationship between, um, uh, applications and databases, uh, and there's shared mutable state in those databases that when is shared, like that's bad, you know, it's not bad to have state.
It's not bad to have mutable state it's bad to have shared beautiful state.
[00:32:41] Jeremy: Yeah. And I think anybody who's kind of interested in learning more about the, you had talked about sharding and things like that. If they go back and listen to your, your first appearance on software engineering radio, um, yeah. It kind of struck me how you were talking about sharding and how it was something that was kind of unique or unusual.
Whereas today it feels like it's very, I don't know, if quaint is the right word, but it's like, um, it's something that, that people kind of are accustomed to now.
[00:33:09] Randy: Yeah. Yeah. Um, it's obvious. Um, it seems obvious in retrospect. Yeah. You know, at the time, and by the way, he didn't invent charting. As I said, in 2007, you know, Google and Yahoo and, uh, Amazon, and, you know, it was the obvious, it took a while to reach it, but it's one of those things where once, once people have the, you know, brainwave to see, oh, you know what, we don't actually have to stop store this in one, uh, database.
We can, we can chop that database up into, you know, into chunks. And that, that looks similar to that herself similar. Um, yeah, that was, uh, that was, uh, that was reinvented by lots of, uh, Lots of the big companies at the same time again, because everybody was solving that same problem at the same time. Um, but yeah, when you look back and you, I mean, like, and honestly, like everything that I said there, it's still like this, all the techniques about how you shard things.
And there's lots of, you know, it's not interesting anymore because the problems have been solved, but all those solutions are still the solutions, if that makes any sense, but you know,
[00:34:09] Jeremy: Yeah, for sure. I mean, I think anybody who goes back and listens to it. Yeah. Like you said, it's, it's, it's very interesting because it's. it all still applies and it's like, I think the, the solutions that are kind of interesting to me are ones where it's, it's things that could have been implemented long ago, but we just later on realized like, this is how we could do it.
[00:34:31] Randy: Well part of it is, as we grow as an industry, we just, we discover new problems. You know, we, we get to the point where, you know, sharding over databases has only a problem when one database doesn't work. You know, when it, when you're the load that you put on that database is too big, or you want the availability of, you know, multiple.
Um, and so that's not a, that's not a day one problem, right? That's a day two or day 2000 and kind of problem. Right. Um, and so a lot of these things, yeah, well, you know, it's software. So like we could have done, we could have done any of these things in older languages and older operating systems and with older technology.
But for the most part, we didn't have those problems or we didn't have them at sufficiently enough. People didn't have the problem that we, you know, um, for us to have solved it as an industry, if that makes any sense.
[00:35:30] Jeremy: yeah, no, that's a good point because you think about when Amazon first started and it was just a bookstore, right. And the number of people using the site where, uh, who knows it was, it might've been tens a day or hundreds a day. I don't, I don't know. And, and so, like you said, the problems that Amazon has now in terms of scale are just like, it's a completely different world than when they started.
[00:35:52] Randy: Yeah. I mean, probably I'm making it up, but I don't think that's too off to say that it's a billion times more, their problems are a billion fold. You know, what they, what they were
[00:36:05] Jeremy: the next thing I'd like to talk about is you came back to eBay I think about has it been about two years ago.
[00:36:14] Randy: Two years yeah.
[00:36:15] Jeremy: Yeah. And, and so, so tell me about the experience of coming back to an organization that you had been at, you know, 10 years prior or however long it was like, how is your onboarding different when it's somewhere you've been before?
[00:36:31] Randy: Yeah. Sure. So, um, like, like you said, I worked at eBay from 2004 to 2011. Um, and I worked in a different role than I have today. I've worked mostly on eBay search engine. Um, and then, uh, I left to co-found a startup, which was in the 99%. So the one, you know, like didn't really do much. Uh, I joined, I worked at Google in the early days of Google cloud, as I mentioned on Google app engine and had a bunch of other roles including more recently, like you said, stitch fix and we work, um, leading those engineering teams.
And, um, so yeah, coming back to eBay as chief architect and, and, you know, leading. Developer platform, essentially a part of eBay. Um, yeah. What was the onboarding like? I mean, lots of things had changed, you know, in the, in the intervening 10 years or so. Uh, and lots had stayed the same, you know, not in a bad way, but just, you know, uh, some of the technologies that we use today are still some of the technologies we used 10 years ago, a lot has changed though.
Um, a bunch of the people are still around. So there's something about eBay that, um, people tend to stay a long time. You know, it's not really very strange for people to be at eBay for 20 years. Um, in my particular team of let's call it 150, there are four or five people that have crossed their 20 year anniversary at the company.
Um, and I also re I rejoined with a bunch of other boomerangs as the term we use internally. So it's, you know, the, um, including the CEO, by the way. So sort of bringing the band back together, a bunch of people that had gone off and worked at it, but at other places have, have come back for various reasons over the last couple of.
So it was both a lot of familiarity, a lot of unfamiliarity, a lot of familiar faces. Um, yup.
[00:38:17] Jeremy: So, I mean, having these people who you work with still be there and actually coming back with some of those people, um, what were some of the big, I guess, advantages or benefits you got from, you know, those existing connections?
[00:38:33] Randy: Yeah. Well, I mean, as with all things, you know, imagine, I mean, everybody can imagine like getting back together with friends that they had from high school or university, or like you had some people had some schooling at some point and like you get back together with those friends and there's this, you know, there's this implicit trust in most situations of, you know, because you went through a bunch of stuff together and you knew each other, uh, you know, a long time.
And so that definitely helps, you know, when you're returning to a place where again, there are a lot of familiar faces where there's a lot of trust built up. Um, and then it's also helpful, you know, eBay's a pretty complicated place and it's 10 years ago, it was too big to hold in any one person's head and it's even harder to hold it in one person said now, but to be able to come back and have a little bit of that, well, more than a little bit of that context about, okay, here's how eBay works.
And here, you know, here are the, you know, unique complexities of the marketplace cause it's very unique, you know, um, uh, in the world. Um, and so, yeah, no, I mean, it was helpful. It's helpful a lot. And then also, you know, in my current role, um, uh, my, my main goal actually is to just make all of eBay better, you know, so we have about 4,000 engineers and, you know, my team's job is to make all of them better and more productive and more successful and, uh, being able to combine.
Knowing what eBay, knowing the context about eBay and having a bunch of connections to the people that, you know, a bunch of the leaders there, uh, here, um, combining that with 10 years of experience doing other things at other places, you know, that's helpful because you know, now there are things that we do at eBay that, okay, well there, you know, you know, that this other place is doing, this has that same problem and is solving it in a different way.
And so maybe we should, you know, look into that option. So,
[00:40:19] Jeremy: so, so you mentioned just trying to make developers, work or lives easier. you start the job. How do you decide what to tackle first? Like how do you figure out where the problems are or what to do next?
[00:40:32] Randy: yeah, that's a great question. Um, so, uh, again, my, uh, I lead this thing that we internally called the velocity initiative, which is about just making us, giving us the ability to deliver. Features and bug fixes more quickly to customers. Right. And, um, so what do I figure for that problem? How can we deliver things more quickly to customers and improve, you know, get more customer value and business value?
Uh, what I did, uh, with, in collaboration with a bunch of people is what one would call a value stream map. And that's a term from lean software and lean manufacturing, where you just look end to end at a process and like say all the steps and how long those steps take. So a value stream, as you can imagine, like all these steps that are happening at the end, there's some value, right?
Like we produced some, you know, feature or, you know, hopefully gotten some revenue or like helped out the customer and the business in some way. And so value, you know, mapping that value stream. That's what it means. And, um, Looking for you look at that. And when you can see the end-to-end process, you know, and like really see it in some kind of diagram, uh, you can look for opportunities like, oh, okay, well, you know, if it takes us, I'm making this effort, it takes us a week from when we have an idea to when it shows up on the site.
Well, you know, some of those steps take five minutes. That's not worth optimizing, but some of those steps take, you know, five days and that is worth optimizing. And so, um, getting some visibility into the system, you know, looking end to end with some, with a kind of view of the system systems thinking, uh, that will give you the, uh, the knowledge about, or the opportunities about we know what can be improved.
And so that's, that's what we did. And we didn't talk with all 4,000, you know, uh, engineers are all, you know, whatever, half a thousand teams or whatever we had. Um, but we sampled. And after we talked with three teams who were already hearing a bunch of the same things, you know, so we were hearing in the whole product life cycle, which I like to divide into four stages.
I'd like to say, there's planning. How does an idea become a project or a thing that people work on a software development? How does a project or become committed code software delivery? How does committed code become a feature that people actually use? And then what I call post release iteration, which is okay, it's now there are out there on the site and we're turning it on and off for individual users.
We're learning in analytics and usage in the real world and, and experimenting. And so there were opportunities that eBay at all, four of those stages, um, which I'm happy to talk about, but what we ended up seeing again and again, uh, is that that software delivery part was our current bottleneck. So again, that's the, how long does it take from an engineer when she commits her code to, it shows up as a feature on the site.
And, you know, before we started the work. You know, two years ago before we started the work that I've been doing for the last two years with a bunch of people, um, on average and eBay was like a week and a half. So, you know, it'd be a week and a half between when someone's finished and then, okay. It gets code reviewed and, you know, dot, dot, dot it gets rolled out.
It gets tested, you know, all that stuff. Um, it was, you know, essentially 10 days. And now for the teams that we've been working with, uh, it's down to two. So we used a lot of, um, what people may be familiar with, uh, the accelerate book. So it's called accelerate by Nicole Forsgren. Um, Jez humble and Gene Kim, uh, 2018, like if there's one book anybody should read about software engineering, it's that?
Uh, so please read accelerate. Um, it summarizes almost a decade of research from the state of DevOps reports, um, which the three people that I mentioned led. So Nicole Forsgren, you know, is, uh, is a doctor, uh, you know, she's a PhD and, uh, data science. She knows how to do all this stuff. Um, anyway, so, uh, that when your, when your problem happens to be software delivery.
The accelerate book tells you all the kind of continuous delivery techniques, trunk based development, uh, all sorts of stuff that you can do to, to solve that, uh, solve those problems. And then there are also four metrics that they use to measure the effectiveness of an organization, software delivery. So people might be familiar with, uh, there's deployment frequency.
How often are we deploying a particular application lead time for change? That's that time from when a developer commits her code to when it shows up on the site, uh, change failure rate, which is when we deploy code, how often do we roll it back or hot fix it, or, you know, there's some problem that we need to, you know, address.
Um, and then, uh, meantime to re uh, meantime to restore, which is when we have one of those incidents or problems, how, how quickly can we, uh, roll it back or do that hot fix? Um, and again, the beauty of Nicole Forsgren research summarized in the accelerate book is that the science shows that companies cluster, in other words, Mostly the organizations that are not good at, you know, deployment frequency and lead time are also not good at the quality metrics of, uh, meantime to restore and change failure rate and the companies that are excellent at, you know, uh, deployment frequency and lead time are also excellent at meantime, to recover and, uh, change failure rate.
Um, so companies or organizations, uh, divided into these four categories. So there's a low performers, medium performers, high performers, and then elite performers. And, uh, eBay was solidly eBay on average at the time. And still on average is solidly in that medium performer category. So, uh, and what we've been able to do with the teams that we've been working with is we've been able to move those teams to the high category.
So just super brief. Uh, and I w we'll give you a chance to ask you some more questions, but like in the low category, all those things are kind of measured in months, right. So how long, how often are we deploying, you know, measure that in months? How long does it take us to get a commit to the site? You know, measure that in months, you know, um, where, and then the low performer, sorry.
Uh, the medium performers are like, everything's measured in weeks, right? So like, if we were deploy, you know, couple, you know, once every couple of weeks or once a week, uh, lead time is measured in weeks, et cetera. The, uh, the high-performers things are measured in days and the elite performance things are measured in hours.
And so you can see there's like order of magnitude improvements when you go from, you know, when you move from one of those kind of clusters to another cluster. Anyway. So what we were focused on again, because our problem was software delivery was moving a whole, a whole set of teams from that medium performer category where things are measured in weeks to the, uh, high-performer category, where things are missing.
[00:47:21] Jeremy: throughout all this, you said the, the big thing that you focused on was the delivery time. So somebody wrote code and, they felt that it was ready for deployment, but for some reason it took 10 days to actually get out to the actual site. So I wonder if you could talk a little bit about, uh, maybe a specific team or a specific application, where, where, where was that time being spent?
You know, you, you said you moved from 10 days to two days. What, what was happening in the meantime?
[00:47:49] Randy: Yeah, no, that's a great question. Thank you. Um, yeah, so, uh, okay, so now, so we, we, we looked end to end of the process and we found that software delivery was the first place to focus, and then there are other issues in other areas, but we'll get to them later. Um, so then for, um, to improve software delivery, now we asked individual teams, we, we, we did something like, um, you know, some conversation like I'm about to say, so we said, hi, it looks like you're deploying kind of once or twice.
If I, if I told you, you had to deploy once a day, tell me all the reasons why that's not going to work. And the teams are like, oh, of course, well, it's a build times take too long. And the deployments aren't automated and you know, our testing is flaky. So we have to retry it all the time and, you know, dot, dot, dot, dot, dot.
And we said, great, you just gave my team, our backlog. Right. So rather than, you know, just coming and like, let's complain about it. Um, which the teams work it's legit for them to complain. Uh, I was a, you know, we were able, because again, the developer program or sorry, the developer platform, you know, is as part of my team, uh, we said, great, like you just gave us, you just told us all the, all your top, uh, issues or your impediments, as we say, um, and we're going to work on them with you.
And so every time we had some idea about, well, I bet we can use Canary deployments to automate the deployment which we have now done. We would pilot that with a bunch of teams, we'd learn what works and doesn't work. And then we would roll that out to everybody. Um, So what were the impediments like? It was a little bit different for each individual team, but in some, it was, uh, the things we ended up focusing on or have been focusing on our build times, you know, so we build everything in Java still.
Um, and, uh, even though we're generation five, as opposed to that generation three that I mentioned, um, still build times for a lot of applications we're taking way too long. And so we, we spend a bunch of time improving those things and we were able to take stuff from, you know, hours down to, you know, single digit minutes.
So that's a huge improvement to developer productivity. Um, we made a lot of investment in our continuous delivery pipelines. Um, so making all the, making all the automation around, you know, deploying something to one environment and checking it there and then deploying it into a common staging environment and checking it there and then deploying it from there into the production environment.
And, um, and then, you know, rolling it out via this Canary mechanism. We invested a lot in something that we call traffic mirroring, which is a, we didn't invent. Other T other places have a different name for this? I don't know if there's a standard industry name. Some people call it shadowing, but the idea is I have a change that I'm making, which is not intended to change the behavior.
Like a lots of changes that we make, bug fixes, et cetera, uh, upgrading to new, you know, open source, dependencies, whatever, changing the version of the framework. There's a bunch of changes that we make regularly day-to-day as developers, which are like, refactorings kind of where we're not actually intending to change the behavior.
And so a tra traffic mirroring was our idea of. You have the old code that's running in production and you, and you fire a request, a production request at that old code and it responds, but then you also fire that request at the new version and compare the results, you know, did the same, Jason come back, you know, between the old version and the new version.
Um, and that's, that's a great way kind of from the outside to sort of black box detect any unintended changes in the, in the behavior. And so we definitely leveraged that very, very aggressively. Um, we've invested in a bunch of other bunch of other things, but, but all those investments are driven by what does the team, what do the particular teams tell us are getting in their way?
And there are a bunch of things that the teams themselves have, you know, been motivated to do. So my team's not the only one that's making improvements. You know, teams have. Reoriented, uh, moved, moved from branching development to trunk based development, which makes a big difference. Um, making sure that, uh, PR approvals and like, um, you know, code reviews are happening much more regularly.
So like right after, you know, a thing that some teams have started doing is like immediately after standup in the morning, everybody does all the code reviews that you know, are waiting. And so things don't drag on for, you know, two, three days, cause whatever. Um, so there's just like a, you know, everybody kind of works on that much more quickly.
Um, teams are building their own automations for things like testing site speed and accessibility and all sorts of stuff. So like all the, all the things that, you know, a team goes through in the development and roll out of their software, they were been spending a lot of time automating and making, making a leaner, making more efficient.
[00:52:22] Jeremy: So, so some of those, it sounds like the PR example is really, on the team. Like you're, you're telling them like, Hey, this is something that you internally should change how you work. for things like improving the build time and things like that. Did you have like a separate team that was helping these teams, you know, speed that process up? Or what, what was that
[00:52:46] Randy: like?
Yeah. Great. I mean, and you did give to those two examples are, are like you say, very different. So I'm going to start from, we just simply showed everybody. Here's your deployment frequency for this application? Here's your lead time for this application? Here's your change failure rate. And here's your meantime to restore.
And again, as I didn't mention before. All of the state of DevOps research and the accelerate book prove that by improving those metrics, you get better engineering outcomes and you also get better business outcomes. So like it's scientifically proven that improving those four things matters. Okay. So now we've shown to teams, Hey, you're we would like you to improve, you know, for your own good, but you know, more broadly at eBay, we would like the deployment frequency to be faster.
And we would like the lead time to be shorter. And the insight there is when we deploy smaller units of work, when we don't like batch up a week's worth of work, a month's worth of work, uh, it's much, much less risky to just deploy like an hour's worth of work. Right. And the, and the insight is the hours worth of work fits in your head.
And if you roll it out and there's an issue. First off rolling backs, no big deal. Cause you only, you know, not, you've only lost an hour of work for a temporary period of time, but also like you never have this thing, like what in the world broke? Cause like with a month's worth of work, there's a lot of things that changed and a lot of stuff that could break, but with an hour's worth of work, it's only like one change that you made.
So, you know, when, if something happens, like it's pretty much, pretty much guaranteed to be that thing anyway, that's the back. Uh, that's the backstory. And um, and so yeah, we were just working with individual teams. Oh yeah. So they were, the teams were motivated to like, see what's the biggest bang for the buck in order to improve those things.
Like how can we improve those things? And again, some teams were saying, well, you know what, a huge component of our, of that lead time between when somebody commits and it's, it's a feature on the site, a huge percentage of that. Maybe multiple days, it's like waiting for somebody to code review. Okay, great.
We can just change our team kind of agreements and our team behavior to make that happen. And then yes, to answer your question about. Were the other things like building the Canary capability and traffic mirroring and build time improvements. Those were done by central, uh, platform and infrastructure teams, you know, some of which were in my group and some of which are in peer peer groups, uh, in, in my part of the organization.
So, yeah, so I mean like providing the generic tools and, you know, generic capabilities, those are absolutely things that a platform organization does. Like that's our job. Um, and you know, we did it. And, uh, and then there are a bunch of other things like that around kind of team behavior and how you approach building a particular application that are, are, and should be completely in the control of the individual teams.
And we were trying not to be, not trying not to be, we were definitely not being super prescriptive. Like we didn't come in and we say, we didn't come in and say, alright, by next, by next Tuesday, we want you to be doing trunk based development by, you know, the Tuesday after that, we want to see test-driven development, you know, dot, dot, Um, we would just offer to teams, you know, hear it.
Here's where you are. Here's where we know you can get, because like we work with other teams and we've seen that they can get there. Um, you know, they just work together on, well, what's the biggest bang for the buck and what would be most helpful for that team? So it's like a menu of options and you don't have to take everything off the menu, if that makes sense.
[00:56:10] Jeremy: And, and how did that communication flow from you and your team down to the individual contributor? Like you have, I'm assuming you have engineering managers and technical leads and all these people sort of in the chain. How does it
[00:56:24] Randy: Yeah, thanks for asking that. Yeah. I didn't really say how we work as an initiative. So every, um, so there are a bunch of teams that are involved. Um, and we have, uh, every Monday morning, so, uh, just so happens. It's late Monday morning today. So we already did this a couple of hours ago, but once a week we get all the teams that are involved, both like the platform kind of provider teams and also the product.
Or we would say domain like consumer teams. And we do a quick scrum of scrums, like a big old kind of stand up. What have you all done this week? What are you working on next week? What are you blocked by kind of idea. And, you know, there are probably 20 or 30 teams again, across the individual platform capabilities and across the teams that, you know, uh, consume this stuff and everybody gives a quick update and they, and, uh, it's a great opportunity for people to say, oh, I have that same problem too.
Maybe we should offline try to figure out how to solve that together. You built a tool that automates the site speed stuff. That's great. I would S I would so love to have that. And, um, so it, uh, this weekly meeting has been a great opportunity for us to share wins, share, um, you know, help that people need and then get, uh, get teams to help with each other.
And also, similarly, one of the platform teams would say something like, Hey, we're about to be done or beta, let's say, you know, this new Canary capability, I'm making this up. Anybody wanna pilot that for us? And then you get a bunch of hands raised of yo, we would be very happy to pilot that that would be great.
Um, so that's how we communicate back and forth. And, you know, it's a big enough. It's kind of like engineering managers are kind of are the kind of level that are involved in that typically. Um, so it's not individual developers, but it's like somebody on most, every team, if that makes any sense. So like, that's kind of how we do that, that like communication, uh, back to the individual developers.
If that makes sense.
[00:58:26] Jeremy: Yeah. So it sounds like you would have, like you said, the engineering manager go to the standup and um, you said maybe 20 to 30 teams, or like, I'm just trying to get a picture for how many people are in this meeting.
[00:58:39] Randy: Yeah. It's like 30 or 40 people.
[00:58:41] Jeremy: Okay. Yeah.
[00:58:42] Randy: And again, it's quick, right? It's an hour. So we just go, boom, boom, boom, boom. And we've just developed a cadence of people. We have a shared Google doc and like people like write their little summaries, you know, of what they're, what they've worked on and what they're working on.
So we've over time made it so that it's pretty efficient with people's time. And. Pretty dense in a good way of like information flow, back and forth. Um, and then also separately, we meet more in more detail with the individual teams that are involved. Again, try to elicit, okay, now, where are you now?
Here's where you are. Please let us know what problems you're seeing with this part of the infrastructure or problems you're seeing in the pipelines or something like that. And we're, you know, we're constantly trying to learn and get better and, you know, solicit feedback from teams on what we can do differently.
[00:59:29] Jeremy: earlier you had talked a little bit about how there were a few services that got brought over from V2 or V3, basically kind of more legacy or older services that are, have been a part of eBay for quite some time.
And I was wondering if there were things about those services that made this process different, like, you know, in terms of how often you could deploy or, um, just what were some key differences between something that was made recently versus something that has been with the company for a long time?
[01:00:06] Randy: Yeah, sure. I mean, the stuff that's been with the company for a long time was best in class. As of when we built it, you know, maybe 15 and sometimes 20 years ago. Um, there actually, I wouldn't even less than a handful. There are, as we speak, there are two or three of those V3. Uh, clusters or applications or services still around and they should be gone in a completely migrated away from, in the next a couple of months.
So like, we're almost at the end of, um, you know, uh, moving all to more modern things. But yeah, you know, I mean, again, uh, stuff that was state-of-the-art, you know, 20 years ago, which was like deploying things once every two weeks, like that was a big deal in 2000 or 2004. Uh, and it's, you know, like that was fast in 2004 and is slow in 2022.
So, um, yeah, I mean, what's the difference? Um, yeah, I mean, a lot of these things, you know, if they haven't already been migrated, there's a reason. And it's because often that they're way in the guts of something that's really important. You know, this is the, this is a core part. I'm making these examples up and they're not even right, but like it's a core part of the payments flow.
It's a core part of, you know, uh, how, uh, sellers get paid. And those aren't examples. We have, those are modern, but you see what I'm saying? Like stuff that's like really core to the business and that's why it's kind of lasted.
[01:01:34] Jeremy: And, uh, I'm kind of curious from the perspective of some of these new things you're introducing, like you're talking about, um, improving continuous delivery and things like that. Uh, when you're working with some of these services that have been around a long time, are the teams the rate at which they deploy or the rate at which you find defects is that noticeably different from services that are more recent?
[01:02:04] Randy: I mean, and that's true of any legacy at any, at any place. Right? So, um, yeah, I mean, people are legitimately, uh, I have some trepidation that say about, you know, changing something that's, you know, been running the, running the business for a long, long time. And so, you know, it's a lot slower going, uh, exactly because it's not always completely obvious what, um, you know, what the implications are of those changes.
So, you know, we were very careful and we, you know, trust things a whole lot. And, um, you know, maybe we didn't write stuff with a whole bunch of automated tests in the beginning. And so there's a lot of manual stuff there. You know, this is pretty, you know, this is just what happens when you have, uh, you have stuff that, you know, you have a company that's, you know, been around for a long time.
[01:02:51] Jeremy: yeah, I guess just, just kind of to start wrapping up as this process of you coming into the company and identifying where the problems are and working on like, um, you know, ways to speed up delivery. Is there, there anything that kind of came up that really surprised you? I mean, you've been at a lot of different organizations. Is there anything about your experience here at eBay that was very different than what you'd seen before?
[01:03:19] Randy: No. I mean, it's a great question. I don't think, I mean, I think the thing that's surprising is how unsurprising it is. Like there's not, you know, the details are different. Like, okay. You know, we have this V3, I mean, like, you know, we have some uniquenesses around eBay, but, but, um, but I think what is maybe pleasantly surprising is all the techniques about how one.
Notice the things that are going on, uh, in terms of, you know, again, deployment, frequency, lead time, et cetera, and what techniques you would deploy to like make those things better. Um, all the standard stuff applies, you know, so again, uh, all the, all the techniques that are mentioned in the state of DevOps research and an accelerate and just all the, all the known good practices of software development, they all apply everywhere.
Um, and that's the wonderful, I think that's the wonderful thing. So like maybe the most surprising thing is how unsurprising or how, how, how applicable the, you know, the standard industry standard techniques, uh, are, I mean, I certainly hope that to be true, but that's why we, I didn't really say, but we piloted this stuff with a small number of teams.
Exactly. Because we, you know, we thought, and it would turned out to be true that they applied, but we weren't entirely sure. You know, we didn't know what we didn't know. Um, and we also needed proof points, you know, Not just out there in the world, but at eBay that these things made a difference and it turns out they do. So.
[01:04:45] Jeremy: yeah, I mean, I think it's easy for people to kind of get caught up and think like, my problem is unique or my organization is unique and, but it, but it sounds like in a lot of cases, maybe we're not so not so different.
[01:04:57] Randy: I mean, the stuff that works tends to work everywhere, the deeds there's always some detail, but, um, but yeah, I mean all aspects of, you know, the continuous delivery and kind of lean approach the software. I mean, we, the industry have yet to find a place where they don't work seriously. You have to find any place where they don't work.
[01:05:19] Jeremy: if people want to, um, you know, learn more about the work that you're doing at eBay, or just follow you in general, um, where should.
[01:05:27] Randy: Yeah. So, um, I tweet semi-regularly at, at Randy shelf. So my name all one word, R a N D Y S H O U P. Um, I'm not, I had always wanted to be a blogger. Like there is a Randy shop.com and there are some blogs on there, but they're pretty old. Um, someday I hope to be doing more writing. Um, I do a lot of conference speaking though.
So I speak at the Q con conferences. I'm going to be at the craft concert in Budapest in a couple of in week and a half, uh, as of this recording. Um, so you can often find me on, uh, on Twitter or on software conferences.
[01:06:02] Jeremy: all right, Randy. Well, thank you so much for coming back on software engineering radio.
[01:06:06] Randy: Thanks for having me, Jeremy. This was fun.
This episode originally aired on Software Engineering Radio.
A few topics covered
Related Links
Postgres
Transcript
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: Today I'm talking to Ant Wilson, he's the co-founder and CTO of Supabase. Ant welcome to software engineering radio.
[00:00:07] Ant: Thanks so much. Great to be here.
[00:00:09] Jeremy: When I hear about Supabase, I always hear about it in relation to two other products. The first is Postgres, which is a open source relational database. And second is Firebase, which is a backend as a service product from Google cloud that provides a no SQL data store.
It provides authentication and authorization. It has a functions as a service component. It's really meant to be a replacement for you needing to have your own server, create your own backend. You can have that all be done from Firebase. I think a good place for us to start would be walking us through what supabase is and how it relates to those two products.
[00:00:55] Ant: Yeah. So, so we brand ourselves as the open source Firebase alternative
that came primarily from the fact that we ourselves do use the, as the alternative to Firebase. So, so my co-founder Paul in his previous startup was using fire store. And as they started to scale, they hit certain limitations, technical scaling limitations and he'd always been a huge Postgres fan.
So we swapped it out for Postgres and then just started plugging in. The bits that we're missing, like the real-time streams. Um, He used the tool called PostgREST with a T for the, for the CRUD APIs. And so
he just built like the open source Firebase alternative on Postgres, And that's kind of where the tagline came from.
But the main difference obviously is that it's relational database and not a no SQL database which means that it's not actually a drop-in replacement. But it does mean that it kind of opens the door to a lot more functionality actually. Um, Which, which is hopefully an advantage for us.
[00:02:03] Jeremy: it's a, a hosted form of Postgres. So you mentioned that Firebase is, is different. It's uh NoSQL. People are putting in their, their JSON objects and things like that. So when people are working with Supabase is the experience of, is it just, I'm connecting to a Postgres database I'm writing SQL.
And in that regard, it's kind of not really similar to Firebase at all. Is that, is that kind of right?
[00:02:31] Ant: Yeah, I mean, the other thing, the other important thing to notice that you can communicate with Supabase directly from the client, which is what people love about fire base. You just like put the credentials on the client and you write some security rules, and then you just start sending your data. Obviously with supabase, you do need to create your schema because it's relational.
But apart from that, the experience of client side development is very much the same or very similar the interface, obviously the API is a little bit different. But, but it's similar in that regard. But I, I think, like I said, we're moving, we are just a database company actually. And the tagline, just explained really, well, kind of the concept of, of what it is like a backend as a service. It has the real-time streams. It has the auth layer. It has the also generated APIs. So I don't know how long we'll stick with the tagline. I think we'll probably outgrow it at some point. Um, But it does do a good job of communicating roughly what the service is.
[00:03:39] Jeremy: So when we talk about it being similar to Firebase, the part that's similar to fire base is that you could be a person building the front end part of the website, and you don't need to necessarily have a backend application because all of that could talk to supabase and supabase can handle the authentication, the real-time notifications all those sorts of things, similar to Firebase, where we're basically you only need to write the front end part, and then you have to know how to, to set up super base in this case.
[00:04:14] Ant: Yeah, exactly. And some of the other, like we took w we love fire based, by the way. We're not building an alternative to try and destroy it. It's kind of like, we're just building the SQL alternative and we take a lot of inspiration from it. And the other thing we love is that you can administer your database from the browser.
So you go into Firebase and you have the, you can see the object tree, and when you're in development, you can edit some of the documents in real time. And, and so we took that experience and effectively built like a spreadsheet view inside of our dashboard. And also obviously have a SQL editor in there as well.
And trying to, create this, this like a similar developer experience, because that's where Firebase just excels is. The DX is incredible. And so we, we take a lot of inspiration from it in, in those respects.
[00:05:08] Jeremy: and to to make it clear to our listeners as well. When you talk about this interface, that's kind of like a spreadsheet and things like that. I suppose it's similar to somebody opening up pgAdmin, I suppose, and going in and editing the rows. But, but maybe you've got like another layer on top that just makes it a little more user-friendly a little bit more like something you would get from Firebase, I guess.
[00:05:33] Ant: Yeah.
And, you know, we, we take a lot of inspiration from pgAdmin. PG admin is also open source. So I think we we've contributed a few things and, or trying to upstream a few things into PG admin. The other thing that we took a lot of inspiration from for the table editor, what we call it is airtable.
And because airtable is effectively. a a relational database and that you can just come in and, you know, click to add your columns, click to add a new table. And so we just want to reproduce that experience again, backed up by a full Postgres dedicated database.
[00:06:13] Jeremy: so when you're working with a Postgres database, normally you need some kind of layer in front of it, right? That the person can't open up their website and connect directly to Postgres from their browser. And you mentioned PostgREST before. I wonder if you could explain a little bit about what that is and how it works.
[00:06:34] Ant: Yeah, definitely. so yeah, PostgREST has been around for a while. Um, It's basically an, a server that you connect to, to your Postgres database and it introspects your schemas and generates an API for you based on the table names, the column names. And then you can basically then communicate with your Postgres database via this restful API.
So you can do pretty much, most of the filtering operations that you can do in SQL um, uh, equality filters. You can even do full text search over the API. So it just means that whenever you obviously add a new table or a new schema or a new column the API just updates instantly. So you, you don't have to worry about writing that, that middle layer which is, was always the drag right.
When, what have you started a new project. It's like, okay, I've got my schema, I've got my client. Now I have to do all the connecting code in the middle of which is kind of, yeah, no, no developers should need to write that layer in 2022.
[00:07:46] Jeremy: so this the layer you're referring to, when I think of a traditional. Web application. I think of having to write routes controllers and, and create this, this sort of structure where I know all the tables in my database, but the controllers I create may not map one to one with those tables. And so you mentioned a little bit about how PostgREST looks at the schema and starts to build an API automatically.
And I wonder if you could explain a little bit about how it does those mappings or if you're writing those yourself.
[00:08:21] Ant: Yeah, it basically does them automatically by default, it will, you know, map every table, every column. When you want to start restricting things. Well, there's two, there's two parts to this. There's one thing which I'm sure we'll get into, which is how is this secure since you are communicating direct from the client.
But the other part is what you mentioned giving like a reduced view of a particular date, bit of data. And for that, we just use Postgres views. So you define a view which might be, you know it might have joins across a couple of different tables or it might just be a limited set of columns on one of your tables. And then you can choose to just expose that view.
[00:09:05] Jeremy: so it sounds like when you would typically create a controller and create a route. Instead you create a view within your Postgres database and then PostgREST can take that view and create an end point for it, map it to that.
[00:09:21] Ant: Yeah, exactly (laughs) .
[00:09:24] Jeremy: And, and PostgREST is an open source project. Right. I wonder if you could talk a little bit about sort of what its its history was. How did you come to choose it?
[00:09:37] Ant: Yeah.
I think, I think Paul probably read about it on hacker news at some point. Anytime it appears on hacker news, it just gets voted to the front page because it's, it's So awesome. And we, we got connected to the maintainer, Steve Chavez. At some point I think he just took an interest in, or we took an interest in Postgres and we kind of got acquainted.
And then we found out that, you know, Steve was open to work and this kind of like probably shaped a lot of the way we think about building out supabase as a project and as a company in that we then decided to employ Steve full time, but just to work on PostgREST because it's obviously a huge benefit for us.
We're very reliant on it. We want it to succeed because it helps our business. And then as we started to add the other components, we decided that we would then always look for existing tools, existing opensource projects that exist before we decided to build something from scratch. So as we're starting to try and replicate the features of Firebase we would and auth is a great example.
We did a full audit of what are all the authorization, authentication, authentication open-source tools that are out there and which one was, if any, would fit best. And we found, and Netlify had built a library called gotrue written in go, which did pretty much exactly what we needed. So we just adopted that.
And now obviously, you know, we, we just have a lot of people on the team contributing to, to gotrue as well.
[00:11:17] Jeremy: you touched on this a little bit earlier. Normally when you connect to a Postgres database your user has permission to, to basically everything I guess, by default, anyways. And so. So, how does that work? Where when you want to restrict people's permissions, make sure they only get to see records they're allowed to see how has that all configured in PostgREST and what's happening behind the scenes?
[00:11:44] Ant: Yeah, we, the great thing about Postgres is it's got this concept of row level security, which actually, I don't think I even rarely looked at until we were building out this auth feature where the security rules live in your database as SQL. So you do like a create policy query, and you say anytime someone tries to select or insert or update apply this policy.
And then how it all fits together is our auth server go true. Someone will basically make a request to sign in or sign up with email and password, and we create that user inside the, database. They get issued a URL. And they get issued a JSON, web token, a JWT, and which, you know, when they, when they have it on the, client side, proves that they are this, you, you ID, they have access to this data.
Then when they make a request via PostgREST, they send the JWT in the authorization header. Then Postgres will pull out that JWT check the sub claim, which is the UID and compare it to any rows in the database, according to the policy that you wrote. So, so the most basic one is you say in order to, to access this row, it must have a column you UID and it must match whatever is in the JWT.
So we basically push the authorization down into the database which actually has, you know, a lot of other benefits in that as you write new clients, You don't need to have, have it live, you know, on an API layer on the client. It's kind of just, everything is managed from the database.
[00:13:33] Jeremy: So the, the, you, you ID, you mentioned that represents the user, correct.
[00:13:39] Ant: Yeah.
[00:13:41] Jeremy: Is that, does that map to a user in post graphs or is there some other way that you're mapping those permissions?
[00:13:50] Ant: Yeah. When, so when you connect go true, which is the auth server to your Postgres database for the first time, it installs its own schema. So you'll have an auth schema and inside will be all start users with a list of the users. It'll have a uh, auth dot tokens which will store all the access tokens that it's issued.
So, and one of the columns on the auth start user's table will be UUID, and then whenever you write application specific schemers, you can just join a, do a foreign key relation to the author users table. So, so it all gets into schema design and and hopefully we do a good job of having some good education content in the docs as well.
Because one of the things we struggled with from the start was how much do we abstract away from SQL away from Postgres and how much do we educate? And we actually landed on the educate sides because I mean, once you start learning about Postgres, it becomes kind of a superpower for you as a developer.
So we'd much rather. Have people discover us because we're a firebase alternatives frontend devs then we help them with things like schema design landing about row level security. Because ultimately like every, if you try and abstract that stuff it gets kind of crappy. And maybe not such a great experience.
[00:15:20] Jeremy: to make sure I understand correctly. So you have GoTrue, which is uh, a Netlify open-source project that GoTrue project creates some tables in your, your database that has like, you've mentioned the tokens, the, the different users. Somebody makes a request to GoTrue. Like here's my username, my password go true.
Gives them back a JWT. And then from your front end, you send that JWT to the PostgREST endpoint. And from that JWT, it's able to know which user you are and then uses postgres' built in a row level security to figure out which rows you're, you're allowed to bring back. Did I, did I get that right?
[00:16:07] Ant: That is pretty much exactly how it works. And it's impressive that you garnered that without looking at a single diagram (laughs) But yeah, and, and, and obviously we, we provide a client library supabase JS, which actually does a lot of this work for you. So you don't need to manually attach the JJ JWT in a header.
If you've authenticated with supabase JS, then every request sent to PostgREST. After that point, the header will just be attached automatically, and you'll be in a session as that user.
[00:16:43] Jeremy: and, and the users that we're talking about when we talk about Postgres' row level security. Are those actual users in PostgreSQL. Like if I was to log in with psql, I could actually log in with those users.
[00:17:00] Ant: They're not, you could potentially structure it that way. But it would be more advanced it's it's basically just users in, in the auth.users table, the way, the way it's currently done.
[00:17:12] Jeremy: I see and postgrest has the, that row level security is able to work with that table. You, you don't need to have actual Postgres users.
[00:17:23] Ant: Exactly. And, and it's, it's basically turing complete. I mean, you can write extremely complex auth policies. You can say, you know, only give access to this particular admin group on a Thursday afternoon between six and 8:00 PM. You can get really, yeah. really as fancy as you want.
[00:17:44] Jeremy: Is that all written in SQL or are there other languages they allow you to use?
[00:17:50] Ant: Yeah. It's the default is plain SQL. Within Postgres itself, you can use
I think you can use, like there's a Python extension. There's a JavaScript extension, which is a, I think it's a subsets of, of JavaScripts. I mean, this is the thing with Postgres, it's super extensible and people have probably got all kinds of interpreters.
So you, yeah, you can use whatever you want, but the typical user will just use SQL.
[00:18:17] Jeremy: interesting. And that applies to logic in general, I suppose, where if you were writing a rails application, you might write Ruby. Um, If you're writing a node application, you write JavaScript, but you're, you're saying in a lot of cases with PostgREST, you're actually able to do what you want to do, whether that's serialization or mapping objects, do that all through SQL.
[00:18:44] Ant: Yeah, exactly, exactly. And then obviously like there's a lot of awesome other stuff that Postgres has like this postGIS, which if you're doing geo, if you've got like a geo application, it'll load it up with a geo types for you, which you can just use. If you're doing like encryption and decryption, we just added PG libsodium, which is a new and awesome cryptography extension.
And so you can use all of these, these all add like functions, like SQL functions which you can kind of use in, in any parts of the logic or in the role level policies. Yeah.
[00:19:22] Jeremy: and something I thought was a little unique about PostgREST is that I believe it's written in Haskell. Is that right?
[00:19:29] Ant: Yeah, exactly. And it makes it fairly inaccessible to me as a result. But the good thing is it's got a thriving community of its own and, you know, people who on there's people who contribute probably because it's written in haskell. And it's, it's just a really awesome project and it's an excuse to, to contribute to it.
But yeah. I, I think I did probably the intro course, like many people and beyond that, it's just, yeah, kind of inaccessible to me.
[00:19:59] Jeremy: yeah, I suppose that's the trade-off right. Is you have a, a really passionate community about like people who really want to use Haskell and then you've got the, the, I guess the group like yourselves that looks at it and goes, oh, I don't, I don't know about this.
[00:20:13] Ant: I would, I would love to have the time to, to invest in uh, but not practical right now.
[00:20:21] Jeremy: You talked a little bit about the GoTrue project from Netlify. I think I saw on one of your blog posts that you actually forked it. Can you sort of explain the reasoning behind doing that?
[00:20:34] Ant: Yeah, initially it was because we were trying to move extremely fast. So, so we did Y Combinator in 2020. And when you do Y Combinator, you get like a part, a group partner, they call it one of the, the partners from YC and they add a huge amount of external pressure to move very quickly. And, and our biggest feature that we are working on in that period was auth.
And we just kept getting the question of like, when are you going to ship auth? You know, and every single week we'd be like, we're working on it, we're working on it. And um, and one of the ways we could do it was we just had to iterate extremely quickly and we didn't rarely have the time to, to upstream things correctly.
And actually like the way we use it in our stack is slightly differently. They connected to MySQL, we connected to Postgres. So we had to make some structural changes to do that. And the dream would be now that we, we spend some time upstream and a lot of the changes. And hopefully we do get around to that.
But the, yeah, the pace at which we've had to move over the last uh, year and a half has been kind of scary and, and that's the main reason, but you know, hopefully now we're a little bit more established. We can hire some more people to, to just focus on, go true and, and bringing the two folks back together.
[00:22:01] Jeremy: it's just a matter of, like you said speed, I suppose, because the PostgREST you, you chose to continue working off of the existing open source project, right?
[00:22:15] Ant: Yeah, exactly. Exactly. And I think the other thing is it's not a major part of Netlify's business, as I understand it. I think if it was and if both companies had more resource behind it, it would make sense to obviously focus on on the single codebase but I think both companies don't contribute as much resource as as we would like to, but um, but it's, it's for me, it's, it's one of my favorite parts of the stack to work on because it's written in go and I kind of enjoy how that it all fits together.
So Yeah. I, I like to dive in there.
[00:22:55] Jeremy: w w what about go, or what about how it's structured? Do you particularly enjoy about the, that part of the project?
[00:23:02] Ant: I think it's so I actually learned learned go through, gotrue and I'm, I have like a Python and C plus plus background And I hate the fact that I don't get to use Python and C plus posts rarely in my day to day job. It's obviously a lot of type script. And then when we inherited this code base, it was kind of, as I was picking it up I, it just reminded me a lot of, you know, a lot of the things I loved about Python and C plus plus, and, and the tooling around it as well. I just found to be exceptional. So, you know, you just do like a small amounts of conflig. Uh config, And it makes it very difficult to, to write bad code, if that makes sense.
So the compiler will just, boot you back if you try and do something silly which isn't necessarily the case with, with JavaScript. I think TypeScript is a little bit better now, but Yeah, I just, it just reminded me a lot of my Python and C days.
[00:24:01] Jeremy: Yeah, I'm not too familiar with go, but my understanding is that there's, there's a formatter that's a part of the language, so there's kind of a consistency there. And then the language itself tries to get people to, to build things in the same way, or maybe have simpler ways of building things. Um, I don't, I don't know.
Maybe that's part of the appeal.
[00:24:25] Ant: Yeah, exactly. And the package manager as well is great. It just does a lot of the importing automatically. and makes sure like all the declarations at the top are formatted correctly and, and are definitely there. So Yeah. just all of that tool chain is just really easy to pick up.
[00:24:46] Jeremy: Yeah. And I, and I think compiled languages as well, when you have the static type checking. By the compiler, you know, not having things blow up and run time. That's, that's just such a big relief, at least for me in a lot of cases,
[00:25:00] Ant: And I just loved the dopamine hits of when you compile something on it actually compiles this. I lose that with, with working with JavaScript.
[00:25:11] Jeremy: for sure. One of the topics you mentioned earlier was how super base provides real-time database updates. And which is something that as far as I know is not natively a part of Postgres. So I wonder if you could explain a little bit about how that works and how that came about.
[00:25:31] Ant: Yeah. So, So Postgres, when you add replication databases the way it does is it writes everything to this thing called the write ahead log, which is basically all the changes that uh, have, are going to be applied to, to the database. And when you connect to like a replication database. It basically streams that log across.
And that's how the replica knows what, what changes to, to add. So we wrote a server, which basically pretends to be a Postgres rep, replica receives the right ahead log encodes it into JSON. And then you can subscribe to that server over web sockets. And so you can choose whether to subscribe, to changes on a particular schema or a particular table or particular columns, and even do equality matches on rows and things like this.
And then we recently added the role level security policies to the real-time stream as well. So that was something that took us a while to, cause it was probably one of the largest technical challenges we've faced. But now that it's in the real-time stream is, is fully secure and you can apply these, these same policies that you apply over the CRUD API as well.
[00:26:48] Jeremy: So for that part, did you have to look into the internals of Postgres and how it did its row level security and try to duplicate that in your own code?
[00:26:59] Ant: Yeah, pretty much. I mean it's yeah, it's fairly complex and there's a guy on our team who, well, for him, it didn't seem as complex, let's say (laughs) , but yeah, that's pretty much it it's just a lot of it's effectively a SQL um, a Postgres extension itself, uh which in-in interprets those policies and applies them to, to the, to the, the right ahead log.
[00:27:26] Jeremy: and this piece that you wrote, that's listening to the right ahead log. what was it written in and, and how did you choose that, that language or that stack?
[00:27:36] Ant: Yeah. That's written in the Elixir framework which is based on Erlang very horizontally scalable. So any applications that you write in Elixir can kind of just scale horizontally the message passing and, you know, go into the billions and it's no problem. So it just seemed like a sensible choice for this type of application where you don't know.
How large the wall is going to be. So it could just be like a few changes per second. It could be a million changes per second, then you need to be able to scale out. And I think Paul who's my co-founder originally, he wrote the first version of it and I think he wrote it as an excuse to learn Elixir, which is how, a lot of probably how PostgREST ended up being Haskell, I imagine.
But uh, but it's meant that the Elixir community is still like relatively small. But it's a group of like very passionate and very um, highly skilled developers. So when we hire from that pool everyone who comes on board is just like, yeah, just, just really good and really enjoy is working with Elixir.
So it's been a good source of a good source for hires as well. Just, just using those tools.
[00:28:53] Jeremy: with a feature like this, I'm assuming it's where somebody goes to their website. They make a web socket connection to your application and they receive the updates that way. How have you seen how far you're able to push that in terms of connections, in terms of throughput, things like that?
[00:29:12] Ant: Yeah, I don't actually have the numbers at hand. But we have, yeah, we have a team focused on obviously maximizing that but yeah, I don't I don't don't have those numbers right now.
[00:29:24] Jeremy: one of the last things you've you've got on your website is a storage project or a storage product, I should say. And I believe it's written in TypeScript, so I was curious, we've got PostGrest, which is in Haskell. We've got go true and go. Uh, We've got the real-time database part in elixir.
And so with storage, how did we finally get to TypeScript?
[00:29:50] Ant: (Laughs) Well, the policy we kind of landed on was best tool for the job. Again, the good thing about being an open source is we're not resource constrained by the number of people who are in our team. It's by the number of people who are in the community and I'm willing to contribute. And so for that, I think one of the guys just went through a few different options that we could have went with, go just to keep it in line with a couple of the other APIs.
But we just decided, you know, a lot of people well, everyone in the team like TypeScript is kind of just a given. And, and again, it was kind of down to speed, like what's the fastest uh we can get this up and running. And I think if we use TypeScript, it was, it was the best solution there. But yeah, but we just always go with whatever is best.
Um, We don't worry too much uh, about, you know, the resources we have because the open source community has just been so great in helping us build supabase. And building supabase is like building like five companies at the same time actually, because each of these vertical stacks could be its own startup, like the auth stack And the storage layer, and all of this stuff.
And you know, each has, it does have its own dedicated team. So yeah. So we're not too worried about the variation in languages.
[00:31:13] Jeremy: And the storage layer is this basically a wrapper around S3 or like what is that product doing?
[00:31:21] Ant: Yeah, exactly. It's it's wraparound as three. It, it would also work with all of the S3 compatible storage systems. There's a few Backblaze and a few others. So if you wanted to self host and use one of those alternatives, you could, we just have everything in our own S3 booklets inside of AWS.
And then the other awesome thing about the storage system is that because we store the metadata inside of Postgres. So basically the object tree of what buckets and folders and files are there. You can write your role level policies against the object tree. So you can say this, this user should only access this folder and it's, and it's children which was kind of. Kind of an accident. We just landed on that. But it's one of my favorite things now about writing applications and supervisors is the rollover policies kind of work everywhere.
[00:32:21] Jeremy: Yeah, it's interesting. It sounds like everything. Whether it's the storage or the authentication it's all comes back to postgres, right? At all. It's using the row level security. It's using everything that you put into the tables there, and everything's just kind of digging into that to get what it needs.
[00:32:42] Ant: Yeah. And that's why I say we are a database company. We are a Postgres company. We're all in on postgres. We got asked in the early days. Oh, well, would you also make it my SQL compatible compatible with something else? And, but the amounts. Features Postgres has, if we just like continue to leverage them then it, it just makes the stack way more powerful than if we try to you know, go thin across multiple different databases.
[00:33:16] Jeremy: And so that, that kind of brings me to, you mentioned how your Postgres companies, so when somebody signs up for supabase they create their first instance. What's what's happening behind the scenes. Are you creating a Postgres instance for them in a container, for example, how do you size it? That sort of thing.
[00:33:37] Ant: Yeah. So it's basically just easy to under the hood for us we, we have plans eventually to be multi-cloud. But again, going down to the speed of execution that the. The fastest way was to just spin up a dedicated instance, a dedicated Postgres instance per user on EC2. We do also package all of the API APIs together in a second EC2 instance.
But we're starting to break those out into clustered services. So for example, you know, not every user will use the storage API, so it doesn't make sense to Rooney for every user regardless. So we've, we've made that multitenant, the application code, and now we just run a huge global cluster which people connect through to access the S3 bucket.
Basically and we're gonna, we have plans to do that for the other services as well. So right now it's you got two EC2 instances. But over time it will be just the Postgres instance and, and we wanted. Give everyone a dedicated instance, because there's nothing worse than sharing database resource with all the users, especially when you don't know how heavily they're going to use it, whether they're going to be bursty.
So I think one of the things we just said from the start is everyone gets a Postgres instance and you get access to it as well. You can use your Postgres connection string to, to log in from the command line and kind of do whatever you want. It's yours.
[00:35:12] Jeremy: so did it, did I get it right? That when I sign up, I create a super base account. You're actually creating an two instance for me specifically. So it's like every customer gets their, their own isolated it's their own CPU, their own Ram, that sort of thing.
[00:35:29] Ant: Yeah, exactly, exactly. And, and the way the. We've set up the monitoring as well, is that we can expose basically all of that to you in the dashboard as well. so you can, you have some control over like the resource you want to use. If you want to a more powerful instance, we can do that. A lot of that stuff is automated.
So if someone scales beyond the allocated disk size, the disk will automatically scale up by 50% each time. And we're working on automating a bunch of these, these other things as well.
[00:36:03] Jeremy: so is it, is it where, when you first create the account, you might create, for example, a micro instance, and then you have internal monitoring tools that see, oh, the CPU is getting heady hit pretty hard. So we need to migrate this person to a bigger instance, that kind of thing.
[00:36:22] Ant: Yeah, pretty much exactly.
[00:36:25] Jeremy: And is that, is that something that the user would even see or is it the case of where you send them an email and go like, Hey, we notice you're hitting the limits here. Here's what's going to happen.
[00:36:37] Ant: Yeah.
In, in most cases it's handled automatically. There are people who come in and from day one, they say has my requirements. I'm going to have this much traffic. And I'm going to have, you know, a hundred thousand users hitting this every hour. And in those cases we will over-provisioned from the start.
But if it's just the self service case, then it will be start on a smaller instance and an upgrade over time. And this is one of our biggest challenges over the next five years is we want to move to a more scalable Postgres. So cloud native Postgres. But the cool thing about this is there's a lot of.
Different companies and individuals working on this and upstreaming into Postgres itself. So for us, we don't need to, and we, and we would never want to fork Postgres and, you know, and try and separate the storage and the the computes. But more we're gonna fund people who are already working on this so that it gets upstreamed into Postgres itself.
And it's more cloud native.
[00:37:46] Jeremy: Yeah. So I think the, like we talked a little bit about how Firebase was the original inspiration and when you work with Firebase, you, you don't think about an instance at all, right? You, you just put data in, you get data out. And it sounds like in this case, you're, you're kind of working from the standpoint of, we're going to give you this single Postgres instance.
As you hit the limits, we'll give you a bigger one. But at some point you, you will hit a limit of where just that one instance is not enough. And I wonder if there's you have any plans for that, or if you're doing anything currently to, to handle that.
[00:38:28] Ant: Yeah. So, so the medium goal is to do replication like horizontal scaling. We, we do that for some users already but we manually set that up. we do want to bring that to the self serve model as well, where you can just choose from the start. So I want, you know, replicas in these, in these zones and in these different data centers.
But then, like I said, the long-term goal is that. it's not based on. Horizontally scaling a number of instances it's just a Postgres itself can, can scale out. And I think we will get to, I think, honestly, the race at which the Postgres community is working, I think we'll be there in two years.
And, and if we can contribute resource towards that, that goal, I think yeah, like we'd love to do that, but yeah, but for now, it's, we're working on this intermediate solution of, of what people already do with, Postgres, which is, you know, have you replicas to make it highly available.
[00:39:30] Jeremy: And with, with that, I, I suppose at least in the short term, the goal is that your monitoring software and your team is handling the scaling up the instance or creating the read replicas. So to the user, it, for the most part feels like a managed service. And then yeah, the next step would be to, to get something more similar to maybe Amazon's Aurora, I suppose, where it just kind of, you pay per use.
[00:40:01] Ant: Yeah, exactly. Exactly. Aurora was kind of the goal from the start. It's just a shame that it's proprietary. Obviously.
[00:40:08] Jeremy: right.
Um, but it sounds,
[00:40:10] Ant: the world would be a better place. If aurora was opensource.
[00:40:15] Jeremy: yeah. And it sounds like you said, there's people in the open source community that are, that are trying to get there. just it'll take time. to, to all this, about making it feel seamless, making it feel like a serverless experience, even though internally, it really isn't, I'm guessing you must have a fair amount of monitoring or ways that you're making these decisions.
I wonder if you can talk a little bit about, you know, what are the metrics you're looking at and what are the applications you're you have to, to help you make these decisions?
[00:40:48] Ant: Yeah. definitely. So we started with Prometheus which is a, you know, metrics gathering tool. And then we moved to Victoria metrics which was just easier for us to scale out. I think soon we'll be managing like a hundred thousand Postgres databases will have been deployed on, on supabase. So definitely, definitely some scale. So this kind of tooling needs to scale to that as well. And then we have agents kind of everywhere on each application on, on the database itself. And we listen for things like the CPU and the Ram and the network IO. We also poll. Uh, Postgres itself. Th there's a extension called PG stats statements, which will give us information about what are, the intensive queries that are running on that, on that box.
So we just collect as much of this as possible um, which we then obviously use internally. We set alerts to, to know when, when we need to upgrade in a certain direction, but we also have an end point where the dashboard subscribes to these metrics as well. So the user themselves can see a lot of this information.
And we, I think at the moment we do a lot of the, the Ram the CPU, that kind of stuff, but we're working on adding just more and more of these observability metrics uh, so people can can know it could, because it also helps with Let's say you might be lacking an index on a particular table and not know about it.
And so if we can expose that to you and give you alerts about that kind of thing, then it obviously helps with the developer experience as well.
[00:42:29] Jeremy: Yeah. And th that brings me to something that I, I hear from platform as a service companies, where if a user has a problem, whether that's a crash or a performance problem, sometimes it can be difficult to distinguish between is it a problem in their application or is this a problem in super base or, you know, and I wonder how your support team kind of approaches that.
[00:42:52] Ant: Yeah, no, it's, it's, it's a great question. And it's definitely something we, we deal with every day, I think because of where we're at as a company we've always seen, like, we actually have a huge advantage in that.
we can provide. Rarely good support. So anytime an engineer joins super base, we tell them your primary job is actually frontline support.
Everything you do afterwards is, is secondary. And so everyone does a four hour shift per week of, of working directly with the customers to help determine this kind of thing. And where we are at the moment is we are happy to dive in and help people with their application code because it helps our engineers land about how it's being used and where the pitfalls are, where we need better documentation, where we need education.
So it's, that is all part of the product at the moment, actually. And, and like I said, because we're not a 10,000 person company we, it's an advantage that we have, that we can deliver that level of support at the moment.
[00:44:01] Jeremy: w w what are some of the most common things you see happening? Like, is it I would expect you mentioned indexing problems, but I'm wondering if there's any specific things that just come up again and again,
[00:44:15] Ant: I think like the most common is people not batching their requests. So they'll write an application, which, you know, needs to, needs to pull 10,000 rows and they send 10,000 requests (laughs) . That that's, that's a typical one for, for people just getting started maybe. Yeah. and, then I think the other thing we faced in the early days was. People storing blobs in the database which we obviously solve that problem by introducing file storage. But people will be trying to store, you know, 50 megabytes, a hundred megabyte files in Postgres itself, and then asking why the performance was so bad.
So I think we've, we've mitigated that one by, by introducing the blob storage.
[00:45:03] Jeremy: and when you're, you mentioned you have. Over a hundred thousand instances running. I imagine there have to be cases where an incident occurs, where something doesn't go quite right. And I wonder if you could give an example of one and how it was resolved.
[00:45:24] Ant: Yeah, it's a good question. I think, yeah, w w we've improved the systems since then, but there was a period where our real time server wasn't able to handle rarely large uh, right ahead logs. So w there was a period where people would just make tons and tons of requests and updates to, to Postgres. And the real time subscriptions were failing. But like I said, we have some really great Elixir devs on the team, so they were able to jump on that fairly quickly. And now, you know, the application is, is way more scalable as a result. And that's just kind of how the support model works is you have a period where everything is breaking and then uh, then you can just, you know, tackle these things one by one.
[00:46:15] Jeremy: Yeah, I think any, anybody at a, an early startup is going to run into that. Right? You put it out there and then you find out what's broken, you fix it and you just get better and better as it goes along.
[00:46:28] Ant: Yeah, And the funny thing was this model of, of deploying EC2 instances. We had that in like the first week of starting super base, just me and Paul. And it was never intended to be the final solution. We just kind of did it quickly and to get something up and running for our first handful of users But it's scaled surprisingly well.
And actually the things that broke as we started to get a lot of traffic and a lot of attention where was just silly things. Like we give everyone their own domain when they start a new project. So you'll have project ref dot super base dot in or co. And the things that were breaking where like, you know, we'd ran out of sub-domains with our DNS provider and then, but, and those things always happen in periods of like intense traffic.
So we ha we were on the front page of hacker news, or we had a tech crunch article, and then you discover that you've ran out of sub domains and the last thousand people couldn't deploy their projects. So that's always a fun a fun challenge because you are then dependent on the external providers as well and theirs and their support systems.
So yeah, I think. We did a surprisingly good job of, of putting in good infrastructure from the start. But yeah, all of these crazy things just break when obviously when you get a lot of, a lot of traffic
[00:48:00] Jeremy: Yeah, I find it interesting that you mentioned how you started with creating the EC2 instances and it turned out that just work. I wonder if you could walk me through a little bit about how it worked in the beginning, like, was it the two of you going in and creating instances as people signed up and then how it went from there to where it is today?
[00:48:20] Ant: yeah. So there's a good story about, about our fast user, actually. So me and Paul used to contract for a company in Singapore, which was an NFT company. And so we knew the lead developer very well. And we also still had the Postgres credentials on, on our own machines. And so what we did was we set up the th th the other funny thing is when we first started, we didn't intend to host the database.
We, we thought we were just gonna host the applications that would connect to your existing Postgres instance. And so what we did was we hooked up the applications to, to the, to the Postgres instance of this, of this startup that we knew very well. And then we took the bus to their office and we sat with the lead developer, and we said, look, we've already set this thing up for you.
What do you think. know, when, when you think like, ah, we've, we've got the best thing ever, but it's not until you put it in front of someone and you see them, you know, contemplating it and you're like, oh, maybe, maybe it's not so good. Maybe we don't have anything. And we had that moment of panic of like, oh, maybe we just don't maybe this isn't great.
And then what happened was he didn't like use us. He didn't become a supabase user. He asked to join the team.
[00:49:45] Jeremy: nice, nice.
[00:49:46] Ant: that was a good a good kind of a moment where we thought, okay, maybe we have got something, maybe this is maybe this isn't terrible. So, so yeah, so he became our first employee. Yeah.
[00:49:59] Jeremy: And so yeah, so, so that case was, you know, the very beginning you set everything up from, from scratch. Now that you have people signing up and you have, you know, I don't know how many signups you get a day. Did you write custom infrastructure or applications to do the provisioning or is there an open source project that you're using to handle that
[00:50:21] Ant: Yeah. It's, it's actually mostly custom. And you know, AWS does a lot of the heavy lifting for you. They just provide you with a bunch of API end points. So a lot of that is just written in TypeScript fairly straightforward and, and like I said, you never intended to be the thing that last. Two years into the business.
But it's, it's just scaled surprisingly well. And I'm sure at some point we'll, we'll swap it out for some I don't orchestration tooling like Pulumi or something like this. But actually the, what we've got just works really well.
[00:50:59] Ant: Be because we're so into Postgres our queuing system is a Postgres extension called PG boss. And then we have a fleet of workers, which are. Uh, We manage on EC ECS. Um, So it's just a bunch of VMs basically which just subscribed to the, to the queue, which lives inside the database.
And just performs all the, whether it be a project creation, deletion modification a whole, whole suite of these things. Yeah.
[00:51:29] Jeremy: very cool. And so even your provisioning is, is based on Postgres.
[00:51:33] Ant: Yeah, exactly. Exactly (laughs) .
[00:51:36] Jeremy: I guess in that case, I think, did you say you're using the right ahead log there to in order to get notifications?
[00:51:44] Ant: We do use real time, and this is the fun thing about building supabase is we use supabase to build supabase. And a lot of the features start with things that we build for ourselves. So the, the observability features we have a huge logging division. So, so w we were very early users of a tool called a log flare, which is also written in Elixir.
It's basically a log sync backed up by BigQuery. And we loved it so much and we became like super log flare power users that it was kind of, we decided to eventually acquire the company. And now we can just offer log flare to all of our customers as well as part of using supabase. So you can query your logs and get really good business intelligence on what your users um, consuming in from your database.
[00:52:35] Jeremy: the lock flare you're mentioning though, you said that that's a log sink and that that's actually not going to Postgres, right. That's going to a different type of store.
[00:52:43] Ant: Yeah. That is going to big query actually.
[00:52:46] Jeremy: Oh, big query. Okay.
[00:52:47] Ant: yeah, and maybe eventually, and this is the cool thing about watching the Postgres progression is it's become. It's bringing like transactional and analytical databases together. So it's traditionally been a great transactional database, but if you look at a lot of the changes that have been made in recent versions, it's becoming closer and closer to an analytical database.
So maybe at some point we will use it, but yeah, but big query works just great.
[00:53:18] Jeremy: Yeah. It's, it's interesting to see, like, I, I know that we've had episodes on different extensions to Postgres where I believe they change out how the storage works. So there's yeah, it's really interesting how it's it's this one database, but it seems like it can take so many different forms.
[00:53:36] Ant: It's just so extensible and that's why we're so bullish on it because okay. Maybe it wasn't always the best database, but now it seems like it is becoming the best database and the rate at which it's moving. It's like, where's it going to be in five years? And we're just, yeah, we're just very bullish on, on Postgres.
As you can tell from the amount of mentions it's had in this episode.
[00:54:01] Jeremy: yeah, we'll have to count how many times it's been said. I'm sure. It's, I'm sure it's up there. Is there anything else we, we missed or think you should have mentioned.
[00:54:12] Ant: No, some of the things we're excited about are cloud functions. So it's the thing we just get asked for the most at anytime we post anything on Twitter, you're guaranteed to get a reply, which is like when functions. And we're very pleased to say that it's, it's almost there. So um, that will hopefully be a really good developer experience where also we launched like a, a graph QL Postgres extension where the resolver lives inside of Postgres.
And that's still in early alpha, but I think I'm quite excited for when we can start offering that on the on the hosted platform as well. People will have that option to, to use GraphQL instead of, or as well as the restful API.
[00:55:02] Jeremy: the, the common thread here is that PostgreSQL you're able to take it really, really far. Right. In terms of scale up, eventually you'll have the read replicas. Hopefully you'll have. Some kind of I don't know what you would call Aurora, but it's, it's almost like self provisioning, maybe not sharing what, how you describe it.
But I wonder as a, as a company, like we talked about big query, right? I wonder if there's any use cases that you've come across, either from customers or in your own work where you're like, I just, I just can't get it to fit into Postgres.
[00:55:38] Ant: I think like, not very often, but sometimes we'll, we will respond to support requests and recommend that people use Firebase. they're rarely
like if, if they really do have like large amounts of unstructured data, which is which, you know, documented storage is, is kind of perfect for that. We'll just say, you know, maybe you should just use Firebase.
So we definitely come across things like that. And, and like I said, we love, we love Firebase, so we're definitely not trying to, to uh, destroy as a tool. I think it, it has its use cases where it's an incredible tool yeah. And provides a lot of inspiration for, for what we're building as well.
[00:56:28] Jeremy: all right. Well, I think that's a good place to, to wrap it up, but where can people hear more about you hear more about supabase?
[00:56:38] Ant: Yeah, so supeabase is at supabase.com. I'm on Twitter at ant Wilson. Supabase is on Twitter at super base. Just hits us up. We're quite active on the and then definitely check out the repose gets up.com/super base. There's lots of great stuff to dig into as we discussed. There's a lot of different languages, so kind of whatever you're into, you'll probably find something where you can contribute.
[00:57:04] Jeremy: Yeah, and we, we sorta touched on this, but I think everything we've talked about with the exception of the provisioning part and the monitoring part is all open source. Is that correct?
[00:57:16] Ant: Yeah, exactly.
And as, yeah. And hopefully everything we build moving forward, including functions and graph QL we'll continue to be open source.
[00:57:31] Jeremy: And then I suppose the one thing I, I did mean to touch on is what, what is the, the license for all the components you're using that are open source?
[00:57:41] Ant: It's mostly Apache2 or MIT. And then obviously Postgres has its own Postgres license. So as long as it's, it's one of those, then we, we're not too precious. I, As I said, we inherit a fair amounts of projects. So we contribute to and adopt projects. So as long as it's just very permissive, then we don't care too much.
[00:58:05] Jeremy: As far as the projects that your team has worked on, I've noticed that over the years, we've seen a lot of companies move to things like the business source license or there's, there's all these different licenses that are not quite so permissive. And I wonder like what your thoughts are on that for the future of your company and why you think that you'll be able to stay permissive.
[00:58:32] Ant: Yeah, I really, really, rarely hope that we can stay permissive. forever. It's, it's a philosophical thing for, for us. You know, when we, we started the business, it's what just very, just very, as individuals into the idea of open source. And you know, if, if, if AWS come along at some point and offer hosted supabase on AWS, then it will be a signal that where we're doing something.
Right. And at that point we just, I think we just need to be. The best team to continue to move super boost forward. And if we are that, and I, I think we will be there and then hopefully we will never have to tackle this this licensing issue.
[00:59:19] Jeremy: All right. Well, I wish you, I wish you luck.
[00:59:23] Ant: Thanks. Thanks for having me.
[00:59:25] Jeremy: This has been Jeremy Jung for software engineering radio. Thanks for listening.
Jason Swett is the author of the Complete Guide to Rails Testing. We covered Jason's experience with testing while building relatively small Ruby on Rails applications. Our conversation applies to just about any language or framework so don't worry if you aren't familiar with Rails.
A few topics covered:- Listen to advice but be aware of its context. Something good for a large project may not apply to a small one- Fast feedback loops help us work quicker and tests are great for this- If you don't involve things like the database in any of your tests your application may not work at all despite your tests passing- You may not need to worry about scaling at the start for smaller or internal applications - Try to break features into the smallest pieces possible so they can be checked in and reviewed quickly- Jason doesn't remember the difference between a stub and a mock because he rarely uses them
Related Links:- Code with Jason- The Complete Guide to Rails Testing- Code With Jason Podcast
Transcript:
[00:00:00] Jeremy: today I'm talking to Jason Swett, he's the author of the complete guide to rails testing, a frequent trainer and conference speaker. And he's the host of the code with Jason podcast. So Jason, welcome to software sessions.
[00:00:13] Jason: Thanks for having me.
[00:00:15] Jeremy: from listening to your podcast, I get a sense that the size of the projects you work on they're, they're relatively modest.
Like they're not like a super huge thing. There, there may be something that you can fit all within your head. And I was wondering if you could talk a little bit to that first, so that we kind of get where your perspective is and the types of projects you work on are.
[00:00:40] Jason: Yeah. Good question. So that is true. Most of my jobs have been at small companies and I think that's probably typical of the typical developer because most businesses in the world are small businesses. You know, there's, there's a whole bunch of small businesses for every large business. And so most of the code bases I've worked on have been not particularly huge.
And most of the teams I've worked on have been relatively small And sometimes so small that it's just me. I'm the only person working on the application. I, don't really know any different. So I can't really compare it to working on a larger application. I have worked at, I worked at AT&T so that was a big place, but I was at, AT&T just working on my own solo project so that wasn't a big code base either.
So yeah, that's been what my experience has been like.
[00:01:36] Jeremy: Yeah. And I, I think that's interesting that you mentioned most people work in that space as well, because that's basically where I fall as well. So when I listened to your podcast and I hear you talking about like, oh, I have a, I have a rails project where I just have a single server and you know, I have a database and rails, and maybe I have nginx in front, maybe redis it's sort of the scale that I'm familiar with versus when I hear podcasts or articles, you know, I'm reading where they're talking about, oh, we have 500 microservices or we have 200 instances of the application.
That's, that's not a space that I've, I've worked in. So I, I found it helpful to, to hear, you know, from you on your show that like, Hey, you know, not everybody is working on these gigantic projects.
[00:02:28] Jason: Yeah. Yeah. It's not terribly relatable when you hear about those huge projects.
And obviously, sometimes, maybe people earlier in their career can get the wrong idea about what's applicable to their situation. I feel like one of the most dangerous kinds of advice is advice that's good advice, but it's good advice for somebody else.
And then I've, I've. Been victim of that, where I get some advice and maybe it's genuinely good advice, but it's not good advice for me where I am doing what I'm doing. And so, I apply the advice, but it's not the right thing. And so it doesn't work out for me. So I'm always careful to like asterisk a lot of the things I say where it's like, Hey, this is, this is good advice if you're in this particular situation, but maybe not for everybody.
And really the truth is I, I try not to give advice these days because like advice is dangerous stuff for that very reason.
[00:03:28] Jeremy: so, so when you mentioned you try not to give advice and you have this book, the complete guide to rails testing, would you not describe what's in the book as advice? I'm kind of curious what the distinction is there.
[00:03:42] Jason: Yeah, Jeremy, right after I said that, I'm like, what am I talking about? I give all kinds of advice. So forget, I said that I totally give advice. But maybe not in certain things like like business advice or anything like that. I do give a lot of advice around testing and various programming things.
So, yeah, ignore that part of what I said.
[00:04:03] Jeremy: something that I found a little bit unique about rails testing was that a lot of the tests are centered around I guess you could call it like a full integration test, right? Because I noticed when working with rails, if I write a test, a lot of times it's talking to the database, it's talking to if, if I.
Have an API or I have a website it's actually talking to the API. So it's actually going through all the layers and spinning up a database and all that. And I wonder if you, you knew how that work, like each time you run a test, is it creating a new database? So that each test is isolated or how does all that stuff actually work?
[00:04:51] Jason: Yeah, good question. First. I want to mention something about terminology. So I think one of the most important things for somebody who's new to testing to learn is that in our industry, we don't have a consensus around terminology. So what you call an integration test might be different from what I call an integration test.
The thing you just described as an integration test, I might call an acceptance test. Although I happen to also call it an integration test cause I use that terminology too, but I just wanted to give that little asterisk for the listener, because if they're like, wait, I thought an integration test was this.
And not that anyway, you asked how does that work? So. It is true that with those types of rails tests, and just to add more terminology into the mix, they call those system tests or system specs, depending on what framework you're using. But those are the tests that actually instantiate a browser and simulating user input, exercise, the UI of the application.
And those are the kinds of tests that like show you that everything works together. And mechanically how that works. One layer of it is that each test runs in a database transaction. So when you, you know, in order to run a certain test, maybe you need certain records like a user. And then I don't know if it's a scheduling test, you might need to create an appointment and whatever. All those records that you create specifically for that test that's happening inside of a database transaction. And then at the end of the test, the transaction is aborted. So that none of the data you create during the test actually gets persisted to the database. then regarding the whole database, it's not actually like creating a new database instance at the start of each test and then blowing it away.
It's still the same database instance is just the data inside of each test is not being persisted at.
[00:07:05] Jeremy: Okay. So when you run. What you would call, I guess you called it an acceptance test, right? Where it's going, it's opening up your website, it's clicking through the website, creating records, things like that. That's happening in a database instance that's created for, I guess, for all your tests that all your tests get to reuse and rails is automatically wrapping your test in a transaction.
So even if you're doing five or 10 database queries at the end of all, that they all get rolled back because they're all within the same transaction.
[00:07:46] Jason: Exactly. And the reason why we want to do that. Is because of a testing principle that you want your tests to be runnable in any order. And the key thing is you want your tests to be deterministic. So deterministic means that the starting state determines the in-state and it's the same every time, no matter what.
So if you have tests a, B and C, it shouldn't be the case that you can run them in the order, ABC, and they all pass. But if you do it CBA, then test a fails because it should only fail. If something's actually wrong, it shouldn't fail for some other reason, like the order in which you run the tests. And so to ensure that property of deterministic newness we need to make it so that each test doesn't leak into the other tests.
Cause imagine if that. Database transaction. thing didn't happen. And it's, it's only incidental that that's achieved via database transactions. It could conceivably be achieved some other way. That's just how this happens to work in this particular case. But imagine if no measure was taken to clean up afterward and I, I ran a test and it generated an appointment.
And then the test that runs after that does some tests that involves like doing a count of appointments or something like that. And maybe like, coincidentally, my second test passes because I've always run the tests in a certain order. and so unbeknownst to me, test B only passes because of what I did in test a that's bad because now the thing that's happening is different from what I think is happening.
And then if it flipped and when we ran it, test B and then test a. It wouldn't work anymore. So that's why we make each test isolated. So it can be deterministic.
[00:09:51] Jeremy: and I wonder if you've worked with any other frameworks or any other languages, and if you found that the approaches and those frameworks or languages is similar to rails, like where it creates these, the transaction for you, does the rollback for you and all of that.
[00:10:08] Jason: Good question. I have to plead ignorance. I've dabbled a little bit in testing and some other languages or frameworks, but not enough to be able to say anything smart about it.
[00:10:22] Jeremy: Yeah, I mean in my experience and of course there are many different frameworks that I'm not familiar with, but in a lot of cases, I I've seen that they don't have this kind of behavior built in, like, they'll provide you a way to test your application, but it's up to you if you want to write code that will wrap everything in a transaction or create a new database instance per test, things like that.
That's all left up to you. so I, I think it's interesting that that rails makes that decision for you and makes it to where you don't really have to think about that or make that decision. And for me personally, I found that really helpful.
[00:11:09] Jason: Yeah, it's really nice. It's a decision that not everybody is going to be on board with. And by that decision, I mean the general decision of rails to make a lot of decisions for you. And it may not be the case that I agree with every single decision that rails has made, but I do appreciate that that the rails team or DHH, or whoever has decided that rails is just going to have all these sensible defaults.
And that's what you get. And if you want to go tweak that stuff, I guess you can, but you get all this stuff this way. Cause we decided what we think is the best way to do it. And that is how most people use their, their rails apps. I think it's great. It eliminates a lot of overhead and then. Use some other technologies, I've done some JavaScript stuff and it's just astonishing how much boiler plate and how many, how much energy I have to expend on decisions that don't really matter.
And maybe frankly, decisions that I'm not all that equipped to make, because I don't have the requisite knowledge to be able to make those decisions. And usually I'd rather just have somebody else make those decisions for me.
[00:12:27] Jeremy: we've been talking about the more high level tests, the acceptance tests, the integration tests. And when you're choosing on how to test something, how do you decide whether it should be tested that, that level, or if it should be more of a unit level tests, something, something smaller
[00:12:49] Jason: Good question. So I want to zoom out just a little bit in order to answer that question and come at it from a distance. So I recently conducted some interviews for a programmer job. I interviewed about 25 candidates, most of those candidates. Okay. And the first step of the interview was this technical coding exercise. most of the candidates did not pass. And maybe, I don't know. Five or six or seven of the candidates out of those 25 did pass. I thought it was really interesting. The ones who failed all failed in the same way and the ones who passed all passed in the same way. And I thought about what exactly is the difference.
And the difference was that the programmers who passed, they coded in feedback loops. So I'll say that a different way, the ones who failed, they tried to write their whole program at once and they would spend 15, 20 minutes carefully writing the program. And then at the end of that 20 minutes, they would try to run it.
And unsurprisingly to me the program would fail like on line 2 of 30, because nobody's smart enough to write that much code and have the whole thing work. And then the ones who did well. They would write maybe one line of code, run it, observe what happens, compare what they observed to what they expected to see, and if any corrections were needed, they made those corrections and ran it again.
And then only once their expectations were satisfied, did they go and write a second line and they would re repeat that process again, that workflow of programming and feedback loops I think is super important. And I think it's what distinguishes, Hmm. I don't exactly want to say successful programmers from unsuccessful programmers, but there's certainly a lot to do with speed.
like, think about how much slower it is to try to write your whole program, run it and see that it fails. And then try to find the needle in the haystack. It's like, okay, I just wrote 30 lines. There's a problem somewhere. I don't know where, and now I have to dig through and find it It's so much harder than if you just write one line and you see a problem and you know, that, that problem lines in that line, you just wrote.
So I say all that, because testing is just feedback loops automated. So rather than writing a line and then manually running your program and using your own judgment to compare what you observed to what you expected to see you write a test that exercises your code and says, I expect to see this when this happens.
And so the kind of test you write now to answer your question will depend first on the nature of the thing you're writing. But for like, if we take kind of the like typical case of, let's say I'm building a form that will allow me to create a customer in a system. And I put in the first name, last name and email address of the customer. that's a really basic like crud functionality thing. There's not a lot of complexity there. And so I am, to be honest, I might just not write a test at all and we can get into how I decide when to write a test and when not to, but I probably would write a test. And if I did, I would write a system spec to use the rails are spec terminology that spins up a browser.
I would fill in the first name field with a first name, fill in the last name field with the last name, email, with email click, the submit button. And then I would assert that on the subsequent page, I see some indicator of success. And then if we think about something that. Maybe more involved, like I'm thinking about some of the complicated stuff I've been working on recently regarding um, coming up with a patient's balance in the medical system that I work on.
That's a case where I'm not going to spin up a browser to check the correctness of a number. Cause that feels like a mismatch. I'm going to work at a lower level and maybe create some database records and say, when I, when I created this charge and when I create this payment, I expect the remaining balance to be such and such.
So the type of test I write depends highly on the kind of functionality.
[00:17:36] Jeremy: So it sounds like in the case of something that's more straight forward, you might write a high level test, I guess, where you were saying I just click this button and I see if the thing I expected to be created is there on the next page. And you might create that test from the start and then just start filling in the code and continually running that test you know, until it passes.
But you also mentioned that in the case of something simple like that, you might actually. Choose to forego the tests and just take a look you know, visually you open the app and you click that same button and you can see the same result. So I wonder if you could talk a little bit more about how you decide, like, yeah, I'm going to write this test or no, I'm just going to inspect a visually
[00:18:28] Jason: Yeah. So real quick before I answer that, I want to say that it's, it's not one of the tests is straightforward or the feature is straightforward that determines which kind of test I write, because sometimes the acceptance test that I write, which spins up a browser and everything. Sometimes that might be quite an involved test and in complicated feature, or sometimes I might write a lower level test and it's a trivially simple one.
It has more to do with um, What's, what's the thing that I care about. Like, is it primarily like a UI based feature that, that is like the meat of it? Or is it like a, a lower level, like calculation type thing or something like that? That's kind of what determines which kind of right. But you asked when would I decide not to write a test.
So the reason I write tests is because it's just like cost prohibitive to manually perform testing, not just in monetary terms, but like in emotional pain and mental energy and stuff like that. I don't want to go back and manually test everything to make sure that it's still working. And so the ROI on writing automated tests is almost always positive, but sometimes it's not a positive ROI.
And so when I don't write it down, It's if these conditions are true, if the cost of that feature braking is extremely low. And if the I'll put that if, if the consequences of the feature breaking are really small and the frequency of the usage is low and the cost of writing the test is high, then I probably won't write a test.
For example, if there's some report that somebody looks at once every six months and it's like some like maybe a front desk person who uses the feature and if it doesn't work, then it means they have to instead go get the answer manually. And instead of getting the answer in 30 seconds, it takes them five.
Extremely low cost to the failure. And it's like, okay, so I'm costing somebody, maybe 20 bucks once every six months, if this feature breaks. And let's say this test is one that would take like an hour for me to write. Clearly it's better just to accept the risk of that feature breaking once in a while, which it's probably not going to anyway. So those are the questions I ask when I decide and, and to, to be clear, it's not like I run through all those questions for every single test I write in the vast, vast majority of cases. I just write the test because it's a no-brainer that it's, that it's better to write the test, but sometimes my instincts tell me like, Hey, is this really actually important to write a test for?
And when I find myself asking that, then I say, okay, what's the consequences of the breakage? How hard is this test to write all that.
[00:21:46] Jeremy: So you talked about the consequences being low, but you also talked about maybe the time to write the test being high. What are the types of tasks that, that take a long time to write?
[00:21:58] Jason: Usually ones that involve a lot of setup. So pretty much every test requires some data to be in place data, either meaning database, data, or like some object structure or something like that. Sometimes it's really easy sometimes to set up is extremely complicated. and that's usually where the cost comes in.
And then sometimes, sometimes you encounter like a technical challenge, like, oh, how do I like download this file? And then like inspect the contents of this file. Like sometimes you just encounter something that's like technically tricky to achieve. But more frequently when a test is hard to write it's because of the setup is hard.
[00:22:49] Jeremy: and you're talking about set up being, you need to insert a whole bunch of different rows into your database or different things that interact with one, another things like that.
[00:23:02] Jason: Exactly.
[00:23:03] Jeremy: when you're testing a system and you create a database that has all these items in it for you to work with, I'm assuming that what's in your test database is much smaller than what's in the real database. So how do you get something that's representative so that if you only have 10 things in your tasks, but in production, there's thousands of them that you can catch that, Hey, this isn't going to work well, once it gets to production,
[00:23:35] Jason: Yeah. that's a really interesting question. And the answers that I don't like, I usually don't try to make the test beta test database representative of the production database in terms of scale, obviously like the right data has to be there in order to exercise the test that it has to be true. But I don't, for example, in production at this moment I know there's some tens of thousands of appointments in the database, but locally at any given time, there are between zero and three or, or So appointments in any particular test, that's obviously nowhere near realistic, but it's only becomes relevant in a great, great minority of cases with, with regard to that stuff, the way I approach that is rather to So I'm thinking about some of those through the, for the first time right now, but obviously with performance in general premature optimization is usually not a profitable endeavor. And so I'll write features without any thought toward performance. And then once things are out there and perform it in production observe the bottlenecks and then fix the bottlenecks, starting with what's the highest ROI.
And usually tests haven't come into the picture for me. It's cause like, okay. The reason for tests again is, so you don't have to go back and do that manual testing, but with these performance improvements, instead of tests, we have like application performance monitoring tools, and that's what tells me whether something needs an issue or people just say like, Hey, this certain page is slow or whatever.
And so tests would be like redundant to those other measures that we have that tell us if there's performance.
[00:25:38] Jeremy: Yeah. So that sorta touches on what you described before, where let's say you were writing some kind of report or adding a report and when you were testing it locally, it worked great generated the report. Uh, Then you pushed it out to production. Somebody went to run it and maybe because of an indexing problem or some other issue It times out, or it doesn't complete takes a long time, but I guess what you're saying is in a lot of cases, the, the consequences of that are not all that high.
Like the person will try it. They'll see like, Hey, it doesn't work. Either you'll get a notification or they'll let you know, and then that's when you go in and go like, okay, now, now we can fix this.
[00:26:30] Jason: Yeah. And I think like the distinction is the performance aspect of it. Because like with a lot of stuff, you know, if you don't have any tests in your application at all, there's a high potential for like silent failure. And so with the performance stuff, we have other ways of ensuring that there won't be silent failure.
So that's how I think about that particular.
[00:26:56] Jeremy: I guess another thing about tests is when you build an application, a lot of times you're not just interacting with your own database, you're interacting with third-party APIs. You may even be connecting to different pieces of hardware, things like that. So when you're writing a test, how do you choose to approach that?
[00:27:23] Jason: yeah, good question. This is an area where I don't have a lot of personal experience, but I do have some there's another principle in testing that is part of the determinism principle where you don't want to involve external HTTP requests and stuff like that in your tests. Because imagine if I run my test today, And it passes, but then I run my test tomorrow and this third-party API is down and my test fails the behavior of my program didn't change. The only thing that's different is this external API is down right now. And so what I do for, for those is I'll capture the response that I get from the API. And I'll usually somehow um, get my hands on a success response and a failure response and whatever kind of response I want to account for.
And then I'll insert those captured responses into my tests. So that then on every subsequent run, I can be using these canned values rather than hitting the real API.
[00:28:37] Jeremy: I think in your um, the description of your book, you mentioned a section on, on stubs and mocks, and I wonder what you're describing here, which of those two things, is it? And what's the difference?
[00:28:53] Jason: Yeah. it's such a tricky concept And I don't even trust myself to say it right every time that I want to remind myself of the difference between mocks and stubs. I have to go back to my own blog posts that I wrote on it and remind myself, okay, what is the difference between a mock and a stub? And I'll just say, I don't remember.
Because this isn't something that I find myself dealing with very frequently. It's something that people always want to know about at least in the rails world. But I'll speak for myself at least. I don't find myself having to use or wanting to use mocks and stubs very much.
I will say that both mocks and stubs are a form of a testable. So a mock is a testable and a stub is a testable and a testable. It's like a play on stunt double instead of using a real object or whatever it is, you have this fake object. And sometimes that can be used to like trick your program into behaving a certain way or it can be used to um, gain visibility into an area that you otherwise wouldn't have visibility into.
And kind of my main use case for mocks and stubs when I do use them, is that when you're testing a particular thing, You want to test the thing you're interested in testing. You don't want to have to involve all the dependencies of the thing you're testing. And so I will like stub out the dependencies.
So, okay. Here's an example. I have a rare usage of stubs in my, in my uh, test suite and dear listener. I'm going to use the word stub. Don't give too much credence to that. Maybe. I mean, mock, I don't remember. But anyway, I have this area where we determine a patient's eligibility to get a certain kind of medicine and there's a ton that goes into it and there's all these, like, there's, there's these four different, like coarse-grained determinations and they all have to be a yes in order for it to overall be a yes.
That they can get this medicine. It has to do with mostly insurance. And then each one of those four core course grain determinations has some number of fine grain determinations that determines whether it is a yes or a no. If I weren't using mocks and stubs in these tests, then in order to test one determination, I would have to set up the conditions.
This goes back to the setup, work stuff we talked about. I'd have to set up all the conditions for the medicine to be a yes. In addition to, to the thing I'm actually interested in. And so that's a waste because that stuff is all irrelevant to my current concern. Let me try to speak a little bit more concretely.
So let's say I have determinations ABC. When I'm interested in determination, a I don't want to have to do all the setup work for determinations, B, C, and D. And so what I'll do is I'll mock the determinations for B, C and D. And I'll say for B, just have the function returned true for C same thing, just return true for D return.
True. So it'd like short circuits, all that stuff and bypasses the actual logic that gives me the yes, no determination. And it just always gives me a yes. That way. There's no setup work for B, C, and D. And I can focus only on.
[00:32:48] Jeremy: And I think it may be hard to say in this example, but would you, would you still have at least one test that would run through and do all the setup, do the checks for ABC and D and then when you're doing more specific things start to put in doubles for the others, or would you actually just never have a full test that actually did the complete setup?
[00:33:14] Jason: well, here's how I'm doing this one. I described the scenario where I'm like thoroughly testing a under many different conditions, but stubbing out B, C and D. They don't have another set of tests where I thoroughly test B and stub out a C and D. And so on. I have one thorough set for, for each of those. If you're asking whether I have one that like exercises, all four of them, No.
I just have ones for each of the four individually, which is maybe kind of a trade off. Cause it's arguable that I don't have complete confidence because I'm never testing the four together. But in the like trade off of like setup?
work and all that, that's necessary to get that complete con confidence and the value of that, like additional, because really it's just like a tiny bit of additional con confidence that I would get from testing all those things together.
In that particular case, my judgment was that that was not worth
[00:34:19] Jeremy: yeah. Cause I was thinking from their perspective of sometimes I hear that people will have a acceptance test that covers sometimes you hear people call it the happy path, right. Where they everything lines up. It's like a very straightforward case of a feature. But then all the different ways that you can test that feature, they don't necessarily write tests for those, but they just write one for the, the base case.
And then, like you said, you actually drill down into more specifics and maybe only test a, a smaller part there, but it sounds like in this case, maybe you made the decision that, Hey, doing a test, that's going to test all four of these things, even in the simplest case is going to involve so much setup and so much work that, that maybe it's not, not worth it in this case.
[00:35:13] Jason: Yeah. And I'd have to go back and refresh my memory as to like what exactly this scenario is for those tasks. Because in general, I'm a proponent of having integration tests that makes sure multiple things work together. Okay. You might've seen that Gif where it says like um, two unit tests, zero integration tests, and there's like a cabinet with two doors.
Each door can open on its own or, or maybe it's drawers. Each drawer can open on its own, but you can't open both drawers at the same time. And so I think that's not smart to have only unit tests and no integration tests. And so I don't remember exactly why I chose to do that eligibility test with the ABC and D the way I did.
Maybe it was just cost-prohibitive to do it altogether. Um, One thing that I want to want to comment on regarding mocks and stubs, there's a mistake that's made kind of frequently where people overdo it with mocks and stuff. They misunderstand the purpose. The purpose again is that you want to test the thing you're testing, not the dependencies of the thing.
But sometimes people step out the very thing they're testing. And so they'll like assert that a certain method will return such and such value, but they'll stub the method they're testing so that the method is guaranteed to return the same value and that doesn't actually test anything. So I just wanted to make, mention that as a common mistake to avoid
[00:36:47] Jeremy: I wonder if you could maybe give an example of when you, you have a certain feature and the thought process you're going through where you decide like, yes, this is the part that I should have a stub or a mock for. And this is the part where I definitely need to make sure I run the code.
[00:37:07] Jason: Well, again, it's very rare that I will use a mocker stub and it's not common that I'll even consider it for better or worse. Like we're talking about. The nature of rails tests is that we spin up actual database records and, and test our models with database data and stuff like that. In other ecosystems, maybe the testing culture is different and there's more mocks and stubs.
I know when I was doing some coding with angular, there was a lot more mocking and stubbing. But with rails, it's kind of like everything's available all the time and we use the database a lot during testing. And so mocks and stubs don't really come into the picture too much.
[00:37:56] Jeremy: Yeah. It's, it's interesting that you, you mentioned that because like I work with some projects that use C-sharp and asp.net, and you'll a lot of times you'll see people say like you should not be talking to the database in your tests. And you know, they go through all this work to probably the equivalent of a mock or a stub.
But then, you know, when I, when I think about that, then I go like, well, but I'm not really testing how the database is going to react. You know, are my, are my queries actually valid. Things like that, because all this stuff is, is just not being run. in some other communities, maybe they're they have different ideas, I guess, about, about how to run tests.
[00:38:44] Jason: Yeah, And it's always interesting to hear expressions. Like you should do this or you shouldn't do that, or it's good to do this. It's bad to do that. And I think maybe that's not quite the right way to think about it. It's more like, well, if I do this, what are the costs and benefits of doing this? Cause it's like, nothing exactly is a good thing to do or a bad thing to do.
It's just, if you do this, this will happen as a consequence. And if you don't this won't and all that stuff. So people who don't want to talk to the database in their tests, why is that? What, what are the bad things they think will happen if you do that? The drawbacks is it appears to me are it's slow to use the database in any performance problem.
Usually the culprit is the database. That's always the first thing I look at. And if you're involving the database and all of your tests, your tests are going to be much slower than if you don't use the database, but the costs of not talking to the database are exactly what you said, where you're like, you're not exercising your real application, you're missing an entire layer and maybe that's fine.
I've never tried approaching testing in that way. And I would love to like, get some experience like working with some people who do it that way. Cause I can't say that I know for an absolute fact that that doesn't work out. But to me it just makes sense to exercise everything that you're actually using when the app runs.
[00:40:18] Jeremy: what's challenging probably for a lot of people is that if you look online for how to do testing in lots of different frameworks, you'll get different answers. Right. And it's not clear what's gonna fit your situation right? And you know, to, to give an example of, we've been talking about how rails will it, it predominantly focuses on tests that, that talks to the database and it wraps everything in a transaction as we talked about before, so that you can reset the state and things like that.
I've also seen in other frameworks where they'll say like, oh, you can run a database test, but you use this in-memory version of the database instead of actually talking to a real MySQL or Postgres instance, or they'll say, oh, for this test we're going to use SQLite in place of the Postgres database you're actually using in production.
And it, it makes the, the setup, I suppose, easier. Um, And maybe it makes the tests run quicker, but then it's also no longer the same as what you're really running. So there's like a lot of different approaches that, that people describe and take. And I think it can be hard for, for people to know, like what, what makes sense for me.
[00:41:42] Jason: Yeah. And this is another area where I have to plead ignorance because again, I don't have experience doing it the other way. Logically, I feel like my way makes sense, but I don't have empirical experience doing it the other way.
[00:41:57] Jeremy: we've talked a little bit about how there's cases where you'll say I'm not going to do this thing because it's going to take a lot of time and I've weighed the benefits. And I wonder if you could give some examples of things where you spent a lot of time on something, and then in hindsight, you, you realize like this really wasn't worth it.
[00:42:18] Jason: I don't think I have any examples of that because I don't think it tends to happen very much. I really can't emphasize enough how old, the case where I choose not to write a test for something is like a one in 5,000 kind of thing. It's really not something I do frequently. The mistake is overwhelmingly in the opposite direction.
Like somebody may, maybe I will get lazy and I'll skip a test and then I'll realize, oh yeah, This is why I write tests because it actually makes everything easier. And uh, we get pain as as a consequence when we skip tests. So that's usually the mistake I make is not writing a test when I should, rather than writing a test when I should not have
[00:43:08] Jeremy: So since then, in general, you, you said that not writing it is, is the, the mistake. How do you get people in the habit of. Of writing the tests where they feel like it's not this thing that's slowing them down or is in the way, but is rather something that's helping them with that feedback loop and is something that they actively want to do.
[00:43:33] Jason: Yeah. So to me, it's all about a mindset. So there's a common perception that tests are something extra. Like I've heard stories about, like somebody gives a quote for a project and then the prospective client asks like, well, how much, if we skip tests, how much less would that be? And it's like, oh, it wouldn't be less.
It'd be like five times more because tests are a time saver. So I want to try to dispel with that notion. But even so it can be hard to bring oneself, to write task because it feels like something that takes discipline. But in my case, I don't feel like it takes discipline. Because I remind myself of a true fact that it's actually the lazy and easy way to code is to code using tests.
And it's the harder, more laborious way to write code. Not using tests because think about what's, what's the alternative to not writing tests. Like we said earlier, the alternative is to manually test everything. And that's just so painful, especially when it's some feature where like, I'm sure you have experience with this, Jeremy, you, you make a code change.
And then in order to verify that the thing still works, you have to go through like nine different steps in the browser. And only on that last step, do you get that answer you're after. That's just so painful. And if you write a test, you can automate that. Some things that might present friction in that process, or just like a lack of familiarity with how to write tests and maybe a um, a lack of an easy process for writing tests.
And just to briefly touch on that, I think something that can help reduce that. Is to write tests in the same way that I write code in feedback loops. So we talked about writing one line, checking, writing, another line, checking that kind of thing. I write my tests in the same way. First I'll write the shell of the test and then I'll run just the shell, even though it seems kind of dumb to just run the shell cause you know, it doesn't do anything. I do that just to demonstrate to myself that I didn't like make some typo or something like that. I'm starting from like a clean baseline. And then I'll write one line of my test. Maybe if I'm writing a system spec, I'll write a line that creates a user of rum that I know that nothing's going to happen when I run the test, but I'll run it just to see it run and make sure there's no errors.
And then I'll add a line that says, log the user in and then I'll run that. And so on just one line at a time. There's this principle that I think is really useful when working, which is to separate the deciding what to do from the actually doing it. I think a lot of developers mixed those two jobs of deciding what to do and doing it in the same step.
But if, if you separate those, so you'd like, decide what you're going to have your tests do. And then after that, so like maybe I'll open my test and I'll write in comments what I want to achieve, not in technical terms necessarily, but I'll just write a comment that says, create a user, right? Another comment that says, log in another comment that says, click on such and such.
And then once I have those, there, I'll go back to that first line and convert that to code. Okay. My comment that says, create a user, I'll change that to the syntax that actually creates a user and again, using the feedback loop. So I'll run that so that I can, you know, once I'm, once I'm done writing all those comments that say what the test does, I'm now free to forget about it.
And I don't have to hold that in my mental Ram anymore. And I can clear my mental RAM. Now all my mental RAM is available to bring, to bear on the task of converting my steps that I already decided into working syntax. If you try to do both those things at the same time, it's more than twice as hard. And so that's why I try to separate.
[00:48:04] Jeremy: So that's interesting. So it's like you're designing, I guess, the feature, what you want to build in the context of the test first it's would that be accurate?
[00:48:19] Jason: that certainly can be the case. So much of this is context dependent. I very regularly give my self permission to be undisciplined and to go on exploratory spikes. And so if I have like very, if I have a really vague idea about what shape a feature is going to take, I give myself permission to forget about tests and I just write some code and I feel cause there's two reasons to write code.
You know, a code is not only a work product code is also a thinking. so I would let go into a different mode, I'll say, okay, I'm not trying to create a work product right now. I'm just using code as a thinking medium, to figure out what I'm even going to do. So that's what I'll do in that case. And then maybe I'll write the test afterward, but if it's very clear, what the thing is that I'm going to write, then I'll often write the test first again, in those two phases of deciding what it's going to be and the deciding how it works.
And I won't do a thing where, where, like I write 10 test cases and then I go through one by one and write code to make them pass. Usually I'll write one test, make a pass, write a second test, make it pass and so on.
[00:49:38] Jeremy: okay. So the more exploratory aspect, I guess, would be when. You're either doing something that you haven't done before, or it's not clear to you what the features should be is, is that right?
[00:49:58] Jason: Yeah, like maybe it's a feature that involves a lot of details. There's like a lot of room for discretion. It could be implemented in more than one way. Like how would I write a test for that? If I don't even know what form it's going to take? Like there's decisions to be made, like, what is the, the route going to be that I visit for this feature?
What am I even going to call like this entity and that entity and stuff like that. And I think that goes back to my desire to not juggle and manage. Multiple jobs at the same time. I don't want to, I don't want to overly mix the design job with the testing job. Cause testing can help with design, but design in like a code structure sense.
I usually don't want to mix testing with like UI design and not even UI design, like, like design in the highest sense. Meaning like what even is this thing? How does it work? Big picture wise and stuff like that. That's not the kind of design that testing helps with in my mind of the kind of design that testing helps with again, is the code structure.
So I want to have my big picture design out of the way before I start writing my test.
[00:51:21] Jeremy: and in terms of the big picture design, is that something that you keep all in your head or are you writing that down somewhere? I'm just wondering what your process is.
[00:51:34] Jason: Yeah, it can work a number of different ways in the past. I've done usability testing where I will do some uh, pen and paper prototypes and then do some usability testing with, with users. And then I will um, convert those pen and paper prototypes to something on the computer. The idea being pen and paper prototypes are the cheapest to create and change.
And then the more you cement it, the more expensive it gets to change. So only once I'm fairly certain that the pen and paper prototypes are right. Will I put it into something that's more of a formal mock. And then once I have my formal mock-up and that's been through whatever scrutiny I want to put it through, then I will do the even more expensive step of implementing that as a working feature.
Now having said all that, I very rarely do I go through all that ceremony. Sometimes a feature, usually a feature is sufficiently small, that all that stuff would be silly to do. So sometimes I'll start straight with the the mock-up on the computer and then I'll work off of that. Sometimes it's small enough that I'll just make a few notes in a note-taking program and then work off of that.
What is usually true is that our tickets in our ticketing system have a bulleted list of acceptance criteria. So we want to make it very black and white. Very yes, no. Whether a particular thing is done and that's super helpful because again, it goes back to the mixing of jobs and separating of jobs.
If we've decided in advance that this feature needs to do these four things. And if it does those four things it's done and it doesn't need to do anything more and if it doesn't meet those four criteria, then it's not done then building the thing is just a matter of following the instructions. Very little thinking is involved.
[00:53:45] Jeremy: depending on the scope of the feature, depending on how much information you have uh, you could either do something elaborate, I suppose, where, you know, you were talking about doing prototypes or sketches and, and so on before you even look at code or there could be something that's not quite that complicated where you have an idea of what it is and you might even play with code a little bit to get a sense of where it should go and how it should work.
But it's all sort of in service of getting to the point where you know enough about how you're going to do the implementation and you know enough about what the actual feature is to where you're comfortable starting to write steps in the test about like, these are the things that are going to happen.
[00:54:35] Jason: Yeah. And another key thing that might not be obvious is that all these things are small. So I never work well, I shouldn't say never, but in general, I, don't work in a feature. That's going to be like a week long feature or something like that. We try to break them down into features that are at most like half.
And so that makes all that stuff a lot easier. Like I use the number four as an example of how many acceptance criteria there might be. And that's a pretty representative example. We don't have tickets where there's 16 acceptance criteria because the bigger something is the more opportunity there is for the conceive design to turn out, not to be viable.
And the more decisions that can't be made, because you don't know the later step until the earlier decision is made and all that kind of stuff. So the small size of everything helps a lot.
[00:55:36] Jeremy: but I, I would imagine if you're breaking things into that small of a piece, then would there be parts that. You build and you tasked and you deploy, but to the user, they actually don't see anything. Is that the appraoch?
[00:55:52] Jason: definitely, we use feature flags. Like for example, there's this feature we're working on right now, where we have a page where you can see a long list of items. The items are of several different types right now. You just see all of them all the time, but depending on who you are and what your role is in the organization, you're not going to be interested in all those things.
And so we want people to be able to have check boxes for each of those types to show or hide those things. Whereas checkbox feature is actually really big and difficult to add. And so the first thing that I chose to do was to have us add just one single check box for one type. And even that one, single checkbox is sufficiently hard that we're not even giving people that yet.
We coded it so that you get the check boxes and that one checkbox is selected by default. When you uncheck it, the thing goes away, but it's selected by default so that we can feature flag that. So the checkbox UI is hidden. Everything looks just the way it did before. And now we can wait until this feature is totally done before we actually surface it to users.
So it's the idea of making a distinction between deployment and release. Cause if we try to do this whole big thing, it's, it's gonna take weeks. If we try to do the whole thing, that's just too much risk for something to go wrong. And then like, we're going to deploy like three weeks of work at once.
That's like asking for trouble. So I'm a huge fan of feature flags.
[00:57:35] Jeremy: Interesting. So it's like the, it's almost like the foundation of the feature is going in. And if you were to show it to the user well, I guess in this case, it actually did have a function right at you. You could filter by that one category.
[00:57:52] Jason: oh, I was just going to say you're exactly right. It wouldn't be a particularly impressive or useful feature, but what we have is complete it's it's not finished, but it is complete.
[00:58:06] Jeremy: I'm not sure if you have any examples of this, but I imagine that there are changes that are large enough that I'm not sure how you would split it up until you, you mentioned like half a days worth of time. And I, I wonder if either have examples of features like that or a general sense of how, what do you do if you, you can't figure out a way to split it up that small.
[00:58:34] Jason: I have yet to encounter a feature that we haven't been able to break up into pieces that are that small. So, unfortunately, I can't really say anything more than that because I just don't have any examples of exceptions
[00:58:49] Jeremy: For, for people listening, maybe that should be a goal at least like, see if you can make everything smaller, see if you can ship as little as possible, you know, maybe you don't hit that half a day mark, but at least give it a, give it a try and see what you can do.
[00:59:10] Jason: yeah. And the way I care would characterize it, maybe wouldn't be to ship as little as possible at a time, but to give a certain limit that you try not to go over. And it's, it's a skill that I think can be improved with practice. You learn certain techniques that you can use over and over. Like for example, one way that I split things up sometimes is we will add the database tables in one chunk. And we'll just deploy that, cause that presents a certain amount of risk, you know, when you're adding database tables or columns or anything like that, like it's always risky when you're messing with the structure of the database. So I like to do just that by itself. And it's kind of tidy most of the time because because it's not something that's like naturally visible to the user is just a structural change.
So that's an example of the kind of thing that you learn as you gain practice, breaking bigger things up into smaller pieces.
[01:00:16] Jeremy: so, and, and that example, in terms of whatever issue tracking system you use, what, what would you call that? Would you just call that setting up schema for X future features, or I'm just kinda curious how you characterize that.
[01:00:35] Jason: yeah, something like that. Those particular tickets don't have great names because ideally each ticket has some amount of value that's visible to the user and that one totally doesn't, it's a purely nuts and bolts kind of thing. So that's just a case where the name's not going to be great, but what's the alternative can't think of anything better. So we do it like that.
[01:01:02] Jeremy: you feel like that's, that's lower risk shipping something that's not user-facing first. Then it is to wait until you have at least like one small thing that, you know, is connected to that change.
[01:01:19] Jason: Yeah. I had a boss in the past who had a certain conception of the reason to do deployments. And, and her belief was that the reason that you deploy is to deliver value to the user which is of course true, but there's another really good reason to deploy, which is to mitigate risk. The further production and development are able to diverge from one another, the greater, the risk.
When you do a deployment. I remember one particular time at that job, I was made to deploy like three months of work at once and it was a disaster and I got the blame because I was the one who did the work. And quite frankly, I was really resentful that that had. And that's part of what informs my preference for deploying small amounts of work at a time.
I think it's best if things can be deployed serially, like rather than deploying in patches, just finish one thing, deploy it, verify it, finish the next thing, deploy it, verify it. I have the saying that it's better to be a hundred percent done with half your work than halfway done with a hundred percent of your work. For, for the hopefully obvious reason that like, if, if you have 15 things that are each halfway in progress, now you have to juggle 15 balls in your head. Whereas, if you have 15 things you have to do, and then you finish seven of them, then you can completely forget about those seven things that you finished and deployed and verified and all that.
And your mental bandwidth is freed up just to focus on the remaining work.
[01:03:10] Jeremy: yeah, that, that makes sense. And, and also if you are putting things out bit by bit, And something goes wrong, then at least it's not all 15 things you have to figure out, which was it. It's just the last thing he pushed out.
[01:03:26] Jason: Exactly. Yeah. It's never fun when you deploy a big delta and something goes wrong and it's a mystery. What introduced the problem? It's obviously never good if you deploy something that turns out to be a problem, but if you deployed just one thing and something goes wrong, at least you can. Roll it back or at the very least have a pretty decent idea of where the problem lies. So you can address it quickly.
[01:03:56] Jeremy: for sure. Well I think that's probably a good place to leave it off on, but is there anything else about testing or just software in general that you, you thought we should've brought up?
[01:04:09] Jason: Well, maybe if I can leave the listener with one thing um, I want to emphasize the importance of programming and feedback loops. It was a real eye-opener for me when I was interviewing these candidates to notice the distinct difference between programmers, who didn't program and feedback loops and programmers, who do I have a post about it?
I'm just, it's just called how to program and feedback loops. I believe if anybody's interested in the details. Cause I have like. It's like seven steps to that feedback loop. First, you write a line of code, then you do this. I don't remember all seven steps off the top of my head, but it's all there in the blog post.
Anyway, if I could give just one piece of advice to anybody who's getting into programming, it's a program in feedback loops.
[01:05:00] Jeremy: yeah, I think that's been the, the common thread, I suppose, throughout this conversation is that whether it's. Writing the features you want them to be as small as possible. So you get that feedback of it being done. And like you said, taking it off of your plate. Then there's the being able to have the tests there as you write the features so that you get that immediate feedback, that this is not doing what the test says it should be doing.
So yeah, it makes it, it makes a lot of sense that basically in everything we do try to get to a point where we get a thumbs up, we get at, this is complete. The faster we can do that, the better we'll we'll all be off. Right.
[01:05:46] Jason: exactly. Exactly.
[01:05:50] Jeremy: if people want to check out your book, check out your podcast, I think you even have a, a conference coming up, right? Uh, where, w where can they learn about that.
[01:06:02] Jason: So the hub for everything is code with jason.com. So that's where I always. Send people, you can find my blog, my podcast, my book there. And yeah, my conference it's called sin city ruby. It's a Ruby conference. This will only be applicable dear listener, if you're listening before March 24th, 2022. But yeah, it's, it's happening in Las Vegas.
It's going to be just a small intimate conference and it's a whole different story, but I kind of put on this conference accidentally. I didn't intend to do a conference. I just kind of uh, stumbled into it, but I think it will be a lot of fun. But yeah, that's, that's another thing that I have going on.
[01:06:49] Jeremy: What, what was it that I guess. Got you into deciding this is, this is what I want to do. I want to make a conference.
[01:06:58] Jason: Well, it started off as I was going to put on a class, but then nobody bought a ticket. And so I had to pivot. And so I'm like, okay, I didn't sell any tickets to this class. Maybe I can sell some tickets to a conference. And luckily for me, it turns out I was right because I was financially obligated to a hotel where I had reserved space for the class.
So I couldn't just cancel it. I had to move forward somehow. So that's where the conference came.
[01:07:28] Jeremy: interesting. yeah, I'm, I'm always kind of curious. How people decide what they want to attend, I guess, like, you know, you said how you didn't get enough signups for your class, but you get signups for a conference. And you know, the people who are signing up and want to go, I wonder to to them, what is, what is it about the going to a conference that is so much more appealing than, than going to a class?
[01:07:54] Jason: Oh, well, I think in order to go to a class, the topic has to be of interest to you. You have to be in like a specific time and place. The price point for that kind of thing is usually much higher than for, for a conference. Whereas with a conference it's affordable to individuals, you don't have to get your boss's permission necessarily, at least not for the money. It's more of like a, you don't have to be a specific kind of person in a specific scenario in order to benefit from it. It's a much more general interest. So that's why I think I've had an easier time selling tickets to that.
[01:08:31] Jeremy: Mm, mm. Yeah, it's, it's more of a I wanna get into a room with a bunch of people and just learn a bunch of cool stuff and not necessarily have a specific specific thing you're looking to get out of it, I guess.
[01:08:46] Jason: Yeah. There's no specific outcome or anything like that. Honestly, it's mostly just to have a good time. That's the main thing I'm hoping to get out of it. And I think that is the main draw for people they want to, they want to see their friends in the Ruby community form relationships and stuff like that.
[01:09:07] Jeremy: Very cool. Jason good luck with the conference and thank you so much for coming on software software sessions.
[01:09:13] Jason: Thanks a lot. And uh, thanks for having me.
Swizec is the author of the Serverless Handbook and a software engineer at Tia.
Swizec
AWS
Other serverless function hosting providers
Related topics
Transcript
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: Today, I'm talking to Swiz Teller. He's a senior software engineer at Tia. The author of the serverless handbook and he's also got a bunch of other courses and I don't know is it thousands of blog posts now you have a lot of them.
[00:00:13] Swizec: It is actually thousands of, uh, it's like 1500. So I don't know if that's exactly thousands, but it's over a thousand.
I'm cheating a little bit. Cause I started in high school back when blogs were still considered social media and then I just kind of kept going on the same domain.
Do you have some kind of process where you're, you're always thinking of what to write next? Or are you writing things down while you're working at your job? Things like that. I'm just curious how you come up with that.
[00:00:41] Swizec: So I'm one of those people who likes to use writing as a way to process things and to learn. So one of the best ways I found to learn something new is to kind of learn it and then figure out how to explain it to other people and through explaining it, you really, you really spot, oh shit. I don't actually understand that part at all, because if I understood it, I would be able to explain it.
And it's also really good as a reference for later. So some, one of my favorite things to do is to spot a problem at work and be like, oh, Hey, this is similar to that side project. I did once for a weekend experiment I did, and I wrote about it so we can kind of crib off of my method and now use it. So we don't have to figure things out from scratch.
And part of it is like you said, that just always thinking about what I can write next. I like to keep a schedule. So I keep myself to posting two articles per week. It used to be every day, but I got too busy for that. when you have that schedule and, you know, okay on Tuesday morning, I'm going to sit down and I have an hour or two hours to write, whatever is on top of mind, you kind of start spotting more and more of these opportunities where it's like a coworker asked me something and I explained it in a slack thread and it, we had an hour. Maybe not an hour, but half an hour of back and forth. And you actually just wrote like three or 400 words to explain something. If you take those 400 words and just polish them up a little bit, or rephrase them a different way so that they're easier to understand for somebody who is not your coworker, Hey, that's a blog post and you can post it on your blog and it might help others.
[00:02:29] Jeremy: It sounds like taking the conversations most people have in their day to day. And writing that down in a more formal way.
[00:02:37] Swizec: Yeah. not even maybe in a more formal way, but more, more about in a way that a broader audience can appreciate. if it's, I'm super gnarly, detailed, deep in our infrastructure in our stack, I would have to explain so much of the stuff around it for anyone to even understand that it's useless, but you often get these nuggets where, oh, this is actually a really good insight that I can share with others and then others can learn from it. I can learn from it.
[00:03:09] Jeremy: What's the most accessible way or the way that I can share this information with the most people who don't have all this context that I have from working in this place.
[00:03:21] Swizec: Exactly. And then the power move, if you're a bit of an asshole is to, instead of answering your coworkers question is to think about the answer, write a blog post and then share the link with them.
I think that's pushing it a little bit.
[00:03:38] Jeremy: Yeah, It's like you're being helpful, but it also feels a little bit passive aggressive.
[00:03:44] Swizec: Exactly. Although that's a really good way to write documentation. One thing I've noticed at work is if people keep asking me the same questions, I try to stop writing my replies in slack and instead put it on confluence or whatever internal wiki that we have, and then share that link. and that has always been super appreciated by everyone.
[00:04:09] Jeremy: I think it's easy to, have that reply in slack and, and solve that problem right then. But when you're creating these Wiki pages or these documents, how're people generally finding these. Cause I know you can go through all this trouble to make this document. And then people just don't know to look or where to go.
[00:04:30] Swizec: Yeah. Discoverability is a really big problem, especially what happens with a lot of internal documentation is that it's kind of this wasteland of good ideas that doesn't get updated and nobody maintains. So people stop even looking at it. And then if you've stopped looking at it before, stop updating it, people stop contributing and it kind of just falls apart.
And the other problem that often happens is that you start writing this documentation in a vacuum. So there's no audience for it, so it's not help. So it's not helpful. That's why I like the slack first approach where you first answered the question is. And now, you know exactly what you're answering and exactly who the audiences.
And then you can even just copy paste from slack, put it in a conf in JIRA board or wherever you put these things. spice it up a little, maybe effect some punctuation. And then next time when somebody asks you the same question, you can be like, oh, Hey, I remember where that is. Go find the link and share it with them and kind of also trains people to start looking at the wiki.
I don't know, maybe it's just the way my brain works, but I'm really bad at remembering information, but I'm really good at remembering how to find it. Like my brain works like a huge reference network and it's very easy for me to remember, oh, I wrote that down and it's over there even if I don't remember the answer, I almost always remember where I wrote it down if I wrote it down, whereas in slack it just kind of gets lost.
[00:06:07] Jeremy: Do you also take more informal notes? Like, do you have notes locally? You look through or something? That's not a straight up Wiki.
[00:06:15] Swizec: I'm actually really bad at that. I, one of the things I do is that when I'm coding, I write down. so I have almost like an engineering log book where everything, I, almost everything I think about, uh, problems I'm working on. I'm always writing them down on by hand, on a piece of paper. And then I never look at those notes again.
And it's almost like it helps me think it helps me organize my thoughts.
And I find that I'm really bad at actually referencing my notes and reading them later because, and this again is probably a quirk of my brain, but I've always been like this. Once I write it down, I rarely have to look at it again.
But if I don't write it down, I immediately forget what it is.
What I do really like doing is writing down SOPs. So if I notice that I keep doing something repeatedly, I write a, uh, standard operating procedure. For my personal life and for work as well, I have a huge, oh, it's not that huge, but I have a repository of standard procedures where, okay, I need to do X.
So you pull up the right recipe and you just follow the recipe. And if you spot a bug in the recipe, you fix the recipe. And then once you have that polished, it's really easy to turn that into an automated process that can do it for you, or even outsource it to somebody else who can work. So we did, you don't have to keep doing the same stuff and figuring out, figuring it out from scratch every time.
[00:07:55] Jeremy: And these standard operating procedures, they sound a little bit like runbooks I guess.
[00:08:01] Swizec: Yep. Run books or I think in DevOps, I think the big red book or the red binder where you take it out and you're like, we're having this emergency, this alert is firing. Here are the next steps of what we have to check.
[00:08:15] Jeremy: So for those kinds of things, those are more for incidents and things like that. But in your case, it sounds like it's more, uh, I need to get started with the next JS project, or I need to set up a Postgres database things like that.
[00:08:30] Swizec: Yeah. Or I need to reset a user to initial states for testing or create a new user. That's sort of thing.
[00:08:39] Jeremy: These probably aren't in that handwritten log book.
[00:08:44] Swizec: The wiki. That's also really good way to share them with new engineers who are coming on to the team.
[00:08:50] Jeremy: Is it where you just basically dump them all on one page or is it where you, you organize them somehow so that people know that this is where, where they need to go.
[00:09:00] Swizec: I like to keep a pretty flat structure because, I think the, the idea of categorization outlived its prime. We have really good search algorithms now and really good fuzzy searching. So it's almost easier if everything is just dumped and it's designed to be easy to search. a really interesting anecdote from, I think they were they were professors at some school and they realized that they try to organize everything into four files and folders.
And they're trying to explain this to their younger students, people who are in their early twenties and the young students just couldn't understand. Why would you put anything in a folder? Like what is a folder? What is why? You just dump everything on your desktop and then command F and you find it. Why would you, why would you even worry about what the file name is? Where the file is? Like, who cares? It's there somewhere.
[00:09:58] Jeremy: Yeah, I think I saw the same article. I think it was on the verge, right?
I mean, I think that's that's right, because when you're using, say a Mac and you don't go look for the application or the document you want to run a lot of times you open up spotlight and just type it and it comes up.
Though, I think what's also sort of interesting is, uh, at least in the note taking space, there's a lot of people who like setting up things like tags and things like that. And in a way that feels a lot like folders, I guess
[00:10:35] Swizec: Yeah. The difference between tags and categories is that the same file can have multiple tags and it cannot be in multiple folders. So that's why categorization systems usually fall apart. You mentioned note taking systems and my opinion on those has always been that it's very easy to fall into the trap of feeling productive because you are working on your note or productivity system, but you're not actually achieving anything.
You're just creating work for work sake. I try to keep everything as simple as possible and kind of avoid the overhead.
[00:11:15] Jeremy: People can definitely spend hours upon hours curating what's my note taking system going to be, the same way that you can try to set up your blog for two weeks and not write any articles.
[00:11:31] Swizec: Yeah. exactly.
[00:11:32] Jeremy: When I take notes, a lot of times I'll just create a new note in apple notes or in a markdown file and I'll just write stuff, but it ends up being very similar to what you described with your, your log book in that, like, because it's, it's not really organized in any way. Um, it can be tricky to go back and actually, find useful information though, Though, I suppose the main difference though, is that when it is digital, uh, sometimes if I search for a specific, uh, software application or a specific tool, then at least I can find, um, those bits there
[00:12:12] Swizec: Yeah. That's true. the other approach I'd like to use is called the good shit stays. So if I can't remember it, it probably wasn't important enough. And you can, especially these days with the internet, when it comes to details and facts, you can always find them. I find that it's pretty easy to find facts as long as you can remember some sort of reference to it.
[00:12:38] Jeremy: You can find specific errors or like you say specific facts, but I think if you haven't been working with a specific technology or in a specific domain for a certain amount of time, you, it, it can be hard to, to find like the right thing to look for, or to even know if the solution you're looking at is, is the right one.
[00:13:07] Swizec: That is very true. Yeah. Yeah, I don't really have a solution for that one other than relearn it again. And it's usually faster the second time. But if you had notes, you would still have to reread the notes. Anyway, I guess that's a little faster, cause it's customized to you personally.
[00:13:26] Jeremy: Where it's helpful is that sometimes when you're looking online, you have to jump through a bunch of different sites to kind of get all the information together. And by that time you've, you've lost your flow a little bit, or you you've lost, kind of what you were working on, uh, to begin with. Yeah.
[00:13:45] Swizec: Yeah. That definitely happens.
[00:13:47] Jeremy: Next I'd like to talk about the serverless handbook. Something that you've talked about publicly a little bit is that when you try to work on something, you don't think it's a great idea to just go look at a bunch of blog posts. Um, you think it's better to, to go to a book or some kind of more, uh, I don't know what you would call it like larger or authoritative resource. And I wonder what the process was for, for you. Like when you decided I'm going to go learn how to do serverless you know, what was your process for doing that?
[00:14:23] Swizec: Yeah. When I started learning serverless, I noticed that maybe I just wasn't good at finding them. That's one thing I've noticed with Google is that when you're jumping into a new technical. It's often hard to find stuff because you don't really know what you're searching for. And Google also likes to tune the algorithms to you personally a little bit.
So it can be hard to find what you want if you are, if you haven't been in that space. So I couldn't really find a lot of good resources, uh, which resulted in me doing a lot of exploration, essentially from scratch or piecing together different blogs and scraps of information here and there. I know that I spend ridiculous amounts of time in even as deep as GitHub issues on closed issues that came up in Google and answer something or figure, or people were figuring out how something works and then kind of piecing all of that together and doing a lot of kind of manual banging my head against the wall until the wall broke.
And I got through. I decided after all of that, that I really liked serverless as a technology. And I really think it's the future of how backend systems are going to be built. I think it's unclear yet. What kind of systems is appropriate for and what kind of kind of systems it isn't.
It does have pros and cons. it does resolve a lot of the very annoying parts of building a modern website or building upon backend go away when you go serverless. So I figured I really liked this and I've learned a lot trying to piece it together over a couple of years.
And if combined, I felt like I was able to do that because I had previous experience with building full stack websites, building full stack apps and understanding how backends work in general. So it wasn't like, oh, How do I do this from scratch? It was more okay. I know how this is supposed to work in theory.
And I understand the principles. What are the new things that I have to add to that to figure out serverless? So I wrote the serverless handbook basically as a, as a reference or as a resource that I wish I had when I started learning this stuff. It gives you a lot of the background of just how backends work in general, how databases connect, what different databases are, how they're, how they work.
Then I talked some, some about distributed systems because that comes up surprisingly quickly when you're going with serverless approaches, because everything is a lot more distributed. And it talks about infrastructure as code because that kind of simplifies a lot of the, they have opposite parts of the process and then talks about how you can piece it together in the ends to get a full product. and I approached it from the perspective of, I didn't want to write a tutorial that teaches you how to do something specific from start to finish, because I personally don't find those to be super useful. Um, they're great for getting started. They're great for building stuff. If you're building something, that's exactly the same as the tutorial you found.
But they don't help you really understand how it works. It's kind of like if you just learn how to cook risotto, you know how to cook risotto, but nobody told you that, Hey, you actually, now that you know how to cook risotto, you also know how to just make rice and peas. It's pretty much the same process.
Uh, and if you don't have that understanding, it's very hard to then transition between technologies and it's hard to apply them to your specific situation. So I try to avoid that and write more from the perspective. How I can give somebody who knows JavaScript who's a front end engineer, or just a JavaScript developer, how I can give them enough to really understand how serverless and backends works and be able to apply those approaches to any project.
[00:18:29] Jeremy: When people hear serverless, a lot of times they're not really sure what that actually means. I think a lot of times people think about Lambdas, they think about functions as a service. but I wonder to you what does serverless mean?
[00:18:45] Swizec: It's not that there's no server, there's almost always some server somewhere. There has to be a machine that actually runs your code. The idea of serverless is that the machine and the system that handles that stuff is trans is invisible to you. You're offloading all of the dev ops work to somebody else so that you can full focus on the business problems that you're trying to solve.
You can focus on the stuff that is specific and unique to your situation because, you know, there's a million different ways to set up a server that runs on a machine somewhere and answers, a, API requests with adjacent. And some people have done that. Thousands of times, new people, new folks have probably never done it.
And honestly, it's really boring, very brittle and kind of annoying, frustrating work that I personally never liked. So with serverless, you can kind of hand that off to a whole team of engineers at AWS or at Google or, whatever other providers there are, and they can deal with that stuff. And you can, you can work on the level of, I have this JavaScript function.
I want this JavaScript function to run when somebody hits this URL and that's it. That's all, that's essentially all you have to think about. So that's what serverless means to me. It's essentially a cloud functions, I guess.
[00:20:12] Jeremy: I mean, there been services like Heroku, for example, that, that have let people make rails apps or Django apps and things like that, where the user doesn't really have to think about the operating system, um, or about creating databases and things like that. And I wonder, to you, if, if that is serverless or if that's something different and, and what the difference there might be.
[00:20:37] Swizec: I think of that as an intermediary step between on prem or handling your own servers and full serverless, because you still have to think about provisioning. You still have to think of your server as a whole blob or a whole glob of things that runs together and runs somewhere and lives or lifts somewhere.
You have to provision capacity. You have to still think about how many servers you have on Heroku. They're called dynos. you still have to deal with the routing. You have to deal with connecting it to the database. Uh, you always have to think about that a little bit, but you're, you're still dealing with a lot of the frameworky stuff where you have to, okay, I'm going to declare a route. And then once I've declared the route, I'm going to tell it how to take data from the, from the request, put it to the function. That's actually doing the work. And then you're still dealing with all of that. Whereas with full serverless, first of all, it can scale down to zero, which is really useful.
If you don't have a lot of traffic, you can have, you're not paying anything unless somebody is actually using your app. The other thing is that you don't deal with any of the routing or any of that. You're just saying, I want this URL to exist, and I want it to run that function, that you don't deal with anything more than that.
And then you just write, the actual function that's doing the work. So it ends up being as a normal jobs function that accepts a request as an argument and returns a JSON response, or even just a JSON object and the serverless machinery handles everything else, which I personally find a lot easier. And you don't have to have these, what I call JSON bureaucracy, where you're piping an object through a bunch of different functions to get from the request to the actual part that's doing the work. You're just doing the core interesting work.
[00:22:40] Jeremy: Sort of sounds like one of the big distinctions is with something like Heroku or something similar. You may not have a server, but you have the dyno, which is basically a server. You have something that is consistently running,
Whereas with what you consider to be serverless, it's, it's something that basically only launches on when it's invoked. Um, whether that's a API call or, or something else. The, the routing thing is a little bit interesting because the, when I was going through the course, there are still the routes that you write. It's just that you're telling, I guess the API gateway Amazon's API gateway, how to route to your functions, which was very similar to how to route to a controller action or something like that in other languages.
[00:23:37] Swizec: Yeah. I think that part is actually is pretty similar where, I think it kind of depends on what kind of framework you end up building. Yeah, it can be very simple. I know with rails, it's relatively simple to define a new route. I think you have to touch three or four different files. I've also worked in large express apps where.
Hooking up the controller with all of the swagger definitions or open API definitions, and everything else ends up being like six or seven different files that have to have functions that are named just right. And you have to copy paste it around. And I, I find that to be kind of a waste of effort, with the serverless framework.
What I like is you have this YAML file and you say, this route is handled by this function. And then the rest happens on its own with next JS or with Gatsby functions, Gatsby cloud functions. They've gone even a step further, which I really like. You have the slash API directory in your project and you just pop a file in there.
And whatever that file is named, that becomes your API route and you don't even have to configure anything. You're just, in both of them, if you put a JavaScript file in slash API called hello, That exports, a handler function that is automatically a route and everything else happens behind the scenes.
[00:25:05] Jeremy: So that that's more of a matter of the framework you're using and how easy does it make it to, to handle routing? Whether that's a pain or a not.
[00:25:15] Swizec: Yeah. and I think with the serverless frameworks, it's because serverless itself, as a concept makes it easier to set this up. We've been able to have these modern frameworks with really good developer experience Gatsby now with how did they have Gatsby cloud and NextJS with Vercel and I think Netlify is working on it as well.
They can have this really good integration between really tight coupling and tight integration between a web framework and the deployment environment, because serverless is enabling them to spin that up. So easily.
[00:25:53] Jeremy: One of the things about your courses, this isn't the only thing you focus on, but one of the use cases is basically replacing a traditional server rendered application or a traditional rails, django, spring application, where you've got Amazon's API gateway in front, which is serving as the load balancer.
And then you have your Lambda functions, which are basically what would be a controller action in a lot of frameworks. and then you're hooking it up to a database which could be Amazon. It could be any database, I suppose. And I wonder in your experience having worked with serverless at your job or in side projects, whether that's like something you would use as a default or whether serverless is more for background jobs and things like that.
[00:26:51] Swizec: I think the underlying hidden question you're asking is about cold starts and API, and the response times, is one of the concerns that people have with serverless is that if your app is not used a lot, your servers scale down to zero. So then when somebody new comes on, it can take a really long time to respond.
And they're going to bail and be upset with you. One way that I've solved, that is using kind of a more JAM Stacky approach. I feel like that buzzword is still kind of in flux, but the idea is that the actual app front-end app, the client app is running off of CDNs and doesn't even touch your servers.
So that first load is of the entire app and of the entire client system is really fast because it comes from a CDN that's running somewhere as close as possible to the user. And it's only the actual APIs are hitting your server. So in the, for example, if you have something like a blog, you can, most blogs are pretty static.
Most of the content is very static. I use that on my blog as well. you can pre-render that when you're deploying the project. So you, you kind of, pre-render everything that's static when you deploy. And then it becomes just static files that are served from the CDN. So you get the initial article. I think if you, I haven't tested in a while, but I think if you load one of my articles on swizec.com, it's readable, like on lighthouse reports, if you look at the lighthouse where it gives you the series of screenshots, the first screenshot is already fully readable.
I think that means it's probably under 30 or 40 milliseconds to get the content and start reading, but then, then it rehydrates and becomes a react app. and then when it's a react app, it can make for their API calls to the backend. So usually on user interaction, like if you have upvotes or comments or something like that, Only when the user clicks something, you then make an API call to your server, and that then calls a Lambda or Gatsby function or a Netlify cloud function, or even a Firebase function, which then then wakes up and talks to the database and does things, and usually people are a lot more forgiving of that one taking 50 milliseconds to respond instead of 10 milliseconds, but, you know, 50 milliseconds is still pretty good.
And I think there were recently some experiments shared where they were comparing cold start times. And if you write your, uh, cloud functions in JavaScript, the average cold startup time is something like a hundred milliseconds. And a big part of that is because you're not wrapping this entire framework, like express or rails into your function. It's just a small function. So the server only has to load up something like, I don't know. I think my biggest cloud functions have been maybe 10 kilobytes with all of the dependencies and everything bundled in, and that's pretty fast for a server to, to load run, start no JS and start serving your request.
It's way fast enough. And then if you need even more speed, you can go to rust or go, which are even faster. As long as you avoid the java, .net, C-sharp those kinds of things. It's usually fine.
[00:30:36] Jeremy: One of the reasons I was curious is because I was going through the rest example you've got, where it's basically going through Amazon's API gateway, um, goes to a Lambda function written in JavaScript, and then talks to dynamoDB gives you a record back or creates a record and, I, I found that just making those calls, making a few calls, hopefully to account for the cold start I getting response times of maybe 150 to 250 milliseconds, which is not terrible, but, it's also not what I would call fast either.
So I was just kind of curious, when you have a real app, like, are, are there things that you've come across where Lambda maybe might have some issues or at least there's tricks you need to do to, to work around them?
[00:31:27] Swizec: Yeah. So the big problem there is, as soon as a database is involved, that tends to get. Especially if that database is not co-located with your Lambda. So it's usually, or when I've experimented, it was a really bad idea to go from a Vercel API function, talk to dynamo DB in AWS that goes over the open internet.
And it becomes really slow very quickly. at my previous job, I experimented with serverless and connecting it to RDS. If you have RDS in a separate private network, then RDS is that they, the Postgres database service they have, if that's running in a separate private network, then your functions, it immediately adds 200 or 300 milliseconds to your response times.
If you keep them together, it usually works a lot faster. ANd then there are ways to keeping them. Pre-warned usually it doesn't work as well as you would want. There are ways on AWS to, I forget what it's called right now, but they have now what's, some, some sort of automatic rewarming, if you really need response times that are smaller than a hundred, 200 milliseconds.
But yeah, it mostly depends on what you're doing. As soon as you're making API calls or database calls. You're essentially talking to a different server that is going to be slower on a lambda then it is if you have a packaged pserver, that's running the database and the server itself on the same machine.
[00:33:11] Jeremy: And are there any specific challenges related to say you mentioned RDS earlier? I know with some databases, like for example, Postgres sometimes, uh, when you have a traditional server application, the server will pool the connections. So it'll make some connection into your data database and just keep reusing them.
Whereas with the Lambda is it making a new connection every time?
[00:33:41] Swizec: Almost. So Lambdas. I think you can configure how long it stays warm, but what AWS tries to do is reuse your laptops. So when the Lambda wakes up, it doesn't die immediately. After that initial request, it stays, it stays alive for the next, let's say it's one minute. Or even if it's 10 minutes, it's, there's a life for the next couple of minutes.
And during that time, it can accept new requests, new requests and serve them. So anything that you put in the global namespace of your phone. We'll potentially remain alive between functions and you can use that to build a connection pool to your database so that you can reuse the connections instead of having to open new connections every time.
What you have to be careful with is that if you get simultaneous requests at actually simultaneous requests, not like 10 requests in 10 milliseconds, if you get 10 requests at the same millisecond, you're going to wake up multiple Lambdas and you're going to have multiple connection pools running in parallel.
So it's very easy to crash your RDS server with something like AWS Lambda, because I think the default concurrency limit is a thousand Lambdas. And if each of those can have a pool of, let's say 10 requests, that's 10,000 open requests or your RDS server. And. You were probably not paying for high enough tier for the RDS server to survive that that's where it gets really tricky.
I think AWS now has a service that lets you kind of offload a connection pool so that you can take your Lambda and connect it to the connection pool. And the connection pool is keeping warm connections to your server. but an even better approach is to use something like Aurora DB, which is also an on AWS or dynamo DB, which are designed from the ground up to work with serverless applications.
[00:35:47] Jeremy: It's things that work, but you have to know sort of the little, uh, gotchas, I guess, that are out there.
[00:35:54] Swizec: Yeah, exactly. There's sharp edges to be found everywhere. part of that is also that. serverless, isn't that old yet I think AWS Lambda launched in 2014 or 2015, which is one forever in internet time, but it's still not that long ago. So we're still figuring out how to make things better.
And, it's also where, where you mentioned earlier that whether it's more appropriate for backend processes or for user-facing processes, it does work really well for backend processes because you CA you have better control over the maximum number of Lambdas that run, and you have more patience for them being slow, being slow sometimes. And so on.
[00:36:41] Jeremy: It sounds like even for front end processes as long as you know, like you said, the sharp edges and you could do things like putting a CDN in front where your Lambdas don't even get hit until some later time.
There's a lot of things you can do to make it where it is a good choice or a good I guess what you're saying, when you're building an application, do you default to using a serverless type of stack?
[00:37:14] Swizec: Yes, for all of my side projects, I default to using serverless. Um, I have a bunch of apps running that way, even when serverless is just no servers at all. Like my blog doesn't have any cloud functions right now. It's all running from CDNs, basically. I think the only, I don't know if you could even count that as a cloud function is w my email signup forms go to an API with my email provider.
So there's also not, I don't have any servers there. It's directly from the front end. I would totally recommend it if you are a startup that just got tens of millions of dollars in funding, and you are planning to have a million requests per second by tomorrow, then maybe not. That's going to be very expensive very quickly.
But there's always a trade off. I think that with serverless, it's a lot easier to build in terms of dev ops and in terms of handling your infrastructure, it's, it takes a bit of a mind shift in how you're building when it comes to the actual logic and the actual, the server system that you're building.
And then in terms of costs, it really depends on what you're doing. If you're a super huge company, it probably doesn't make sense to go and serverless, but if you're that. Or if you have that much traffic, you hopefully are also making enough money to essentially build your own serverless system for yourself.
[00:38:48] Jeremy: For someone who's interested in trying serverless, like I know for myself when I was going through the tutorial you're using the serverless framework and it creates all these different things in AWS for you and at a high level I could follow. Okay. You know, it has the API gateway and you've got your simple queue service and DynamoDB, and the lambdas all that sort of thing.
So at a high level, I could follow along. But when I log into the AWS console, not knowing a whole lot about AWS, it's creating a ton of stuff for you.
And I'm wondering from your perspective for somebody who's learning about serverless, how much do they need to really dive into the AWS internals and understand what's going on there.
[00:39:41] Swizec: That's a tough one because personally I try to stay away as much as possible. And especially with the serverless framework, what I like is configuring everything through the framework rather than doing it manually. Um, because there's a lot of sharp edges there as well. Where if you go in and you manually change something, then AWS can't allow serverless framework to clean up anymore and you can have ghost processes running.
At Tia, we've had that as a really interesting challenge. We're not using serverless framework, we're using something called cloud formation, which is essentially.
One lower level of abstraction, then serverless framework, we're doing a lot more work. We're creating a lot more work for ourselves, but that's what we have. And that's what we're working with. these decisions predate me. So I'm just going along with what we have and we wanted to have more control, because again, we have dev ops people on the team and they want more control because they also know what they're doing and we keep having trouble with, oh, we were trying to use infrastructure as code, but then there's this little part where you do have to go into the AWS console and click around a million times to find the right thing and click it.
And we've had interesting issues with hanging deploys where something gets stuck on the AWS side and we can take it back. We can tear it down, we can stop it. And it's just a hanging process and you have to wait like seven hours for AWS to do. Oh, okay. Yeah. If it's been there for seven hours, it's probably not needed and then kills it and then you can deploy.
So that kind of stuff gets really frustrating very quickly.
[00:41:27] Jeremy: Sounds like maybe in your personal projects, you've been able to, to stick to the serverless framework abstraction and not necessarily have to understand or dive into the details of AWS and it's worked out okay for you.
[00:41:43] Swizec: Yeah, exactly. it's useful to know from a high, from a high level what's there and what the different parts are doing, but I would not recommend configuring them through the, through the AWS console because then you're going to always be in the, in the AWS console. And it's very easy to get something slightly wrong.
[00:42:04] Jeremy: Yeah. I mean, I know for myself just going through the handbook, just going into the console and finding out where I could look at my logs or, um, what was actually running in AWS. It wasn't that straightforward. So, even knowing the bare minimum for somebody who's new to, it was like a little daunting.
[00:42:26] Swizec: Yeah, it's super daunting. And they have thousands, if not hundreds of different products on AWS. and when it comes to, like you mentioned logs, I, I don't think I put this in the handbook because I either didn't know about it yet, or it wasn't available quite yet, but serverless can all the serverless framework also let you look at logs through the servers framework.
So you can say SLS function, name, logs, and it shows you the latest logs. it also lets you run functions locally to an extent. it's really useful from that perspective. And I personally find the AWS console super daunting as well. So I try to stay away as much as possible.
[00:43:13] Jeremy: It's pretty wild when you first log in and you click the button that shows you the services and it's covering your whole screen. Right. And you're like, I just want to see what I just pushed.
[00:43:24] Swizec: Yeah, exactly. And there's so many different ones and they're all they have these obscure names that I don't find meaningful at all.
[00:43:34] Jeremy: I think another thing that I found a little bit challenging was that when I develop applications, I'm used to having the feedback cycle of writing the code, running the application or running a test and seeing like, did it work? And if it didn't, what's the stack trace, what, what happened? And I found the process of going into CloudWatch and looking at the logs and waiting for them to eventually refresh and all that to be, a little challenging. And, and, um, so I was wondering in your, your experience, um, how you've worked through, you know, how are you able to get a fast feedback loop or is this just kind of just part of it.
[00:44:21] Swizec: I am very lazy when it comes to writing tests, or when it comes to fast feedback loops. I like having them I'm really bad at actually setting them up. But what I found works pretty well for serverless is first of all, if you write your backend a or if you write your cloud functions in TypeScript that immediately resolves most of the most common issues, most common sources of bugs, it makes sure that you're not using something that doesn't exist.
Make sure you're not making typos, make sure you're not holding a function wrong, which I personally find very helpful because I have pretty fast and I make typos. And it's so nice to be able to say, if it's completely. I know that it's at least going to run. I'm not going to have some stupid issue of a missing semi-colon or some weird fiddly detail.
So that's already a super fast feedback cycle that runs right in your IDE the next step is because you're just writing the business logic function and you know, that the function itself is going to run. You can write unit tests that treat that function as a normal function. I'm personally really bad at writing those unit tests, but they can really speed up the, the actual process of testing because you can go and you can be like, okay.
So I know that the code is doing what I want it to be doing if it's running in isolation. And that, that can be pretty fast. The next step that is, uh, Another level in abstraction and and gives you more feedback is with serverless. You can locally invoke most Lambdas. The problem with locally running your Lambdas is that it's not the same environment as on AWS.
And I asked one of the original developers of the same serverless framework, and he said, just forget about accurately replicating AWS on your system. There are so many dragons there it's never going to work. and I had an interesting example about that when I was building a little project for my girlfriend that sends her photos from our relationship to an IOT device every day or something like that.
It worked when I ran SLS invoke and it ran and it even called all of the APIs and everything worked. It was amazing. And then when I deployed it, it didn't work and it turned out that it was a permissions issue. I forgot to give myself a specific, I am role for something to work. That's kind of like a stair-stepping process of having fast feedback cycles first, if it compiles, that means that you're not doing anything absolutely wrong.
If the tests are running, that means it's at least doing what you think it's doing. If it's invoking locally, it means that you're holding the API APIs and the third-party stuff correctly. And then the last step is deploying it to AWS and actually running it with a curl or some sort of request and seeing if it works in production.
And that then tells you if it's actually going to work with AWS. And the nice thing there is because uh serverless framework does this. I think it does a sort of incremental deploys. The, that cycle is pretty fast. You're not waiting half an hour for your C code pipeline or for your CIO to run an integration test, to do stuff.
One minute, it takes one minute and it's up and you can call it and you immediately see if it's working.
[00:47:58] Jeremy: Basically you're, you're trying to do everything you can. Static typing and, running tests just on the functions. But I guess when it comes down to it, you really do have to push everything, update AWS, have it all run, um, in order to, to really know. Um, and so I guess it's, it's sort of a trade-off right. Versus being able to, if you're writing a rails application and you've got all your dependencies on your machine, um, you can spin it up and you don't really have to wait for it to, to push anywhere, but,
[00:48:36] Swizec: Yeah. But you still don't know if, what if your database is misconfigured in production?
[00:48:42] Jeremy: right, right. So it's, it's never, never the same as
production. It's just closer. Right? Yeah. Yeah, I totally get When you don't have the real services or the real databases, then there's always going to be stuff that you can miss. Yeah,
[00:49:00] Swizec: Yeah. it's not working until it's working in production.
[00:49:03] Jeremy: That's a good place to end it on, but is there anything else you want to mention before we go?
[00:49:10] Swizec: No, I think that's good. Uh, I think we talked about a lot of really interesting stuff.
[00:49:16] Jeremy: Cool. Well, Swiz, thank you so much for chatting with me today.
[00:49:19] Swizec: Yeah. Thank you for having me.
Alexander Pugh is a software engineer at Albertsons. He has worked in Robotic Process Automation and the cognitive services industry for over five years.
This episode originally aired on Software Engineering Radio.
Related Links
Enterprise RPA Solutions
Enterprise "Low Code/No Code" API Solutions
RPA and the OS
Transcript
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: Today, I'm talking to Alexander Pugh. He's a solutions architect with over five years of experience working on robotic process automation and cognitive services.
Today, we're going to focus on robotic process automation.
Alexander welcome to software engineering radio.
[00:00:17] Alex: Thank you, Jeremy. It's really good to be here.
[00:00:18] Jeremy: So what does robotic process automation actually mean?
[00:00:23] Alex: Right. It's a, it's a very broad nebulous term. when we talk about robotic process automation, as a concept, we're talking about automating things that humans do in the way that they do them. So that's the robotic, an automation that is, um, done in the way a human does a thing.
Um, and then process is that thing, um, that we're automating. And then automation is just saying, we're turning this into an automation where we're orchestrating this and automating this. and the best way to think about that in any other way is to think of a factory or a car assembly line. So initially when we went in and we, automated a car or factory, automation line, what they did is essentially they replicated the process as a human did it. So one day you had a human that would pick up a door and then put it on the car and bolt it on with their arms. And so the initial automations that we had on those factory lines were a robot arm that would pick up that door from the same place and put it on the car and bolt it on there.
Um, so the same can be said for robotic process automation. We're essentially looking at these, processes that humans do, and we're replicating them, with an automation that does it in the same way. Um, and where we're doing that is the operating system. So robotic process automation is essentially going in and automating the operating system to perform tasks the same way a human would do them in an operating system.
So that's, that's RPA in a nutshell,
Jeremy: So when you say you're replicating something that a human would do, does it mean it has to go through some kind of GUI or some kind of user interface?
[00:02:23] Alex: That's exactly right, actually. when we're talking about RPA and we look at a process that we want to automate with RPA, we say, okay. let's watch the human do it. Let's record that. Let's document the human process. And then let's use the RPA tool to replicate that exactly in that way.
So go double click on Chrome, launch that click in the URL line and send key in www.cnn.com or what have you, or servicenow hit enter, wait for it to load and then click, you know, where you want to, you know, fill out your ticket for service. Now send key in. So that's exactly how an RPA solution at the most basic can be achieved.
Now and any software engineer knows if you sit there and look over someone's shoulder and watch them use an operating system. Uh, you'll say, well, there's a lot of ways we can do this more efficiently without going over here, clicking that, you know, we can, use a lot of services that the operating system provides in a programmatic way to achieve the same ends and RPA solutions can also do that.
The real key is making sure that it is still achieving something that the human does and that if the RPA solution goes away, a human can still achieve it. So if you're, trying to replace or replicate a process with RPA, you don't want to change that process so much so that a human can no longer achieve it as well.
that's something where if you get a very technical, and very fluent software engineer, they lose sight of that because they say, oh, you know what? There's no reason why we need to go open a browser and go to you know, the service now portal and type this in when I can just directly send information to their backend.
which a human could not replicate. Right? So that's kind of where the line gets fuzzy. How efficiently can we make this RPA solution?
[00:04:32] Jeremy: I, I think a question that a lot of people are probably having is a lot of applications have APIs now. but what you're saying is that for it to, to be, I suppose, true RPA, it needs to be something that a user can do on their own and not something that the user can do by opening up dev tools or making a post to an end point.
[00:04:57] Alex: Yeah. And so this, this is probably really important right now to talk about why RPA, right? Why would you do this when you could put on a server, a a really good, API ingestion point or trigger or a web hook that can do this stuff. So why would we, why would we ever pursue our RPA?
There there's a lot of good reasons for it. RPA is very, very enticing to the business. RPA solutions and tools are marketed as a low code, no code solution for the business to utilize, to solve their processes that may not be solved by an enterprise solution and the in-between processes in a way.
You have, uh, a big enterprise, finance solution that everyone uses for the finance needs of your business, but there are some things that it doesn't provide for that you have a person that's doing a lot of, and the business says, Okay. well, this thing, this human is doing this is really beneath their capability. We need to get a software solution for it, but our enterprise solution just can't account for it. So let's get a RPA capability in here. We can build it ourselves, and then there we go. So there, there are many reasons to do that. financial, IT might not have, um, the capability or the funding to actually build and solve the solution. Or it it's at a scale that is too small to open up, uh, an IT project to solve for. Um, so, you know, a team of five is just doing this and they're doing it for, you know, 20 hours a week, which is uh large, but in a big enterprise, that's not really. Maybe um, worth building an enterprise solution for it. or, and this is a big one. There are regulatory constraints and security constraints around being able to access this or communicate some data or information in a way that is non-human or programmatic. So that's really where, um, RPA is correctly and best applied and you'll see it most often.
So what we're talking about there is in finance, in healthcare or in big companies where they're dealing with a lot of user data or customer data in a way. So when we talk about finance and healthcare, there are a lot of regulatory constraints and security reasons why you would not enable a programmatic solution to operate on your systems.
You know, it's just too hard. We we're not going to expose our databases or our data to any other thing. It would, it would take a huge enterprise project to build out that capability, secure that capability and ensure it's going correctly. We just don't have the money the time or the strength honestly, to afford for it.
So they say, well, we already have. a user pattern. We already allow users to, to talk to this information and communicate this information. Let's get an RPA tool, which for all intents and purposes will be acting as a user. And then it can just automate that process without us exposing to queries or any other thing, an enterprise solution or programmatic, um, solution.
So that's really why RPA, where and why you, you would apply it is there's, there's just no capability at enterprise for one reason or another to solve for it.
[00:08:47] Jeremy: as software engineers, when we see this kind of problem, our first thought is, okay, let's, let's build this custom application or workflow. That's going to talk to all these API APIs. And, and what it sounds like is. In a lot of cases there just isn't the time there just isn't the money, to put in the effort to do that.
And, it also sounds like this is a way of being able to automate that. and maybe introducing less risk because you're going through the same, security, the same workflow that people are doing currently. So, you know, you're not going to get into things that they're not supposed to be able to get into because all of that's already put in place.
[00:09:36] Alex: Correct. And it's an already accepted pattern and it's kind of odd to apply that kind of very IT software engineer term to a human user, but a human user is a pattern in software engineering. We have patterns that do this and that, and, you know, databases and not, and then the user journey or the user permissions and security and all that is a pattern.
And that is accepted by default when you're building these enterprise applications okay.
What's the user pattern. And so since that's already established and well-known, and all the hopefully, you know, walls are built around that to enable it to correctly do what it needs to do. It's saying, Okay. we've already established that. Let's just use that instead of. You know, building a programmatic solution where we have to go and find, do we already have an appropriate pattern to apply to it? Can we build it in safe way? And then can we support it? You know, all of a sudden we, you know, we have the support teams that, you know, watch our Splunk dashboards and make sure nothing's going down with our big enterprise application.
And then you're going to build a, another capability. Okay. WHere's that support going to come from? And now we got to talk about change access boards, user acceptance testing and, uh, you know, UAT dev production environments and all that. So it becomes, untenable, depending on your, your organization to, to do that for things that might fall into a place that is, it doesn't justify the scale that needs to be thrown upon it.
But when we talk about something like APIs and API exist, um, for a lot of things, they don't exist for everything. And, a lot of times that's for legacy databases, that's for mainframe capability. And this is really where RPA shines and is correctly applied. And especially in big businesses are highly regulated businesses where they can't upgrade to the newest thing, or they can't throw something to the cloud.
They have a, you know, their mainframe systems or they have their database systems that have to exist for one reason or the other until there is the motivation and the money and the time to correctly migrate and, and solve for them. So until that day, and again, there's no, API to, to do anything on a, on a mainframe, in this bank or whatnot, it's like, well, Okay. let's just throw RPA on it.
Let's, you know, let's have a RPA do this thing, uh, in the way that a human does it, but it can do it 24 7. and an example, or use cases, you work at a bank and, uh, there's no way that InfoSec is going to let you query against this database with, your users that have this account or your customers that have this no way in any organization at a bank.
Is InfoSec going to say, oh yeah. sure. Let me give you an Odata query, you know, driver on, you know, and you can just set up your own SQL queries and do whatever they're gonna say no way. In fact, how did you find out about this database in the first place and who are you.
How do we solve it? We, we go and say, Okay. how does the user get in here well they open up a mainframe emulator on their desktop, which shows them the mainframe. And then they go in, they click here and they put this number in there, and then they look up this customer and then they switch this value to that value and they say, save.
And it's like, okay. cool. That's that RPA can do. And we can do that quite easily. And we don't need to talk about APIs and we don't need to talk about special access or doing queries that makes, you know, Infosec very scared. you know, a great use case for that is, you know, a bank say they, they acquire, uh, a regional bank and they say, cool, you're now part of our bank, but in your systems that are now going to be a part of our systems, you have users that have this value, whereas in our bank, that value is this value here. So now we have to go and change for 30,000 customers this one field to make it line up with our systems. Traditionally you would get a, you know, extract, transform load tool an ETL tool to kind of do that. But for 30,000 customers that might be below the threshold, and this is banking. So it's very regulated and you have to be very, very. Intentional about how you manipulate and move data around.
So what do we have to do? okay. We have to hire 10 contractors for six months, and literally what they're going to do eight hours a day is go into the mainframe through the simulator and customer by customer. They're going to go change this value and hit save. And they're looking at an Excel spreadsheet that tells them what customer to go into.
And that's going to cost X amount of money and X, you know, for six months, or what we could do is just build a RPA solution, a bot, essentially that goes, and for each line of that Excel spreadsheet, it repeats this one process, open up mainframe emulator, navigate into the customer profile and then changes value, and then shut down and repeat.
And It can do that in one week and, and can be built in two, that's the, the dream use case for RPA and that's really kind of, uh, where it would shine.
[00:15:20] Jeremy: It sounds like the. best use case for it is an old system, a mainframe system, in COBOL maybe, uh, doesn't have an API. And so, uh, it makes sense to rather than go, okay, how can we get directly into the database?
[00:15:38] Alex: How can we build on top of it? Yeah,
[00:15:40] Jeremy: we build on top of it? Let's just go through the, user interface that exists, but just automate that process. And, the, you know, the example you gave, it sounds very, very well-defined you're gonna log in and you're going to put in maybe this ID, here's the fields you want to get back.
and you're going to save those and you didn't have to make any real decisions, I suppose, in, in terms of, do I need to click this thing or this thing it's always going to be the same path to, to get there.
[00:16:12] Alex: exactly. And that's really, you need to be disciplined about your use cases and what those look like. And you can broadly say a use case that I am going to accept has these features, and one of the best ways to do that is say it has to be a binary decision process, which means there is no, dynamic or interpreted decision that needs to, or information that needs to be made.
Exactly like that use case it's very binary either is, or it isn't you go in you journey into there. and you change that one thing and that's it there's no oh, well this information says this, which means, and then I have to go do this. Once you start getting in those if else, uh, processes you're, you're going down a rabbit hole and it could get very shaky and that introduces extreme instability in what you're trying to do.
And also really expands your development time cause you have to capture these processes and you have to say, okay. tell me exactly what we need to build this bot to do. And for, binary decision processes, that's easy go in here, do this, but nine times out of 10, as you're trying to address this and solution for it, you'll find those uncertainties.
You'll find these things where the business says, oh, well, yeah. that happens, you know, one times out of 10 and this is what we need to do. And it's like, well, that's going to break the bot. It, you know, nine times out of 10, this, this spot is going to fall over. this is now where we start getting into, the machine learning and AI, realm.
And why RPA, is classified. Uh, sometimes as a subset of the AI or machine learning field, or is a, a pattern within that field is because now that you have this bot or this software that enables you to do a human process, let's enable that bot to now do decision-making processes where it can interpret something and then do something else.
Because while we can just do a big tree to kind of address every capability, you're never going to be able to do that. And also it's, it's just a really heavy, bad way to build things. So instead let's throw in some machine learning capability where it just can understand what to do and that's, you know, that's the next level of RPA application is Okay. we've got it. We've, we've gone throughout our organization. We found every kind of binary thing, that can be replaced with an RPA bot. Okay.
Now what are the ones that we said we couldn't do? Because it had some of that decision-making that, required too much of a dynamic, uh, intelligence behind it. And let's see if we can address those now that we have this. And so that's, that's the 2.0, in RPA is addressing those non-binary, paths.
I would argue that especially in organizations that are big enough to justify bringing in an RPA solution to solve for their processes. They have enough binary processes, binary decision processes to keep them busy.
Some people, kind of get caught up in trying to right out the gate, say, we need to throw some machine learning. We need to make these bots really capable instead of just saying, well, we we've got plenty of work, just changing the binary processes or addressing those. Let's just be disciplined and take that, approach.
Uh, I will say towards RPA and bots, the best solution or the only solution. When you talk about building a bot is the one that you eventually turn off. So you can say, I built a bot that will go into our mainframe system and update this value. And, uh, that's successful.
I would argue that's not successful. When that bot is successful is when you can turn it off because there's an enterprise solution that addresses it. and, and you don't have to have this RPA bot that lives over here and does it instead, you're enterprise, capability now affords for it. And so that's really, I think a successful bot or a successful RPA solution is you've been able to take away the pain point or that human process until it can be correctly addressed by your systems that everyone uses.
[00:21:01] Jeremy: from, the business perspective, you know, what are some of the limitations or long-term problems with, with leaving an RPA solution in place?
[00:21:12] Alex: that's a, that's a good question. Uh, from the business there, isn't, it's solved for. leaving it in place is other than just servicing it and supporting it. There's no real issue there, especially if it's an internal system, like a mainframe, you guys own that. If it changes, you'll know it, if it changes it's probably being fixed or addressed.
So there's no, problem. However, That's not the only application for RPA. let's talk about another use case here, your organization, uses, a bank and you don't have an internal way to communicate it. Your user literally has to go to the bank's website, log in and see information that the bank is saying, Hey, this is your stuff, right?
The bank doesn't have an API for their, that service. because that would be scary for the bank. They say, we don't want to expose this to another service. So the human has to go in there, log in, look at maybe a PDF and download it and say, oh, Okay.
So that is happens in a browser. So it's a newer technology.
This isn't our mainframe built in 1980. You know, browser based it's in the internet and all that, but that's still a valid RPA application, right? It's a human process. There's no API, there's no easy programmatic way to, to solution for it. It would require the bank and your it team to get together and, you know, hate each other. Think about why this, this is so hard. So let's just throw a bot on it. That's going to go and log in, download this thing from the bank's website and then send it over to someone else. And it's going to do that all day. Every day. That's a valid application. And then tomorrow the bank changes its logo. And now my bot is it's confused.
Stuff has shifted on the page. It doesn't know where to click anymore. So you have to go in and update that bot because sure enough, that bank's not going to send out an email to you and saying, Hey, by the way, we're upgrading our website in two weeks. Not going to happen, you'll know after it's happened.
So that's where you're going to have to upgrade the bot. and that's the indefinite use of RPA is going to have to keep until someone else decides to upgrade their systems and provide for a programmatic solution that is completely outside the, uh, capability of the organization to change. And so that's where the business would say, we need this indefinitely.
It's not up to us. And so that is an indefinite solution that would be valid. Right? You can keep that going for 10 years as long, I would say you probably need to get a bank that maybe meets your business needs a little easier, but it's valid. And that would be a good way for the business to say yes, this needs to keep running forever until it doesn't.
[00:24:01] Jeremy: you, you brought up the case of where the webpage changes and the bot doesn't work anymore. specifically, you're, you're giving the example of finance and I feel like it would be basically catastrophic if the bot is moving money to somewhere, it shouldn't be moving because the UI has moved around or the buttons not where it expects it to be.
And I'm kind of curious what your experience has been with that sort of thing.
[00:24:27] Alex: you need to set organizational thresholds and say, this is this something this impacting or something that could go this wrong. It is not acceptable for us to solve with RPA, even though we could do it, it's just not worth it. Some organizations say that's anything that touches customer data healthcare and banking specialists say, yeah, we have a human process where the human will go and issue refunds to a customer, uh, and that could easily be done via RPA solution, but it's fraught with, what, if it does something wrong, it's literally going to impact.
Uh, someone somewhere they're their moneys or their, their security or something like that. So that, that definitely should be part of your evaluation. And, um, as an organization, you should set that up early and stick to it and say, Nope, this is outside our purview. Even we can do it. It has these things.
So I guess the answer to that is you should never get to that process, but now we're going to talk about, I guess, the actual nuts and bolts of how RPA solutions work and how they can be made to not action upon stuff when it changes or if it does so RPA software, by and large operates by exposing the operating system or the browsers underlying models and interpreting them.
Right. So when we talk about something like a, mainframe emulator, you have your RPA software on Microsoft windows. It's going to use the COM the component operating model, to see what is on the screen, what is on that emulator, and it's gonna expose those objects. to the software and say, you can pick these things and click on that and do that.
when we're talking about browser, what the RPA software is looking at is not only the COM the, the component object model there, which is the browser, itself. But then it's also looking at the DOM the document object model that is the webpage that is being served through the browser. And it's exposing that and saying, these are the things that you can touch or, operate on.
And so when you're building your bots, what you want to make sure is that the uniqueness of the thing that you're trying to access is something that is truly unique. And if it changes that one thing that the bot is looking for will not change. So we let's, let's go back to the, the banking website, right?
We go in and we launch the browser and the bot is sitting there waiting for the operating system to say, this process is running, which is what you wanted to launch. And it is in this state, you know, the bot says, okay. I'm expecting this kind of COM to exist. I see it does exist. It's this process, and it has this kind of name and cool Chrome is running. Okay. Let's go to this website. And after I've typed this in, I'm going to wait and look at the DOM and wait for it to return this expected a webpage name, but they could change their webpage name, the title of it, right. They can say, one day can say, hello, welcome to this bank. And the next day it says bank website, all of a sudden your bot breaks it no longer is finding what it was told to expect.
So you want to find something unique that will never change with that conceivably. And so you find that one thing on the DOM on the banking website, it's, you know, this element or this tag said, okay, there's no way they're changing that. And so it says cool the page is loaded. Now click on this field, which is log in.
Okay. You want to find something unique on that field that won't change when they upgrade, you know, from bootstrap to this kind of, you know, UI framework. that's all well, and good. That's what we call the happy path. It's doing this perfectly. Now you need to define what it should do when it doesn't find these things, which is not keep going or find similar it's it needs to fail fast and gracefully and pass on that failure to someone and not keep going. And that's kind of how we prevent that scary use case where it's like. okay. it's gone in, it's logged into the bank website now it's transactioning, bad things to bad places that we didn't program it for it, Well you unfortunately did not specify in a detailed enough way what it needs to look for.
And if it doesn't find that it needs to break, instead of saying that this is close enough. And so, in all things, software engineering, it's that specificity, it's that detail, that you need to hook onto. And that's also where, when we talk about this being a low-code no-code solutions that sometimes RPA is marketed to the business.
It's just so often not the case, because yes. It might provide a very user, business, friendly interface for you to build bots. But the knowledge you need to be able to ensure stability and accuracy, um, to build the bots is, is a familiarity that's probably not going to be had in the business. It's going to be had by a developer who knows what the DOM and COM are and how the operating system exposes services and processes and how.
JavaScript, especially when we're talking about single page apps and react where you do have this very reactive DOM, that's going to change. You need to be fluent with that and know, not only how HTML tags work and how CSS will change stuff on you in classes, but also how clicking on something on a single page app is as simple as a username input field will dynamically change that whole DOM and you need to account for it. so, it is it's, traditionally not as easy as saying, oh, the business person can just click, click, click, click, and then we have a bot. You'll have a bot, but it's probably going to be break breaking quite often. It's going to be inaccurate in its execution.
this is a business friendly user-friendly non-technical tool. And I launch it and it says, what do you want to do? And it says, let me record what you're going to do. And you say, cool.
And then you go about you open up Chrome and you type in the browser, and then you click here, click there, hit send, and then you stop recording. The tool says, cool, this is what you've done. Well, I have yet to see a, a solution that is that isn't able to not need further direction or, or defining on that process, You still should need to go in there and say, okay, yeah.
you recorded this correctly, but you know, you're not interpreting correctly or as accurate as you need to that field that I clicked on.
And if you know, anybody hits, you know, F12 on their keyboard while they have Chrome open and they see how the DOM is built, especially if this is using kind of any kind of template, Webpage software. It's going to have a lot of cruft in that HTML. So while yes, the recording did correctly see that you clicked on the input box.
What it's actually seen is that you actually clicked on the div. That is four levels scoped above it, whereas the parent, and there are other things within that as well. And so the software could be correctly clicking on that later, but other things could be in there and you're going to get some instability.
So the human or the business, um, bot builder, the roboticist, I guess, would need to say, okay, listen, we need to pare this down, but it's, it's even beyond that. There are concepts that you can't get around when building bots that are unique to software engineering as a concept. And even though they're very basic, it's still sometimes hard for the business user to, they felt to learn that.
And I I'm talking concepts as simple as for loops or loops in general where the business of course has, has knowledge of what we would call a loop, but they wouldn't call it a loop and it's not as accurately defined. So they have to learn that. And it's not as easy as just saying, oh Yeah.
do a loop. And the business will say, well, what's a loop.
Like I know, you know, conceptually what a loop could be like a loop in my, when I'm tying my shoe. But when you're talking about loop, that's a very specific thing in software and what you can do. And when you shouldn't do it, and that's something that these, no matter how good your low code, no code solution might be, it's going to have to afford for that concept.
And so a business user is still going to have to have some lower level capability to apply those concepts. And, and I I've yet to see anybody be able to get around that in their RPA solutions.
[00:33:42] Jeremy: So in your experience, even though these vendors may sell it as being a tool that anybody can just sit down and use but then you would want a developer to, to sit with them or, or see the result and, and try and figure out, okay, what do you, what do you really want this, this code to do?
Um, not just sort of these broad strokes that you were hoping the tool was gonna take care of for you? Yeah.
[00:34:06] Alex: that that's exactly right. And that's how every organization will come to that realization pretty quickly. the head of the game ones have said, okay, we need to have a really good, um, COE structure to this robotic operating model where we can have, a software engineering, developer capability that sits with the business, capability.
And they can, marry with each other, other businesses who may take, um, these vendors at their word and say, it's a low code meant for business. It just needs to make sure it's on and accessible. And then our business people are just gonna, uh, go in there and do this. They find out pretty quickly that they need some technical, um, guidance to go in because they're building unstable or inaccurate bots.
and whether they come to that sooner or later, they, they always come to that. Um, and they realize that, okay, there there's a technical capability And, this is not just RPA. This is the story of all low-code no-code solutions that have ever existed. It always comes around that, while this is a great interface for doing that, and it's very easy and it makes concepts easy.
Every single time, there is a technical capability that needs to be afforded.
[00:35:26] Jeremy: For the. The web browser, you mentioned the DOM, which is how we typically interact with applications there. But for native applications, you, you briefly mentioned, COM. And I was wondering when someone is writing, um, you know, a bot, uh, what are the sorts of things they see, or what are the primitives they're working with?
Like, is there a name attached to each button, each text, field,
[00:35:54] Alex: wouldn't that be a great world to live in, so there's not. And, and, as we build things in the DOM. People get a lot better. We've seen people are getting much better about using uniqueness when they build those things so that they can latch onto when things were built or built for the COM or, you know, a .NET for OS that might, that was not no one no one was like oh yeah, we're going to automate this.
Or, you know, we need to make this so that this button here is going to be unique from that button over there on the COM they didn't care, you know, different name. Um, so yeah, that is, that is sometimes a big issue when you're using, uh, an RPA solution, you say, okay. cool. Look at this, your calculator app.
And Okay. it's showing me the component object model that this was built. It that's describing what is looking at, but none of these nodes have, have a name. They're all, you know, node one node, 1.1 node two, or, or whatnot, or button is just button and there's no uniqueness around it. And that is, you see a lot of that in legacy older software, um, E legacy is things built in 2005, 2010.
Um, you do see that, and that's the difficulty at that point. You can still solve for this, but what you're doing is you're using send keys. So instead of saying, Okay.
RPA software, open up this, uh, application and then look for. You know, thing, this object in the COM and click on it, it's going to, you know, it can't, there is no uniqueness.
So what you say is just open up the software and then just hit tab three times, and that should get you to this one place that was not unique, but we know if you hit tab three times, it's going to get there now. That's all well and good, but there's so many things that could interfere with that and break it.
And the there's no context for the bot to grab onto, to verify, Okay. I am there. So any one thing, you could have a pop-up which essentially hijacks your send key, right? And so the bot yes, absolutely hit tab three times and it should be in that one place. It thinks it is, and it hits in enter, but in between the first and second tab, a pop-up happened and now it's latched onto this other process, hits enter. And all of a sudden outlook's opening bot doesn't know that, but it's still going on and it's going to enter in some financial information into oops, an email that it launched because it thought hitting enter again would do so. Yeah.
That's, that's where you get that instability. Um, there are other ways around it or other solutions.
and this is where we get into the you're using, um, lower level software engineering solutioning instead of doing it exactly how the user does it. When we're talking about the operating system and windows, there are a ton of interop services and assemblies that a, uh, RPA solution can access.
So instead of cracking open Excel, double-clicking on Excel workbook waiting for it to load, and then reading the information and putting information in, you can use the, you know, the office 365 or whatnot that, um, interop service assembly and say, Hey, launch this workbook without the UI, showing it, attach to that process that, you know, it is.
And then just send to it, using that assembly send information into it. And the human user can't do that. It can't manipulate stuff like that, but the bot can, and it it's the same end as the human users trying. And it's much more efficient and stable because the UI couldn't afford for that kind of stability.
So that would be a valid solution. But at that point, you're really migrating into a software engineering, it developer solution of something that you were trying not to do that for. So when is that? Why, you know, why not just go and solve for it with an enterprise or programmatic solution in the first place?
So that's the balance.
[00:40:18] Jeremy: earlier you're talking about the RPA needs to be something that, uh, that the person is able to do. And it sounds like in this case, I guess there still is a way for the person to do it. They can open up the, the Excel sheet and right it's just that the way the, the RPA tool is doing it is different. Yeah.
[00:40:38] Alex: Right. And more efficient and more stable. Certainly. Uh, especially when we're talking about Excel, you have an Excel with, you know, 200,000 lines, just opening that that's, that's your day, that's going to Excel it, just going to take its time opening and visualizing that information for you. Whereas you, you know, an RPA solution doesn't even need to crack that open.
Uh, it can just send data right directly to that workbook and it that's a valid solution. And again, some of these processes, it might be just two people at your organization that are essentially doing it. So it's, you know, you don't really, it's not at a threshold where you need an enterprise solution for it, but they're spending 30 minutes of their day just waiting for that Excel workbook to open and then manipulating the data and saving it.
And then, oh, their computer crashed. So you can do an RPA solution. It's going to be, um, to essentially build for a more efficient way of doing it. And that would be using the programmatic solution, but you're right. It is doing it in a way that a human could not achieve it. Um, and that again is. The where the discipline and the organizational, aspect of this comes in where it's saying, is that acceptable?
Is it okay to have it do things in this way, that are not human, but achieving the same ends. And if you're not disciplined that creeps, and all of a sudden you have a RPA solution that is doing things in a way that where the whole reason to bring that RPA solution is to not have something that did something like that. And that's usually where the stuff falls apart. IT all of a sudden perks their head up and says, wait, I have a lot of connections coming in from this one computer doing stuff very quickly with a, you know, a SQL query. It's like, what is going on? And so all of a sudden, someone built a bot to essentially do a programmatic connection.
And it is like, you should not be who gave you this permissions who did this shut down everything that is RPA here until we figure out what you guys went and did. So that's, that's the dance.
[00:42:55] Jeremy: it's almost like there's this hidden API or there's this API that you're not intended to use. but in the process of trying to automate this thing, you, you use it and then if your, IT is not aware of it, then things just kind of spiral out of control.
[00:43:10] Alex: Exactly. Right. So let's, you know, a use case of that would be, um, we need to get California tax information on alcohol sales. We need to see what each county taxes for alcohol to apply to something. And so today the human users, they go into the California, you know, tobacco, wildlife, whatever website, and they go look up stuff and okay, let's, that's, that's very arduous.
Let's throw a bot on that. Let's have a bot do that. Well, the bot developers, smart person knows their way around Google and they find out, well, California has an API for that. instead of the bot cracking open Chrome, it's just going to send this rest API call and it's going to get information back and that's awesome and accurate and way better than anything. but now all of a sudden IT sees connections going in and out. all of a sudden it's doing very quickly and it's getting information coming into your systems in a way that you did not know was going to be, uh, happening. And so while it was all well and good, it's, it's a good way for, the people whose job it is to protect yourself or know about these things, to get very, um, angry, rightly so that this is happening.
that's an organizational challenge, uh, and it's an oversight challenge and it's a, it's a developer challenge because, what you're getting into is the problems with having too technical people build these RPA bots, right? So on one hand we have business people who are told, Hey, just crack this thing open and build it.
And it's like, well, they don't have enough technical fluency to actually build a stable bot because they're just taking it at face value. Um, on the other hand, you have software engineers or developers that are very technical that say, oh, this process. Yeah. Okay. I can build a bot for that. But what if I used, you know, these interop services, assemblies that Microsoft gives me and I can access it like that.
And then I can send an API call over here. And while I'm at it, I'm going to, you know, I'm just going to spin up a server just on this one computer that can do this. When the bot talks to it. And so you have the opposite problem. Now you have something that is just not at all RPA, it's just using the tool to, uh, you know, manipulate stuff, programmatically.
[00:45:35] Jeremy: so, as a part of all this, is using the same credentials as a real user, right. You're you're logging in with a username and password. if the form requires something like two factor authentication or, you know, or something like that, like, how does that work since it's not an actual person?
[00:45:55] Alex: Right. So in a perfect world, you're correct. Um, a bot is a user. I know a lot of times you'll hear, say, people will be like, oh, hi, I have 20 RPA bots. What they're usually saying is I have 20 automations that are being run for separate processes, with one user's credentials, uh, on a VDI. So you're right.
They, they are using a user's credentials with the same permissions that any user that does that process has, that's why it's easy. but now we have these concepts, like two factor authentication, which every organization is using that should require something that exists outside of that bot users environment. And so how do you afford for that in a perfect world? It would be a service account, not a user account and service accounts are governed a little differently. A lot of times service accounts, um, have much more stringent rules, but also allow for things like password resets, not a thing, um, or two factor authentication is not a thing for those.
So that would be the perfect solution, but now you're dragging in IT. Um, so, you know, if you're not structurally set up for that, that's going to be a long slog. Uh, so what would want to do some people actually literally have a, we'll have a business person that has their two factor auth for that bot user on their phone.
And then just, you know, they'll just go in and say, yeah.
that's me. that's untenable. So, um, sometimes what a lot of these, like Microsoft, for instance, allow you to do is to install a two factor authentication, application, um, on your desktop so that when you go to log in a website and says, Hey, type in your password.
Cool. Okay. Give me that code. That's on your two factor auth app. The bot can actually launch that. Copy the code and paste it in there and be on its way. But you're right now, you're having to afford for things that aren't really part of the process you're trying to automate. They are the incidentals that also happen.
And so you have to build your bot to afford for those things and interpret, oh, I need to do two factor authentication. And a lot of times, especially if you have an entirely business focused PA um, robotic operating model, they will forget about those things or find ways around them that the bot isn't addressing, like having that authenticator app on their phone.
that's, um, stuff that definitely needs to be addressed. And sometimes is only, found at runtime like, oh, it's asking for login. And when I developed it, I didn't need to do that because I had, you know, the cookie that said you're good for 30 days, but now, oh, no.
[00:48:47] Jeremy: yeah. You could have two factor. Um, you could have, it asking you to check your email for a code. There could be a fraud warning. There's like all sorts of, you know, failure cases that can happen.
[00:48:58] Alex: exactly. And those things are when we talk about, uh, third-party vendors, um, third-party provider vendors, like going back to the banking website, if you don't tell them that you're going to be using a bot to get their information or to interface with that website, you're setting yourself up for a bad time because they're going to see that kind of at runtime behavior that is not possible at scale by user.
And so you run into that issue at runtime, but then you're correct. There are other things that you might run into at runtime that are not again, part of the process, the business didn't think that that was part of the process. It's just something they do that actually the bot has to afford for. that's part of the journey, uh, in building these.
[00:49:57] Jeremy: when you're, when you're building these, these bots, what are the types of tools that, that you've used in the past? Are these commercial, packages, are these open source? Like what, what does that ecosystem look like?
[00:50:11] Alex: Yeah, in this space, we have three big ones, which is, uh, automation anywhere UI path and, blue prism. Those are the RPA juggernauts providing this software to the companies that need it. And then you have smaller ones that are, trying to get in there, or provide stuff in a little different way. and you even have now, big juggernauts that are trying to provide for it, like Microsoft with something like power automate desktop.
So all of these, say three years ago, all of these softwares existed or all of these RPA solution softwares existed or operated in the same kind of way, where you would install it on your desktop. And it would provide you a studio to either record or define, uh, originally the process that was going to be automated on that desktop when you pushed play and they all kind of expose or operate in the same way they would interpret the COM or the DOM that the operating system provided. Things like task scheduler have traditionally, uh, exposed, uh, and they all kind of did that in the same way. Their value proposition in their software was the orchestration capability and the management of that.
So I build a bot to do this, Jim over there built a bot to do that. Okay. This RPA software, not only enabled you to define those processes, But what their real value was is they enable a place where I can say this needs to run at this time on this computer.
And it needs to, you know, I need to be able to monitor it and it needs to return information and all that kind of orchestration capability. Now all of these RPA solutions actually exist in that, like everything else in the browser. So instead of installing, you know, the application and launching it and, and whatnot, and the orchestration capability being installed on another computer that looked at these computers and ran stuff on them.
Now it's, it's all in the cloud as it were, and they are in the browser. So I go to. Wherever my RPA solution is in my browser. And then it says, okay, cool. You, you still need to install something on the desktop where you want the spot to run and it deploys it there. But I define and build my process in the provided browser studio.
And then we're going to give you a capability to orchestrate, monitor, and, uh, receive information on those things that you have, those bots that you have running, and then what they're now providing as well is the ability to tie in other services to your bot so that it has expanded capability. So I'm using automation anywhere and I built my bot and it's going, and it's doing this or that.
And automation anywhere says, Hey, that's cool. Wouldn't you like your bot to be able to do OCR? Well, we don't have our own OCR engine, but you probably as an enterprise do. Just use, you know, use your Kofax OCR engine or Hey, if you're really a high speed, why don't you use your Azure cognitive services capability?
We'll tie it right into our software. And so when you're building your bot, instead of just cracking open a PDF and send key control C and key control V to do stuff instead, we'll use your OCR engine that you've already paid for to, to understand stuff. And so that's, how they expand, what they're offering, um, into addressing more and more capabilities.
[00:53:57] Alex: But now we're, we're migrating into a territory where it's like, well, things have APIs why even build a bot for them. You know, you can just build a program that uses the API and the user can drive this. And so that's where people kind of get stuck. It's they they're using RPA on a, something that just as easily provides for a programmatic solution as opposed to an RPA solution.
but because they're in their RPA mode and they say, we can use a bot for everything, they don't even stop and investigate and say, Hey, wouldn't this be just as easy to generate a react app and let a user use this because it has an API and IT can just as easily monitor and support that because it's in an Azure resource bucket.
that's where an organization needs to be. Clear-eyed and say, Okay. at this point RPA is not the actual solution. We can do this just as easy over here and let's pursue that.
[00:54:57] Jeremy: the experience of making these RPAs. It sounds like you have this browser-based IDE, there's probably some kind of drag and drop set up, and then you, you, you mentioned JavaScript. So I suppose, does that mean you can kind of dive a little bit deeper and if you want to set up specific rules or loops, you're actually writing that in JavaScript.
[00:55:18] Alex: Yeah. So not, not necessarily. So, again, the business does not know what an IDE is. It's a studio. Um,
so that's, but you're correct. It's, it's an IDE. Um, each, whether we're talking about blue prism or UiPath or automation anywhere, they all have a different flavor of what that looks like and what they enable.
Um, traditionally blue prism gave you, uh, a studio that was more shape based where you are using UML shapes to define or describe your process. And then there you are, whereas automation anywhere traditionally used, uh, essentially lines or descriptors. So I say, Hey, I want to open this file. And your studio would just show a line that said open file.
You know, um, although they do now, all of them have a shape based way to define your process. Go here, here. You know, here's a circle which represents this. Uh, let's do that. Um, or a way for you to kind of more, um, creatively define it in a, like a text-based way. When we talk about Java script, um, or anything like that, they provide predefined actions, all of them saying, I want to open a file or execute this that you can do, but all of them as well, at least last time I checked also allow you for a way to say, I want to programmatically run something I want to define.
And since they're all in the browser, it is, uh, you know, Javascript that you're going to be saying, Hey, run this, this JavaScript, run this function. Um, previously, uh, things like automation anywhere would, uh, let you write stuff in, in .NET essentially to do that capability. But again, now everything's in the browser.
So yeah, they do, They do provide for a capability to introduce more low level capability to your automation. That can get dangerous. Uh, it can be powerful and it can be stabilizing, but it can be a very slippery slope where you have an RPA solution bot that does the thing. But really all it does is it starts up and then executes code that you built.
[00:57:39] Alex: Like what, what was the, the point in the first place?
[00:57:43] Jeremy: Yeah. And I suppose at that point, then anybody who knows how to use the RPA tool, but isn't familiar with that code you wrote, they're just, they can't maintain it
[00:57:54] Alex: you have business continuity and this goes back to our, it has to be replicable or close as close to the human process, as you can make. Because that's going to be the easiest to inherit and support. That's one of the great things about it. Whereas if you're a low level programmer, a dev who says, I can easily do this with a couple of lines of, you know, dot net or, you know, TypeScript or whatever.
And so the bot just starts up in executes. Well, unless someone that is just as proficient comes along later and says, this is why it's breaking you now have an unsupportable business, solution. that's bad Juju.
[00:58:38] Jeremy: you have the software engineers who they want to write code. then you have the people who are either in business or in IT that go, I don't want to look at your code.
I don't want to have to maintain it. Yeah. So it's like you almost, if you're a software engineer coming in, you almost have to fight that urge to, to write anything yourself and figure out, okay, what can I do with the tool set and only go to code if I can't do it any other way.
[00:59:07] Alex: That's correct. And that's the, it takes discipline. more often than not, not as fun as writing the code where you're like, I can do this. And this is really where the wheels come off is. You went to the business that is that I have this process, very simple. I need to do this and you say, cool, I can do that.
And then you're sitting there writing code and you're like, but you know what? I know what they really want to do. And I can write that now. And so you've changed the process and while it is, and nine times out of 10, the business will be like, oh, that's actually what we wanted. The human process was just as close as we could get nothing else, but you're right.
That's, that's exactly what we needed. Thank you nine times out of 10. They'll love you for that. But now you own their process. Now you're the one that defined it. You have to do the business continuity. You have to document it. And when it falls over, you have to pick it back up and you have to retrain.
And unless you have an organizational capacity to say, okay, I've gone in and changed your process. I didn't automate it. I changed it. Now I have to go in and tell you how I changed it and how you can do it. And so that, unless you have built your robotic operating model and your, your team to afford for that, your developer could be writing checks bigger than they can cash.
Even though this is a better capability.
[01:00:30] Jeremy: you, you sort of touched on this before, and I think this is probably the, the last topic we'll cover, but you've been saying how the end goal should be to not have to use the RPAs anymore
And I wonder if you have any advice for how to approach that process and, and what are some of the mistakes you've seen people make
[01:00:54] Alex: Mm Hmm. I mean the biggest mistake I've seen organizations make, I think is throwing the RPA solution out there, building bots, and they're great bots, and they are creating that value. They're enabling you to save money and also, enabling your employees to go on and do better, more gratifying work. but then they say, that's, it that's as far as we're going to think, instead of taking those savings and saying, this is for replacing this pain point that we had to get a bot in the first place to do so.
That's a huge common mistake. Absolutely understandable if I'm a CEO or even, you know, the person in charge of, you know, um, enterprise transformation. Um, it's very easy for me to say, ha victory, here's our money, here's our savings. I justified what we've done. Go have fun. Um, and instead of saying, we need to squirrel this money away and give it to the people that are going to change the system. So that, that's definitely one of the biggest things.
The problem with that is that's not realized until years later when they're like, oh, we're still supporting these bots. So it is upfront having a turnoff strategy. When can we turn this bot off? What is that going to look like? Does it have a roadmap that will eventually do that?
And that I think is the best way. And that will define what kind of processes you do indeed build bots for is you go to it and say, listen, we've got a lot of these user processes, human processes that are doing this stuff. Is there anything on your roadmap that is going to replace that and they say, oh yeah you know, in three years we're actually going to be standing up our new thing.
We're going to be converting. And part of our, uh, analysis of the solution that we will eventually stand up will be, does it do these things? And so yes, in three years, you're good. And you say, cool, those are the processes I'm going to automate and we can shut those off.
That's your point of entry for these things not doing that leads to bots running and doing things even after there is a enterprise solution for that. And more often than not, I would say greater than five times out of 10, when we are evaluating a process to build a bot for easily five times out of 10, we say, whoa, no, actually there's, you don't even need to do this.
Our enterprise application can do this. you just need retraining, because your process is just old and no one knew you were doing this. And so they didn't come in and tell you, Hey, you need to use this.
So that's really a lot of times what, what the issue is. And then after that, we go in and say, Okay.
no, there's, there's no solution for this. This is definitely a bot needs to do this. Let's make sure number one, that there isn't a solution on the horizon six months to a year, because otherwise we're just going to waste time, but let's make sure there is, or at least IT, or the people in charge are aware that this is something that needs to be replaced bot or no bot.
And so let's have an exit strategy. Let's have a turn-off strategy.
When you have applications that are relatively modern, like you have a JIRA, a ServiceNow, you know, they must have some sort of API and it may just be that nobody has come in and told them, you just need to plug these applications together.
[01:04:27] Alex: And so kind of what you're hitting on and surfacing is the future of RPA. Whereas everything we're talking about is using a bot to essentially bridge a gap, moving data from here to there that can't be done, programmatically. Accessing something from here to there that can't be done programmatically.
So we use a bot to do it. That's only going to exist for so long. Legacy can only be legacy for so long, although you can conceivably because we had that big COBOL thing, um, maybe longer than we we'd all like, but eventually these things will be. upgraded. and so either the RPA market will get smaller because there's less legacy out there.
And so RPA as a tool and a solution will become much more targeted towards specific systems or we expand what RPA is and what it can afford for. And so that I think is more likely the case. And that's the future where bots or automations aren't necessary interpreting the COM and the DOM and saying, okay, click here do that.
But rather you're able to quickly build bots that utilize APIs that are built in and friendly. And so what we're talking about there is things like Appian or MuleSoft, which are these kind of API integrators are eventually going to be classified as RPA. They're going to be within this realm.
And I think, where, where you're seeing that at least surfaced or moving towards is really what Microsoft's offering in that, where they, uh, they have something called power automate, which essentially is it just a very user-friendly way to access API. that they built or other people have built.
So I want to go and I need to get information to service now, service now has an API. Yeah. Your, IT can go in and build you a nice little app that does a little restful call to it, or a rest API call to it gets information back, or you can go in and, you know, use Microsoft power automate and say, okay, I want to access service now.
And it says, cool. These are the things you can do. And I say, okay, I just want to put information in this ticket and we're not talking about get or patch or put, uh, or anything like that. We're just saying, ah, that's what it's going to do. And that's kind of what Microsoft is, is offering. I think that is the new state of RPA is being able to interface in a user-friendly way with APIs. Cause everything's in the browser to the point. where, you know, Microsoft's enabling add ins for Excel to be written in JavaScript, which is just the new frontier. Um, so that's, that's kind of going to be the future state of this. I believe.
[01:07:28] Jeremy: so, so moving from RPAs being this thing, that's gonna click through website, click through, um, a desktop application instead it's maybe more of this high, higher level tool where the user will still get this, I forget the term you used, but this tool to build a workflow, right. A studio. Okay. Um, and instead of saying, oh, I want this to click this button or fill in this form.
It'll be, um, I want to get this information from service now. And I want to send a message using that information to slack or to Twilio, or, um, you're basically, talking directly to these different services and just telling it what you want and where it should go.
[01:08:14] Alex: That's correct. So, as you said, everything's going to have an API, right? Seemingly everything has an API. And so instead of us, our RPA bots or solutions being UI focused, they're going to be API focused, um, where it doesn't have to use the user interface. It's going to use the other service. And again, the cool thing about APIs in that way is that it's not, directly connecting to your data source.
It's the same as your UI for a user. It sits on top of it. It gets the request and it correctly interprets that. And does it the same thing with your UI where I say I click here and you know, wherever it says. okay. yeah. You're allowed to do that. Go ahead. So that's kind of that the benefit to that.
Um, but to your point, the, the user experience for whether you're using a UI or API to build up RPA bot, it's going to be the same experience for the user. And then at this point, what we're talking about, well, where's the value offering or what is the value proposition of RPA and that's orchestration and monitoring and data essentially.
we'll take care of hosting these for you. we'll take care of where they're going to run, uh, giving you a dashboard, things like that.
[01:09:37] Alex: That's a hundred percent correct. It's it's providing a view into that thing and letting the business say, I want to no code this. And I want to be able to just go in and understand and say, oh, I do want to do that. I'm going to put these things together and it's going to automate this business process that I hate, but is vital, and I'm going to save it, the RPA software enables you to say, oh, I saw they did that. And I see it's running and everything's okay in the world and I want to turn it on or off. And so it's that seamless kind of capability that that's what that will provide.
And I think that's really where it isn't, but really where it's going. Uh, it'll be interesting to see when the RPA providers switch to that kind of language because currently and traditionally they've gone to business and said, we can build you bots or no, no, your, your users can build bots and that's the value proposition they can go in.
And instead of writing an Excel where you had one very, very advanced user. Building macros into Excel with VBA and they're unknown to the, the IT or anybody else instead, you know, build a bot for it. And so that's their business proposition today.
Instead, it's going to shift, and I'd be interested to see when it shifts where they say, listen, we can provide you a view into those solutions and you can orchestrate them in, oh, here's the studio that enables people to build them.
But really what you want to do is give that to your IT and just say, Hey, we're going to go over here and address business needs and build them. But don't worry. You'll be able to monitor them and at least say, yeah okay. this is, this is going.
[01:11:16] Jeremy: Yeah. And that's a, a shift. It sounds like where RPA is currently, you were talking about how, when you're configuring them to click on websites and GUIs, you really do still need someone with the software expertise to know what's going on. but maybe when you move over to communicating with API, Um, maybe that won't be as important maybe, somebody who just knows the business process really can just use that studio and get what they need.
[01:11:48] Alex: that's correct. Right. Cause the API only enables you to do what it defined right. So service now, which does have a robust API, it says you can do these things the same as a user can only click a button that's there that you've built and said they can click. And so that is you can't go off the reservation as easy with that stuff, really what's going to become prime or important is as no longer do I actually have an Oracle server physically in my location with a database.
Instead I'm using Oracle's cloud capability, which exists on their own thing. That's where I'm getting data from. What becomes important about being able to monitor these is not necessarily like, oh, is it falling over? Is it breaking? It's saying, what information are you sending or getting from these things that are not within our walled garden.
And that's really where, it or the P InfoSec is, is going to be maybe the main orchestrator owner of RPA, because they're, they're going to be the ones to say you can't, you can't get that. You're not allowed to get that information. It's not necessarily that you can't do it. Um, and you can't do it in a dangerous way, but it's rather, I don't want you transporting that information or bringing it in.
So that's, that's really, what's the what's going to change.
[01:13:13] Jeremy: I think that's a good place to wrap it up, but, uh, is there anything we missed or anything else you want to plug before we go?
[01:13:21] Alex: No. Uh, I think this was uh pretty comprehensive and I really enjoyed it.
Alex thanks for coming on the show
[01:13:28] Alex: No, thank you for having me. It's been, it's been a joy.
[01:13:31] Jeremy: This has been Jeremy Jung for software engineering radio. Thanks for listening.
Josef Strzibny is the author of Deployment from Scratch and a current Fedora contributor. He previously worked on the Developer Experience team at Red Hat.
This episode originally aired on Software Engineering Radio.
Links:
Transcript:
You can help edit this transcript on GitHub.
[00:00:00] Jeremy: Today, I'm talking to Josef Strzibny.
He's the author of the book deployment from scratch. A fedora contributor. And he previously worked on the developer experience team at red hat.
Josef welcome to software engineering radio.
[00:00:13] Josef: Uh, thanks for having me. I'm really happy to be here.
There are a lot of commercial services for hosting applications these days. One that's been around for quite a while is Heroku, but there's also services like render and Netlify. why should a developer learn how to deploy from scratch and why would a developer choose to self host an application
[00:00:37] Josef: I think that as a web engineers and backend engineers, we should know a little bit more how we run our own applications, that we write. but there is also a business case, right?
For a lot of people, this could be, uh, saving money on hosting, especially with managed databases that can go, high in price very quickly. and for people like me, that apart from daily job have also some side project, some little project they want to, start and maybe turn into a successful startup, you know but it's at the beginning, so they don't want to spend too much money on it, you know?
And, I can deploy and, serve my little projects from $5 virtual private servers in the cloud. So I think that's another reason to look into it. And business wise, if you are, let's say a bigger team and you have the money, of course you can afford all these services. But then what happened to me when I was leading a startup, we were at somewhere (?) and people are coming and asking us, we need to self host their application.
We don't trust the cloud. And then if you want to prepare this environment for them to host your application, then you also need to know how to do it. Right? I understand completely get the point of not knowing it because already backend development can be huge.
You know, you can learn so many different databases, languages, whatever, and learning also operations and servers. It can be overwhelming. I want to say you don't have to do it all at once. Just, you know, learn a little bit, uh, and you can improve as you go. Uh, you will not learn everything in a day.
[00:02:28] Jeremy: So it sounds like the very first reason might be to just have a better understanding of, of how your applications are, are running. Because even if you are using a service, ultimately that is going to be running on a bare machine somewhere or run on a virtual machine somewhere. So it could be helpful maybe for just troubleshooting or a better understanding how your application works.
And then there's what you were talking about with some companies want to self-host and, just the cost aspect.
[00:03:03] Josef: Yeah. for me, really, the primary reason would be to understand it because, you know, when I was starting programming, oh, well, first of there was PHP and I, I used some shared hosting thing, just some SFTP. Right. And they would host it for me. It was fine. Then I switched to Ruby on Rails and at the time, uh, people were struggling with deploying it and I was asking myself, so, okay, so you ran rails s like for a server, right. It starts in development, but can you just do that on the server for, for your production? You know, can you just rails server and is that it, or is there more to it? Or when people were talking about, uh, Linux hardening, I was like, okay, but you know, your Linx distribution have some good defaults, right.
[00:03:52] Jeremy: So why do you need some further hardening? What does it mean? What to change. So for me, I really wanted to know, uh, the reason I wrote this book is that I wanted to like double down on my understanding that I got it right. Yeah, I can definitely relate in the sense that I've also used Ruby and Ruby on rails as well. And there's this, this huge gap between just learning how to run it in a development environment on your computer versus deploying it onto a server and it's pretty overwhelming. So I think it's, it's really great that, that you're putting together a book that, that really goes into a lot of these things that I think that usually aren't talked about when people are just talking about learning a language.
[00:04:39] Josef: you can imagine that a lot of components you can have into this applications, right? You have one database, maybe you have more databases. Maybe you have a redis key-value store. Uh, then you might have load balancers and all that jazz. And I just want to say that there's one thing I also say in the book, like try to keep it simple. If you can just deploy one server, if you don't need to fulfill some SLE (SLA) uh, uptime, just do the simplest thing first, because you will really understand it. And when there was an error you will know how to fix it because when you make things complex for you, then it will be kind of lost, very quickly. So I try to really make things as simple as possible to stay on top of them.
[00:05:25] Jeremy: I think one of the first decisions you have to make, when you're going to self host an application is you have to decide which distribution you're going to use. And there's things like red hat and Ubuntu, and Debian and all these different distributions. And I'm wondering for somebody who just wants to deploy their application, whether that's rails, Django, or anything else, what are the key differences between them and, and how should they choose a distribution?
[00:05:55] Josef: if you already know one particular distribution, there's no need to constantly be on the hunt for a more shiny thing, you know, uh, it's more important that you know it well and, uh, you are not lost. Uh, that said there are differences, you know, and there could be a long list from goals and philosophy to who makes it whether community or company, if it's showing distribution or not, lack of support, especially for security updates, uh, the kind of init systems, uh, that is used, the kind of c library that is used packaging format, package manager, and for what I think most people will care about number of packages and the quality or version, right?
Because essentially the distribution is distribution of software. So you care about the software. If you are putting your own stuff, on top of it. you maybe don't care. You just care about it being a Linux distribution and that's it. That's fine. But if you are using more things from the distribution, you might star, start caring a little bit more.
You know, other thing is maybe a support for some mandatory access control or in the, you know, world of Docker, maybe the most minimal image you can get established because you will be building a lot of, a lot of times the, the Docker image from the Docker file. And I would say that two main family of systems that people probably know, uh, ones based on Fedora and those based on Debian, right from Fedora, you have, uh, Red Hat Enterprise Linux, CentOS, uh, Rocky Linux.
And on the Debian side you have Ubuntu which is maybe the most popular cloud distribution right now. And, uh, of course as a Fedora packager I'm kind of, uh, in the fedora world. Right. But if I can, if I can mention two things that I think makes sense or like our advantage to fedora based systems. And I would say one is modular packages because it's traditional systems for a long time or for only one version of particular component like let's say postgresql, uh, or Ruby, uh, for one big version.
So that means, uh, either it worked for you or it didn't, you know, with databases, maybe you could make it work. With ruby and python versions. usually you start looking at some version manager to compile their own version because the version was old or simply not the same, the one your application uses and with modular packages, this changed and now in fedora and RHEL and all this, We now have several options to install. There are like four different versions of postgresql for instance, you know, four different versions of redis, but also different versions of Ruby, python, of course still, you don't get all of the versions you want. So for some people, it still might not work, but I think it's a big step forward because even when I was working at Red Hat, we were working on a product called software collections.
This was kind of trying to solve this thing for enterprise customers, but I don't think it was particularly a good solution. So I'm quite happy about this modularity effort, you know, and I think the modular packages, I look into them recently are, are very better, but I will say one thing don't expect to use them in a way you use your regular version manager for development.
So, if you want to be switching between versions of different projects, that's not the use case for them, at least as I understand it, not for now, you know, but for server that's fine. And the second, second good advantage of Fedora based system, I think is good initial SELinux profile settings, you know, SE Linux is security enhanced Linux.
What it really is, is a mandatory access control. So, on usual distribution, you have a discrete permissions that you set that user set themselves on their directories and files, you know, but this mandatory access control means that it's kind of a profile that is there beforehand, the administrators prepares. And, it's kind of orthogonal to those other security, uh, boundaries you have there. So that will help you to protect your most vulnerable, uh, processes because especially with SELinux, there are several modes. So there is, uh, MLS (?) mode for like that maybe an army would use, you know, but for what we use, what's like the default, uh, it's uh, something called targeted policy.
And that means you are targeting the vulnerable processes. So that means your services that we are exposing to external world, like whether it's SSH, postgresql, nginx, all those things. So you have a special profile for them. And if someone, some, attacker takes over, of your one component, one process, they still cannot do much more than what the component was, uh, kind of prepared to do.
I think it's really good that you have this high-quality settings already made because other distributions, they might actually be able to run with SELinux. But they don't necessarily provide you any starting points. You will have to do all your policies yourself. And SELinux is actually a quite complex system, you know, it's difficult.
It's even difficult to use it as a user. Kind of, if you see some tutorials for CentOS, uh, you will see a lot of people mentioned SELinux maybe even turning it off, there's this struggle, you know, and that's why I also, use and write like one big chapter on SELinux to get people more familiar and less scared about using it and running with it.
[00:12:00] Jeremy: So SELinux is, it sounds like it's basically something where you have these different profiles for different types of applications. You mentioned SSH, for example, um, maybe there could be one for nginx or, or one for Postgres. And they're basically these collections of permissions that a process should be able to have access to whether that's, network ports or, file system permissions, things like that.
And they're, they're kind of all pre-packaged for you. So you're saying that if you are using a fedora based distribution, you could, you could say that, I want SSH to be allowed. So I'm going to turn on this profile, or I want nginx to be used on this system. So I'm going to turn on this profile and those permissions are just going to be applied to the process that that needs it is that is that correct?
[00:12:54] Josef: Well, actually in the base system, there will be already a set of base settings that are loaded, you know, and you can make your own, uh, policy models that you can load. but essentially it works in a way that, uh, what's not really permitted and allowed is disallowed.
that's why it can be a pain in the ass. And as you said, you are completely correct. You can imagine it as um nginx as a reverse proxy, communicating with Puma application server via Unix socket, right? And now nginx will need to have access to that socket to be even being able to write to a Unix socket and so on.
So things like that. Uh, but luckily you don't have to know all these things, because it's really difficult, especially if you're starting up. Uh, so there are set of tools and utilities that will help you to use SELinux in a very convenient way. So what you, what you do, what I will suggest you to do is to run SELinux in a permissive mode, which means that, uh, it logs any kind of violations that application does against your base system policies, right?
So you will have them in the log, but everything will work. Your application will work. So we don't have to worry about it. And after some time running your application, you've ran these utilities to analyze these logs and these violations, and they can even generate a profile for you. So you will know, okay, this is the profile I need.
This is the access to things I need to add. once after you do that, if, if there will be some problems with your process, if, if some article will try to do something else, they will be denied.
That action is simply not happening. Yeah. But because of the utilities, you can kind of almost automate how, how you make a profile and that way is much, much easier.
Yeah.
[00:14:54] Jeremy: So, basically the, the operating system, it comes with all these defaults of things that you're allowed to do and not allowed to do, you turn on this permissive flag and it logs all the things that it would have blocked if you, were enforcing SELinux. And then you can basically go in and add the things that are, that are missing.
[00:15:14] Josef: Yes exactly right.
[00:15:16] Jeremy: the, next thing I'd like to go into is, one of the things you talk about in the book is about how your services, your, your application, how it runs, uh, as, as daemons. And I wonder if you could define what a daemon is?
[00:15:33] Josef: Uh, you can think about them as a, as a background process, you know, something that continuously runs In the background. Even if the virtual machine goes down and you reboot, you just want them again to be restarted and just run at all times the system is running.
[00:15:52] Jeremy: And for things like an application you write or for a database, should the application itself know how to run itself in the background or is that the responsibility of some operating system level process manager?
[00:16:08] Josef: uh, every Linux operating system has actually, uh, so-called init system, it's actually the second process after the Linux kernel that started on their system, it has a process ID of one. And it's essentially the parent of all your processes because on Linux, you have always parents and children. Because you use forking to make new, make new processes. And so this is your system process manager, but obviously systemd if it's your system process manager, you already trusted with all the systems services, you can also trust them with your application, right? I mean, who else would you trust even if you choose some other purchase manager, because there are many, essentially you would have to wrap up that process manager being a systemd service, because otherwise there is, you wouldn't have this connection of systemd being a supreme supervisor of your application, right?
When, uh, one of your services struggle, uh, you want it to be restarted and continue. So that's what a systemd could do for you. If you, you kind of design everything as a systemd service, for base packages like base postgresql they've already come with a systemd services, very easy to use. You just simply start it and it's running, you know, and then for your application, uh, you would write a systemd service, which is a little file.
There are some directives it's kind of a very simple and straightforward, uh, because before, before systemd people were using the services with bash and it was kind of error prone, but now with systemd it's quite simple. They're just a set of directives, uh, that you learn. you tell systemd, you know, under what user you should run, uh, what working directory you want it to be running with.
Uh, is there a environment file? Is there a pidfile? And then, uh, A few other things. The most important being a directive called ExecStart, which tells systemd what process to start, it will start a process and it will simply oversee oversee it and will look at errors and so on.
[00:18:32] Jeremy: So in the past, I know there used to be applications that were written where the application itself would background itself. And basically that would allow you to run it in the background without something like a systemd. And so it sounds like now, what you should do instead is have your application be built to just run in the foreground.
and your process manager, like systemd can be configured to, um, handle restarting it, which user is running it. environment variables, all sorts of different things that in the past, you might've had to write in your own bash script or write into the application itself.
[00:19:14] Josef: And there's also some. other niceties about systemd because for example, you can, you can define how reloading should work. So for instance, you've just changed some configuration and you've want to achieve some kind of zero downtime, ah, change, zero downtime deploy, you know, uh, you can tell systemd how this could be achieved with your process and if it cannot be achieved, uh, because for instance, uh, Puma application server.
It can fork processes, and it can actually, it can restart those processes in a way that it will be zero downtime, but when you want to change to evolve (?) Puma process. So what do you do, right? And uh systemd have this nice uh thing called, uh, socket activation. And with system socket activation, you can make another unit.
Uh, it's not a service unit. It's a socket unit there are many kinds of units in systemd. And, uh, you will basically make a socket unit that would listen to those connections and then pass them to the application. So while application is just starting and then it could be a completely normal restart, which means stopping, starting, uh, then it will keep the connections open, keep the sockets open and then pass them. when the application is ready to, to process them.
[00:20:42] Jeremy: So it sounds like if, and the socket you're referring to these would be TCP sockets, for example, of someone trying to access a website.
[00:20:53] Josef: Yes, but actually worked with Unix. Uh, socket as well. Okay.
[00:20:58] Jeremy: so in, in that example, Um, let's say a user is trying to go to a website and your service is currently down. You can actually configure systemd to, let the user connect and, and wait for another application to come back up and then hand that connection off to the application once it's, once it's back up.
[00:21:20] Josef: yes, exactly. That, yeah.
[00:21:23] Jeremy: you're basically able to remove some of the complexity out of the applications themselves for some of these special cases and, and offload those to, to systemd.
[00:21:34] Josef: because yeah, otherwise you would actually need a second server, right? Uh, you will have to, uh, start second server, move traffic there and upgrade or update your first server. And exchange them back and with systemd socket activation you can avoid doing that and still have this final effect of zero downtime deployment.
[00:21:58] Jeremy: So the, this, this introduction of systemd as the process manager, I think there's, this happened a few years ago where a lot of Linux distributions moved to using systemd and there, there was some, I suppose, controversy around that. And I'm kind of wondering, um, if you have any perspective on, on why there's some people who, really didn't want that to happen, know, why, why that's something people should worry about or, or, or not.
[00:22:30] Josef: Yeah. there were, I think there were few things, One one was for instance, the system logging that suddenly became a binary format and you need a special utility to, to read it. You know, I mean, it's more efficient, it's in a way better, but it's not plain text rich, all administrators prefer or are used to. So I understand the concern, you know, but it's kind of like, it's fine.
You know, at least to me, it it's fine. And the second, the second thing that people consistently force some kind of system creep because uh systemd is trying to do more and more every year. So, some people say it's not the Unix way, uh systemd should be very minimal in its system and not do anything else.
It's it's partially true, but at the same time, the things that systemd went into, you know, I think they are essentially easier and nice to use. And this is the system, the services I can say. I certainly prefer how it's done now,
[00:23:39] Jeremy: Yeah. So it sounds like we've been talking about systemd as being this process manager, when the operating system first boots systemd starts, and then it's responsible for starting, your applications or other applications running on the same machine. Uh, but then it's also doing all sorts of other things.
Like you talked about that, that socket activation use case, there's logging. I think there's also, scheduled jobs. There's like all sorts of other things that are part of systemd and that's where some people, disagree on whether it should be one application that's handling all these things.
[00:24:20] Josef: Yeah. Yeah. Uh, you're right with the scheduling job, like replacing Cron, you have, now two ways how to do it. But, you can still pretty much choose, what you use, I mean, I still use Cron, so I don't see a trouble there. we'll see. We'll see how it goes.
[00:24:40] Jeremy: One of the things I remember I struggled with a little bit when I was learning to deploy applications is when you're working locally on your development machine, um, you have to install a language runtime in a lot of cases, whether that's for Ruby or Python, uh, Java, anything like that. And when someone is installing on their own machine, they often use something like a, a version manager, like for example, for Ruby there's rbenv and, for node, for example, there's, there's NVM, there's all sorts of, ways of installing language, run times and managing the versions.
How should someone set up their language runtime on a server? Like, would they use the same tools they use on their development machine or is it something different.
[00:25:32] Josef: Yeah. So there are several ways you can do, as I mentioned before, with the modular packages, if you find the version there. I would actually recommend try to do it with the model package because, uh, the thing is it's so easy to install, you know, and it's kind of instant. it takes no time on your server.
It's you just install it. It's a regular package. same is true when building a Docker, uh, docker image, because again, it will be really fast. So if you can use it, I would just use that because it's like kind of convenient, but a lot of people will use some kind of version manager, you know, technically speaking, they can only use the installer part.
Like for instance, chruby with ruby-install to install new versions. Right. but then you would have to reference these full paths to your Ruby and very tedious. So what I personally do, uh, I just really set it up as if I am on a developer workstation, because for me, the mental model of that is very simple.
I use the same thing, you know, and this is true. For instance, when then you are referencing what to start in this ExecStart directive and systedD you know, because you have several choices. For instance, if you need to start Puma, you could be, you could be referencing the address that is like in your user home, .gem, Ruby version number bin Puma, you know, or you can use this version manager, they might have something like chruby-exec, uh, to run with their I (?) version of Ruby, and then you pass it, the actual Puma Puma part, and it will start for you, but what you can also do.
And I think it's kind of beautiful. You can do it is that you can just start bash, uh, with a login shell and then you just give it the bundle exec Puma command that you would use normally after logging. Because if you install it, everything, normally, you know, you have something.
you know, bashprofile that will load that environment that will put the right version of Ruby and suddenly it works.
And I find it very nice. Because even when you are later logging in to your, your, uh, box, you log in as that user as that application user, and suddenly you have all the environment, then it just can start things as you are used to, you know, no problem there.
[00:28:02] Jeremy: yeah, something I've run in into the past is when I would install a language runtime and like you were kind of describing, I would have to type in the, the full path to, to get to the Ruby runtime or the Python runtime. And it sounds like what you're saying is, Just install it like you would on your development machine.
And then in the systemd configuration file, you actually log into a bash shell and, and run your application from the bash shell. So it has access to the, all the same things you would have in an interactive, login environment. Is that, is that right?
[00:28:40] Josef: yeah, yeah. That's exactly right. So it will be basically the same thing. And it's kind of easy to reason about it, you know, like you can start with that might be able to change it later to something else, but, it's a nice way of how to do it.
[00:28:54] Jeremy: So you mentioned having a user to run your application. And so I'm wondering how you decide what Linux users should run your applications. Are you creating a separate user for each application you run? Like, how are you making those decisions?
[00:29:16] Josef: yes, I am actually making a new user for, for my application. Well, at least for the part of the application, that is the application server and workers, you know, so nginx um, might have own user, postgresql might have his own user, you know, I'm not like trying to consolidate that into one user, but, uh, in terms of rails application, like whatever I run Puma or whenever I run uh sidekiq, that will be part of the one user, you know, application user.
Uh, and I will appropriately set the right access to the directories. Uh, so it's isolated from everything else,
[00:30:00] Jeremy: Something that I've seen also when you are installing Ruby or you're installing some other language runtime, you have. The libraries, like in the case of Ruby there's there's gems. and when you're on your development machine and you install these, these gems, these packages, they, they go into the user's home directory.
And so you're able to install and use them without having let's say, um, sudo or root access. is that something that you carry over to your, your deployments as well, or, or do you store your, your libraries and your gems in some place that's accessible outside of that user? I'm just wondering how you approach it.
[00:30:49] Josef: I would actually keep it next to next to my application, this kind of touches maybe the question or where to put your application files on the system. so, uh, there is something called FHS, file system hierarchy standard, you know, that, uh, Linux distributions use, they, of course use it with some little modifications here and there.
And, uh, this standard is basically followed by packagers and enforced in package repositories. Uh, but other than that, it's kind of random, you know, it could be a different path and, uh, it says where certain files should go. So you have /home we have /usr/bin for executables. /var for logs and so on and so on.
And now when you want to put your, your application file somewhere, you are thinking about to put them, right. Uh, you have essentially, I think like three options, for, for one, you can put it to home because it's, as we talked about, I set up a dedicated user for that application. So it could make sense to put it in home.
Why I don't like putting it at home is because there are certain labeling in SELinux that kind of, makes your life more difficult. it's not meant to be there, uh, essentially on some other system. Uh, without SELinux, I think it works quite fine. I also did before, you know, it's not like you cannot do it.
You can, uh, then you have, the, kind of your web server default location. You know, like /usr/share/nginx/html, or /var/www, and these systems will be prepared for you with all these SELinux labeling. So when you put files there, uh, things will mostly work, but, and I also saw a lot of people do that because this particular reason, what I don't like about it is that if nginx is just my reverse proxy, you know, uh, it's not that I am serving the files from there.
So I don't like the location for this reason. If it will be just static website, absolutely put it there that's the best location. then you can put it to some arbitrary location, some new one, that's not conflicting with anything else. You know, if you want to follow the a file system hierarchy standard, you put it to /srv, you know, and then maybe slash the name of the application or your domain name, hostname you can choose, what you like.
Uh, so that's what I do now. I simply do it from scratch to this location. And, uh, as part of the SELinux, I simply make a model, make a, make a profile, uh, an hour, all this paths to work. And So to answer your question where I would put this, uh, gems would actually go to this, to this directory, it will be like /apps/gems, for instance.
there's a few different places people could put their application, they could put it in the user's home folder, but you were saying because of the built-in SELinux rules SELinux is going to basically fight you on that and prevent you from doing a lot of things in that folder.
[00:34:22] Jeremy: what you've chosen to do is to, to create your own folder, that I guess you described it as being somewhat arbitrary, just being a folder that you consistently are going to use in all your projects. And then you're going to configure SELinux to allow you to run, uh, whatever you want to run from this, this custom folder that you've decided.
[00:34:44] Josef: Yeah, you can say that you do almost the same amount of work for home or some other location I simply find it cleaner to do it this way and in a way. I even fulfilled the FHS, uh, suggestion, to put it to /srv but, uh, yeah, it's completely arbitrary. You can choose anything else. Uh, sysadmins choose www or whatever they like, and it's fine.
It'll work. There's there's no problem. There. And, uh, and for the gems, actually, they could be in home, you know, but I just instruct bundler to put it to that location next to my application.
[00:35:27] Jeremy: Okay. Rather than, than having a common folder for multiple applications to pull your libraries or your gems from, uh, you have it installed in the same place as the application. And that just keeps all your dependencies in the same place.
[00:35:44] Josef: Yep,
[00:35:45] Jeremy: and the example you're giving, you're, you're putting everything in /srv/ and then maybe the name of your application. Is that right?
[00:35:55] Josef: Yeah.
[00:35:55] Jeremy: Ok. Yeah. Cause I've, I've noticed that, Just looking at different systems. I've seen people install things into /opt. installed into /srv and it can just be kind of, tricky as, as somebody who's starting out to know, where am I supposed to put this stuff?
So, so basically it sounds like just, just pick a place and, um, at least if it's in slash srv then sysadmins who are familiar with, the, the standard file system hierarchy will will know to, to look at.
[00:36:27] Josef: yeah. Yeah. opt is also a yeah, common location, as you say, or, you know, if it's actually a packaged web application fedora it can even be in /usr/share, you know? So, uh, it might not be necessarily in locations we talked about before
one of the things you cover in the book is. Setting up a deployment system and you're using, shell scripts in the case of the book. And I was wondering how you decide when shell scripts are sufficient and when you should consider more specialized tools like Ansible or chef puppet, things like.
[00:37:07] Josef: yeah, I chose bash in the book because you get to see things without obstructions. You know, if I would be using, let's say Ansible and suddenly we are writing some YAML files and, uh, you are using a lot of, lot of Python modules to Ansible use and you don't really know what's going on at all times. So you learn to do things with ansible 2.0, let's say, and then new ansible comes out and you have to rely on what you did, you know, and I've got to rewrite the book. Uh, but the thing is that, with just Bash I can show, literally just bash commands, like, okay, you run this and this happens, And, another thing uh why I use it is that you realize how simple something can be.
Like, you can have a typical cluster with sssh, uh, and whatever in maybe 20 bash commands around that, so it's not necessarily that difficult and, uh, it's much easier to actually understand it if it's just those 20, uh, 20 bash comments. Uh, I also think that learning a little bit more about bash is actually quite beneficial because you encounter them in various places.
I mean, RPM spec files, like the packages are built. That's bash, you know, language version managers, uh, like pyenv rbenv that's bash. If you want to tweak it, if you have a bug there, you might look into source code and try to fix it. You know, it will be bash. Then Docker files are essentially bash, you know, their entry points scripts might be bash.
So it's not like you can avoid bash. So maybe learning a little bit. Just a little bit more than, you know, and be a little bit more comfortable. I think it can get you a long way because even I am not some bash programmer, you know, I would never call myself like that. also consider this like, uh, you can have full featured rails application, maybe in 200 lines of bash code up and running somewhere.
You can understand it in a afternoon, so for a small deployment, I think it's quite refreshing to use bash and some people miss out on not just doing the first simple thing possible that they can do, but obviously when you go like more team members, more complex applications or a suite of applications, things get difficult, very fast with bash.
So obviously most people will end up with some higher level too. It can be Ansible. Uh, it can be chef, it might be Kubernetes, you know, so, uh, my philosophy, uh, again, it's just to keep it simple. If I can do something with bash and it's like. 100 lines, I will do this bash because when I come back to it in, after three years, it will work and I can directly see what I have to fix.
You know, if there's a postgresql update at this new location whatever, I, I immediately know what to look and what to change. And, uh, with high-level tooling, you kind of have to stay on top of them, the new versions and, updates. So that's the best is very limited, but, uh, it's kind of refreshing for very small deployment you want to do for your side project.
[00:40:29] Jeremy: Yeah. So it sounds like from a learning perspective, it's beneficial because you can see line by line and it's code you wrote and you know exactly what each thing does. Uh, but also it sounds like when you have a project that's relatively small, maybe there, there aren't a lot of different servers or, the deployment process isn't too complicated.
You actually choose to, to start with bash and then only move to, um, something more complicated like Ansible or, or even Kubernetes. once your project has, has gotten to a certain size.
[00:41:03] Josef: you, you would see it in the book. I even explain a multiple server deployment using bash uh, where you can actually keep your components like kind of separate. So like your database have its own life cycle has its own deploy script and your load balancer the same And even when you have application servers.
Maybe you have more of them. So the nice thing is that when you first write your first script to provision one server configure one server, then you simply, uh, write another Uh, supervising script, that would call this single script just in the loop and you will change the server variable to change the IP address or something.
And suddenly you can deploy tomorrow. Of course, it's very basic and it's, uh, you know, it doesn't have some, any kind of parallelization to it or whatever, but if you have like three application servers, you can do it and you understand it almost immediately. You know, if you are already a software engineer, there's almost nothing to understand and you can just start and keep going.
[00:42:12] Jeremy: And when you're deploying to servers a lot of times, you're dealing with credentials, whether that's private keys, passwords or, keys to third-party APIs. And when you're working with this self hosted environment, working with bash scripts, I was wondering what you use to store your credentials and, and how those are managed.
I use a desktop application called password safe, uh, that can save my passwords and whatever. and you can also put their SSH keys, uh, and so on.
[00:42:49] Josef: And then I simply can do a backup of this keys and of this password to some other secure physical location. But basically I don't use any service, uh, online for that. I mean, there are services for that, especially for teams and in clouds, especially the, big clouds they might have their own services for that, but for me personally, again, I just, I just keep it as simple as I can. It's just on my, my computer, maybe my hard disk. And that's it. It's nowhere else.
[00:43:23] Jeremy: So, so would this be a case of where on your local machine, for example, you might have a file that defines all the environment variables for each server. you don't check that into your source code repository, but when you run your bash scripts, maybe read from that file and, use that in deploying to the server?
[00:43:44] Josef: Yeah, Uh, generally speaking. Yes, but I think with rails, uh, there's a nice, uh, nice option to use, their encrypted credentials. So basically then you can commit all these secrets together with your app and the only thing you need to keep to yourself, it's just like one variable. So it's much more easy to store it and keep it safe because it's just like one thing and everything else you keep inside your repository.
I know for sure there are other programs that we have in the same way that can be used with different stacks that doesn't have this baked in, because rails have have it baked in. But if you are using Django, if you are using Elixir, whatever, uh, then they don't have it. But I know that there are some programs I don't remember the names right now, but, uh, they essentially allow you to do exactly the same thing to just commit it to source control, but in a secure way, because it's, encrypted.
[00:44:47] Jeremy: Yeah, that's an interesting solution because you always hear about people checking in passwords and keys into their source code repository. And then, you know, it gets exposed online somehow, but, but in this case, like you said, it's, it's encrypted and, only your machine has the key. So, that actually allows you to, to use the source code, to store all that.
[00:45:12] Josef: Yeah. I think for teams, you know, for more complex deployments, there are various skills, various tools from HashiCorp vault, you know, to some cloud provider's things, but, uh, you can really start And, keep it very, very simple.
[00:45:27] Jeremy: For logging an application that you're, you're self hosting. There's a lot of different managed services that exist. Um, but I was wondering what you use in a self hosted environment and, whether your applications are logging to standard out, whether they're writing to files themselves, I was wondering how you typically approach that.
[00:45:47] Josef: Yeah. So there are lots of logs you can have, right from system log, your web server log application log, database log, whatever. and you somehow need to stay on top of them because, uh, when you have one server, it's quite fine to just look in, in and look around. But when there are more servers involved, it's kind of a pain and uh so people will start to look in some centralized logging system.
I think when you are more mature, you will look to things like Datadog, right. Or you will build something of your own on elastic stack. That's what we do on the project I'm working on right now. But I kind of think that there's some upfront costs uh, setting it all up, you know, and in terms of some looking at elastic stack we are essentially building your logging application.
Even you can say, you know, there's a lot of work I also want to say that you don't look into your logs all that often, especially if you set up proper error and performance monitoring, which is what I do with my project is one of the first thing I do.
So those are services like Rollbar and skylight, and there are some that you can self host so if people uh, want to self host them, they can. But I find it kind of easier to, even though I'm self hosting my application to just rely on this hosted solution, uh, like rollbar, skylight, appsignal, you know, and I have to say, especially I started to like appsignal recently because they kind of bundle everything together.
When you have trouble with your self hosting, the last thing you want to find yourself in a situation when your self hosted logs and sources, error reporting also went down. It doesn't work, you know, so although I like self-hosting my, my application.
[00:47:44] Josef: I kind of like to offload this responsibility to some hosted hosted providers.
[00:47:50] Jeremy: Yeah. So I think that in and of itself is a interesting topic to cover because we've mostly been talking about self hosting, your applications, and you were just saying how logging might be something that's actually better to use a managed service. I was wondering if there's other. Services, for example, CDNs or, or other things where it actually makes more sense for you to let somebody else host it rather than your
[00:48:20] Josef: I think that depends. Logging for me. It's obvious. and then I think a lot of, lots of developers kind of fear databases. So there are they rather have some kind of, one click database you know, replication and all that jazz back then so I think a lot of people would go for a managed database, although it may be one of those pricy services it's also likes one that actually gives you a peace of mind, you know? maybe I would just like point out that even though you get all these automatic backups and so on, maybe you should still try to make your own backup, just for sure. You know, even someone promised something, uh, your data is usually the most valuable thing you have in your application, so you should not lose it.
And some people will go maybe for load balancer, because it's may be easy to start. Like let's say on DigitalOcean, you know, uh, you just click it and it's there. But if you've got opposite direction, if you, for instance, decide to, uh, self host your uh load balancer, it can also give you more, options what to do with that, right?
Because, uh, you can configure it differently. You can even configure it to be a backup server. If all of your application servers go down. Which is maybe could be interesting use case, right? If you mess up and your application servers are not running because you are just messing with, with them.
Suddenly it's okay. Because your load balancers just takes on traffic. Right. And you can do that if it's, if it's your load balancer, the ones hosted are sometimes limited. So I think it comes to also, even if the database is, you know, it's like maybe you use some kind of extension that is simply not available. That kind of makes you, uh, makes you self host something, but if they offer exactly what you want and it's really easy, you know, then maybe you just, you just do it.
And that's why I think I kind of like deploying to uh, virtual machines, uh, in the cloud because you can mix and match all the services do what you want and, uh, you can always change the configurations to fit, to, uh, meet your, meet your needs. And I find that quite, quite nice.
[00:50:39] Jeremy: One of the things you talk about near the end of your book is how you, you start with a single server. You have the database, the application, the web server, everything on the same machine. And I wonder if you could talk a little bit about how far you can, you can take that one server and why people should consider starting with that approach.
Uh, I'm not sure. It depends a lot on your application. For instance, I write applications that are quite simple in nature. I don't have so many SQL calls in one page and so on.
[00:51:13] Josef: But the applications I worked for before, sometimes they are quite heavy and, you know, even, with little traffic, they suddenly need a more beefy server, you know, so it's a lot about application, but there are certainly a lot of good examples out there. For instance. The team, uh, from X-Plane flight simulator simulator, they just deploy to one, one server, you know, the whole backend all those flying players because it's essentially simple and they even use elixir which is based on BEAM VM, which means it's great for concurrency for distributed systems is great for multiple servers, but it's still deployed to one because it's simple. And they use the second only when they do updates to the service and otherwise they can, they go back to one.
ANother one would be maybe Pieter Levels (?) a maker that already has like a $1 million business. And it's, he has all of his projects on one server, you know, because it's enough, you know why you need to make it complicated. You can go and a very profitable service and you might not leave one server. It's not a problem. Another good example, I think is stackoverflow. They have, I think they have some page when they exactly show you what servers they are running. They have multiple servers, but the thing is they have only a few few servers, you know, so those are the examples that goes against maybe the chant of spinning up hundreds of servers, uh, in the cloud, which you can do.
It's easy, easier when you have to do auto scaling, because you can just go little by little, you know, but, uh, I don't see the point of having more servers. To me. It means more work. If I can do it, if one, I do it. But I would mention one thing to pay attention to, when you are on one server, you don't want suddenly your background workers exhaust all the CPU so that your database cannot serve, uh, your queries anymore right? So for that, I recommend looking into control groups or cgroups on Linux. When you create a simple slice, which is where you define how much CPU power, and how much memory can be used for that service. And then you attach it to, to some processes, you know, and when we are talking about systemd services.
They actually have this one directive, uh, where you specify your, uh, C group slice. And then when you have this worker server and maybe it even forks because it runs some utilities, right? For you to process images or what not, uh, then it will be all contained within that C group. So it will not influence the other services you have and you can say, okay, you know, I give worker service only 20% of my CPU power because I don't care if they make it fast or not.
It's not important. Important is that, uh, every visitor still gets its page, you know, and it's, they are working, uh, waiting for some background processes so they will wait and your service is not going down.
[00:54:34] Jeremy: yeah. So it sort of sounds like the difference between if you have a whole bunch of servers, then you have to, Have some way of managing all those servers, whether that's Kubernetes or something else. Whereas, um, an alternative to that is, is having one server or just a few servers, but going a little bit deeper into the capabilities of the operating system, like the C groups you were referring to, where you could, you could specify how much CPU, how much Ram and, and things, for each service on that same machine to use.
So it's kind of. Changing it, I don't know if it's removing work, but it's, it's changing the type of work you do.
[00:55:16] Josef: Yeah, you essentially maybe have to think about it more in a way of this case of splitting the memory or CPU power. Uh, but also it enables you to use, for instance, Unix sockets instead of TCP sockets and they are faster, you know, so in a way it can be also an advantage for you in some cases to actually keep it on one server.
And of course you don't have a network trip so another saving. So to get there, that service will be faster as long as it's running and there's no problem, it will be faster. And for high availability. Yeah. It's a, it's obviously a problem. If you have just one server, but you also have to think about it in more complex way to be high available with all your component components from old balancers to databases, you suddenly have a lot of things.
You know, to take care and that set up might be complex, might be fragile. And maybe you are better off with just one server that you can quickly spin up again. So for instance, there's any problem with your server, you get alert and you simply make a new one, you know, and if you can configure it within 20, 30 minutes, maybe it's not a problem.
Maybe even you are still fulfilling your, uh, service level contract for uptime. So I think if I can go this way, I prefer it simply because it's, it's so much easy to, to think about it. Like that.
[00:56:47] Jeremy: This might be a little difficult to, to answer, but when you, you look at the projects where you've self hosted them, versus the projects where you've gone all in on say AWS, and when you're trying to troubleshoot a problem, do you find that it's easier when you're troubleshooting things on a VM that you set up or do you find it easier to troubleshoot when you're working with something that's connecting a bunch of managed services?
[00:57:20] Josef: Oh, absolutely. I find it much easier to debug anything I set on myself, uh, and especially with one server it's even easier, but simply the fact that you build it yourself means that you know how it works. And at any time you can go and fix your problem. You know, this is what I found a problem with services like digital ocean marketplace.
I don't know how they call this self, uh, hosted apps that you can like one click and have your rails django app up, up and running. I actually used when I, uh, wasn't that skilled with Linux and all those things, I use a, another distribution called. A turnkey Linux. It's the same idea. You know, it's like that they pre prepare the profile for you, and then you can just easily run it as if it's a completely hosted thing like heroku, but actually it's your server and you have to pay attention, but I actually don't like it because.
You didn't set it up. You don't know how it's set up. You don't know if it has some problems, some security issues. And especially the people that come for these services then end up running something and they don't know. I believe they don't know because when I was running it, I didn't know. Right. So they are not even know what they are running.
So if you really don't want to care about it, I think it's completely fine. There's nothing wrong with that. But just go for that render or heroku. And make your life easier, you know,
[00:58:55] Jeremy: Yeah, it sounds like the solutions where it's like a one-click install on your own infrastructure. you get the bad parts of, of both, like you get the bad parts of having this machine that you need to manage, but you didn't set it up. So you're not really sure how to manage it.
you don't have that team at Amazon who, can fix something for you because ultimately it's still your machine. So That could have some issues there.
[00:59:20] Josef: Yeah. Yeah, exactly. I will. I would recommend it or if you really decide to do it, at least really look inside, you know, try to understand it, try to learn it, then it's fine. But just to spin it up and hope for the best, uh, it's not the way to go
[00:59:37] Jeremy: In, in the book, you, you cover a few different things that you use such as Ruby on rails and nginx, Redis, postgres. Um, I'm assuming that the things you would choose for applications you build in self hosts. You want them to have as little maintenance as possible because you're the one who's responsible for all of it.
I'm wondering if there's any other, applications that you consider a part of your default stack that you can depend on. And, that the, the maintenance burden is, is low.
[01:00:12] Josef: Yeah. So, uh, the exactly right. If I can, I would rather minimize the amount of, uh, dependencies I have. So for instance, I would think twice of using, let's say elastic search, even though I used it before. And it's great for what it can do. Uh, if I can avoid it, maybe I will try to avoid it. You know, you can have descent full text search with Postgres today.
So as long as it would work, I would uh, personally avoid it. Uh, I think one relation, uh, database, and let's say redis is kind of necessary, you know, I I've worked a lot with elixir recently, so we don't use redis for instance. So it's kind of nice that you can limit, uh, limit the number of dependencies by just choosing a different stack.
Although then you have to write your application in a little different way. So sometimes even, yeah. In, in such circumstances today, this could be useful. You know, I, I think, it's not difficult to, to run it, so I don't see, I don't see a problem there. I would just say that with the services, like, uh, elastic search, they might not come with a good authentication option.
For instance, I think asked et cetera, offers it, but not in the free version. You know, so I would just like to say that if you are deploying a component like that, be aware of it, that you cannot just keep it completely open to the world, you know? Uh, and, uh, maybe if you don't want to pay for a version that has it, or maybe are using it at the best, it doesn't have it completely.
You could maybe build out just a little bit tiny proxy. That would just do authentication and pass these records back and forth. This is what you could do, you know, but just not forget that, uh, you might run something unauthenticated.
I was wondering if there is any other, applications or capabilities where you would typically hand off to a managed service rather than, than trying to deal with yourself.
[01:02:28] Josef: Oh, sending emails, not because it's hard. Uh, it's actually surprisingly easy to start sending your own emails, but the problem is, uh, the deliverability part, right? Uh, you want your emails to be delivered and I think it's because of the amount of spam everybody's sending.
It's very difficult to get into people's boxes. You know, you simply be flagged, you have some unknown address, uh, and it would just it would just not work. So actually building up some history of some IP address, it could take a while. It could be very annoying and you don't even know how to debug it. You, you cannot really write Google.
Hey, you know, I'm, I'm just like this nice little server so just consider me. You cannot do that. Uh, so I think kind of a trouble. So I would say for email differently, there's another thing that just go with a hosted option. You might still configure, uh, your server to be sending up emails. That could be useful.
For instance, if you want to do some little thing, like scanning your system, a system log and when you see some troublesome. Logging in all that should, it shouldn't happen or something. And maybe you just want an alert on email to be sent to you that something fishy is going on. And so you, you can still set up even your server, not just your main application and might have a nice library for that, you know, to send that email, but you will still need the so-called relay server. to just pass your email. You. Yeah, because building this trust in an email world, that's not something I would do. And I don't think as a, you know, independent in the maker developer, you can really have resources to do something like that. So will be a perfect, perfect example for that. Yeah.
[01:04:22] Jeremy: yeah, I think that's probably a good place to start wrapping up, but is there anything we missed that you think we should have talked about?
[01:04:31] Josef: we kind of covered it. Maybe, maybe we didn't talk much about containers, uh, that a lot of people nowadays, use. uh, maybe I would just like to point out one thing with containers is that you can, again, do just very minimal approach to adopting containers. You know, uh, you don't need to go full on containers at all.
You can just run a little service, maybe your workers in a container. For example, if I want to run something, uh, as part of my application, the ops team, the developers that develop this one component already provide a Docker file. It's very easy way to start, right? Because you just deployed their image and you run it, that's it.
And they don't have to learn what kind of different stack it is, is a Java, is it python, how I would turn it. So maybe you care for your own application, but when you have to just take something that's already made, and it has a Docker image, you just see the nice way to start. And one more thing I would like to mention is that you also don't really need, uh, using services like Docker hub.
You know, most people would use it to host their artifacts that are built images, so they can quickly pull them off and start them on many, many servers and blah, blah. But if you have just one server like me, but you want to use containers. And I think it's to just, you know, push the container directly.
Essentially, it's just an archive.
And, uh, in that archive, there are few folders that represent the layers. That's the layers you build it. And the Docker file and that's it. You can just move it around like that, and you don't need any external services to run your content around this little service.
[01:06:18] Jeremy: Yeah. I think that's a good point because a lot of times when you hear people talking about containers, uh, it's within the context of Kubernetes and you know, that's a whole other thing you have to learn. You have to learn not only, uh, how containers work, but you have to learn how to deploy Kubernetes, how to work with that.
And, uh, I think it's, it's good to remind people that it is possible to, to just choose a few things, run them as containers. Uh, you don't need to. Like you said, even run, everything as containers. You can just try a few things.
[01:06:55] Josef: Yeah, exactly.
[01:06:57] Jeremy: Where can people, uh, check out the book and where can they follow you and see what you're up to.
[01:07:04] Josef: uh, so they can just go to deploymentfromscratch.com. That's like the homepage for the book. And, uh, if they want to follow up, they can find me on twitter. Uh, that would be, uh, slash S T R Z I B N Y J like, uh, J and I try to put updates there, but also some news from, uh, Ruby, Elixir, Linux world. So they can follow along.
[01:07:42] Jeremy: Yeah. I had a chance to, to read through the alpha version of the book and there there's a lot of, really good information in there. I think it's something that I wish I had had when I was first starting out, because there's so much that's not really talked about, like, when you go look online for how to learn Django or how to learn Ruby on Rails or things like that, they teach you how to build the application, how to run it on your, your laptop.
but there's this, this very, large gap between. What you're doing on your laptop and what you need to do to get it running on a server. So I think anybody who's interested in learning more about how to deploy their own application or even how it's done in general. I think they'll find the book really valuable.
[01:08:37] Josef: Okay. Yeah. Thank you. Thank you for saying that. Uh, makes me really happy. And as you say, that's the idea I really packed, like kind of everything. You need in that book. And I just use bash so, it's easier to follow and keep it without any abstractions. And then maybe you will learn some other tools and you will apply the concepts, but you can do whatever you want.
[01:09:02] Jeremy: All right. Well, Josef thank you, so much for talking to me today.
[01:09:05] Josef: Thank you, Jeremy.
Michael Ashburne and Maxwell Huffman are QA Managers at Aspiritech.
radio.net/2021/05/episode-461-michael-ashburne-and-maxwell-huffman-on-quality-assurance/">This episode originally aired on Software Engineering Radio.
Related Links:
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Today I'm joined by Maxwell, Huffman and Michael Ashburn. They're both QA managers at Aspiritech. I'm going to start with defining quality assurance. Could one of you start by explaining what it is?
Maxwell: [00:00:15] So when I first joined Aspiritech, I was kind of curious about that as well. One of the main things that we do at Aspiritech besides quality assurance is we also, give meaningful employment to individuals on the autism spectrum. I myself am on the autism spectrum and that's what, initially attracted me to the company.quality assurance in a nutshell is making sure that, products and software is not defective. That it functions the way it was intended to function.
Jeremy: [00:00:47] how would somebody know when they've, when they've met that goal?
Michael: [00:00:50] It all depends on the client's objectives. I guess. quality assurance testing is always about trying to mitigate risk. There's only so much testing that is realistic to do, you know, you could test forever and never release your product and that's not good for business. It's really about, you know, balancing, like how likely is it that the customer is going to encounter defect X, how much time and energy would be required to, to fix it?
Overall company reputation, impact, there's all sorts of different metrics. Uh, and every, every customer is unique really they, they get to set the pace,
Maxwell: [00:01:30] does the product work well? is the user experience frustrating or not? that's always a bar that I look for. One of the main things that we review in the different defects that we find is customer impact.
and how much of this is going to frustrate the customers. And when we're going through that analysis, is this cost effective or not. The client they'll determine it's worth, the cost of the, uh, quality assurance and of the fix of the software to make sure that that customer experience is smooth.
Jeremy: [00:02:03] When you talk to, to software developers, now, a lot of them are familiar with things like they need to test their code right. They have things like unit tests and integration tests that they're running regularly. where does quality assurance fit in with that? Like, is that considered a part of quality assurance is quality assurance something different?
Michael: [00:02:24] we try to partner with our clients, because the goal is the same, right. It's to release a quality product that's as free of defects, as, you know, as possible.
We have multiple clients that will let us know these are clients typically that we've worked with for a long time that have sort of established a rhythm. they'll let us know when they've got a new product in the pipeline and as soon as they have available, Uh, software requirements, documentations specs, user guides, that kind of thing.They'll provide that to us, to be able to then plan. Okay. You know, what are these new features? Uh, what defects have been repaired since the last build or, you know, it all depends on what the actual product is. And we start preparing tests even before there may be, uh, A version of the software to test, you know, now that's more of a, what they call a waterfall approach where it's kind of a back and forth where, you know, the client preps the software, we test the software.
If there's something amiss, the client makes changes. Then they give us a new build. but we just as well, we work in, uh, iterative design or agile is a popular term, of course, where. We have embedded testers, that are, you know, on a daily basis, interacting with, uh, client developers to address, you know, to, to verify certain parts of the code as it's being developed.
Because of course the problem with waterfall is you find a defect and it, it could be deep in the code or some sort of linchpin aspect of the code. And then there's a lot of work to be done. To try to fix that sort of thing. Whereas, you know, embedded testers can identify a defect or, or even just like a friction point as early as possible.And so then they don't have to, you know, tear it all down and start over and it's just, Oh, fix that, you know, while they're working on that part, basically.
Jeremy: [00:04:18] so I think there's two things you touched on there. One is. the ability to bring in QA early into the process. And if I understand correctly, what you were sort of describing is. even if you don't have a complete product yet, if you just have an idea of what you want to build, You were saying you start to generate test cases and it almost feels like you would be a part of generating the requirements generating. Like, what are the things that you need to build into your software before uh, the team that's building it necessarily knows themselves did that I sort of get that.
Maxwell: [00:04:55] I've been in projects that we've worked with the product from cradle to grave. a lot of them haven't gotten all the way to a grave yet, but, um, some of them, the amount of support that they're offering. It's reached that milestone in its life cycle, where they're no longer going to, um, address the defects in the same way.
They want to know that they're there. They want to know what exists. But then now there are new products that are being created, right? So we are, um, engaged in embedded testing, which is, which is testing, certain facets of the code actively, and making sure that it's, doing what it needs to do.
And we can make that quick patch on that code and put it out to market. And we're also doing that at earlier stages with, in, in earlier development where before it's an even, fully formed design concepts, we're offering suggestions, and recommending that, you know, this doesn't follow with the design strategy and the concept design.so that part of embedded testing or unit testing, can be involved at earlier stages as well. For sure.
Michael: [00:06:08] Of course, those, you have to be very, careful you know, we wouldn't necessarily make blanket recommendations to a new client, a lot of the clients that we have, we have been with us for several years. And so. You know, you develop a rhythm, common vocabulary, you know, you know, which generally speaking, which, goals weigh more than other goals and things like that from, from client to client or even coder to coder.
it's only once we've really developed that. shared language that, you know, we would say by the way, you know, such and such as missing from blankety blank, say great example with a bunch of non words in it, but I think you get the picture.
Jeremy: [00:06:48] when you're first starting to work on a project, you don't know a whole lot about it, right? you're trying to, to understand how this product is supposed to work and, what does that process look like? Like what should a company be providing to you? What are the sorts of meetings or conversations you're having that, that sort of thing.
Michael: [00:07:08] we'll have an initial meeting with a handful of people from both sides and just sort of talk about both what we can bring to the project and what their objectives are. and, and, you know, the, the thing that they want us to test, if you will. And, if we reach an agreement that we want to move forward, then the next step would be like a product demo, basically, we would come together and we would start to fold in, you know, leads and some other analysts, you know, people that were, might be a good match for the project say, and we always ask, our clients.
And they're usually pretty accommodating. if we can record the meeting, you know, now everyone's meeting on Google meet and virtually and so forth. And so, uh, that makes it a little easier, but a lot of our analysts have everyone has their own learning style. Right. You know, some people are more auditory, some people are more visual.
So we preserve, you know, the client's own demonstration of what it's either going to be like, or is like or is wrong or whatever they want us to know about it. and then we can add that file to our secure project folder and anybody down the road that's being onboarded. Like that's, that's a resource an asynchronous resource that they can turn to right? A person doesn't have to re demonstrate the software to onboard them, or sometimes, you know, by the time we're onboarding new people, the software has changed enough that we have to set those aside actually. And then you have to do a live in person kinda deal.
Maxwell: [00:08:32] and you really want to consider, individuals on the, on the spectrum, the different analysts and testers they do have different learning styles. We do want to ask for as many different. resources that are available, to, accommodate for that, but also to have us be, the best enabled to be the subject matter experts on the product.
so what we've found is that what we're really involved in is writing test cases and, and, and rewriting test cases to humanize the software to really get at, what are you asking this software to do that in turn is what the product is doing. a lot of the testing we do is black box testing, and we want to understand what the original design concept is.
So that involves the user interface, design document, right? early stages of that, if available, or just that, dialogue that Michael was referring to, to get that common language of what do you want this product to do? What are you really asking this code to do? having recordings, or any sort of training material, is absolutely essential. To being the subject matter experts and then developing the kind of testing that's required for that.
Michael: [00:09:48] And all sorts of different clients have different, different amounts of testing material, so to speak I mean, everything from, you know, a company that has their own internal, test tracking software and they just have to give us access to it. And the test cases are already there to, a piece of paper, like a physical piece of paper that they copied the checklist into Excel.
And now, like, these are the things that we look at, but of course there's always a lot more to it than that, but that at least gives us a starting point to sort of to build off of and, you know, testing areas and sections and, you know, sort of thematically related features, things like that. And then we can, we develop our own tests, on their behalf, basically.
Jeremy: [00:10:29] And when you're building out your own tests, what, what would be the, the level of detail there? Would it be a high level thing that you want to accomplish in the software and then like absolute step by step, click by click,
Michael: [00:10:42] You know, I hate to make every answer conditional, right. But it sort of depends on the software itself and what the client's goals are. one of our clients, uh, is developing a new, screen-sharing app that's for developers both work on the same code at the same time, but they can take turns, typing, controlling the mouse, that sort of thing.
and although this product has been on the market for awhile, we started out with one of those checklists and now have hundreds of test cases based on, both features that they've added, as well as weird things that we found like, Oh, make sure sometimes you have to write a test case, uh, that tests for the negative, like the, the absence of a problem, right?
So you can make sure X connects to Y and the video doesn't drop or. If you can answer the connection, on, before the first ring is done and it successfully connects anyway, or, or, you know, any host of, of options. So our test cases, for that project, we have a lot of, uh, screen caps and stuff because a picture's worth a thousand words as the cliche goes.
but we also try to describe, describe the features, not just, you know, present the picture with an arrow, like click here and see what happens. Because again, everyone has sort of different data processing styles and some would prefer to read step by step instructions rather than try to interpret, you know, some colors in a picture.
And what does this even mean out of context?
Maxwell: [00:12:08] and lots of times you'll end up potentially seeing test cases they seem like they could be very easily automated. Cause literally they're written all in code. and the client will occasionally ask us to do a test cycle scrub or they'll ask us, okay, well, what can be automated within this?
Right. But one of the key things we really look at is, is to try to humanize that test case a little more away from that just basic automation, lots of times that, that. Literally involves asking, what are you trying to get out of this out of this test case? cause it's fallen so much into the, into the weeds that you no longer can really tell what you're really asking it to really do So lots of times we will, we will help them automate them. But also just give it the proper test environment. and the, and the the proper steps, you'd really be amazed. How many test cases just do not have the proper steps get an, an actual expected result. And if it's written wrong at that basic manual level, you're not adding value.
so that's one thing that we, that we really have found it's added value to the clients and to their test cycles.
Michael: [00:13:21] A lot of people ask about automation because it's a very sexy term right now. And it certainly has its place. Right. But uh you can't automate new feature testing. it has to be an aspect of the product that's mature. Not changing from build to build. And you also have to have test cases that are mature, that you know, every little virtual or otherwise, you know, T is crossed and I is dotted, or else you end up having to do manual testing anyway, because the computer just goes ohit didn't work. Because that's really all the, you know, the automated process can do is either it passes or it doesn't. And so then we have to come in and, and we have clients where we do plenty of that. Like, okay, they ran through the tests and these three failed figure out why, and then they go in and start digging around and, Oh, it turns out this is missing or this got moved in the latest update or something like that.
Jeremy: [00:14:12] that's an interesting, perspective for testing in general, where it sounds like when a feature is new, when you're making a lot of changes to how the software works. that's when, manual testing can actually be really valuable because as a person, you have a sense of what you want and if things kind of move around or don't work exactly the way you expect them to, but you kind of know what the end goal is. you have an idea of like, yes, this worked or no, this didn't. and then once that's solidified then that's when you said it's easier to, to shift into automatic testing. for example, having, an application, spin up a browser and click through things or, or trigger things through code, things like that.
Michael: [00:14:58] And you have to, you know, you have to get the timing just right. Cause the computer can only wait in so many increments and you know, if it, if it tries to click the next thing too soon and it hasn't finished loading, you know, then it's all over. but that's actually the, the, the discernment that you were sort of referring to the, the, using your judgment when executing a test.
that's where we really, we really do our best work and we have some analysts that specialize in exploratory testing, which is where you're just sort of looking around systematically or otherwise. I personally have never been able to do that very well. uh, but that's critical because those, those exploratory tests are always where you turn up the weirdest combination of things.
Oh, I happened to have this old pair of headphones on and when I switched from Bluetooth to. manual plug, you know, just disconnected the phones or the, you know, the conference call altogether you know, and who does that. Right. But, you know, there's all, all sorts of different kinds of combinations and, and, and who knows what the end user is going to bring. He's not going to necessarily buy all new gear, right. When he gets the new computer, the new software, whatever.
Jeremy: [00:16:05] I feel like there's been a. Uh, kind of a trend in terms of testing at software companies, where they, they used to commonly have, in-house testing or in-house QA, it would be separated from development.
And now you're seeing more and more of, people on the engineering staff, on the developing staff being responsible for testing their own software, whether that be through unit tests, integration tests, Or even just using the software themselves, where you're getting to the point where you have more and more people are engineers that maybe have some expertise or some knowledge in tests and less, so people who are specifically dedicated to test. and so I wonder from your perspective you know, a QA firm or just testers in general? Like what their role is in, in software development going forward.
Maxwell: [00:16:55] having specialized individuals that are constantly testing it and analyzing the components and making sure that you're on track to make that end concept design come to life really is essential. And that's what you get with the quality assurance. It's like a whole other wing of your company that basically is making sure that everything you are, that you are doing with this, product and with this software, is within scope.
and you can't be doing anything better as well. that's the other aspect of it, right? cause lots of times when we find a component and we found something, that we've broken or we've found a flaw in the design we look at, what that means.bigger picture um, with the overall product. and we try to figure out all right, well, does this part of the functionality... is it worth it to fix this part of the functionality? is it cost-effective right. So lots of times quality assurance.
it comes right down to the, to the cost-effectiveness of the different, patches. and lots of times it's even the safety. of the, uh, product itself. it all depends on what exactly you're, you're designing, but I can give you an, an example of a, of a product that, that we were, that we were working with in the past, where we were able to get a component to overheat, obviously that is a critical defect that needs to be addressed and fixed.
that's something that can be found. as you're just designing the product. But to have a specialized division, that's just focused on quality assurance. They're more liable, they're more inclined. And that is what their directive is, is to find those sorts of defects. And I'll tell you the defects that we found that overheated this, this product, it was definitely an exploratory, find it was actually caught.
Off of a test case that was originally automated. so we definitely, we're engaged in every aspect or a lot of the aspects of the, of the engineering departments with this, uh, products. but in the end it was exploratory testing.
It was out of scope of what they had automated then ended up finding us. That's where I really see quality assurance in this, in this field within software engineering, really gaining respect and gaining momentum in understanding that, Hey, these are, these are really intelligent, potentially software engineers themselves.That their key focus is to, is to testing our product and making sure that it's a design that, that is within the scope.
Michael: [00:19:36] It's helpful to have a fresh set of eyes too you know, if a person's been working on a product for, you know, day in, day out for months on end, inevitably there will be aspects that become second nature. may allow them to effectively like skip steps in the course of testing, some end result when they're doing their own testing, but you bring in, you know, a group of a group of analysts who know testing, but don't know your product other than generally what it's supposed to do and you sort of have at it and you find all sorts of interesting things that way.
Jeremy: [00:20:13] Yeah, I think you brought up two interesting points. one of them is the fact that. Nowadays, there is such a big focus on, automated testing as a part of a continuous integration process, right? Somebody will write code they'll check in their code, it'll build, and then automated tests will see that it's still working.
But those tests that the developers wrote, they're never going to find things that there were never a test written for. Right. So, I think that whole exploratory, testing aspect is interesting. and then Maxwell also brought up a good point uh, it sounds like QA can also not just help find what defects or issues exists, but they can also help. grade how much of an issue those defects are so that, the developers they can prioritize. Okay. Which ones are really a big deal that we need to fix, uh, versus what are things that, yeah, it's, I guess it's a little broken, but it's not, not such a big deal.
Maxwell: [00:21:14] in a broader sense, there are certain whole areas of design right now. Uh, Bluetooth is a really, uh, big area that we've been working in. I'm the QA manager for the Bose client at, Aspiritech and Bluetooth is really a big thing that, that is, that is involved in all of their different speakers.
So obviously if we, if we find anything, anything wrong with, with a certain area, you know, we want them to consider what areas they might want to focus more manual testing and less automation on. Right. and we're always thinking about, feature specific in that sense, um, to help the clients out as well.
and analysts that are on the spectrum, they really have. it's fascinating how, how they tend to be very particular about certain defects. and, and they can really find things that are very exploratory, but they don't miss. the, uh, forest for the trees in the sense that they still maintain, the larger concept design, funnily enough, where they can let you know, you know, is Bluetooth really the factor in this, that should be fixed here or is it, or is it something else? to, it leads to different, to interesting avenues for sure.
Michael: [00:22:32] Yeah, Bluetooth is really, A bag of knots in a lot of ways, you know, the different versions, different hardware vendors, we work with zebra technologies and they make barcode printers and scanners and so forth.And you know, many of their printers are Bluetooth enabled. but you know, the question is, is it Bluetooth 4 to Bluetooth, 4 is it, backwards compatible.
And, uh, a certain, uh, rather ubiquitous, computer operating system is notorious for having trouble with Bluetooth management, whether it's headphones or printers or whatever. and in that instance, because we want to, you know, we're not testing the computer OS, we're testing the driver for the printer. Right.
So, part of the protocol we wound up having to build into the, into the test cases is like, okay, first go in deactivate, the computer's own, resident internal hardware, Bluetooth, then connect to this, you know, third party USB dongle, install the software, make sure it's communicating, then try to connect to your printer For a long time, an analyst would run into some kind of issue.
And the first question is always, are you using the computer Bluetooth? Or is it a third-party Bluetooth and is discoverable on, is it a Bluetooth, low energy because you don't want to print using Bluetooth low energy because it'll take forever. Right?
And then the customer thinks, Oh, this isn't working. It's broke. You know, not even knowing that there's. Multiple kinds of Bluetooth and yeah. It's, uh, it's hairy for sure.
Jeremy: [00:24:00] Yeah. And then I guess, as a part of that, that process, you're finding out that there isn't necessarily a problem in the customer's software. but it's some external case so that, when you get a support ticket or a call, then, you know, like, okay, this is another thing we can ask them to check. Yeah.
Maxwell: [00:24:18] Absolutely. And then that's something that we are, that we've been, you know, definitely leveraged for, to help out, to try to resolve customer issues that come in as well, and try to set up a testing environment that mimics that. And, and we've occasionally. integrated that to, to become part of our manual testing and some automated scenarios as well.
so that, so those have been interesting scenarios having to buy different routers and what, and what have you. And once again, it gets back to the cost-effectiveness of it. You know, what is, what is the market impact? Yes. This particular AT&T router or what have you, um, might be having an issue. but you know, how many, how many users in the wind, the world are really running the software on this.
Right. and that's something that, everyone needs to, you know, that every company should consider when they're, considering, uh, you know, a a patch, um, in the, in the software.
Jeremy: [00:25:14] and something you, you also brought up is. As a, as a software developer, when there is a problem. One of the things that we always look for is we look for a reproducible case, right? Like, what are the steps, um, you need to take to have this bug or this problem occur. And it sounds like one of the roles might be.we get in a report from a customer saying like this part of the software doesn't work. Um, but I'm not sure when that happens or how to get it to happen. And so, as a QA, uh, analysts, one of your roles might be taking those reports and then building a repeatable, um, test case.
Michael: [00:25:55] Absolutely. There's lots of times where clients have said we haven't been able to reproduce this, see if you can. And you know, we get back to them after some increment of time. And, sometimes we can, and sometimes we can't, you know, sometimes we have to buy special, uh, like headphones or some kind of, you know, try to reproduce the investment that the client was using.Uh, in case there was some magic sauce interaction going on there.
Maxwell: [00:26:21] our analysts on the spectrum. they are so particular in writing up defects, all the little details. and that really is so important in quality assurance is documentation for the entire process. that's one area where I think quality assurance really helps development in general is making sure that everything is documented that it's all there on paper and the documentation is, is solid and really sound. Um, so for a lot of these defects, we've actually come in and I think up the standard a little bit where you can't have the defect written where, you know, the reproducibility is one out of one.
And it turns out this was a special build that the developer was using that no one else in the company was even using. It's a waste of time to track this defect down. And that's based on the fact that it was a poorly written up report in the first place.so it can be fun to have to. Track down all the various equipment you need for it. And analysts are really well-suited for writing those up and, and, uh, and investigating these different defects that are errors that we, that we find sometimes they're, sometimes they're not actually defects. They're just errors in the system.
Michael: [00:27:34] uh, tell them about like The Bose guide that Bose wound up using the guide that we had made internally.
Maxwell: [00:27:40] Yeah. There have been, so many guides that we've ended up creating that have been like terminal access, shortcuts, uh, just different, different ways to, you know, access the, uh, system, from a tester perspective that have, that have absolutely helped, just documenting all these things that engineers end up using to test code. Right. But lots of times these shortcuts Aren't well documented anywhere. so what quality assurance and what the Aspiritech has done, is we come in and we really create excellent training guides for how to, how to check the code.Um, and, and what all the various commands are that have to be inputted and how that translates to what, the more obvious user experiences, which is, I think a lot of times what ends up being lost. it ends up all being code language and you don't really know what the user experiences, it's nice to, to be able to, To have found that, that the guides that we've created when we show them to the clients, because really we created them to make life easier for us and make the testing easier for us to make it more translatable.
When you see all this different code that some of us are very well versed in. but other analysts might not be. As well-versed in the code or that aspect of it. Right. but once you humanize it and you, and you sort of say, okay, well, this is what you're asking the code to do. then you have that other perspective of, I actually can think of a better way that we could Potentially do this. so we've brought a lot of those guides to the clients and they've really been, they've really been blown away, at how well documented All of that was, um, all the way down to the level of the, uh, GUIDs of all the systems. We have very good inventory tracking, and even being able to test and run, internal, components of the system. and that's why I bring up the a G U I D S so a lot of the testing that we end up doing, or I wouldn't say a lot, but a portion of it is the sort of tests that installers would be running this sort of functionality that only installers of systems would be, would be running. So, it's still, it's still black box testing, but it's behind the scenes of what the normal user experiences. Right. It's sort of the installer experience for lack of a better word.
And even having that well-documented and finding errors. In, in those processes have been quite beneficial. I, I remember one scenario in which there was an emergency update method that we had to test, right. And this is, this was a type of method where if someone had to run it, they would take it into the store and a technician would run it right. So basically we're, we're running software quality assurance on a technicians test for a system and the way a technician would update the system. And what we found is that what they were asking the technician to do was a flawed and complex series of steps. it did work. But only one out of 30 times, and only if you did everything in a very particular timing.
And it just was not something that was user-friendly, for the, a tech technician. So it's the kind of thing that we ended up finding. And lots of times it requires the creation of a guide because they don't have guides for technicians, to end up to end up finding a defect like that.
Michael: [00:31:21] and the poor technician, you know, he's dealing with hundreds of different devices, whatever it is, you know, whatever the field is, whether it's phones or speakers or printers or computers or whatever. And, you know, this guy is not working with the same, software day in and day out. The way we have to sometimes again, because the developer is sort of building the, the tool that will do the stuff, you know, we're, we're dealing with the stuff it's doing. And so in a lot of ways, uh, we can bring our own level of expertise to a product. Uh, we can surpass, you know, the developer even, it's not like a contest, right.
but just in terms of, you know, how many times is a developer installing it for the first time, like Maxwell was saying, what, when we do out of box testing, we have to reset everything and install it fresh over and over and over and over again. And so, so we wind up being exposed to this particular, you know, series of steps that the end user might only see a couple of times, but you know, who wants their brand new shiny thing, especially if it costs hundreds of dollars, you know, you don't want to have a lot of friction points in the process of finally using it.
You know, you just kind of want it to just work as effectively as possible.
Jeremy: [00:32:45] if I understood correctly in, in Maxwell's example, that would be, you had a physical product, like let's say a pair of headphones or something like that, and you need to upgrade, the firmware or. Perform some kind of reset. And that's something that like you were saying, a technician would normally have to go through, but, as QA, you go in and do the same process and realize like, this process is really difficult and it's really easy to make a mistake or, um, just not do it properly at all.
and then, so you can propose like either, you know, ways to improve those steps or just show the developers like, Hey, look, I have to do all these things just to, you know, just update my firmware. Um, you might want to consider like, making that a little easier on, on your customers. Yeah.
Maxwell: [00:33:32] Absolutely. And the other nice thing about it, Jeremy is, you know, we don't look at it at a series of tests like that as lower level functionality. just because, um, you know, it's more for a technician to have run it. It's actually part of the update testing.
So it's, so it's actually very intricate. as far as the design of the, of the product. We find a defect in how this system updates. It's usually going to be a critical defect. Um, we don't want the product to ever end up being a boat anchor or a doorstop. Right. So that's so that's, so that's what we're always trying to avoid.
and in that scenario, it's one of those things where then we don't exactly close the book on it once we, once we figure out, okay, this, this was a difficult scenario. For the technician, we resolve it for the technician. And then we look at, bigger scope, how does this affect the update process in general?
You know, does this affect the, uh, customers testing, that suite of test cases that we have for those update processes. you know, it can, it can extend to that as well. Uh, and, and then we look at it in terms of automation too. Uh, to see if there's any areas where we need to fix the automation tests.
Michael: [00:34:46] it can be as simple as power loss during the update at exactly the wrong time. the system will recover if it happens within the first 50 seconds or the last 30 seconds, but there's this middle part where it's trying to reboot in the process of updating its own firmware. And if the power happens to go out, then. You're out of luck
that does not make for a good, reputation for the client. that, I mean, the first thing a customer that's unhappy about that kind of thing is going to do is tell everybody else about this horrible experience they had.
Maxwell: [00:35:20] Right. And I can think of a great example, Michael, we had found a ad hoc defect. They had asked us to look in this particular area. There was a very rare customer complaints of update issues. but they could not find it with their automation. we had one analyst that amazingly enough, was able to pull the power.At the right exact time in the right exact sequence. And, and we reported the ticket and we were able to capture the logs for this incident. And they must have run this through 200,000 automated tests and they could not replicate what this human could do with his hands. Um, and it would have really amazed them after we had found it.cause they really had run it through that many automation tests, but it does, it does happen where you find those.
Jeremy: [00:36:10] we we've been talking about, uh, in this case you were saying this was for Bose, which is a very, large company. And I think that. When the average developer thinks about quality assurance, they usually think about it in the context of, I have a big enterprise company. Um, I have a large staff, I have money to pay for a whole bunch of analysts, things like that.
I want to go back to where Michael, you had mentioned how. one of your customers was for a, uh, a screen-sharing application. we had an interview with Spencer Dixon who's the CTO at Tuple. I believe that's the product you're referring to. So. I wonder if you could walk us through, like for somebody who has a business that's I want to say they're probably maybe four or five people, something like that.
what's the process for them, bringing on a dedicated analysts or testers. given that you're coming in, you have no knowledge of their software. What's the process there like?
Michael: [00:37:13] first of all, not to, not to kiss up, but the guys at Tuple are a really great bunch of guys. They're very easy to work with. we have like an hourly cap, per month, you know, to try to not exceed a certain number of hours. That agreement helps to manage their costs. They're very forthcoming. and they really have, folded us in to their development process. You know, they've given us access to their, uh, trouble ticket, uh, software.
We use their internal instant messaging application, to double-check on, you know, expected results. And is this a new feature or is this something that's changed unintentionally? so when we first started working with them, there was really only one. person on the project. and this person was in essence, tasked with turning, the Excel checklist of features into suites of test cases.
And, you know, you, you start with Make sure X happens when you click Y and then you make that the title of a test case. And, you know, once you get all the easy stuff done, then you go through the steps of making it happen. They offered us a number of very helpful sort of starting videos that they have on their website for how to use the software, by no means are they comprehensive.
but it was enough to get us comfortable, you know, with the basic functionality and then you just wind up playing with the software a lot. they were very open to giving us the ramp up time that we needed in order to check all the different boxes, uh, both ones on their list. And then new ones that we found because, you know, there's, there's more than one. connection type, right? That can be just a voice call or there can be the screen sharing and you can show your local video from your computer camera, so you can see each other in a small box. And, you know, what order do you turn those things on?
And, which one has to work before the next one can work? Or what if a person changes their preferences in the midst of a call and, you know, these are things that, fortunately Tuple's audience is a bunch of developers. So, uh, when their clients, their customers report a problem, uh, the report is extremely thorough because they know what they're talking about.
And so the reproduction steps are pretty good, but we still, sometimes we'll run into a situation, that they've shared with us. It's like, we can't, we can't make this one happen. And I don't know. I mean, The getting back to the Bluetooth, like they've even had customers where, uh, I guess one headset used a different frequency. Uh, than another one, even though they were on the same Bluetooth version. And when he changed this customer, I shouldn't say he, the customer, whoever whomever, they are, uh, they changed from one headset to another and you know, the whole thing fell apart and it's like, how do you even, you know, cause you don't go to the store and look on the package and see, Oh, this particular, uh, you know, headphone uses 48 kilo Hertz for their, you know, At the outset. I didn't even know that that was a thing that could be a problem. Right. It just, you figured Bluetooth has its band of the telecom spectrum and, but, you know, anything's possible. So they gave us time to ramp up, you know, cause they knew that they didn't have any test cases and uh, over time now, there's a dedicated team of three people that are on the project regularly, but it can expand to as many as six, you know, because it's a sharing application, right?
So you tend to need multiple computers involved, And yeah, we've really, we've really enjoyed a relationship with Tupelo and our, and our eagerly awaiting, uh, if there would be windows version, because there's so many times when we'll be working on another project even, and, you know, talking with the person and saying, Oh, I wish I could, you know, we could use Tuple cause then I could click the thing on your screen and you could see it happen instead of just, you know, um, they are working on a Linux version though.
I don't think that's a trade secret. So that's, that's in the pipeline. We're excited about that. And these guys, they pay their bills in like two days. No customers do that. They're, they're really something.
Jeremy: [00:41:14] I mean, I think that's a, a sort of a unique case because it is a screen sharing application, you have things like headsets and webcams and, you're making calls between machines. So, I guess if you're testing, you would have all these different laptops or workstations set up all talking to one another.So yeah, I can imagine why it would be really valuable to have, people or a team dedicated to that.
Michael: [00:41:40] And external webcams. And you know, whether you're, you're like my Mac mini is a 2012, so it doesn't have the three band audio. port, right. It's got one for microphone and one for headphone. So that in itself is like, well, I wonder how many of their customers are really going to have an older machine, but, it wound up being an interesting challenge because then I had to, if I was doing a testing, I had to have a microphone sort of distinct from the headphones.
And then that brings in a whole other nest of interactivity that you have to. Account for maybe the microphones USB based, you know, all sorts of craziness.
Jeremy: [00:42:19] I'm wondering if you have projects where you not only have to use the client, but you also have to, to set up the server infrastructure so that you can run tests internally. I'm wondering if you do that with clients and if you do like, what's your process for learning? How do I set up a test environment? How do I make it behave like the real thing, things like that.
Maxwell: [00:42:42] So the production and testing equipment is what the customers have right. It's basically to, to create that setup, we just need the equipment from them and the user guides and the less information, frankly, the better in those setups, because you want to mimic what the customer's scenario is, right? You don't want to mimic too pristine of a setup. and that's something that we're always careful about when we're doing that sort of setup. As far as more of the integration. and the, uh, sandbox testing bed, where you're testing a new build for regressions or what, or what have you, that's going to be going out. we'd be connected to a different server environment.
Michael: [00:43:24] And with zebra technologies, they're their zebra designer, printer driver. Uh, they support windows 7 windows 8.1 windows 10 and windows server 2012, 2016, 2019. And in the case of the non server versions, both 32 bit and 64 bit, because apparently windows 10 32 bit is more common in Europe, I guess, than it is here.
And even though, you know, windows 7 has been deprecated by Microsoft, they've still got a customer base, you know, still running, you know, don't fix what ain't broke. Right. So why would you update a machine if it's doing exactly what you want, you know, in your store or a business or whatever it is. And so we make a point of, of executing tests in all 10 environments.
It, it can be tedious because windows 7 uh, 32 and 64 have their own quirks. So we always have to test those too, you know, windows 8 and windows 10. They're fairly similar, but you know, they keep updating windows 10 and so it keeps changing. and then when it's time, for their printer driver to go through the, uh, The windows logo testing, they call it that's like their, their hardware quality labs, hardware certification, uh, that Microsoft has, which in essence means when you run a software update on your computer, uh, if there's a new version of the driver, it'll download it from Microsoft servers.
You don't have to go to the customer website and specifically seek it out. So we actually do, uh, certification testing for zebra, uh, with that driver in all of those same environments and then submit the final, package for Microsoft's approval. And that's, uh, that's actually been, uh, sort of a job of ours if you will, for several years now.And that it's not something you take lightly when you're dealing with Microsoft and actually this sort of circles back to the, the writing, the guides, Because, you know, there are instructions that come with the windows hardware lab kit, but it doesn't cover everything obviously. And we wound up creating our own internal zebra.Printer driver certification guide and it's over a hundred pages because we wanted to be sure to include every weird thing that happens, even if it's only sometimes, and be sure you set this before you do this, because in the wrong order, then it will fail. And he won't tell you why and all sorts of strange things.
And we've of course, uh, when we were nearing completion on that guide. Our contact at zebra was actually wanted, wanted a copy. Um, cause you know, we're not they're only a QA vendor obviously. and so if there's anything that would help and they have other divisions too, you know, they do, uh, uh, they have a browser print application that allows you to print directly to the printer from a web browser without installing a driver and that's a whole separate division and, you know, but overall, all these divisions, you know, have the same end goal as we do, which is, you know, sort of reducing the friction for the customer, using the product.
Jeremy: [00:46:31] That's an example of a case where it, it sounds, you said it's like a hundred pages, so you've got these, these test cases basically ballooning in size. And maybe more specifically towards, the average software project, as development continues, new features get added, the product becomes more complex.
I would think that the number of tests would grow, but I would also think that it, it can't grow indefinitely. Right. There has to be a point where. it's just not worth going through, you know, X number of tests versus the value you're going to get. So I wonder how you, how you manage that as things get more complicated, how do you choose what to drop and what to continue on with?
Michael: [00:47:15] it obviously depends on the client, in the case of zebra to use them again, You know when we first started working with them, they put together the test suites. We just executed the test cases, as time went by, They began letting us put the test suites together. Cause you know, we've been working with the same test cases and you know, trying to come up with a system.
So we sort of spread out the use instead of it always being the same number of test cases, because what happens when you get, when you execute the same tests over and over again, and they don't fail. That doesn't mean that you're you fixed everything. It means that your tests are worthless. Eventually. so they actually, a couple of summers ago, they had us go through all of the test cases, looking at, uh, the various results to evaluate like, okay, if this is a test case that we've run 30 times and it hasn't failed for the last 28 times, is there really any value in running it at all anymore?
Uh, so long as that particular functionality isn't being updated because they update their printer driver every few months when they come out with a new line of printers, but they're not really changing the core functionality of what any given printer can do. They're just adding, like model numbers and things like that.
So when it comes to like the ability of the printer to generate such and such a barcode on a particular kind of media, like that only gets you so far. but when you have, you know, uh, some printers have RFID capability and some don't, and so then you can, you get to kind of mix it up a little bit, depending on what features are present on the model.
So deprecation of, worn out test cases, uh, does help to mitigate, you know, the ballooning, test suite. I'm sure. I'm sure Bose has their own approach
Maxwell: [00:49:02] Absolutely. there are certain features then might also fall off entirely, where, you'll look at how many users are actually using a certain feature. like Michael was saying, you know, there might not be any failures on this particular feature, plus it's not particularly being used a lot.
So, so it's a good candidate for being automated. Right. so also we'll look at cases such as such as that. and we'll go through a test cycle scrubs. we've had to do, um, a series of, update matrixes that we've had to, um, progressively. look at how much of the market has already updated to a certain version.
so if a certain part of the market, if 90% of the market has already updated to this version, You don't you no longer have to test from here to here as far as your update testing. So that's another way in which you can, which you slowly start to reduce test, test cases and coverage.
but you're always, you're always looking at that with risk assessment in mind. Right. And, and you're, and you're looking at, you know, who are the end users that, you know, what, what's the, what's the customer impact. If we're, if we're pulling away. Um, or if we're automating this set of test cases.
so, you know, we go about that very, uh, carefully, but, we've been gradually more and more involved in helping them assess, what test cases are the best ones to be manually run? cause those are the ones that we end up finding defects in time and time again.
so those, so those are the areas. that we've really helped rather than having, you know, cause lots of times clients will, if they do have a QA department you know, the test cases will be written more in an automation type language. So it's like, okay, why don't we just automate these test cases to begin with?
And it'll be very broad scope where they have everything is written as a test case, for the overall functionality. And it's just way too much as you're pointing out Jeremy and as features grow. It just, that just continues on. it has to be whittled down in the early stages to begin with.
but that's how we, that's how we help out. to finally, you know, help manage these cycles to get them in a more reasonable, manual testing, cadence, right. And then having, having the automated section of test cases have that be, you know, the larger portion of the overall coverage as it should be in general.
Jeremy: [00:51:29] so it sounds like there's this, this process of you working with, uh, the client figuring out, what are the test cases that. don't or haven't brought up an issue in a long time, or, the things that get the most or the least use from customers, things like that, you, you, you look at all this information to figure out what are the things that for our manual tests we can focus on, um, and try to push everything else. Like you said, into some automated tests.
Michael: [00:51:59] So if, over time, we're starting to see these trends with older test cases or simpler test cases. You know, if we notice that there's a potential we'll bring that to the, to the client's attention.
And we'll say, we were looking at this batch of tests for basic features and we happened to notice that they haven't, failed ever or in two years or whatever. Would you consider us dropping those, at least for the time being, see how things go. and you know, that way we're spending less of their time.
So to speak, you know, on the whole testing process, because as you pointed out, like the more you build a thing, the more time you have to take, you know, to test it from one end to the other. but at the same time, uh, a number of our analysts are, um, OAST 508 trusted tester certified for accessibility testing, using screen readers and things like that.
uh, it's interesting how many web applications, you know, it just becomes baked into the bones. Right. And so, you know, you'll be having a team meeting talking about. yesterday's work. Um, and somebody will mention, you know, when I, when I went to such and such page, you know, because this person happened to use, a stylus to change the custom colors of the webpage or something like that.
Um, they'll say, you know, it really, it was not very accessible and there was light green, there was dark green, there was light blue, like I can, you know, and so I used my style sheet to make them. Red and yellow and whatever. and you see enough of that kind of stuff. And then that's an opportunity, to grow our engagement with the client, right?
Because we can say by the, by, you know, we noticed these things, we do offer this as a service. If you wanted to fold that in or, you know, set it up as like a one-time thing, even, you know, it all depends on, how much value it can bring. The client, right. you know, we're not pushing sales, trying to, Oh, we'll always get more whatever.
Um, but it's just about, when you see an opportunity, for, improvement of the client's product or, you know, uh, helping, uh, better secure their position in the market or, you know, however, however it works or could work to their advantage. You know, we sort of feel like it's our duty. To mention it as their partner.
we also do data analysis, you know, we don't just do QA testing. I know that's the topic here, of course. but that is another, another way where, you know, our discerning analysts can find, one of our products or one of our clients rather. we do monthly, uh, call center. Like help desk calls.
we analyze that data in aggregate and, you know, they'll find these little spikes, you know, on a certain day, say over there or a clutch over a week of people calling about a particular thing. And then we can say to the, to the client, you know, did you push a new feature that day or was it rainy that day?
Or, you know, I mean, it could be any, and maybe the client doesn't care, but. But we see it. So we say it and, and let them decide what to do with the information.Jeremy: [00:55:08] The comment about accessibility is, is, um, is really good because it sounds like if you're a company and you're building up your product and you may not be aware of the accessibility issues, um, you have a tested by someone who's using a screen reader, you know, sees those issues with contrast and, and so on.
And now the developer, they have like these, these specific, actionable things to do and potentially even, um, moved those into automated tests to go like, okay, we need to make sure that these UI elements have this level of contrast, things like that.
Michael: [00:55:45] Yeah. And there's different screen readers too. you know, the, the certification process, like with the government to become a trusted tester uses one particular screen reader named Andy it's an initialism. Um, but there are others and, you know, then it's on us to become familiar with, you know, what else is out there because it's not like everyone is going to be using the same screen reader, just like not everyone uses the same browser
Maxwell: [00:56:10] I think the clients realize that, yeah, we do have a good automation department, but is it well balanced with what they're doing manual QA wise? And I think that's where we often find that there's a little bit lacking that we can provide extra value for, or we can boost what is currently there.
Michael: [00:56:28] Our employees are quality assurance analysts. They're not testers. They don't just come in, read the script, then, Pokemon go afterwards. we count on them to bring that critical eye, you know, and they're, and everyone's own unique perspective. Uh, when they go to use any given product, you know, Pay attention to what's happening.
You know, even if it's not in the test case, you know, something might, you know, flash on the screen or there might be this pause before the next, uh, thing kicks off that you are waiting for. And that happens enough times and you kind of notice, like there's always this lag right before the next step, you know, and then you can check that out with.
The developer, like, is this lag, do you guys care about this lag at all? And you know, sometimes we find out that it's unavoidable because something, you know, something under the hood has to happen before, the next thing can happen.
Maxwell: [00:57:20] and even asking those questions, we've found out fascinating things like, you know, why is there this lag every time when we run this test, you know, we never want to want to derail a client too much. You know, we're always very patient for the answer. And sometimes we don't, you know, we might not get the answer, but I think that that does help build that level of respect between us and the developers, uh, that we really care what their, what their code is doing. And we want to understand, you know, if there is a slight hiccup what's causing that slight hiccup, it's, it's, it ends up being fascinating for our analysts as we are learning the product.And that's what makes us wanna want to really learn, um, exactly what that, what the code is doing.
Michael: [00:58:02] though I'm not a developer. Um, when I first started at Aspiritech, I worked on Bose as well, and I really enjoyed just watching their code, scroll down the screen. As you know, the machine was booting up or the speaker was updating because you can learn all sorts of interesting things about what's happening, you know, that you don't see normally.
There's all sorts of weird inside jokes, uh, in terms of like what they call the command or, you know, Oh, there's that same spelling error where it's only one T or, you know, things that you kind of, you kind of get to know the developers in a way, you know, like, Oh, so-and-so wrote that line.
We always wondered. Cause there's only this one T and that word was supposed to have two teeth, you know, and they say, Oh yeah, we keep giving him a hard time about that. But now we can't change it because, so we have fun.
Jeremy: [00:58:52] If people want to learn more about, what you guys are working on or about Aspiritech where should they head?
Maxwell: [00:58:58] www.aspiritech.org. is our, is our website, head to there, uh, give you all the information you need about us.
Michael: [00:59:06] We also have a LinkedIn presence, uh, that we've been trying to leverage lately and, uh, talk to our current clients. I mean, they've really, they've really been our biggest cheerleaders and the vast majority of our, of our work has come from client referrals. was, is an example of that too.
You know, they were referred by a client who was referred, you know, we're very proud of that. You know, it speaks volumes about, about the quality of our work and the relationships that we build and, and, uh, you know, we have very little customer turnover in addition to very little staff turnover and that's because we invest in these relationships and then it seems to work for both sides.
Jeremy: [00:59:46] Michael, maxwell, thanks for coming on the show
Maxwell: [00:59:49] thank you so much, Jeremy. It's great talking to you.
Michael: [00:59:52] Thanks for having us
Scott is the Community Program Manager for the .NET team at Microsoft and the host of the Hanselminutes podcast.
This episode originally aired on Software Engineering Radio.
Personal Links and Projects:
Related Links:
Transcript:
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Today I'm talking to Scott Hanselman. He's a partner program manager at Microsoft, the host of over 750 episodes of the Hanselminutes podcast. And he's got a YouTube channel. He's got a lot of interesting videos. one of the series I really enjoy is one called computer stuff they didn't teach you. You're all over the place. Scott, welcome to software engineering Radio.
Scott: [00:00:22] Thank you for having me. Yeah. I was an adjunct professor for a while and I realized that in all of my career choices and all of my volunteerism and outreach, it's just me trying to get back to teaching. So I just started a TikTok. So I'm like the old man on TikTok now. But it's turned out to be really great.
And I've got to engage with a bunch of people that I would never have found on any other social platform. So whether you find me on my podcast, my blog, which is almost 20 years old now, my YouTube or my TikTok or my Twitter, it's pretty much the same person. I'm a professional enthusiast, but that's all done in the spare time because my primary job is to own community for visual studio.
Jeremy: [00:01:02] That experience of being a teacher in so many ways is one of the big reasons why I wanted to have you on today, because I think in the videos and things you put out, you do a really good job of explaining concepts and just making them approachable. So I thought today we could talk a little bit about .NET and what that ecosystem is because I think there's a lot of people who they know .NET is a thing, they know it exists, but they don't really know what that means. So, maybe you could start with defining what is .NET?
Scott: [00:01:37] Sure, so Microsoft is historically not awesome at naming stuff. And, I joke about the idea that 20 years ago, Microsoft took over a top level domain, right? Thereby destroying my entire .org plan to take over the world. When they made the name .NET they were coming out of COM and C and C++ involved component object models.
And they had made a new runtime while Java was doing its thing in a new language and a new family of languages. And then they gave it a blanket name. So while, Java was one language on one runtime, which allowed us to get very clear and fairly understandable acronyms like JRE, Java, runtime, environment, and JDK... Microsoft came out of the gate with here's .NET and C# and VB and you know, COBOL.NET and this and that. And this is a multilanguage runtime. In fact, they called it a CLR, a common language runtime. So rather than saying here's C#, it's a thing. They said here's .NET. And it can run any one of N number of languages now.
The reality is that if you look over time, Java ended up being kind of a more generic runtime and people run all kinds of languages on the Java runtime. .NET now has C#, which I think is arguably first among equals as its language. Although we also have VB and F# as well as a number of niche languages, but the .NET larger envelope name as a marketing thing remains confusing. If they had just simply said, Hey, there's a thing called C# it probably would have cleared things up. So, yeah. Sorry. That was a bit of a long answer and it didn't actually get to the real answer.
Jeremy: [00:03:24] So, I mean, you had mentioned how there was this runtime that C# runs on top of. Is that the same as the JVM? Like the Java virtual machine?
Scott: [00:03:36] Right. So as with any VM, whether it be V8 or the JVM, when you could take C# and you compile it, you compile it to an intermediate language. We call it IL. Java folks would call it a byte code and it's basically a process or non-specific assembler. For lack of another word. If you looked at the code, you would see things like, you know, load, you know, load four byte string into, you know, pseudo register type of stuff.
And, um, you would not be able to identify that this came from C#. It would just look like this kind of middle place between apples and apple juice. It's kind of apple sauce. It's pre chewed, but not all the way. And then what's interesting about C# is that when you compile it into a DLL, a dynamic link library or Linux which is like a shared object where you compile it into an executable.
The executable contains that IL, but the executable doesn't know the processor it's going to run on. That's super interesting, because that means then I could go over to an ARM processor or an Intel x86 or an x64 or whatever, an iPhone theoretically. And then there's that last moment. And then that jitter that just-in-time compiler goes and takes it the last mile, that local runtime, in this case, the CLR, the common language runtime takes that IL.
Chews it finally up into apple juice. And then does the processor specific optimizations that it needs to do. And this is an oversimplification because the, the pipeline is quite long and pretty exquisite. In fact, for example, if you're on varying levels of processor on varying versions of x64, or on an AMD versus on a, uh, an Intel machine.
There's all kinds of optimizations you can do as well as switches that we can do as developers and config files to give hints about a particular machine. Is this a server machine? Is this a client machine? Does it, is it memory constrained? Uh, does it have 64 processors? Can we prep this stuff both with a naive JIT a naive just-in-time compilation, or then over time, learn, literally learning over the runtime cycle of the process and go, you know, I bet we could optimize this better then swap out in while it's running a new implementation so we can actually do a multi-layered JIT. So here's the, I want to start my application fast and run this for loop.
You know, this for loop's been running for a couple of thousand milliseconds and it's kind of not awesome. I bet you, we can do better then we do a second pass. Swap out the function table, drop a new one in, and now we're going even faster. So our jitter, particularly a 64 bit jitter is really quite extraordinary in that respect.
Jeremy: [00:06:29] That's pretty interesting because I think a question that a lot of people might have is what's, what's the reason for having this intermediary language and not compiling straight to machine code. And what it sounds like, what you're saying is that, it can actually at runtime figure out like how can I run things more efficiently based on what's being executed in the program.
Scott: [00:06:50] Right. So when you are running a strongly typed language that has the benefits and the speed of drawing type language. and you also choose to compile it to a processor specific representation. We, we usually as developers and deployers think about things in terms of architectures like this is for an arm processor, but is this a raspberry PI, is it an Android device you really want to get down to the stepping individual versions of like an Intel?
For example, I'm talking to you right now on an Intel. But it's a desktop class processor. That was the first one released many, many years ago. It doesn't have the same instruction set as a modern i9 or a Xeon or something like that. So we have to be able to know what instruction sets are available and do the right thing.
And then if you add in things like, um, SIMD or doing, you know, data parallelization at the CPU level, If that's available, we might treat a vector of floats differently than we would on another thing. So how much information up front do we know about deployment and is it worth doing all that work on the developer's machine?
And this is the important part, which may not necessarily reflect the reality of where it's going to get deployed. You know, my desktop does not represent a machine in Azure or Heroku or AWS. So getting up to the point where we know as much as we can. And then letting the, the runtime environment, adapt to the true representation of what's happening on the, on the, on the final machine.
That doesn't mean that we can't go and do native compilation or what's called end gen or native image generation that if we know a whole lot, but, um, it provides a lot of value if we wait until the end.
Jeremy: [00:08:34] Yeah, that's an interesting point too, in that if we are compiling down to an intermediate language on our own development machines, then we don't have to worry about like, having all these different targets right. Of saying I'm going to build for this specific type of ARM processor you know, all these different places that our application could possibly go because the runtime has taken care of that for us.
Scott: [00:08:58] Right. And I think that's one of those things where it's taking care of it for us, unless we choose to, it reminds me of the time. Uh, if you remember, when we went from, you know, mostly manual shift cars to automatic shift cars, but people who bought fancy Audis and BMWs were like, you know, when I'm going to the shops to get groceries, I'm cool with automatic, but I really like the ability to downshift.
And then we started to see these hybrid cars that had, you know, D and R for drive in reverse and then a little move left and move right. To take control. And like, I'm going to tell the automatic transmission on this car that I really want to run at higher rpms. .NET tries to do that. It gives you both the ability to just say, you know, do what you're gonna do. I trust the defaults, but then a ton of switches, if you choose to really know about, uh, that, that ending environment.
Jeremy: [00:09:50] We've been talking about the common language runtime, which I believe you said is the equivalent of the Java virtual machine. You also mentioned how there are languages that are built to target the intermediate language that, that runtime uses like C# and VB and things like that. And you touched on this earlier before, but I feel like we haven't quite defined what .NET really is like, is it that runtime and so on?
Scott: [00:10:18] Okay. so .NET as an all encompassing marketing term, I would say is the ecosystem. Just as Java is a language and also an ecosystem. So when someone says, I know .NET, I would assume that they are at the very least familiar with C#. They know how to compile code and deploy code. They might be able to be a website maker, or they might be an iPhone developer.
That's where .NET starts getting complicated. Because it is an ecosystem. Someone could work. Let's say two people could work for 10 years in the ecosystem, and one could be entirely microservices and containers on the backend, in the cloud. And the other person could be Android and Unity and games and Xbox, and they are both .NET Developers, except they never touched each other's environment.
One is entirely a game and mobile app developer and the other one is entirely a backend developer. That's where it's harder now. I think the same thing is true in the Java world. Although I would argue that you'd be surprised how many places that you can find .NET, whether it be, in a Python notebook, you can run .NET And C# and an IPYMB in a notebook and Jupiter um, on, you know, on the Google cloud or you can run it on a microcontroller, not a microprocessor, but literally on a tiny 256 K micro controller, all of that is, is .NET.
Now .NET is also somewhat unique, uh, amongst ecosystems in that it has a very large, what we call BCL the base class library. One of the examples might be that we know with a Python or with a JavaScript. You say console.log, hello world. We got the same thing. And then it's like, Hey, I need a really highly optimized, a vector of floats with a complicated math processing. Well, you have to go and figure out which one of those libraries you want to pull in. Right? There's lots of options in the Python world. There's lots of options in the Java world, the JavaScript world, but then you have to go hunting for the right library. You have to start thinking.
Microsoft, I think is somewhat more prescriptive in that our base class library, the stuff that's included by default has everything from regular expressions to math, to, you know, representations of memory all the way up to putting buttons on the screen.
So you don't always have to go out to the third party until it's, uh, it's time to get into higher level abstractions. And that has made .NET historically attractive to enterprises because when you're picking a framework, And it has a whole ton of ecosystem things. Who do you throttle when it doesn't work? Right?
There's an old joke that people picked IBM because they got, it gave them one throat to choke. Uh, people who are in enterprises typically will buy a Microsoft thing and then they'll buy a support license and then they can call Microsoft when it is. Um, now fast forward for 20 years, Microsoft has a balance between a strong base class library and enrich open-source environment.
So it's a little bit more complicated, but the historical context behind why Microsoft ended up in enterprises is because of that base class library and all of those libraries that were available out of the box.
Jeremy: [00:13:20] And that's sort of in direct contrast to something like say JavaScript where people are used to going to the node package manager and getting a whole bunch of different dependencies that have been developed by other people. It sounds like with .NET, the philosophy is more that Microsoft is going to give you most of what you'll need and you can sort of stay in that environment.
Scott: [00:13:46] I would say that that was the original philosophy. I would say that that was pre open source Microsoft. So maybe 10 and 15 years ago. I would think that that is not a philosophy that is very conducive to a healthy, open source ecosystem. But it was early on when it was like, here's a thing you kind of buy, you didn't really buy it, but like it comes with windows and you buy visual studio.
When I joined about 13-14 years ago, myself and others are really pushing open source because we came from the open source community. And as we started introducing things like Azure and services .NET is free, has been free. Is now open source visual studio code, visual studio community are all free. So the stuff I work on, I actually have nothing to sell.
As such, um, it's better to have a healthy community. So we have things like NuGet so NuGet and N U G E T is the same as NPM is the same as Maven. It's our package manager environment. And that is, uh, allows you to pull both Microsoft libraries and third-party libraries. And we want to encourage an open source community that then, you know, pays open source framework, uh, creators for their work using things like github sponsors.
People who are working in enterprises now need to understand that if you pick just the stuff that Microsoft gives you out of the box, you may be limited in the diversity of cool stuff that's happening. But if you go and start using open source technology, You might need to go and pay a support contract for those different tools or frameworks, depending on whether it's a, you know, a reimagination of ASP.NET or web framework.
And you say, I don't like the one Microsoft has. I'm going to use this one or an OAuth identity server that Microsoft does not ship out of the box. It's a missing piece. Therefore use this third party identity server.
Jeremy: [00:15:33] What are some things that you or Microsoft has done to encourage people to build open source stuff for .NET?
Scott: [00:15:43] Well, early on Microsoft was, uh, you know, attempted to open source .NET By creating a thing called rotor, R O T O R, which was a way of saying, Hey, Java, everyone in, in, in academia is using Java. We have one too, but. It didn't exist. The only real open source licenses at the time were GPL, so Microsoft made a very awkward, this is about 20 years ago, very awkward overture, where it's like, here's a zip file with a weird license that Microsoft made called the MSPL, the Microsoft permissive license.
And here's a kind of a neutered version of .NET That you could potentially use. And it wasn't source it wasn't open source. It was source opened. It's like, here's a zip file. Like, go, go play with this. There were no take backs. Fundamentally Microsoft was not a member of the open-source community. They were just like, Hey, look, we have one too.
When we started creating ASP.NET MVC, the model view controller stuff, which is kind of the Ruby on Rails for .NET. The decision was made by a bunch of folks on the team that I worked on and encouraged by me and others who think like me. Just open source the thing and do takebacks it's it's not just, Hey there's Hey, you can watch us write code and do nothing about it, but the takebacks is so important.
And the rise of GitHub and social coding enabled that even more. So now actually about 60% of .NET comes from outside Microsoft. It is truly collaborative. Now Microsoft used to patent a bunch of stuff and they'd patent stuff all over the place and everyone would be worried. I don't want to commit to this because it might be patented, but then they made the patent promise to say that any of the patents that covered .NET won't be things that will be enforced.
And then they donated .NET to the .NET foundation, which is a third party foundation, much like many other open source foundations and .NET and all of the things within it, the logos and the code and the copyrights are all owned by that. So if Microsoft, God forbid were to disappear or, uh, stop caring about .NET, it is the communities that can be built and deployed and run entirely outside of .NET. So it's overtures like that to try to make sure that people understand that this isn't going anywhere, it's truly open source and then take backs is such a big thing. And this has been going on now for, for many, many years, most people who think .NET and probably the folks that are listening to this show are thinking the big giant enterprise monolith that only runs on Windows that came out in the early 2000s.
Now it is a lightweight cross-platform open source framework that runs in containers and runs on Linuxes and raspberry pis. It's totally open and it's totally open source all the way down. The compilers, open source, all the libraries, everything.
Jeremy: [00:18:21] To your point of people, picturing .NET being closed source running only on windows. I think an important component to that is the fact that there have been multiple .NET runtimes right? There's been the framework there's been core. There's been mono. I wonder if you could elaborate a little bit on what each of those are and what role they fit in today.
Scott: [00:18:46] That is a very, very astute and important observation. So if I were to take Java byte code, Uh, I've taken Java source code and I've turned it into Java byte code. There are multiple Java runtimes, Java VMs that one could call upon it's from Oracle and there's Zeus and there's different open source, open JBMs and whatnot.
There are even interpreters. You can run that through a whole series of them. There's X. You can imagine a pie chart of all the different Java VMs that are available. Now. Certainly there's the 80% case, but there are lots of other ones. The same thing applied in the .NET ecosystem. There is the .NET framework that shipped and ships continually still with windows.
And the .NET framework and its associated CLR, I would say is a, and I'm putting my fingers in quotes here, a .NET. But to your point, there are multiple .NETs. Plural. Mono is a really interesting one because it is an open source, clean room reimplementation of the original Windows-based .NET and clean room in that they did not look at the source code.
They looked at the interface. So if I said to you, Jeremy, um, here's the interface for `System.String`, write me an implementation of this interface here are the unit tests and you wrote yourself Jeremy specific `System.String`. And it, it worked, it passed all the tests. Now, you know, who knows under load, it might behave differently.
But for the most part, you made a, a clean room reimplementation of that. Mono did that for all of .NET proper, but then that became a new .NET. So for those that are listening, they can't see us or see a screencast. But imagine that I go out to the command line and with a Microsoft implementation of .NET, I make a new console app.
It says `Console.WriteLine`. Then I compile it into an executable. I can then run that with Mono. So I created it on one environment and I ran it with another. So I, I did the compilation and the first, the first bit of chewing, and then the finally the jitter would run through mono because the IL language is itself a standard that has been submitted to the standards organizations.
And it is a thing. So if you and I were, uh, you know, advanced computer science students in university, that might be an interesting homework assignment, right? A IL you know, interpreter or write an IL jitter. And if you can successfully consume IL that has been created by one of these many .NETs you have passed the course.
Then .NET core is a complete reimplementation opensource cross-platform with no specific ties to windows. And that's the .NET that has kind of, we're hoping will be rebranded just .NET. So five years from now, people won't think about .NET core .NET framework. They'll just think about .NET.
It will run great on windows. It'll run great on Linux it'll run great right in the cloud, yada yada, yada. That is a kind of a third .NET. We're trying to unify all of those with the release of .NET 5, which just came out and .NET 5 we picked the number because it's greater than 4, which was the last version of .NET on windows.
But we also skipped a number because .NET Core 3. Would have been confusing if we named it .NET Core 4, again, Microsoft not being good at naming stuff. You imagine two lines heading to the right .NET framework and .NET core. They then converge into one .NET. The idea being to have one CLR, one common language runtime, one set of compilers and one base class library.
So that if someone were to learn that. If we harken back to the middle of the beginning of our conversation, when I said that there was a theoretical .NET developer out there, who's a gamer and there's another one who's doing backend work. They should be using the same libraries. They should be using the same languages.
They should have a familiarity with everything up to, but not including the implementation specific details of their systems, so that if a unity person or a, uh, decided to become a backend developer or a backend developer decides to move over to make iPhone games, they're going to say, Oh, This is all the system, dot this and that stuff I'm familiar with. That's a `System.String`. I appreciate that. As opposed to picking an iPhone or an Android specific implementation of a library.
Jeremy: [00:23:09] If .NET core is the open source reimplementation of the .NET Framework, and you've also got, you were talking about how mono is this clean room implementation of it? What role does, does mono play, currently?
Scott: [00:23:25] Hmm. Good question. So mano was originally created by, uh, Miguel de Icaza and the folks that were trying to actually make an outlook competitor on Linux that then turned into the Xamarin, uh, company X, A M A R I N and Xamarin, um, then was a different reimplementation, which allowed them to do interesting stuff like, can we take that IL and send it over to I'm doing a thing called SGEN, uh, and generating, you know, uh code that would run on an iPhone or code that we're running on an Android and they created a whole mobile ecosystem and then Microsoft bought them.
So then, because mono and .NET work on the same team we're getting together right now and we're actively merging those together. Now, originally mono was really well known that it was a just nice, solid, clean C code. You could basically clone the repo run configure run make, and it would build while the windows, implementation of .NET Core and .NET Core in general, wasn't super easy to build it.
Wasn't just a clone configure and make. A lot of work has been done to make the best parts of mono be the best parts of .NET core, as opposed to like disassembling one and, you know, taking them apart and making this frankenbuild, we want to make it so.net core is, is a, uh, a re-imagining of both original .NET framework and windows and the best of a cross platform one.
So that is currently in the process in that .NET 5 .NET 6 wave. So this year in, um, in November, we will release .NET 6, and that will also be an LTS or a long-term support release. So those of you who are familiar with Ubuntu's concept of LTS those long short term long-term support versions will have, you know, many years, three years plus of support.
And by then the mono .NET merge will kind of be complete. And then you'll find that .NET core will basically run everywhere that mono would. It doesn't mean mono as an ecosystem goes away. It just becomes another, uh, not the supported version of .NET, uh, at a runtime level.
Jeremy: [00:25:31] When we look at the Java virtual machine and we look at that ecosystem, there are languages like Scala and Clojure, things that were not made by the creator of the JVM. But are new languages that target that runtime.
And I wonder, from your perspective, why there was never that sort of community, with the common language runtime?
Scott: [00:25:56] I would think that because the interest from academia early on 20 years ago was primarily around Java. And Microsoft was slow on the uptake from an open source perspective that, um, slowed down possible implementations of other languages. And then .NET got so popular and C# evolved because C# is now on version nine, the C# that you wrote the idiomatic C# of 2001 is not the idiomatic C# with includes, you know, records and async and await and interesting things that have shown up on C# and then been borrowed by other languages, async await, being a great example, linq language, integrated query, being another thing that didn't exist, that language has evolved fast enough that people didn't feel that it was worth the effort. And it was really F# that picked an entirely different philosophy in the, in the functional way, by doing kind of an Ocaml style thing. And because C# and F# are so different from each other, they really fit nicely. They're as different as English and another language that isn't Latin-based, you know, And they meaning that they add flavor to the environment.
That doesn't mean that there aren't going to continue to be niche languages. Uh, but there was also things like IronRuby and IronPython, but they weren't really needed because the Python and Ruby runtimes already worked quite well.
Jeremy: [00:27:16] We've been talking about .NET and sort of the parts that, make up .NET, but I was wondering, other languages have niches, right? You have go maybe for more systems type programming or Python for machine learning. Are there niches for .NET?
Scott: [00:27:33] Oh, yeah. I mean, so when I would, I would, I would generally push back on the idea of niche though, because I don't want to use niche as a diminutive right. I kind of like the term domain specific just in the sense of you know, when in Finland speak Finnish, that doesn't mean that we would say, Hey, you know, Finnish didn't win.
Like we're not all speaking. It's not the, it's not the language. That one. But we don't want to think about things like that. There are great languages that exist for a reason because people want to do different stuff with them. So for example a really well thought of language in the .NET space is F# and F# is a functional language.
So for people who are familiar with OCaml and Haskell and things like that, and people who want to go and create a language that is, you know, very correct, very maintainable, very clear that has a real focus on you managing the problem domain. As opposed to the details of, of imperative programming. Um, it has immutable data structures by default.
There's a lot more type inference than there is in C#. Um, a lot of people use F# in the financial services industry. and it also has some interesting language features like records and discriminated unions, uh, that, uh, C# people are just now kind of getting hip to.
Jeremy: [00:28:47] What I typically see or what I think people sort of believe when they think of .NET is they think of enterprise applications. And I'm wondering if you, have some examples of things that are sort of beyond that space.
Scott: [00:29:03] Sure. So when I think of .NET, like in the past, I would think of like big windows apps with giant rows of tabs and text boxes over data. You know, if I go to my eye doctor and I see the giant weird application that he's running, that's multiple colors and lot of grids and stuff. And I think, Oh, that must be written in .NET.
Those are the applications of 20 years ago. The other thing that I think people think about when they think about .NET is it's a thing that is tied to windows and built on windows and probably takes about gigabytes of space. .NET core is meant to be more like node and more like go than it is to be like .NET Framework.
So for example, if I were to create a microservice using .NET core, I go out to the command line and I type `dotnet new`. Web or `dotnet new empty`, or `dotnet new console`. And then I can say `dotnet publish` and I can bring in a runtime. And we talked before about how you could keep things very non-specific until the last minute and let the runtime on the deployed machine decide.
Or you could say, you know, this is going to a raspberry PI, it's going to be a microservice on a raspberry PI. I want it to be as small as possible. So there's multiple steps. We've gone to, we made this little microservice and we published it. Then we publish all of the support, DLLs, the dynamic link libraries, the shared objects.
If it's going to Linux all of the native bindings that, that provide that binding between the runtime and, the native processor and the native operating system, all the system calls then that is right now, about 150, 180 megs on .NET. But then we look at the tree of function calls. And we say, you know, this is a micro service.
It only does like six things. It looks at the weather, it calls like six native functions. Then we do what's called tree trimming or tree shaking and we shake, we give it a good shake. And then we basically delete all of the functions that are never called. And we know those things, right. We can, we can introspect the code and we can shake this thing and we can get microservices down to 30, 40 megs.
Now again, you might say, well, compared to go or compared to writing that in C. Sure. But you're talking about still maintaining the jitter, the garbage collector, the strongly typed abilities of .NET. All of the great features of this cool environment shook, shaken, shaken, shaken down to, you know, 30, 40 megs.
And then you can pack that tightly into a container, even better if the container layer above it. And if you imagine a multi-stage Docker container, doesn't include the compiler. Doesn't include the SDK and the container above. It includes the runtime itself. You might only deploy, you know, 10, 15 megs.
You can really pack nicely together a .NET microservice and then a whole series of them. Thousands of them potentially into something like Kubernetes. That's a very fundamentally different thing than the giant .NET application that took two gigs on your hard drive.
This is the cool part, though. A giant, we call it windows forms or that WPF, that windows presentation application that often we find in large enterprises moves slowly because the not, not run slowly, but move slowly from an, um, from a dev ops perspective because the enterprise is afraid to update it, or they need to wait for windows to get the version of .NET that they want because .NET has historically been tied to windows.
Which sucks then that application and the whole team behind it gets depressed. What if we could take the cool benefits that I just described around containers back port it, so that, that giant application at my doctor's office ships its own local copy of .NET, which it can then maintain, manage, and, and deal with it's it's on its own.
All in a local folder. Doesn't need to wait for windows. That's really cool. Then they could do that, that, that prejudging basically do a native image generation and even create a single executable where inside is .NET. So then I give you my doctor's office dot exe. You double click on it. You install nothing.
It simply unfolds like a flower into memory. Including all of the benefits. Now it's not generated native code at this point, right? It's .NET in an envelope that then blooms and then runs with all native dependencies that it needs. That is pretty cool. And that actually really generates excitement in the enterprises so that they can bring those older applications to, to modern techniques like dev ops and, uh, as well as a huge, huge perf speed up like two X, three X per speed up.
Jeremy: [00:33:47] When you talk about a perf speed up, why would including the, the .NET runtime with the application result in a perf speed up?
Scott: [00:33:55] It's not that we're including the, the, the .NET runtime that caused the perf speedup. It's that 20 years of improvement in jitter, 20 years of understanding the garbage collector, rather than using the .NET that shipped with windows, they can use the one that understands. Again, a callback to the beginning instruction sets that are newer and modern as SIMD, SSE2, all the different things that one could potentially exploit when on a newer system. Those things are we get huge numbers and actually .NET itself .NET 5 is 2 or 3x over .NET core three, which is 2 or 3x over. So a huge perf push has been happening and actually all the benchmarks are public. So if you go to techempower, the tech empower benchmarks, you can see that we're having a wonderfully fun kind of competition, a bit of a thumb war with, you know, the gos and the Java's and the, you know, groovy on rails of the world.
To the point where we're actually bumping up against the laws of physics. We can put out, you know, seven or 8 million requests a second. And then you notice that we all bundle at the top and .NET kind of moves within a couple of percentage of another one. And the reason is, is that we've saturated the network.
So without a 10 gigabit network, there's literally nothing, no more bytes can be pushed. So .NET is no longer in the middle or to the far right of the, uh, of the long tail of the curve. We're right up against anything else. And sometimes as fast as 10x, faster than things like node.
Jeremy: [00:35:23] So it sounds like going from the .net framework to .net core and rebuilding that, uh, you've been able to get these, giant speed improvements. You were giving the example of a windows desktop application and saying that, uh, you could take this that was built for the .net framework, um, and run it on .NET core.
I'm wondering, is that a case where somebody can just take their existing project, take their existing code and just choose to run it? Or is, does there have to be some kind of conversion process?
Scott: [00:35:56] So that's a great question. So it depends on if you're doing anything that is like super specific or weird. Like if you were. A weird application, then, then you might bump up against an edge case. The same thing, the same thinking applies. If you were trying to take an application cross-platform or example, I did a tiny mini virtual machine and an operating system for a advanced class in computer science.
15, 20 years ago, I was able. To take that application, which was written and compiled in .net 1.1 and move it to .NET core and run in a container and then run on a raspberry PI because it was completely abstract. It was using byte arrays and, and, and list of T and kind of pretty straightforward base class library stuff.
It didn't call the registry. It didn't call any windows APIs. So I was able to take a 15 year old app and move it in about a day. Another example, my blog, which used to run on .NET to my blog has over 19 years old now has recently been ported from .NET 2 to 4, then .NET core. And now .NET 5 runs in the cloud in Linux, in a container. With 80% of the code, the same, the stuff that isn't the same that I had to actually consciously work around with the help of my friend Mark Downey was, well, we called the registry. The registry is a windows thing. It doesn't exist. So what's the, you know, what's the implementation, is it a JSON file now? Is it an XML file? You know, when it runs in a container, it needs to talk to certain mapped file system things while before it was assuming. Okay. Back slashes or a C:. So implementation details need to move over. Another thing that is interesting to point out though, is that there are primitives, there are language and runtime primitives that didn't exist before that exists now.
And one of the most important one is a thing that's called span of T `Span
So something very simple, like some HTTP headers come in, you know, if you're doing 7 million of these a second, this can add up right. And it's like, well, usually we take a look at that chunk of memory and we copy it into an array and then we start using higher level constructs to deal with that. Now I have a list of char and then we for loop around the list of char that's kind of gross.
And then you find out that you're in the garbage collector a lot because you're allocating memory and you're copying stuff around. You could take that older code, update it to you, use span of T and represent that contiguous region of arbitrary memory differently rather than an array, a span you can point to native memory or even memory managed on a staff back and then deal with stuff.
For example, parsing or maneuvering within HTTP header with no allocations. So a really great example, you'd ask for an example of an application that you couldn't believe. VR desktop. Are you familiar with VR desktop as it relates to the Oculus quest? Okay. So the Oculus quest is the Facebook VR thing, right?
And it's an Android phone on your face and it's, you can use it unplugged. And then people say, well, I got this Android phone on my face. I'd really like to plug it in with a USB cable and use the power of my desktop. But then I got to ship those bits across the USB and then have a client application that allows me to use it as a dumb client.
Okay. You would expect an application that someone would write to do that would be written in C or C++ or something really low level VR desktop allows you to actually see your windows desktop and ship those bits across the wire, across the literal wire, the USB or wirelessly is written entirely in C#. And you can use really smart memory allocation techniques. And the trick is don't allocate any memory and C# is more than fast enough to have less than 20 millisecond jitter on frames so that it is buttery smooth, even over wireless. And I did a whole podcast on that with the Guy Godin the Canadian, a gentleman who wrote VR desktop.
Jeremy: [00:40:23] That's really interesting because you're talking about areas where typically when you think of a managed language, like, C#, you're able to use it in more environments than you, you might normally think of. And, I also kind of wonder, you have the new Apple machines coming out. They have their arm processors instead of the x86 processors. what is the state of.net when targeting an arm environment? Can somebody simply take that IL that's on one machine, bring it on to the arm machine and it just works?
Scott: [00:41:01] Yep. So if you are putting, if you're taking over the, the IL and I'm on a windows machine right now, I'm on an X64 machine, and I go, `dotnet new console`. And I go, `dotnet run` and it works on my X64 machine. And I say, `dotnet publish`. I can then take that, that library over to the raspberry PI or an arm machine say `dotnet run`.
And because I'm saying `dotnet run` with the local .NET runtime, it will pick up that IL and it will just run. So I have run using my now 19-20 year old blog. As an example, I have run that entire blog on a raspberry PI unchanged. Once I had worked out the portability issues, like once I had worked out the portability issues like back slashes versus forward slashes and assumptions like that.
So once the underlying syscalls line up, absolutely. And we actually announced in July of 2020 that we will support .NET core on the new Mac hardware. And there are requirements for .NET and .NET core on, you know, Apple like arm 64 and big sur, that would be required to get that runtime working.
But yes, the goal would be to have native processes versus Rosetta to processes working. And that work is happening.
Jeremy: [00:42:24] We were talking about like the typical enterprise application, you've got a, windows forms application is it possible for somebody with an application like that to, to target arm?
Scott: [00:42:36] So right now there is there's multiple layers. So the compiler yes. The runtime. Yes. The application. Yes. At the point where, if I were going to run on windows arm, like a surface pro X or something like that, we would need support for, I believe windows forms, winforms and WPF. I don't know that those are working right now, but those apps still run because just like we have the, the Rosetta, you know, Emulator can emulate x86 stuff.
So for example, paint.net is a great.net application that is basically Photoshop, but it's four bucks, um, that it runs entirely on .NET core and it runs just fine on a surface pro X in an emulated state. And I believe that the intent of course is to get everything running everywhere. So yeah, I would be sad if it didn't. That's a little outside my group, but the people that are working on that stuff. And by the way, the great thing about this go and look at the GitHub issues, right? So, you know, if you look on, I'm looking right now on a .net issue, 4879, where they're talking about the plan is supporting .NET core 3 and 5 on Rosetta 2 via emulation and .NET 6 supported in both native on ARM architecture and Rosetta two.
And there's in fact, a .NET runtime. A tracking issue. That's called support Apple Silicon, and they walk through where they're at current status on both mono and .NET six.
Jeremy: [00:44:00] very cool.
Scott: [00:44:01] Yeah, it is cool. It's cool. Because you get to like, actually see what we're doing, right? Like before you had to know somebody at Microsoft and then call your friend and be like, Hey, what's going on?
We are literally doing the work in, open on GitHub. So if the, if the tech journalist would pay more attention to the actual commits, they would get like scoops weeks before an a blog post came out because you're like, I just saw that, you know, the check-in for, you know, they had an Apple Silicon show up in the .NET but-da-da.
Jeremy: [00:44:26] Might get a little bit technical for the average journalist though.
One of the things that you've been talking about is how with .NET Core, you have a lot of cross-platform capabilities. I wonder from an adoption standpoint, what you've seen, like are there people actually running their services and Linux and Docker, Kubernetes, that sort of thing.
Scott: [00:44:48] Yeah .NET. Um, one of the things that we try to make sure that it works everywhere. So we've seen huge numbers, not only on Azure, but AWS has a very burgeoning, uh, .NET Community, as well as their SDKs all work on .NET. Same with Google cloud. we're trying to be really friendly and supportive of anyone because I want it to run everywhere.
So for example, at our recent .NET conf, which is our little virtual conference that we make, we had Kelsey Hightower from Google cloud, like their main Google cloud Kubernetes person. Learn.net in a weekend, and then deploy to Kubernetes on Google cloud. We're seeing a huge uptick on .NET 5. So that unified version where we took.net core and the.net framework and unified them, it's the most quickly adopted .NET ever.
I mentioned before that like 60% of the, commits are coming from outside the community. We're seeing big numbers, you know, stack overflow is moving to dot has moved to .NET core. They're actually showing their perf. And most of the games that you run are already running on .NET. And, you know, if you see a unity game, there's .NET and C# running in that a lot of the mobile games that you're seeing, um, one of the other things that we're noticing is that more and more students and young people are getting involved.
Um, so we're really trying to put a lot of work into that. One of the other things that's kind of cool that we should talk about before the end is a thing called blazor. Have you heard about that?
Jeremy: [00:46:07] Yeah. Maybe you could explain it to our audience.
Scott: [00:46:09] So just like, um, if we use Ruby on rails, as an example, I'd like to use other frameworks as an example, like builds cool stuff on top of it. So it's like you pick your Ruby, whether it be JRuby or, and Ruby or whatever. And then you've got your framework on top of that and then other frameworks, and then you layer on your layer and your layer.
.NET has this thing called ASP.NET active server pages. .NET. It's a stupid old name. You can make, web APIs that are JSON and XML based API APIs. You can make a standard model view controller type environments. You can do a thing called razor pages where you can basically mix and match C# and HTML.
But if you want to do those higher level, enterprise applications where you just don't want to learn Java script. We've actually taken the CLR and compiled it to web assembly and we can run the CLR inside V8. But then if you actually hit F12 tools on a blazor page, laser is bla Zed, O R hit F 12.
You go, and you hit a blazor page. You can see the DLLs, the actual dynamic link libraries coming over the wire. And then the intermediate language that we've been talking about so much on this podcast gets interpreted on the client side, which allows a whole new family of people. That already knows C#.
Like those individuals that we spoke to spoke about at the beginning, you've got someone with 10 years experience in unity and someone else got 10 years experience on the backend. I never got around to learning JavaScript. They can write C# they can target the browser. And then the browser itself runs.net and then blazor allows them to create spa applications, single page applications, much like, you know, Gmail and Twitter and in PWAs, progressive web apps.
So one could go and create an app. And rather than an electron kind of an app, which is using JavaScript and a whole host of tools, they can make a blazor app, have it run natively anywhere they can then pin it to their mobile phone or their taskbar on windows and they're there suddenly a web developer, and it's a really great environment for those big enterprise applications or those more line of business type apps, because the data binding is just so clean and then that backend can be anything.
So blazor is a really great environment and a pretty interesting technical achievement to get .NET itself running within the context of the browser.
Jeremy: [00:48:33] Does that mean that somehow you've been able to fit the .NET core runtime, in web assembly? I'm trying to understand how that works.
Scott: [00:48:40] Yep. That's exactly. That's exactly right. So remember how we mentioned before that this isn't the multi gigabyte thing that everyone is so afraid of, you basically bring down. I think it's like 15 or 20 megs, which is, you know, a little bit big, but if you think about the homepage of the verge, it's not that big, right.
I've seen PNGs bigger than the .net, uh, runtime. Uh, once you bring that down, of course, if you put that on a, on a CDN. that's handled. I think it's actually even smaller than that. It's not that big of a deal because once it's down, it's cached and that is really the .NET core, you know, mono runtime running within the context of WebAssembly and then you're just calling local browser things.
But here's the cool part. Let's say you don't like any of that, let's say I've described this whole thing. You like the application model. I want to write C#. But you don't want to download and like put a lot of work into getting the local CPU and the local V8 implementation to do that. You can actually not run the .NET framework and the .NET core rather, pardon me on the client side, but you can have it run on the server side and then you open a socket between the front end and the backend.
So then the events that occur. On the client side, get shuttled over web sockets to a stateful server. So there's blazor, which is client side, and then there's blazor server, which means it can run anywhere on any device. So then it's more of a programming model thing. You still don't have to write any JavaScript because you'll be addressing the Dom the document object model within the browser from the server side.
The messaging then goes magically over the web sockets. You basically, you have a persistent channel bus running between the front end and the back end. So you get to choose.
Jeremy: [00:50:28] yeah, we did a show on Phoenix live view, and it sounds like this is a very similar concept, except that you get the choice of having that run in the server or run locally in the client.
Scott: [00:50:40] Yeah. And I think the folks on Ruby on rails are doing similar kinds of things. It's this idea of, you know, bringing HTML back, maybe getting a little less JavaScript heavy, and really thinking about shoving around islands of marked markup. Right. If I have a pretty basic, you know, master detail page and I click on something and I just want to send back a card.
Is that really need to be a full blown react view? Angular application is the idea is you, you make a call to get some data, you ship the HTML back and you swap out a div. So it's trying to find a balance between, too much JavaScript and, uh, and too many post backs. And I think it, it hits a nice balance.
Blazor is seeing huge pickup, particularly in people who have found JavaScript to be somewhat daunting.
Jeremy: [00:51:27] I think we've been talking about things that are a little more cutting edge in terms of what's happening in the .NET ecosystem. Things like web assembly. what else are Microsoft's priorities for the future of .NET?
Are you hoping to move into some more ahead of time compilation environments, For example, where you need something really small, that can run on IoT type devices. What's the future of .NET from your perspective?
Scott: [00:51:57] All of the above. If you take a look at the microsoft .NET IOT, um, repository on github, we've got bindings for dozens and dozens and dozens of sensors and screens and things like that. We have partnerships with folks like PI top. So if you're teaching .NET, you can go in inside of your browser in a Jupiter kind of an environment, uh, go and, um, communicate with actual physical hardware and call GPIO general purpose IO directly. Uh, we're also partnering with folks like wilderness labs, which has a wonderful implementation of .NET Based on mono that allows you to write straight C# on a tiny micro processor, not a micro controller, pardon me uh, rather than a microprocessor and create, you know, commercial grade applications like, uh, you know, uh, uh, thermostat running in .NET, which again, takes those different people that I've mentioned before, who have familiarity with the ecosystem, but maybe not familiarity with the deployment and turns them into IOT developers.
So there's a huge group of people that are thinking about stuff like that. We've also seen 60 frames a second games being written in the browser using blazor. So we are finding that it is performant enough to go and do you know, we've seen, you know, gradients and different games like that. Being implemented using the canvas, uh, with blazor server running it's, uh, at its heart, the intent is for it to be kind of a great environment for everyone anywhere.
Um, and then if you go to the .NET website, which D O T.net dot.net, we can actually run it in the browser. And write the code and hit run so you can learn .NET without downloading anything. And then there's also some excitement around .NET interactive have notebooks, which allows you to use F# and C# in Jupiter notebooks and mix and match it with different languages.
You could have JavaScript, D3JS, some SVG visualizations, using C# all in the browser entirely in the, uh, in the Jupiter notebooks IPYMB ecosystem.
Jeremy: [00:54:00] So when you talk about, .NET Everywhere. It's really increasing the number of domains that you can use .NET at, and like be the person who maybe you built, uh, enterprise windows applications before, but now you're able to take those same skills and like you said, make a game or something like that.
Scott: [00:54:20] Exactly. It doesn't mean that you can't or shouldn't be a polyglot, but the idea is that there's this really great ecosystem that already exists. Why shouldn't those people be enabled to have fun anywhere?
I think that's a great note to end on. So for people who want to check out what you're up to, or learn more about .net, where should they head?
Scott: [00:54:42] Well, they can find me just by Googling for Hanselman, Google, with being, if it makes you happy. Um, everything I've got is linked off of hanselman.com. And then if they're interested in learning more about .NET, they can go to just Google for .NET and you'll find dotnet.microsoft.com. Uh, within the download page, you'll see that we work on Mac, Linux, Docker, windows, and, um, if you already have visual studio code, just download the .NET SDK, go to the command line type `dotnet new console`, and open it up.
And visual studio code VS code will automatically get the extensions that you need and you can start playing around. Feel free to ask me any questions. We've got a whole community section of our website, whether it be stack overflow or Reddit or discord or code newbies or tick tock, we are there and we want to make you successful.
Jeremy: [00:55:29] Very cool. Well, Scott, thank you so much for chatting with me today.
Scott: [00:55:34] Thank you. I appreciate your very deep questions and you clearly did your research and I appreciate that.
Shubheksha is a Software Engineer at Apple. She previously worked at Monzo and Microsoft.
Personal Links and Projects:
Other interviews about getting into open source:
Related Links:
Music by Crystal Cola.
Jeremy: This is Jeremy Jung and you're listening to software sessions. Today, I'm talking to Shubheksha Jalan about working with distributed systems, the go programming language and some of the lessons she's learned being a software engineer for the last three years. Previously, she worked as a backend engineer at Monzo Bank. And before that she was a software engineer at Microsoft.
Shubheksha, thank you so much for joining me today.
Shubheksha: Thank you so much for having me.
Jeremy: You picked up a lot of experience with distributed systems at Monzo bank. Could you start with explaining what a distributed system is?
Shubheksha: Yeah. So the main premise and the main difference, between like, a distributed and a non distributed system, is that a distributed system has more than one node. And when I say a node, I mean just computers, machines, servers. So it's basically a network of interconnected nodes that is being asked to behave as a singular entity and present itself as, as something that's not really disjoint.
And that's kind of where the trouble starts. something that you hear a lot of people who work with distributed systems say is that just don't do it unless you really really need to because like things spiral out of control very, very quickly. Because like, when you have a single machine, you don't have to think about like communication over the network.
You don't have to think about like distributing storage. You don't have to think about so many other things, whereas with like, when you have multiple machines, you have to think about coordination. You have to think about communication and like at what levels and like what sort of reliability you need and all of that.
So on and so forth. So it just gets super tricky.
Jeremy: Yeah. And when we talk about distributed systems, you mentioned about how it's having multiple nodes. And I think a more traditional type of application is where you would expect to have a load balancer and some web servers behind the load balancer and having those talk to a database and those that involves multiple nodes or multiple systems. But, I guess in that case, would you consider that to be a distributed system or not really?
Shubheksha: I think, yeah, like if something is going over the network and like, you have to sort of make those trade offs, I think I would still call it a distributed system. And like right now, like it's not just the big companies that are building and deploying distributed systems. It's pretty much everyone, like even small startups, especially like, because the cloud has become so dominant now, like the cloud is essentially a giant distributed system that we sort of take parts off and deploy our stuff, which is isolated from all of the other stuff that's also deployed on it. But like from the perspective of a cloud provider, it is a big big distributed system. And like, even when you, as a customer are using a cloud provider, you are, your code is deployed. You don't even know where, and it's still to you like if you spin up like multiple EC2 instances, for example, that is a distributed system.
Jeremy: It's almost not so much that the different nodes are doing different things I guess it's just the fact that you have to deal with the network at all. And that's where all of these additional troubles or, or challenges come in.
Shubheksha: Yeah, I think, yeah, that's probably a more accurate way to put it. Like, I can't really think of a system that you can like reasonably fit on a single machine right now. especially with the way, like we provision things, in the cloud. So yeah, by that definition, pretty much everything that we are building is a distributed system at some level.
Jeremy: Mmm you were saying earlier about how don't build one if you don't need to. But now it almost sounds like everything is one, right?
Shubheksha: Yeah. To some level. Yeah. Everything is one. Yeah.
Jeremy: When people first start getting started with having a system that involves the network, what are some common mistakes you see people make?
Shubheksha: Like a lot of people misunderstand the CAP theorem. The CAP theorem, is basically one of the most popular theorems in distributed systems literature and it states that like CAP stands for consistency, availability and network partitions.
And there's a lot of debate around like what those terms actually mean. So I don't want to go into that right now because that'll probably take up the rest of our time. CAP theorem states that you can only have two out of those three things at a given point of time.
And a lot of people sort of take that to heart and take it way too literally whereas there's a great blog post about this, which we should definitely link in the show notes, but which argues that like network partitions are not really optional. There's no way you can have a network that's a hundred percent reliable, like that's not humanly possible.
And there's also like some confusion around like what the terms, availability and consistency mean, And that's also like a huge source of debate and confusion in the community. And like a lot of people when they, when they sort of start working. They don't really understand what that means in what context, because those terms are very, very overloaded in tech in general.
And like one thing means like five different things depending on the context. So yeah, it can be really hard to get started.
Jeremy: Mm, and since it's a little difficult to pin down, what CAP means, I guess, for somebody who's, who's starting out. what would you suggest they they focus on to make sure that when they build something, that it works, I guess is one way to put it.
Shubheksha: I think that's, that's a very hard question to answer in like a generalized manner. It, it depends on your requirements and like, what is your end goal? What sort of like availability and reliability characteristics that you're looking for, from the system, for example, if it's a, if it's a system that's like sorting some critical data in that case, like consistency would be more valuable compared to variability, like it's a trade off.
Like everything else. So it really depends on the use case and yeah, like what you hope to achieve with the system.
Jeremy: Do you have a example from your experience where you, you chose those trade offs and could kind of walk us through that process?
Shubheksha: Uh, no, I don't think most people would have to deal with this day to day. And if you do, it's probably not a very good sign.
Like at Monzo, we had like a pretty unified platform. We did not have to like, think about this, at like, that level and go into it like super deep.
Jeremy: If I understand correctly there are engineers who build infrastructure or build tooling so that the software engineers or the backend software engineers don't have to worry about is this going to be eventually consistent or is this being partitioned instead you have some kind of API that's layered on top of that. Is that correct?
Shubheksha: Yeah. So you don't have to worry about like partitioning or anything like that. Like that's just taken care of. you just write a database schema, you deploy it and like data gets populated. You're not super worried about how that's done.
Jeremy: Hmm. And when you give the example of a database schema, is this like a commercial product or an open source product? Or is this something built in house at Monzo?
Shubheksha: Oh no. This is Cassandra.
Jeremy: Oh Okay. Then it's the Cassandra database developers. They are the ones who had to think about the CAP theory and work out the trade offs that they wanted to decide on.
And you, as the engineer, you just have to know how to use Cassandra.
Shubheksha: Yeah. And like we deploy it ourselves as well. So it can get a little tricky there too. You can fine tune it, it has a lot of knobs. So you have to give consideration to that and configure it accordingly basically.
Jeremy: I don't know if this applies to Cassandra specifically, but sometimes when people talk about distributed systems they talk about things like eventual consistency of how in your database, the latest change may not always be there when you go out to read it as a, as a backend software engineer. Is that something that you have to to think about and work around?
Shubheksha: So that also depends to an extent on the configuration and the type of data. if you know some data might not be present when you're reading it, like you have to guard for it. But like a very common pattern that's used is like, if you don't find something in the database, then you just assume that it's not present at all.
So like you try to read by a particular ID and if it's not present, then you create a new one. we don't have to deal with eventual consistency at every step. I would say like, I'm sure Cassandra does something in the background to deal with that. But yeah, like usually the assumption we make is that like, yeah, if it's not found in the database, then you just go and create it.
Jeremy: Mm, okay. I guess another thing I would ask about is you have all these different nodes you're referring to in a distributed system, and they all need some way of communicating with each other. what are some, some good protocols or ways for different nodes to communicate?
Shubheksha: something that I have been using at Monzo, or like I used to use at Monzo is a protobufs. It is a project that has been released by Google and it basically specifies how different services will communicate over the wire. So that sort of standardizes it and you have to still take care of like, some of the marshaling and unmarshaling.
But yeah, like it does most of the, the job on its own because when you're communicating on like, suppose you have two different services deployed on two different machines and machines, and you need a way to make them talk to each other, you know, you basically need some sort of a language that both of them understand.
So like if one is speaking French, and if one is speaking German, they can't really talk. So like, protobufs are sort of like both of them are talking in English and they can understand what they're saying.
Jeremy: And protobuf is the serialization format. Right. So that's a way that you can get a message from one system to another. But how is the, the transport handled? Like, are you using web services or some kind of message bus?
Shubheksha: Oh, yeah. So, that's mostly done over HTTP. Like wire remote procedure calls. Like, that's the most common thing I've seen you basically have some sort of a wrapper around, uh, like go's HTTP primitives to suit the needs of your system.
Jeremy: Hmm. Okay. you have HTTP endpoints in each service and you send that service, a protobuf message,
Shubheksha: Via a remote procedure call yeah.
Jeremy: If you're using HTTP, each node or each service has to know where the other machine is or how to reach the other machines, how is that managed in a in a system?
Shubheksha: that's sort of done, by some sort of service discovery mechanism, something that's super commonly used is Consul by HashiCorp. The job of that piece of software is to basically find out what service is deployed where. That's the other challenge with distributed systems, because you have so many things all over the place, you need to sort of keep track and have an inventory of like what is where so that if you get a request for a particular endpoint, you know where to send it.
you can use like a service discovery, uh, tool like Consul or you can use something like Envoy, which is a network proxy, and which sort of helps you uh do something similar.
Jeremy: And from the, the back engine engineers perspective, if you are using Envoy, um, or Consul or something like that, when you're writing out your end points or which HTTP endpoint you're going to talk to, what does that look like? Is there some central repository you go to and you just put that URL in and Consul or Envoy handles the rest?
Shubheksha: Oh, no. So like as soon as you deploy your service, basically, the platform will be aware of it. And all of the manifests that are associated with your service will get deployed and Envoy will pick the changes up.
For example, a new service, you need to make your proxy aware of it. So there will be some amount of configuration involved either you can do that by hand, or you can do that in an automated way.
Jeremy: So it's, it's almost like, adding your service to the system, making it aware that it exists. rather than having to know exactly who you're talking to, that's all handled by, some kind of infrastructure team or by the product.
Shubheksha: Yeah. So all of that, like, all of the platform components are basically deployed by a separate team.
Jeremy: Hmm. Okay. If you are outside of that team, a lot of this complexity--
Shubheksha: Is hidden away from you, yeah. And I have mixed feelings about that.
I feel like as a backend engineer, I would still want to know how things work. And like, part of that is just because I'm a curious person by nature. But like part of it is I, I genuinely think like developers should know where their code is running and how it's running and like the different characteristics of it, like making it too opaque is not really helpful. And like, I think it's needed to be like a holistic all around engineer basically.
Jeremy: Yeah, because I wonder how that impacts when there's a problem and you need to debug the problem. How do you determine whether it's something happening on the infrastructure side versus a bug in the code? That sort of thing.
Shubheksha: Yeah. So that can be a tricky one. Like you have to go through like multiple layers of investigation to figure out at what level the problem is like you usually start with the code and when, like, and also depending on like your monitoring and alerting, like, if it's something really clear that like a node is down, then you know that yes, it is an infrastructure problem.
But if like the error rate is high. Then it is very likely to be a code problem, but yeah, like in some cases it can be very difficult to figure out like, what is actually going on and like, what is the symptom and what is the cause.
Jeremy: And in the context of a distributed system, are there specific tools or strategies you would use to try and figure out what's going on?
Shubheksha: So this really depends on the system and like how well it's monitored. Like. my main takeaway was that like observability and monitoring should be a first class citizen of the design process, rather than an afterthought that you just add in, after you're done building the entire system, like that goes a long, long way in helping people debug because like, as, as a team grows or as a company grows and as the system grows, it can get so unwieldy that it, it can be super hard to keep track of what's going on and what, what is being added where. That's one of my main, main takeaways that yes, you should always think about observability, right from the start rather than treating it like an afterthought.
And in terms of like investigation, I'm not really sure if there are like specific tactics I would use. Like that's just something. Like you watch and learn other people do. And like it's so specific to the technologies and the kind of platform that you're dealing with. But yeah, like the best thing I have found about like incidents and like on call and all of that is that you just, you have to watch a lot of people who are really good at it. Do it again and again, and again, ask them questions and like, see what workflows and mental models they have and like the kind of patterns that they have built over time. And just go from there.
It starts to makes sense after a while like, initially it just seems like magic you just keep staring and you're like, how did this person know that, you know, this is where they had to look. And like that's where the bug was. I like to think about it as like, you just have to form mental models. Like the more familiar you are with something the stronger your mental models are. And like you have to update them. with time with changes, all sorts of things and they just stick after a while.
Jeremy: Do you have any specific examples of a time you were trying to debug something and some of the steps that you took and trying to resolve it?
Shubheksha: Oh yeah, after like I was, I used to be on the, on call rotation at Monzo. So like I was involved in lots of incidents and like, usually we had pretty decent run books, about like, what to do when a spe specific alert went off.
So we used to just usually follow the steps and if we were like completely out of our depth, then try a bunch of things. Throw darts randomly and see what sticks. Like sometimes you just don't know what else you can do. Like every outage is not like super straightforward, super simple. You have to like, look around, hit at like five different things and like see where it hurts basically. So yeah.
Jeremy: And in terms of the run books you were mentioning, would that be a case where there was a problem previously and then somebody, documented how they fixed the problems, then you would address that the same way next time?
Shubheksha: Usually yes, usually yes. Or something like if like something was being tested and there's a very clear process to like, test whether it's working or not. Something like that would be documented as a runbook as well.
Jeremy: Can you give us a little bit of a picture of specifics in terms of, are you SSH going into machines or are you looking at a bunch of log files? What, what is, uh, that troubleshooting process often look like?
Shubheksha: I'm not sure if I'm supposed to talk about this, especially because I'm not at Monzo anymore, so yeah, I'd be a little cautious talking about that right now.
Jeremy: Okay. That's fair. You had mentioned observability being really important before. what are some examples of things that you can add to your services or add to your infrastructure to make this debugging process a lot easier?
Shubheksha: So one of the main things that, uh, I think work great in practice is error budgets. Having some sort of definition of this is what is acceptable in terms of a particular service or a particular system returning errors. And like, if it crosses that threshold, then it's like, yes, this is our priority. We need to fix it.
Like a lot of the SRE (Site Reliability Engineering) work can not be postponed indefinitely, which is what happens in a lot of companies. That are trying to do SRE but badly. So, yeah, so like that's something that I really like as a philosophy. in terms of like monitoring, I think what I'm trying to get at, when I say that like monitoring should be a first class citizen, is that you should be very clear about the output that you expect from the system or what does healthy, not healthy look like for your system, because if you have that, then you can derive metrics from it and vice versa. If you're not able to come up with good metrics for your system, then have you looked closely enough at like what the system is trying to achieve?
Jeremy: Yeah. I think that idea of error budgets is really interesting because I think a common problem people have is they'll log everything and there will be errors that happen, but people just expect it right? They go, Oh, it's flashing warnings, but don't worry about it. It normalizes it, but nobody really knows when it's an actual problem.
Shubheksha: Yup. Yeah. Like, yeah. So basically yeah if you're logging everything and there are errors that are expected errors then yeah how do you differentiate whether an error is an expected error or it's an unexpected error like that's just a slippery slope that you don't really want to tread on.
And people get used to it. Like if a service keeps spewing errors and nobody really cares then if there's an actual outage, it's very easy to dismiss that yeah. that's what this service does and we have seen it before and it wasn't a problem. So yeah. It's probably fine.
Jeremy: Like you were saying, it's observing what you expect the system to do. So I guess, would that just be looking at. a normal day and looking at your error rate and seeing if people are able to successfully complete whatever they're trying to do on the system. and then if you actually start receiving, I guess, trouble tickets or you receive complaints from customers, that's when you can determine oh this error rate being this percentage and people complaining at this time means that this is probably a real issue. Something like that?
Shubheksha: Yeah, that's a very tricky question to answer because like, yeah, it's, it's hard to know what the right error budget is. Like, that's a much harder question to answer then, you know, just having an error budget. So yeah, like initially, like it it's just playing around and seeing like what sort of throughput the system has, what's the expected load and like all of those things and coming up with some metric and like tweaking it over time as you learn more about the system.
Jeremy: And with these systems, I'm assuming there are a lot of different nodes, a lot of different services. How do you debug or kind of work with this sort of system on your own local machine? Like, are you spinning up all these different services and having them talk or do you have some kind of separate testing environment? What does that look like?
Shubheksha: Yeah. Usually there is a separate testing or staging environment, which tries to mimic the production environment. And on your local computer, you can usually spin up a bunch of containers. Obviously it's nowhere close to like having an actual copy of the production environment, but for simple enough changes it can work.
But usually there are like testing environment, which will give you most of the same capabilities as a production environment, where you can test changes, but it's like the other problem is keeping them in sync. That's also a really, really hard problem. So a lot of times like staging environments exist, but like even when you're deploying to production after testing and staging, it's like fingers crossed.
I don't really know what's going to happen. We'll just deploy and see what breaks. Because yeah, the environments can divert quite a bit.
Jeremy: In terms of the staging environment, would that be where you have a bunch of fake data that's similar to production and then you have tooling to have transactions that would normally occur in production and staging?
Shubheksha: So like mimicking stuff as much as possible to whatever extent. Yeah.
Jeremy: How would engineers coordinate if you have this staging environment, I don't imagine everybody has their own.
Shubheksha: Yeah. That can get that can get tricky as well, because like people can override changes and like if they they're testing at the same time and they don't know, and like they're deploying on the same, the same service, so that's one benefit of microservices that like, if your service is small enough that multiple people will not be working on it at the same time, then you sort of reduce that contention.
But yeah, like if there are multiple people who are working on the same service, then they have to coordinate somehow to just figure out who's using the environment when and like testing stuff. So the other thing is like having some, a small subset of the staging environment given to like every single engineer, but like, yeah, that's, that's obviously not very simple to achieve, especially if you have like an engineering organization that is growing quite a bit.
Jeremy: How large was the engineering team at Monzo?
Shubheksha: I think it was about 150 engineers.
Jeremy: Hmm, 150. Okay. So in your experience, when you were working on a service, were you able to work on it in isolation where you would work out what the contract should be between services. So you didn't actually need to talk to the whole system for a lot of the time? Or what did that look like?
Shubheksha: I think most of the time it was like a single person working on a single service. And yeah, if you do, need to work on the same service with someone else, and usually it was, it was never more than two people in my experience then yeah you just, you coordinate with them and just give them a heads up that yes, you're doing this and you'll be deploying your changes.
Jeremy: And your service most likely is talking to other services. So would that be a part of this design process where you work out what the contract should be and that sort of thing?
Shubheksha: Ah, so usually that was pretty straightforward. Like it was a simple RPC. So you do not have to think about it too much. Like you just know that service A will talk to service B to fetch some data XYZ.
Jeremy: So I guess before you started working on the service, you might provide to the other team, or whoever owns that other service here is the protobuf message that I intend to receive and that sort of thing.
Shubheksha: Yes. So mostly like if like a service A talking to service B. Service B will have its own handlers and like its own, proto files and all of that. And I just reuse that in my service to talk to service A.
Jeremy: And then likewise, if somebody was talking to your service, you would provide them your proto files or schema.
Shubheksha: Yeah.
Jeremy: Another thing about having all these different services is I would imagine that they are all being updated at different times.
What are some of the challenges or things that you have to look out for when you're updating services in a distributed system?
Shubheksha: The biggest problem is dependencies what depends on what, like the. Dependency graph can be pretty difficult to sort of map out, especially if you have like a really large sprawling system with like dependencies all over the place. If you're deploying, like 4 services, let's say, and like A depends on B, which depends on C.
That's a problem in any distributed system. Like if you have like especially something modular, like just tracking the dependencies in that entire system can be a huge task. And like, it can inform a lot of other decisions, like in what order do you deploy things?
And like, if you want to remove something, like, it just leads to this rabbit hole because like, yeah, if you want to delete a service, you need to first figured out, like if it's if there's anything else that's depending on it, otherwise a bunch of things will break and you won't really realize that.
Jeremy: And how do you, keep track of that. Is there some kind of chart or some kind of document? Like how do people know what they can change?
Shubheksha: A lot of it is just in people's heads. Like you talk to people who have worked on that part that you're working on. And like, they'll usually be pretty familiar with like the sort of dependencies that exist. I think. We tried to like map out all of the dependencies at once and like someone posted about this on Twitter as well.
But like, they actually tried to create a dependency graph of like all the services that we had and yeah, it was not a pretty picture,
Jeremy: Hmm. Because how many services are we talking about here? Is it like hundreds
Shubheksha: 1500
Jeremy: 1500 plus. Wow.
Shubheksha: Yep.
Jeremy: That's pretty wild because basically you're relying on maybe going into email or chat and going like, Hey, I'm going to update this service. Uh, which ones might this break?
Shubheksha: Yeah. So that can get tricky and that does make like onboarding people a little bit harder. Like there are there are trade offs, both good and bad ones when you have a system like that. It's not all bad. It's not all good.
Jeremy: 1500 is-- it seems like nobody at the company could even really know exactly what's happening at that scale. I wonder from your perspective, how do you decide when something should be split off you know into an individual service versus having a service take care of more things?
Shubheksha: Yeah, so that was usually pretty scoped, We try to let a service do one thing and do it well, rather than trying to like, put a bunch of functionality into the same thing. So that, that was usually the rule of thumb that we followed for every new service that we designed.
Jeremy: Hm. Cause it seems like a bit of a trade off in that. If there's a problem with that one service, then you have a very localized place on where to work on. But on the other hand, like you said, if you have this dependency graph, you might need to step through 10, 20, 30 services to find the root problem.
Shubheksha: Yeah, there were definitely code paths that can get that big and that long. So yeah.
Jeremy: In traditional systems, we think of concepts like transactions with a relational database. For example, you may want to update a bunch of different records and have that all complete as one action. When, when you're talking about a distributed system and you need to talk to 5, 10, 20, however many services, how do transactions work in distributed systems?
Shubheksha: I'm very tempted to say they don't (laughs). Uh, so yeah, so this is, a really really interesting field within itself, like within distributed systems itself and transactions like distributed transactions are really really hard, like transactions themselves are super hard, even on a single machine.
And then you like add all of this complexity and you're just like, yeah galaxy brain (laughs) . I'm not super familiar with this topic, but I've read a bit because it's just like super fascinating. And like, usually you try to offload all of that.
One way you can do it is via a managed service. You don't have to care like where your database is running, how it's running, you just use APIs, you throw stuff at it and you know, it's going to be stored.
There's a bunch of fine tuning and configuration you can do yes. But yeah, you don't know how to think about like the nitty gritty of it. Distributed transactions, like there's also different definitions for it. Like what do you mean?
I mean, when you say a distributed transaction, like, a transaction, that's executing on multiple machines or transaction, that's like gathering data from multiple machines before returning it to you? A transaction that's I don't know, halted to be executed on different machines? So yeah, it can get really complicated.
Jeremy: Yeah, I'm thinking from the perspective of Monzo is a bank, right? So, you might pull money out of an account and transfer it to somebody else like a very basic example. And maybe as a part of that process, you need to talk to 10 different services and let's say on the sixth service call, it fails, but those previous five have already executed whatever they're going to execute.
Shubheksha: Ah, right. I see what you mean. Do we roll all of that back? Or like, do we, yeah, so like we did not roll stuff back. But yeah, like you have to account for that in your error handling logic that what happens if it fails at this step and like XYZ has already happened?
Jeremy: Could you give like an example of one way you might handle something like that?
Shubheksha: Uh, let me think. I'm not sure. I possibly dealt with a situation like that because like all of the tables were scoped to the service that you were working on. So like nobody else, like no other service can access the tables that you had within your service. So that's simplified a lot of things. And usually there was a lot of correlation by IDs.
So like if you're generating flake IDs in one service and it fails and it doesn't generate an id, then it will not be found. And you know something has gone wrong. So you basically just like log and bail and stop proceeding. But obviously around the parts that move money we had a lot more robust error handling around that. Yeah.
Jeremy: Yeah. So I guess that's one thing that could help you track, as you were saying, having an ID that passes from service to service. So,
Shubheksha: So that's usually accomplished by go's context package. So you have some sort of a trace ID that's like passed through the lifetime of a request all the way down. Uh, so you know, that yes, like all of those calls were part of the same request.
Jeremy: Hmm. Okay. And then that would mean that you could go to some kind of logging or monitoring service and actually see, like, these are the five services that we're trying to handle it, and this was the results. I guess there's like two parts there's there's figuring out which services and which calls were part of the same transaction.
And then like you were saying earlier if something goes wrong, how do we correct for it? And yeah, that sounds like a very, very challenging problem (laughs).
Shubheksha: Yeah. Yeah. Like one of the main problems with distributed systems is when things go wrong, you don't know what's going to break where. Like a lot of things break and then you just have to like start searching for, okay, what has gone wrong, where, and like something completely different in a completely different part of the system might just be like breaking something else in a completely different part of the system. And like, as your system grows, it's impossible to keep track of all of it in your head, especially like a single person cannot do that.
So it can just be really hard to like, have a big picture view of the system and figure out what's going on, where
Jeremy: Mm, and it sounded like in your experience that that knowledge of what was going, where and how things interacted. A lot of that was sort of tribal in a way. Like there were people who, who knew how the pieces fit together. Um, but it's tough to really have this document or this source of truth.
Shubheksha: Yeah
Jeremy: Yeah, I can imagine that being like you were saying earlier, very difficult in terms of onboarding. Yeah.
Shubheksha: Yep.
Jeremy: I guess another thing about when you have distributed systems and you have all these different services. Presumably one of the benefits is so that you can scale up certain parts of the system.
I wonder from your perspective, how do you kind of figure out which parts of the systems need to be scaled up and what are some strategies for, if you're really hitting barriers, a performance in a certain part of the system, how you deal with that?
Shubheksha: Usually metrics. If you're constantly getting alerts that yes, like a particular node or like a particular part on a particular node is constantly hitting its memory and CPU usage, or, you know, you're going to be performing some sort of like heavy workload on a particular service and you scale it up preemptively or like an alert fires, and then you scale it up.
So that's like a very manual way to go about it. Uh, and in terms of performance constraints. So a lot of that was handled by our infrastructure team and I personally did not work on like fine tuning uh performance as such on any of the parts of the system that I worked on. But like what I saw other people doing, was that like, you have to go on chasing, profiling, and figuring out where that bottleneck is.
And like, sometimes it can just be your bug. That's like, like you're leaking goroutines or something like that, or like you're leaking memory somewhere else. So stuff like that, like usually you just, you have to keep profiling till you figure what's going on and then fix it.
Jeremy: And for you as the person who's building the services and handing them off to the infrastructure team, was there any contract or specification written up where you said like, Hey, my service is going to do this. And I think it's going to be able to process this many messages or this many transactions a second.
And then that's how the infrastructure team can figure out how much to scale or how did that work?
Shubheksha: Yeah. So there was some amount of load testing that was done for like services that were really, really high throughput with the infrastructure team so that they like were in the loop and like they knew what was going on and they can scale up preemptively.
Jeremy: In your experience as a backend engineer, we've talked about the infrastructure team. Are there any other teams that you would find yourself working on a regular basis?
Shubheksha: Not really, no, it was mostly backend and and infrastructure. And in some cases security as well. If you needed like a particular like their input on something particular that we were working with.
Jeremy: Would they provide some kind of audit to tell you, like, this is where potential issues might be or?
Shubheksha: Yeah. Like something like that, or like, if you just need their input on like a particular security strategy, like if you wanted to like store some, critical data or something like that, how do we do that? Or like, what would be the best way to go about it so stuff like that.
Jeremy: During your time at Monzo, you were writing systems in, in Go What do you think makes Go a good choice for distributed systems or for the type of work you were doing in general?
Shubheksha: One of the main things that I like about it is that it's super easy to pick up. This was my first job with Go like first, full time job. I was learning Go a little before that, but I was able to pick up very, very quickly. I think what attracted people to Go like, this was definitely a huge part of it.
Then the fact that it was backed by Google and, and it wasn't going to like just vanish overnight was also really helpful. And I think, it really started from docker and docker sort of shot into the limelight and it sort of made it the defacto language of cloud native infrastructure and it just kept catching on.
And then Kubernetes came along and then because Kubernetes was written in Go, everything else started to be written in Go. It has a lot of neat features that I think make it uh really really, easy to sort of start building stuff with it. One of them is definitely like how good the standard libraries are and how easy it is to sort of build on top of them.
And like first-class support for concurrency, which can be a huge, huge pain to sort of implement on its own and like a lot of languages don't support those primitives out of the box. Like it's really good for those sorts of use cases as well. And yeah, it's just easy and fun to learn.
Jeremy: So basically it was easy for you to step in as a new employee, and start learning how services work just because the language was easy to pick up. um, and yeah, that, that built in support for concurrency. I wonder, are there any things compared to your experience with other languages where you, that you didn't like about go.
Shubheksha: Mmm, I think, yeah, this is, this is like a very common and controversial opinion, uh, in the go community, but like sometimes it can feel very repetitive like, especially the error handling bits where you're like constantly checking for errors. I can appreciate explicit error handlings but like, yeah, sometimes it can just feel like you're writing the same block of code, like 50 times in the same file.
And that can be a little bit annoying. It's super verbose. Like it just, verbose, not in the Java sense, but more in the, you should know what you're doing sense, like, it tries to be as simple as possible rather than being as clever as possible and making things harder to understand
Jeremy: And the the error example you were giving, I'm assuming that's because go doesn't have exceptions. Is that right?
Shubheksha: Yes, Go does not have a concept of exceptions at all. you're supposed to do explicitly check for errors every single step and then decide what you're going to do. So in a way, it sort of makes you think harder about like the ways in which your code fails and what you're supposed to do after that, which is a good thing.
Jeremy: Right in some cases you have a very deeply nested call stack, And you might be accustomed to throwing an exception deep and at the, outer level, just handling that. Um, but with Go at every single level of the stack. You would need to check the error.
What do I want to return or, yeah okay.
Shubheksha: So you have to bubble the error up. You need to make sure you're bubbling the right error up. You have to augment it with all of the information and the metadata that you need, which will help you debug the problem no matter at what layer it occurred at so that can definitely be sometimes tricky.
Jeremy: Yeah, I can definitely see, um, I guess why it'd be good and bad. Maybe we should move on to you recently wrote a post about the lessons you learned in your third year as being a software engineer. I thought we could go through a few of those points and maybe, get into where you're coming from with those. I think the first point you had mentioned was just doing things like working on projects and doing the work isn't enough. You have to know how to, how to market it as well.
Shubheksha: Yeah.
Jeremy: You had talked about marketing to the right people. are you talking about people who are internal at your company or outside and, and how do you find what you consider to be the right people?
Shubheksha: Oh, that's a really good point. I think it's a bit of both like internally in terms of like progression and like the stakeholders of your project and all of the other people that you work on on a day to day basis. And externally just to build a network and like, let people know what you're working on and what you're learning.
Like I'm a huge fan of like learning in public in general. That's just something that brings a lot of good things your way if you do it right.
Jeremy: Things like blog posts or conference talks, things like that. Another thing you mentioned was how, how titles matter. And I know that there's a lot of discussion where people say like, Oh, it doesn't really matter. But, um, you were saying like, they really do matter. And. from your perspective, how dod the roles of, of titles change from, from company to company?
Shubheksha: I think it. It really depends on the size of the company as well. Uh, that's one of the main factors, like how well established the company is, how many engineers do they have? Like how much do they like place weight on the title internally? Like there are companies where like there are meetings, which are exclusively for people who have a certain level or a certain title.
So in that case, like, it's very obvious that if you don't get promoted then you get you're stuck, you're losing out because you're literally not allowed to attend. A certain set of meetings that you would probably benefit from. And in, in like in other ways it can also like halt your progression. And when I say titles matter, I don't say that.
Like, just because you're a senior engineer, you know, more I don't believe in like, just number of, years of experience, it's about the quality of the work that you have done more than like just the sheer amount of time that you've put in. But. In terms of like how people perceive you, what they expect from you and what they assume about you definitely shifts when you have like a title attached to your name.
Jeremy: And in your career or just in people's careers in general, how much, weight or priority should people put into making sure that they, they have like a. impressive title or a high title. especially when you think about, you were saying how it changes between the size of the organization.
Um, somebody who's a CTO at a three person company it's not the same as at a 5,000 person company. I wonder what your thoughts are on that.
Shubheksha: Uh, yeah, that's a really good question. I think like what I mentioned in the next point is that it, it's very easy to get caught up in that, in the climb. Like when it comes to career ladders, but that can also be very detrimental and that relates directly to the fact that it's not just years of experience, it's actually the quality of those years of experience that actually counts towards like how good you are.
If you, if you run after two months, it can just be like a second full time job. It can completely drain you. You'll stop learning, you stop actually doing what you enjoy. And you'll just blindly chase a title. Like that's not a good position to be in because like, you have to balance it out.
At the same time if you feel like you're just getting stuck, you're not getting promoted. You're not getting to the level you want to be at. It might just be time to like, change. And see what's out there, and if there's someone who will treat you better, but like, it really depends on what you value more because a title change can bring you better job prospects, especially if it's a change from like just software engineer to senior software engineer, then you just have to much wider pool of roles available to you. So it really depends on the stage of the career you're at. And what do you value personally?
Jeremy: In your posts, you had talked about chasing promotions, What are some of the things that you might find yourself doing that are more going for the promotion rather than growing yourself? technically or, you know, skill wise?
Shubheksha: That depends on what your company values and what do they base promotions on? So like if they have a very rigid structure where it's like a bunch of check boxes that you have to tick, then a lot of people will try to game that system. And they'll just try to do things maybe at the expense of their growth, as well as the expense of what's good for the company just to get ahead and get promoted.
So that's not beneficial for the employee or the company. But yeah, like a lot of, frameworks that are like of that fashion, where you have to check a bunch of boxes in order to get promoted often like end up in people doing things that they are being incentivized to, but in a very wrong way,
Jeremy: Right. So I guess it's looking at what your organization's process is and figuring out is it, you know, some of it is not the stuff you want to do, but you just kind of get it done and you can move on or whether the process it's so, difficult.
And, and so, um, um broken, yeah, basically, uh, that it's worth just moving on to another, another organization. Your, your next point, I think was about, uh, how sponsors are so important. And I wonder if you could explain a little bit about what you mean by a sponsor versus, a mentor or somebody else?
Shubheksha: Yeah. So the difference between uh sponsors and mentors is that I think mentors, the way I think about it at least is that mentors are very passive in the way they help you vs sponsors are very active. Like mentors will be there for you. They will help you. You can go ask them questions, whereas with sponsors, they will actively put your name in the hat for opportunities help you connect with people and like send opportunities that they think are a good fit your way, and they will be your advocate.
Rather than you having to ask your mentor to be your advocate. So I think that's, that's the main difference. And like, there's also the question of sort of like the sponsor's reputation being at stake in a certain sense that they, they basically take a chance on you and they trust you enough to put their reputation on the line in order to do right and good by you, which is not the case with mentors.
Jeremy: It sounds like a sponsor would be someone who is vouching for you, right. Or is giving you recommendations.
Shubheksha: Yes, essentially. Yeah. Someone who watches for you in like rooms either you're not in, is familiar with your work, actively advocates for it.
Jeremy: Hmm. And that could be someone within your organization, for example, saying, I think that you would be a great fit for this project and telling other managers about that, or it could even be someone outside of your work, just saying if you're looking for a position or something, they might say, Oh, I know somebody who would be a great fit for this conference talk or this job, that, that sort of thing.
Shubheksha: Precisely. Yeah.
Jeremy: Do you have any specific advice for how people should.. I don't know if seek out is the right word but to build up a network of sponsors,
Shubheksha: I get that a lot. And I have a blog post in the works for it, which I'm trying to think about, but like, it's really hard to put in words, like it's just like a lot of networking and like reaching out to people and like slowly and steadily sort of surrounding yourself with people that you can admire and look up to and who will be like willing to vouch for you.
So like there's no direct, straight method to do it. I can't prescribe like if you follow step one, two, three, that, yes, you'll definitely have it. I think it's just, it just takes time on a lot of effort and patience, because I remember at like, when I started, like, I was desperately looking for mentors and I was just like reaching out to people and I just felt like I could do so much if I just needed the guidance and it was really really hard to find.
So yeah, like I completely empathize with like people who are struggling with it at the moment. And it's, it's really hard. A lot of people are like super busy. They don't have time. And so it's it can just feel like bad to sort of ask for their time as well. So, yeah, that's also something I definitely struggle with, but like don't have concrete steps, at the moment, will, hopefully have something to say and publish it in a blog post.
Jeremy: Yeah. I mean, I wonder from your personal experience, I remember a few years ago you went on a few podcasts to talk about your experience, getting into open source and things like that.
You know, looking back on the process you went through, is that a path that you would recommend to other people, or do you have thoughts on how you might have approached that differently now?
Shubheksha: I think I got very lucky at like some steps. Like I did not set out I did not plan it out uh that way it just sort of happened. And yeah. So like, I don't think I would take a different path. Like I like along the way, especially via open source like I met so many great people who have been amazing mentors and sponsors for me, but like I did not seek them out.
The, the other thing that I realized was like, you can't meet people and be like, Oh, you're my mentor. Like, that's a relationship. That's built over time as you learn from each other. And the other thing is, yeah, it's a two way relationship. It's not just you leeching off the other person and like trying to, you know, like all, like take away all of their knowledge and like sort of embedded it in your brain.
That's not how it works. Like even if you have less knowledge compared to the other person, they can still learn something from you and like, it's always good to have that mindset rather than, you know, the hero mindset where you're like, Oh my God, this person is, is God. And like, I want to learn everything from them like that doesn't help.
Like placing people on a pedestal, basically. Like it eases the pressure on them as well. And it helps you be more comfortable too. So like a lot of my mentor, like sponsor relationships are like that where we have a bunch of like mutual interests and we talk about it and we learn from each other. So like a lot of them are like, not even like formalized, like we never, like had a conversation where we were like, Oh, like, we are like a mentor and a mentee now, like a sponsor and a sponsee now, like yeah, you just, you sort of build those kind of bonds with people and where they feel comfortable watching for you.
And you just take it from there.
Jeremy: That makes a lot of sense just because relationships in general are very much, it's not you meet somebody and you go, we are going to be friends, right?
Shubheksha: It just takes time. Yeah.
Jeremy: Yeah. So I can see how it'd be too difficult to pin down what are the steps you should take? I'm looking forward to the blog posts though.
Shubheksha: Yeah. Thank you.
Jeremy: I think the last thing you had mentioned in the post was you were talking about how programming got easier over time. And I wonder, were there any specific milestones you remember of things that were really hard and then it, all of a sudden, it just clicked over the years
Shubheksha: I think uh one of the main things I remember was like, just on designing containers and how they work. Like it was literally like a light bulb moment with where like I just, I just felt like, yes, I finally understand what it is. And like, there have been times over the years where I have felt like that without like actively trying to something that I've experienced very frequently is like, I learn about something.
Like say right now and I'll understand maybe like 40% of it. And then I'll come back and revisit it, like from another topic or like an adjacent topic. And I finally understand it and maybe I've been like working with it or around it in the intervening months, but like, it's not like I sat down and I tried to like, learn about it or anything like that.
But like when you're understanding builds up. During the intervening time, you sort of realize that you've not only learned the thing that you were actually learning, but a lot of adjacent concepts also makes sense. And it can be very hard to have this perspective when you're starting out because you're just like, none of this makes sense, like what is going on.
But yeah, it, like your understanding sort of grows and compounds in a similar way to money, I would say. And it's very hard to remember that when you're just starting out.
Jeremy: Yeah, I can definitely relate to that where I think it's, it's very difficult. And I think a lot of people want to be able to sit down and read a book or go through a course and go like, okay, I now know this thing. Uh, but I have the same experience with you where, uh, it's just working on problems that are not even hitting.
Always the main thing, but it's related to it and all of a sudden it might suddenly click. Yeah.
Shubheksha: Yeah.
Jeremy: Cool. Well, I think that's a good place to wrap up, but are there any other things you, uh, you'd like to mention or think we should have talked about.
Shubheksha: No, I think we covered a lot of ground and it was really fun chat thank you so much.
Jeremy: Cool. where can people follow you online to see what you're up to and what you're you're going to be doing next?
Shubheksha: they can follow me on Twitter. Uh, that's where I'm most active. Uh, my handle is scribblingon, I have a website which is shubheksha dot com and that's where I post all of my writing.
Jeremy: Cool. Shubheksha, thank you so much for joining me today.
Shubheksha: Thank you so much.
Ryan is the author of Advanced React Security Patterns and Securing Angular Applications. He picked up his authentication expertise working at Auth0. Currently, he's a GraphQL developer advocate at Prisma.
Related Links:
Music by Crystal Cola.
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Today, I'm talking to Ryan Chenkie. He used to work for Auth0 and he has a new course out called advanced react security patterns.
I think authentication is something that often doesn't get included in tutorials, or if it's in there, there's a disclaimer that says, this isn't how you do it in production. So I thought it would be good to have Ryan on and maybe walk us through what are the different options for authentication and, how do we do it right?
Thanks for joining me today, Ryan.
Ryan: [00:00:30] Yeah, glad to be here. Thanks for inviting me.
Jeremy: [00:00:33] When I first started doing development, my experience was with server rendered applications. like building an app with rails or something like that.
Ryan: [00:00:42] Yep.
Jeremy: [00:00:42] And the type of authentication I was familiar with there was based on tokens and cookies and sessions. And I wonder if for people who aren't familiar, if you could just walk us through at sort of a basic level, how that approach works.
Ryan: [00:01:01] Sure. Yeah. so for those who have used the internet for, for a long time, the web, I should say for a long time, And who might be familiar with like the typical round trip application, which, which many of these still exist. A round trip application is one where every move you make through the application or the website is a trip to the server that's responsible for rendering the HTML for a page for you to see and on the server things are done. Data is looked up and HTML is constructed which is then sent back to the clients to display on the page.
That approach is in contrast to what we see quite often these days, which is like the single page application experience, where if we're writing a react or an angular or Vue application it's not always the case that we're doing full single page, but in a lot of cases, we're doing full single page applications where all of the JavaScript and HTML and CSS needed for the thing gets downloaded initially, or maybe in chunks as movements are made.
But let's say initially is downloaded and then interacting with data from the server is happening via XHR requests that that go to the server and you get some data back. So it's different than the typical round trip approach, In the roundtrip approach. And what's historically done, what is still very common to do and it's still a very legit way to do auth is you'll have your user login. So username and password go into the box. They hit submit. And if everything checks out, you'll have a cookie be sent back to the browser, which lines up via some kind of ID with a session that gets created on the server.
And a session is an in memory kind of piece of data. It could be in memory on the server. It can be, in some kind of like store, like a Redis store, for example, some kind of key value store, could be any of those things. And the session just points to a user record, or it holds some user data or something.
And it's purpose is that when subsequent requests go to the server, maybe for new data or for a new page or whatever, that cookie that was sent back when the user initially logged in is automatically going to go to the server. That's just by virtue of how browsers work cookies go to the place from which they came automatically.
That is how the browser behaves. The cookie will end up at the server. It will try to line itself up with a session that exists on the server. And if that is there then the user can be considered authenticated and they can get to the page they're looking for or get the data, the data that they're looking for.
That's the typical setup and it's still very common to do. It's a very valid approach. Even with single page applications. That's still a super valid approach and some people recommend that that's what you do, but there are other approaches too these days.
There are other ways to accomplish this kind of authentication thing that we need to do, which I suspect maybe you want to get into next, but you tell me if we want to, if we want to go there.
Jeremy: [00:04:03] You've mentioned how a lot of sites still use this cookie approach, this session approach, and yet I've noticed when you try to look up tutorials for working on react applications or for SPAs and things like that. I've actually found it uncommon, at least, it seems like usually people are talking about JSON web tokens that are not the cookie and session approach.
I wonder, if you had some thoughts on, on why that was.
Ryan: [00:04:37] Yeah, it's an interesting question. And I've thought a lot about this. I think what it comes down to is that especially for front end developers who might not be super interested in the backends, or maybe they're not concerned with it. Maybe they're not working on it, but they need to get some interaction with a backend going.
It's a little bit of a showstopper, perhaps, maybe not a showstopper. It's a hindrance. There are road blocks put in place if you start to introduce the concept of cookies and sessions because it then necessitates that they delve a little bit deep into the server. I think for front end developers who want to make UIs and want to make things look nice on the clients but they need some kind of way to do authentication.
If you go with the JSON web token path, it can be a lot easier to get done what you need to get done. If you use JSON web tokens maybe you're using some third party API or something. And they've got a way for you to get a token, to get access to the resources there.
It becomes really simple for you to retrieve that token based on some kind of authentication flow, and then send it back to that API on requests that you make from your application and all that's really needed there is for you to modify the way that requests go out. Maybe you're using fetch or axios or whatever you just need to modify those requests such that you've got a specific header going out in those requests. So it's not that bad of a time, but if you're dealing with cookies and sessions, then you've got to get into session management on the server. You've got to really get into the server concepts in some way if you're doing cookies and sessions.
I think it's just more attractive to use something like JSON web tokens, even though when I think when you get into, like, if you look at companies that have apps in production and, and like, especially like with organizations that have lots of applications, maybe internally, for example. And they're probably doing single sign-on. Maybe they've got tons of React applications, but if they're doing single sign-on, especially there's going to be some kind of cookie and session story going on there. So yeah, I think all that to say ease of use is probably a good reason why in tutorials, you don't see cookies and sessions, all that often.
Jeremy: [00:06:51] It's interesting that you bring up ease of use because, when you look at tutorials, a lot of times, it just makes this assumption that you're going to get this JSON web token. You're going to send it to the API to be authenticated. But there's a lot of other pieces missing.
I feel like to to make an actual application that's secure and going through your course for example, there's the concept of things like, the fact that with a JWT, a JSON web token, you can't, invalidate a token. Like you can't force this token to no longer be used because it got stolen or something like that.
Or, there's strategies that you discuss. Like you have this concept of a refresh token alongside the JSON web token and going through the course and, and I'm looking at this and I'm going like, wow, this, this actually seems like pretty complicated. This seems like I have to worry about more things than, just holding onto a cookie.
Ryan: [00:07:50] Totally. Yup. Yeah, I think that's, that's exactly right. I think that's one of the selling features of JSON web tokens is especially if you're coming from maybe like maybe if you're newer to the concept, let's say one of the compelling reasons to use them is that they're pretty easy to get started with for, for all the reasons that I went into just a second ago.
But once you get to the point where you've got to tighten up security for your application, you need to make sure that users aren't going to have these long lived tokens and be able to access your APIs indefinitely. And you want to put other guards in place. It requires that you do a bit of gymnastics to create all these things to protect yourself, that cookies and sessions can just do for you if you use this battle-tested well trotted method of authentication and authorization and that's one of the big arguments against JSON web tokens.
There's this really commonly cited article, which if I'm doing training, I'll drop, in any session that I'm doing, I can probably get you the link, but it's joepie.
I think I'm going to Google that joepie, JWT. That's what I always Google when I need to find this. The title is Stop using JWTs for Sessions. So this guy. Sven Sven Slootweg, I think is his name. He, he's got a great article, a great series of articles actually about why it is not a great idea to use JSON web tokens.
And I think there's a lot of validity to what he's saying. It essentially boils down to this-- that by the time you get to the point where you've got a very secure system with JSON web tokens. you will have reinvented the wheel with cookies and sessions and without a whole lot of benefit. I think he's making the case for this in situations where let's say you've got like a single page react application and you've got an API that is controlled by you that is your own first party API. That's just responsible for surfacing data for your application in those situations. You might think you need JSON web tokens, but you would be able to do a lot with cookies and sessions just fine in those situations. Where JSON web tokens have value I think is when you have different services, in different spots, on different domains that you can't send your cookies to because they're on different domains.
Then JSON web tokens can make sense. At that point, you're also introducing the fact that you need to share your secrets for your tokens amongst those different services. Other third-party perhaps services or domains that you don't control would need to know something about your tokens. I don't know if I've ever seen that actually in practice where you're sharing your secret, but in any case you would need to, to be able to validate your tokens at those APIs.
And so that becomes a case for why JWTs makes sense because cross domain requests, you won't be able to use cookies for that. So there's trade-offs but ultimately if you've got an application front end app and your own API, I think cookies and sessions make a lot of sense.
Jeremy: [00:10:50] I think that for myself and probably for a lot of people, there is no clear I guess you could say decision tree or how do you choose which of the two to use? And I wonder if you had some thoughts on, on how you make that decision.
Ryan: [00:11:05] Yeah. Yeah. It's interesting. It's certainly a whole plethora of trade offs that you would need to calculate in some way. It's one of the things that I see people frustrated with the most, I think, especially when you see articles like this, where they say stop using JWTs. The most common refrain that I've seen in response to these articles is that it is never explained *what* we should do then.
Like there's always these articles that tell you not to do something but they never offer any guidance about what you should do. And this article that I pointed out by Sven he says that you should just use cookies and sessions. So he does offer something else you can use.
But I think what it comes down to is you need to, I think first ask yourself, what's my architecture look like. Do I have one or two or a couple or whatever, front end applications backed by an API? Do I have mobile applications in the mix? Maybe communicating with that same API and where's my API hosted is it under the same domain or different domain? Another thing is like, do I want to use a third party provider? If you're rolling your own authentication, then sure. Cookies and sessions would be sensible, but if you're going to go with something like, Auth0 or Okta or another one of these identity providers that lets you plug in authentication to your application, they almost all rely on OAuth, which means you are going to get a JSON web token or a different kind of access token.
It doesn't necessarily need to be a JSON web token, but you're going to get some kind of token and instead of a cookie and session, because they are providing you an authentication artifact that is generated at their domain. So they can't share a domain with you, meaning they can't do cookies and sessions with you.
So the decisions then come down to like, whether you want to use third party services, whether you want to have support multiple domains to, to access data from you know, it would be things like, are you comfortable with, your sessions potentially being ephemeral? Meaning like, if you're going to go with cookies and sessions and you are deployed to something that might wipe out your sessions on like a redeploy or something, I would never recommend that you do that.
I would always recommend that you use like a redis key value store or something similar to keep your sessions. You would never want to have in-memory sessions on your server because these days with cloud deployments, if you redeploy your site, all your sessions will be wiped.
Your users will need to log in again, but it comes down to how comfortable are you with, with that fact, and then on the flip side, it's like, how comfortable are you with having to do some gymnastics with, with your JSON web tokens, to invalidate them potentially.
One thing that comes up a lot when I'm with the applications I've seen that are using JSON web tokens is somebody leaves the company. Or somebody storms out in the middle of the day cause they're upset about something and they've been fired or whatever, but they have a JSON web token that is still valid. The expiry hasn't expired yet. So theoretically, they could go home or something. If they have access to the same thing on their home computer, or maybe they're using a laptop of their own. Anyway, if they still had access to that token, they could still do some damage if they're pissed off right. Something like that. And, unless you put those measures in place ahead of time, you can't invalidate that token.
You could change up the secret key that validates that token on your server, but now all your users are logged out, everyone has to log back in. So yeah, man, it just comes down to like a giant set of trade-offs and the tricky part about it... I think a lot of the reason that people get frustrated is just that all of these little details are hard to know ahead of time.
And that's one of the reasons that I wanted to put this course together is to help enlighten people as to the minutia here. Cause it's a lot of minutiae, like there's a lot of different details to consider when making these kinds of decisions.
Jeremy: [00:15:13] I was listening to some of the interviews, and an example of something that you often see is people go like, okay, let's say I'm going to use JWTs but I don't know where to store it.
And then in your course, you talk about it's probably best to store it in a cookie and make it HTTP only so that the JavaScript can't get to it and so on and so forth. And then you're talking to, I think it was Ben (Awad). And he's like, just put it in local storage what's the big deal? In general in tutorials that you see on the internet, but in this course, it's like, you're getting these mixed messages of like, is it okay to put it here? Or is it not?
Ryan: [00:15:55] Yeah, it's interesting, right? Because I mean, maybe it's just like this common tack that bloggers and tech writers have, where, when they write something, I find this often and I like to go contrary to this, but what I find often, a lot of people are like, do this and don't do this. I'm giving you the prescription about what to do.
Whereas I like to introduce options and develop a conversation around trade-offs and stuff like that. But I think where we see a lot of this notion that you're not supposed to put them in local storage or you're supposed to put them in a cookie or whatever, it's because we see these articles and these posts and whatever about how it's so dangerous to put it in local storage and you should never do it.
And there's this pronouncement that it should be off limits. And the reality is that a lot of applications use local storage to store their tokens. And if we dig a little bit beneath the surface of this this blanket statement that local storage is bad because it's susceptible to cross-site scripting... If we dig a little bit deeper there what we find is that yes, that is true. Local storage can be (cross-site) scripted. So it's potentially not a good place to keep your tokens. But if you have cross-site scripting vulnerabilities in your application, you've arguably got bigger problems anyway. Because let's say you do have some cross site scripting vulnerability that same vulnerability could be exploited to allow somebody to get some JavaScript on your page, which just makes requests to your API on your behalf and then sends the results to some third party location.
So yeah, maybe your tokens aren't being taken out of local storage, but if your users are on the page and there's some malicious script in there, it can be running requests to your API, your browser thinks it's you because it's running in the browser in that session. And you're susceptible that way anyway. So yeah, the argument there to say local storage is not a big deal is because yeah, cross-site scripting bad, but I mean, you're not going to take your tokens out of local storage and be like, ah, I don't have to worry about cross-site scripting now. I'm good. You still have to manage that. So and that's the other thing too, like there's so many different opinions on this topic about what's right and what's wrong. Ultimately, it comes down to comfort level and your appetite for putting in measures that are hopefully more secure things like putting it in a cookie or keeping it in browser state. But maybe with diminishing returns. There's maybe some point at which putting all that extra effort into your app to take your tokens out of local storage might not be worth it because if you don't have that vulnerability of cross-site scripting anyway then maybe you are okay.
Jeremy: [00:18:40] And just to make sure I understand correctly. So with a cross site scripting attack, that would be where somebody is able to execute JavaScript on your website, usually because somebody submitted a form that had JavaScript code in it. And what you're saying is when you have your token stored in a cookie, the JavaScript can't access it directly, which is supposed to be the benefit. But it can still make requests to your API. Get information that is private in the JavaScript, and then start making public requests that don't need the token to other URLs?
Ryan: [00:19:20] Pretty much. Yeah. So if you put your token in a cookie and it's got to be an HTTP only cookie, you need to have that flag set, then JavaScript can't access it. Your browser won't let you script to that cookie, cause like you can with regular cookies, you can do `document.cookie` and you can get a list of cookies that are stored in that browser session, I guess you would say, or for that domain or whatever.
So yeah, you're right. You got it right. If someone is able though, to get some cross-site scripting, let's say you are keeping your token in a cookie. Someone gets some malicious script onto your page. They can start scripting things such that it makes your browser request your API on behalf of the user, without them clicking buttons and doing whatever, just says, make a post request, make a get request, whatever to the API... results come back. And then they just send them off to some storage location of theirs. I've never seen that in practice. In theory that can happen. It might be interesting to, to make a proof of concept of that. I mean, in theory all of that flow checks out but I've never actually seen it be done but theoretically possible. I'd be willing to bet a lot that it's happened in practice in some fashion.
Jeremy: [00:20:39] So following the path of your course, it starts with cookies, it goes to JWTs. And then at the end of the advanced course, it goes back to cookies and sessions. And, so it sounds like if you're going to be implementing something yourself and not using a third party service, maybe implicitly you're saying that implementing cookies and sessions probably has a lesser chance of you messing things up if you go that route. Would you agree with that?
Ryan: [00:21:12] Yeah, I think so. I think if you, especially like, if you follow the guidance, put forth by the library authors of these things, like maybe you got an express API. If you use something like `express-session`, which is a library for managing sessions. and you follow the guidance. You've got a pretty good chance of being really secure. I think, especially if you implement something like cross-site request forgery. If you have a CSRF token, that is, we can get into CSRF if you like, but if you have that protection in place, then you're in pretty good shape.
You're using a mechanism of authentication that has been around for a long time. And it has had a lot of the kinks worked out. JSON web tokens have been around for a while, I suppose in technology years, but it's still fairly new. Like it's not something that's been around for a long, long, long time, maybe 2014 or something like that I want to say. And really just into popular usage man in the last like four years probably is how long that's been going on. So not as much time to be battle-tested.
I've heard opinions before about like, man, we're going to see in a few years' time... So this plethora of single page apps, that's using JSON web tokens, we're going to see all sorts of fallout from that. I've heard this opinion be expressed where people who are not fans of JWTs are able to see their potential weak spots. They say like, people are gonna figure out how to hack these things really easily and, and just, you know, ex expose all sorts of applications that way. So, yeah. TBD on that. I suppose, but it's certainly there's potential for that.
Jeremy: [00:22:46] And so I wonder how things change when, cause I think we're, we're talking about the case where you've got an application hosted on the same domains. So, let's say you've got your react app and you've got your backend, both hosted behind the same URLs basically, or same host name. Once you start bringing in, let's say you have a third party that also needs to be able to talk to the same API. How does that change your thinking or does it change it at all?
Ryan: [00:23:22] Well, I suppose it depends how you're running things. I mean, Anytime that I've had to access third party APIs to do something in my applications. It's generally a call from my server to that third-party server. So in that flow, what it looks like is a request to come from my react application to my API that I control and the user would be validated as the request enters my application. And if it passes on through, if it's valid, then a request can be made to those third parties.
For example, if you're using third parties directly from the browser, I can't think of too many scenarios where you would be calling your API and third parties using the same JSON web token. I don't know that that would be a thing. In fact, that's probably a little bit dangerous because then you're sharing your secret key for your token with other third parties, which is not a great idea. When third parties are in the mix, generally, it's a server to server thing for me.
But you know, maybe there's a way that you validate users both at your API and at the third party using separate mechanisms, separate tokens or something like that. I don't know. I haven't employed that set up strictly from the browser to multiple APIs. For me, it's always through my own server, which is arguably, probably the way you want to do it anyway, because you're probably going to need some data from your server that you need to send to that third party and you're probably going to want to store data in your server that you get back from that third party in some scenarios. So I don't know if you go from your own server, you're probably better off I think.
Jeremy: [00:25:02] And let's say you're going from your own server and you need to make requests to our API. Does that mean that from your server side code you're receiving the cookie and then you send requests from then on using that cookie even though you're a server application?
Ryan: [00:25:23] Well in those scenarios. So this is interesting. This is where, people argue that JSON web tokens are a good fit for this scenario when you have servers to server communication. That article that I mentioned, about how you should just use cookies and sessions and stop using JSON web tokens.
The argument boils down to the fact that JSON web tokens are not meant for sessions. They're not meant to be a replacement to a session. And that's how people often use them is like a direct one for one replacement to sessions. But that's not what they are. Tokens are an artifact that is produced at a point in time that you have proven your identity, that they're basically like a receipt that you proved your identity.
And it's like, I don't know how to think about this. Maybe like when you go to an event or something and you get a stamp on your hands so that you can leave and then come back in, it's kind of like that. And then by the time that stamp washes off. It's expired. So you can't get back in that kind of thing right? So it's a point in time kind of thing. Whereas like a session would be more so like if you wanted to leave and come back from that event and every time you came back, it's like, Hey I was here just a second ago. Can you go to the back and just check with your records and make sure that I'm still valid or whatever. And then they do that and then they let you in, I don't know, maybe a bad analogy, whatever, but the point is that cookies and sessions versus tokens. It's a different world. It's not, they're not interchangeable, but people use them as if they are.
So this article argues that a good use case for JSON web tokens is when you have two servers needing to communicate with one another needing to prove between one another, that they should have access to each other, and maybe they need to pass messages back and forth between one another. That's the sort of the promise of JSON web tokens is that you can encode some information in a token easily, send it over the wire and have it be proved on either end that the message hasn't been tampered with.
That's really the crux of it with JWTs. So getting back to your question there, yeah, that's a good use of that. It's a good use case for, for tokens is if you need to communicate with a third party API, you wouldn't be able to do that necessarily with cookies and sessions.
I mean, I don't know of a way to send a cookie between two servers. Maybe there's something there. I don't know of it. But yeah, JSON web tokens easily done because you just send it in the request header. So they go easily to these other APIs. And that, I mean, if you look at integrating various third party APIs, it's generally like, they're going to give you an access token.
It might not be a JWT, but they're going to give you an access token and that's what you use to get access to those APIs. So yeah, that's how I generally put it together myself.
Jeremy: [00:28:23] It sounds like in that case you would prefer having almost parallel types of authentication. You would have the cookie and session set up for your own single page application talking to your API. But then if you are a server application that needs to talk to that same API... Maybe you use, like you were saying some kind of API key that you just give to that third party, similar to how when you log into to GitHub or something and you need to be able to talk to it from a server, you can request a key. And so that that's authenticating differently than you do when you just log in the browser.
Ryan: [00:29:07] Yeah I think that's right I think the way to think about it is like if you've got a client involved. Client being a web application or a mobile application, that's a good opportunity for cookies and sessions to be your mechanism there. But then yeah, if it's server to server communication, It's probably going to be a token.
I mean, I think that's generally these days anyway, the only way you'd see, see it be done. I think like most of the third party services that I plug into, but the access tokens, they give you aren't JSON web tokens, from what I've seen, but if you're running your own third party services, for example, and you want to talk from your main API, perhaps to other services that you have control of then a JSON web token might be a good fit there, especially if you want to pass data encoded into the token to your other services. Yeah, that's what I've seen before.
Jeremy: [00:29:59] And I wonder in your use of JSON web tokens, what are some common things that you see people doing that they should probably reconsider?
Ryan: [00:30:11] Yeah. Well, One is that it's super easy to get started with JSON web tokens. If you use the HS 256 signing algorithm, which is kind of the default one that you see quite often, that is what we call a synchronous way of doing validation for the token, because you sign your token with a shared secret.
So that shared secret needs to be known by by your server. And whichever other server you may be talking to so that you can validate the tokens between one another. If your API is both producing the token, so signing it with that secret and also then validating it to let your users get access to data. Super common thing to do, maybe not such a big deal because you're not sharing that secret around, but as soon as you have to share that secret around between different services, things start to open up for potentially that secret getting out and then whoever has that secret is able to sign new tokens on behalf of your users and use them at your API, and be able to get into your API.
So the recommendation there is to instead use the RS-256 signing algorithm or some other kind of signing algorithm, which is more robust. It's an asynchronous way of doing this meaning that there's then like a public private key pair at play. And, it's kind of like an SSH key.
You've got your public side and your private side. And so, if you want to be able to push stuff to GitHub, for example, you provide your public side of your key, your private side lives on your computer, and then you can just talk to one another. That is the better way of doing signing, but it's a bit more complex.
You've got to manage a JSON web key set. There's various complexities to it. You've got to have a rotating set of them, hopefully, various things to manage there. So yeah, if you're going to go down the road of JSON, web tokens, I'd recommend using RS 256 it's more work to implement, but it's worthwhile.
One thing that's interesting is that for a long time, eh, I can't remember if this is something that's like in the spec or not, but there's this big issue that surfaced about JSON web tokens, years back a vulnerability, because it was found out that if you, I think if you said the algorithm, so there's this header portion of the token, it's like an object that describes things about the token.
And if you set the algorithm to none I believe it is, most implementations of JSON web token validation would just like accept any token I think that's how it worked is like you could set the algorithm to none and then whatever token you sent would be validated, something like that. And so that issue has been fixed at the library level in most cases. But I think it's probably still out in the wild, in some respects, like there's probably still some service out there that is accepting tokens with an algorithm set to none. Not great. Right? Like you, you don't want that to be your situation. yeah. So watch out for that.
Other things that I can think about would be like, Don't store things in the token payload that are sensitive because anyone who has the token can read the payload, they can't modify it. They can't modify the payload and expect to, to make use of the modified payload, but they can read what's in it.
So you don't want secret info to be in there. If you're going to do a shared secret, make sure it's like a super long strong key, not some simple password. I think because JSON web tokens are not subject to, so I guess I'll back up a little bit. If you think about trying to crack a password and trying to brute force your way into a website, by guessing someone's password, how do you do that?
Well, you have a bot that is going to send every iteration, every variation of password I can think of until it succeeds. If you've got your password verification and things set up properly, you're going to use something like Bcrypt which inherently is slow at being able to decode or being able to verify passwords because you want to not allow for the opportunity that bots would be trying to guess passwords the slower you make it within reason so that a human doesn't have a bad time with it.
The more secure it is because a bot isn't going to be able to crack it. But if somebody has a JSON web token, they can run a script against it that does thousands and thousands of iterations per second, arguably, or however fast their computer is. And this is something that I've seen in real life in a demo is somebody who is able to crack adjacent web token with a reasonably strong secret key in something like 20 minutes, because they had lots of compute power to throw against it.
And they were just guessing, trying to guess various iterations of secret keys and it was successful because, you know, there's no limitation there on how fast you can try to crack it. So, and as soon as you do that, as soon as you crack it and you have that secret key, a whole world of possibilities for a hacker opens up because then they can start producing new JSON web tokens for their, for this application's users potentially, and get into their systems, very, sneakily.
So. Yeah, make sure you use a long, strong key or use RS 256 as your algorithm. That's my recommendation. I think.
Jeremy: [00:35:25] I think something that's, that's interesting about, this discussion of authentication and JSON web tokens is that in the server rendered space. When you talk about Rails or Django or Spring, things like that, there is often support for authentication built into the framework. Like they, they take care of, I try to go to this website and I'm not logged in.
So I get taken to the login page and it takes care of, Creating the session cookie and putting it, putting it in the store and it takes care of, the cross site request, forgery, issues with, with a token for that as well. And, and so in the past, I think people could, could build websites without necessarily fully understanding what was happening because the framework could kind of take care of everything.
And why do we not have that for say react or, or for Vue things like that?
Ryan: [00:36:25] Yeah. Well, I think ultimately it comes down to the facts that these, you know, these libraries like react and Vue are, are client side focused. I think we're seeing a bit of a shift when we think about things like next JS, because there's an opportunity now. if you have a full kind of fully featured framework, like next, where you get a backend portion, you get your API routes.
There's an opportunity for next, I hope they do this at some point in the future. I hope they take a library like next there's this library called next auth. very nice for integrating authentication into a next application, which is a bit tricky to do. It's it's surprisingly a little bit more tricky than you would think, to do.
If they were to take that and integrate it directly into the framework, then that's great. They can provide their own prescriptive way of doing authentication. And we don't need to worry about it as much, maybe, you know, a plugin system for other services, like Auth0 would be great. but the reason that we don't see it as much with react, vue, angular, et cetera, is because of the fact that they don't assume anything about a backend.
If you're going to talk about having something inbuilt for authentication, you need to involve a backend. that's gotta be a component. So yeah, that's, that's that's I think that's why.
Jeremy: [00:37:35] I think that, yeah, hopefully like you said, as you have solutions like, like next that are trying to. hold both parts, in the same place that we, we do start to see more of these, things built in, because I think there's so much, uncertainty, I guess, in terms of when somebody is building an application, not knowing exactly what they're supposed to do and what are all the things they have to account for.
so, so hopefully, hopefully that becomes the case.
Ryan: [00:38:06] I hope so, you know, next is doing some cool things with like I just saw yesterday. I think it was that there's an RFC in place for them to have like first-class tailwind support, which means like you wouldn't have to install and configure it separately. Tailwind CSS would just, come along for the ride with next JS.
So you can just start using tailwind classes, which I think would be great. Like, that'd be, I it's one of the least favorite things about, starting up a next project for me is configuring tailwind. So if they do, if they extend that to be something with off. Man. That'd be fun. That'd be good,
Jeremy: [00:38:37] at least within the JavaScript ecosystem, it felt like, people were people, well, I guess they still are excited about building more minimal libraries and frameworks and kind of piecing things together. and, and getting away from something that's all inclusive, like say a rails or a spring or a Django.
And it almost feels like we're maybe moving back into that direction. I don't know if that.
Ryan: [00:39:02] Yeah. Yeah, I get that sense for sure. I think that there's a lot of people that are interested in building full stack applications. you know, it feels like there was this. Not split. That's not the right way to put it, but there's, there was this trend, four or five years ago to start kind of breaking off into like being a front end developer or a backend developer, as opposed to like somebody who writes a server rendered app.
In which case you have to be, you have to be both just by the very nature of that. so there's specialization that's been taking place, and I think a lot of people now. You know, there's people like me who instead of specializing in one or the other, I just did both. So I call myself a full stack developer.
and now there's this maybe trend to go to, to, to, to go back to like being a full stack developer, but using both sides of the stack as they've been sort of pieced out now into like a single page app plus, some kind of API, but taking that experience and then melding it back into the experience of a full-stack developer.
I think we're seeing that more and more so. Yeah. I don't know. I think also like if you're. If you're wanting to build applications independently, or at least if you want to have like your head around how the whole thing works cohesively, you kind of need to, to be, you know, you need to have your head in, in, in all those areas.
So I think there's a need for it. And I think the easier it's made while we're still able to use things like react, which everybody loves you know, all the better
Jeremy: [00:40:29] Something else about react is something I liked about the course is, more so than just learning authentication. I feel like you, you, in some ways, teach people how to use react in a lot of ways. Some examples I can think of is, is how you use, react's context API to, to give, components access to requests that, that have the token included, or using your authentication context, that state to be able to know anywhere in your app, whether, you should be logged in or whether you're, still loading things like that. I wonder if there's like anything else within react specifically that you found, particularly useful or helpful when you're trying to build, an application with authentication.
Ryan: [00:41:20] Yeah. In React specifically. so I come from mostly doing angular work, years back to focusing more on react in the last few years. It's interesting because angular has more, I would say I would argue angular brings more out of the box that is helpful for doing authentication stuff or helping with authentication related things than something like react.
And, and maybe that's not a surprise because react is this maybe, maybe, maybe you think of reacting to the library as opposed to a framework. And angular is a framework that gives you lots of. Lots of features and is prescriptive about them. in any case, angular has things built in like route guards so that you can easily say this route should not be accessed unless some condition is met.
For example, your user is authenticated. It has its own HTTP library, which is to send requests to a server X XHR request. And it gives you an interceptors layer for them so that you can put a token onto a header to send that. React is far less prescriptive.
It gives you the option to do whatever you want. but some things that it does do nicely, I think which help for authentication would be like the react dot lazy function to lazily load routes in, in your application or at least lazily load components of any time. That's nice because one of the things that you hopefully want to do in a single page app, to help both with performance and with, with security is you don't want to send all of the stuff that a user might need, until they need it.
So an example of this would be like, if you've got, let's say you've got like a marketing page. And you've got your application, maybe under the same domain, maybe under different sub domains or something like that. and you've got like a login screen. There's no reason that you should send the code for your application to the user's browser before they actually log in.
And in fact, you probably shouldn't even put your login page in your react application, if you can help it. if you do have it in your application, you, at the very least you shouldn't ship all of the code for everything else in the application until the user has proved their identity. So to play this out a little bit, if the user did get all of the code for the application before they log in, what's the danger of that?
Well, there's hopefully not too much. That's dangerous because hopefully your API is well-protected and that's where the value is. That's where your, you know, your data is it's protected behind a, an API that's guarded, but the user then. Still has access to all your front end code. Maybe they see some business logic or they see they're able to piece together things about your, your application.
And, and a good example of this is do you follow, Jane Wong on, on Twitter? She finds hidden features in apps and websites through looking at the source code. I don't know how exactly how she does it. I'd love to talk to her about how she does it. I, I suspect that she'll like pull down the source code that gets shipped to the browser.
She'll maybe look through it and then maybe she'll, she'll run that code in her own environment and, and change things in it to coax it, to, to show various, to render various things that normally wouldn't render. So she's able to do that. And see secrets about how about those applications. So imagine that playing out for something, some situation where like, You in your front end code have included things that are like secret to the organization that really shouldn't be known by anybody definitely recommend you don't do that.
That instead you, you hold secret info behind an API, but the potential is still there. If it, if not that, then at least the user is figuring out how your application works. They're really getting a sense of how your application is constructed. So before you even ship that code to the browser, the user users should be authenticated.
before you send things that an admin should see versus like maybe a regular user, you should make sure they're an admin and reacts lazy loading feature helps with that. It allows you to split your application up into chunks and then just load those chunks when, when they're needed. so I'd like that feature aside from that, it's kind of like, you know, one of the downsides of react I suppose, is like, for other things like guarding routes, for attachment tokens, this kind of thing, you kind of have to do it on your own.
So yeah, react's lazy loading feature, super helpful for, you know, allowing you to split up your application. But aside from that other features, you know, it's kind of lacking, I would say in react, there's not, not a ton there to, to help you.
You kinda have to know those things on your own. Once you do them a couple of times and figure it out, it becomes not so bad, but still some work for you to do.
Jeremy: [00:46:04] Yeah. Talking about the hidden features and things like that. that, that's a really good point in that. I, I feel like a lot of applications, you go to the website and you get this giant, bundle of JavaScript. And, in a lot of cases, a lot of companies are using things like feature flags and things like that, where there'll be code for something, but the user just never gets to see it.
And that's, that's probably how you can just dig dig through and find things you're not supposed to see.
Ryan: [00:46:19] Yep. Exactly. You just pull down the source code that ends up in the browser and then flip on those feature flags manually, right? That's probably how she does it, I would assume. And it's very accessible to anybody because, as I like to try to, hammer home in my courses is that the browser is the public client.
You can never trust it to hold secrets. you really need to to take care of not to expose things in the browser that shouldn't be exposed.
Jeremy: [00:46:44] I wonder, like you were saying how in react, you have to build, more things yourself. You gave the example of how angular has a built-in way to, to add, tokens, for example, when you're, you're making HTTP requests, I wonder if you, if you think there's there's room for some kind of library or standard components set that, you know, people could use to, to have authentication or authorization in a react or in a vue, or if that's just kind of goes against the whole ecosystem.
Ryan: [00:47:18] Yeah, I think there's room for it, I wouldn't be surprised actually, if you know, some of the things that I show in my course, how to build yourself. If that exists in a, an, an independent library. I think where it gets tricky is that it's often reliance on the context. And even on the backend, that's, that's being used to make those things really work appropriately.
So. I guess a good example is like the Auth0 library that's allows you to build off into a react application. It's very nice because it gives you some hooks that, are useful for communicating, with Auth0 to, for instance, get a token and then to be able to tell if your user is currently authenticated or not.
So they give you these hooks, which are great, but, but that's reliant on like a specific service. A specific backend, a session being in place Auth0 stores, a session for your users. That's how they're able to tell if they should give you a new token. I mean, I love to see that kind of stuff.
Like if there's, you know, things that are pre-built to help you, but I just. I don't know if there's like a good way to do it in a, in a sort of general purpose way because it is quite context dependent, I think.
But I, I might be wrong. Maybe there's value. In having something that, I mean, I suppose if you created something that you could have like a plug-in system for, so something may be generic enough that it gives you these useful things. Like whether the user is currently authenticated or, if a route should be guarded or something like this.
If you had that with the ability to plug in your context, like, I'm using Auth0 or I'm using Okta, or I'm using my own auth. Whatever, maybe that maybe, maybe there's opportunity for that. that might be interesting.
Jeremy: [00:49:03] Would definitely save people a lot of time.
Ryan: [00:49:05] Yeah. Yeah, for sure. And that's why I love these prebuilt hooks by Auth0, for sure. When I'm using Auth0 in my applications, it certainly saves me time.
Jeremy: [00:49:14] The next thing I'd like to ask you about is you've got a podcast, the entrepreneurial coder and, sort of in parallel to that, you've been working a quote unquote, normal job, but also been doing a lot of different side things. And I wonder, whether that's the angular security book or recording courses for egghead like how are you picking and choosing these things and, and what's your thought process there?
Ryan: [00:49:43] Yeah, that's a good, good question. I have always been curious about business and entrepreneurial things and, and, and selling stuff, I suppose. I would say now I haven't been in business for a super long time, but I, I have been in business for a while now. and what I mean by that is, Yeah, you can take it back to like 2015 when I started at Auth0 at the same time that I was working there, I was also doing some side consulting, building applications for clients. And so since that point, I, you know, I I've been doing some side stuff. it turned into full time stuff back in 2017. I left Auth0, late 2017 to focus on, on consulting.
Then 2020, mid 2020, I realized that the consulting was good and everything like that. And I mean, I still do some of it, but it's, it's difficult when you're on your own as a consultant to be learning. And, and that's something that I really missed is that I found that my learning was really stagnating.
I learned a ton at Auth0 because I was sort of. Pushed to do so it was necessity of my job. As a, as a consultants building applications. I mean, sure. I needed to learn various things here and there, but I, I really felt that my learning had stagnated. And I think it's because I wasn't surrounded by other engineers who were doing things and learning from them and sort of being pushed to, to learn new stuff.
So yeah, an opportunity came up. With, with Prisma, who I'm now, now, working at I'm working at Prisma and it was a, it started everything kind of fell into place to make sense for it. For me. I mean, I, I don't think I would take just, just any job. It was really a great opportunity at Prisma that presented itself.
And I said, you know, I, I do miss this whole learning thing, and I do want to get some more of that. So, that's, that's probably the biggest reason that I decided to join up with a company again. So all that being said, the, the interest in like business yeah. And interest in doing product work, like a course, this react security course, or my angular book, that came out of just some kind of like, I don't know, man, like some kind of desire. I don't know when exactly it sort of sprouted, but this desire to create a product, to productize something, productize my knowledge and to create something out of that and offer it to the world and offer it, in a way where, you know, I could be, be earning some, some income from it. And you know, there there's various people I could point to that have inspired me along that journey, the, the most prominent one is probably Nathan Barry. I don't know if you're familiar with Nathan Barry. He's a CEO of convert kit and I've followed his stuff for a long time. And he was very, always very open about, his, eBooks and his products that he creates.
And that that really inspired me to want to do something similar. And so, in trying to think of what I could offer and what I could create, authentication and, identity stuff kind of made sense because I acquired so much knowledge about it at Auth0. And I was like, you know, this probably makes sense for me to to bundle up into something that is consumable by people and I, and hopefully valuable. I sense that it's been valuable as certainly the response to the products has told me that it has been valuable. So yeah, I think that's the genesis there. And then the. interest in business stuff I think I've just, I don't know. I've always been curious about, especially about like online business, maybe, maybe less so with like traditional brick and mortar, kind of business, but, but, but online business in particular, it just the there's something alluring to it because like, if you can create something and offer it for sale, there's no, there, I'm sure there are better feelings, but I was about to say there, there's no better feeling than to wake up to an email saying that you've made money when you're you've been sleeping right. And that's, it's a cool feeling for sure. so. Yeah, I think that's, that's probably where that comes from.
Jeremy: [00:53:34] And when you first started out, you were making courses for egghead and frontend masters, does that feel like very distinctly different than when you go out and make this react security course?
Ryan: [00:53:49] Yeah. Yeah, definitely. Yeah. For a few reasons. I think that the biggest and probably the overarching thing is that when you do it on your own, you create your own course, the sense of ownership that you have over it is makes it a different thing. I'm in tune quite deeply with the stuff I've created for like react security.
And even though I've done what three, I think front-end masters courses, you know, I've been out to Minneapolis a bunch of times. I know the people there, quite well it's. There, isn't the same sense of like, ownership over that material, because it's been produced for another organization when it's your own thing.
It's, it's like your baby, and you care more deeply about it. I think so. So yeah, that's been, it's a different experience for sure. For that reason. I think too, there's the fact just the, you know, getting down to the, the technicalities of it. There's, there's the fact that you are responsible for all of the elements of production.
You know, you might look at a course and be like, okay, it's just a series of videos, but there's tons of stuff that goes into it. In addition to that, right. There's like all of the write-ups that go along with each video. There's the marketing aspects like of. You know, putting the website together and all, and not only that, like the there's, the lead-up, the interest building lead up over the course of time through building an email list that, and you collect email addresses through blog posts and social engagement.
It was building up interest on social over the course of time, like tweeting out valuable tidbits. there's just this whole complex web of things that go into putting your product together. Which means that you need to be responsible for a lot of those things, if you want it to go well, that you don't really need to think about if you're putting it on someone else's platform.
So there's, trade-offs either way definitely trade-offs. But, I mean, it depends what I want to do, but I think. If I were to do a product with some size to it, it would be I'd want to do my own, my own thing for sure.
Jeremy: [00:55:48] Are there specific parts of working on it on your own what are the parts (that) are just the least fun?
Ryan: [00:55:57] Yeah, the parts that are the least fun. I suppose for me, it's like, probably editing, editing the videos is the least fun. I've got this process down now where it doesn't take super, super long, but it's still, It's a necessity to, to the process that is always there, editing videos and, and the way that I. I dunno, I'd probably do it in a I I probably put this on myself, but I, the way that I record my videos, I editing takes quite a while. yeah, I, I think that is not such a great part about it. what else? I guess there's like the whole aspect of like, if you, if you record a video and you've messed something up and the flow of things doesn't work out, then you have to like rethink the flow.
There's any anytime I've got to do anytime I've got a backtrack, I suppose that that takes the wind out of my sails for a little while. Yeah, those are the two that stand out the most. I think.
Jeremy: [00:56:51] Yeah. I noticed that during some of the videos you would, make a small mistake and correct it like you would say, Oh, we needed these curly braces here. I was curious if that was a mistake you made accidentally and you decided, Oh, you know, it's fine to just leave it in. Or if those were intentional.
Ryan: [00:57:10] Definitely no intentional mistakes. I'd love to post a video sometime on a raw video that isn't edited because you'd laugh at it. I'm sure it's like, inevitably in every video, there's what it looks like. If you, if you were to see it unedited is a bunch of like starts and stops for a section till I finally get through and then a pause for awhile and then a bunch of starts and stops to get to the, to the next section.
So like, "In this section we're going to look at", and I don't like the way that I said it. So I start again "in this section, we're gonna look at" and then there's like five of those until I finally get it. And so my trick with editing is that I will edit from the, the end of the video to the start so that I don't accidentally edit pieces of the video that I'm just going to scrap.
Anyway, that's what I did for a long time. When I first started, it's like, I'd spend like 10 minutes editing a section, and then I get further along and I'm like, Oh, I, I, I did this section in a better way later and now I have to throw it, all that editing that I just did. So if I edit from the back to the to the front, it, it means that I can find the final take first, chop it when I start to take and then just get rid of everything before it right.
So yeah, there's, there's definitely no intentional mistakes, but sometimes like, if there's. If there's something small, like it's, Oh, I I mistyped a variable or something like that. I'll often leave that in. Sometimes it's a result of like, if I do that early on in the video, and then I don't realize it until the code runs like a few minutes later.
I'm just like internally. I'm like arggh! But I don't want to go and redo the video. Right. So I'm just like, okay, I'll, I'll leave that in whatever. And some of, some of those are good. I think in fact, I arguably, I probably have my videos in a state where they're, they're a bit too clean, like, cause I've heard a lot before where people don't mind seeing mistakes.
They actually kind of like when they see you make mistakes and figuring out how to. How to troubleshoot it, right?
Jeremy: [00:59:02] The, Projects that you've done on your own have been learning courses, right. Or they've been books or courses. And I wonder if that's because you, specifically enjoy making, teaching material, or if this is more of a testing ground for like, I'll, I'll try this first. And then I'll, I'll try out, a SAAS or something like that later.
Ryan: [00:59:22] Right. Yeah. I definitely enjoy teaching. Would say that I, am passionate about teaching. maybe I'm not the most passionate teacher you'd come across. I'm sure there are many others that are more passionate than I am, but I enjoy it quite a bit. I mean, and I think that's why I like the kind of work that I do.
The devrel work that I do at Prisma. You know, I enjoy being able to offer my experience and have others learn from it. So at this point, like if I'm making courses or books or whatever, and I don't know if I'll do another book. I mean, writing, writing a book is a different thing than, than making a a video-based course.
And, and I don't know if I would do another one but in, in any case, I don't think it's a proving ground. I'll do other courses regardless of what other sort of things I make. But speaking of SAAS, that is an area that I'd like to get into, for sure. As time wears on here, I'd love to start and I've got a few side projects kind of simmering right now that that would be in that direction.
So yeah, I think that that's, that's an ideal of mine is getting to like a SAS where you started hopefully generating recurring revenue more, you know, more. More predictable recurring revenue, as opposed to, you know, of course it's a great, because you build up to a launch, hopefully you have a good launch.
It's a pretty good amount of money that comes in at launch time, but then it goes away and maybe you have some sales and you generate revenue as time goes on. But, less predictably than perhaps you would in a SAAS if you have one. So yeah, that's kind of where my head's at with that right now.
Jeremy: [01:00:51] I'm curious you, you mentioned how you probably wouldn't do another book again. I'm curious why that is and, and how you feel that is so different than a video course.
Ryan: [01:01:02] Yeah, I think, I think it's because I enjoy creating video based courses more than I, I enjoy writing. I like both, but writing less so. And I also think that there's this perception when you go to sell something, but if it's a book it's worth a certain price range, and if it's a video course, it's worth a different price range that is higher.
I don't think that those perceptions are always valid because I think the value that can come from a book, if it's on a technical topic that is going to teach you something, you know, if you pay $200, $250, $300 for that book, which is in our world, that's a PDF. Right. I think it's perfectly valid. I think that, people have this blocker though, when it comes to that internally, they're like, there's no way I'm going to pay that much for a PDF, but the same information in a video course, people are often quite, amenable to that. So there's that.
I happen to enjoy creating video courses more than I do books. I think that's the reason that I probably wouldn't do a book again. And a book is a book is harder to do than a video course. I think writing a book is, is a difficult, is a difficult thing. For whatever reason. I, I couldn't tell you like technically why or specifically why, but I sense, I've heard this from a lot of people that making a book is harder than making a video course. Yeah.
Jeremy: [01:02:28] Yeah, I have noticed that it feels like everybody who tries to write a book, they always say it was way harder than they thought it was going to be. And it took away longer.
Ryan: [01:02:36] Yeah, for sure.
Jeremy: [01:02:37] I do agree with the value perception. I don't know what it is, but maybe it's just the nature of the fact that, there's so much, you can just go online and see in blog posts and stuff that maybe for some reason, people don't value it as much when it's all assembled and, writing form.
But like the pitch, even your course on the landing pages, you say like, yeah, you can technically find all this information online, but you know, just think about how much time you're going to spend looking for it and not knowing whether or not it's, it's valid or not.
Ryan: [01:03:14] Yep. Absolutely. That's the big, the big sell I think, with, with, online education, rather than having you assemble all that knowledge yourself over the course of time, the value proposition is I will give it to you. I will show you my lessons learned and give it to you succinctly in a nicely. Kind of a, you know, collated fashion.
Jeremy: [01:03:34] Another thing that, I enjoyed about the course is that it was very to the point, sometimes you'll see an advertisement for a video course and they'll say like, Oh, it's got 20 hours of content, but it's like, the fact that it's long doesn't mean you learn more.
If anything, you would think you would want to spend less time to learn the same amount of material. So, yeah. I definitely appreciated that as well as the fact that you've got the transcripts underneath the video.
Ryan: [01:04:05] Oh, that's good to hear. That's awesome. Yeah.
Jeremy: [01:04:07] Cause it's like, I could watch a video and then get to the end and like, maybe I'll remember some things, but other things I won't. And then to, to pick it up, I have to go watch the video again. whereas when you've got that, that transcript, that's pretty helpful.
Ryan: [01:04:23] That's great to hear. Yeah, I was, I wanted to make a point of including transcripts, you know, for accessibility reasons, for sure. But, but also because you know, something that I have realized as time has gone on is that people do appreciate the, the written form. I always thought that maybe it's a little bit weird to read something that is like, the narration of a video, because it doesn't translate perhaps super well if it's in, in, in print, but I think, everything that I've heard about the transcripts has been positive. Like, like you've said, so I'm glad to hear that, people seem to like them.
Jeremy: [01:04:54] The last thing I'll ask is. I guess when authoring courses, or even just when you're releasing a product on your own, what are like some things that you think are really good ideas and what are some things that, you've realized like, Oh, maybe, maybe not so much.
Ryan: [01:05:09] Yeah. So, what I'd offer there is. it's a good idea to try to validate what you, want to create as a product. If you, if you're going for, if you're going for a sale, like right. If you want to, if you want to sell the thing, if you want to make a decent amount of money, it's a really good idea to try to validate it early on and you can do that without much effort. I think by doing smaller things ahead of time, it's things like creating couple of videos for YouTube see what the response is there. offer tidbits on places like Twitter and see what people, how people respond to them. One of the ways that I validated these courses was by doing just that, like tweeting out kind of small, valuable pieces of content, writing blog posts, and seeing how people responded to them.
And then the, like the biggest thing, though, for me for this particular course was I created the free, intro course, the react security fundamentals course. And after I saw, you know, the number of people signing up for it, I was like, okay, I think there's enough volume here. A volume of interest to tell me that if I was able to offer a sale to even a small percentage of that, that list that it would be worthwhile.
The other thing that I, would recommend is that you, that you build up an email list, you find some way to get in touch with your. audience and keep in touch with them. And that's, that's often through email. if you can build up an email list and offer valuable things to them over the course of time, they will have you in mind, they will feel some need for reciprocity when it comes time to, for you to offer something for sale, do all those things.
well before you go to release your course, the last thing I want for anyone is to release the course and nobody is there to, to buy it. And, you know, unfortunately I think a lot of people. have the assumption that if they just, you know, create it, put it up on the internet, it'll sell somehow. But the reality is you almost certainly need to have develop an audience beforehand.
At least if you do, that's the way that, you know, you can make a meaningful sort of, I guess, a impact, but also be. financial results from going down that road. So build that audience, I think is, is the key there.
Jeremy: [01:07:18] Cool, that's a good place to, to start wrapping up, if people want to check out your course or see what you're up to, where should they had?
Ryan: [01:07:26] Sure. Yeah. Thanks so much that you can go to react security.io, and I'll get you some links, to put in the show notes or whatever, but react security.io. There's a link there for. The advanced course. you can check me out on Twitter. I'm at @ryanchenkie on Twitter and, where else I've got a YouTube channel that's my name? Ryan Chenkie has my name on YouTube. I've got some sort of free stuff there. Yeah. Check those places out, you know, kind of find your way around.
Jeremy: [01:07:50] Cool. Well, Ryan, thank you so much for chatting with me today.
Ryan: [01:07:54] Thanks for having me. This has been great. I was happy to be able to do this.
Jeremy: [01:07:58] Thanks again to Ryan Chenkie for coming on the show. I'll be taking a short break and there will be new episodes again next month. i hope you all have a great new year and I'll see you then.
Timirah is an iOS developer, developer advocate, founder of the TechniGal LA meetup group, and instructor for O'Reilly and Coursera.
Related Links
Music by Crystal Cola.
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:03] Today I'm talking to Tamirah James. She's been an iOS developer for... how many years has it been?
Timirah: [00:00:09] Oh my goodness. Since I first started iOS, I want to say ooh (whispering) oh my gosh seven years.
Jeremy: [00:00:15] Seven years. Where did that time go?
Timirah: [00:00:17] My goodness. Oh yeah. Seven years. Wow.
Jeremy: [00:00:21] I think a lot of people listening have written software before, but maybe not for iOS. So I'm interested to hear what your experience is and how you think people should get into it.
Timirah: [00:00:31] Absolutely. So, yeah, and excuse me, we, we talked about my, before, before we, cut the podcast on, we were talking about how I have this 8:00 AM rasp in my voice (laughs) so you'll have to excuse the rasp. But, yeah, I, like we said, you know, seven years with iOS development. The great thing, and the crazy thing about, you know, my journey in iOS is I really went in with purpose and, intention, right. I went into mobile development knowing, that it was going to be something that was valuable not only into my career, but valuable to, society as a whole right. Valuable to the industry as a whole. So it, it goes all the way back to like, when I first started coding and I was like, I want to say I was 17 years old. and I started to get into web development and, I got bamboozled into getting into a, like a high school, like internship thing, to learn web design. and I loved it and that was my first introduction to coding. And I was like, Oh my goodness.
Like, this is amazing, like web development, web design. And then once I started to think about college and the route that I wanted to go, I was like, okay. Hm. There's so many other avenues. And so many other doors in this industry, like. You know what? I know that I want to pursue computer science but what, what is going to be my path?
So I started thinking about my career pathway immediately. I'm like, okay, going to college then what? And at that time, I think like we were just making the transition from like flip phones. And like the cameras and the, Oh my gosh. If you had the video phone, you call them the video phones, video phones.
And then the transition into like, Oh, now we're at, iPod touches. Now we're at iPhones now or at we're like, okay. Oh, wow. This is like the birth of the smartphone, the birth of the iPhone, and really seeing the transition. The phone being something that's way beyond just a communication device right now it's media, now it's entertainment, now it's education, it's finance, it's, all of these things. I always say like, you can leave your laptop at home, but if you leave your phone, You're like, Oh my gosh like, how am I going to function throughout the day? If I can build something for that, I can make a huge, huge impact so I really went into it with all, like, it was very much, something that I went with all intentions, all intentional purposes.
So I said, you know, after college I'm going to go right into mobile development. I really think that I will make the biggest impact there and build things that will stick with people and people will spend time with, and people enjoy, and will live with users. I fell in love with the idea of the evolution of the phone and mobile development.
So went to college, ended up dropping out in that first year, computer science, for a lot of reasons, one of them being that the university I went to wasn't a, tech school, so to speak, computer science was not one of their pillars. You know, one of their like, Oh, this is something that we're focusing and we have resources and we have a gateways for mentorship and internships.
Like they were just like, Hey, we got computer science is here, you know? and yeah, so I went and I took that opportunity. And it just, wasn't what I thought it would be. I didn't have any support coming in, you know, just learning, computer science and, you know, there were so many different aspects to it.
It was culture, there were, cultural aspects. There was just the support in terms of what was being taught. and I just felt like the more I was in it, the more behind I felt, I felt like it was being pushed out. and yeah, so I left school and I was like, okay, where do I, where do I kind of go from here?
And again, I knew that I still wanted to do mobile development. So I continued that journey, online there were, there were, maybe some coding, boot camps, that were really underground and just getting started or getting their feet wet and trying to figure that out but none that were really visible like that and really like, Oh, this is an option. So didn't have any like bootcamp or coding school, situation.
But I had so many resources online. I can remember my primary source of education at that time, along with like, everything like Ray Wenderlich, and, tree house, coursera, which is crazy cause I'm teaching courses for Coursera now. And, in addition to that, my primary source was Stanford's iOS development course, which they were offering all the lectures for free.
And I was upset cause I was like, wait a minute, the school I was going to. It was like this private university, you know, and here I could have a Stanford education, you know, in which I kind of ended up with it, so, okay. I have a Stanford education in iOS development (laughs) . so yeah, I kind of continued on from there and really went cold turkey.
I know that like, it's hard for a lot of people to adapt to like being self-taught in really the meaning of being self-taught right, where I don't really have a physical community or any source of like mentorship around me saying, Hey, you should learn this. Or you should learn that.
I didn't even know where to look for that kind of guidance. I didn't really have, a boot camp that was promising me a roadmap, a career ladder. what the important tech stacks for mobile development were. all that groundwork. Like I really did it myself and, I had the support of my mom.
I didn't work another job. Like I was living with my mom. and she's always been single parent so one paycheck. And, It was really, maybe less than a year before I, started into my professional, career and got my first role landed my first role as an iOS engineer.
So, anything you want to learn, you want to make sure that you do the work right? Outside of, even if you're at a coding bootcamp, you have to want to learn it for yourself. and then you have to look at the benefits, what, what are the benefits of learning?
Something like this. iOS development. Okay. You know, that it has a very strong Apple developer community around it, and then Apple, the entity itself, Apple, really prides itself on making sure that they give an abundance of resources to developers, you know, via, you know, th the WWDC, you know, which is their yearly conference that they do.
And they make sure that they are. flooding developers with new things and, new information and, you know, hands-on training, in the best ways that they can. So there is a lot of resources out there. and just in terms of iOS development itself, the benefit of iOS. again, it, iOS can be very niche and that's something that probably turns people away.
They're like, Why iOS development, you know, Android you have a plethora of different devices. you have, I guess, more scalability in terms of who's able to access, your applications, right? You think about people in other countries like the iOS device is probably the number one device in the United States.
But outside of here, like, you know, I've been to Europe, I've been to a couple other places and. They honey, they are Android down. They are just like it's all Android over there. you know, and that has a lot to do with just pricing and all these things a lot of people are turned away just because it is such a niche and I call it like this King Apple theory where, you have to use an Apple language, and build on an Apple IDE.
Right on an Apple machine, for Apple products, you know, so they do a great job of making sure that ecosystem is tight. and that, that compatibility and the conversions are really just like very much limited outside of that ecosystem. So yeah, like I think it's definitely worth it. you know, because there is so much of a focus, you know, there's a lot of benefit in that as well. So yeah.
Jeremy: [00:09:18] What would you say are the biggest reasons or benefits for choosing to develop for iOS versus Android?
Timirah: [00:09:24] I think it really just comes down to, when it comes to the benefits of iOS itself, again, it comes down to, for one stability, right? Sometimes it is harder when you have so many devices to build for, right?
So you have this like ecosystem of Apple products we have like, for specifically for iOS development, it's just the iPhone. We have different versions of the iPhone. of course they're multiple versions that are, that, you know, keep upgrading as far as the operating systems like iOS versions.
And th that's basically the end of your worries when it comes to like support and compatibility. but there's so many different like operating systems when it comes to end devices and device sizes and so on and so forth when you're dealing with, Android development. Right. And so something that might work or look good for, six or seven out of 10 devices Android devices, you may have some users out there, it, the functionality isn't as, optimal, you know, it isn't as reactive.
Just because, there's a slight difference in the operating system. There's a slight difference in, the device, the model. That wide range, it can be a bit of a hindrance there. iPhones keep continuing to like upgrade and the iOS versions can continue to go up. Like you have less and less of a variety of devices that you have to like really worry about. Right. And so there's a lot more, consistency and again, stability, when you're dealing with iOS development. And some people see it as a blessing and a curse that there's so many different languages that you can utilize for Android development, as opposed to with iOS development.
Again, you have to use an Apple language, not unless you're doing like cross platform situation. you know, which is like a handful of frameworks out there. But if you want a native rich experience, you're going to go, native Swift or are you going to go native, you know, objective C.
So, yeah, so there's this like, this, this aspect where it benefits you to remain in that consistency and that, and that stability and because. Apple knows that everything, all of the development is weighted on those languages is weighted on, that IDE, which is Xcode. they put all of their focus into making sure that those things are as stable as possible.
As opposed to, Android, where again, Yeah, there's a blessing and a curse in how much that variety is. So I think that's probably the main thing when it comes to that big difference and the benefit of, building for iOS.
Jeremy: [00:12:24] It, it almost sounds like for an independent developer or for a smaller company, it might be easier to take on iOS development, because as you said, there, there aren't as many devices you have to make sure that your software works on. With iOS there is just Apple's operating system. There isn't a version by Samsung and by Huawei and by Google and so on. And then in the tools themselves, you said there's a lot of different options for languages and IDEs for Android. Whereas for iOS, everything's more. centralized and Apple is that's really the main choice, right?
So they're forced to push all their efforts into that. It's almost like different philosophies, right? You get a bunch of options, or you have more limited choices, but maybe there's less for you to have to worry about less for you to have to learn and maybe have some more focus there.
Timirah: [00:13:22] Absolutely. You know, Apple is doing everything to make sure, try and make sure that they are a global, they're definitely one for global dominance, right? (laughs) I don't know why that sounded scary in a way, but, you know, Apple wants to be number one, you know, they want.
Yeah, of course they want the iPhone to be the number one device in the world. Right. but then there's this issue of like, you know, pricing of course, like, okay. A thousand dollar phone is probably not popular. Like everyone, like America has this, this thing too. Like, America has a culture of thinking differently, right?
Like even down to, like you ever had a friend who's like, they text you their number and you're like, Oh, a green bubble came up. I've seen like, like this among my friends.
Like if you text me and if I see a green bubble, I'm going to give you a side eye, you know? (laughs)
Like, you know, so it's like, okay, so it's even down to that. Right? So beyond just developers, like down to users, like, Oh, you have a green bubble, you know, there's this, there's this thing, which is stupid and it's hilarious. But it is the thing, like, you know, if you want the top phone, you want top performance, you want the top camera, even if, you know, some of the features are very much copy-paste from Android. If you see some of the features, sometimes when Apple is doing the rollouts and they're like, Android, this... this phone, the Galaxy has always had this feature, you know, now Apple is rolling it out like it's so new and it's so innovative. But because it's on the iPhone now it is a new trend. Right? Just because it, it has set that status quo in the culture. There's this thing where like, centralization is just a big thing, when it comes to iOS development.
Jeremy: [00:15:14] Earlier you were talking about how you might build your application with Swift, or you might also have Objective C could you kind of go into why there's these two languages and when you would use each?
Timirah: [00:15:29] Yeah. So, Back back back in the day, Objective C was created over 30 years ago. and that was the sole language for these past like three decades, for development in the Apple environment when it comes to, building applications. there was just one language at this, this whole time.
Before 2014 and that's when Swift was born and it kinda hit everybody by storm because it was just like, okay, there was just no inkling. Like we had no inkling that, that Apple would even do something like that or produce something like that, like a brand new language. so the Apple developer community was like, Oh, what what's happening here?
They were just like shaken up. the differences in the languages are very, very drastic. Objective C is known for being a very ugly language. (laughs) Very verbose, very, I mean, double down on like boiler plating, there's a lot of redundancy, right. And the benefit in a language like that is, you know, yes, it's ugly and hairy for a beginner.
I had to make lemonade in my mind in order to learn it. You know, all of this, like, you know, redundancy is kind of teaching me syntax and teaching me logic, right. If I have to keep making a reference to this, if I have to keep doing this, if I have to keep doing that, if I have to add the boiler plate, kind of code in, it's embedding this logic into me, you know, as a beginner, like, Oh, this, you know what this actually means every single time, once you're an expert in that, like it's just annoying. Okay. so people were tired of it. it was an ugly language, but it was the only language at the time. So 2014 Swift was a breath of fresh air, very simplified, straightforward, very clean, very legible readable language. when you compare it to objective C, so something you might do in objective C with like 30 lines of code, you might do with a Swift with like eight or nine lines of code. so it was like, it was definitely like, Whoa, this is huge, huge thing. So, and then Swift. Has gone through these six years, going through this huge rapid, climb just in popularity and in evolution.
So it went from like, now there's another option for, you know, building, iOS applications. Right. you can use this new language then like the very next year, like Swift becomes open source now, like, Oh my goodness, what does this mean? Like, Oh, we talked about like Apple being very much like closed off and very like strict on their programming language in their platforms.
And, you know, everything is being built by Apple. So like this new layer of transparency saying, Hey, we open source this language. you know, so you can see like everything and not only just see everything, but also like contribute and like make it what you want it to be. So open it up to like, the community to build up, and create the language of their liking.
And it was a real, a huge deal. And it made a huge statement in the Apple developer community, and it also attracted other people to the language cause they were curious. Right. and once it became open source, like the, the, the flood gates kind of open it transitioned Swift as a language transitioned into this, like, okay, now we see all these benefits of the Swift compiler and, you know, runtime. can we use this in other areas, like beyond iOS development and now next thing you know, you're seeing Swift on the server, you know, server side Swift frameworks, you're seeing, you know, people building web applications with Swift, and it, it just kinda grew a pair of wings and it's all in, it's been since its birth. It's been like this top 10 language to learn. if you've seen like basically any tech publication or blogs, like you say, top 10 languages to learn for 20, 20, or 2019 or whatever. And Swift has always been on the list.
And a lot of people are like, okay, I understand why this language or that language, but why Swift, why are they suggesting Swift? Are they, if I'm being recommended Swift as a top 10 language to learn, am I being encouraged to learn iOS development?
Is that where this is going instead of acknowledging Swift the programming language right in itself and the benefits of the language and what you can do. So, yeah, it kind of offered up this opportunity for Swift to, like I said, grow a pair of wings, and now you can, you can actually be a full stack developer using purely Swift.
You know, you can build your native front end with Swift and you can build a backend with Swift, build a whole like rest API. So. Yeah, like there's this whole whole situation. So now you have objective C and you have Swift, you know, now objective C. there's this talk that I like to give it's called, Swift for Objective C OGs. and basically it's encouraging people who are still in that objective C bubble to take the jump, learning Swift because every year, you know, during WWDC, you don't see Apple coming out and saying, Hey, here's some new updates with objective C here's a new framework for objective C like no, there's no, new, there's there, there are no new updates when it comes to objective C and I alluded to the fact that that may be due to Objective C becoming even more of a legacy language and maybe in the, you know, far in the, I don't know how near or far that future, that, that, that destiny is, but, you know, maybe one day become deprecated.
Objective C is something that a lot of enterprise companies still use just because, you know, if you, if you're Google or someone your legacy code is built in objective C. Now there's this new language. Right? So now you know, you see all the benefits of Swift and the compiler and you see all of this. So now there's this thing where now like, okay, you either hire developers who know Swift to help make the transition, or you train your developers that are working on the iOS products to learn Swift. So they can help make the transition, but that, that process is, you know, it takes two years, to properly make that migration.
Even if you have to mix the two, right. It, it takes a while. So, yeah. So objective C is more of the, I don't know if you'd call it grandfather, but definitely the father. And then you, you know, Swift is the, the new baby that is like taking over the world right now.
Jeremy: [00:22:38] So if you're making a new iOS application you could do it entirely in swift you don't need to even look at Objective c
Timirah: [00:22:45] Absolutely. Yeah, you can make it just Swift. If you open XCode and then it'll ask you, if you, you know, if you create a new project that asks you, you know, what language are you doing this in, are you doing Swift or objective C? You can do it either or, and also you can use, if you want it to, let's say You're building application in Swift, but there's this like, third-party like library, that's doing some cool. I don't know, animation or something, but it hasn't been converted to Swift. And it's a, it's in objective C you can use a bridging header to basically, pull that library in and, take advantage of that functionality from that third party library and continue to, you know, build their application in Swift.
So yeah, there's a way you can, you can use both. You can use Swift. They are definitely their individual languages. but yeah, absolutely.
And Swift has like this huge, interoperability, factor. And I talked about this, like in my, O'Reilly course as well, like. It's a huge deal. That, that, that component, is a huge deal because like you have other developers, not only just people coming from the Apple community who are using objective C, but you have developers who are coming, you know, now trying to learn Swift, but they're coming from a Python background or, you know, they're using it for different reasons.
And that, that component, that element of Swift, it helps to bridge that gap. Right. so if you're learning Swift and again, you're still trying to use some Python, like libraries or integrate some like Python modules. You can, utilize that with Swift. yeah. So I think it's like supported with a with python, like C++ or C.
And that, that is a huge benefit when it comes to Swift, it's just like, okay. And that that's a big help. when someone not only is building from mobile, but like coming from, like I said, other backgrounds and trying to build something with Swift that may or may not have something to do with iOS development per se.
Jeremy: [00:24:51] I wanna go a little bit into Swift the language itself. If, if you're someone who's familiar with, like you said, Python or JavaScript or Java, how does Swift compare what what's similar and what's different?
Timirah: [00:25:07] At first glance, there's so many similarities in syntax alone. when it comes to JavaScript, which is really friendly for like people who are coming from JavaScript, or languages that JavaScript maybe similar to as well.
Right. even down to like the var keywords, and you know, one thing about Swift though, is it's, it's very statically typed. and there's this element of like type safety, Swift kind of prides itself on being a very safe language and that contributes to its performance.
Right. It's very much, it's the speed is probably what makes it so popular. amongst developers, it's just like, Oh wow. This compiler is like, okay. Yeah (laughs) and that's due to it being such a. safe language, right? So like the type safety and generics kind of, it makes it, that safe language and it prevents like the user from, basically passing in like incorrect types. And, you know, it's just very, like, it will yell at you. It will definitely (laughs) yell at you, because it just prides itself on, preserving memory and, just being very, very safe. There is one thing that I always talk about. Which took me by surprise. I was saying I was shook af I was shook af, when I, the main thing was, this thing called optionals.
And I don't know if you've heard about the Swift optionals, where they're probably the most annoying thing you will come across when you're dealing with Swift. and basically you identify optionals with these, question marks and exclamation points, and I'm like, Oh my goodness, more punctuation. What does this mean?
I've never put a question mark in a line of code in my life. Okay. Exclamation points, you know, and we have a different reference for like exclamation points. Right. So anytime I was coding before Swift, you know, anytime I use an exclamation point, it was to say, like, not something, right.
So not equal something. but no, it didn't mean that at all. Basically these optionals are attributed to type safety, and talking about protecting values, and checking for them before, before everything kind of compiles. so that means, you know, if you have the value that may or may not be nil, Swift offers an opportunity for you to wrap that value, and say, Hey, This may or may not be needed. We're going to check for that. But whether it's nil or not, this function is not going to break and the app is not going to crash right. So we're not even taking up that space. And then it you the option to unwrap that value and say, Hey, I think it's safe.
This value is pretty much not going to be nil. and yeah, but it forces you to declare that and it forces you to, constantly be thinking about the memory and thinking about type safety at all times, you know? We talked about objective C giving you that mentality of redundancy or of what was important right. and what's important for Swift is performance and it's, like, you know what, I'm not even (hits the mic) going to give you the opportunity to, (laughs) slow me down or make your app. You know, make this app application crash, based off of, you know, some value that, you maybe forgot about, or it just became nil, because you forgot to do X, Y, and Z.
I'm just going to wrap this thing, for you and I'll let you know what comes of it. Yeah, so th that's a huge thing that, like, It's not something that's unconventional when it comes to other programming languages. but the approach and the syntax of it is a little bit different and it might throw some people coming from other languages off.
So I think that's a huge difference, you know, when you see the syntax, but yeah, this is familiar whaaat a question mark? that's that's one of the one of the big differences, as well.
Jeremy: [00:29:18] So to see if I understand correctly, this, this optional type, the real problem it's trying to solve is when you work with languages like JavaScript, and I'm sure this is the case with Python as well. There can be cases where you have a function and it takes in parameters. But one of those parameters could be nil or null.
And when you try to use that function, you'll get an error at runtime where it'll say something like undefined is not a function or just no reference exception or something like that. It'll just blow up. Yeah. And it sounds like what the optional type is attempting to do is to have you tell the compiler through your code, that this value, it could have the value in it, or it could be nil.
And, the compiler then knows if you've handled that case of the fact that it could be null, or, or it could have something in it. Did I kind of get that right?
Timirah: [00:30:18] Absolutely. Absolutely. You know, and that's inside or outside of a function. Like it, even if it's not within a function, You know, you just have this, this value that you have when the view first loads, you know, you want this to like, okay, no, I need to know, like, I, I need to protect this and I need to know, what is this going to, What does this output? And I need, and I like, I need to know it it's going to check for it. making sure that the app does not, crash based off of, you know, this, this one value. One of the things, one of the examples I give is like, like an Instagram situation, right.
Where, if you are on Instagram, then nine times out of 10, you have an Instagram account. but you might have zero followers. so the number of followers, you know, you might have a situation where it might be zero or it might be nil.
And, but that doesn't mean that you're, you know, Instagram should not crash because you don't have any followers or you don't have any, pictures. Okay. that's, you know, IE the bots, you know, the bots live because (laughs) that that situation can exist. You don't have to have any pictures on Instagram.
You don't have to have any followers. Right. So it can be zero. It can be nil. Instagram should not crash because of that. The page, your profile should also, load as well. So those are things that should be protected. Those values, should always be protected. It may, they may or may not be there, but that page does exist.
So yeah, that's the example that I always like to give as well.
Jeremy: [00:31:55] When someone is, learning Swift and figuring out how to build iOS applications and you go onto Google and you start searching for tutorials and things like that. Are there certain features or certain ways of building applications that they're going to find that they should avoid? Because there's, there's newer ways of doing it now.
Timirah: [00:32:20] That is a great question, huh... Things that are old, maybe outdated best practices when it comes to Swift... Hmm... one thing I will say about Swift is Swift is constantly, like, I probably said this already, but constantly evolving, and a lot. Yeah. And that is due to Swift being open source, the community, of maintainers and contributors around Swift.
They're very passionate. They're very progressive and aggressive when it comes to. and when I say aggressive in the, in the dearest way, And so every so often, there's like every couple of months you see we're on Swift.
So Swift is six years old. we're on Swift 5.3. Okay. so let's say that could be what a version a year? but then like you have versions in between that, right? So like, Oh, this is 1.2. This is 2.5. we're constantly being, upgraded and updated. So let's say if you're looking into Swift development, you're like, okay, I want to do iOS development.
Okay. Here's the Swift tutorial. If it's from 2015 and this is like Swift 3, it's no good. It's no good at all. And you know, Apple does a great job of keeping up, right? So, let's say, okay. XCode that the latest version out in the right now of XCode, is XCode 12, XCode 12. you have to use. I believe no later, I don't think it's compatible with a version later than Swift 4. but certainly nothing under that. And then even like the iOS version, like, okay. iOS, 13 and up, you know, so, and we're on iOS 14. So everything is constantly like bumping you up in 4. If you come across like some books, literally an entire book of Swift.
In 2014? No good. 2015, no good. 2016. No good. You want to look for resources that are the most updated, as you can possibly find, no later than 2019. So if you come across a tutorial, a course, a book. Whatever, look at that date because that's probably the best, the best thing you know, when it comes to being current, making sure that you're using a version of Swift that's actually compatible with this version of XCode. and if not XCode will scream at you, (laughs) it will not compile. And yeah, you want to stay, stay current, stay forward. and stay progressive when it comes to Swift, because in six years Swift has come like it's grown so fast risen, so fast, and it's grown so much in terms of like the community and, what the language is.
So if you want to learn more. I would say the best resources would be, a good friend of mine, Paul Hudson. He does a great, I mean, he's, he's excellent. When it comes to, updating, Swift content, keeping updated Swift content. He's written so many books on Swift and he's, he's like a machine when it comes to keeping things updated, you know, he doesn't let any really any of his content go, outdated. so he continues to update his books, update his courses, update his, website. He has a website called Hacking with Swift that's the brand. so you can go to hackingwithswift.com and you can check out all of this like cool stuff, for iOS development, whatever you want to accomplish with iOS development.
And learning Swift in general, like hacking with swift is like Swift world. Okay. And you want to stay in tune with swift.org as well, because that's where you're going to get all the updates from the community, on Swift the language and the progress itself. , and what's new. I believe the latest thing I've seen on there is the update that, Swift is now available on windows.
So you can, install Swift on your Windows machine and, you know, basically build applications with Swift. So, and again, that's attributed to the community behind that. So you can go to swift.org to learn more about that. Learn about Swift progress, learn how to get started. Another resource is raywenderlich.com Ray Wenderlich was such a powerful resource to me coming up in my early days, learning iOS development with this very universally like relatable, fun content, blog posts, books, and then they just moved into video content as well. Great stuff it's always updated, always updated, you know, whenever there's a new Swift framework or a Swift, whatever, iOS, tooling, whatever, like it's on raywenderlich.com. And of course me, you know (laughs) huge, huge advocate. I've been working, recently with, O'Reilly, to really provide more Swift presence and content on the platform on their online learning platform.
I've had a few, live trainings on their platform and they're really fun. if you want to spend a couple of hours, you know, with a live hands-on training on getting started with Swift, and getting your hands dirty with iOS development in itself, you know, finding out all the nooks and crannies of XCode and you know, all the cool things that you can do beyond iOS development with Swift. I, I do a training with them every so often, and we're working on doing more things with them, hopefully, doing more things with Swift UI, you know, yeah. So there's a lot of, there's more content out there than there, there has been ever before, But you want to make sure that you check that date on those, on those things so that you're learning the right things and, you're, you're on the right path.
Jeremy: [00:38:41] That seems really, different for Swift than other languages where you could bring up an article about Ruby that's four or five years old. And a lot of it is still going to apply, but it sounds like Swift it's like every year it's like, Oh, this isn't, we don't do it this way anymore.
Timirah: [00:38:58] Yeah. Yeah. Absolutely. Because Swift is still like, you know, finding its groove, right. You know, Swift, despite, you know, the, the stability and all this stuff, I'll, you know, I'm hyping it up, but Swift is still young. Like, you know, six years old is not a long time. And for it to be as progressive as it is, it's weird in itself.
I, you know, I would have expected it to be, I don't know, anywhere between 6 to 10 years in, for iOS development before it moved on to like, Oh, now we're doing this and we're on Windows and use Swift with Linux, like, Yeah. Like before it spread its wings like that, but it just kinda took off.
And the popularity was just like a wildfire. So, yeah, it's a really young language. So it's still finding its groove and that is one of the cons too, just like people like, Oh, it's constantly changing it's exhausting to keep up, but it keeps you on your toes. Like any language, like you should always be, you know, on your toes, but there are a lot more like, You know, more, the widely used, traditionally accepted languages, Javascript you only have to worry about new frameworks, you know, not so much changes to the language itself (laughs). So yeah, that is like a downside. Some people are like, (sigh) You know, Oh, we don't do it like this anymore. but yeah, like XCode and in the Swift compiler yourself, it does a great job of reminding you, Hey, it doesn't just tell you, like, this is wrong.
The, the warnings let you know, what the alternative to whatever that is that you're, you're doing. So say, Hey, that, that was done in, in Swift 3, you, you probably meant X, Y, and Z. So it's, it's very good at inference when it comes to that.
Jeremy: [00:40:45] So I guess that means if you have, a application you've been building over the past few years, as Swift is making these changes, XCode could actually tell you, here's all the things that you need to update for the new version of Swift. And I'm not sure is it where it'll make the changes for you or it just kind of tells you, and then you, you still go ahead?
Timirah: [00:41:05] No, it tells it, it lets you know, it was like, no, excuse me. (laughs) No, _you_ do it. So those are things that it'll say, Hey, it'll let you know. It'll let you know before, before you run it. say, Hey, like, no, like you need to change this. if you have a, an application you built, like with Swift 3 and you, you open it in the new version of XCode and XCode will say, Oh, okay. All right, it'll give you, like, it'll say, not necessarily, errors that it won't run, but it, some things will be buggy. Right. So it'll allow you to run it because it is valid code, right?
But it will say like this is buggy, this is buggy. This, it has been deprecated so that that's no longer there. So, and a lot of things that are deprecated is there because now, a lot of those things are built in, so it's not, it doesn't have to be, programmed to say: Hey, certain functions are now like built in to the compiler itself.
So it's like, you know, you don't have to worry about this. You don't have to worry about that. Like, That's automatic now, when it comes to, you know, Swift 5, so you don't have to worry about these things. so yeah, they, it, it does a great job of letting you know that. So it'll run, it'll probably be, you probably have some missing loops here and there.
But it will definitely give you those, yellow warnings and suggestions like, Hey, you should change this, this, this, this, this, this, and I do believe XCode has an option. Where you can, convert your code to it. It'll say it might pick it up, pick the whole thing up and just say, Hey, they, give you an alert and say, Hey, this is Swift 3. We noticed. And would you like to convert this to Swift 5? And, there's an aspect where it can convert. Like I've seen. up to like 90, at least 90% of your code, can, can be converted over. so yeah, like sometimes it'll pop up, it'll just pop up and I'll ask you, like, should you convert this?
Okay. If you should do like, press this button, if you want, if you want, want this to be converted to the latest version of Swift, then you have to go back and kind of fix some things right. Yeah. So, you know, like I said, XCode does a great job of, being that supervisor and then like, the compiler itself is like the principal, like, you know, so, so yeah. Yeah.
Jeremy: [00:43:29] Yeah, I think Swift is, is really an, an, a unique place where you were talking about how you think it's finding its groove and it's getting to make all these changes hopefully for the better. Right. And I, I feel like. Usually when a language is, is doing that. It's when not a lot of people are using it yet.
So you're making all these changes, but it's okay because, there aren't people with apps that hundreds of millions of people are using. but Swift is in this unique space where, they're making all these changes, but because that is the language for iOS. You have, you know, millions of developers that are, you know, hitting the language hard and shipping apps.
And, so it's just a really unique position I think, for Swift to be in.
Timirah: [00:44:16] Yes, a thousand percent, a thousand percent.
Jeremy: [00:44:20] Cool. So I think that's a good place to start wrapping up. Is there, anything else you wanted to mention or want to talk about where people can check out what you're up to?
Timirah: [00:44:30] I, first of all, I'm so glad that we, finally got this podcast in, for those of you listening, we've been trying to do this for like three months, trying to get this scheduled for three months. My life has been super busy, but I wanted to make sure that we, you know, got this chance to really talk and, and chat.
You know, Jeremy you've been amazing I love the podcast. so it was so important to me that, I came on here and, you know, I had this awesome conversation with you. about me, let's see. you can find me on Twitter at @TamirahJ and like, Oh, what's, what's my Twitter. Oh, my name, TamirahJ you can find me on Twitter.
Follow me on Twitter. mention me, I'll say, hi, I'll follow back. I do all of that. If you have any questions about, you know, getting started in iOS development, if you're starting your journey, if you want to learn more about, getting a job in iOS development, what the interview process is like.
If you wanna be involved in some of the things that I'm doing in terms of teaching iOS development or teaching with Swift, please feel free to reach out, I do have a Coursera course that is coming out very, very soon.
And yeah, I did this thing, back when I first announced there where I was just like, Oh, can you guess what language I'm going to do the course on because it's not Swift. But it's been a while. Coursera has, they've changed the, the release date a couple of times.
So I said, you know what, I'm just going to tell everyone, you know, it is actually on flutter. Okay. and Flutter, is a a cross platform framework, that utilizes Dart, which is a language that is, that was created by Google. that is something that I've gotten into within the past, like year and a half Flutter is like super fun.
You know, and it took me away from like my native, basically my native patterns and, got into cross-platform. And it's really fun. It's really great for, people who are, beginners and beginning in the cross-platform realm. And when we talk about how performance like fast Swift is, of course there's some loss when it comes to cross-platform, in terms of, performance and in terms of, tooling, native tooling.
But flutter is probably the closest that I've seen to getting that full on experience, getting that reactive experience and getting a lot of the freedom when it comes to, even down to like animations or things like that. Just a lot of the cool things that you would be able to take advantage of with, native mobile development.
Flutter, really makes up for a lot of that with, cross-platform development in the cross-platform, space. So flutter is definitely the best bet. I'm going to be doing a Coursera course or a series of coursera courses on Flutter. So look out for that. Shout out to Technigal LA, which is my meetup turned nonprofit for women of all ages who want to thrive in STEM, anything from like educational opportunities, employment opportunities, networking opportunities, we're doing it all virtually, but, it's all free. So if you want to learn more about that, or if you want to get involved, do you want to come, you know talk to some women, give, you know, teach a workshop or if you want to know where some other talented women are in STEM, you know, feel free to reach out. We are Technigal, so that's technical, but with a G. So, T E C H N I G A L and then LA, because I am based in Los Angeles. So, yeah, so you can check us out on, Instagram and or Twitter as well. or you can check us out on, meetup.
So yeah. Thank you so much. Jeremy, like this is, this is a lot of fun and I'm glad that we finally, finally, got a chance to, to get it in. Yeah, absolutely.
Jeremy: [00:48:26] Yeah, well, that means things are, things are happening. That's good.
Timirah: [00:48:29] Yeah. Yeah. It's a good thing to be busy. Yeah.
Jeremy: [00:48:35] I hope you enjoyed the conversation with Tamirah. If you want to learn more about iOS development, we've got notes to all the resources we talked about in the show notes. The music in this episode was by Crystal Cola. Thanks again for listening.
Related Links:
This episode was originally posted on Software Engineering Radio.
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Today I'm talking to Julie Lerman. She's a frequent conference speaker, an independent consultant and a Pluralsight course author. Today we're going to talk about her experience working on ORMs and Entity Framework.
Julie, welcome to software engineering radio.
Julie: [00:00:16] Hey, Jeremy. Thanks so much for having me.
Jeremy: [00:00:18] For those who aren't familiar with ORMs what are they and why do we need them?
Julie: [00:00:23] The first thing is the definition so ORM stands for object relational mapper. It's a software idea and implementation in many aspects for taking the concepts that we describe as classes and transforming them into the structure of a relational database. So you can get the data in and out easily.
So the ORM takes care of that mapping and the transformation. So you don't have to worry about writing SQL. You don't have to worry about getting structured results from a database that are in rows and columns and then transforming them back into object instances. So the ORMs do that for you.
Jeremy: [00:01:15] Some of the most popular languages used are object oriented. Why do we continue to store our data in a relational database when it's so different from our application code?
Julie: [00:01:26] There's certainly a lot of reasons for relational databases compared to document databases, object databases. Relational databases for their structure and normalization and tuning, and it's different kinds of storage needs I think than when you're storing documents. Sometimes I think more about getting the data back out. Finding data.
It's one thing to be able to store it in a way that makes sense. But then when you need to pull data together and create relationships that aren't naturally in those documents unless you have prepared and designed something like a document database for every possible scenario.
There's also the whole concept of event sourcing where if you're just storing the events then you can actually re-persist that data in different structures so that it's easier for the various ways you might access it. Also, relational databases have been around for a long time and it's interesting that even with the popularity and extreme power of these other kinds of nonrelational databases, the NoSQL databases, relational databases are still, I can't say the source off the top of my head but I've used them in conference slides. They update their data monthly, but I can see these graphs where it shows that it was still about 70% of systems out there are using relational databases and in a lot of cases, so much legacy data, too, right. That's all going to be in relational databases.
Jeremy: [00:03:18] Relational databases, they give us the ability to request just the data we need that matches a certain condition or we can combine different parts of our data and get just a specific thing that we need.
If we were dealing with a bunch of arrays of objects or documents it might be harder to query for a specific thing does that sort of match...?
Julie: [00:03:45] Yeah, especially that unplanned data access and. Another interesting thing. And it's one of those things that I think you have to like, stop and think about it, to realize the truth in it, which is that most systems, most of the work you're doing the database is reading the database much more than creating and storing data.
That's the bigger problem. Getting the data back out and having the flexibility to do weird things whereas something like a document database depending on how you split it up and keys and everything, it's just more designed. A relational database is a little easier to get at that weird stuff.
Jeremy: [00:04:37] You mentioned ad hoc queries. So with a relational database you can store all your information and then you can decide later what queries or information you want to get out, whereas you had hinted at with a document database you have to think upfront about what queries you want to make. Could you elaborate a little bit on that?
Julie: [00:05:02] I don't want to be generalizing too much, because there's all kinds of other factors that might come in especially when talking about huge amounts of data for like machine learning and things like that. There's pros and cons to both of them.
And like I mentioned with event sourcing ways to have your cake and eat it too. When I look for example at documentation for Azure CosmosDB, that's the document database that I have the most familiarity with. One of the most important documents is that upfront design and that design bleeds into how you're modeling your data in your application so that you can store it fluidly into the document database. So it's really, really important. Now I don't want to conflate this with the beauty of a system like document database, how that aligns so much more nicely for persistence.
And most of your getting data in and getting data out when we're talking about domain driven design. Where you don't want to worry about your persistence, just focus on the domain and then we can figure out how to get the data in and out right? We've got patterns for that.
But it is interesting if something like a document database more naturally aligns with your objects right? So that flows nicely in. However, it's still really important to be considerate of how you're breaking up that data in the way you store it. What your graphs are going to be. Maybe it's easy to get it in this way. But then what about when you're getting it out? What about different perspectives that you have? I want to be really careful about, comparing and contrasting the document DB versus relational DB. I want to stop at some level of that because there's always going to be debate and argument about that.
However one of the nice things about an ORM is it does take away some of that pain. That we normally would have when we're writing applications and writing that data access layer where we have to say, okay, I want this. Now I have to figure out the SQL statement, or call a stored procedure, or however I'm going to do it.
Whether I'm getting data in or getting data out and then I have to transform my objects into the SQL right? Or I get data back. And then I have to read through that data and instantiate objects and stuff, all the values into those objects. That's the stuff that the ORM can...
Repetitive. Like Bull (bleep). Right. Writing all that stuff. So that's what the ORMs do. I think the next big question that comes up then is yeah, but my SQL is always going to be better than some generic thing that an ORM is going to build for me. Right? The general advice for that is for many cases what the ORM doesthe performance will be just perfectly fine. And then there's going to be the cases where the performance isn't good. So then you can use your own stored procedures, use your views.
But a lot of the ORMs, certainly the one I focus on, which is Microsoft's Entity Framework, has the ability to also interact with stored procedures and views and do mappings to views and things like that. So people don't realize that, right. But maybe 80% of the SQL that entity framework is going to write whether it's for queries or for pushing data into the database. Maybe 80% of that is perfectly fine. And then 20% of that, do performance testing, go yeah. I can do better. I can do better and fix those up.
Jeremy: [00:09:27] I want to step back a little and walk through some of the features of ORMs that you had mentioned. Uh, one of them was not necessarily having to write SQL so you have some alternative query language that the ORM is providing. So what are some of the benefits of, that?
Like why would somebody not just jump straight to SQL and would instead use this SQL-like query language?
Julie: [00:09:57] well, Again, I'd have to kind of focus on .NET and I know you do a little bit of .NET programming so you probably are familiar with LINQ the query language that's built into .NET, into C-sharp and the .NET languages. It's a much, simpler expression than SQL, and it's much more code like and if you were writing a SQL string to put in to your application, we want to use stored procedures and views, but let's say you're just writing raw SQL, right? You're writing a string. And you're crossing your fingers that you got it right, right? That there's no typos, but also that you've written it correctly, but if you're using something like LINQ, which is part of the language, then you've got all the advantages of the language syntax and IntelliSense and here's my object what are my methods that are available all popping up.
You don't have to worry about magic strings. I think that's really key and you don't have to know SQL concepts really. So with entity framework at the very beginning the team actually came out of Microsoft research. The team that had created that started with a SQL like language to do it. But then LINQ was also created at Microsoft by the C sharp team. And then, LINQ was originally created for objects. So you can use that for writing queries just across your objects. I wouldn't say LINQ was evolved, but a LINQ version that applied to entity framework was created.
And there was also LINQ to SQL which was a much lighter weight ORM that was also created within Microsoft, so you're using almost the same LINQ. A lot of things are shared some are more specific to entity framework.
There's another benefit. I use LINQ for all kinds of things in my code, not just for the database interaction with Entity Framework, I use LINQ just when I'm working with arrays. Working with data in objects in memory. So I've also got this common syntax. I already know how to use it. There's just some special features of it that are specific to using entity framework.
So there's all kinds of benefits and the best benefit was it enabled me to essentially forget SQL. I forget how to write T-SQL. If I have to write raw T-SQL, I have to Google it after all these years of coding but I don't mind.
Jeremy: [00:12:39] With LINQ this was a query language that was already built into C# and also VB.NET. All of the languages that run on the .NET runtime, and so the developer may already be familiar with this query language. And they can apply it to making queries to a database rather than having to learn SQL which would be an entirely new language to the developer.
Julie: [00:13:19] Yes. And not just that, but I don't have to learn T-SQL (Microsoft SQL Server) and O-SQL (Oracle) and P-SQL (Postgres). There's all kinds of different database providers for entity framework. And the same for other ORMs. I feel a little bad just focusing on EF, but that's my area of expertise.
But the providers are the ones that do that transformation. So entity framework iterates over the query a few times and creates a query tree. And then the query tree is passed on to the provider, which then does the rest of the transformation into its particular SQL. So, I actually go back and forth between SQL server, SQLite and PostgreSQL. SQL server's what I'm most familiar with but I can use SQLite and Postgres and there's a lot of generic, common stuff right?
So as a matter of fact the way LINQ for entity framework is built. That's just one thing and then the providers figure it out. So if you need something that's really specific to SQL server that doesn't work in Postgres or something in Postgres that doesn't work in SQLite there might be an extension that lets you tap into that particular thing. Or call a stored procedure. Except you wouldn't do that in SQLite because they don't have stored procedures.
Interestingly tapping back to talking about document databases. There is a provider for Azure CosmosDB. So we've got an object, relational mapper. That maps to a non-relational database. So the reason that exists is because there was a lot of people that said I really am familiar with entity framework. I've been using it for years. We're going to be using CosmosDB. Can I just use that? So that exists.
And Azure has a SQL I'll just say SQL language for accessing but there's other APIs that are used also it's not all SQL, but because it has the SQL construct the providers kind of already know how to build SQL.
So it's just, we'll build the SQL that Azure works. So it's not solving the problem that we have with relational databases, which is that we've got this rows and columns structure and we need to do this translation all the time. The problem it was solving that provider is solving by linking entity framework and the document database is I already really know how to use entity framework.
And here I am just, I'm just doing some queries and persisting some data. Why do I have to learn a whole new API?
Jeremy: [00:16:27] You're saying that even with a document store, you might be able to use the same query language that's built into your ORM. With a relational database you have joins and things like that allow you to connect all your tables together and do all these different types of queries.
Whereas with a document database, I usually associate those with not having joins. Is that also the case with this document database that you're referring to with, with CosmosDB?
Julie: [00:17:02] I've written only tiny little minimal SQL for accessing it used to be called document DB, cosmos DB directly. I can't off the top of my head remember how that SQL works that out. If you've got to pull multiple documents together. I don't know, but what's nice is that in your code you don't have to worry about that stuff. Oh God, inner joins and outer joins and all like, Oh my God, I can't, I can't write that stuff anyway. So LINQ makes sense to me. I know it really well. So I don't have to worry about that.
Jeremy: [00:17:43] It sounds like at a high level, you get this query language that's built in to the language. Is more similar to the other code you would write. And you also get these compile time guarantees, auto-completion, what things you're allowed to call versus SQL, where, like you were saying, it's you have this giant string you hope that you got right. And don't really know until you query. So there's a lot of benefits with an ORM. You were also talking about how when you don't have an ORM you have to write your own mapping between after you've done your SQL query, you get some data that's in the form of just a set of records. And you have to convert those into your objects that are native to your application. And so basically the ORM is taking care of a lot of the you could say repetitive or busy work that you would normally have to do.
Julie: [00:18:47] To be fair, there are mappers. There are programs that are mappers or APIs that are mappers like auto mapper for .NET. So the connection string, like I don't have to worry about creating connections and all that kind of stuff, either with the ORM, but something like auto mapper, you can have a data model that really matches your database schema, and you can have a domain model that matches the business problem you're trying to solve. And then you can use something like auto mapper to say, okay, like here's what customer looks like in my application. Here's what customer looks like in this other data model. That's going to go to my data, but that map matches my database really easily.
And auto mapper let's you say this goes with this, this goes with that blah, blah, blah, blah, blah. So yeah. that stuff does exist. And, and there are also other lighter much... entity framework is a big thing. It does a lot. There are certainly lighter weight ORMs that they don't do as much, of that take away as much of that work.
However, if you need to squeak every little millisecond out of your performance, there's an open source ORM called dapper and it was built by the team that writes stack overflow because they definitely needed to squeak every millisecond of performance out of their data access. There was no question about that.
So with dapper, it takes care of some of the stuff, some of that creating the connections and materializing objects for you, but you still write your own SQL because they needed to tightly control the SQL to make sure. Because it's more direct it doesn't have lots of things happening in between that entity framework does. And there's lots of dials in entity framework where you can impact how much work it's doing and how much resources it's using.
Jeremy: [00:21:09] So my understanding with, whether you would use entity framework versus one of these smaller ORMs, or more basic mapper is, it sounded like performance. And so it sounds like maybe you would default to using something like entity framework and then only dropping down to something like dapper, if you were having performance issues. Is, is that where your perspective comes from?
Julie: [00:21:41] Not sure if I would wait until I'm having performance issues. I think people sometimes will make that decision in advance, but an interesting plan of attack so that you can benefit from the simplicity of using entity framework. So you don't have to write the SQL, et cetera, is for example, using a CQRS pattern.
So you're separating the commands and the queries separating pushing data into the database and the queries. Cause it's usually the queries where you need to really, really focus on the performance and worry about the performance. Some people will use CQRS and let entity framework take care of all the writing and let dapper take care of all the readings.
So they're designing their application. So things are not, not the business logic, but the layer of the application that's taking care of the persistence. So designing it so that those are completely separate. So, Oh, I'm doing writing. So we're spinning up dapper and we're going to, or actually the other way around, I'm writing. So I'm going to spin up an entity framework context and say, go ahead and save changes or I'm doing a query and I'm writing my own SQL because I trust it. So now I'm going to spin up dapper and execute that SQL, let it create the create my objects for me.
Jeremy: [00:23:10] It sounds like it might be better to decide that up front figuring out, what is the amount of load I'm going to get in my application. What are the types of queries I'm going to be using for writing and reading data? And then from there, figuring out, should I use dapper? Should I use entity framework?
Julie: [00:23:35] Yeah, and there's some amount of testing and exploration, you would need to do proof of concept stuff. I think you might want to do first because I think entity framework query performance surprises people a lot. Like it's a lot better than people presume it's going to be.
So yeah, I would definitely do some proof of concept before going that route cause it's certainly a lot easier to just use one thing all the way through.
However, if you've got things, applying separation of concerns and broken up, then it's not as traumatic to say for this bit... It is running a little slowly and, let's just use dapper. But if it's separated out, then it's easier to just say, yeah, let's just change over to that bit in our infrastructure.
Jeremy: [00:24:30] Are there workloads in a general sense that you would say that entity framework performs worse than others? For example, An app with a heavy write workload. Would that be an issue in entity framework? Are there examples of specific types of applications?
Julie: [00:24:52] I don't know if it's workload, compared more to structure you know schema. Um, or what it is you're trying to get at. you know, sometimes it's, how your model is designed, right? So there are just some places that any frameworks not going to, just not going to be able to do something as well as your own SQL. I had an example, this was years ago. So entity framework was not in its infancy, but it was like EF4 and I had a client who had gone down kind of a not great path of implementing entity framework in their software that's used across a big enterprise. They had created one big, huge model for everybody to use.
So that one model was responsible for no matter what you were doing, you had to use that model. Just pulling that model into memory was a lot of work for entity framework, but it created a lot of problems for the teams who were stuck using that. Because of the way they had set that up, they had this common query that would only return one piece, one row. They were looking for one row, but because of the way they had designed this model entity framework had to go through all kinds of hoops to create the SQL, to get at this one piece of data to, to write this query. And it was taking, it was in Oracle taking two minutes to execute. And they're like, what do we do? And I looked at it and I said, Just write a stored procedure. It's too late to change this model because everything's dependent on it.
So just write a stored procedure and the stored procedure took nine milliseconds. So like, okay we can pay you for the whole three days. You can go home now if you want, that was worth it.
Jeremy: [00:26:46] And so it sounds like that's not so much the choice of not using entity framework as it is-- you're still using entity framework, but you're using it to make a request to a stored procedure in the database. Similarly to how you could probably send raw SQL from entity framework.
Julie: [00:27:05] Yes. So you can still take advantage of that. So we're still using entity framework and saying, Hey, I don't want you to write the SQL, just execute this for me. And so that's still taking advantage of the fact that it knows it's SQL Server. It knows what the connection string is, et cetera.
It's not writing the query now there's we can take advantage of views in the same way and can actually map our objects to views and have it so we're still executing the view. And it's gonna still materialize the data for us, which is really nice.
Jeremy: [00:27:44] Yeah. So when you were talking about deciding whether or not to use entity framework, when you use entity framework, you could still be using a mix of the LINQ query language. Maybe raw SQL commands. Like you said, views, stored procedures. It's not this binary choice where you're saying, I'm going to write everything myself or I'm going to leave everything to Entity Framework.
Julie: [00:28:11] Absolutely. I mean that's the way we approach programming overall. Right? So I've got all of these wonderful arrows in my quiver to choose from. And, I'm not stuck with one. I just leverage using my knowledge and my experience. I'm making a decision, Hey, right tool for the job right. I guess that's what we've been saying for decades, not just programming, using the right tool for the job, but just because you want to use Entity Framework, doesn't mean you have to use it all the way through, but I think that's like, again using, using C# for this and maybe I'll use NodeJS or maybe I've got some really complex mathematical thing to do, for this particular thing I'll use F# or some other functional programming language.
Jeremy: [00:29:08] So we did a show back in 2007 about ORMs and I know that you've had a lot of experience both working with ORMs and probably working with applications without them just writing SQL directly. And so I would imagine there's been a lot of changes in the last decade, or maybe even couple decades.
What are some of the big things that you think have changed with ORMs in the last, 10 or 20 years?
Julie: [00:29:40] I think something we've been talking about, which is [that]NoSQL databases have reduced the need for ORMs right? Just slice that right off the top. There's so many scenarios where people don't need relational databases to solve that. So this is interesting, right? Because I'm not sure that the answer is.
About what's happened with the ORMs... I think the answer is more what's evolved with the way we persist data and the way we perceive data. Machine learning, right? The idea of event sourcing, right? Talk about right tool for the job. Like here's my data, but for this scenario, it's going to be more effective and efficient for me if that data is stored in relational for this scenario, it'd be more effective if it's stored as documents, right? For this scenario, it'd be more effective if it's text, like just a text file. Oh, I forgot XML. Just kidding or JSON. Right?
I think that's what's impacted ORMs [more] than the ORMs themselves. And entity framework is still going strong and evolving, and very, very wide use, in the .NET space. Some of the ones that were more popular are not really around anymore or haven't been updated in awhile.
Dapper gets its love of course from the stack overflow team. So I really think it's more about the change in how we store data that's just reduced people's need for ORMs. But still whatever that resource was 70% of the applications are still using relational.
And then there's the people, like, no matter what, like we will never use an ORM. We will always write our own SQL. They have beautiful stored procedures and views and finely tuned databases. And they'll write the SQL for you these people. Just don't ask me to do it cause I'll probably never write SQL that's as good as entity framework with its algorithms and command trees can figure out on my behalf but there are plenty of database professionals who can definitely do that.
Jeremy: [00:32:37] For somebody who is working with a relational database and let's say they worked with entity framework or they worked with hibernate, or any number of ORMs a decade ago.. Is there anything that's changed where you would say, Hey, this is different. Maybe you should give this a second look?
Julie: [00:32:59] I apologize for not being able to speak to hibernate, like how that has evolved, but when Entity Framework first came out and this was very different than NHibernate, the .NET version of hibernate, Entity framework did not have any appreciation or respect for our objects.
It was really designed from the perspective of storing relational data and it wasn't testable. It just didn't have any patterns. Interestingly, I spent a lot of time researching how entity framework aligns with good model design based on how we design domain driven design aggregates, and some of the, things that we put in place to protect our entities from people doing bad things with them. So Entity Framework at the beginning did not give us any way to do that. It didn't just lock your data access in to entity framework, the whole framework.
It locked your entire application in. All your entities and your classes had to be completely tied to entity framework in order for entity framework... Now that's all decoupled, right? So entity framework also we had this big visual modeler that in order to create the model, you had to start with an existing database.
Now it's like, look, here are my classes I don't want to design a database. I'm designing software. Right. I'm solving business problems. I have my domain model. So you build your domain model and then say, Hey, entity framework. Here's my domain model. Here's my database. Oh. And by the way, I know that by default, you're going to want this to go there and this to go there.
But I need to tell you there's a few things I want you to do a little differently. So we have all of that and we tell.. entity framework has nothing to do with our domain logic. So it's gotten better and better and more respectful of our code, our domain logic, and our domain business logic and code, and staying out of the way.
And that's been a big deal for me as somebody focused on domain driven design, which is. As I'm building my software I do not want to think about my data persistence, right. Has nothing to do with my domain. I like to joke unless I'm building the next Dropbox competitor right? So data persistence has nothing to do with my domain.
So I. Look to see how well as each iteration of entity framework comes out. I look and do some experiments and research to see how well it's defaults understand or map correctly, like get my data in and out of my database without me having to tweak Entity framework's mappings, but more importantly, without me having to do anything to my domain classes in order to satisfy entity framework and with each iteration it's gotten better and better and better, which is nice.
Now, one bit of pushback I had in the early days when it was talking about entity framework and DDD was well, you're mapping your domain directly to the database using entity framework. You shouldn't do that. You should have a data model and map that, but what's interesting is entity framework becomes the data model, right? The mappings of the entity framework become the data model.
So I'm not doing it directly. That is the data model. So it is solving that problem. It's giving me that separation and there are still some design, some aggregate patterns that, entity framework, for example, won't satisfy in terms of the mappings, or maybe just some problems, some business problems or storage problems that you have. There are some times when it just makes sense to have something in-between even if the entity framework is still in-between that, like I have a specific data model, cause it just handles it differently. So you don't have to change how your, your domain I'm very passionate about that. Don't change the domain to make entity framework happy. Never, never, never, never.
Jeremy: [00:37:45] Could you give a specific example of before the objects you would have to create in order to satisfy entity framework? What were you putting into your objects that you didn't want to put in?
Julie: [00:38:01] That I didn't want to, I don't want to, well, one of the things that Entity Frameworks mappings could not handle is if you want it to completely isolate or encapsulate a collection in your entity. In your class, if you want to completely encapsulate the collection entity framework couldn't see it. Entity framework only understood for example, collections, but maybe you want to make it so that you can only iterate through it and enumerate through it. But if it was an IEnumerable, which is so that's what you would do, like in your class, you would say, Oh, I'm going to make it an IEnumerable so nobody can just add things to it and go around my business logic, which is, if you want to add something, I have to go over to this business logic to make sure all your invariants are satisfied or whatever else needs to happen.
So entity framework literally could not see that. And so, yeah. We struggled with, finding ways to work around that adding stuff, or just giving it up and saying, okay, I'm exposing this here, please. Don't ever use it that way please. But you know, you have to tell the whole team don't ever use it that way.
Right. But with Entity Framework Core 3 they changed that. So you can write your class the way you want to, and you can use the patterns that we use to encapsulate collections and protect them from abuse, misuse, and still entity framework is able to discover them and understand how to do all the work that it needs to do.
So that's just an example.
Jeremy: [00:39:50] What you wanted to do in your domain object is make this collection be an IEnumerable, which makes it so that the user of that class is not allowed to add items to it?
Julie: [00:40:03] Right. Developers using it. Add or remove
Jeremy: [00:40:08] right. Add or remove. And, uh, so you wanted to be able to control the API for the developers. Say that you want to receive this object and you can, you can see what's in it, but you can't add or remove from it.
Julie: [00:40:24] Well, you can, but not directly through that property. Right. You'd have to use a method that has the controls. I refer to it as a ADD DD ADD driven design, right? Because I am a control freak. I want to control how anybody's using my API.
Jeremy: [00:40:44] You wanted to control how people use the API, but if you did that within your object, then entity framework would look at that. IEnumerable and say, I don't know what this is, so I can't map this.
Julie: [00:40:56] It just completely ignores its existence.
So that was one evolution of Entity Framework that you can do that now. And all kinds of other things.
Jeremy: [00:41:09] Are there any other specific examples you can think of, of things that used to be either difficult to do or might frustrate people with entity framework [that] are addressed now?
Julie: [00:41:20] Along those same lines of encapsulating properties, right? So, and encapsulated collections was really, really hard. Like it wasn't hard. It was impossible. But if you want it to either encapsulate properties or not even expose properties, you didn't want to expose them all at all and you just wanted to have private fields.
You couldn't do anything like that either, but now with the, more modern, EF Core, you could literally have an object, a type defined with fields, no properties exposing those fields outside of the API. But you can still persist data. Entity framework is still aware of them or depending on how locked down, like if you made them totally private you'll have to tweak the configuration in Entity Framework, you have to tell Entity Framework, Hey, this field here, I want you to pay attention to it. Right. So there's some conventions where it will automatically recognize it.
There are some ways of designing in your type that Entity Framework won't see it, but you can override that and tell entity framework about that. So I think that's a really interesting thing, because a lot of scenarios where people want to design their objects that way. And you know, again, just so much of that is about protecting your protecting your API from misuse or abuse but not putting the onus on the users of your API, right.
The API will let you do it. You're allowed to do and it won't let you do what you're not allowed to do.
Jeremy: [00:43:04] You had mentioned earlier another big difference is that it used to be, that you would create your database by writing SQL, and then you would point entity framework to that to generate your models?
Julie: [00:43:17] Oh, yes. The very first iteration, that was the only way it knew how to build a model. Oh, wait, there was a way to build the model, a data model in the designer. That's right. But then everything's dependent on that model or on the database instead of no, I want to write my domain logic and then, now Entity Framework. Not everybody has a brand new project, right. But say you get a brand brand new project and there is no existing database. So you can ask entity framework to generate the database based on the combination of its assumptions from reading your types. And extra configurations you add to entity framework and create the database.
So I hear all the DBAs going. Yeah. Oh no, no, no, no, no. That's not how it works. But this is the simple path. Like, so first you can have it create the database and then as you make changes to your domain Entity Framework has this whole migrations API that can figure out what has changed about your domain and then, figure out what SQL needs to be applied to the database in order to make those changes. So that's, that's one path. A twist on that path of course, is you can create that stuff. Instead of having those migrations, create the database or update the database, you can just have it create SQL and hand it to the professionals and say, this is sorta what I need.
If you can do this and then just make it the way you want it to be, just long as the mappings still work out. But the other way is if you do have a greenfield application and you do have a database already, you can reverse engineer, just like we did at the very beginning, but it's a little different.
But you can reverse engineer into a data model, which is, your classes. And a context, the entity framework configuration. So the classes are I refer to those as a stake in the ground. Because then you can just take those classes and make them what you want them to be. And then just make sure that you're still configuring the mapping so everything lines up. So there are two ways to go still, but at the beginning, There was this, I forgot even that the designer had, the entity data model designer had a way to design a model visually and then create both the database and the classes from it. But that, that went away really quickly.
So it was essentially point to the database, the whole huge database and create one whole huge data model from it. And that's how everything would work now.
A) we don't do that. And also we still don't have one big, huge data model. Right. Separation of concerns again. So we have a data model for this area of work and a data model for that area of work. And they might even point back to their own own databases. Right. And start talking about microservices and et cetera.
Jeremy: [00:46:35] Yeah. So if I understand correctly, it used to be where you would either use this GUI tool to build out your your entities or your database, or you would point entity framework at an existing database that already had tables and
Julie: [00:46:53] And all of that UI stuff is gone. Some third party, providers make some UI stuff, but as far as the entity framework is concerned, you know, Microsoft is concerned. We've got our classes and we've got our database. There's no UI in between.
Jeremy: [00:47:12] It sounds like the issue with that approach was that it would generate your domain models, generate your classes, but it would do it in a way that probably had a whole bunch of entity framework specific information in those classes?
Julie: [00:47:30] In your classes. Yeah. So much dependencies in your classes. Now, none of that, they started moving away from that in the second version of the entity framework. Which was called EF4 not 2, but
it's cause it was aligning with .NET 4, but now we've now we've got EF Core, which came along with .NET Core, which is cross platform and lightweight and open source.
Jeremy: [00:48:00] So it sounds like we're moving in the direction from having these domain models. These classes have all this entity framework specific stuff to creating just basic or normal classes. Like the kind of code you would write if you weren't thinking about entity framework or weren't thinking about persistence specific issues.
And now with current ORMs, like entity framework core, those are able to map to your database, without, I guess the term could be polluting all your classes.
Julie: [00:48:38] And to be fair, entity framework was the evil doer at the beginning of entity framework. Right. hibernate didn't do that, nhibernate didn't do that. They totally respected your domain classes and it was completely separate. There was a big, well, I'll just say a big stink, made, by the people who had been using these kinds of practices already. And were using tools like hibernate and Entity Framework came around and said: Oh, here's the new ORM on the block from Microsoft. Everybody needs to use it. And they looked at, and they're like, But, but, but, you know, because entity framework was inserting itself into its logic into the domain classes.
And, the Entity Framework team got a lot of big lessons from the community that was really aware of that. And it wasn't right. I was, it was all new to me. I had never used an ORM before, so it was like, la di da, this is cool. This is nice. And you know, all these other people were like going, this is terrible. Why is Microsoft doing this? I don't understand. So eventually it evolved and I learned so much. I learned so much because of them. I mean, these are really smart people. I have a lot of respect for, so I was like, okay, they obviously know what they're talking about. So I gotta, I gotta figure that out. That's how I ended up on the path of learning about DDD. Thanks to them.
Jeremy: [00:50:09] So it sounds like there's been three versions of entity framework. There's the initial version that tried to mix in all these things into your domain models. Then there was the second version which you said was called entity framework 4 that, that fixed a lot of those issues. And then now we have Entity Framework Core.
I wonder if you could explain a little bit about. Why a third version was created. And what are the key differences
Julie: [00:50:40] Yeah. So this is, this was when Microsoft also at the same time, it had taken .NET Framework and rewritten it to be cross platform and open source and use modern software practices. And at the same time, the same thing was done with entity framework. They kept many of the same concepts, but they rewrote it because the code base was like 10 years old and also just in old thinking.
So they rewrote it from scratch, keeping a lot of the same concepts. So people who were familiar, you know, who've been using Entity Framework they wouldn't be totally thrown for a loop. and so rewrote it with modern software practices, much more, open API in terms of flexibility and open source.
And because it was also part of .NET Core cross-platform. So that was huge. And then they've been building on top of that. So Entity Framework, the original Entity Framework. So I think it's interesting how you say there were three. So the first iteration of entity framework, and then this next one where they really, honored separation of concerns.
So that was EF4 through EF6 and then they kind of went to a new box with EF Core and served with the EF Core 1 and then 2 and 3 and, no 4. And now it's 5. So skipping a number again. So EF6 is still around because there's millions of applications that are using it. and they're not totally ignoring it.
As a matter of fact, they brought EF6 on top of .NET core. So EF6 is now cross-platform, but it's still in so many applications and they've been doing little tweaks to it, but EF Core is where all the work is going now.
Jeremy: [00:52:39] And it sounds like Core is not so much a dramatically different API for developers, but is rather an opportunity for them to clean up a lot of technical debt and, provide that cross platform capability, at least initially.
Julie: [00:52:57] And enable it to go forward and to do more, be able to achieve more, functionality that people have wanted, but was just literally wasn't possible with the earlier stack.
Jeremy: [00:53:12] Do you have any examples of a big thing that people had wanted for a while? They were finally able to do?
Julie: [00:53:19] And this is really, really specific now because this is about entity framework with the newest version, EF Core 5 that's coming out. One thing that people have been asking for since the beginning of EF time was when you're eager loading data. So there's a way of bringing related data back in one query.
There's actually two ways, but there's one way called a method called include. So including is so that you can bring back graphs of data instead of bringing, you know, go get the customers. Now, go get their orders. And now we need to merge them together in memory in our objects. So the include method says, please get me to customers and their orders and include, transforms that into like a join query or something like that brings back all of that different data and then builds the objects and make sure they're connected to each other in memory.
That's a really important thing to be able to do. The problem is because it was all or nothing, whatever relation, whatever related data you were. Okay. Bringing back in that include method, you would just get every single one. So, you know, if you said customer include their orders, there was no way to avoid getting every single one of their orders for the last 267 years.
Jeremy: [00:54:42] You're able to do an include, so you're able to join a customer to their orders, but in addition to that also have some kind of filter, like a where.
Julie: [00:54:52] On that child data, I'll say quote, you know, I'm doing air quotes, that child data, that related data. So we could never do that before. So we had to come up with other ways to do that.
Jeremy: [00:55:04] Yeah. So it was, it was possible before, but the way you did it was maybe complicated or convoluted.
Julie: [00:55:12] And it writes good SQL for it too, better than me. Maybe not better than, many of my really good DBA friends, but definitely better than I would.
Jeremy: [00:55:25] Probably be better than the majority of developers who know a little bit of SQL, but, uh, are definitely
Julie: [00:55:33] Not DBAs yeah. I mean, that's one of the beauties of a good ORM, right?
Jeremy: [00:55:39] I guess it, it, um, doesn't quite get you to being a DBA, but it, um, levels the playing
Julie: [00:55:46] Well, I can hear my DBA friends cringing at that statement. So I will not agree with that. It doesn't level the playing field, but what it does is it enables many developers to write pretty decent SQL how's that how's that my DBA friends?
Jeremy: [00:56:10] I guess that's an interesting question in and of itself is that we're using these tools like ORMs that are generating all this code for us and we as a developer may not understand the intricacies of what it's doing and the SQL it's writing. So I wonder from your perspective, how much should developers know about the queries that are being generated and about SQL in general, when you're working with an ORM?
Julie: [00:56:42] I think it's important. If you don't have access to a data professional it's important to at least understand how to profile and how to, recognize where performance could be better and then go seek a professional. But yeah, profiling is really important. So if you've got a team where you do have a DBA somebody who's really good with and understands they can be doing the profiling, right. Capturing it. And there's, you know, whether, if they're using SQL server, they might just want to use SQL server profiler there's all kinds of third party tools.
There's some capability built into visual studio. If you're using that, or there, there's all kinds of ways to profile, whether you're profiling the performance across the application, or you want to hone in on just the database activity, right? Because sometimes if say there's a performance problem, is it happening in the database? Is it happening in memory? Is entity framework trying to figure out the SQL? Is it happening in memory after the data has, you know, the database did it really fast? The data's back in memory now, entity framework is chugging along for some reason, having a hard time materializing all these objects and relationships.
So it's not just about not always about profiling the database. And there are tools that help with that also.
Jeremy: [00:58:18] You have the developers writing their queries in something like entity framework. And you have somebody who is more of an expert, like a DBA who can take a look at how are these things performing. And then if there are issues they can dive down into those queries, tell the developer, Hey, I think you should use a stored procedure here or a view.
Julie: [00:58:44] Yeah. Or even kind of expand on that. Unless you're solo or a tiny little shop, this is how you build teams. You've got QA people. You've got testers, you've got, data experts. You've got people who are good at solving certain problems. And everybody works together as a team. I'm such a Libra. Why can't we all get along?
Jeremy: [00:59:10] Yeah. And it sounds like it takes care of a lot of the maybe repetitive or tedious parts of the job, but it does not remove the need for, like you said, all of these different parts of your team, whether that's, the DBA who can look at the queries or maybe someone who is more experienced with profiling so they could see if object allocation from the ORM is a problem or all sorts of different, more specific, issues that they could dive into.
A lot of these things that an ORM does, you were saying it's about mapping between typically relational databases to objects and applications and something you hear people talk about sometimes is something called the object, relational impedance mismatch. And I wonder if you could explain a little bit about what people mean by that and whether you think that's a problem
Julie: [01:00:11] Well, that is exactly what ORMs are trying to solve, aiming to solve. The mismatch is the mismatch between the rows and columns and your database schema and the shape of your objects. So that's what the mismatch is, and it's an impedance for getting your data in and out of your application.
So the ORMs are helping solve that problem instead of you having to solve it yourself by reading through your objects and transforming that into SQL, et cetera, et cetera, all those things we talked about earlier. I haven't heard that term in such a long time. That's so funny to me.
It's funny that, we talked about it a lot when, and like in the EF world, it was like, what's EF4 it's to take care of the impedance mismatch, duh. Okay. Yeah. We never said duh. Just kidding. But we talked about that a lot. At the beginning of entity framework's, history, and yeah, I haven't had anybody bring that up in a long time.
So I'm actually grateful you did.
Jeremy: [01:01:15] So maybe, we've gotten to. The point where the tools have gotten good enough where people aren't thinking so much about that problem because ideally entity framework has solved it?
Julie: [01:01:28] Entity framework and hibernate and dapper.
Jeremy: [01:01:32] I think that's a good place to wrap things up. If people want to learn more about what you're working on or about entity framework or check out your courses where should they head?
Julie: [01:01:46] Well, I spend a lot of time on Twitter and on Twitter. I'm Julie Lerman and I have a website and a blog. Although I have to say, I tweet a lot more than I blog these days. but my website is thedatafarm.com and I have a blog there and of course look me up on Pluralsight.
Jeremy: [01:02:08] Cool. Well, Julie, thank you so much
Julie: [01:02:11] Oh, it was great. Jeremy, and you asked so many interesting questions and also helped me take some nice trips into my entity framework past to remember some things that I had actually forgotten about. So that was interesting. Thanks.
Jeremy: [01:02:26] Hopefully... Hopefully good memories.
Julie: [01:02:28] All good. All good.
Jeremy: [01:02:31] All right, thanks a lot Julie.
Julie: [01:02:31] Yep. Thank you.
John Doran is the CTO of Phorest, an application for managing salons and spas.
We discuss:- Transitioning a desktop application to a SaaS- Struggling with outages and performance problems- Moving away from relying on a single person for deployment- Building a continuous integration pipeline- Health monitoring for services- The benefits of docker- Using AWS managed services like Aurora and ECS
This episode originally aired on radio.net/2018/07/se-radio-episode-332-john-doran-on-fixing-a-broken-development-process/">Software Engineering Radio.
Transcript
Jeremy: [00:00:00] Today I have John Doran with me. John is the director of engineering at Phorest, a Dublin based SAAS company, that processes appointments for the hair and beauty industry. He previously worked as a technical lead at Travelport digital, where he supported major airlines and hotel chains with their mobile platforms.
I'll be speaking with John about the early days of their business, the challenges they faced while scaling and how they were able to reshape their processes and team to overcome these challenges. John, welcome to software engineering radio.
John: [00:00:29] Hey Jeremy, thanks so much for having me.
Jeremy: [00:00:31] The first thing I'd like to discuss is the early days of forest to just give the listeners a little bit of background. What type of product is Phorest?
John: [00:00:40] Sure. So forest is essentially, um, It's a salon software focused in the hair and beauty industry. And it didn't actually start off as that back in 2003, it was actually a messaging service actually built by a few students and Trinity college. One of which was his name was Ronan.
Percevel. Ronan is actually our current CEO. So that in 2003, that messaging service was supporting, um, nightclubs, dentists, various small businesses around Dublin, and the guys were finding it really hard to get some traction in, in those phase different industries. So Ronan actually went and worked as a, as a hair receptionist in a salon.
And what he learned from that was that through using messaging on the platform that they were able to actually increase revenue for salons and actually get more money in the tills, which was hugely powerful thing. So from there, they were able to refocus on the, on that particular industry, they built supplementary features and a product around that messaging service.
So in 2004, it became a kind of a fully fledged appointment book. And from there then they, they integrated that appointment book with the messaging service. So by 2006, then I guess you could classify phorest as a full salon software. So it had things like stock take, financial reporting, and staff rostering. That's fully based salon software system was pretty popular in Ireland and actually by between 2006 and 2008, they became the number one in the industry in Ireland. So what that meant was, you know, the majority of salons in Ireland were running on, on the phorest platform and that was actually an on-premise system. So all the data would have been stored locally in the hair salon.
And there was no backend,
Jeremy: [00:02:30] just so I understand correctly. So you say it was running on premise. It was an appointment system. So is this where somebody would come into the salon and make an appointment and they would enter it into a local computer and it would be just stored there?
John: [00:02:46] Exactly. So, so what Ronan figured out throughout his time, working in the salon that by actually sending customers text messages, to remind them about their appointments really helped cut down the no-show rates, meaning that customers did turn up for their appointments when they were due and meaning that the staff members didn't have to sit around, waiting for customers to walk in.
So as Phorest I guess, as a company developed. We, we moved into building extra features around that core system, which is an appointment book, which manages the day-to-day, uh, rows of a hairstylist. So we built, uh, email and marketing retention tools around that. Okay. I guess a really important point about Phorest's history is when the recession hit in 2008 in Ireland, we, uh, moved into the UK.
So as we were kind of the number one provider in Ireland, we felt that, uh, when the recession hit that we needed to move into the UK, but being on premise meant there was a lot of friction actually installing the system into the salons. So in 2011, they actually took a small seed round to build out, I guess, the cloud backend.
Well, once the kind of cloud backend was built, it took about a year to get it off the ground and released. Um, and as the company kind of gained traction in the UK, they, they migrated all of their premise customers onto the cloud solution.
Jeremy: [00:04:07] I guess you would say that when it was on premise, a lot of the engineering effort or the support effort was probably in keeping the software, working for your customers and just addressing technical issues or questions and things like that. And that was probably taking a lot of your time. Is that correct?
John: [00:04:25] Precisely the, the team was quite small. So we had five engineers who were essentially building it at the cloud backend. And one engineer who was maintaining that Delphi on premise application.
So what was happening was our CEO Ronan was actually the product owner at that time. And the guys were making pretty drastic and kind of quickfire decisions in terms of features being added to the product based on, you know, getting a certain customer in that really needed to pay the bills. And some of those decisions, uh, I guess, made the product a bit more complex, uh, as a group, but it certainly was, it was a big improvement from the on-premise solution.
Jeremy: [00:05:03] Hmm. So the on-premise solution you said was written in Delphi, is that correct?
John: [00:05:08] Yeah,
Jeremy: [00:05:09] when it was first started, was it just a single developer?
John: [00:05:13] Exactly. Yeah. So it was, it was literally, uh, put together by some, some outsourcers and a single developer managing it. There was no, there was no real in-house developers.
It was, you know, a little bit of turnover there. But when that small seed round came in with the guys put that together, the foundations of the cloud-based backend, which was a, a Java kind of classic, uh, n-tiered application with web socket to update the appointment screen. If anything changed on the backend.
And, um, you, you would kind of consider it a majestic monolith as such
Jeremy: [00:05:44] When you started the cloud solution. Were you maintaining both systems simultaneously?
John: [00:05:49] Yeah, so, um, for a full year, that goes where we're building out that backend. And at the same time, there was a one guy who was, who was literally maintaining, fixing bugs on that, that Delphi application.
And just to kind of give you an example. Um, one of the guys who was actually working on support, he actually went and taught himself SQL and he used to, to tunnel into the salons at nighttime to fix any database issues. And, um,
Jeremy: [00:06:17] Oh, wow.
John: [00:06:17] Yeah. So it was, it was, you know, hardcore stuff. Um, another big thing about not being a cloud-based and, and one of the big reasons we needed to become cloud based was we, you know, as, as people move online and, you know, it's, it's quite common to book, you know, your cinema or some something else online, but, um, Ronan could see that trend coming for online bookings, uh, and we needed to be cloud-based to build, to build out that online booking system.
And just to kind of give you an idea of the scale, like last year, we, we would have processed about over 2 billion euros worth of transactions to the system. So it's really, it's really growing. And, um, you know, it's huge, huge scale at the moment by that, I guess looking, looking back at the past, the guys would have built a great robust system getting us to that 10,000 salon mark, particularly in the UK, but that would have been the point that the guys would have started seeing some, you know, shakiness in terms of stability and the speed at which at which you could deliver new new features.
Jeremy: [00:07:22] You were saying the initial cloud version took about a year to create?
John: [00:07:26] Exactly. Yeah.
Jeremy: [00:07:27] And you had five engineers working on it after the seed round? At that time, when you first started
working on the cloud version of the application, did you have a limited rollout to kind of weed out defects? Or how did you start transferring customers over.
John: [00:07:46] So there was definitely some reluctant customers to, to move across. We did it, I guess, uh, gradually there was a lot of reluctance for people. People were quite scared of their data, not being stored in their salon. And so it was quite hard to get those, some of those customers across and only two weeks ago, we, we actually officially stopped supporting that final two customers have finished up. So. You know it took us a good seven years to finish that transition.
Jeremy: [00:08:12] Uh, so it was a very gradual transition where you actually, what did you ask customers whether they wanted to move or how did you...
John: [00:08:21] Oh, yeah. It was a huge, huge sales and team effort to, to get people across the line. But I would say the majority of people either would have churned or, or would have moved across the, the more forward-thinking people.
I, you know, they would have been getting new features and a better service.
Jeremy: [00:08:36] Right. So it was kind of more of a marketing push from your side to say, Hey, if you move over to our cloud solution, you'll get all these additional capabilities. But it was ultimately up to them to decide whether they want it to.
John: [00:08:47] Yeah. So, um, no, some companies, they. They kind of build that product with a different name and they try and sell it but uh Phorest we actually kept the UI very similar. So it wasn't very intrusive to the users. It was just kind of seen as an upgrade with, um, I guess, less friction.
Jeremy: [00:09:06] Right. Right. I want to talk a little bit about the. Early days where you, you said you spent about a year to build the MVP at that point, let's say after that year had passed, were you still able to iterate quickly in terms of features or were you having any problems with performance or downtime at that point?
John: [00:09:28] So in 2012, when the cloud-based product launched, particularly in the UK, once we hit about a thousand customers, we started to see creaking issues in the backend, lots of JVM, garbage collection problems, lots of, uh, database contention and lots of outages.
So we got to a point where we were trying hardware at the problem to, to make things a little bit faster. So what our problems were. We kind of relied a lot on a single person to do a lot of the deployments. it wasn't really a team effort to, to ship things. It was more so developer finishes code and the machine push it off.
Maybe at the end of the month we would ship. I guess the big problem was the stability. So essentially what, what happened was. In terms of the architecture, we were introducing caches at various levels to try and, um, cope with performance. So, uh, a layer of caching on the client's side was introduced, uh, memcached uh, was introduced, uh, level, level 2 hibernate caching, always just, you know, really focusing on fixing the immediate problem, without looking at kind of the bigger picture, once I said, I mentioned that 2000 salons as a marker, I guess once we hit like 1200.
The guys had to introduce, uh, the idea of silos, which was like essentially 1000 customers are going to be linked to a specific URL and that URL will host the, the API returning back the data that they need. And then the other silo, which would service the other, you know, 200 growing to say thousand businesses.
So essentially if you think about it, you've got, I guess, a big point of failure. If, if that, if that server goes down, There's no load balancing between between servers and those two servers are their biggest size possible. So I guess a big red herring was the, the cost, uh, I guess, implications of that, you know, it was the largest, uh, instance type on Amazon on RDS and EC2 level.
Jeremy: [00:11:31] The entire system was on a single instance for each silo?
John: [00:11:35] Yeah. So if you imagine, um, when you, when you log in, you'll get returned a URL for a particular silo. So what would happen then would be X businesses go to that silo and X, Y businesses go to the other silo and what that did was basically it load balanced, the businesses at kind of a database level.
Jeremy: [00:11:55] You were mentioning how you had like different caching layers, for example, memcached and things like that, but those were all being run on the same instance. Is that correct?
John: [00:12:05] Um, they would have been hosted by Amazon.
Jeremy: [00:12:07] Oh okay. So those would have been, uh, Amazon's hosted services.
John: [00:12:10] So, yeah. Yeah. It's kind of like when you build that MVP or you build that initial stage, your product is kind of, you're focusing on building features.
You're focusing on getting bums on seats and you. It was that point, that 12, 1200 to a thousand. salons that where we felt that pain, that scaling pain.
Jeremy: [00:12:30] So in a way, like you said, you were doing multitenancy, but it was kind of split per thousand customers.
John: [00:12:38] Yeah, exactly. So if you imagine, if, if a failure happened on one of those servers, there is no fault tolerance.
If the deployment goes wrong in terms of like, uh, putting an instance in service, those, those thousand customers can't make purchases, their customers can't make online bookings. There's no appointments being served. You can't run transactions through the till. So, uh, we've caused huge, huge friction.
Jeremy: [00:13:04] Right. Uh, what were the managed services you were using in combination with the EC2 instance?
John: [00:13:12] So, um, a really good decision at the start of the guys moving to cloud was making a big bet on Amazon in terms of utilizing them for RDS, EC2, caching. There was no deployment stack, or there was no deployment, uh, infrastructure as code.
It was all I guess, manually done through Amazon console, which is something that we later address, which we'll chat about it, but it was all, all heavily reliant on Amazon.
Jeremy: [00:13:38] And you had mentioned that you were relying on one person to do deployment. Was, was that still the case at this time?
John: [00:13:46] Yeah, so up until, I guess 2014. Um, It was all reliant on one guy who, who literally had to bring his laptop on holidays with them and tether from cafes. If something went down to deploy new code, he was the only guy who knew how to do it. So it was, it was a huge pain point and bus factor.
Jeremy: [00:14:08] So it sounds like in terms of how the team was split up, there was basically, you have people working on development and you have a single person in the sort of ops role.
John: [00:14:21] Yeah. And, uh, essentially when, when this kind of thing happens, you, the people who write the code don't ship it, you get all sorts of problems in terms of dependencies and tangles. And, uh, and you know, just just knowledge, knowledge silos, and also, you know, because the guys were working kind of in their own verticals, uh, different areas of the product.
There was no consistency. Consistency in terms of the engineering values, how they were working, practices, procedures, you know, deployments, that sort of stuff. It was all, it was all very isolated. So, um, people did their own, their own thing. So, uh, you could imagine for say trying to hire someone new would be quite hard because, um, you know, for someone to come in very, very different, depending on which engineer you talk to.
That makes sense?
Jeremy: [00:15:10] Yeah, was, was this a locally located team or was this a remote team?
John: [00:15:16] Most of the guys were actually in Dublin. Um, one or two people traveled a little bit worked remotely and a couple of people did actually move abroad. So it was predominantly based in Dublin, but, um, some people traveled a bit in terms of processes
Jeremy: [00:15:28] For someone knowing how to deploy or how to work on a feature. It was mostly tribal knowledge. It's more just trying to piece together a story from talking to different people. Is that correct?
John: [00:15:42] Precisely. So, um, you, you had no consistency in languages or frameworks. Um, except I would say that that model, it, uh, that, that initial part of the platform was extremely consistent, uh, in terms of the patterns used, uh, the. I guess the way it communicated with database and you know, how the API was built, um, was extremely strong and, uh, is, is the heart of steel is the heart of the organization. So say for example, there was a lot of really good, uh, say integration and unit tests there, but they got abandoned for a little while and we had to bring them back, back to life, to, to, to enable us to start moving faster again, and to give us a lot more confidence.
Jeremy: [00:16:32] Hmm. So it sounds like maybe the initial version and the first year or so had a pretty solid foundation, but then as. I'm not sure if it was the team that grew or just the, the rate of features. Uh, would you say that?
John: [00:16:47] I would say it was a combination of the, the growth of the company in terms of the number of customers on it, and the focus on delivering features.
So focusing on feature development rather than tinking about scalability and. Being extremely aware of how, how fast were you gaining customers at that time? Was this a steady increase or large spikes? You're talking to 30% annually. So 30% annually and really, really low churn rate as well.
Jeremy: [00:17:17] So what would you feel was the turning point where it felt like your software had, or your business had to fundamentally change due to the number of customers you had?
John: [00:17:28] So it was essentially those issues around stability and cost where we're on sustainable for the business customers complaining, uh, our staff not being able to. To do their job. So, you know, part of Phorest's core values and mission is to help the salon owner grow their business and use, use the tools that we provide to, to do that.
And if people are firefighting, uh, and not being able to, to support our customers, to be able to send, help them send really great marketing campaigns to boost our revenue, if we're not doing that, um, we're, we're firefighting the company would have been pointless. So we weren't fulfilling our mission by coping with outages and panicking all the time. The costs again, we're, we're unsustainable and you know, the team, you know, it was just, I guess, uncomfortable with this, the state we were in. So the turning point would have been, I would say in like 2014, when we, we essentially hired in some people who have more, more experience in.
I would say the high scalability systems and people who, who cared a little bit more about quality and best practices. So when you hire a three or four people like that, you kind of, you bring in a, a different way of thinking you kind of, you hire, hire, hire these dif different values. You know, when you, when you try to. To talk to a team and try and get these things out. They're normally quite organic. If you bring people in from maybe a similar comp at all from a different industry, but similar experience, you, you kind of get that for free. And that's what Phorest did. So, um, basically in 2014, and since now, we've, we've invested heavily in hiring and hiring the right people in terms of how they operate and then in terms of how they think but also bringing that back to our our values and, um, and what we try to do,
Jeremy: [00:19:28] Do you think that bringing in, you know, new new people, new talent is really one of the largest factors that allowed you to make such large changes to change your culture and change your way of thinking?
John: [00:19:41] The other thing would be, I would say the trust, um, that Ronan CEO and the leadership team Phorest has, um, and their openness to change.
Um, I think that, uh, a lot of other organizations will be quite scared of this type of a change in terms of heavily investing in the product to make it better, just like from experience and talking to the people, you know, would have been very easy to, to not invest, uh, you know, and just leave the software ticking along with bugs and handling the downtime, but it was, it was about the organization and their value, their value is around really helping, helping the salon owners and not spending that time firefighting.
Jeremy: [00:20:28] So it sounds like within two years or so of, of launch was when you, uh, decided to, to make this change.
John: [00:20:38] Yeah. So, um, you know, it's not an, not an easy one to make because you know, it's really hard to find talent.
Um, And we, we, we were lucky to, to really get some, some great people in, and it wasn't about making radical change at the start. You know, it started from foundations. So it was teams like, you know, let's get a continuous integration server going here, guys, and, you know, let's bring all of that. Let's bring back all the broken tests and make sure they're running so that we can have a bit more confidence in what we share.
W we, you know, introduce code review code reviews and pull requests back in into things and a bit more collaboration and getting rid of those pockets of knowledge. Um, you know, reliance on individuals.
Jeremy: [00:21:21] I do want to go more into those a little bit later, but before that, when you were having performance issues or having outages before all these changes, how were you finding out?
Was it being reported by users or did you have any kind of process, um, you know, to notify you?
John: [00:21:41] So the quite commenting was basically the phones, which would light up. Um, there was very, very little transparency of what was going on in the system. It got to a stage where we actually installed a physical red button on support floor, which, uh, texted everyone in the engineering team.
Jeremy: [00:22:00] Oh, wow. Okay.
John: [00:22:02] Yeah.
Jeremy: [00:22:02] One of the things that we often hear is when a system has issues like this, it's difficult to free up people to fix the underlying problems, um, due to the time investment required. And as you mentioned, all the firefighting going on, how did you overcome this issue?
John: [00:22:24] So I guess. You know, the beforehand it was, it was a matter of, you know, restart the server.
Let's keep going with our features, but it was really bad stopping to think about, um, you know, what really happened here. And, you know, maybe let's write down an incident report and gather some data, but, well, what actually happened under the hood and a few things, you know, a few questions, key questions could be raised from that.
You know, what are we going to do to stop this from happening again? Why didn't we know about it before the customers and, you know, What were the steps we made to, to reproduce some, actually fix this issue and, and what are the actions that are going to happen and how are we going to track that those actions do happen after, after the issue?
Jeremy: [00:23:08] Let me see you. If I understand this correctly, you, you actually did build sort of a process when you would have incidents to figure out okay.
John: [00:23:17] That was the first step I would say. Yeah. So let's figure out what happened and how, and it was just about gathering data and getting information about what was, what was really going on.
So let us identify as, you know, common things that happens that may be usually we would just, you know, restart server and forget but or fail over database and forgotten, you know, everything's not normal and a couple of errors, but as we started gathering that data, we started to see common problems. So maybe, you know, Our deployment processes isn't good enough and it's error prone, or this specific message broker isn't fault tolerant, or the IOPS in the database are too high at this time due to these queries happening.
But after we got that data, you know, uh, and we started really digging deep into the system. We realized that this isn't something that you could just take two days in your sprint to start to fix, uh, go, just coming back to your question on, uh, finding that time to, to fix things where we kind of had to make a tough call.
When we looked at everything to, to say, you know, let's stop feature work and let's stop product work. And. Let's fix this property.
Jeremy: [00:24:26] Okay. Yeah. So, so basically you got more information on, on why you were having these problems, why you were having downtime or performance issues and started to build kind of a picture of, and realize that, Oh, this is, this is actually a very large problem.
And then as a company, you made the decision that, okay, we're going to stop feature development. To make sure we have enough resources to really tackle this problem.
John: [00:24:55] Precisely. And, um, from the product side of things, you know, this was a big, big driving factor in it. You know, we wanted to build all these amazing features to help salons to grow, but we just couldn't actually deliver on them.
And we couldn't have any predictability on the way we deliver them, because because of that firefighting and, you know, cause we were sidetracked so much. There was no confidence in, in release cycle and stability or, or what, what we could actually deliver. So, um, yeah, it was, it was a pretty hard decision for us to make in terms of, uh, the business.
Cause we haven't had a lot of deliverables and commitments to customers and to, you know, to our sales team. So we, we have to have to make that call.
Jeremy: [00:25:36] You were mentioning earlier about how you started to bring in a continuous integration process before you had done that. You also mentioned that there were tests in the system initially, but then the tests kind of went away.
Could you kind of elaborate on what you meant by that?
John: [00:25:53] Yeah, so. As I said, like the kind of the core system was built with a lot of integrity and a lot of good patterns. For example, uh, a lot of integration tests, uh, running against, uh, the APIs, uh, were written and maybe were written against a specific feature, but they were never run as a full suite.
So, um, what would happen was there'd be maybe one or two flaky ones. And, um, you know, because there was no continuous integration server, it was, it was easy enough for a developer to run specific tests for that, uh, functionality that they were were were building. But because there was the CI wasn't there, there was no full suite ran.
So when, when it came time to actually do that, we realized, you know, So, you know, 70% of them are broken,
Jeremy: [00:26:40] so they, they were building tests as they developed, but then they were maybe not running those, uh,
John: [00:26:47] before commit
or merge
Jeremy: [00:26:49] right. And so adding the continuous integration process, having, uh, some kind of build process, really forced people to pay attention to whether those tests were working or not
John: [00:27:01] Exactly. Um, and. Just a kind of a step on from that was, you know, um, a huge delay in getting stuff to test because, because we relied on that onw guy to build stuff. Um, and actually that was, you know, done from a, you know, a little Linux box in the engineering floor, um, which was quite temperamental.
Uh, you'd be quite delayed and actually in even just getting stuff into people's hands and kind of what the core of software development is all about right, you know, Getting getting what you build into people's hands and we just couldn't do it
Jeremy: [00:27:34] Just because the, the process of actually doing a build and a deployment was so difficult when you added the continuous integration process.
Uh, were there other benefits that you maybe didn't expect when you started bringing this in?
John: [00:27:50] So, um, I, I guess I mentioned the deployments is a big one. I think that. People started to see real, um, benefit in terms of their workflow. I guess, along with the continuous integration, there was, uh, more, more discipline in terms of, uh, how we worked.
So the CI server introduced a better workflow, uh, for us on a, it helped us see real clarity, uh, in terms of the quality of the system, where, where we had coverage, where it didn't and, um, It also helped us break up the system a little bit. So I mentioned majestic monoliths. So it was actually when, when we went to look at it, there was five application servers sitting in one repo and the CI server and some crafty engineering helped us split that up quite well.
To break at the repo into multiple application servers.
Jeremy: [00:28:45] Hmm. So, so actually bringing in the continuous integration actually encouraged you to rearchitect your application and in certain ways, and break it down into smaller pieces.
John: [00:28:56] Exactly. Yeah. And really it was all about confidence. Um, and being able to test and then know that we weren't progressing.
Jeremy: [00:29:03] What do you, you think people saw in terms of the pain or the challenges from that sort of monolith set up that you think sort of inspired them to break it up?
John: [00:29:13] The big one was a bug fix in one small, area of the system meant the whole stack had to be redeployed, which took hours and hours. The other thing would have been the speed of development in terms of navigating around a pretty large code base and the slowness of the test suite to run, which was around 25 minutes.
When we, when we started and got them all green,
Jeremy: [00:29:36] the pain of running the tests and having it possible to break so many things with just one change, maybe encourage people to, to shrink things down. So they wouldn't have to think so much about the whole picture all the time.
John: [00:29:50] Exactly. We started to see, you know, a small fix or a small feature breaking something completely non-related typical example would have been due to a HTTP connection configuration on a client, um, breaking completely on, unrelated areas of the system.
Jeremy: [00:30:06] Okay. One thing I'd like to talk about next is the monitoring. Uh, you mentioned earlier that it was really just phone calls would come into support and you even had the, the big red button, you could press uh, what did you do to add monitoring to your application?
John: [00:30:25] That's pretty, uh, important to mention that, you know, we talked about making a decision to stop to down tills (?) and start fixing stuff. So that's, that's when, uh, we, we started, you know, looking at the monitoring and everything else, like continuous integration, bringing back tests, but at kind of a key point of, uh, of this evolutionary project was, was the monitoring.
Um, so we did a few things. So we, we upgraded our systems to be using new relic. To help us find errors and it was there, but, um, it wasn't being utilized in a good enough way. So we used the APM (Application Performance Management) there. We looked at CloudWatch I mean we introduced watch metrics to help us watch traffic, to help us see slow, uh, transactions.
Um, log entries helped us a lot. Uh, in terms of spotting anomalies in the logs, Pingdom was actually a, a really surprising, um, good addition to the monitoring. Um, it's simply just, just calls any health check, endpoint you want. And. That has some, some nice Slack and messaging integration. It was, that was great for us.
It's helped us a lot. So we did a couple of other things like, um, some small end to end tests that would, um, Give us a kind of a heartbeat to how the system was running. Um, and, and they were also gave us the kind of confidence that we would know about an issue before a customer, being able to allow us to get rid of that red button.
Jeremy: [00:31:53] All of these are managed services that, that you either send logs to or check health end points on your system.
Did you configure them somehow, too text your team or send messages to your team when certain conditions were met or
John: [00:32:11] so we, we, we started with just like a simple Slack channel that would, uh, would send us any kind of dev ops related issues into, into there.
And that that's kind of what helped us change the culture a little bit in terms of being more aware of the infrastructure and the operations. And Pingdom was great for set, setting up a team with people who, who would get notifications for various parts of the system. And, uh, our CloudWatch, um, alarms, we set up a little Lambda function that would just forward on uh any critical messages to text us.
Jeremy: [00:32:44] And before this, you said it was basically just user calls and now you are actually shifting to kind of proactively identifying the problems. Yeah.
John: [00:32:54] Yeah, exactly. There was some small, really small alerts there, but nothing as comprehensive as this. We actually, um, we changed some of the applications. We introduced health end points to all of them. So they would report on their ability to connect to message broker, their ability to connect to a database, any dependencies that they actually needed. We would actually check as part of pinging that endpoint. So, if you hit any of our servers, we, any new or older ones, they would all have like a forward slash health endpoint and that would give you back a JSON structure, uh, and give us a good insight into how healthy that component was.
Jeremy: [00:33:33] Yeah. And if there was a problem and you were trying to debug issues, were you mostly able to log into these third-party services, like log entries or new Relic to actually track down the problem?
John: [00:33:46] Yeah. So again, th those services gave us that information, but it would always come back to, you know, being, if you needed to get into a server and a big thing, which we'll talk about is Docker.
Um, we, we don't have SSH access into those servers. So we rely on those third parties to give us that information. But in the past, maybe we would have had to get in and, you know, look at the processes and take dumps, but. With log entries and new Relic, we were able to do that stuff without needing to.
Jeremy: [00:34:17] So previously you might have someone actually SSH into the individual boxes and look at log files and things like that.
John: [00:34:25] Exactly. So it's quite easy when you've got one server, but with that, as we'll discuss where you've got many small containers and it's extremely complicated
Jeremy: [00:34:34] Next I'd like to talk about, uh, since you mentioned Docker, uh, how did you make the decision to move to Docker?
John: [00:34:42] So it was something our CTO was really aware of and he really wanted us to explore this.
The big benefits for us was that shift in mindset of one guy not being responsible for deployments, but us actually developing and using Docker in our day to day workflow and the cost implications as well. The fact that we could, instead of having that say eight X large, we could have. Running one application server.
We could have 12 containers running on much smaller containers running on an EC2 instances. So it was that idea of being able to, to maximize, uh, CPU and memory. What was a huge, huge benefit for us that we, we, we saw.
Jeremy: [00:35:24] So the, the primary driver was, was almost your. AWS bill or your
John: [00:35:30] Big time. Yeah. Portable applications that, um, you know, w had much less maintenance. We didn't have to go in and worry about it because we had a, I guess, a. We mentioned this earlier, like these kinds of silo tech stacks, we didn't need to worry about a Ruby environment or PHP environment or a Java JVM install. It was just the container.
And that was a hugely big, an important thing for us to do and really kind of well thought out by our CTO at the time.
Jeremy: [00:35:59] So, so you mentioned like Ruby containers and JVMs and things like that. Does your application. I actually have a bunch of different frameworks and a bunch of different languages?
John: [00:36:09] Yeah. So, um, as we split out the, that monolith, uh, we also, I guess, started building smaller domain specific, not micro I'd say kind of uh, services responsible for areas of the system, uh, our online booking stack. So if you go to any of our customers, um, you know, you can book a and their point of sale system in the salon, but you can also book on your phone and we have a custom domain for every one of those salon. So it's like phorest.com/book/foundationhair.
Um, if you click on that, you're going to be brought to the online booking stack, which is a, a rails app actually in a, an Ember, Ember JS frontend. So, um, the system, as we started splitting it apart became more and more distributed and Docker was great for us in terms of consistency. And that portability particularly around different text stacks.
Jeremy: [00:37:01] Migrating to Docker, made it easier for you to both develop and to deploy using a bunch of different tech stacks.
John: [00:37:08] Exactly.
Jeremy: [00:37:09] When running through your CI pipeline, would you configure it to create an image that you would put into a private registry such as Amazon's elastic container registry?
John: [00:37:18] Yeah. So we made the mistake of building and hosting our own registry at the start. Uh, we quickly realized the pain and that around three, four months in where I'm actually at the same time as Amazon released the ECR.
So I guess the main reason we did that ourselves was because we were early adopters and we pay, paid a little tax on that, but we did, uh, we moved to ECR. So. Our typical application kind of pipeline is uh build unit tests, maybe integration, acceptance tests, build a container. And then some of those applications, they run acceptance tests against the container running on the CI server, uh, push, push to the registry.
And after it's pushed to the registry, then we would configure deployment and trigger it.
Jeremy: [00:38:02] Do you have a manual process where somebody watches the pipeline go through and you make the call to push or not? Or is it a automated process?
John: [00:38:13] Automated. So, um, we built a small kind of deployment framework again, because we were early adopters of Amazon's ECS, uh, their container service.
Uh, so we built a small, um, deployment stack, which which allowed us to, to essentially configure a new services in ECS and deploy new versions of our application through CI to our, uh, ECS clusters. So it was all automated using an infrastructure as code solution, such as cloud formation. So when we were in looking back at the problems in the old good old days, uh, you know, we seen that one was, you know, uh, things were just configured on the AWS console.
And we, we knew we needed infrastructure as code and we needed a repeatability and the ability to recreate stuff. So we, we use cloudformation and essentially what face something very similar to terraform. Um, and we do use Terraform for, for some of our managing our restaurant clusters on some other things.
Jeremy: [00:39:16] Okay. So you maybe initially moved from having someone manually going in and creating virtual machines to more of a infrastructure is code approach.
John: [00:39:27] Exactly. Yeah.
Jeremy: [00:39:28] You, you had mentioned that one of the primary drivers of, of using Docker was performance, did you start creating performance metrics so that you knew how much progress you were making on that front?
John: [00:39:41] Yeah, so essentially that the effort to kind of make our infrastructure more reliable it's it was a set as kind of a set of steps to get there. And we started with API level testing to make sure that anything we change under the hood, it didn't break the functionality. And we also wrote a bunch of performance tests, particularly around pulling down appointments, creating appointments and sending large, large volumes of messages.
We, we knew we couldn't have any regressions there. So. We use gattling to do those performance tests. And we, we would run that from continuous integration server and we do various types of soak testing to make sure we weren't weren't taking any steps backwards.
Jeremy: [00:40:23] So each time you would do a deployment, you would run performance tests to ensure that you weren't getting slower, or you weren't having any problems from the new deployment.
John: [00:40:34] Yeah, I would say though, that like, uh, this kind of effort and we called it project Darwin internally, this effort to kind of. It had a few goals, but it was all about, you know, becoming fault-tolerant being more scalable, reducing Amazon costs. And during project Darwin, when we, we didn't just move our 1200-1500 salons we didn't just drop them and move them to docker.
There was so many changes under the hood that these performance tests were key to giving us, uh, a pulse on how we were doing. But, um, I guess when we were done with project Darwin and every, everything was onDocker and, and everyone was much, much happier. Um, we, we just, we run those performance tests, ad hoc and as, as part of some release pipeline.
Jeremy: [00:41:21] Hmm. Okay. So initially you were undergoing a lot of big changes and that was where it was really important to, uh, check the performance metrics with, with every build.
John: [00:41:32] Exactly. Yeah.
Jeremy: [00:41:33] Uh, what were some of the, the big challenges? Cause you mentioned you were changing a lot of things. What were some of the big challenges moving your existing architecture to Docker and to ECS?
John: [00:41:47] There was a couple. The biggest there's two huge ones. So one was state. Getting state out of those big servers was extremely hard. We needed to re remove the level two cache because we need, because we needed to turn that one server into smaller, load balanced containers. We needed to remove the state because we didn't want somebody in one term computer terminal, fetching our appointments, and then on their first go mobile app looking at different data.
So we had to get rid of state. And the challenge there was that MySQL performance just wasn't good enough for us. So, um, we actually had to look really hard at migrating to Amazon Aurora, which is what we did again, coming back to cost. Uh, Aurora is much more cost-effective in terms of the system beforehand was provisioned for peak load.
So we, we would have provisioned IOPS for Friday afternoon. The busiest time that the salon was using the system. And we were paying for the same amount on a Sunday night. Compared to Aurora where you're, you're paying for IOPS and the additional benefits of performance around how Amazon rebuilt the storage engine there.
So that's the caching side of things. The other big, big challenge was the VPC. So when you needed to get all of our applications in, into a VPC to to be able to use the latest instance types on Amazon, uh, and also for our application to be able to talk security to Aurora database. So those two are definitely the biggest challenges with the MySQL setup.
Jeremy: [00:43:19] It sounded like you had to pay for your peak usage, whereas with Aurora it automatically scales up and down. Is that correct?
John: [00:43:29] Um, no. You're actually charged per read and write. So that would be the big difference.
Jeremy: [00:43:34] Oh I see. Per read and write. Okay. So it's just per operation. So you don't actually think about what your needs are going to be.
It kind of just charges you based on how it's used.
John: [00:43:45] The other new really nice thing was, uh, looking back at our incident reports, a really common issue would have been, Hey, the database has run out of storage and Aurora does actually autoscale its storage engine.
Jeremy: [00:43:56] You mentioned removing state from your servers and you, you mentioned removing the level two cache, can you kind of explain sort of at a high level, what that means to those who don't understand that?
John: [00:44:10] Sure. So in the Java world, when you have an ORM framework like hibernate, essentially when you create a database, that cache will store that data, um, in its level two cache. And what that means is that it doesn't need to hit the database for every query.
And that's the, that was the solution for, for Phorest as we were in that MVP slash early days. But it wasn't the solution for us to become fault-tolerant.
Jeremy: [00:44:39] So it would be someone makes a query to an ORM in this case. It's and it's hibernate and, uh, on the server's memory, it would retrieve the results from there instead of from the database.
John: [00:44:55] Yeah, exactly. Okay. And that's what I was coming back to around, um, creating an API for a list of appointments. If you had two servers deployed with, with them using an L2 cache, you would get different state because cache,
Jeremy: [00:45:12] you put a different cache in place, or did you remove the cache entirely?
John: [00:45:17] So we removed that cache entirely, but we did have a rest cash, which is, was memcached and that's distributed. And we use cache keys based on, uh, entity versions. So that was distributed and, and worked well with multiple containers behind a load balancer.
Jeremy: [00:45:34] So you removed the, the cache from the individual servers, but you do have a managed Memcached instance that you use.
John: [00:45:42] Yeah, exactly. And getting rid of that level two cache.
Our performance tests told us that MySQL just wasn't performance enough. Whereas Aurora was much better at handling those types of queries, some large joins. It was a big, big relational database.
Jeremy: [00:45:58] So we we've talked about adding in continuous integration, monitoring performance metrics, uh, Aurora Docker. Did any of these processes require large changes to your code base?
John: [00:46:13] To be honest, not really. It was more of a plumbing things together and a lot of orchestration from a human point of view. So, um, people being aware of how all this stuff works and. Uh, essentially making sure that we all knew we're on the right page.
I don't like the biggest, uh, piece of engineering and coding work was the deployment and infrastructure script. So provisioning the VPCs writing the integrations with ECS, uh, that, that sort of thing. But, um, in terms of actual coding, it wasn't too invasive.
Jeremy: [00:46:47] I think that's actually. A positive for a lot of people, because I believe there are people who think they need to do a big, uh, rewrite if they have, you know, performance problems or problems, keeping track of the status of their system.
But I think this is a good case study that shows that you don't necessarily need to do a rewrite. You just need to put additional processes and checks in place and, um, maybe change your. Deployment process to kind of get the results that you weren't
John: [00:47:21] It's about the foundations as well. If you have some really strong people at the start who, you know, pave some good uh, roads there in terms of good practices, like just for example, a really good, uh, Database change management set up some good packaging of the code.
Really good packaging of the code. So it was quite easy for us to slip out five services from that big monolith. It's about the foundations at the start, because it would be quite easy to, to build an MVP with some people who raised, you know, 1000 line PHP scripts and the product works and that's a different case study because, you know, you CA you can't fix that essentially.
Jeremy: [00:48:04] Right? So it's because the original foundation was strong that you were able to undergo this sort of transformation
John: [00:48:12] truly yeah
Jeremy: [00:48:13] adopting all of these processes, did they resolve all of the key problems your business faced?
John: [00:48:21] When we, when we look back and we see that, you know, all of our systems are running on docker, we see a huge cost benefit.
So uh that problem was certainly solved. We, we were able to see issues before our customers, so we have better transparency, uh, in the system. No longer was, uh, were we dependent on one big server uh, a 1000 customers were no longer dependent on one big server. Um, so it meant that we had really good fault and we do have really good fault tolerance on, on those containers.
If one of them dies, ECS will literally kill it. Uh, bring up a new one. Uh, it will also do some auto scaling for us. Say on a Monday morning, it will, you maybe have eight containers running, but on a Friday, maybe it'll auto scale to 14. So that's been ground bre breaking for us in terms of how we work. We went from shipping monthly to quarterly from between monthly and quarterly to, to daily.
And something I use as a, uh, a team health metric right now is, is our frequency of deployment. And I'd say we're hitting about 25 deployments a week now, compared to the olden days is, is, is great. We always want to get better at it. I would say that those have been really amazing things for us, but also in terms of the team, it's, it's a lot easier for us now to hire a new engineer, um, bringing them in because of this consistency.
And also, um, I guess, uh, we're not relying on these pockets of knowledge anymore. So we, again, around hiring it's, it's a lot easier for someone to come into the system and, and know how things work. And I think in terms of hiring as well, when you talk about the kind of setup it's, uh, it's, you know, you know, there's some, some good stuff happening there.
Jeremy: [00:50:10] It sounds like you have a better picture in terms of monitoring the system, you brought your costs down significantly. The deployment process is much easier. The existence of the containers and ECS is kind of serving the purpose of where people used to have to monitor the individual servers and bring them up themselves.
But now you've sort of outsource that to Amazon, to take care of. Uh, does that sound, does that all sound correct?
John: [00:50:42] Yeah. Spot on.
Jeremy: [00:50:43] And I find it interesting too, that you had mentioned that improving all of your process to use actually made it easier to bring new people in. And that's because you were saying things are now more clearly defined in terms of what to do, rather than all of this information kind of being tribal in a sense.
John: [00:51:06] Yeah. Like a typical example will be like, Hey, uh, let's redeploy this, uh, bug fix. And so previously, you know, it might be a capistrano deploy or, uh, you know, oh you need to get SSH keys to this thing and, you need to log in here and you need to build this code. On this local machine and try and ship it up.
And that just all goes away. Um, particularly with Docker on that, that continuous integration pipeline is just, it sets a really good set of standards and things that people should find quite easy and repeatable.
Jeremy: [00:51:40] And, uh, so now in terms of deployment, you can use something like cloud formation and you have the continuous integration process that can deploy your software without somebody having to know specifically about how that part works.
John: [00:51:58] Exactly. So I would say if we wanted to create like a new service responsible for some new, new functionality in Phorest, uh, say a spring boot application, a Java application. They can simply provide a Docker file and get that deployed to dev staging or production with, I would say 10 lines of YAML configuration.
So you could go from initial set up of a project to, to production in a day. If you wanted to just. Zero friction there I would say.
Jeremy: [00:52:29] It really makes the onboarding a lot easier then
do you think your team waited too long to change their processes?
Or do you think these changes came at just the right time?
John: [00:52:42] I would say if we waited any longer, it could have been detrimental to, to, I guess, the health of the business.
I think that the guys did a great job in terms of getting us to a certain point. Well, we would have risked technical decay, I would say. And, uh, what kind of, uh, really, uh, harming the organization. If I had gone any further, I would say it was, it was a lot of work to do this and it could have been easier if.
If we had paid more attention to technical debt and making the right decisions earlier on. So maybe saying no to that customer who wants a bespoke piece of functionality, well, you have to do what you have to do.
Jeremy: [00:53:24] So, so you would say maybe identifying earlier on just how much the current processes were causing you problems.
If you had identified that earlier, um, you think you might have made the decision to try and make these changes, uh, at an earlier time.
John: [00:53:44] Yeah. So the guys earlier were, were making really good decisions, but maybe they didn't have the experience for, you know, higher scale at the scalability solutions and systems.
So it's, it's about hiring the right people at different stages of where the product is evolving. I would say.
Jeremy: [00:54:00] Given what you've gone through with the migration of Phorest, what advice would you give to someone building a new process? What, what can they do to keep ahead of either technical debt or any of the other issues you face?
John: [00:54:18] I think it's about how it's, it's actually, uh, a people, um, and cultural thing along with tech decisions. So. Everybody needs to be really aligned in terms of these decisions that they're making, rather than letting people go on an individual basis. I think there needs to be good leadership in terms of getting a group of people thinking the same way.
I reckon the technical currency is, is extremely important. And as your system grows, you need to be able to, to look, look back. And identify areas of pain and by pain, I mean, you know, speed of deployment, uh, speed of development, ability to adapt and change your software. So if you notice that a feature that used to maybe take a week has now taken two weeks.
You know, you probably need to take a really hard look at that area of the system and figure it out. Could it be simplified? Um, and why, why is it taking too long?
Jeremy: [00:55:21] Basically identifying exactly where your pain points are, um, so that you can really focus your efforts and, and have an idea of what you're really going for.
John: [00:55:31] Yeah. You need to build, um, an environment of trust. And I will also say that you need to be able to.To be able to be calm, confident, and okay with failure in terms of take taking risks sometimes and saying no to features and customers to be able to, to push back on, on leadership and make sure that you're, you're really evolving the system the right way.
Uh, not just, uh, becoming a feature factory.
Jeremy: [00:56:01] Yeah. It's always going to be a kind of balance on, you know, how much can you pull back, but still stay competitive in whatever space you're in.
John: [00:56:12] Yeah. So what, what we're doing right now based on those lessons is we tried to do like a six to eight week burst of work.
And we would always try and take a week or two wiggle room between that and starting something new to look back at what we just built and make sure we're happy with it. But also look at our, our, our technical backlog and. See if there's anything there that's really pain, you know? And just, even for example, this week, we, we noticed an issue with a lot of builds failing on our CI because of, uh, how, how it was set up to push Docker images.
So, okay. Usually they would fail and that was actually a real pain point for us. Just over the last couple of months, because maybe a deployment, which should take 20 minutes was taking 40. Cause you'd have to re trigger it. So that's just like, that's an example of us looking at what, what was high value and making sure we just fix it before we start something new.
Jeremy: [00:57:08] So making sure that you don't kind of end up in the same situation where you started, where. These technical issues sort of build without people noticing them instead kind of in shorter iterations, doing sort of a sanity check and making sure like everything is working and we're all going in the right direction.
John: [00:57:27] Yeah. It's about the team. And I mentioned before, it's about, you know, the leadership and a group of people together. Talking through common issues and, you know, maybe meet, meet every two, three weeks. Talk about some key metrics in the system. Why is it this too high? Why is this too low? You know, you can through
kind of through your peers you can really see the pain points and, and they'll, they'll. More than likely tell you them.
Jeremy: [00:57:51] When you look back at all the different technologies and processes you adopted, did you feel that any of them had too much overhead for someone starting out? What was your experience in general?
John: [00:58:04] So some people just didn't like doing code reviews. Some people just really just felt that. They could just push, push what they needed and that it was almost a, a, a judgment on them in terms of the code review process, which it totally wasn't. I would say, uh, some people found Jenkins and continuous integration a bit, you know, what's the point.
And so we, we had had some, you know, some pain points there. Um, but as we got to Docker, as people seeing the benefits of, of these things, you know, less bugs going into production, uh, less things, breaking people, being able to go home nice and early in the evening and not be woke up in the middle of night with, uh, you know, an outage call.
Those were all the, the, the benefits, and that's reaping the rewards of, of thinking like this.
Jeremy: [00:58:56] Your team was bringing on a bunch of new things at once. What was your process for adopting all these new things without overwhelming your team?
John: [00:59:06] So it was starting at the foundation. So the continuous integration, the code reviews where we're incrementally brought in and we had regular team meetings to discuss pros and cons.
And it was really important for people to, to input on those things rather than to, to, to just implement them. They would have failed if we hadn't done it like that. It took time. I know it's still, I would say we're still not in a perfect world, but it's about group, group consensus and making sure that everyone everyone's bought in to what we're trying to achieve.
Jeremy: [00:59:39] So basically getting everyone in the same room and making sure they understand what exactly the goal is. And everyone's on the same page.
John: [00:59:47] Yeah. So we tried to make a big efforts, uh, particularly for people who are working remotely to get them all in the same room. Once a quarter, we talk about our challenges, talk about our goals, talk about our values and make sure we're all on the same page.
And sometimes we tweak them and you know, that's how we feel. It's best to do it.
Jeremy: [01:00:09] Finally, I want to talk about what's next for, for Phorest. What are the remaining pain points you have now. And what do you plan on tackling next?
John: [01:00:20] So right now we're on 4000 salons on our platform. We're really happy with the state of the infrastructure to get us to maybe 8000-1000 salons, but we need to be really conscious of the company's growth and our goals.
So we need to make sure that we can scale at a much bigger level. And we also need to make sure that our customers aren't affected by, uh, our growth. We were looking at serverless for any kind of newer pieces of the product to see if they can help us reduce costs even more and, and help us stay, stay agile in terms of our infrastructure and how we roll out a couple of years ago, when we launched into the USA, we noticed we, um, It doubled our overhead in terms of infrastructure, operations, and deployment.
And as we grow in the U S we, we need to be really conscious of not making eh any, um, I guess, uh, mistakes from the past.
Jeremy: [01:01:15] So you're mostly looking forward to additional scaling challenges and possibly addressing those with serverless or some other type of technology.
John: [01:01:27] Yeah. So, um, one area in particular will be our SMS sending.
So that's kind of. A plan for the next six to eight months would be to make sure that we can continue to scale at the growth rate of SMS and email sending, which is, is huge in the platform.
Jeremy: [01:01:44] Um, you said so far, you've been experiencing 30% growth year over year. And you said when you moved to the U S you actually doubled your customer base?
John: [01:01:56] I'd say we doubled our, uh, overhead in terms of infrastructure. managing (?) deployments. We we're still very early stage in the US and that's our big focus for the moment. But as we grow there, we, we need to be, I guess, more operationally aware of, of how it's, how it's going over there. There's a much bigger market.
Jeremy: [01:02:17] To kind of cap it off.
How, uh, how can people follow you on the internet?
John: [01:02:21] Sure. So you can grab me on Twitter at Johnwildoran , J O H N W I L Doran. And if you ever wanted to reach out to me and talk to you about any of this type of stuff, I'd love to meet up with you. So feel free to reach out.
We cover:
Related Links
This episode was originally on Software Engineering Radio.
Transcript
You can help edit this transcript on GitHub.
Jeremy: Today, I'm speaking with Sumit Kumar. Sumit is the head of engineering at Share Now, which is previously car2go. He's also the creator of an open source plugin called leaflet-geoman, which provides drawing tools for the leaflet mapping library.
I'll be speaking with Sumit about his experience working with leaflet and how developers can build mapping applications with it. Sumit, welcome to software engineering radio.
Sumit: Hello. Thanks for having me.
Jeremy: So the first thing I'd like to start with is for people who aren't familiar with leaflet, what is it and what types of projects would people build with it?
Sumit: So leaflet is basically a mapping library, a JavaScript library for, mobile friendly interactive maps. That's how they say it. And if you want a map and you don't want to use Google maps, for example, for whatever reason, leaflet is basically an open source alternative to Google maps and other mapping providers. Of course, you still need a map itself. So basically the images of, of a map where you can have open street map or any other provider, but with leaflet you can. , if I would have to describe it, you basically create the layers on top of the map. So that means polygons, markers, points of interest to show any data that you might have and to zoom around the map and stuff like this. So it's the library around the map itself and gives developers really good tools to do their own mapping solutions.
Jeremy: I'm sure a lot of people are familiar with Google maps. What are some of the main reasons you might not choose it?
Sumit: So in big corporations a lot of companies don't want to be tied in. There are licensing reasons especially in Germany or in Europe, there are data protection reasons. There is simply a stigma attached to Google in that sense. Then there are Google uses a custom format for the geo data and leaflet you can use an open standard called GeoJSON.
You can use it with Google too, as far as I know, but it basically will transform everything to the Google format and it's much more expensive. That is also [a] reason, especially for startups or, individual developers with side projects. If they have a big volume, if they expect bigger traffic, then Google maps is quite expensive.
Jeremy: So what are some examples of sources people would use?
Sumit: Yeah, that's a good question. Normally people use openstreetmap which is free to use. But I think the Google street maps are not very pretty. So all the projects that I do, even if it's open source, I use Mapbox. Mapbox is a company that I would say it's built on top of leaflet.
Last time I checked even the maintainers or the core maintainers of leaflet work at Mapbox. So the company grew out of leaflet, from my understanding, and it's highly compatible with leaflet and they provide beautiful maps. This is not free. So I pay for great maps to use even for my open source projects.
But you can use basically any provider you want. So there is also HERE maps. It's a European provider. There is Google maps. Of course you can use that. I'm not entirely sure about Apple maps if you can use that with leaflet, but any provider where you can fetch the tiles you can implement it into leaflet and leaflet is a mapping tool. That means it's not about only about like satellite maps or street maps. You can also use indoor maps. You can use a map of the Moon or Mars. You can use maps of video games or Game of Thrones it doesn't matter at the end.
Any image that you can take from an area can be a map. And this could be even a (blueprint) a map of your house. You could even use that as a digital map and create whatever you need on top of it.
Jeremy: And in that case would you have a JPEG or a static image and then you reference that so someone's able to click around that?
Sumit: I've never done a Tile layer myself. But I've seen an application built with the library that I maintain. And this is a SaaS application for construction sites. So they fly a drone on top of the construction site, take, some HD photo of it, or even 4K, I'm not entirely sure.
And then they create a map out of it. That they then draw on top of on the construction site which is really interesting. And as far as I know, it's, it's quite easy for them to make a tile layer out of it. The biggest problem with a tile layer is always zooming in and out of the map. So you need different distances basically.
And of course the file size. So if you do an HD Photo and you opened a map and it has to load multiple HD photos. This creates quite some load and is slow for the user. So this optimization is the hardest part. But other than that, if you research how to do that, I'm pretty sure there are tools for that to make it quite easy.
Jeremy: When you talk about tiles for a map is it where you're starting with a really high resolution image and then you're cutting it into a bunch of pieces so as the person zooms around the map they're not having to load that entire image?
Sumit: It's optimized for exactly this. So it loads as many tiles that are visible onscreen. So if you zoom in. Then it only loads the tiles that are visible. These are usually I would say six images or so. And they should be quite, quite optimized in size.
If you have a huge image that covers a big area. If you zoom out then instead of loading 1000 tiles, you load only six tiles that show more area but in less detail. The load for the user is basically always the same, doesn't matter which zoom level, and you start with a high resolution image and then cut it down.
But again, again, I've never done it myself so not 100% sure on that.
Jeremy: When people talk about map sources, sometimes they talk about raster tiles and sometimes they talk about vector data. What's the main difference between the two and when would you use each?
Sumit: Oh, the vector data itself, you can think if you are a front end developer, you can think of it like an SVG. So it's an image versus an SVG. An image has a fixed dimension width and height and if you zoom in, the quality gets lower. If you zoom in with an SVG, it's rendered by the browser so it has infinite scalability in a sense. And with, vector maps, it's the same. I'm not sure if that is a bit of a stretch to mapping experts, but, the good thing for me, if I use vector maps, you can zoom in and out very fluently. There aren't particular steps to the next image it's a smooth transition.
And, especially if you use something like canvas mode in leaflet, it's a much smoother experience for the user. But. These are not individual images anymore. So I'm not talking necessarily about the tiles right now, but about the layers you put on top. So let's say you have markers of Tesla superchargers for example, and you put them on the map.
These can be individual DOM elements. And if you have 6,000 of them, it creates a lot of load for the browser. But if you have something like canvas mode it's drawn inside the one element and everything is smooth and performant again. You have the downside of you cannot interact with the DOM elements individually, of course.
But I might getting a bit out of your question right now because, vector maps itself, can be... If you refer to tiles, there might be even some additional advantages that I'm not sure yet. What I do know is that Uber, for example, created their own library also on top of open source products. I think it was on top of Mapbox. And they use only vector tiles like this because they have huge data needs and they need very performant maps.
And if you use leaflet only just out of the box and you have big, big, big data then you might get into performance problems. Problems that I have myself but not yet solved in particular apps that I built.
Jeremy: And when you're talking about these big sets of data, is this the mapping data in terms of things like streets and locations, or is this more overlayed on top of that?
Sumit: That's definitely overlays on top of that. So I can do some examples. II'm working in the mobility sector of a car sharing company that's also how I got into leaflet 6-7 years ago because they needed a geospatial data management tool basically that I've built. And, the data that they use are, points of interest, electric charging stations. We have parking zones, drops zones for the cars. We separate the city into polygons to track demand in specific areas of course. And then you have other companies like (masque?), which is a local logistics company. They have zones where their ships drive and stop off the harbors. Tesla for example has supercharger stations, which is small data compared to that and ridiculously detailed data if we are talking autonomous vehicles. Then you need data like you separate the road into different lanes. For example, one goes in one direction, the other one in the other direction. You have parking spots and this for a whole city or a whole country, this is data that is so big and so much your browser would just crash if you try to display it in leaflet alone.
Jeremy: Hmm. And so the very simplest case. If someone is trying to put on a list of parking locations or superchargers what does the API look like that for leaflet? Are people calling functions that are adding these one by one or is it syncing to a collection? What does that look like?
Sumit: so there are multiple options in leaflet, which is quite cool. So normally what I do is I try to use the data always in GeoJSON. So I store it maybe in a different format, but in general, in the APIs I build around leaflet use GeoJSON normally and leaflet you can add GeoJSON simply.
So there is a GeoJSON method where you just put in data and it displays it on the map. You can also create your own shapes. That means a circle marker, a circle, a polygon, a line, a marker. Then you basically give it two coordinates and it creates the shape on the map.
With GeoJSON it's a wrapper for everything. So. If you provided a GeoJSON, it can be markers, can be polygons, can be lines, and leaflet will just add everything. It depends on your needs or your source of data that you have. And particularly with my library where I create the shapes so the user can draw the shapes, him or herself. I need all of those functions. Basically.
Jeremy: And, you were saying that GeoJSON is a good option for storing the locations of different things, storing shapes. How about data that changes often, like if you have a car sharing application you might have live locations of where cars are. Would GeoJSON be suitable for something like that?
Sumit: So I'm sure there are people that disagree with me. I had discussions actually with developers from Uber and also some, let's say, mapping experts from other companies, and the opinions vary. I can only speak for myself. even though JSON or GeoJSON, might not be the most efficient storing format out there, it's basically a standard in modern APIs to use JSON or GeoJSON as as the format to exchange data.
And I like to have not a lot of transformations with my data, so I store it as much as I can in GeoJSON. It works fine for me. I don't have any restrictions or downsides for that? I don't build applications on a scale of Uber, but Uber also told me they store in GeoJSON and they don't have a problem with it either.
They like to use open standards and I agree with that. And if we are talking live data it doesn't matter how frequent the location, for example, of the marker is updated. Let's say every half a second, you update the location of a moving car.
You simply store the two coordinates into it and you can build it as a, you store the entire GeoJSON again or you store only the changed coordinate. Really the subset of the actual data doesn't matter how you build it. It's so small in comparison that it will only make a difference if you have, I don't know if you update a thousand or a million entries, all at the same time.
Then of course, your network is gonna slow down a little bit, but this has nothing really to do with GeoJSON or JSON, this is any data. If you have so much data updates you should look at something like a message queue like RabbitMQ or Kafka or something like this. But if you do it just over REST API, I would personally batch the requests.
So every second or every two seconds, I would update all the entries that have been changed. If we're talking about a front end or something and batch everything together.
Jeremy: So to make sure I understand correctly, if you were getting the locations of cars, and you query the backend API, you would get a GeoJSON file, which is basically a regular JSON file that has a bunch of elements that might have the latitude and longitude of each car. And as you got updates, maybe you get an update every second or every few seconds. You would receive new GeoJSON files from the server and it would just have the elements that had changed or new elements?
Sumit: So, there are two different, points in let's say an application stack, where you have these updates. The first one is the car sends an update to your backend. And if we are talking about our company, for example, share now, or, companies that use my open source product, these might be, for example, 10,000 cars always connected to some sort of backend, and they send their location, maybe every second. For example, let's say you have a network hiccup and they all reconnect, at the same time, you have a huge spike because 10,000 cars send their location at once, and this is an area where we at least had the experience... Don't use just HTTP use a message queue. So we can handle all of the data because the cars do not only send a location, they also send is the window down, is it up motor, start motor, stop events or everything that happens in a connected car is sent to a server.
And this can be multiple events per second. If we're talking 10,000 to a 100,000 vehicles. Or scooters or whatever it is. This load should be handled by a message queue like Kafka or Rabbit MQ. Once it's in the backend and you just want just in quotation marks, want to display it on a front end then we are talking a different story here because, front-end doesn't have a message queue like this we can use if you want to do it really in a real time sense, you can use WebSockets, which is... Yeah, you send incremental updates as they come in to the front end and change your data and can be basically an ID and a new latitude and longitude and you connect the data and the front end.
It can be new JSON. It's completely dependent on how you build it. I normally send some metadata with the JSON that is, for example, a name of the license plate, name of the car, maybe the model, is this a Mercedes or BMW or whatever. And maybe even some other data depending on the project, depending on the data that you get.
There is maybe a lot of metadata involved. Then I don't send the entire JSON. Then I just send the updated information. But if you are not using a web socket, if you're using HTTP, then there is no push from the server. The browser has to fetch updated information. So in this case, and what I would do is, let's say I display Berlin on a map and we have taking as an example, share now and car2go here. Of course, can be applied to any company out there.
If we are talking 2000 cars, for example, and I want to show them in real time, I will simply fetch the list of all the cars, every second or every two seconds. So there is no push or anything. I would just do a fetch a simple, interval request.
Jeremy: And so you're doing this fetch on this interval, and is it taking into account what it chooses to display? It's just taking that entire document and showing everything that you're sending, or is it performing a diff, comparing the elements you already had and the new elements?
Sumit: Yeah, that's a good question. So I do a diff, but it depends on the project. So I am building a little SaaS product around geo management basically. And there I use diffs. I would have, yeah, I did a mistake of not, not diffing. and the reason is if you redraw the layers on a map, it's a lot of performance overhead.
And if you do this every second and currently you are maybe creating another layer on the map, you are interacting with the map in some other way. This, this is a performance problem. Also, the map constantly rerenders basically your data. And the map simply gets laggy. And if you want to do anything else on the map it gets laggy and it's not a good performance.
So I use diffing that if out of 2000 cars only two move, then they will get updated and the rest is just thrown away.
Jeremy: And internally. So when you're talking about using this diff, does that mean on the leaflet side or on the JavaScript side, do you have this GeoJSON document and then you're updating that GeoJSON document and leaflet just figures out that only those two elements changed and it doesn't touch the rest of the page?
Sumit: Yeah, that would be great. But sadly, as far as I know it's not like that. So you give leaflet something to draw and they will draw it. So I do the diff, basically on the business application side, on the business logic side, I do this myself. I'm diffing, when I get the payload basically to the payload that is already there with lodash do a comparison, remove everything else.
And, then I have to remove the layers first that are currently in leaflet, and then I feed them the new ones. And I also have my own IDs associated with everything. So I can compare also individual shapes on the map or layers on the map.
Leaflet has IDs that they create when you add something to the map. But they are not consistent. So if you add a marker to the map it has an ID. If you want to add the same marker to the map again with a different location it gets a different ID.
You can get the current one on the map and update that one. That is also possible, but then you would have to buy in completely into the leaflet logic, including the IDs and how to handle everything. And I like to separate... I like to have leaflet just as a dumb drawing tool and not as the source of truth for my IDs and business logic and all of that.
So I do this outside. I do the diffing outside. I do the storing to the database outside, and leaflet is just a component. I feed it data, it draws it, and that's, that's about it.
Jeremy: Mm. So in situations where you have live data and you have a lot of data changing, it sounds like you actually aren't using GeoJSON in that case, you have your own collection outside of leaflet. And you're using that along with your own diffing to determine, which elements you should find inside of leaflet based off of a key.
And then, moving just the ones that you've found that have changed?
Sumit: No, it's GeoJSON. It's all GeoJSON that I store. For example, the SaaS tool that I built has an API where you can fetch the layers that are displayed on the map, from an API And this is all GeoJSON. And then internally, I only use GeoJSON. And that meansif you put it into leaflet internally in leaflet, it's also transformed to something else, of course.
But the map, you can just say, okay, give me these layers and make them GeoJSON basically. And then this is how my whole application uses it. but GeoJSON has a properties block, and in there you can add all the metadata you want. So in there, I have the IDs that are basically the reference to my database in there.
I have, a gravitational center of a specific polygon, for example. I have descriptions, names, anything they use, I would like to see I store in the properties of GeoJSON and and use that everywhere. So it's still all GeoJSON.
Jeremy: You have let's say a single GeoJSON document for all of your car locations. And if you go inside that GeoJSON find the key of a car that you want to move, you can update the location through the GeoJSON?
Sumit: That's actually a pretty good question and just depends a bit on the application that you're building. So GeoJSON allows you to group layers together. That means if you have one polygon, cool, you can have that. If you have two or three or a hundred, you can group them in what's called a FeatureCollection.
You can store it this way, that's completely fine. Then you have big payloads, but you have to get into the GeoJSON and get the specifics out of it. In the SAAS application, I have this use case to count and limit the layers that a user can draw on a map and this means I want this to be separate database entries.
I can also choose to share one layer versus the other with the public or with the colleague or whatever and to have this granular permission system I need this to be separate database entries and that means... Each layer on the map is for me personally a different GeoJSON. But again this is highly customized to my use case.
I'm sure there are users out there if they fetch my data from GeoJSON through the API. They would not like to have a thousand different GeoJSONs in an array or a collection for all the markers. They would just like to have one big GeoJSON file that owns all of the data.
This is easy to do. You can just combine them. That's fine. And I will do this on the API level. But for my use case, I store them separately.
Jeremy: Okay. So when we're talking about GeoJSON in a lot of cases you actually choose to have a separate GeoJSON document for each marker or each thing that you're going to put onto the map.
Sumit: Exactly.
Jeremy: We were talking about how you're bringing in data on the backend and then you're bringing it in as GeoJSON on the frontend.
Is the GeoJSON, is that something that you're converting at the API level or are you storing your data as GeoJSON on the back end as well?
Sumit: This is the embarrassing part of the application. When you start out, and I'm building this application for a long time and there is tech debt of course and that is one thing.
So what got me up fast? I use Firebase as a data storage, or firestore it's called. And this just helps me ramp up an application quite fast. And I still use it even after many months, I think a year now that I'm building this application. And, in there, you have a limitation of the documents, how you store it.
The document format is limited. So I cannot have nested arrays, but, GeoJSON has quite a lot of nested areas, especially if you have. Multipolygons. That means multiple polygons belonging to one layer, acting as one layer, basically with holes in it. For example, this does some multiple levels of nested arrays.
You can't just store this in firestore. You have to store it as a JSON string. So basically this means is I store my GeoJSON as strings currently, that means on a database level I can't do any queries. Anything I do with the data I first fetch it via cloud functions or a node script or whatever.
I fetch the data first. Then I do the calculations on top of it. For example, is a point inside this GeoJSON or are these polygons near the user or whatever? I do all the calculations and then I serve the data as GeoJSON of course. So I store it as GeoJSON in a sense, but it's a string.
And if I would have to do it again, or if I look into the roadmap for the future, I would probably go to something like MongoDB where you can, I'm not sure if it's like native GeoJSON support, but the important part is that if you build an application that basically does a lot of of geo queries and stuff like this.
If it's made to handle spatial data use a database that can do geoqueries on a database level. That means I give a database a coordinate like my location and do a query like give me all polygons. That where this point is inside a polygon this is something. If you can do this on a database level, this adds a lot of performance, because otherwise, if we are talking thousands or millions of layers, you have to fetch everything and calculate this on your backend, on the backend side.
And this is expensive. So I would go to a database that allows geo queries, Firebase SaaS. They have this in a very, very limited format. I asked this on stackoverflow. I saw the question yesterday again. I asked this about two years ago, 2016 and back then they answered that they haven't exposed their geo queries yet and they still haven't.
So I'm not sure when they come out with it, but if I needed I would switch databases I will not wait for that. And if I would start over, I will choose a database beforehand that allows me to do that.
Jeremy: And for your work at share now, how are you storing the geo data there?
Sumit: So, I have not touched that particular product in quite a while. But, back then when, when we built it. It was GeoJSON, but we stored, we used Mongo DB, and I'm not entirely sure if you can simply without any transformation store GeoJSON in Mongo DB, but Mongo DB has geo queries. So, I know that much, but I can't tell you right now off the top of my head if you have to transform it or if MongoDB accepts GeoJSON but transforms it internally that I don't know.
Jeremy: We're talking about how you can have an application with a lot of data but depending on where the user is looking on the map they don't necessarily need to bring in all that data at once. What are the strategies you use to deal with that?
Sumit: That is a good one. So what geo queries often allow you to do is if I look on a map you can define the borders by basically the top left, top right, bottom left, bottom right corner of your screen. So if I have a map on my screen leaflet and any mapping library can basically give you the boundaries that you are currently looking at.
And these are these four coordinates. And if your database supports geo queries, you can basically say give me all layers that are inside these boundaries and then you can just fetch that data. This is how I would do it from the top of my head. Honestly, I've not built it like this because I've not had to deal with those big datasets.
And there might be downsides to this. If the user for example moves the map how many queries do you do? Then you have a moving target, basically. But I'm sure there are ways to solve this. So this would be my first clue on how to achieve this.
Jeremy: When you're working on a leaflet application, do you have any experience integrating a leaflet map with other frameworks like React or Vue?
Sumit: That is a very good question. I get this quite often on the open source library as well. And with mapping libraries like leaflet you are basically back to the roots of HTML and JavaScript where you interact with the DOM elements directly.
And with the introduction of frameworks like Angular and React and now Vue we moved past this. These are all abstractions to the DOM element. We say create this div and create this div. We don't do this anymore. But with leaflet you still at least a little bit have to do this in the sense that, for example, you tell it the div that it should render in and you call a specific element directly, sometimes through the leaflet API of course, but it doesn't have this reactive nature like our frameworks today do.
So what happens is that there are a lot of abstractions to mapping libraries. So there is react-leaflet, vue-leaflet, all these abstractions, to use the mapping library in the same reactive way, in the same thinking mode of the framework, which, which is good for anyone that is starting out.
I personally like to use it bare bones. I like to interact with leaflet directly with the API it provides. I have full control over it. Maybe I'm an advanced user in a sense. But the abstractions that are out there, they limit me because I always have to go through that. If there is a new feature or an edge case I want to tackle or something I always need the buy in of the maintainer of that abstraction and this is something I don't do.
Sometimes I build my own abstraction but normally I basically build a component in Vue and it doesn't matter if it's React or Angular or Vue. I build a component that owns the map itself. And as a property I put the GeoJSON in there, for example. And the component does all the rest, maybe the diffing even, the rendering, the updating, the watching of the properties.
But you have to build your watchers yourself and stuff like this. So it's just out of context. Not every developer is as comfortable with this, especially if you start programming and you started in the world of React so you're not used to this barebone coding.
But if you use other libraries that are interacting with the DOM elements directly then it's a good exercise to connect these two worlds. Another use case might be where you could use your skillsets, in the sense are, charting libraries. So if you display charts for example on a dashboard, it's the same, same topic there.
Jeremy: So basically you recommend keeping your framework code separate from your interactions with leaflet. You'll use leaflet's built in APIs, not use a wrapper just because you want to make sure you have full control and not be limited by that wrapper.
Sumit: Exactly. And if there is a new version of leaflet coming out, I want to use it immediately. It probably has some updates that I want to put in. And again, I can control the user experience much, much better if I can interact with leaflet directly through the APIs.
I wouldn't necessarily say I would recommend it. It's the way how I do it. And I think if you build an advanced, a big application, I think you are better off using the leaflet API directly. If you just want to display a map and maybe a marker on top of it to show the location of your business on your business website, then it's totally fine to use a wrapper, right?
That's easy. You will have it done in less than an hour and that's it. But if you build a big application to handle geospatial data. Then it's a different story. I would say remove the abstraction and go directly with the library.
Jeremy: For these components that are from your framework do those end up rendering surrounding the map? Like you would have a sidebar or a top bar or something like that?
Sumit: Yeah. Also also a good question. So on Geoman, I use it. The SaaS product is also the demo website basically for the open source library, and it's called geoman. And there, I have a map component as I said, and it's basically fullscreen. From the Dom site. Of course, I have stuff around it.
I have a sidebar in a sense and a footer and whatnot. But with CSS I just display it differently. So the map is for me, always front and center. It's the biggest element. And it should be basically like Google maps. It should be the one big background thing and the header and the sidebar, just like on top of the map.
And you can build this in different ways. Like you could even add this to leaflet you could do everything in leaflet, you could even add your sidebar as a map element on the map but I don't do this. I have the map container and I just overlay my other Dom elements on top of the map, so I want to keep this really separate to the map itself. The map component is definitely one separate component for me that I interact with. Yeah. Because of lock in. Again, let's say I want to change mapping providers at one point instead of changing one component, I would have to change I don't know the whole application. So I try to keep this as separate as possible.
Jeremy: So if you were to look at the HTML the DOM nodes that have all your controls are your sidebars, things like that, they would be completely independent of the leaflet DOM node, but you use CSS maybe with a fixed position or something to appear on top of the map, is that correct?
Sumit: Yeah, that's true. I would do one exception though. I build an open source library, as you mentioned, for drawing, basically layers on the map, markers, lines, whatever. These buttons I add them to leaflet. So because it's a leaflet plugin, that means any user of leaflet that wants to use my plugin just imports it.
And it adds these buttons to the leaflet map. I basically used the leaflet framework to display this, but anything else, like for example, an address search, the list of the layers that is currently on there, an export button for downloading the GeoJSON, changing the tiles from street view to satellite view, this stuff you can all add to the map.
And a lot of people do that, but I personally keep this out of the map. I have this separately again, because In my SAAS application, I would like to have more control over the user experience. And maybe I have something like a pay wall in front of it or an upsell button or I need to limit this based on user permissions.
And I simply feel more comfortable doing this in a framework like Vue instead of leaflet itself.
Jeremy: You were talking about how you built a plugin.
Can you talk a little bit about what makes sense to be built as plugins versus built outside of them?
Sumit: Yeah. Also very good question. I thought of that as well. Sometimes my thoughts went if you have the stuff I just mentioned, an address search and the list of layers and stuff like this. I thought about, Hmm, what if everything of that is a plugin for leaflet, I could open source everything. I just add everything modular to leaflet. It's a big lock in into the leaflet ecosystem. So this is all possible and there are plugins, I think for all of that, there are plugins for your address search, there are plugins to switch tiles, plugins to draw, to export data for all of that.
There are plugins already and they add this to the leaflet map. Again, I think if you have a a use case of just having a small map showing some small stuff, or you want to create an MVP, meaning a small app that can do something where, yeah, it's not a product that you might sell, to someone.
You can do all of that and it will be completely sufficient and it's probably less work. To just use a plugin to add this to the map. But, if you have a product and you want to control the design aspects of it a lot and maybe animations and stuff like this. I think for every developer it's more convenient to not fiddle in leaflet code for that. Leafet has its own CSS also. So you have to overwrite this or compete with it over the style sometimes here and there. And keeping it out of it just means the data that is displayed and sent between these components has to flow through somewhere. And if you build a big application, you probably have an application state, a state, somewhere like in react, what is it called? Redux, for example. Redux store where you have your data store on the front end. If you use this as your source of truth, you can use it for all the components that you display on top of the map that are not inside leaflet. And it's much easier using that ecosystem then recreating everything in leaflet because if you do that you basically don't need a framework anymore.
You build everything in the leaflet library and you interact with the DOM again directly. So the abstraction that we mentioned before you would have to build this for basically everything, right? And there is no reactive thing anymore. You don't get the benefit from React anymore if you add everything directly in leaflet.
Jeremy: If you look at the leaflet website, there's a long list of plugins. Is there a set of plugins that you typically use when you know you're building a leaflet application?
Sumit: No. Somehow I'm ignorant in that sense. No, not even the address geocoding thing. I also built this on my own. There are many, many, many good plugins there honestly and what I can tell you is there are plugins that we use at SHARE NOW or Car2Go and there are plugins that I constantly see people use together with the library that I am providing and these are something like, what's it called? A marker grouping tool. It's...
Jeremy: Clustering.
Sumit: Clustering. Yes. Thank you. Marker cluster is a plugin that is used a lot and we use this as well. As the name implies it clusters the markers when you zoom out, so you don't have 10 million markers on a map, but it clusters them together.
It's also good for performance, but also for usability. so this is something that is used a lot. And then of course everything that creates heat maps and data visualization are also used a lot, at least from the circle of open source maintainers or open source users that I'm interacting with a lot.
But honesty, if you are using leaflet, look at the plugin library, see what is there before you build it yourself. Try the plugin and see if it fits your needs because you don't have to reinvent the wheel. For me, this might be a bit of a sickness and it's fun to recreate the wheel for me.
But of course it's a waste of time if you have it already there.
Jeremy: Yeah. When we talk about displaying things on the map. We were talking about cars earlier. And you're saying you display them in the form of markers. A lot of times applications want to show information that's associated with those markers, whether that's a label on the map or something else like that.
How do you typically approach that? Keep all that information grouped together and keep it updated.
Sumit: So I saw different implementations. You can, as always, the metadata that you store with your layers. You can choose to have them its own entity outside of the GeoJSON, or you add it to your GeoJSON. And so the question is, is the marker itself, your source data, your main entity that means... do you, do you associate the style for a marker with the marker or do you store the style separate and store the idea of the marker in the style that is basically an architecture question that you have to solve.
An example that might resonate maybe more with everyone is we have cars, right? So we have cars and the cars have metadata like license plate, which tires they have, which SIM cards they use, they have all this, a lot of data, much more than the geo data. And so is the location of the car just the property of the car? Or is everything is from the car metadata of my location data?
And this is a distinction you have to decide on your own, how your complete architecture, landscape and ecosystem looks like. For me, I decided many times that the location data is the source of truth, let's say. And I put a lot of metadata in. That is because if everything the user does and everything my API does resolves around the location data I can just as well as use that as my main source of truth.
So if we go back to styling. Styling in particular I would store as metadata. So you have a marker and it should resemble a car. So I want a different icon. I maybe want a direction. If we are talking apps like Uber for example, you see the direction the car is heading. So, I might have a degree so I can, I know how to turn the icon on a map. I need a color, and maybe I wanted to display the lane that the car was driving, the route. So I want to display that and all of this data I store as metadata. And if the user changes that, I change it in the metadata of that marker. That means if any team is fetching the data from the API they have also the information how to display it. And this was for me and for us also very important to do. So styling, I would definitely recommend storing this with the geodata as metadata with everything else. Like the fuel level of the car. That decision you have to do on your own architecture wise.
Jeremy: If you wanted to make a customizable user interface in terms of somebody chose, they wanted to see what's the speed of the car or something like that and they want to be able to turn it off and on, you would store what the user chose in the same place.
Sumit: What the user chose in the sense of, okay, I want to see the fuel level. I want to see the license plate.
Jeremy: They want to be able to turn it on and off.
Sumit: Yeah. So that is a big use case. I think that's the use case for everyone that displays markers. Let's use Tesla superchargers for example. You have the location of the supercharger and if you click on one, you have the address, the exact location, how many are there? How many are currently free? Blah, blah, blah. All of all of this data like you can you, can you eat there? are there restrooms, et cetera.
So Tesla, they don't give you the data as GeoJSON. They give you a collection of superchargers basically an array with objects. And just the location of the supercharger is a GeoJSON.
That means you have a big collection of JSON data and a subset property of each entry is a GeoJSON. That is also how we do it at Car2Go and SHARE NOW. GeoJSON is just a part of a bigger, dataset. But in Geoman, for example, I reversed that. I use GeoJSON as the main source and have.
Basically most of the data as meta data inside the GeoJSON, it doesn't make a big difference to anyone. It's just a preference of how you manage your data. And again, what is your main entity? Is your entity a car that has a location property, or is your main entity a marker and being a car is just a property of that marker, could also be a plane or user.
Because I wrote an application where everything revolves around location data, I chose the location is my main entity and that is the GeoJSON and the fact that it's a car, it's just metadata, a property of it with a specific icon basically, like my application doesn't care if it's a plane or a car. At the end of the day.
Jeremy: I guess another thing I'd like to ask about is when people use maps, a lot of times they're on their phones as well as just being on the desktop. what's your approach or what are things to watch out for when building a site that needs to work on both desktop and mobile?
Sumit: Yeah. Very tricky. Very tricky. So, two things, if you just want to display it. So again, the use case of you have a location of a business or something that you would like to display. That is no problem at all. Leaflet is very mobile friendly, and if you just display a marker the user can zoom in and out, can easily scroll through.
So if your thumb goes on the map and you scroll, you might have seen this with Google maps right in Google maps to zoom or to move the map, you need to use two fingers. They do that so you can easily scroll through the website without, You know that you are scrolling is interrupted by your moving the map instead.
So these kinds of functionality to make it mobile friendly is there in all of these libraries. So this is an easy thing. You don't have to do anything. It just works out of the box. But if you create more advanced interactions with an app, like for example, a user should be able to draw a polygon.
Then it gets more trickier and geoman and also my library leaflet-geoman they both work on mobile. I think that is also one reason why it's getting quite popular because from my research, it's the only drawing tool that works on mobile. But if you use your thumb, it gets in your small screen, it gets less precise your drawings. So I have features like pinning. That means if you click near a different marker, it just snaps them together. So it assumes you want to place them on top of each other. That's a really cool use case, especially on desktop where you can easily move to markers. But on mobile, it's not that easy.
So. I personally think it's possible, but it's not a good way to do that except for let's say you have an iPad Pro with a pen. Then it's really cool. I was really impressed when I used my own library for the first time on an iPad Pro with a pen, because then you can draw really precisely. It's a lot of fun.
It's easy to do. that's really cool. So if you have advanced interactions with mobile. Then you will get into situations with click events and stuff like this where it gets maybe a little bit more tricky and use cases where you hover the mouse over the map and you display things or you give the user a hint of what will happen when they click.
You can't do this on mobile. It's basically a different experience and in a sense a limited functionality if we are talking about drawing, So this you should keep in mind. If it's strictly about displaying data, you will not have a problem.
Jeremy: And in terms of the UI, it may need to be significantly different, right? Between desktop and mobile, do you effectively, hide a lot of UI elements or build two components depending on what size the viewport is?
Sumit: Yeah. Yeah, of course. So in Geoman, I use the sidebar for example, right? If you display it on mobile, you don't see the map anymore and it's just a lot of data. So the more data you show and the more metadata you want to display the trickier it gets. How to display this on mobile. And the map should be front and center, especially on a tool that revolves around that.
So that means if you open geoman.io on your phone, you only see the map and you have a small icon on top that moves in the sidebar, for example. it's still not perfect. I think it's a limited use case also, my particular app to use this on mobile. But of course, you have the same problem with any website that displays a lot of data.
You have to hide some, you might have to, basically also reduce some data like some data that you just don't need on mobile. you will reduce maybe the table, columns, you know, stuff like this to make it as easy as possible for the user on mobile. And they can still open a desktop view if they want to on mobile.
But it's such a limited use case. I think that, as long as it looks good and it has the basic information, I think everyone is fine. If you go into an advanced mode than a desktop might be, or a tablet might be the way to go for the user.
Jeremy: Yeah. We've been talking about geoman. And you've mentioned that you have a leaflet plugin called geoman-leaflet. Can you go into a little bit about what geoman is and why you decided to create geoman-leaflet?
Sumit: Yep. Yeah, sure. So just so you know, it's not a superhero or something like this, it derived from geo management, and I just noticed later that it sounds like some sort of superhero. Anyway, I have a quite, let's say, experience now in solving this for companies and seeing the problems inside companies.
And not only in the mobility space, this ranges to communication providers to, again, construction sites and logistics companies and everyone that has, geospatial data. There are two categories I would say that I'm trying to help. One is they have their own application, they have their own team that works on this or multiple teams, and they use open source products or pay products and they need more functionality.
So they probably use my open source library if they use leaflet for creating data. And, the open source library basically helps you create and edit this data. So you need to create a polygon. You can create this of course by just providing the coordinates. But if you have a user and he needs to, for example, create a polygon around a building or around a block or a city or whatever, my library is there to help them just easily create these polygons on the map.
They can create markers, circles. rectangles, whatever it is. They can also edit each specific Vertex, add Vertexes in-between, they can move the entire shape. They can cut holes in polygons. They can of course, remove the polygon. So all of these drawing and editing features are there. And one product I'm currently thinking about, I'm just collecting feedback. I've not built it yet, is a pro version of that that has even more advanced drawing and editing features. Some companies need to have polygons that are adjacent to each other and cover a complete area like a city, and to have maybe a hundred or 200 polygons that make up the city.
So just think of it like sales territories, or something like this. And that there should not be an overlap or a hole between these polygons, so they need to stick together. Now they can do this already with the open source library that I'm providing, but if they want to change the shape of one polygon, they would have to change all of the adjacent polygons as well, which is quite a lot of work.
And I could build a tool, for example, where you just draw a new border and it automatically calculates. The new size of each of those polygons and adapts that because you basically tell them, Hey, I don't want any overlap and I don't want any hole in it. And these are such advanced niche drawing tools that require weeks and months of work to build, that I want to wait if people or if companies need this, and I'm thinking about providing basically a pro version of the open source library. So that means I have the open source library that I could constantly maintain for basic drawing and editing needs. I wouldn't say it's basic, it's quite advanced already. Then we have really advanced tools where I think a pro version of the library would be helpful for some companies.
And then we have the second bucket of companies that don't build their own application. They just want a way to easily have a service like MailChimp, for example. Instead of managing email, they want to manage geospatial data. So they want a service where they can create data where they can put data in through an API, and where all the teams and clients and apps can fetch the data again and this is where the SaaS application comes in and it's called geoman. There you have a studio where you have your map. Or you can create multiple map like projects, and then you create, for example, one map that displays your supercharging network.
You have one map that displays your sales territories or your parking spots. You can create one map that displays airlines, routes, whatever it is, whatever your data is. And for me, the powerful thing that would solve a lot of problems inside companies is any user can draw this.
It can be a working student that creates parking spots, for example, in a city. Then you have your developers that not only consume the data, but they can write the data through the API and it's updated everywhere on every consumer. And you can attach the metadata to it to put everything in.
Jeremy: To summarize, there's geoman, the application, which is where somebody can draw shapes or add data, probably import data from something like say a CSV or an API or something like that.
And then, if you had an application that wanted to consume the data, geoman itself has an API for people to get data back out.
Sumit: Exactly.
Jeremy: and then geoman-leaflet is the leaflet plugin that you built and use within the geoman application.
Sumit: Exactly. And it's open source for anyone that wants to build basically their own application to manage your data. And, yeah, so I got asked a lot, for example, why do I open source this? Because it seems like it's one of the best plugins out there. I don't want to, say anything false here.
And I'm not sure about that, but users tell me they switched to it, for example, from other libraries. And I also haven't seen other libraries that provide the same functionality. So it's one of the more advanced ones. And people ask me, why do it open source? Why is it not part of the geoman suite?
That makes it a reason for people to use. And the reason is simply that I know that Geoman will not cover all use cases that are out there. People need very specific needs and geo data. And I like open source. I love contributing to it. I love the interaction with the community and I think it's a great open source product.
I personally benefit a lot from open source as well specifically with leaflet and I thought Geoman as a platform can be a way for me to earn money and invest more time into the open source product that I can then give back again. You know what I mean?
I've maintained this library for four years now. I have not earned a single Euro with it and it costs a lot of time and I feel bad if I don't maintain it for a month or something. But if, if the platform is successful, then I have more time to develop it. It serves multiple purposes, not only for me personally, but also I can maintain the library much better, add much more features for anyone that wants to build their own application with the open source tools
Jeremy: Very cool. So hopefully at some point you'll get to spend more time, in your day job working on geoman and working on geoman-leaflet.
Sumit: Would love to. Yeah.
Jeremy: To start wrapping up, for people who are learning leaflet are there common mistakes you see people make or suggestions you have for people who are learning leaflet now?
Sumit: Yeah. If you go through the docs and the standard stack overflow tool that we all use every day you will get quite far. What I see a lot, especially if you use frameworks, and beginners with leaflet, they struggle sometimes to make this mental distinction between these two universes, these two ecosystems.
So if you use a reactive framework, like react, vue, Angular, just know that leaflet itself is not part of that. You interact with the DOM directly as mentioned before. So don't expect like this reactivity. This you just provide something and everything changes on the map.
This is something where I see a lot of questions happening and then on the other hand, the business side, so think of leaflet and mapping. Like they provide you the tools to display a map and data on top of it, but the business logic behind it, like, how do you store it in what data format or I want the tile to be red if it's overlapping something else, this stuff, you have to build yourself or use a plugin for it. It won't create business logic for you. It's basically a dumb way of rendering data. Not that dumb, but I hope you get what I mean right? You have business logic that is specific to you. And you have to build this yourself. So the library, will not do this out of the box for you. So I see these questions a lot also on the open source repository, where people just expect expect it to do things. but they can easily do this in three lines of code they just haven't wrapped their head around yet. What exactly is leaflet doing for you and what not. So if you have any business logic this is something you have to build yourself and it's honestly quite easy. But if you have any questions I'm happy to help, not only for my open source library, I'm on Twitter and everywhere, so I'm happy to help out.
Jeremy: Cool. Are there any specific projects that people should look at. If they're trying to learn how an application should be laid out in leaflet.
Sumit: So there are multiple demo projects for basically all the plugins. There are demo projects and also for leaflet itself. You can look at geoman.io, which is quite a, let's say, more advanced use case. Everything from Uber of course is very advanced. And especially in data visualization, they have amazing tools and it's all open source also.
So if you look at demo pages from Uber and from Mapbox you will get a sense of where it can go. And then I would personally just look into the DOM and see what they do and how they do it. Maybe they have an open source repository where I can take a look.
And also, for example, if you want to write your own plugin with leaflet, look at the open source code. That's how I started. Like I had no idea how to create a plugin for leaflet or how to manage geospatial data, like I just didn't know. All I had was I needed functionality that no plugin had so leaflet-draw was back then the only one. I basically scan their code, looked at how they do it, and try to recreate it and then build my own architecture with it. But it was very similar in the beginning the code. So just look, that's what open source is there for, look into the code, learn from that, clone it, adapt it, and grow from there.
Jeremy: Yeah. How about in terms of like a full application, you're mentioning to look at, say geoman.io but geoman itself is not open source, is it?
Sumit: No. Yeah, so I'm not sure where are to look there. what I do sometimes. Okay. You have stuff like geojson.io. it's a small utility to, to edit GeoJSON. I've built the same on geoman.io. Oh. But yeah, it's a different one from Mapbox and this is open source. It's a smaller, application that, yeah, that you can take a look at.
What I can also recommend is if you for example you want to create a leaflet map and you don't know how. The demos are not enough for you. You want to see some real use cases with React for example, what I do is I used the advance search functionality on GitHub. So I look at who uses leaflet and who use react, and maybe I searched for the inception code from leaflet, and filter by it.
And then you find a lot of applications that basically use it like this, and then I scan around and try to find someone that uses the same stack as me and how they do it. And this gives me a lot of inspiration.
Jeremy: Yeah, that's a great tip. Not just for leaflet, but for anytime you're trying to learn a new library. Yeah, it's really helpful.
Sumit: Yeah. The GitHub search functionality is underrated, I think in that sense.
Jeremy: Cool. well before we finish up, is there anything that you think we should have mentioned or we should have talked about?
Sumit: I think we had a quite a awesome overview. There is not much to mention if you are into geospatial data or if you have the problem to solve the problem, for your company, or a client or whatever. I think it's quite the interesting field. It's quite niche also, but yeah, the user experience is... It's amazing, and you have such a big impact. If you build something nice on top of maps. And of course it's a field that is very, very future-proof. It doesn't matter if we're talking drones, autonomous vehicles, sales, like everything in the mobility sector specifically, but everything around us needs more and more location data because everything is connected. And, I think it's a field where you as a developer, if you have these skills are very good equipped for the future. So, don't hesitate to get into it and code a bit around it. I don't think you will regret it.
Jeremy: Cool. And for people who want to follow you or see what's going on with geoman, where should they go?
Sumit: So, I'm on too many platforms, I say, but, I'm active, very active on Twitter. my handle is tweetsofsumit. There I'm the most active. So if there is anything you would like to ask, or if you want to follow me and you know, see where geoman is going, the open source library or also even how to create a business, like I'm not an expert in it. I just share my journey. I will share everything on Twitter. And if I have a, for example, a YouTube video or even a guest appearance on a podcast like this one, and I will share everything there. So I think that is the best way. And of course, you can go to my website, which is raum.sh raum is a German word, raum for a room. raum.sh Is my personal website where I also post occasional updates here and there.
Jeremy: Very cool. Well, thank you so much for talking to me today Sumit
Sumit: Thanks for having me, Jeremy it was nice talking to you.
Taylor Thomas is an Engineer at Azure, the core maintainer of the Kubernetes Package Manager Helm, and a member of the Krustlet team.
Timestamps
Related Links
This episode is also posted on the station.org/">Rustacean Station feed. Check it out for episodes all about Rust!
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Hey, this is Jeremy Jung. This episode, I'm talking to Taylor Thomas about running WebAssembly on the server with Rust. He's an engineer on the Azure team. The core maintainer of Helm, which is a package manager for Kubernetes. And he's currently working on Krustlet, which runs WebAssembly applications within Kubernetes.
Also, during our conversation, you're going to hear us talk about WASM. That's shorthand for WebAssembly. All right. I hope you enjoy my talk with Taylor. Taylor thanks for joining me today.
Taylor: [00:00:30] Thank you for having me, Jeremy. This is my first Rust related podcast so it's exciting for me. I'm a fairly new rustacean all things considered, so I'm happy to be here.
Jeremy: [00:00:41] For people who aren't familiar with the world of kubernetes and containerization all these different things. Could you start by explaining at a high level what Kubernetes is?
Taylor: [00:00:55] Yeah, so Kubernetes, is a container orchestrator. And you've probably at least heard of Docker if you're in the technology world even if you haven't used it. But Docker is a technology that made something called containers popular and useful. Those technologies have been around for a while inside of the Linux kernel that's why they call it a container everything is inside of this thing and contained in this process. It uses C groups and kernel namespaces. There's a couple of different things under the hood that are going on. But basically it allows you to create an artifact that can be bundled up into something called an image.
And that image can be passed around and then be used multiple times. So often people will compare those to VMs. VMs were a big revelation, right? Because you didn't have to go literally put in a new blade if you needed a new server or reboot something you had to instead just say, okay, I want a new VM.
But you still had to install a whole operating system. It was like spinning up a new computer. And so containers made that even more simple because instead of having to do that, it's using the same shared underlying, kernel calls and things underneath the hood, but everything's isolated. And so if you want to spin up three different instances of an nginx server all you'd have to do is create three containers that are all running at the same time and those will all have the same specification that you basically baked into there. An immutable artifact. If you want to create a new one it has a new hash and a new version. So this was really good for people, but the problem is how do you orchestrate it across everything?
And that's where Kubernetes came in. So Kubernetes takes that and says, okay, well we have a huge fleet of nodes. That's out there. How do I schedule each of these containers properly so that they're either not on the same place or that they meet certain requirements or that I have a certain number of them.
It takes care of all that including some of the underlying networking connections so that you can connect all your containers to each other in a distributed and actually self-healing way it goes through loops. And if something goes wrong, it will try to heal that container and make it come back to a normal running state.
And so it is a very powerful technology it's caught on... maybe it's caught on a little too much in some people's opinions, but it's a useful tool for containers that came about and a lot of people are using it for underlying infrastructure projects.
Jeremy: [00:03:16] It reminds me a little bit of when you're using something like AWS and it has auto-scaling for VMs. And let's say that you have an application and it runs on virtual machines. And you would be able to tell AWS as more people are accessing my application I want you to create new VMs [and] run new instances of my application.
Something like Kubernetes is able to do something similar, but maybe at a more generic level of being able to figure out. Okay. How many machines do I have access to? And anytime I'm asked to run something, I'll go and find the right machine to run it on. And it's always going to be in the context of these containers. Did I kind of get that right?
Taylor: [00:04:15] Yes, it's very much a generic tool across all these different things. And nowadays you can run it with, with windows or Linux or mostly windows and Linux. But there are some limitations there, but yes, it's a very generic tool. In terms of being able to connect what you want and use what you want to, to make this distributed platform easy.
And like you said it will scale things. If you've done something in AWS where you have the elastic things where they'll scale it's a similar thing. There's something in Kubernetes that you can set up that it will automatically scale to make sure that you have the right around right amount of capacity for what you're doing.
Jeremy: [00:04:50] And I think this takes away the need for the developer to need to understand where their applications are going to run. It's. You give some kind of configuration to Kubernetes and say, here's my app. And then it just figures out, where to run it and how many instances to run and that sort of thing.
Taylor: [00:05:12] Yeah, it's really meant as a dev ops or SRE kind of tool that instead of it being a. they have to like custom tailor machines. They already have these Docker images in place and can just run them, like you said. you just give it the configuration and it is still involved. It's not like a magic bullet there, but you don't have to care as much about where it goes if you've configured everything properly, it just kinda does it.
Jeremy: [00:05:36] I wonder if you could paint a picture of, where containers fit in between just running a full machine versus running a single process on a computer.
Taylor: [00:05:47] Essentially a container is just a running process, a single process. And because it's a single running process, you can run multiple of them on a machine. The tools that you need installed, all the binaries, all those different things are encapsulated inside of this container.
And so that way you can run 10 of them on a machine instead of spinning up 10 servers. So it allows a little bit more density, and obviously you still have to be careful about what noisy neighbor problems and other things that normally happen in infrastructure, but it allows for much more condensed areas. And the spin up is much quicker. If you have a small image that you have to get getting a new one and spinning it up is easily under 30 seconds every time. bigger images can take longer, but even then it's still faster than provisioning a full VM for what you're trying to do. And so this fits for a lot of different kinds of services that, don't need like very, like large, large requirements on the system.
And even if you do have large requirements, you can use Kubernetes in a way that can be helpful.
Jeremy: [00:06:45] And so you're talking about how these containers are a process running on the system. So for example, if I were to in windows, look at the task manager or in Linux, just run ps, I would see individual processes for each of these containers that are being run.
Taylor: [00:07:05] Yes. It really depends on the implementation, but essentially that's, what's going on. There's with Docker, there's some other underlying details, so we don't need to get into, but essentially you would see processes that, that are spun up that are, are doing the work, but they are just processes underneath the hood instead of being a full operating system.
Jeremy: [00:07:23] Next, I want to talk a little bit about, WebAssembly, because I know that's an important part of the krustlet project that you work on. Could you explain a little bit about what WebAssembly is at a high level?
Taylor: [00:07:37] Yeah, so WebAssembly, as you can probably guess from the name is a tool that was originally designed for the web. Now, that's I think the original creators didn't necessarily intend it to be that way, but that's the name that it has. And that's what it's used for. There's multiple, companies and, big websites use WebAssembly. And the idea behind WebAssembly is that you can create a binary that can then be consumed. So it's compiled code that can then be run in the browser. So giving you the speed of compiled code. While also, still being sandboxed inside of the browser sandbox environment.
So this allows for some very performant things to be done. You have things like rendering tools. I know one that I've seen an example I always give is one. I like Autodesk, which they're maker of CAD tools and rendering tools. They have a lot of online things that run WebAssembly. Because it's, it gives it the performance.
It needs to be able to do those more complex tasks. And so that's where WebAssembly started, but the idea is WebAssembly could be used anywhere. And when you pull out a WebAssembly from just the browser, it allows you to have kind of a universal interface and what this is called, they've defined it. And there's a working group and it's still very much a work in progress, but it's called WASI, which stands for WebAssembly system interface.
And that interface is a definition of basically how it can interact with the system. And the nice thing about it is it still that sandboxed model, you have to grant explicit permission to do anything. So if you want it to be able to access a file somewhere, you have to grant access to that file ahead of time before you start it, It's not there yet, but I assume when we get to some of the networking socket support in there, it will also have those. You have to grant specific permission for it to do it. Whereas opposed to containers, you have a little bit a different security model because you're still running like into the normal Linux security things that you have to deal with.
there's ways to break out of a container. There's not really a way to break out of a WebAssembly module in the same way. because. You, you're not, you're not running in these, like you're running a compiled code thing somewhere instead of basically like a shim over specific binaries or things that are being run.
so that's, that's the main difference, and this allows things to be run on any system. one of the things that we, we saw with Docker and that we were really hoping when, when we, when we had that Docker was that it could run anywhere. But if we're being honest with ourselves, it can't really run anywhere Docker and containers in general are a Linux tool. Now there's people, well, who at Microsoft and elsewhere have made Windows containers, a thing, they work, they work well there's some really cool work they've done. And there's nothing. I have nothing to say against that. They're they they've done great work, but, but really if you have a nginx container, You can't run that on a windows machine.
You can run it on a windows machine, but it's technically running a Linux VM behind the scenes, same thing on a Mac. And so it isn't truly this, ideal of write once run anywhere. Now, obviously there there's still technical challenges there. You can't do it completely, but you can come. So if I compile a WebAssembly module, a WASI compatible.
WebAssembly module on my Mac. I can pass it to you and you can run it on a windows machine, on a Linux machine, on a raspberry PI. It doesn't matter because, and it'll be the exact same code. so that is a very powerful thing that we saw in WebAssembly that fits also fairly well inside of this container space and what we could do with it.
Jeremy: [00:11:12] Help me to understand a little bit more about what WebAssembly actually is, because does WebAssembly itself have byte code or some kind of language that you're, you're passing to? a runtime, like for example, an example I can think of is. the Java virtual machine where people can write many different types of languages, such as, I believe Clojure and Scala, they can write a language that generates Java byte code and is run by the Java virtual machine.
And so you can run those applications anywhere that you can run the Java virtual machine is WebAssembly similar? Is there a language for WebAssembly that is the target for say rust or C or different languages that you want to run in WebAssembly?
Taylor: [00:12:06] That's a really good question. It is similar in the sense that it does write code that then can be interpreted anywhere and there are various WASM runtimes. The reference implementation that we're using within the Krustlet project is called Wasmtime. And that's the one that's following the Waze spec, and is essentially the reference implementation for the WASI WASI specification.
And, that one can run on, on any of these systems that you've mentioned, and it has a byte code that it interprets in that way. And there's very, there's, there's a lot of technical details we could dive into there that I'm not a huge expert on, I know the basics, but there's some that are just like JIT compilers, right?
So just in time, there's some that are like, compiled at, like a pre compilation step that can happen before. all of these things can happen but the thing is, is this is much more lightweight and small and constrained than the Java the JVM would be in this case. Right. So it's a, it's a much smaller and compressed use case, but it does have the similarity that it is byte code being interpreted by some sort of runtime.
Jeremy: [00:13:09] And, and this, this byte code, cause you've been talking about how there are different, I guess, implementations of the WebAssembly runtime. You, you gave a Wasmtime as an example of that. Does that mean that as long as you target, the WebAssembly bytecode that any of these different runtimes could run that code?
Is, is that how it works?
Taylor: [00:13:35] Yeah. And that's why the WASI specification is kind of the thing that's making it work on the server side rather than just in the web is because we need those specifications for how to interface with different things on the machine. and so there's also this other thing called interface types, which also defines like how you can interchange different data types in between different parts of applications.
To be clear. This is an area that's still under heavy development, for example, Wasmtime and WASI in general, hasn't even finalized their network specifications. So you have to kind of do work arounds and things to get net working in place. Now that's coming this, but like, this is very bleeding edge in that sense.
Jeremy: [00:14:17] And so this bike code can run in the browser in whatever WebAssembly runtime is built into the browser. And then it can also run outside the browser in a runtime, such as Wasmtime. And it sounds like the, the distinction there is that. When you're running outside of the browser, you need some kind of consistent API to be able to access the file system, access network, things like that.
which would normally be handled by the browser, but once you take it outside of the browser, you need some common interface that knows how to make system calls to open a file on windows versus open a file on, on Linux. And that's what a runtime like Wasmtime would do by implementing what WASI describes is that sort of, did I get that right there?
Taylor: [00:15:10] Yeah, that's a, that's a great summary of how this works.
Jeremy: [00:15:14] Cool. so one of the questions that, I think people often have is when you work with languages like rust or, or go, they can be compiled ahead of time, you can get a binary that you can run directly on your operating system, for languages like those what are the benefits of using WebAssembly to run an application versus just using that binary?
Taylor: [00:15:42] It just depends on, I guess, the situation you're trying to do with it. Like there's certain tools, like the idea is what we have with, with WASM right now, isn't meant to replace Docker entirely, I think it opens up new cases for people who aren't, who aren't like already into Docker or need to, or have some specific things.
But also this makes there, there's a certain amount of portability and security that comes from using WASM. obviously if you build a binary and then you build it for each system, you have the, the native things all built in all ready to go, which has a distinct advantage over some, like having to work through an interface, however, that security model is what gives us, a lot of hope for the future of this, because you have to explicitly grant permissions.
and it's a compile once. I don't have to compile my windows version. I don't have to. So you don't have to worry about cross compiling tool chains or the different VMs that you have to spin up to build one for ARM and for Mac and for Linux and for windows and you know, all those, all the different targets.
You don't have to worry about that with wasm you just build the, build the one binary and it's ready to go. it's also very, very small. If you do some, even without optimizations, you're talking like a meg or two megs for for a simple application. As opposed to a full binary, when, when you build it and rest binaries are fairly small, go, binaries are larger cause they've compiled in the runtime, but like rust, even though their binaries are small, this is even smaller. And if you strip it out, you can get it under a meg, depending on what it's doing.
Now that size really matters with something like Kubernetes.
So if you start with a container and you pull down an image, some of those images, even if you're using like the very slim things are still. 20 30, 40 megs. Now that's not a big deal, but there's some of these bigger ones, especially when people are doing, some of the bigger applications are close to a gig.
Now that's not recommended. I know people say, but that's not a recommended practice. Yes. I know it's not. But in PRA and what actually happens in reality is people will do that. And so when you pull down, if you have a new version, even though they've cashed certain layers and things are the same, it's still takes a while to pull the new version and start it up.
Whereas WASM is just these tiny modules. And even if we haven't done something super big, but I'm guessing that even if it's only, even if it's huge, it's only a few megs. And so then you're, you're pulling this down and this is very quick and very fast. And the added security benefits on top of that, where you're not having to deal with the same security layers as what is available inside of a container, that, that, yeah, very explicit grants on your security surface.
Is a very powerful thing for us inside of Kubernetes on the server side.
Jeremy: [00:18:29] And that security piece is a little interesting, is that when you're building the WebAssembly application that you're explicitly building in this application will have permissions to the file system at these paths, or kind of what does that look like?
Taylor: [00:18:48] So it's not built in at compile time. Now there is one. when we get talking a little bit more about Krustlet, we have another implementation and there's one that, was started by capital one called wascc. which is WebAssembly, secure capabilities, connector. There's lots of W's here in this space.
And so this, this wascc runtime actually does things called capabilities where you can, explicitly you're actually supposed to explicitly say, I need this capability and I'm signed to have this capability, but WASM by default and the WASI spec, you grant it at runtime. You say, I'm going to give you access to this file, or I'm going to give you the permission to do this thing. So those things are done at the very beginning of your runtime, not when you're compiling it.
Jeremy: [00:19:33] Hmm. Interesting. And then, so that would be some kind of configuration file, maybe that would say, okay, when you run, this WASM application. Only give access to these permissions. And like you were saying before, that's much simpler than something like a Docker container where your permissions model is actually based off of a Linux operating system.
Taylor: [00:19:57] Yes. It's much simpler in that sense. I mean, you're still, there's, there's more overhead involved if you were to spinning this up completely manually, you have to say, okay. I need to give access to this directory and I need to give access to this thing, but Kubernetes kind of takes care of that for you.
And we do that inside of Krustlet with, with what we're doing to kind of abstract some of that away. So you don't have to do it. So, I mean, it's, it's like most security tradeoffs, right? Like the most secure thing is a server. That's not connected to the internet and is inside of a locked room, right? Like that's, and you can only access it in, in that room with a badge.
Like that's, that's like the most secure you can get, but that computer is kind of useless. so that's the idea behind, the, the security trade off is now you're just being explicit about what you're granting instead of just having all these implicit things of, Oh, I can access the network and I can access this thing and I can access this thing, which comes by default as running a process on a, on an operating system.
Jeremy: [00:20:50] When you were talking about being able to run the WASM application anywhere, that sounds like. benefit because normally when you have a build server, for, for example, an open source project, you'll see that they have to target all these different operating systems. They may have to build their application for, six or eight different OSS.
And in the case of using WebAssembly, they could just build it once and it would technically run on any of those. It sounds like. We've been talking about WebAssembly on the server. before we get into how Krustlet runs WASM applications, if I had a WASM application and I just wanted to run it on my machine, what does that look like?
Is there some kind of package manager or, how am I running these applications?
Taylor: [00:21:43] So there's a couple of different ways that these can be shared around. There's no, specific. implementation right now. the thing that we started with, with inside of Krustlet itself, as we use the OCI specification, which is the exact same thing that the containers use, it's just storing as a different artifact type.
Uh, there's some work by the people, working on WASC, they have something called gantry. and there's a couple other people who are looking into how you're supposed to store modules. So the idea behind all this, we have to figure out what exactly want to do. So it's kind of a little bit in the air. we have a tool that was built by somebody on the team called WASM to OCI, and it does the work of pushing that to a container registry that supports Arbitrary artifacts. And so what you can do is take that and pull that down and it just pulls down the compiled module file.
It's always something.wasm, and then you can use whatever runtime you want to use to run it. So you can download wasmtime and install that you can down. You could do a there's Wasmer, there's Wasm3, which is for kind of optimized for embedded devices. Those are the kinds of things that, that you can do.
Just, you have to choose the runtime that you want to use. And then you have to grab the module from somewhere, wherever that might be right now, which is still, like I said, a little bit of a loosey goosey kind of thing.
Jeremy: [00:23:10] and, if I was just in the context of running an application on my computer without involving Kubernetes or anything like that, would it be where I get a single file? That's the WASM application. And then I pass that into say wasmtime or something like that, just at the command line. And that's how I would run it.
Taylor: [00:23:32] Yeah, that's exactly how it would work locally. Now, obviously we're, we're working on building us to make it even more fully featured. The ideal would be that then you can have an application that can be easily swapped across machines. I mean, imagine if you had something like notepad. Or some sort of editor and then you could do it there and then just take that same application and then have it somewhere else.
without worrying about what kind of system it is, you could have your raspberry PI 4 running a random desktop somewhere, and you could pass it over to a server and, do a virtual desktop session. Like you could do anything crazy that you wanted to by passing that around.
Jeremy: [00:24:09] cool. Yeah. I mean, it's, it's this idea of having a truly universal binary, I guess, having the ability to copy this application anywhere and run it without having to worry about, I guess basically anything you just need to have a wise and runtime.
Taylor: [00:24:26] Yeah. And that gives it some quite a bit of power. Right? Cause you could jump from an edge device. I'm using edge in the loosest possible term. Cause it could mean anything, but like any type of edge device, all the way to. I server I'm running in a data center. It can be anything along that whole spectrum of different servers that can run this, any can pass it around.
And it can, you can even start to think about how you could maybe hot swap it, right? Like you could point at one running implementation of it. And then when you don't have access to that, because you lose internet, you could point it at a local instance of how to run this.
Jeremy: [00:24:57] very cool. now maybe we should go a little bit more into Krustlet. I know you've talked a little bit about what it is, but maybe you could go a little more into detail about what Krustlet is.
Taylor: [00:25:11] Yeah. So Krustlet stands for Kubernetes Rust kubelet. Lots of Ks. The main idea behind Krustlet was we wanted to create, we want it to reimplement the kubelet, which a kubelet is basically the, the binary that runs on a Kubernetes node that connects it to the cluster and runs it and joins. So that's, that's a kubelet and. we wanted to write it, write it so we could do WebAssembly. And now the reason we wrote it in rust was for a couple reasons. Number one, since you're probably listening to this because you like rust and you do rust, rust has probably the best WebAssembly support for, for server side things.
most languages at this point actually have WebAssembly support, but it's mostly geared towards the browser, but, WASI is something you can easily add in. It's actually, you just use rustup and do rustup component or rustup target add wasm32 dash WASI. And that's how simple it is to start compiling for WASI compatible WASM binaries in rust.
and so that is a, very powerful and useful tool to have around. If you can do it that easily. If you look at some of the other examples like C or C plus, plus you have to, customize clang properly or download the CA the already preconfigured tool chain and use that clang and then also you have a couple of other languages that support it, but rust was fully featured language that we could use it with, but also rust has caught our attention for a while, just because of its application inside of distributed systems and compute, like normally it's looked at as a systems engineering language, but can we use it in these cloud applications, this idea of cloud native as the buzz word right now? can it be used in a cloud native way? And so it was a secondary goal here was to prove that it could be, could be done like that.
Plus you add on all of the safety and security and, correctness features inside of rust. And it really helps us out. So we wrote several blog posts about that, that I can probably send around or, or link around if you send out show notes. but the main idea is that pre, it has prevented us from shooting ourselves in the foot.
Go has really easy concurrency things. There's sometimes that's one of the things that I most miss about using go for, for some of this is if I want to do concurrency. work. It's quite simple to set up, it's built into the language. It's very simple to do. But the thing is, is that there are bugs that we got caught where we'd sit there and be like, where everyone gets mad and like, why are you getting mad at me borrow checker, like, well, I've done everything. And then you're like, Oh, like it's caught that down the line. I'm going to, if I were to do this, I'd have two people trying to read the same data and that has been very useful and exciting for us because it's stopped us from doing those things. So the guarantee that when your rust code compiles, that it will be correct, even if it's not necessarily the right code, it's at least correct.
And you're not going to have weird database access issues. Inside of, of your code. And so that was the secondary reason that we found it was, it's just very powerful for doing that. And it also it's, it's expressiveness with traits and how generics work gave us a good deal of flexibility inside of Kubernetes.
having dealt with both extensively with both the. Kubernetes rust client and the Kubernetes go client, the Kubernetes rust client, it's much more ergonomic due to how traits work and generics. And so that was just something that we really enjoyed coming over. But it also, like I said, the main thing was that it had such good WASM support already built in and really most of the WASM runtimes are being written in rust right now.
So it makes sense for us to be in this space. So that's why this project was created. was to do is to have the ability to easily run, rust things inside or easily run WASM inside of Kubernetes. So that's, why we used rust and why we came up with Krustlet.
Jeremy: [00:29:07] Yeah, that makes a lot of sense because it had reminds me of, I had a conversation with a Armin Ronacher in an earlier episode and he works with century on different debugging tools and things like that. And the reason why they chose to use rust in parts of Sentry is because there are a lot of existing tools or a lot of existing crates in Rust related to compilers or related to, reading debug files and things like that. And so in your case, the, the community had already done a lot of work in WebAssembly, and that's why it made sense to choose rust. So I think it's interesting how there's these certain niches that people have built up and, continue to cultivate and continue to bring additional projects in because of that, that base it's, it's very cool.
You had mentioned a little bit earlier about how rust has really good support for WebAssembly. And so it's easy to get something up and running in WebAssembly using rust. but you had also mentioned with WASI that there are only certain features of the system that are implemented.
Like you had mentioned that there aren't network calls implemented yet. And I wonder when you're writing a rust application, do you have to do anything special when you're trying to target WASI, for example, if I were to make a network call in my rust application, and I try to compile it to be run in WebAssembly is the, the command line tool going to tell me you're trying to use an API that, that doesn't exist.
Like how does that part work?
Taylor: [00:30:52] that right now is a very rough edge. It just depends. I haven't tried to direct compile in with a network call yet, but. Most of the time, if you're doing something, that can't be compiled, the co the compiler will go in and say, like, I can't find a linked thing. Like when it's trying to link and do things, like, I can't find anything that links this together or that I'm able to compile this in and it'll spit out there.
And sometimes it's very obtuse. It just depends on what it is. This is, like I said, it's a very rough edge right now. because it is so new, when, when you are compiling, but for the most part, you actually write things the same way. There's some things that you might need to pull in, to make sure that you don't, do something incorrectly or that you've had the correct things attached to the data structure or whatever it might be for the specific case.
There's actually some good examples inside of like wasmtime and a couple other places. But for the most part, you write code pretty much how you just normally would.
Jeremy: [00:31:45] so it sounds like currently, like you said, you pretty much write code as normal. You run it through the command line utility, and then you may get a helpful error message or you may get one that's really hard to decode and then you basically start digging around the internet, trying to find out, what might be wrong.
Taylor: [00:32:05] Yeah, that's kind of how it goes. Now. Luckily, people are very responsive to this, in, in the community. but we're working with them to, to work around this and there are possibilities of using networking. So, the wascc examples. That's one of the reasons we chose to use wascc is because it has networking support built in.
Now, the way it works is kind of just working around the problem. You have a capability that's built for the native system. And so if you're doing it on a, on a Mac there's, you can either load it from like a dylib file or like a, an object file of some kind, or you can have it compiled in, which is what we do in Krustlet.
And so when we build it for windows, It gets that compiled part for windows when we build it for, for Mac, it's going to have that compiled thing in there. And so like it's just working around it by creating a component of the system that is then linked by wascc to be able to talk forward calls around.
And so when something calls the networking thing, that call gets forwarded to the WebAssembly module, which handles it and passes it back out. There's not actual networking. Inside of the module itself. And so it works around it. It's a bit different of a model. It's an actual better model as opposed to, the wasmtime implementation and Crescent, which is more of a working like a standard container.
I use that term very, very loosely. It's more like I have a process I'm running that process as, for as long as it wanted to. And then it's done as opposed to responding to specific like action or actor calls that come in from a, from a host.
Jeremy: [00:33:38] so wasmtime and, and WASC are two, different WebAssembly run times. when you're talking about these processes running as actors, I guess, would that be. This would be a case where if you want to run a number of processes and you want them to communicate with one another, that's when you would choose wascc.
I'm trying to understand when you would use one runtime versus the other
Taylor: [00:34:04] Yeah. So right now, like if you were to go download Krustlet right now and try it, basically you have, if you were going to be using Wasmtime, the only way to communicate is by like data on files. Cause it can access files. So you could mount a shared volume and then like pass it off so it could do data processing, but it can't do communication very well until we get the, until WASI finalizes, how it's going to do networking.
if you want to do a full thing with WASC. So the thing with WASC is, you can use it entirely out of Kubernetes as its own thing. It's almost, it's, it's very similar to like a functions as a service kind of kind of tool. You write your things in whatever language, compile them to a WASM module. And then it handles connecting all those things together.
And the capabilities model around it gives it a security, an additional security layer that you have to, things are signed. So all your modules have to be signed. And then you have to, say that it has access to specific capability. So whether it can access another capability that another WebAssembly module is exposing or whatever, you can glue all these together so that you could pass calls around.
and it also has actually a, an ad hoc networking tool called lattice to connect multiple nodes together. So it can run entirely outside of Kubernetes. but it also has a bunch of tooling and things around it. So when you're going to do it, you're buying into that system and you have to know that that's what you're doing.
Wasmtime sometime is meant to be, like I said, following that same WASI specifications. So as soon as WASI gets networking and then we'll implement the networking stuff, just because we want to make sure that we have the more, like here's the vanilla option, how you glue it together. If you want more of this like quick functions kind of thing, then wascc is an amazing tool.
And so we implemented that because we've been, we've been collaborating with them for a long time. and so we've, we've had that implementation there so that it can be show another way wascc can be used while also helping people who want to try this right now and hack around to do things with networking.
Jeremy: [00:36:00] Does that mean there is some specific API call, I guess, that you're using within your WASC actors that allows it to communicate with the other actors. but it's not like general network calls. It's, it's a very constrained API. Is, is that right?
Taylor: [00:36:16] Yeah, there's an underlying thing called they call wapc yeah. WAPC so it's a WebAssembly protocol. I can't remember. Basically. It's a, it's the protobuf of the, this world. it's a message protocol. Here's how I'm sending a message back and forth.
And so each actor is what it's called, is a WebAssembly module that can respond to specific events. And so it registers specific event handlers for those events. And when the, the underlying host that's running all these, gets it it dispatches those events to the actors and to, to run.
Jeremy: [00:36:49] and so if you were using that with Krustlet then, the, the way that the actors communicate with one another or communicate with the host, that would be configured automatically by Kruslet I guess, or, or what is the,
Taylor: [00:37:04] To an extent.
Jeremy: [00:37:05] Okay.
Taylor: [00:37:06] That's why I was saying that it can be used more fully featured outside of the Kubernetes world. but it has a distinct place inside of Krustlet as well. And so Krustlet will do some of the configuration to a point like it makes sure all the capabilities and stuff are configured, but we're, for example, we're still trying to define, well, if somebody wants to add other capabilities, How do they define that?
Which normally you do, there's a freestyle block inside of a Kubernetes configuration called an annotation. So we're thinking, well, do we do it with an annotation or do we do it with another tool we don't know yet, but right now it does that base configuration link of it. It'll set up, make sure you have like an HTTP server access and that you have access to do logging and that you have access to do all those things.
It sets up for you. But we have to still define a way of what's the, what's a safe way for a user to say, I need this capability.
Jeremy: [00:37:56] When you've been developing Krustlet you've, you've mentioned how there's a lot of existing WebAssembly, capability in rust or projects, built in rust. are there other parts of the rust ecosystem or specific crates that, that helped you speed up your development a lot?
Taylor: [00:38:15] Yeah, there are, in terms of development tools, I have really loved, Cargo expand when they're doing some of the macro stuff. Async things do a lot of like additional, macro things that like build stuff out or, or wrap things in. So it's kind of nice to see that. I also learned about a bunch of different tools around how to debug stack overflow's because we were accidentally pinning some stuff too early and it was causing a stack overflow that we didn't discover until windows. because there's a smaller stack and so that, like, there were some really interesting things on the nightly compiler, like how to print, type sizes, that was really helpful in identifying things that could have gone wrong. and looking at like our other tools we've really enjoyed, the Kubernetes crate for it. It's called uh Kube. And that one has some really awesome, awesome tools in it, that are, very helpful. I think a lot of them are things that people have heard of ser serde, or sir, I can never, I feel like that's
Jeremy: [00:39:17] Right.
Taylor: [00:39:17] that's a constant debate in the rust community.
Um,
Jeremy: [00:39:19] I, yeah, I thought it was Serde, but I don't know.
Taylor: [00:39:23] And I think it's Serde because it's serialize, deserialize, but anyway, yeah, so that's there, that we've used a lot of, and has been very helpful. obviously the tokio runtime has been helpful for us as well. But really like, it just depends. Like if there's not like any other crazy tools we've used outside of it, I do have to say a huge, a huge thanks to those who are, those who were working on, like the Kube crate and, and some of these other things we've, we've been able to contribute back as we found bugs, and other stuff.
I love the rust rust TLS crate as well, which has been very helpful for windows because then we don't have to have an open SSL dependency. and so that those are, those are the different things that have been, I guess, really helpful to us as I've like looked through and seen all the different things that we've done in our, in our code.
Jeremy: [00:40:11] At the start of the episode, you mentioned you're relatively new to Rust, the languages you had used previously. would that be like go or what are some of the other languages you have a lot of experience with.
Taylor: [00:40:23] Yeah, I am a, re stationed by way of go. I've kind of like moved all over the place. so I've done a lot. I mean, I've done a lot of Python, like for glue code when I've done some more SRE work. I did node, back when it was starting to become a thing. Right. And then moved on to go, and then we have.
and then I've been doing some rust, since then, so yeah, it's just, I come from go, that's my main background before this was a lot of go because I was in the Kubernetes space, really heavily, well, I still am. And so that, that's where I got my go experience from. So that's my, that's my background where they're coming from go to rust is really the current thing that that's happened.
Jeremy: [00:40:59] And since you, you have experienced with go, I'm wondering, are there things that you miss from go either in the language itself, the runtime, or even just the ecosystem that you wish that rust had?
Taylor: [00:41:16] Yeah, there's a few things we've run into here. I overall, I am very pleased with, with, with the rust, and would choose it for a lot of different cloud projects, depending on what you're into. I think. Go, this is totally personal opinion here, but I think go is very well suited for smaller, like true microservices or anything.
Similar, very, very small constraints tools, because it's quick to get started. It has such a constraint vocabulary. And one of the things I appreciate particularly, and some people hate this and some people like it, but I like that generally there is a way to do something in in go. There's sometimes two, sometimes three, but most times there is a way to do it.
And when coming to Rust, there was 40 different ways to do the exact same thing. And so that was, that was something that I missed. The other thing I've said is that I really, I kind of mentioned, I said before that the concurrency story is much better in go and now it doesn't provide the same security or not security, correctness that.
Rust provides, but it's so much easier to get started and to spit things back and forth. And it's just been, as a relatively new rustacean, I kind of get frustrated by the fact that there are two, three async implementations.
Jeremy: [00:42:40] Tokyo, async standard.
Taylor: [00:42:43] Yeah. Those are the two that I know. I think there was one more right?
Jeremy: [00:42:46] I think there's uh smol, I think it's how you say it.
Taylor: [00:42:49] Yeah, Smol
Jeremy: [00:42:50] I don't know.
Taylor: [00:42:50] It's supposed to be like a tiny yeah. But yeah, it's so, and I know that's a common complaint and people have done a lot of work on those, but it's very frustrating because you like get bought into that specific implementation and it's like, let's choose one and make it part of the standard library, or make it the blessed crate or whatever it is just so we can all standardize and not have to like have weird hook ins to each run time and. all those those kinds of things, so that that's been fairly difficult to, to like work with sometimes, most of it's just otherwise like ergonomics.
I know that, we're going to be writing a blog post on this soon, but inside Krustlet we were, doing a state machine graph. It was more, it's kind of like a cross between like a traditional state machine and uh walking, a graph, that we were doing to be able to. Encapsulate the logic inside of, inside of Krustlet a little bit better.
Cause they were turning into monster functions that were just ridiculous. And because of that, we started to run into some of these things in go, you have the, you have their interfaces, right? And these interfaces, you can just keep calling through and chaining through, but because of how the, the type security that comes through, Through the trait system and things here in rust, it made a really difficult to find a way to just iterate.
And we finally found a way around it. It was a little bit, a little bit clunky and well like I said, we're going to write a blog post on it. Kind of explain everything, but there's just some of those things where the boundaries rust puts on you make some things very difficult to figure out, to make it work in a way that's both rusty, but also readable, to someone trying to write it.
that was one of those things. Like, I just see some of those rough edges sometimes that just, and I'm not sure if there's a way to solve that. That's that could just be me airing my grievances. But like, that's something that I miss. There's a little bit more flexibility that comes from the go model, with some of the things that we've, that we've run into here.
But like I said, I, I feel that that's worth it, given the security and things that we've gotten in return.
Jeremy: [00:44:49] so you mentioned one of the things with rust is that there's, there's so many different ways to do the same thing. And as you were learning the language, I wonder what was your approach to figuring out what is the ergonomic way to do things? And I guess just how to pick up rust in general.
Taylor: [00:45:09] all right. It was a combination of Clippy. And I mean, everybody loves Clippy. Some people get really mad at Clippy, but like Clippy at least tells me like, Oh, Hey, like, you're right, but this, you could do this more efficiently or you could avoid an extra allocation or you could do all those, those kinds of things.
learning to read the compiler messages. You get trained from like reading other compile. Like when other things fail, you look where the compile error failed, what line? And then you go there. Because normally it doesn't give you very much information, but with rust, you have to go through and read like, Oh, okay.
It's telling me this went out of scope here or was passed, passed here, and you need to go do this, or you need to go do this, or you need to go do this. all those things can be, can be very useful when you're reading a rust comp, like rest compiler error message. So those are the other things.
The other thing I had, and this is. I think one of the bottlenecks is that I had to go to an experienced rustacean who was working like a couple of experienced ones to go get the help. And there, that's what I love about the rust community. People are very willing to help. There's like the rust mentors page.
I know was mentioned at rust conf recently that I hadn't heard about. but the big problem is that that's still kind of a bottleneck. Whereas with other languages I've been able to find like at least semi clear examples of. This is a good way to do it, or this is how, this is how these things are handled.
I mean, an example of this is to string (to_string) versus to owned (to_owned) which technically they almost do the same thing when you're doing it with like, from, is there a proper name for ampersand string? Like just the, like the string slice, right. those things, the, to own versus to string.
Really what it turned into is there used to be a difference, but then there was now it's just clear. Like I just need this as a string, then I use two string. But if I'm using it in a case where I, I have one, but I need a cop an owned copy then I used to own (to_owned) , even though they're I think they, at this point they call the same underlying logic.
And so learning those kinds of things I had to learn from people I didn't there wasn't like something could say, Hey, like here's the history of this? Or. even in the documentation, which rust's documentation of the functions is really good. It doesn't say like, you should use this here or here there there's no specific suggestions. And so, if there's one thing that I hope we can improve is maybe how we can document like the intermediate level cause getting started there's stuff, but then otherwise you're kind of bottlenecks at specific, like asking other rustaceans, Hey, how do I do this thing?
And that makes it a little bit harder for people.
Jeremy: [00:47:40] I'm not sure what the, what the solution for that is. Whether that's, like you said some other intermediate guide or, but then again, it's kind of like, How do you, determine what to put in there and how do people get directed to that versus the relationship you're talking about, where you're, you're talking to experienced people and they just have all that context in their head and they can tell you, it's a hard problem to solve for sure.
Taylor: [00:48:05] Well, yeah, and like I said, the compiler's fantastic. A lot of times you like learn from the compiler messages and that's how I finally learned that. Um when you give a static constraint on a trait that you're not saying it has to be static, it just has to fit into a static, which like, I was like mind blown moment right there.
And I was like, Oh, that's cool. Okay. I get it now. But before I'm like, I don't want to make this static. That's going to like bloat the size of this. And I like, I can't have this allocated on a stack. Like, this is huge, but it's like, no, It's just saying this needs to fit in a static. And I'm like, okay. And I learned that I think from a compiler, I could be wrong, but I think that the compiler said this doesn't fit in something or whatever.
And I'm like, wait a second. And I, and I dug into it and found it. So I think that a lot of the work that people discussed this at rustconf about making the compiler, just kind of guide you through things and anticipate it is quite impressive. And I'm very, very happy with that. So maybe that is that intermediate way that eventually get there.
But I'm just saying that I think that the there's some tooling there, but when people get in that, that learning curve is just a little bit, I like to describe it as logarithmic, right? Like it's just, you have that initial punch over the top.
And then once you understand that you can be quite capable of producing things quickly.
Jeremy: [00:49:17] You were talking about some of the. Issues you ran into or roadblocks where you needed to get more intermediate help or talk to people more experienced. I wonder when you first started, how did you first start learning Rust?
Did you go through the, the breast programming book? Did you start making little small projects? I'm curious how you approach that.
Taylor: [00:49:42] Yeah, I did a combination of both. I tried to do some of the like rest by example, the rust book, just some of the basics. And then I tried implementing, I tried reimplementing some parts of, of, different projects that had in the past or things I just wanted to try.
And then I went on with trying to like actually implement in a real project, which was Krustlet. And you can see if you actually look at the Krustlet code, you can see where like, Oh, they must have been new there and we've been going through and cleaning that up as we go along. But having that real project and something I could like go towards, that's how I learn best.
It's just having like an actual goal of either reimplementing something or, Or, or finding like an actual project to go, like dig into. And that's how, that's how I work, but it's a little bit different for every person
Jeremy: [00:50:33] Was there a moment, I guess, where you felt like you were really struggling and then once you pass this point that, rust clicked for you or was it, did it feel pretty straightforward as you were going through the process?
Taylor: [00:50:47] it was a little bit gradual. more than like a specific moment where I was like, Oh, everything clicked. I do have those moments. Like I mentioned before, like when I understood what, like a static constraint on a trait means, like that was like, Oh, like mind blown. I finally get it. but it was more of a once I started like actually doing like more complex traits or trait bounds.
I think what I finally felt I was getting it was when I could do something like that. Where T equals this plus or T this plus this and, N is this plus this, like with the like long constraints at the end of a function and like understood what it was doing and why I did it that way. And I think also a combination of, of implementing some of the traits, like as ref, as mut ref those kinds of things that I could pull out or like convert, or have a wrapper type.
And I'm like, okay, I'm finally getting how all this glues together. that's when I that's, when I noticed, I think, okay, like I can, I can do this now. Like I can put together some, some cool things. but yeah, it comes, each thing comes with its own accomplishments. Like I had mentioned before, we had the state machine that we've just finished and we're cleaning up right now.
And that, that state machine thing was like, okay, like we finally got something that worked. I think it's recently where I felt like, okay, I feel like I can actually be a good contributor to the community and maybe even start mentoring others properly because I have the knowledge to do it because now, now we create something new that as far as we can tell, like, people haven't done something like this in rust before, outside of just toy things.
And so like, we're, that's why I'm excited. I wish we had had like that today. So I could say, Oh, here's this blog post, but I'm really excited for that, because we're going to talk about like all the work we built on from people in the community who had posted about it and all these things in it. Okay. We managed to get something that works.
It's still like has rough edges. It still has things, but we've got something that works well for us. And so I, I that's, that's been fairly recent. And so I think there's just moments where like, I keep understanding more, but for me it was more of a gradual turning of like, Oh, and then all of a sudden I realized like, Oh no, I think I've gotten to the point where I know it, it wasn't like a click. Like I know what it is. Oh, I just realized I actually know what I'm doing now.
Jeremy: [00:52:56] Yeah. Yeah, no, that, that makes sense. You've been mentioning how Krustlet, has some rough edges.
And I know on the project page, it mentions how it's highly experimental. What do you think are the, the big parts that are, that are currently missing for somebody who would want to go in and actually host their application using Krustlet?
Taylor: [00:53:16] well, one of them is completely outside of working ends, which is networking, in Waze. we're, we're trying to work with the community to do that. And we're going to see if there's a way we can only solidify and jump in and work on that. but we're getting close. So, this is no one can hold us to this, but we're hoping to get towards a 1.0 Release towards the end of the year, beginning of next year.
and so like around the holiday season, things will slow down, whatever. So that's why we don't know it could be January, February, and that's what we're hoping for. And really the big things that we have are, we have basic volume support. but we don't have cloud volumes support. So we're going to be looking in a way of how we can make sure every provider can, can use this.
We're trying to solidify the API at the same time with that, because we have people who are writing other providers and a provider is just something that is an implementation of a runtime. And so we've written ours in for WASM, but we have another person who's a core maintainers, his name's Kevin.
and he's been, recently made a core maintainer of the project as well. And he's working on one for containers. so he's just moving the container implementation stuff over to rust because of all the security benefits and things. And so we need volume support for all the providers, and then we're going to try, and then we're going to figure out a way we can abstract the networking, probably using the same interfaces that Kubernetes has already defined, so that we can have networking implementations more, more readily available and connected into these things from, the rest of the Kubernetes world.
And then after that, we need to have like a real demos like we have demos, but we need like some, like, I want, like, here's a real application as far in so far as you can make it real and take that and put together, some bootstrapping things. So it's easier for people to set it up. We want to make it as one click as possible to set it up.
And so. Those are kind of the things we're looking at and that people can, can look for rough edges. But if you want it for it, for example, if you want it to trigger like data pipeline processing, you can use just wascc or you can glue together wascc and a WASI provider, which is the wasmtime one. You can glue both of those together or have two of them running and you can trigger a data chain using an HTTP call and then process the data using one.
You can do that right now. we had, an intern on our team over the summer and she did some work with, on raspberry pies using a raspberry PI Kubernetes cluster, and then using Krustlet to read soil sensors. You can do some fun things with it. It's just not all the way there. And that's partially because of where things like this is, this is bleeding edge, and that's why we put like the big warning sign on the reading.
Like, like, please don't run this in production. Like you're you're this is so new. Not just the project itself, but also the technology around it. And so that's what those are kind of like the steps we have. So, at this point I would say, I maybe would remove the highly from Krustlet's description Hm I, if I was really wanting to, because now it's just, this is an experimental project.
That's kind of the goal for the future here, but we're not that far out from having something more it's a 1.0 where people can actually people can start using it in a real way and maybe not perfectly, but in a real way.
Jeremy: [00:56:27] Last year, Solomon, the CTO of Docker, he had tweeted that if WASI and WASM had existed in 2008, then they wouldn't have needed to create Docker. And I'm wondering from your perspective, thinking about the future of, WASM and Krustlet, do you think that.
Running applications in WASM could become the default for, for server side applications in the future.
Taylor: [00:56:59] It's a possibility. I wouldn't peg it entirely for sure. I, I think it will be a mix. People are just like getting on board with Kubernetes stuff and containers, which sometimes like, like I said, I think some people go way too. Far into it and don't think they just like, Oh, they hear Kubernetes and buzzword and want to do it.
But people are still just barely getting to that thing. So it'll be a long time if it does become the default. But I think it has the ability to, reach a very specific audience right now and have that grow. a lot of these constrained environments, like edge computing, you can't run Docker on there it's too much, too much overhead.
They just can't handle it. But you could run, WASM modules. it also allows you to pack things in more tightly because if everything's a small process, that's just this tiny little thing and tiny little binaries. You can run that with. we can run a lot more than you could with containers right now.
So I wouldn't say that it's going to, it's not a container killer, nor is that our current goal. Like we didn't, we didn't think that, like, we don't want to disparage that other technology. That's, that's something we still use a lot and effectively. but I, I do think that there is going to be some takeover of that space, at least in a small measure with all this stuff from wasm and WASI.
Because of its just portability and the ideal of we'd be closer to a write, write once compile, once and run anywhere kind of situation, people say it's a pipe dream. I think we'll never get completely there. Even with WASM, there's still constraints. There's still things that will be in place, but this makes it easier.
And having, if, if every language gets to the point where you can do it with rust, where you just say, here's my Wasi target and build it. That to me sounds like a very powerful way of doing it and not just for server side. I think that can reinvent a lot of things with normal applications as well. Just because of how portable they are and how then, applications could be tied to you instead of just like being tied to a computer or whatever it might be.
So there's some really interesting ideas here. It's just. There's those, those ones are a little bit further out, but I do think even in the short term, we'll start seeing some good applications where WASI will be WASI, compatible WASM, binaries will be a better choice than using a container.
Jeremy: [00:59:17] and it sounds like maybe in the short term you were talking about edge computing and that might be something like where you have a CDN, running application code. At their, their edge nodes, something like Cloudflare's workers or Fastly has an equivalent. are you thinking that might be where, where these things start?
Taylor: [00:59:39] I think that's where it's already started in one sense. And that's one of the reasons we chose Kubernetes is we think there's more beyond Kubernetes and server side on my team. That's that's our belief that we believe there's more to that, but everybody is getting into trying to do Kubernetes and have this Kubernetes has become an API layer that people understand that a lot of people use and enabling WASM through that API gives people a reason.
We want people to start using this and say, Hey, like, why doesn't my insert language of choice? Have the support for WASM. I like WASI binaries. Can we please get that? And then as people do that, we start getting more motion around it.
Jeremy: [01:00:19] I know when. I talk to people about WebAssembly a, sometimes what I'll hear is they'll say, well, I'm happy writing JavaScript in the browser. Why do I care about WebAssembly right. And so, like you say, if there are more use cases for running other than, just in the browser, then that might inspire other languages like Python or Ruby, or who knows what other languages too, to focus on getting them to work on WebAssembly. So I think that's, that's pretty exciting. so I think that's a good, good place to start wrapping up. if people want to learn more about Krustlet or, about what you're working on, where should they head?
Taylor: [01:01:00] I would definitely start with, the actual project site. So that's deislabs/krustlet on, github. there is also some, some posts that we have in various places. I wish there was like one amalgamation of all of this, but, you can look at the, it's deislabs.io/posts. I believe.
Let me just double check that.
Jeremy: [01:01:23] and we can probably get the krustlets specific, posts and then put those in the show notes as well.
Taylor: [01:01:28] Yeah, and I can send those, but yeah, it's deislabs.io/posts. that's posts for all of our projects, but you'll see some, at least three blog posts there about Krustlet. one was around our, our stack and heap allocation problems that we had and some lessons learned there. so those are, those are some other places you can go.
We have some other posts that hopefully we can send in the show notes that kind of give an overview of the different things that the reasoning behind this, if you're more interested at this from a high level, like a business perspective, we have a post for that we have some other things about the security things we got from it that we've posted around.
So, there's, there's lots of sources of information there, but if you really want to get started and look at the project and install it and try it out, go ahead and check out. deislabs/krustlet on get hub. And that one will have the docs and everything you need to get started.
Jeremy: [01:02:19] Very cool. Taylor, thank you so much for talking to me today. It's been interesting learning about Krustlet and WASM. Kubernetes and all of that. And I think it's going to be very interesting to see where it goes in the future.
Taylor: [01:02:33] Well, thank you very much for having me. And hopefully everyone finds at least some of this interesting and useful.
Paul Smith is a Software Engineer at GitHub and the creator of the Lucky web framework. He previously worked at heroku and thoughtbot and has experience building applications using Rails and Phoenix. He's also the creator of the Bamboo e-mail package and the co-creator of the ExMachina test data package for Elixir.
We discuss:
Related Links:
This episode originally aired on Software Engineering Radio.
Transcript:
You can help edit this transcript on GitHub.
Jeremy: Today I'm talking with Paul Smith.
Paul is the creator of the lucky web framework and he currently works at GitHub. Today, we're going to talk about the crystal programming language and the lucky web framework. Paul, welcome to software engineering radio.
Paul: Thank you so much. Happy to be here.
Jeremy: There are a lot of languages for software developers to choose from. What excited you about crystal?
Paul: Yeah, that's really interesting because when I first saw Crystal, I actually was not interested at all. it basically looked like Ruby to me. And so I just think, okay, so it's a faster Ruby. And typically if I want to learn a new language and want something that feels really different, that pushes the boundaries on things.
I started getting more interested in compile time guarantees. I worked at thoughtbot previous to github and previous to Heroku and people were starting to get really into typed languages. Um, some people were starting to get into Haskell, which is like, you know, the, the big one that, I guess is probably one of the more type safe, but also hard to use languages.
Um, but also Elm, which has a good focus on developer happiness and productivity and explaining what's going on. And as they were talking about, how they were writing fewer tests and it was easier to refactor, uh, it started becoming clear to me that that's something I want. Um, one of the things somebody said was, if the computer can check the code for you let the computer do that rather than you, or rather than a test. so I started to get really interested in that. I was also interested in elixir, um, which is another fantastic language. I did a lot of work with elixir. I built a library called bamboo, which is an email library. And another called ex machina, which is what a lot of people use for creating test data. Um, so I was really into it for awhile.
And at first I'm like, wow, I love functional. And then I realized like. I can do a lot of, like a lot of the stuff I like about this I can do with objects. I just need to rethink things so that it uses objects rather than whatever random DSL
Jeremy: Cause I mean, when you think about functions, right? Like you've got this big bucket of functions and you got to pass in all the parameters right? Whereas, you know, in a lot of cases, I feel like if you have those instance variables available in the object, then the actual functions can be a lot simpler in some ways.
Yeah.
Paul: Totally. That's like a huge focus and making the object small so that it. It doesn't have too much, but that's how I began to feel with elixir is that I'm like, I just have 50 args and most of them I don't care about. Like I want to look at what's important to this method, to this method.
It's, you know, this argument, but with functions you're like, which things important. Is the first thing? Probably not. That's probably just the thing I'm passing everywhere. And so I liked that ability to kind of focus in and know like, this object has these two instance variables everywhere.
Jeremy: Yeah. It's kind of interesting to get your perspective because, it seemed like you were pretty deep into elixir if you had created, bamboo and ex machina and stuff like that, so it's kind of
Paul: Yeah. I was like way gung ho and, and then I started missing objects. And luckily with crystal and ruby, you still get a lot of the functional stuff. Like you can pass blocks around. Um, that's functions. You can use functions. But it's not the other way in Elixir, you can't use objects. It just doesn't exist.
And then the type safety. I'm just like, I still run into so many errors and it was so frustrating. I don't want to do that.
The main benefit I got out of elixir compared to rails, um, which is what I had been using and still use a lot of, was speed. That was really big. Um, in terms of bugs caught about the same, mostly because it's still for the most part dynamically typed language with very few compile time guarantees. Um, so I'd still get the nil errors. I'd still mess up calls to different functions and things like that. And so that's where I ran into crystal. It has the nice syntax. I like from elixir and Ruby. It's also very, very fast. Faster than go in some benchmarks.
So it's quick. Plenty fast for what I need.
And it has those compile time guarantees, like checking for nils. That's a huge one. and it also makes the type system very friendly. So it does a lot of type inference. And very powerful macros so you can reduce some of the boiler plate.
And so that's when I kind of started getting into crystal was seeing Elixir I still got a lot of these bugs that I was running into with rails, but I liked the speed but I don't want to use Haskell and Elm doesn't exist on the backend. so I started looking at crystal.
Jeremy: And so it sort of sounds like there's this spectrum, right? You have Ruby and you have, elixir, where you don't necessarily specify your types so the compiler can't help you as much. And then you've got Haskell, which is very strict, right? You have a compiler that helps you a lot. Um, and then there's kind of languages inbetween Like. For example, Java and C and things like that. They've been around for quite some time. how does crystal sort of compare to languages like those ?
Paul: Yeah, that's a great question cause I did look at some of those other ones. TypeScript for examples is huge. Kotlin was another one that I had looked at because it's Java but better basically. That's the way it's pitched. And so far everyone that's used it has basically said that. And also looking at rust, what it came down to was how powerful was the type system. So crystal has union types, which can be extremely helpful, um, and it catches nil. Java does not have a good way to do that. Um, Kotlin does. But also boiler plate and the macro system crystal's is extremely powerful. Elixir also has a very powerful macro system.
But crystal's is type safe, which is even more fantastic. So basically what that let me do with lucky, it was build even more powerful type safe programs. And we can kind of get into that once we, we talk about lucky and how that was designed. Um, but basically with these other languages, a lot of what we do in lucky just simply wouldn't be possible or wouldn't be possible without a significant amount of work and duplication.
Jeremy: You covered a few things there. One of the things was, macros, what are are macros?
Paul: Yeah. This is like a confusing thing. It took me a while to, to get, um, what it is. But, uh, in Ruby, for example, they have ways of, of metaprogramming. That are not done at compile time for most compile time languages, compiled languages, I should say. You need macros to de-duplicate thing, and basically what a macro does is it generates code for you.
The way I think about it is basically you've got a method or a macro, but it looks like a method. It has code inside of it. And it's like you're copy pasting, whatever's inside of that macro into wherever you called it from. So in other words, rails has a, has many, like has many users, has many tasks that's generating a ton of code for you.
So that's how Ruby does it. Um, and crystal has many would be a macro and it would literally generate a ton of code. And copy paste that into wherever you called it. Um, so it's just a way to reduce boilerplate.
Jeremy: So in the case of dynamic languages, like Ruby, when you talk about Metaprogramming, that's having I guess, a function that is generating code at runtime, right? And the macro is sort of doing something similar except it's generating that code at compile time. Is that kind of the distinction?
Paul: That's the way I look at it. there are people much smarter than me that probably have a more specific answer about what the differences are, but in my mind and in practical usage, that's what it comes down to in my mind.
Jeremy: Let's say there's a problem in that code, what do you get shown in the debugger?
Paul: Debugging macros is definitely harder than debugging your regular code for that exact reason. it is generating a code. So what crystal does, uh, there's different ways of doing this, but I like Crystal's approach. It'll show you the final result of the code and it'll point to the line in the generated code that caused the issue and tell you which macro generated it. Now, it's still not ideal because that code isn't code you wrote, it's code that the macro generated, but it does allow you to see what the macro generated and why it might be an issue.
Part of that can be solved by writing error messages and error handling as part of the macro. So, in other words, making sure, if you're expecting a string literal, you can have a check at the top that checks for it to be a string literal. I wouldn't use them by default, but it's great for, I think a framework where you have a lot of boiler platey things that you're literally typing in every single model or every single controller, and that people kind of get used to. It's well tested. It has nice error messages. In my own personal code though, I pretty much never used macros. They're only in the libraries that I write.
Jeremy: Another thing you mentioned is how crystal helps you detect Nils or nulls. Um, how does, how does the language do that?
Paul: It actually uses union types for that, some languages that have this, they'll have an optional type, which is basically a wrapper around whatever real type, like an optional string, optional int, and you have to unwrap it. The way crystal does it is you would say string or nil, and there's a little bit of syntactic sugar.
So you can just say string with a question mark at the end. But that gets expanded to string or a nil type. Um, so then within that method, the compiler knows that this could be a string, could be a nil, and there's a little bit of sugar there where the compiler, if you say, if whatever variable you have, it's going to know that within that, if it is not nil and in the else it is.
So there's a little bit of sugar there as well. Um, but that's basically how they handle it. And there are ways to force the compiler, uh, just say, Hey, this thing is not nil you can call not nil on it. That's a little, I would avoid that because maybe the compiler's right. And it really is nil. Or maybe you change the method later and then it can become nil and you're going to get a runtime error there.
But it does have those escape hatches. Cause sometimes you just need the quick and dirty and you can, if you need to.
Jeremy: As long as you don't tell the compiler that, then you will actually have a compiler error. If you have a method that takes in, let's say some type of object or a, a, nil. And then you don't account for the fact that like it could be nil. Then the compiler actually won't let you compile, is that correct?
Paul: That is correct. So for example, if you just had a method that's like, print. email and it accepts a user or nil, now, I'm not saying I would do that, but let's say that it does. And you just tried within that method to do user.email to print the user's email. Um, it's going to fail and tell you that nil does not have the method, email.
And so you need to handle that. And then, yeah, you're forced to either do an if, or for example, you can use try, which is basically a method that says call call a method on this object. Unless it's nil, if it's nil, just return nil. But yes, it kind of forces you to do that.
Jeremy: And in crystal, how do you handle errors? Because a lot of different languages, they'll have things like exceptions or they may have result types. What's sort of the the main way in crystal?
Paul: I'd say I'd group it into two types of errors where. You have runtime exceptions still because things do break. Not everything is in a perfect world. Inside your type system, databases go down, you know, redis falls over or whatever. So you still have runtime exceptions and then you have the compile time errors, which we kind of just talked about.
But in terms of how those runtime exceptions are handled it's I don't want to say exactly the same as Ruby, cause there probably are some subtle differences, but extremely similar to Ruby and that you're not passing around errors. It's so, it's not like go where you are explicitly handling errors at every step.
Um, you raise it and you can rescue that error kind of like a try catch in other languages and you can also just let it bubble up and rescue at a higher level, which I personally prefer. Because not every air is something that I care about and kind of forcing me to handle every single error everywhere means that it is harder as a reader of the code to tell which errors I should care about because they're all treated as equal.
So I like that in crystal, I can say this particular error, this particular method I want to handle in a special way. And somewhere up above the stack. I can just say anything else. Just print a 500 log it, send it to Sentry.
Jeremy: Yeah, so it's very similar to, like you said, Ruby, or any other language that primarily relies on exceptions. Like I think Java for example, probably falls into the same category.
Paul: probably. I haven't used it in quite some time, but I imagine it would be similar.
Jeremy: You had mentioned that that crystal is like pretty, pretty fast compared to other languages. what are the big. benefits you've gotten from that raw speed?
Paul: The biggest benefit I would say is not having to worry so much about rendering times, and rails for example. You can spend a ton of time in the view, even though everyone says databases are slow, they're not that slow in something like rails active record takes a huge amount of time to instantiate every single record.
So how does this play out in real life? You could, for example, in lucky if you wanted to load a thousand records and print them on the page and probably do that in. a couple hundred milliseconds maybe, which is a totally reasonable response time. Same thing in rails would be many seconds, which is not reasonable in my opinion.
And this can be really helpful, partly because it just means your apps are faster, people are getting the response as quickly. But also because you have a lot more flexibility. I've built internal tools where they want to have the ability to search all of the inventory or products or whatever else and they want to have like a select all or be able to select everything.
And in rails, you can't just render all 1000 products cause it basically falls over and you can try and cache stuff. But then that gets complicated. Um, so you kind of have to paginate. But when you paginate that makes it hard to select things across multiple pages, it's then you need some kind of JavaScript to remember which ones you selected across pages, and it just balloons the complexity, right?
If you know, Hey, we only have eight or 900 products, we're not going to suddenly have 20,000 in lucky. You just render them all, put them all on the same page, give them all check boxes, and it's. In the user's hands in 200 milliseconds and you're done. You just removed most of that complexity. So those are some of the ways that that speed is playing out. And I think one key difference there is some people think speed is just about scalability. How many people can be using this? The speed improvements I care about are the ones where even if you have one request per day, I want that request to be insanely fast. and so that's kind of what you're getting with lucky and crystal.
Jeremy: When you talk about web applications, you know, with lucky being a web. Framework. A lot of people point out that a lot of the work being done is IO, right? It's talking to the database, it's making network calls. But I guess you're saying that rendering that template, those are things that actually having a fast language, it really does make a big difference.
Paul: It does. Yeah. I, I think the whole database IO thing, a lot of times that's what people say when they're working with a slow language. If you have a fast one. It's not as big of a deal. Cause this was the same with Phoenix and Elixir. Like I loved, how quickly it could render HTML. That was huge.
Jeremy: And like you said, that opens up options in terms of, not having to rely on caching or pagination or things like that.
Paul: Yeah. This is huge. I mean, an example from work. We just announced github discussions. Um, and I'm on that team. And one of the big things we were trying to get working was, was performance of the discussions show page. You could have hundreds of comments on that page. And we were finding that most of the time taken was actually spent rendering the views and calling methods on the different objects to render things differently in the seconds. And we can't cache those reliably because there are so many different ways to show that data. If you're a moderator, you get certain buttons. If you're an unverified user, like someone who just signed up, you see a different thing. If you're not signed in and you see a different thing, and so you can't reliably cache those, and we had a lot of cool techniques to kind of get that down, but this is something that if this were written in lucky, it just would not have been an issue.
Jeremy: And github in particular is written in Ruby, is that correct?
Paul: It is. Yeah. It's using Ruby on rails, and I'm not trying to knock rails. I, I really love rails. I mean, I've been using it for 12 years. Um, I like Ruby. Uh, but Hey, if there's something that could be even better, I'm open to that.
Jeremy: For sure. You have used Rails for 12 years. how would you say that your productivity compares in Ruby versus in crystal?
Paul: I think that's tricky. It's kind of better and worse. And what I mean by that is. I think crystal, I am. I'm more productive. And crystal, you do have compile times and we can talk about that. They're not the fastest, they're not the slowest, but I do find that I can write more code and then compile once, and it kind of just tells me where the problems are and I have a lot more confidence and I spend a lot less time banging my head on like, why isn't this thing working?
And it's because I passed the wrong type somewhere. however, Ruby has a massive ecosystem, so there are things that exist in Ruby that I would have to rewrite and crystal. and so that for sure, no matter how productive I am in crystal, is not as productive as requiring the gem and then just using it.
So the hope with lucky though, is that we're building up enough things that. You don't have to be rewriting everything. And the community is also really stepped up and writing a number of, libraries that are super helpful for web development. Um, for example, somebody just wrote web drivers.cr, which makes it so that it can automatically install the version of Chrome driver that matches the version of Chrome that you have installed.
So you don't have to manage that at all. That's something that was in Ruby for awhile, and will be in lucky, probably in the next release. So yeah, I think it's better. It's one of those things that will get better with time.
Jeremy: So in terms of the actual language, productivity, crystal, it sounds like basically a net positive, but it's more in the the community aspect and how many libraries are available. that's where a lot more time, but it's taken.
Paul: I think so. And then just the initial ramping up, uh, it is a new language and so there aren't as many stack overflow questions and answers and there aren't as many tutorials. So there's definitely some things there. But like I said, those are things we're working on, especially for one out of lucky. Try and make sure we have really good guides, uh, really good error messages.
We tried to borrow a little bit from Elm. Not specific error messages, but just the idea that an error message should raise something human readable and understandable, and if possible, help guide them in the right direction of what they probably want to do, or at least point them to documentation to make it easier.
So we're trying to help with that as much as, as we can.
Jeremy: I kind of want to move into next more into your experience. building lucky. you know, you were a rails developer for many years, and are there any like specific major pain points, I guess, in rails or in your previous web development experience that you wanted to address with lucky?
Paul: Yeah. There were, um, some more specific than others. Um, some easier to solve. In the sense that the solution is like it works or it doesn't. And others that are a little bit more abstract. So I'll talk about some of the specific things. I often said that I'm into type safety. I don't think that is quite true, and I think it. Especially if you haven't used lucky, it just doesn't click what that means or why it matters. Cause you just think like, Oh, so you know, don't tell me if I pass an integer instead of a string. Like who cares? I'm not seeing those kinds of errors.
What I'm most interested in is compile time guarantees, whether that's with a type or some other mechanism. and that's there, not just to prevent bugs, but to help you as a developer to spot problems right away. And give you a nice error so you know what to do about it. So, for example, one of the things that I've seen in basically every framework I've ever used, regardless of whether it is type safe or not, is that you need to use an HTTP method, a verb and a path.
So, for example, if you want to delete a user, you would have forward slash users forward slash one to be the ID. The tricky part is you have to have the HTTP method delete for it to do the delete action. But sometimes you forget that you use a regular link and you wonder why the heck it just keeps showing you this thing instead of deleting it or the particularly insidious one is when you have a update and a create. One uses post one uses put, if you have an update form and you forget to put the method put, you get all kinds of routing errors cause it says, Hey, this doesn't exist. And you went, well why? Why doesn't this exist? I can see it right here. I've got the route, I've got everything.
Oh it's cause I forgot to put the HTTP method is a PUT. And it just waste time. So that's one of those things where we wanted to compile time guarantee and lucky. And so I don't want to go too in depth here, but basically what we did was we made every controller into a single class that handled the routing and also the response.
Jeremy: If I understand correctly, when you have a page. And you want to link to a specific user, on that page. Then you would use this function link to, and you would pass in the class that corresponds to showing a user, and then you would pass parameters into that function. Like, for example, the id of the user.
And if you didn't do that. Then you would have an error at compile time, not
Paul: correct.
Jeremy: you. You wouldn't need to like start the website and then go to the page and have it, basically explode, which I guess is typically what you would expect from most web frameworks.
Paul: Or what's worse, it wouldn't explode. It would just generate the wrong link and you would have to remember to click that link or write an automated test that clicks that link. And so it's really easy for bugs to sneak in, and this just completely prevents that class of bug. As well as just makes life easier because if you forget a parameter while you're developing from the start, instead of just generating something with like a nil ID, it's going to say, Hey, you forgot this.
It just saves a lot of debugging time, and I think it's also more intuitive if you've ever used rails helpers or Phoenix, help any of these man the conventions. Like it's a singular, isn't plural, is it? Does it have the namespaces and not have the namespace in lucky that it's gone. You just call the action, the one that you created, you call that exactly as is.
Jeremy: It sounds like this is maybe a little more explicit, I guess?
Paul: Yeah, it's a little more explicit, but I hesitate. I've heard a couple of things in the programming community. Um, one, the rails started as convention over configuration, which that was huge because you had to learn the convention, but at least once you did, you knew how about other rails projects were. And then another one I hear is explicit over implicit.
I don't buy into either of those in particular. Um, because sometimes implicit is better, sometimes explicits better. I mean, for example, it was a quick example. I don't hear anyone arguing to bring back the old objective C where you had to manually reference and dereference memory that is technically more explicit.
But does anyone want to do that? No. So I don't think explicit over implicit, you have to think about it. Everything needs to be judged, in its own context. And what I think is even better than convention over configuration is intuitive over inventions. Meaning you don't even think about it.
You don't even need, there doesn't need to be a convention. Because you're literally just calling the thing that you created like anything else, there's nothing special about that. It's a class just like any other class and you call a method on it, just like any other method.
I think it's tricky because I think it's also easy to say explicit over implicit and make your code super hard to follow. And it's like, yes, it's more explicit, but also I just wrote 20 lines of code instead of one. And those 20 lines could differ because I do it differently than the other guy or girl.
Jeremy: Another thing about lucky that's a little different is that for templating, instead of having somebody write HTML and embedding language code in it, uh, you instead have people write crystal code.
So could you kind of explain sort of why you made that decision and what the benefits are.
Paul: Yeah, sure. So a lot of things actually with, lucky. Kind of I did not want to do, or were definitely not how I started doing things. And it just kind of moved in that direction. based on the goals. And I think that's part of what makes lucky, different is that we don't say, here's how I want to do it.
We say, here's what I want to do and I want it to be easy, simple, and bug free. So. What we started with was using templating languages, just like you'd use in almost any, anything where you write your HTML and then you interpolate values in it. At the time I wrote lucky, and this may be changed now. you could not use a method that accepted a function or a block is what it would be called and crystal, and have that output correctly in the template. I think it just blew up. I don't remember, this was two years ago, three years ago. The other problem I was having was, it's not just a template. Any bigger size framework also has partials or you know, fragments or includes or whatever you want to call it. It also has layouts where you can inject different HTML in different parts of your HTML layout, and those are all things that a person has to learn when they're learning your framework. What are these methods called for.
Generating a partial for calling a partial or injecting stuff in different layers of the layout. And it's also more stuff that I have to write. And with lucky, like there was already a lot to write. They were building the ORM and the automated test drivers and the router and like everything. So I can't afford to just do stuff like everyone else does it if it's not pulling its weight.
So eventually. I started experimenting with building HTML, using classes and regular Crystal methods. Some of the requirements, um, for me when I was building it was it had to match the structure of HTML and it had to be very easy to refactor. Meaning I can pull something out into a new method and it just works.
So easy refactoring. And then I also need to be able to do layouts with it. The reason for that is Elm also uses, code to generate HTML. However, it is not approachable to a newcomer. if for example, you have a designer and they pull up in an and try and look at what that, what that generates.
No way. I mean. I'm a programmer, I still don't know what it generates without really looking through Elm. And that's partly because you are generating data objects. So arrays of arrays. Or maps or whatever else. so I didn't want that. It has to be approachable to people and look and be structured like HTML.
And so we were actually able to do that. I don't know if I need to go into huge detail, but basically you can say, Hey, I want to div. Inside of that, I want an H1 underneath that. I want another div. And you're not building arrays and maps and anything else. What that provides is actually a lot of things that I did not think of.
One super easy refactoring. If you have a link in a particular page and you don't want to copy that over and over and over, extract a method and you call it like any other method, there's nothing to learn. It's just a method. Like anything else, it can accept arguments just like anything else. Your conditionals work.
Um, you can extract that into a component, which is basically another class and it tells you explicitly here's what I need to run. And it renders the thing. Um, you always have the correct closing tag. I have been bitten so many times by shifting stuff around. And forgetting a closing tag and my whole page looks wonky and I have to go through layers of indentation.
That just doesn't happen if you forget an end so you would have a do end when you're creating these blocks, it blows up. It's like, Hey, you're missing one. And the coolest part is you just add an end in there and you've run the crystal formatter and it re indents everything perfectly. And then on top of that, it's, if that wasn't enough.
Like I just loved how easy it was to refactor and use. you don't have to split up your code from your template. Like in rails, you would have a helper. So you've got like, here's your template, but then you might have a helper, a totally separate file. If you've got something that pertains to just that page, you can just extract a method.
It's right there. But this also made it so we can do layout without any special work. Your layout is basically a class. You would say, here's my class with the head. It renders the head renders HTML body or whatever. And then it calls a content method or a sidebar method or whatever else, and your page.
So if you wanted to render a list of users inherits from that class and implement a content method or a sidebar method. And so when that's rendered out, it just calls those methods. So we got all of that for free. If you look at our view rendering code, it's 50 lines. because basically we use a macro and give it a list of tags, like, you know, paragraph H1 H2 whatever, and generate a bunch of methods.
And that's basically it. So from an implementation perspective, it's extremely simple. Plus, you get all these niceties around refactoring is super easy. It's super easy to tell what a page needs to render at the top of the page. You just say, you know, I need a user. I need a paginator. I need a current user.
So you know what that page needs. You don't get that with a template. and you get all the power of crystal for rendering layouts however you want. that all basically came for free. So it was kind of a happenstance that templates weren't working and this has worked out better people, a lot of people when they see this, they're like, what the heck is this?
I hate it. And I always just say, just give it a try. Just give it a try for a little bit. So far. One person has said like, okay, I don't like it, and you can use templates if you want. We've actually built that in, but everybody else is like, now that I've used it, I love it.
Jeremy: What it sounds like is in a lot of, JavaScript frameworks, for example, like react, there's this concept of components, right? And so you can create, what looks like new HTML tags, but really has. some other HTML in it like let's say you have a a list of, uh, businesses and maybe you have a component that would have, all the business details in it. it sounds like in the case of lucky, you kind of can do the same thing. It's just that your component would be in the form of a crystal class. And so there isn't any new syntax. and you're not mixing, different languages. Like you're not mixing HTML and JavaScript. Instead, everything is just using crystal.
Paul: exactly. you have two options. You can extract a private method cause sometimes it's just a small thing you want to extract only used by one page. Just do a method. If not. Uh, extract a class. And the cool part about all of this is that you don't need to restructure anything. Meaning you can start with everything in one method, in your content method, and then you can pull out just a little bit into a private method.
And then if that's not enough cool, pull that out into a class so you're not forced into just pulling out classes all over the place if you don't need one.
It really worked out kind of really well because it also makes testing easier. You can pull out a class component that just does one thing and you can instantiate just that component and test just that HTML. And once again, this is very easy because it's a class you call it and run it like any other class.
And so that's been a big goal of Lucky is try to reduce, and this also comes down to the whole like convention over configuration is how do we just make it so there is no convention. It's just intuitive. Like if you know how to extract and refactor a crystal class, you know how to extract and refactor stuff for a page in lucky automatically. Um, and I mean, of course there's still some degree of learning and experimentation, but it's the same paradigms. if you want to include methods in multiple pages, use a module just like any other module. So that was very much a goal.
And that's part of, uh, other parts of lucky, for example, querying in something like rails. The model is for creating, updating, reading, everything. In lucky you do create a model and we use macros to actually generate other classes for you, but you have a query object that is a class.
Jeremy: What am I passing into my query object what does that look like?
Paul: Let's say you have a user by default, it generates a User::Base query. So basically you have this new object namespace under the model. And by default, the generators generate an another file.
And basically what that does is it creates a new class called user query. That inherits from that user based query class. What you would do in your controller action or anywhere, uh, say user query dot new by default. That just gives you a new query that would query everything in the database. Unless of course you overrode initialize and did something else. Then it would use that scope. so if you want him to further filter down, you would call, for example, if you wanted the name to be Paul, it would be user query dot.
new.name parens Paul as a string. Because lucky generates methods for every column on the model with compile time guarantees. So if you typo that method, it's going to blow up. If you've renamed the column later, it's going to blow up. if you accidentally give it nil, it's going to blow up and tell you to use something else, but that's how you would do it.
You say dot. Name is Paul. Or, uh, we also have type specific criteria and methods. You can do things like dot age. Dot. G T for greater than 30. And so you have this very flexible query language that's all completely type safe. So in your scopes, if you wanted to do something like recently published for a post, inside that method, you would do something like published at dot gt at.gt for greater than one dot week dot ago.
And you can chain that. So you could do post query.new dot. Recently published dot, authored by Paul or whatever. So that's basically how it works. Um, you just have these methods that are chained, that you can build upon in pretty much any way you want.
Jeremy: In a lot of applications now, people use JavaScript frameworks, whether its react or Vue or angular, what does integrating with JavaScript libraries and frameworks look like in lucky?
Paul: I think easier than a lot in the sense that you can generate a lucky project with. different modes. So when you initialize a project, you can use just the command line with some flags, or the default is to walk you through a wizard, which will say, do you want API only? In which case, you know, it won't even have HTML pages or the default, which is a full app.
What that does is it generates Webpack config for you. Um, it sets up your public assets and images so that they can be copied and fingerprinted. and so out of the box that already has a basic web pack set up for you that handles CSS. Um, it handles most of your ES6, JavaScript type stuff that people typically like.
That's just handled out of the box. if you want to include react or vue. You would include that just like any other Webpack project in terms of building it. Um, and it's actually a little simpler. We use Laravel mix on top of Webpack, which is basically a thin JavaScript layer that calls Webpack underneath the hood.
If you want a full single page app. That's also totally supported. Um, you would basically have just one HTML page that, you know, has the basic HTML and body tags and within that Mount to your app. So whatever that is for your language in vue, it might be, just a tag that's like main app.
And then in your JS you would initialize, um, that tag with your app. And we have fall back routing so that you can do client side routing if you want. It's not particularly well-documented, which is the biggest problem. Um, some people are helping with that cause a number of people have done react and view.
And so, um, hopefully those will be fleshed out a little bit more, but it's totally supported. in the longterm though, we've got plans to make it, so you don't even need those types of frameworks quite as much. since we already have class components and a bunch of other things, uh, I'm working on a way to add type safe interactivity to HTML.
So you're not writing the Javascript, you're writing crystal for the most part, and it can interface with Javascript and you can run, you know, use react and vue inside of it. But a lot of your simple open close, if anything like that is going to be handled client side, but written with crystal and server interactions will also, those will be sent over an Ajax request, but will also be typed safe when you call the actions and do all the HTML rendering similar to live wire for Laravel or live view by Phoenix. But with some. Some differences that's not done yet, but it will be, and I think it's going to be really exciting. I've got a proof of concept up locally and uh, it's really awesome.
Jeremy: We had a previous episode on live view and I think the possibilities of things like that are really interesting of, of being able to not have to have this sort of separation between your JavaScript front end, and your server backend yet still be able to have, the kind of interactivity people expect.
Paul: Yeah, I think it could be cool. and that's also where speed comes into play. When you're doing interactions like that, you don't want to wait even a hundred, even 50 milliseconds. Is noticeable for those types of interactions. And so Phoenix also fast, really fast, template language. Uh, basically it gets compiled down to elixir, and so that helps a lot.
Um, I do think there's some big flaws that I've seen in some other implementation. Well, I don't want to say flaws, that sounds a little overly harsh, but things that I personally, are just deal breakers for me. And one of those is some clientside interactions have to be instantaneous. I just have to be, if I click on my avatar on the top right, I expect the menu that has settings and log out to be instant.
If there's any kind of latency in the network and it takes 200 milliseconds, even. That's going to be a weird interaction and it's going to feel like your app is broken. And of course that's exacerbated by people, not in your country. This is another problem. People are doing these things, deploying servers in their own country.
Put a VPN in front of your computer in Australia or even the UK, 400 milliseconds. That's just, you can't do that for a settings menu or for opening a modal. And so there needs to be some way to do those interactions instantaneously. Live wire by Laravel, the same guy that wrote it, built our Alpine JS.
Which is kind of, it looks a little bit like vue, but it doesn't have a virtual, DOM it operates with the Dom that you generate. That's what it uses for client side interactivity. So you can do the server side stuff, which I mean, if latency's there, you're, if you're submitting a comment, look, there's no way around it.
You've got to hit the server. But if you're opening and showing something at a menu, a tab, a modal. That's instantaneous and is handled by Alpine. So lucky actually going to use that along with our own server rendered stuff to do client side interactions instantaneously.
Jeremy: So Alpine, it's a JavaScript front end framework, you said, similar to vue. without the virtual Dom, and it sounds like what you're planning is to be able to write crystal code and have that generate Alpine code. Is that right?
Paul: That's correct. Cause it's mostly in line and it can't do everything. But most of what I want from client side interactions are typically super simple things. I want to open and close something. I want to show tabs. And those are things that Alpine's incredibly good at because you don't need a separate JavaScript file.
We can just generate something that says, it uses X as it's kind of modifier X dash click toggle the thing. True or false, toggle open to true or false and X if or X show and then if it's open or not. Those are things that we can very easily generate on the backend and make type safe because we can say, you know, this has to be a boolean and here's the action.
And all those things are then type safe, but you can still do JavaScript if you want, so you can still use JavaScript functions in there with your Alpine if you need to.
Jeremy: Yeah. That just sounds like the distinction between that and like a live view or a live wire is that my understanding is with those solutions you're shipping over basically diffs in your HTML, and that's how it's determining what to change. Whereas you're saying like, you may still have some of that but there's certain interactions where you just want to run JavaScript locally on the person's client, and you should still be able to do that even if you are doing this sort of sending diffs over the wire, for other things.
Paul: Yeah. Exactly. Alpine's made for that. The biggest key differentiator between Livewire live view is the type safety, all those nice things that you get in lucky you're going to get also for your client side interactions. So if you have an action and you have a typo or something, it's going to blow up.
It's going to tell you if you forget something, if you've missed the wrong type. I mean, and this is something that's very hard in the front end world because you either have to run an automated test to make sure you catch these or the worst. You have to open up the console. Because like, why isn't this working?
I don't know. Now I have to dig into the console. It's not even where you typically want to see logs, and so being able to shift that to where you're used to seeing errors and before you even have to open the browser, I think that's going to be a huge deal.
Jeremy: I think on the server side, testing is pretty well understood in terms of, you know, especially if you have API end points, or you have just regular server code, like people know how to test that. But on the client side, there's like so many different ways of doing it.
It feels like, and a lot of them involve spinning up browsers and, um, it can get kind of complicated and so, yeah, it'll be interesting to see if you can shift more of that to the, the server environment that a lot of people are used to.
Paul: Yeah, I think it will be cool. We'll see how it goes and yeah, I do think there's definitely complexity that comes with moving it to Javascript, especially if you have a single page app cause then you need to spin up an API. You need the the server and an API. When you use your Cypress tests or whatever, or a lot of people mock the API, which sometimes is fast, but can get out of sync, in which case you lose confidence in your tests.
So having it in one spot, is I think really great. And we do have the capability to run browser tests that's built into lucky because I think it is still good to have at least a couple smoke tests for your critical paths. To test the happy path. Um, but I mean if you can write fewer of those, that's great cause they take forever to run.
Jeremy: For sure. Yeah. Um. In lucky, there's a lot of features that in other frameworks would be not usually be included. Like for example, there's authentication. you have this setup check script to see if your app has all of its dependencies, things like that.
I wonder if you could sort of explain sort of how you decided what sorts of features should exist in the framework versus being something that you'd leave to the user to decide.
Paul: I think things If there's no downside for one thing, if there's no downside, only upside and almost everyone would benefit from it, I want to include it. So that's, for example, the system checks script. Um, we also have a setup script and that's what we tell people to use. Instead of saying like, first installed yarn and then run your migrations and blah, blah, blah.
No our documentations don't even mention that. It's like run script set up. Um, and the idea there is, it serves as kind of a best practice. It kind of pushes you into things to say like, Hey, put stuff that you need in here. Then we lay it on the system check, which also runs before setup. And also every time you boot the development environment, um, where it'll check, Hey, do you have a process manager?
Which you need. It'll check whether Postgres is installed and running, because that's required. so if you go back to kind of that criteria, it's useful to pretty much everyone. Meaning like, if Postgres isn't running and the app's not going to work, everyone would need to know that. Um, and it doesn't really have a downside.
if you don't want it for whatever reason, you just delete it. Or stop running it, that's not a huge downside. That's like, you know, one click. So that's part of why that's included. I don't like spending time on things that aren't delivering actual real value.
So I don't like spending time figuring out why my local environment is not working or why it was working and now suddenly isn't. And with something like a system check that makes teams happier in the sense that, let's say all of a sudden ads somebody adds a new search capability and it requires elastic search, and I do git pull from master, do my feature as soon as I boot the app, if they've added something to system check that says, Hey, you need elastic, it's going to tell me it's not going to just blow up.
It's going to be like, Hey, you need elastic search now. Install that and run it. These are the types of things that I really think are gonna save, a lot of time in terms of auth. that's another one of those where it's like, so many people want it and it should be easy and simple and not like five different ways to do it, but not everyone wants it, which is why, we made it optional.
You choose in the wizard, like if you don't want auth, fine. I guess that most people generate it with off. I know I do cause I need it. And the thing is, we also changed how auth works in the sense that it's mostly generated code. It's not just a bunch of calls to some third party library. So what that means is it is easy to modify.
So if you want to add email confirmations or invitations or anything else like that. You're not mucking around in some third party library. It's code generated in your app that you can see and modify. So it doesn't lock you into anything. It's very flexible and it helps you get off the ground running.
And that's why that was included. Uh, and I'm sure we're going to have other stuff that may be included or at least an option of being included in the future.
Jeremy: Yeah. I think, one of the conversations that people are having now is particularly in the JavaScript ecosystem.
you end up pulling in a lot of different dependencies. You end up having to make a lot of different decisions. And so it's interesting to sort of see, lucky kind of move back and in the direction of say, a rails of trying to kind of include the things that you think probably most people building an app are going to need.
Paul: Yeah, it's a little more in that direction. I think on the flip side, rails is starting to include so much that people are starting to get mad almost, and it's like so much that you're like, what is this? What is happening. So we want to strike a balance there. And so part of that is being very careful about what is included.
I think some of the things that are included in rails could just as easily be added after the fact, meaning, 20 minutes of work and you can add it. Those are the types of things I probably would not include in lucky if it's 10 20 30 minutes to, you know, add it and modify your app. And only 50% of people even want it.
We're probably going to just say, here's a guide on how to do it and make it easy, but not do that as a generator, if that makes sense.
Jeremy: What's an example of something like that that would be pretty easy to add in after the fact and doesn't necessarily need to be included?
Paul: Um, well, in rail six, it's coming up. They have this action mailbox thing that handles inbound emails. I'm pretty sure by default that is included. I could be wrong, so don't quote me on that. But I've been seeing a lot of Twitter stuff lately of people being super pissed about it, so I think it's there.
Um, that's something I definitely wouldn't include because I think I've written one app ever that uses inbound emails. I mean github does too, but I have not written that and a lot just don't have that. So it's odd to include it, especially given the fact that it's not particularly hard to set up yourself. I think based on what I've seen, or action text is another one where it has ways of making rich text editing easier. That might be something too where. It could be added on later that I think, at least as a little bit more merit, because I think it's fairly common for at some point to be like, we need a rich text editor.
Um, but those are the kinds of things that I would probably push off. And it's not a best practice either. Meaning I think it's smart that it has active record by default and chooses a database for you. Um, because it's best practice to just use active record. Right. And you're gonna have the best time using active record.
Cause that's what everyone uses. So including that makes sense. But yeah, something like action mailbox is like what's the benefit in including it
Jeremy: Yeah. Just because the majority of people who are writing applications, they'll never need that inbound email feature. Uh, as opposed to, your example of authentication, like probably the majority of applications people are building will have authentication in them.
Paul: Exactly. Yeah. And it's something that's hard to add, meaning um it touches so many parts of your application. And because we are generating stuff, it's not easy to add after the fact. but stuff that. Is easy to add and easy to remove that. Another criteria is how easy is it to remove it? So we include a few default CSS styles, but super easy to remove.
It's basically like you go to your application dot CSS, it's like delete everything below this line. You delete it and it's like you're done. But it's nice because it makes it look decent and not like a horrific, ugly thing when you start your app, but it's easy to remove. And so that's something, for example, that we also include by default.
Jeremy: That's also, I think the distinction between something that's generated code and, or configuration that the user can see. Um, I mean, I think your. Set up scripts and your system checks, scripts. Uh, one of the things that makes those kind of more straightforward is the fact that they are in your code base and they're their bash scripts, right?
So, if you want to modify it or you want to remove it, they're, they're kind of right there. Whereas something like a action text or action mailbox is probably in like. the rails gem, right. It's in the library, so you don't even see it in your code base. I guess that would be the distinction there.
Maybe
Paul: Yeah. Or you might, but you don't know why it's there or what it does. Or. Yeah. Another concern is how many things does it hook into? so for example, one of the big things is, like I said, do default styles. How many places does that hook into things? Just one, you go to your, your main CSS file and delete it, but there's a way to do that that I don't particularly like.
I've seen some people, for example, use bootstrap or any framework. It doesn't matter what it is. The problem with those is it also modifies the generated HTML and the scaffold. Cause by default it's adding classes like column three, medium button, blah, blah, blah, blah, blah. If you don't want to use bootstrap, you have to remove bootstrap and manually go through all of the generated HTML files to remove the bootstrap classes.
And so that's like a key difference too is how easy is it to remove. And we've really want to only add things that are easy to remove or really hard to add.
Jeremy: What, what is the, uh, adoption of lucky look like? Do you know of people using it in production currently?
Paul: Yeah. I don't have exact numbers. Um, which I think is good because it. Reduces anxiety a lot. Not knowing is like, is it going up? Is it going down? But people are using it in production. a lot from the very early days of crystal, one of our core team members, um, Jeremy, he's been using it at work for two and a half, three years.
And they've had great success with it. They replaced some of their rails and microservices with lucky, uh, originally for the performance boost. And I think this is common. They stay for all the nice type safety and the reliability they get. Um, it's hard to explain with just words, but then you use it and you see an error and we try and make them nice.
Not all of them are. But we try and make it nice and people go, Oh, this is nice. Or people are annoyed that they see this compiler error and then realize, Oh, wait, actually did catch a block. So, but they're having great success. Um, big performance boost. like something like they reduced their number of servers by like 70%, and their response times got cut down 60 or 70%.
So yeah, they're having great success and then a few other people are building client projects using lucky. I don't know what they are. Some people, there's just not, they can't say to the public, unfortunately. But yeah, people are using in production, which is really exciting.
Jeremy: Looking at, the crystal community, what does that look like? you know, is a pretty active what are your thoughts on the community?
Paul: Yeah, it's quite active. Um, they've got sponsors, quiet, quite a few corporate sponsors, so they're making decent money to help fund development. They're aiming for 1.0, I don't know exactly when, but they did a blog post. I'm saying it's going to be soon, and I've talked to them in person about it, but I don't know how much you know, I was supposed to say, but soon.
Um, which is fantastic because then you're not going to have to deal with the breaking changes, which have definitely been happening the last two years. And I think it's good because the language is improving and changing things. But once 1.0 hits, people are going to be able to jump in and they're not going to have to update their apps every 3 months or whatever.
Um, but yeah, a lot of participation, and the sponsorship money goes a long way. A lot of the development is based in Argentina and the dollar is super strong over there. so meaning if you've got corporate sponsorship in dollars over here, that goes a really long way towards the development. Um, and they're all super nice.
I've talked to a lot of them in person. Um, super nice, super smart guys. The community itself in terms of forums and chats, that's where I'm a little hesitant. It's, it's active, but I think not particularly welcoming for newcomers. just really strong personalities, very smart, but very strong personalities, and I would say.
It may be better to come to the lucky chat rooms. We're very strict about our code of conduct and not about nitpicky things, but just in general that, you know, you talk to people with respect and empathize, and we're not the type of people where you come with a question and we're like, well, did you Google it?
We're going to try and help you. And so I think it's a very welcoming community. And even if you're not using lucky. Feel free to hop on our chat room. Um, if you go to the lucky website, there's a link.
And uh, yeah, we're pretty nice over there. So things are moving forward. We're trying to get to around the same time as crystal. Um, maybe a little after, but I think that'll be a big milestone.
Jeremy: it's interesting talking about the community. Because I think when you think about Ruby one of the big parts that attracts people is not just the language or the framework, but it's, you know, having an inclusive community, having people that are really friendly.
so it's good to hear that, lucky are striving to do that. Like why is there that divide?
Paul: Uh, I'm not entirely sure. I mean, part of it is I am a sensitive person, and so I am kind of trying to create the community that I want, which may be actually way more, upbeat and positive just because, I want new people to feel comfortable. and I think maybe part of it is, with crystal, they don't have that much time, I think, is part of it. And so it's easier to brush stuff off. Some of it could be just that they don't care about the same things that I personally do.
There's nothing actively bad going on. It's just I prefer things rather than to just be okay or average. I want it to be exceptional. And a place where it's just like, don't worry, you can say something. If it's, you feel it's dumb. We're not going to be like, pile on.
We're going to be like, Hey, it's fine and here's maybe an alternative. so yeah, I mean, go to the crystal rooms. I still do. I still get help. There's a lot of really smart people. Um, you just gotta put on like a little thicker skin and be prepared for like, why do you want to do this? Have you tried this other thing?
Have you done this other thing? In a way it's a good thing because they're making sure that like you've tried your different options and you're not just asking to do something that's a horrible idea, but it can make people, I think, feel like their ideas getting attacked or whatever.
Um, so that's what I mean by part of it is just like, if you're sensitive, that's gonna come off as probably harsher than it was intended. Um, but you can still get a lot of help.
Jeremy: Yeah, I guess it's just trying to find the right level of, yeah. I don't know what the word would be, but, yeah. Making people feel comfortable.
Paul: Yeah, I do have a really high bar for that because like I am sensitive and I grew up when I learned to program all online with books and with forums. And I remember how hard it was as a new developer that didn't know best practice and people would be like, why are you even trying to do that?
It's so stupid. And it's like, dude, I've been programming for like six months, calm down. And, I think it's common. I think, I mean, that happened in the Ruby forums happened in the rails forums. it's a common thing, I think across the internet and various communities. So it may not even be that Crystal is particularly bad.
It's probably a lot like most communities, but we just want ours to be exceptional. Um, in terms of making people feel welcome and you know, if someone has a bad idea and air quotes bad because maybe it's a great idea and we just don't have the context, but if it is a bad idea, we're not going to say, why are you doing that?
Blah, blah, blah. First, let's help you solve your problem and then talk about how might this be better? Maybe there's a better way to do it. And. It just feels a lot better. People are more accepting to have your feedback when you're not just immediately jumping on them and say, why are you even trying to do that?
Um, and so I think that's important. Uh, yeah.
Jeremy: Yeah. I mean, I think that probably applies to really all projects, right? Like they could all kind of stand to learn from some of that. And Kind of see it from the other person's perspective who doesn't have sort of all the. The same knowledge that you know you've been building up and maybe they can bring you a new perspective as well that you didn't, you didn't even think about.
Yeah.
Paul: Yeah, totally. I mean, we've changed stuff in lucky a lot of stuff that I was pretty sure about and they asked if it could be done differently, shared their use case, and it's like, Oh yeah, I made a mistake. And so it's good for everyone. Like if you show a little bit of vulnerability and openness, you're much more likely to learn.
And you're much more likely to learn new and novel things because the people with, the strongest opinions are often the ones that have that opinion based on some principle they read about or a talk or something else. It's the quiet people that are like, Hey, can we try doing this like a little differently?
And you're like, Whoa, I've never thought of this. Because no one else has, but you're new. You came up with this great new, innovative idea and he felt comfortable sharing because we're not just shooting people down constantly and so yeah, I wish more communities did that in general because it's mutually beneficial.
Jeremy: That's kind of a good place to start wrapping up, but where should people go if they want to learn more about lucky?
Paul: First place, luckyframework.org. That's the main website. It has guides. it has blog posts that you can follow or subscribe to with new announcements. Uh, and it has a link to our chat room as well as the github. So that's where I'd go. Feel free to hop on the chat room anytime. Um, we're all really helpful, and try and be nice. And so like, people shouldn't hesitate to run in there if they have problems. Um. If there's stuff that's confusing. Um, feel free to open an issue on lucky. We have a tag that's like, improve error experience. So we're, we have dedicated stuff just to do that. Yeah. In fact, if you start a lucky project, then you get a compile time error when you first start or are fresh on a project, it says, Hey, if you're still stuck, go to our chat room and ask for help.
Everyone should feel free to do that.
Jeremy: Very cool. and how can people follow you and see what you're up to.
Paul: @PaulCSmith on Twitter. Probably the best way to do it right now. Maybe one day I'll have a blog or something, but right now it's Twitter.
Jeremy: Cool. Well, uh, Paul, thank you so much for coming on the show.
Paul: Yeah. Thanks for having me. I really enjoyed it.
Henri is a frequent conference speaker and organizer of the Toronto Web Performance and JAMStack meetups.
We discuss:
Conference Talks
Related Links
People mentioned during this episode
Music by Crystal Cola: 12:30 AM / Orion
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Hey, this is Jeremy Jung and you are listening to software sessions. This episode, I'm talking to Henri Helvetica. He's a freelance developer with a focus on performance engineering. He's also involved in the Toronto web performance and JAMstack meetup groups. And we discuss why images and performance are so tightly tied together.
We also went deep into what life after JPEG might look like with the introduction of formats, like web P. And we talk about tools that can help you during your web performance journey.
Henri is a big runner so i asked him if he started his day with a run.
Henri: [00:00:36] Thank you for the introduction. And good morning. And with regards to a run, I wanted to first thing in the morning, and as we were talking about, getting up early just moments ago, I have my alarm set for 6:30.
I tend to sort of open my eyes up around quarter to six, and figure out like how this run is going to go, but it was raining this morning. so I was a little upset, went and looked outside. The rain stopped by 7:00 AM. I was thinking, okay, maybe I should head out now. And as I was getting ready to head out, the rain started again and I thought to myself, okay, it's not going to happen simply because I knew I was doing this podcast.
So I want to be back in time and fresh. And, and afterwards, I think I'm going to watch, this, Microsoft event they have online, MS create. yeah, run does not look good today.
And it's funny. I was speaking to Burke, Holland from Microsoft. He said that he sent the clouds my way, knowing that it would force me to stay in.
Jeremy: [00:01:34] They are a big cloud company, right?
Henri: [00:01:36] That's a good one. I like that.
That was good. I'm going to have to keep that one.
Jeremy: [00:01:42] You're, you're pretty deep into the web performance space. What are some of the biggest mistakes you see people making on the front end?
Henri: [00:01:52] I mean, web performance I, I consider it a bit of a dark art. There's lots involved and, and much of it may not seem, very clear to the sort of like average developer at times. but with any auditing that takes place, whether it be web performance or accessibility or, UX, overall, you're always going to have some low hanging fruit, and, One of those fruits is, image management. and I think that, you tend to find a lot of people sort of disregarding the importance of making sure that images are set properly, as a resource loading on your page. and. It's important for a number of reasons. most notably is the fact that it's always, absolutely going to be the heaviest resource on your page. Okay. Barring video. and you know, video in the last say couple of years, specially, especially this year has become a lot more prominent. So I mean, that's a bit of a different conversation, because, you know, you could quite often find pages with no videos. So I didn't want to go too deeply into that topic, but, you know, you will find images 99.9% of the time and images are challenging. Image management has become a lot more complicated, for a number of reasons. Retina screens brought in a particular challenge with regards to how to select the right image. and then you also have more than ever people are really paying attention to, connectivity, understanding that, connectivity may vary along like a five minute period, what was 4G at the start of your walk might suddenly downgrade to like very poor 4G or even like moderate 3g. Then you might go into your home and back out. And so you have varying conductivity that, ultimately the site doesn't care about. It's like, Hey, just load this image.
You have these things to take in consideration and you luckily have some very brilliant engineers out there that are trying to make these accommodations. So, I would certainly say, images, Have been, are, and potentially will continue to be, one of the bigger challenges in terms of a low hanging fruit.
Jeremy: [00:04:15] I want to go a little more into images. If you have the most basic case. Let's say you're not building a single page application. You're building a traditional, like just a document website. What are some of the ways that you should be treating images? you mentioned retina, how do we ensure we're only sending retina assets to people with retina devices? Should we be loading images in a lazy fashion? Like what are some of the best practices there?
Henri: [00:04:47] The ultimate best practice, at this point, and it's a bit of a cop out, but, it would be to, outsource the work. and I say that simply because I think it's, it's become, enough of a challenge that there are some companies out there that are solely set up to do that work for you?
Obviously people like Cloudinary and there's a bunch of others as well, that have been very upfront and outspoken in their need to let people know that this is the kind of work that they do. And, that's their specialty now.
Barring that, there are a number of ways you can, look at, managing images. Obviously I'd say, one of the earliest revelations that came along the way when dealing with images and dealing with a retina, non retina, when we had and obviously, formats as well. picture tag srcset, and the ability to pinpoint what you wanted to send under what conditions.
And, that was fantastic at the time. And, and everyone felt like, okay, we've, we've come up with a great solution, but along the way as well. What ended up happening is that you had these ever growing blocks of code. And I believe it was Brad frost once, posted, I think a screenshot of a code block just to handle, retina, not retina, et cetera.
And it was so huge. He just sat there and was like, I'm not going to do this, there's no way, I need this massive block of code just to serve a couple of images under the right conditions. Things like that came about, and obviously as much as they did work, there was a sense that, things need to be a bit more simplified.
I mean, people are still working on that, but I mean, what that also didn't do is. take in consideration, things like network conditions. And so you can have like this amazing, beautifully scripted block of code for images, but that still didn't take into consideration whether or not you were getting proper, bandwidth and, good, round trip times and whatnot. And that's where things like, network information API came around and whether or not you want to serve a particular images under particular conditions. And that's where it starts to get pretty complicated.
Jeremy: [00:07:23] And this code you were referring to, this is all JavaScript?
Henri: [00:07:27] Oh no no. Th th this is all just like classic HTML. I mean, we've not, we've not even gone into the JavaScript element yet, But, no this is all just like straightforward HTML that was there for you to sort of manage, images as best as possible.
And, and like I said, the code block could just grow very quickly. If you want to just have like all your options covered, right. or conditions should I say? but again, none of that really delved into the idea that, Oh, we have variant network conditions too. And that sort of threw like an additional curve ball into, what seemed like very simple rudimentary work, you know, loading up an image of a cat, but, it's not the case anymore.
Jeremy: [00:08:15] In CSS, for example, there are things like media queries where we could say, the device's screen is this size. So I'm going to send them this type of image. are, are these the types of things you're talking about when you're talking about serving different images to different browsers and devices?
Henri: [00:08:34] With regards to different browsers specifically, the picture tag was actually a bit of a revelation there because we had situations where at one point. There was kind of like a fractured, landscape of, image support. you may remember, at the time when web P was starting to, make its way into the conversation.
Even though the webp is like 11 years old, but I feel like, you know, even to this day, a lot of people, like webp what's that I'm not sure. And I remember a couple of years ago when I went on this sort of like image, you know, format management discussion, at conferences, there were people who had no idea what the webp was.
To go back, in history, when the web P was really started to be introduced by Google and, supporting, browsers with the blink engine, there was a moment in time where, Mozilla felt that, we had not extracted all we could out of a JPEG. So their sentiment was that, a sudden introduction of a web P might've been premature.
And in fact, there was a, a blog post that it described their decision into maintaining their sort of support, and additional research into, getting, better compression over the JPEG. A blog post that has since vanished, but I think if you go to the Wayback machine, you might be able to find it. And then on top of that you had the idea that the JPEG and JPEG 2000 and JPEG XR were sort of still out there floating around for people who want to experiment and really, really dive in a little more, because at the time you had CDN companies like say Akamai that we're working with, big retailers and, they had obviously a lot invested in making sure that, they could squeeze all the data they could, out of, certain assets like images. So you could have say a website, like, I remember in one of my talks, I gave this example like forever 21, I could talk about a company that's, gone bankrupt. So it's not like I have stock in that company.
Jeremy: [00:10:44] Yeah. They're not going to come get you now.
Henri: [00:10:46] Exactly, exactly. Right. It's like, here's a couple of pennies, man. Let me give me some, give me some stock. but, I, I, remember in my talk, I showed that, in devtools in different browsers you saw, you know, the JPEG 2000 being served and you saw the JPEG XR being served. You saw obviously, in Mozilla's case, a JPEG being served.
Now I believe in Chrome. you were getting the webp. So the picture tag definitely helped with that, you know, with people who really want to be, very focused and, and trying to serve the best, format and most compressed. option possible, you know, very, disciplined delivery of assets because that's what it is at the end of the day. Trying to be as disciplined as possible and trying to find the absolute best possible, solution to, to sort of lessen the load.
Jeremy: [00:11:38] Is WebP kind of the equivalent of a JPEG. It's another lossy, image format, but is perhaps more efficient
Henri: [00:11:46] So the WebP is a very interesting format. So, history. WebP came from, a video format. So the WebP is actually a product of the WebM. Some of the more interesting and more, data efficient, image formats are all actually stemming from video, which is really interesting.
So, the HEIC, which came from the HEVC, and, and then very future conversation, AV1 birthed the AVIF. AVIF and again, that's video to still image, but let's get back to the WebP it was made for the web, essentially. Visual fidelity it may not be the best format, but in terms of what is best for the web resources being transmitted down the wire, the WebP makes a great case. and I'll list a couple of features real quickly, obviously a very aggressive codec compression is 10, 20, 30%, better than PNGs. And better than JPEGs. Their chroma subsampling, is locked at 420. So for those who may or may not know chroma subsampling, basically it has to do with, it's sort of like the removal of certain colors that, may not be super perceivable to the eye.
And so the fidelity remains to an extent, and essentially they removed some data that you probably, you know, yeah the average person wouldn't really catch. And it also had transparency, which made it obviously a lot more attractive because obviously the JPEG didn't have that.
And, at one point, actually, and I've mentioned this a few times and you know, I was lucky enough to have a conversation with a Chrome PM. just about four years ago, he had mentioned to me The web P they had specifically the PNG format in their sites, as the felt that, feature for feature, they're aligned well enough that they felt that they could replace every PNG on the net with a webp.
I also say that because the WebP came in two flavors lossless and lossy. Obviously the lossy one being the most attractive, but there is also a lossless option. So for those who really want to hang on to that fidelity and they, they refuse to let them go. There's a lossless format as well.
So, the WebP on paper was an attractive format. But early on, some of the challenges, was encoding, was support. For people who are just so used to PNGs and JPEGs and, and God forbid GIFs, the majority of the software out there had just endless support for those three, even SVGs, but the WebP not the case. There was some work involved in trying to get the WebP into sort of like the ecosystem, but it wasn't going to happen without some of the more proper software outfits not supporting it. Say for example, Photoshop, there's a bit of a potentially outdated plugin that's been around and now not even supported by the original company that put it out, for Photoshop. And, you can sort of go through the litany of other sort of popular software, outfits that may or may not be supporting it to this day.
Jeremy: [00:15:15] In terms of the best practice for images, you said there's a picture tag that will let you use a different format depending on the user's browser. So if you were using Chrome now, I suppose that it could send the user a WebP. but if they were using Safari, maybe it would have to send them a JPEG or PNG.
Henri: [00:15:39] Yeah, it's funny you should mention Safari. I can go back and finish up my WebP story. Web P was slowly gaining and I do mean slowly gaining some recognition, I don't want to list it as popularity, but some recognition.
Mozilla had doubled down in the JPEG. You had the WebP, JPEG, for sure. And then whatever else you wanted to use that was on the fringes of popularity, like the JPEG 2000 and the JPEG XR, 2000 being supported by Apple, and XR being supported by Microsoft. Then a significant moment in format history, Mozilla had a moment of clarity and they reopen, the bug to provide support to the WebP. And which was a bit of a, you know, I don't want to say a shocker. But, had sort of decided that, Hey, you know what, we probably need to do this.
So they reopened, that bug, again in history, and sort of significant in a sense that for like a week, webkit, specifically Safari supported webp. And it was like a very bizarre moment. And he got pulled ASAP. It was like grand opening, grand closing, literally.
And I could send you that link and this had an article about it. that was like, you know, no knew what was going on. But at one point we'll say about a year to two years ago, I think. First of all, you had Chrome and Blink engine supporting webP. You had Mozilla who had finally announced that, it was in nightly. Also somewhere in between, Edge had moved over to Chromium and they sort of quietly announced that they had WebP support. So, at the top of 2020, you had three of the four majors all supporting WebP and then hell froze over about a month and a half ago, at WWDC at Apple headquarters.
And they announced WebP support for, Safari 14. So basically in about a couple of years, you went from one to four of the majors supporting web P. and it is significant in a sense that, A well known, image researcher, developer, engineer, Cornell. I can never pronounce his last name, but, so I won't, but, I'm a big fan of his work. He's actually the author of imageoptim. image, optimization tool. He put out this tweet saying that he felt some point next year, WebP could essentially be the only format you need. and it actually does make some sense, because if you're going to have the four majors on board and, and all the other browsers who are running off the blink engine, You know, we could, we could see the, the web P format climb significantly in, in presence on the web.
Because as of right now, if you go to the HTTP archive, WebP is still in relatively trace amounts, on the web. and again, for a number of reasons from like tooling to developer, knowledge. It's going to be pretty interesting to see what happens.
Jeremy: [00:19:09] WebP it sounds like it's going to be able to replace JPEGs and PNGs because it has that lossless option. How about, how about GIFs? Are we going to be able to have animated pictures in WebP?
Henri: [00:19:22] The big WebP proponents will tell you yes. And, I mean, I'm going to go back to that statement you made with regards to, why the WebP is going to be, replacing the JPEG and the PNG part of the reason and I think specifically why it feels it can replace a P because I think essentially people will still reach out to the PNG as a lossless format for that transparency element, and no other image format out of the classic four, had that as a lossy option. So now back to what you were saying with respect to the animated GIFs, As much as I'm not a huge fan of the animated GIF, we have to live with the fact that people love them.
Jeremy: [00:20:13] and they definitely love them.
Henri: [00:20:15] they absolutely like imagine Twitter with no animated gifs. it's almost like it'd be like empty but it's, it is interesting because, For the most part, the animated GIF has been replaced by the MP4. Quite often when you go into dev tools and, and look at, The entrails of this GIF, you believe you sent off a, it's actually an MP4 now, that was done for a number of reasons, specifically, storage because, MP4 versus GIF in terms of storage, huge difference in terms of size and, and some of these GIF farms realized that very early and, you know, you can have, storage costs, balloon out of control, just cause, you know, you want to carry a GIF. That was part of it. I've actually never, I mean, I shouldn't say never, I've seen an animated WebP once. Like I'm assuming it probably loops as well.
So on paper, yes. There would be probably an argument for that to take place, but also for that to take place, the services that are out there, giving you these GIFs will have to on their own, do the encoding. So you can sort of like drag this, this animated, image, and hopefully it will be a WebP that'll be the one, early challenge.
And, and again, I get back to the idea that we as just everyday persons have the, the tools and the encoding capabilities either on our phones or our computers to just say, Oh yeah, I want WebP. So there is a bit of a developer education that's going to have to take place, let alone consumer education right? You know, the average person knows what a GIF is. Devs don't know what WebPs are. So I don't, I don't imagine, individuals will. So there's going to be that I think hurdle along the way. But on paper yeah, it would be, it would be, probably an apt replacement.
Jeremy: [00:22:25] That is interesting about how you mentioned a lot of the times where we would usually use a GIF. We now use an MP4, because that makes me wonder when. Someone is in a discord room, for example, and there's animated images everywhere. even if somebody thinks they're using GIFs, those may actually all be MP4s.
Is that, is that right?
Henri: [00:22:47] Absolutely. I'll tell you a quick story. I remember, one of my earliest talks on images. I'd mentioned that and the next day. I ran into a developer, and a speaker actually, who was, at my talk. And, he mentioned how they didn't believe me. And they went into dev tools.
And, it was like, So, for everyone who didn't see that I just made this sort of like a mind blown face. and, and they told me that like, Hey, I was suspicious of that comment and I went into dev tools and they had no idea.
Jeremy: [00:23:27] Yeah.
Henri: [00:23:29] I actually felt good for a minute, you know, but, but yeah, that's, what's happening.
And, and again, we're talking about, The availability to still have, the animation, but to save on storage, and even though, a gift may have that classic choppy look and feel, and you're like, Oh, that's gotta be a GIF, but it's just the choppy look and feel as an MP4, it's being done, because they do have the capabilities.
Now getting back to that WebP conversation, whether or not they'll be able to, get all that encoding done. and, and, and suddenly, you know, their terabytes or petabytes of GIFs are going to be turning into, WebP. I'm not sure, but we'll see.
Jeremy: [00:24:15] Yeah, but it sounds like if they're being turned into MP4 files to the end user, it really doesn't matter.
Henri: [00:24:24] Ultimately that is, the challenge, right. as developers, we're making sure that, you know, we are disciplined as possible, but the end user doesn't care, is it looping? Does it work? can I post it on my page? That's it, very early in fact, there was a situation where, Facebook who have been, very aggressive in, exploring performance, opportunities and how to save, data, Quite often they're very early adopters.
There was a point where they were starting to serve webp and they found out people were often just dragging stuff to the desktop to share either with friends or somehow. And they're realizing that, the WebP was being supported in the browser, but nowhere else. And so there, there were some complaints.
And then at one point, I think they were trying to do something a Chrome where when you drag the WebP out of a window, it would be encoded into like PNG a by the time we got to your desktop little things like that. But what that described is the user experience had to be absolutely seamless.
People do not care. And they just want to know that the images went from their window to the desktop, or they could just share it with a friend, you know, an iMessage or whatever it is and that's it, that's always part of the challenge, right? Making sure that, that the users can have like a very seamless experience in sharing, in social media,
Jeremy: [00:25:54] Yeah, that's interesting because it reminds me of an iPhone in a lot of cases, if you take a photo, it's not. A JPEG, it's a HEIC format and you send that to someone, who can't open it. And you're kind of like what the heck is going on.
Henri: [00:26:10] Absolutely. it's funny, you should mention that because I remember when Heath and you know, that whole ecosystem, was being introduced, at this one WWDC. It might've been two years ago and you just saw Twitter, just kind of not explode, but just going through this, like, what's HEIF, what's HEIF or what's going on?
Hey, Hey, and I remember I gave a talk within like two weeks about HEIF and. Nothing happened. Even within the, Apple ecosystem, that format wasn't even supported by Safari. I think it may have been supported by, maybe image capture, but it had limited support outside of the iOS ecosystem now.
I don't want to sort of like get into like the iOS business, but, if you take your phone and you go into say, a timed shot, like three, 10 seconds, whatever it comes up as a JPEG, which is weird. And I believe the front facing camera, I don't think does a HEIC shot either, but. If you do the, normal sort of like, backside camera shot, that's not burst. I believe it comes up as a HEIC. it's like super weird. and it's very bizarre because now you're talking about, them adopting HEIC or HEIF being the only ones. And now. Providing support for webP, which is super interesting, but that may also have, to do with the fact that, there are patents around, HEIF, and HEIC, and that's something that I've, I've come to sort of, discover.
And, and, and why the support for, open source formats like WebP, and a few of the others, like, AVIF that I talked about, are, are significant, because I think the opportunities are there to sort of bypass those royalty payments.
Jeremy: [00:28:06] The current encodings that we use now are any of those, patent encumbered, like are the browser vendors paying royalties for those?
Henri: [00:28:16] With respect to the WebP certainly not, None of the browsers are supporting HEIC. So there's probably no payments there that's part of the reasons why, I believe, some of the future, formats that are patented maybe challenged, they'll still make some money, but I don't think that they'll see the sort of windfalls that they have in the past. Just cause there's so much support behind, open source, formats, you know, I'll give you a quick example. Let me know if I'm going off course here because I have all this stuff racing through my head. so AV1 it's an open source video format and AVIF is the sort of, the image format that's, born from AV1.
AV1 is being supported by two to three dozen companies, all companies that have a very vested interest in video. And, all the browser vendors are in because actually Chrome, Mozilla and Cisco were three of the founding partners in this.
And Apple and Safari joined. Does that mean that Apple and Safari specifically is going to support AVIF? Not necessarily, but at least we see the early interest. and I don't see why Safari won't keep a very close eye on that now there are some people out there who would make the argument that they won't.
Okay. I get it. but the fact that they joined the consortium very early, I think is, is, Hopefully telling, of their interests. So that being said, there will be support, I think, longterm for open source, formats. although, there's, there's one particular format, sort of like lurking in the background right now, which is called the JPEG XL.
And this is another open source format a couple of years ago. when the JPEG was celebrating an anniversary, I think it was the 25th anniversary actually. the, JPEG.org, the organization, put out a call for paper, to sort of see what was out there. See if people were interested in, in, sort of like improving the JPEG, as it stood, because again, 25th anniversary, the JPEG was a little long in the tooth and it's like, what else can we do? so a couple of companies came together. Seven submissions were made actually. Two were picked.
And the two companies that, were selected, were Cloudinary and Google. And Cloudinary had played around with this one format called FUIF, which is a free universal image format. And, Google had been toying around with this one format called PIK. PIK was in the background working because they had also believed that the JPEG was getting a little old and can use some updates. Long story short, the two, projects kind of came together into one and it became like the JPEG XL. and it's been moving along. you could actually, play with it right now. but. Again, not to get into the entails. it's not been adopted by a single browser yet, but they're certainly working on it.
And in the fact that I think you have two image powerhouses, coming together, I think there may be something bubbling, on that end, and the JPEG is, being touted as like the one format for all your needs. So imagine, potentially, the WebP with the added idea and support that would also be able to replace something like an SVG, which is very interesting because the SVG as a vector format has particular features that a raster format can never have. But the JPEG XL feels like boom. You know, they have that covered. And a few other features that, you know, I don't want to get into the entrails too, too, too much, but, it's, it's kind of fun out there right now, you know?
Jeremy: [00:32:16] It's interesting because we've had the same formats in the browser for such a long time. We've had the JPEG the GIF the PNG, SVG, seems after all these decades, we're finally getting to the point where we might start seeing, new, new formats take over.
Henri: [00:32:36] Yep. And you know, people have to realize that, there's, you know, there are so many things that, that go into, having, so many updates say in the last three or four years or five, part of it's like computing power, there's so much, that goes into, being able to, have these formats readily available. Early days of, of the web P one of the challenges was the fact that it was CPU intensive.
But you know, again, you're talking about the WebP being 11 years old in, you know, in six years, CPU power can change quite a lot from a handheld device to, to your laptop. Right. So who knows what you know is taking place on the enterprise side. but you know, it's funny, you should mention the fact that the formats are all old.
In, in my talks, I mentioned this quite often and just want to remind people that, you know, like you said, the GIF, the SVG, the PNG and JPEG. You're looking at easily, like a hundred years, which is crazy, you know, like I, I joke around in my talks that it's like older than the Rolling Stones, but, that's very important.
We've had HTTP1 and 1.1 For like almost 30 years almost. And, in the last three or four, we've gone H2 and now we're talking H3. If you're looking at the early days of web, you know, no one knew that the web was going to be consumed, in greatest amounts on handheld devices, and handheld devices with moderate power.
I think we're lucky enough that, we have like some iPhones and high end Androids and whatnot, but the average individual is looking for a deal and the deals happen with moderate devices and they have particular, CPU hurdles and, you know, we've needed to make some changes along the way.
Formats being one of them and, protocols being the other, but that's a separate story.
Jeremy: [00:34:38] I would think that currently a lot of the devices have dedicated hardware to decode specific image formats, like for example, H264 and, and maybe now that things like web P are gonna be in the browser, there would be dedicated hardware to, to decode that as well.
Henri: [00:34:56] And, and these are things that will probably come along. but you know, you still have the fact that you have an absolute trusty in what I like to call the workhorse in, in the JPEG that's always going to be there. I mean the JPEG again, by far, the, the, the best support out there, it's the one that's being delivered, by digital cameras by, you know, Our phones.
So I mean, there's less of a concern, but eventually yes, you do want some hardware decoding. Like, for example, when I brought up the, the AV1 I remember, a speaker from, a talk from an engineer at YouTube, talking about, you know, them experimenting with the AV1 and then realizing that something about like 10 to 12% of their users run very old devices and they had no idea.
And so these are things that you have to take in consideration. And so, if they're finding out that they're on old devices through, you know, trying to serve, you know, next generation video, They're still going to be on these old devices being served next generation, still images.
These are, again, part of these curve balls that engineers are being, pitched, quite often
Jeremy: [00:36:18] Yeah, so it seems like this, I guess, dream that we had of. Hey, everything is going to be moving to web P or maybe everything will be moving to AV1 is probably not going to become true for actually quite some time, because like you said, there are still going to be a significant percentage of people who they need to use the old formats because of their old devices or because they're, they're buying low cost devices.
So it doesn't seem like we're going to be able to get away from. The image tag that goes here, we'll give you the JPEG. We'll give you the webp and so on. Yeah.
Henri: [00:36:56] Yeah, that, that, that's definitely going to be, you know, it it's like that rough transitional period right where, everything's being supported by everything you needed. And then now you have to get into that transition where it's like, okay, well, you know, maybe it's the hardware that needs to change. Okay. Now we have to make sure that people know that they should request a web P and things like that.
So along the way to nirvana. we have to get a bunch of things sorted out so that, from a user standpoint to a developer standpoint, to like, a hardware, so yeah, I mean, it it goes past browser support. and obviously that's one of the hurdles that's without a doubt. but, once you have the browser support, that you need, you know, you still have this sort of like the few legs to, go down and, and make sure that, you can have that sort of like perfect situation where it's like, okay, boom.
Now we have like the support that we want. We have the browsers, and then we have the education and that, that, that needs to take place.
Jeremy: [00:38:07] We've been talking a lot about things that are, that are coming and I want to bring us back to some of the things that that developers can do now to improve performance or at least improve, perceived performance.
And one of the things we had mentioned earlier, was the concept of image, lazy loading. Like, should we be loading all the images on page load? Or should it be as the user scrolls? And I wonder from your perspective, how should web developers approach that and what are the tools they should use for that?
Henri: [00:38:43] I would say you, you actually should be lazy loading. I mean, ideally in an ideal world, you know, you don't, request resources that you're not going to see. and, a while back, I don't know if I could dig this up, but there was a study, sort of indicating that, something like two thirds of, of, of resources were below the fold, on average, and then, on top of that, only about one out of two, users went to the bottom of the page.
So yeah, so you ended up having a bunch of resources below the fold that, quite possibly aren't going to be, needed.
So the advent of lazy loading came about again, you know, wanting to make sure that you're not hampered by a page having to load say 10 megs of resources, just so that you can look at the, first sort of like page and a half of, of information. So that being said, lazy loading became a bit of a priority.
And as you may know, Chrome has natively, added, lazy loading as of right now, I believe in stable, if not for sure canary, and obviously there's some libraries out there that would, that would help you out with, with, that process as well.
And again, I get back to the idea of being disciplined as a developer. And making sure that, Hey, did you want to snack because, I can give you a snack or I can take you to all you can eat, you know? And the all you can need is what we don't need since you only want a snack. So, if that analogy made sense, but, but yeah, I mean, it's, it's super important, you know?
I mean, for example, I think I tweeted this a couple of days ago. someone who's at an agency, sent me a, a site that they had just, pushed and it'd gone live and I'm like, okay, let me take a quick look. You know, it was like 11 megs on, on first load. it was like 99 images. 89 of them were lostless, and everything loaded in one shot.
And it was a fairly long, fairly long page. I mentioned this to them in a quick communication, like, hey, there's a bunch of other issues, but you guys should be using some lazy loading, you know, because again, it's the idea of whether or not, a, a individual is going to go to right down to the bottom of, the page, and, on average that's not the case.
And lazy loading is going to help you manage those assets. for the best user experience possible,
Jeremy: [00:41:08] So with that particular page, as an example, it sounds like one of the things that they should do is conditionally load. different qualities of images based on the device you're using. And the other thing would be to do some form of lazy loading so that you, you don't load every image on the page, but instead you load them as the user gets close to them. and this is all being done just through HTML tags? Like there's no JavaScript?
Henri: [00:41:39] So barring the native, implementation. you could actually there are a couple of ways that you can set that up now, prior to the native implementation of lazy loading, we were using the intersection observer API, and what that was essentially, it was kind of like,
I describe it as like fake lazy loading. so you could actually indicate, you actually set up, through intersection observer, sorta like how far from the viewport you want particular assets to load, and that became, native to the browsers before Lazy loading. Now it had uses beyond images, but, people were starting to use that for images specifically.
That was certainly available now on, you know, outside of using JavaScript, I can't believe you were able to set that up. however, you had mentioned, potentially using, Images, I think you'd mentioned images in a, low quality. Did you mention that at some point?
Jeremy: [00:42:48] Yeah. So if you are on, let's say you're on a phone and the device size is small. Maybe you don't need that full 4K resolution image. so it sounded like there was some way within just using HTML that you would be able to, select different images for different devices.
Henri: [00:43:11] Oh, so, we're talking about maybe using the, network information API potentially, we're getting into, I guess, JavaScript, not so much HTML, but, you know, you could actually start to conditionally, provide, particular, image qualities.
Depending on the network conditions, obviously now, what that could be is, I don't know, let's talk about baseball, you know, you may have a front page where it's like major league baseball under a crisis, you know, and then you have like a bunch of players, like, on the page in an image, but that might be under ideal conditions, say 4G powerful phone, whatever.
But just say they're under less than ideal network conditions instead of, of, of having the image of the players. You might throw up, the MLB logo. Three colors, made it at SVG. It might be two kilobytes instead of like the 43 that it was for the image of the players.
So the net info API is going to allow you to do, stuff like that. Certainly. there was a point as well, and this was again, a JavaScript implementation where, I mean, if you do, what sort of people know from medium, where they give you that sort of blurred, image, and then it does this little swap to give you something a bit more, high quality.
I mean, people have their say about that. Some love it. Some don't because it's actually an extra request. Et cetera, et cetera, et cetera. Facebook also played, aggressively with that where they're actually, I think their, their, blurred image was like one or two kilobytes or something. And then they would swap in the proper image that was more of a user experience situation because they wanted to let people know that, Hey, this image here coming up.
You can stick around, but you know, that sort of blurred image gave the user just the information they needed to know. Hey, there's an image, I can wait the extra, like half second for it to load up. Instead of giving you like this blank page.
Jeremy: [00:45:19] This has all been specifically with images. Are there any other, common mistakes, either on that particular page or on other websites that you often see?
Henri: [00:45:31] I mean, in terms of, additional sort of challenges, like some of the challenges you'll see, it's sort of like, a bigger user experience, issue. So in terms of, prioritizing what needs to load, on the first load. So above the fold, and these are things that you're pretty much going to see, Once the page loads, but at a particular level so that's why the film strips are very important. So you can see sort of like a frame by frame level of what's loading on the page, and then you can make some adjustments. I mean, in terms of, another low hanging fruit, I'd probably say, the next one might be making sure that you keep your requests down.
And again, that partly has to do a bit with some lazy loading, because at that point, you can make sure that, you're not loading, like say, you know, making these 300 requests in one shot when really you can just keep it at like a hundred, 150, I dunno. But you'll know only when you see the page load yourself.
I think that's certainly important. And I'm talking about low hanging fruits here. I don't want to get into the deep, entrails that, that, you know the good people like Andy Davies and Patrick Meenan, get into. but yeah, I mean, these, I think are some of the, immediate decisions that you can make.
Just making sure that, you know, your, your first load is as quick as possible. And that's by keeping some of the requests down and make sure that you're not, you're not stuffing, that first load with like a bunch of, you know, I shouldn't say needless, resources, but some resources are probably, on the fence on whether or not they should be right there. you could sort of, you know, have them load below the fold, and still deliver the information that you, you, planned on, on providing.
Jeremy: [00:47:20] In the case of the first load. Would that usually be because when you first load the page there's scripts that are running, or there's images that are loading, that aren't really content, I guess, that aren't really text somebody needs to read and things it's like that. And those are the things that you would load, later, somehow I guess this is what you're saying?
Henri: [00:47:43] So, it's a great question because this is the kind of conversation I've had with, with, designers, and which is why I always believe that designers should certainly be, aware of the performance conversation, because you know, they'll sit there and make this, have this page mock up and they're like, boom, boom, boom, boom, boom, boom.
This is going to look amazing. But really and truly, I think part of that push pull is what is the most important here? what. what kind of asset, what information can we have come in below the fold? You know, what, what can come in a little later in a load? because again, all that research, is around the snap of the finger, making sure that page loads up right away, the user can kind of scan to see what's going on.
And then at that point start to like make the way down the page. You do not want that page to load and have this sort of like blank space. And then that turns into that blank stare. And then, who knows what happens after that? you really want that information to be like, boom, cattle prod.
It's right there. It's like, okay, they're scanning the headline, looking at a couple of photos and then they start to load up the rest. And that little bit of time as seemingly insignificant is sometimes a world to a page and when it comes to loading resources,
Jeremy: [00:49:09] So that's almost bringing it to, being a part of the design is can you make a page where. you don't have to load all the sort of surrounding images and surrounding chrome, just to see the content. And even before you talk about performance, it's, it's more of a, when somebody first gets to this page, how do we make sure they see what they need to see?
Henri: [00:49:30] Absolutely. Absolutely. And, you know, you bring this up just a week or two or week after, I've I've listened to, or watched, an amazing talk. it was, actually, with regards to cnn.com, during 9/11, you know, and it was phenomenal how, the sort of tech lead, talked about how the stripped the, page bare in order to just keep the details that they needed, because they actually had tried one version of the page, what they felt was, sort of like the essentials.
And then they kept stripping. It kept stripping, it kept stripping it, and eventually it just became one image, a logo and text. Now, granted, you know, that's also because you're under so much sort of, sort of stress from all the traffic, but the idea remains that what do you need to deliver in terms of information?
What is going to load as quickly as possible? That's what we're going to go with. In essence, almost 20 years later, you know, we're still dealing with that because now, even though we don't get that sort of like 9/11 level stress to a site, but you still have the fact that you have devices and varying networks and you know, it'll never translate again from like, that kind of traffic to, you know, a network being that throttled.
But the fact remains that, we can predict that they're going to be on their device. It's going to have various, a network connectivity, varying power as a handheld. And so you try and do the best to manage that element. And the fact that you do have like a team of developers and designers who have sometimes, dueling opinions.
Jeremy: [00:51:27] As a user, and maybe this is more common amongst developers, we sometimes like to see the lite pages, you gave the example of CNN or NPR, where they just give you the content. And, those are often in separate versions from the normal site, but you're sort of saying, can we take some of the lessons that we're applying to those, those lite sites and just apply it to our design in general?
Henri: [00:51:53] Absolutely. And, Someone who had heard me talk about the idea that lite sites existed. and I brought up the fact that lite.cnn.io, is upright now. And you can go check it out and it'll have the same information as cnn.com, but CNN.com will have all the media images and videos and, and X, Y, and Z and ads.
Whereas the lite site is (pssh) text, not a single image. And apparently, this became a bit of a standard after 9/11 and that site is up all the time. And again, I keep mentioning 9/11 because obviously that was like a very unique situation, but the fact remains that in times of hurricanes, any kind of like meteorological, crisis, or anything like that.
Storms, whatever you want. These sites are still going to be very important because that infrastructure is going to be down and you won't be able to access sites as, as readily as, as possible. and, the fact that you have a site with just bare bones info is fantastic.
Jeremy: [00:53:00] Another thing I want to talk a little bit about is, Chrome being slow or being a memory hog is kind of this running joke. And I'm wondering in your perspective, is that the fault of Chrome or is it the types of applications and websites that developers are putting on it?
Henri: [00:53:20] I mean, it, this is mildly above my pay grade. you know, I keep it a number of windows open as well. there's probably a little bit of both going on. and applications have never been more demanding that's without a doubt. and in turn that's a testimony to the fact that browsers have never been more capable with a bunch of features, you know? I've been a big proponent and fan of, browser engineers because, I should pull this up, but someone, on Twitter said that the browser is by far. the greatest piece of software out there.
I think there's a strong case for that to be absolutely true. I've gone out and said that, on your phone, through your browser, you could probably please just take care of everything. You need to do banking. you can save money, you can go out and make money. I mentioned that you could find a date, buy clothes for that date, rent the car for the date, order food for the date, all through a browser, and if you had told us that like 10, 15 years ago, people would be like, yeah, whatever.
But in 2020, today, July 19th, it's happening. And, browsers have to make these sort of, accommodations, you know, so, whether or not Chrome is to blame in terms of like being a memory hog, I'll let you know the engineers debate that, but I think you must tell the story that the browser is basically a bit of like a Swiss army knife right now.
We have made demands, from the browser that have generally been met. And, I think we've benefited from that, 10 fold, and, and again, If you didn't have a laptop or anything like that, the browser on your device would be able to, to save you.
And, you know, we could have had this, this conversation on the browser, on the phone, let alone that, you know, the fact that we actually are having it on a desktop right now, you know, so, you know, kudos to the engineers out there. I'm not slamming you guys.
Jeremy: [00:55:31] Yeah, it's, it's pretty, it's pretty incredible just what you can do in the browser and the fact that it has become so extensive that now we have electron right. Where people are making websites basically designed to be desktop applications.
Henri: [00:55:47] Yeah. I mean, I don't know if you saw someone, yesterday the day before. I think recreated, either MacOS 8 or some Mac application in electron. It was pretty fascinating, you know, so, but yeah, absolutely.
Jeremy: [00:56:02] And what makes a lot of this possible is, is JavaScript, right? And there's a lot of different JavaScript frameworks that people use, like React and Vue and so on. And from your perspective, what are the additional things you should be thinking about from a performance perspective when you're working in, in heavy JavaScript, code bases?
Henri: [00:56:25] You know, on demand. That's it, you mean the world without JavaScript although it can, it can exist, you know, it would be challenging. and I think we have to accept that it's definitely here to stay and it's here to, I mean, I shouldn't say here it's here to stay. I don't think it was ever not invited.
I just think that, there was a sort of like this liberal use of JS and, you know, without really under standing how powerful it was and, how caustic it could be a to the experience, the user experience, Thank the Lord for people like Alex Russell, who are out there to remind us of the fact.
But that that's, that's really it. And I think that, as, more and more frameworks come around, the idea is to not really. shun, their availability, because again, you know, people are spending some time to research and create a library, or a framework that they feel that they need.
It's when it's being used and deployed. can we use it as efficiently as possible? That's it at the end of the day, name the resource. Someone's out there, using it liberally. Can you make sure that, in employing this, this resource, you could just, send it down the wire.
Or make use of it in the most disciplined fashion. And that's what it comes down to in the end. And that is the research that needs to take place. when, using these resources, especially JavaScript, like I said, and like you mentioned, between its availability and everything that it can do for us,
Jeremy: [00:58:05] You, you talk about being in dev tools all the time. Are there specific parts of dev tools or, or tools outside of dev tools that you use to try and identify areas where there's performance problems?
Henri: [00:58:20] So, Dev tools I mean, auditing a site can seem a little boring at times because. You are essentially looking for very, specific details, I personally love dev tools because, as scary as it can look at times, and there's still many parts of it that, you know, I tend to forget, or I get lost, in, but, It will provide a lot of the information that you need, on a pages, health, and, and what's taking place, on the page now with respect to tools in particular that I like, again, I'll say it, I love dev tools.
Obviously, it would be impossible to, do a, proper performance audit without having something like web page test handy, webpagetest.org, a product of, the great mind of Patrick Meenan a tool that's been called the, the Cadillac of performance. And, and again, not the prettiest to the naked eye, but, a treasure trove of, of detail and information.
And, Patrick Meenan has done, incredible work in, in making that available. so yeah, webpagetest.org is certainly, something, you need to keep around. I'm going to throw in, lighthouse. And I say that specifically, because, if you take lighthouse version one, two, right through to version six, there's been very considerable amount of, of work and research that has gone into what we are seeing right now in lighthouse, in terms of the information that you provided, the recommendations, and, the links they provide with the recommendations, they'll tell you, it's like, Hey, we see. In these sort of like 10 resources that you might be able to save X amount of, of kilobytes of megabytes, even, if you do this, and that has helped in what I believe, is yeah, a maturation in, understanding performance.
And what you're also seeing right now, through lighthouse is people working, to reach the, mythical score of a hundred. you know, you have people sharing. Their performance scores and all the other audits that, it takes in consideration, sharing them on Twitter, you know, saying that, you know, I have 89 right now.
I'm totally working on the rest. This was not happening like prior to lighthouse. you know, having a, performance conversation was kind of like speaking to like a wall of bricks, but now, people are just, openly and willingly sharing, you know, their scores, which means that they are looking into ways of improving, their sites performance, whether or not they get into the deep entrails, separate story.
But, Google has been able to create this, platform where, Not all the low hanging fruits, but low to mid will sort of give you a good idea of what it's like to look after performance. So, I would say certainly dev tools, certainly webpage And, and again, I think lighthouse has been, very important in sort of raising the bar of performance.
Jeremy: [01:01:48] Yeah, I think lighthouse is interesting because it almost gamifies it. You get this score, like you said, you can post it in a tweet and say like, Hey, look at, look at the score I got. There's also a target, for people who make frameworks and things like that. Where, if you have a static site generator, you could say, Oh, the default template that you get from my generator, it's going to get you this score in lighthouse. So I think it's, it is very powerful to have that, that target or that goal that everybody can see. Yeah.
Henri: [01:02:23] Yup. Absolutely. And, it has been gamified and, I know that term is used, quite often. and I definitely agree. I mean, I don't want to make it sound like it's. Like a game game game. but it is something that people have been able to sort of like, you know, look at and say, okay, this is where I want to be.
You know, like this is the 10 second, hundred meter. That's how I qualify. I've been running 10 fives. We're going to get it down to 10 soon. So, I, I'm seeing these, these conversations come out of, of individuals. I least expected to sort of, talk about, when that kind of adoption, takes place, I think it needs to be sort of, discussed and to an extent, applauded.
Jeremy: [01:03:07] I think in, in your work, it seems like you have a pulse on a lot of the different APIs that are available in the browser. Are there any that you think are being under utilized things that people could be using that they aren't?
Henri: [01:03:24] The thing about some of these API APIs, quite a few, are likely being underutilized, but what is underutilized ultimately? I think the bigger picture is, can you look at your personal goals with this particular site? are they being met and if they're not, what are the avenues.
To you meeting this goal that you have in mind. and then you'll step back and look at all your options. And, you know, for every particular API that's available there's probably a bit of a downside, some of it might be sort of like complexity of use, you know, cause some are just not used quite often because it requires a bit more work and sometimes people are like, Oh, I don't feel like spending an afternoon having to do this. You know, I get back to the idea around animations, right. it could be the animated GIF. It could be potentially an MP4 it could actually be. animated CSS, but that CSS, the SVG animation is going to take a bit more work and you're like, eh, or are you going to just jump in and just say, Oh, F it, I'm just going to use this GIF for this animation, even though by the naked eye, you could tell that it could have been done in CSS, right?
So the same thing is happening with particular APIs where it's like, eh, using it on a page. It's probably going to take a little bit of work, which means if there's a mistake, it's like, I don't want to go in there and having to do all this, like, and debugging X, Y, and Z. You start to look at some of these other options that might be a little easier.
And then that might even mean that you might rewrite a particular part of the page cause it's like, eh, you've suddenly discovered that, you know, you could kind of strip this out and make it a bit more bare and make it easier overall to sort of manage. so, that's part of the challenge with APIs. In fact, I remember, I think it was Steve Souders who said that user timing API.
So, yeah, it was the user timing, API, a great tool. developer adoption, not so great because it just meant that there was a bit more work taking place. And a lot of developers, they found didn't want to get into that. And so that's why you start to have some of these other tools like speed curve.
That'll sort of like show you what needs to be improved and where you can sort of, make some of these improvements. and hopefully, without the sort of, challenges of having to sort of like refactor, some code, when most developers don't want to.
Jeremy: [01:05:58] Right. I think it's sort of a general belief that you only do the things that you need to, and, if you have a simpler way of doing it, where you import some package from NPM, let some JavaScript library, do it for you, then you'll do it until it becomes a problem.
Henri: [01:06:15] Yep. Absolutely. And that's where, you know, the whole idea of, I'm sure you remember this sort of a sudden push for a vanilla JavaScript because people were just jumping in and grabbing all these libraries to do some of the simplest, work and people importing in particular library of, you know, like a hundred, 200k library could have been like 30 K of just vanilla JavaScript.
You know, so, as you said people just don't like to do the heavy lifting
Jeremy: [01:06:42] Yeah, I mean, I think we can all identify with, if we don't have to do the work, then we don't want to do the work.
Henri: [01:06:49] And that's it. I mean, and I never fault developers for that. That's where we're at, you know, trying to make sure that people can sort of do a bit of the work and get some the reward without going heavily into the entrails.
Jeremy: [01:07:02] For sure. I think that's a good place to start wrapping up, but are there any other things you, you think I should have asked or anything you thought we should have talked about?
Henri: [01:07:13] I mean, not really. It's just, you know, again, this is a, it's just a fun conversation and, hopefully not to the audiences detriment sometimes, you know, my mind starts to run in these different directions the minute, make a one statement and I'm like, Oh, oh, oh. I kind of want to mention this too.
I mean, we did spend a bit of time on the images side, but, it, it is something that I've found fascinating, over the last few years. And, you know, I've followed quite a few, developers and engineers who have, been deep into that research. So anyone who wasn't into images, I certainly do want to apologize.
But, if you are, I hope you enjoyed it and enjoyed that little, the triggering from my, my Bluetooth. That's telling me that, dude, it's kinda like Oh man, what's his name? the comedian he had the wrap it up box.
Jeremy: [01:08:03] Oh, really? No, I hadn't heard of that.
Henri: [01:08:06] Oh man. why am I forgetting his name? Oh, anyways, whatever.
Jeremy: [01:08:10] Yeah, it's trying to play you off the stage.
Henri: [01:08:12] Exactly, exactly. It's like the big hook keeps missing my neck. But you know, again, there's so many things around performance that that, can be discussed. You know, I recently listened to a podcast, which featured an engineer from Facebook and they were talking for an hour about, HTTP3 and QUIC and I thought that was fascinating. And, and that's some of the heavy duty lifting that's taking place, But. For very obvious reasons. And it's certainly going to play well into the future, especially, this future that we have right now, that's going to be heavily laden with media, you know, images and certainly video.
For obvious reasons who are working from home now, students are going to be learning from home. So there's going to be a lot of streaming taking place. And that's the, the next hurdle in dealing with media management and dealing with performance.
Jeremy: [01:09:02] Yeah, I think my hope as a developer is that these new things, whether they be HTTP3, or they be the new video codecs new formats, I'm hoping that we get APIs or get supporting libraries that makes it almost transparent to us. That makes it so that, I just say, I want to post this video and somebody else, like probably the people who are on that podcast have done the hard work of making sure that goes over the right transport.
Gets encoded in the right video format, split up if it needs to get split up that sort of thing. that's, that's my hope anyways.
Henri: [01:09:43] Yeah. I mean, it's certainly everything that we'd like to see, because, as it becomes a seamless and like you said, effortless for us, ideally, that can be sort of, also, transmitted down to, the user. I mean, because ultimately they're the ones, having to, you know, follow along, the online class.
They're the ones who have to do the, live, remote, you know, yoga, you know, workshop or whatever it is. I think there are a lot of discoveries that are going to take place in the next little while, because, the, the fact that our world is going to be, about us consuming a lot more video content.
This was predicted to be happening two to three years from now. Obviously this pandemic, basically, cut that into like a third, we are more than ever dependent on the transmission of video across the net. And I think it's incredible that even our conversation is taking place right now in such a clear and seamless fashion, back to school, is about like a month from now, we're about to see the net just they're going to be pulling carts, you know? And, it's very interesting to see how that's going to take place because you know, we're talking about the pandemic hitting like mid-March and there was still confusion as to how this was going to play off of the rest of the year. Now it's like, we're getting ready for what might be an entire year of remote everything.
let's see what happens.
Jeremy: [01:11:16] For sure. This is another big test for the web.
Henri: [01:11:21] Yep. Totally. You know, the web is just there, like wiping its forehead, like, Oh my God, like what just happened? And how am I going to make it, make my way through this? So, no man, it's, it's super interesting and I'm sure we're going to, you know, maybe a year from now have another discovery, and some engineering feat that's hopefully going to make things better for everyone involved.
Jeremy: [01:11:46] For sure. If people want to see what you're doing next, I know you do a lot of conference talks and things like that. Where should they check you out?
Henri: [01:11:55] I would say exclusively, Twitter, you know, I'm a big fan of the platform, so, Henri Helvetica. So H E N R I and Helvetica, as you know it to be spelled. I'm planning to have this 1.0, 2.0, release of work that I've wanted to do probably in September. So, I've shared this privately with a few people but, a blog is coming.
I'm probably going to do a series of short YouTube videos talking about little things from browsers to a short podcast on performance, but a light conversation again, as we just did, but probably a little shorter. But yeah, Twitter is certainly the place to find me, whether it be, in public or in my DMs as well.
Jeremy: [01:12:44] Awesome. I'm looking forward to seeing the YouTube channel and the podcast.
Henri: [01:12:49] Absolument and Jeremy, man, I definitely want to thank you for A) Your patience. Because like I said, my AV hurdles, weren't the most fun. So this is a conversation that's long in the making and thanks for having me on the show.
Jeremy: [01:13:04] Yeah, thanks for coming on. It's been a real pleasure Henri.
Henri: [01:13:07] Merci Merci. And when I get my thing going, I'll be sure to ping you to have you as a guest. Cause I'll tell you about this other little project I have as well.
Jeremy: [01:13:17] Nice, I'm looking forward to it.
Henri: [01:13:19] Yeah, man, it's going to be fun.
Jeremy: [01:13:20] That's going to do it for this episode. If you're interested in learning more about web performance, we have all the tools and APIs we discussed in this episode, in the show notes. And if you have questions, I'm sure that Henri would love to hear from you.
The music in this episode was by Crystal Cola. Thanks again for listening and I'll see you next time.
Pete is the senior security researcher at Brave Software and the co-chair of the W3C Privacy Interest Group.
We discuss:
Related Links
Music by Crystal Cola: 12:30 AM / Orion
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] In this episode of software sessions, I'm talking to Pete Snyder about the many ways websites track us. How ad blockers like uBlock Origin work. And the process of developing web standards with privacy in mind. We start by discussing his role as a senior privacy researcher at Brave software
Pete: [00:00:18] Brave is kind of interesting or unique as a startup in that we have a proper research lab. I think our research team is seven or eight people right now. Those are people who do research in the form of published publications but also doing research that ties back into product in some way.
My research responsibilities are to figure out new ways that you can improve browser privacy, address tracking on the web, and solve the kinds of problems that Brave is interested in solving. I have one foot in engineering world and one foot in publishing world.
Jeremy: [00:00:48] Why is academic research important in this space?
Pete: [00:00:52] My gut feeling is that what's useful about academic research is that it changes the incentives and it gives you a chance to do things that are more novel and particularly things that are less tied to a short term ROI cycle. That is particularly useful for things that have watchdog functions over industry, things that are more difficult to monetize but more useful to average web users.
That's not to say there aren't people who try to build businesses around privacy or responsible computing but the incentives don't always work that way. What's really neat about doing a research focused computing career is you can do things that don't have to make somebody money in the short term. You can pick more oddball projects. The things that might not come to fruition right away.
Jeremy: [00:01:36] And is there a key difference in how you approach a problem when you're doing it in an academic context versus as a product for a company?
Pete: [00:01:46] Sure. So they go both ways. If I'm working for something at Brave the emphasis is on correctness and certainty. And knowing that when we ship it to 10 million people or whatever that it's not going to break and it's going to do what it says on the tin and that it's going to be a material improvement over the state of things before we ship that feature.
And that's really different than if you're trying to come up with a research project where.. sometimes good, sometimes bad, but the emphasis is not necessarily on a hundred percent correctness but is on novelty and doing something or figuring out some way to solve a problem in a way that it hasn't been tackled before.
And so you'll read research papers that say it works 95% of the time and that'll be sufficient or compelling for a research paper. But you wouldn't want to ship something that breaks 1 out of 20 websites if you're actually making a product. The goals are different, but also the success criteria are different.
Jeremy: [00:02:39] So it sounds like you can tackle things where it wouldn't be good enough for a product yet. But it's something that if you were working on it within the context of a company, they might say: Oh, we're not going to do that because it just doesn't seem like it's going to work.
Pete: [00:02:54] Yeah, exactly. So, maybe because certainty of success isn't there, or there isn't a one or two step obvious path to being a product. Maybe it conflicts with the current business goal or whatever else. But yeah, you have much more latitude in terms of products you can choose and kind of problems you want to tackle.
If you're writing research papers and not that I'm some incredible researcher or anything, but if you try to do successful research it doesn't reward you to solve that final 5% of the problem.
There's no benefit to getting no, not none, but there's a small benefit of going from 95% to 99% success or accuracy. On product you have to grind out as close to a hundred as you can get.
Jeremy: [00:03:37] And do you have examples of things where you worked on it in a research context and it actually became a part of a product?
Pete: [00:03:46] Sure. Yeah. So a couple of things. One is that.. so there's a research paper that we wrote at Brave called Speed Reader. Speed Reader is a different way of doing a reader mode in a browser. Right now, if you use any of the reader modes in popular browsers, you download the page, you render some subset of it.
You throw some JavaScript at it and it extracts sections that it thinks are useful, then it presents a new page to you. That's not a hundred percent correct. Chrome's DOM distiller does something slightly different, but to approximation you render the page and then you extract stuff out of it. Brave speed reader does something different.
It intercepts it at a network layer. It examines the text HTML. Does the analysis there and then feeds that back to the rendering engine. And so there's a bunch of nice benefits there. There's a privacy improvement in that you're executing less code talking with less third parties. There's an performance improvement as well in that you don't have to do the initial displaying and tear all that stuff down and build it back up. So that was the research paper that we published at. WWW 2018 2019. I don't remember, but either a year or two ago. And it's now in beta in Brave. That's maybe the oldest one. The most recent one was a project that I did last summer with a student from North Carolina, Quan Chen on figuring out ways that we can do blocking better.
Right now, if you're using a privacy tool in a browser in most cases you're downloading a big list of things that should get blocked. They look kind of like regular expressions. It says: Yes, block this. No don't black that, and it's a useful thing.
But it has the trade off of it's very easy to circumvent that. Somebody can just change the URL [and] move it to a different domain inline in the page, whatever else. And so the approach that we took from this paper is.. Let's not focus on the URL. Let's build signatures of the execution paths of these scripts and we can use that as the Oracle to identify this is known bad or known good. That machinery ended up being very complicated and it isn't something we want to ship to all of our users because of the performance hit. It's something that we use for generating filter lists that we should download to users regularly.
Jeremy: [00:05:40] Existing projects you were saying are just looking at a list of URLs. And you said using something like regular expressions to figure out if the URL it's pulling is on that list. The part I wasn't clear about is the new way that you were describing worked.
Pete: [00:05:57] Yeah. The alternative approach that we came up with is to instead.. Not care about where the code came from or even how the code is structured. So if it's obfuscated in some way, but instead to look at the DOM and JavaScript operations that the code executes and sequence those and use that as the identifying signature of the code.
There's some cleverness in there that makes that particularly difficult to do in JavaScript versus other languages. But at a high level, it was saying let's identify things based on their behavior, not on their source.
Jeremy: [00:06:28] And so would that be where the browser would have to load the script.. see how it would affect the DOM. And then based on that, you would determine whether or not this was something that was probably showing you an ad or trying to track you, that sort of thing?
Pete: [00:06:43] Yes. The way the project works toe to tip is.. There's these long, long, long lists of things that people previously have identified as being tracking related or ad related those are things like EasyList and EasyPrivacy and the uBlock Origin lists and all this kind of stuff.
And so you can throw those at the web and you get some kind of nice labeled data set of these things are tracking and ad related, these things are benign. So you can run those, execute those files or load those pages and get signatures of how that code operates in the page. And so now you have your ground truth signatures of this is what known bad code does. And this is what known good code does. Then you can run that stuff against a bunch of things that you don't know the labels of and you can rebuild those labels on top of this code that people have examined before.
And so you can do a couple of things with that. You can either use that to build even more lists in an automated way. You can use it to do code rewriting since some parts are good and some parts are bad. You can use it for online blocking things like that.
Jeremy: [00:07:38] You're basically looking at things that people have identified as being bad behavior or tracking behavior. And you can load things that you haven't seen before and use that to instead of having a human curate the list you could have your code load things that it hasn't seen before and figure out.. Oh, this looks like this thing that I've seen before that somebody said was bad. And so I'm going to make a new list based on that.
Pete: [00:08:08] Yeah, exactly. For the Alexa 1000 or Alexa 10,000 like the most popular sites on the web, those have a lot of eyes on them. And so things that are tracking related get picked up pretty fast on those. But for the long, long, long tail of the web, that stuff is barely examined or at least, has a lot less eyes on it.
And so this is a way that you can use the the code that people have looked at to identify code that fewer people have looked at.
Jeremy: [00:08:32] On a broad sense, how deeply are people being tracked and do you think people are aware of just how deeply they're being tracked?
Pete: [00:08:41] So in the first case unimaginably. The amount of web surveillance and offline surveillance that people undergo.. unimaginable. Large amount. And then the second case, very little. You'll find these tools like Brave or the new version of Safari or Adblock plus or any of these.. uBlock Origin.
Good tools by people who are sincerely trying to reduce this stuff. And they'll put a little number in the URL bar and it'll say 10 trackers on this page or whatever. And you'll go to some new site and it'll have 95 or whatever. And that's just the known bad stuff. I think people have very little understanding of how atrocious the situation is.
Jeremy: [00:09:24] And what are some of the ways that people are being tracked by their browser?
Pete: [00:09:29] Oh. Well. In most cases, the tracking isn't being done by the browser. That's not necessarily the case. Chrome absolutely does observed things about what you're doing and sends it to the Google mothership. But in general the tracking isn't happening because of the browser itself.
But rather by things the browser is loading because of things the web pages tell it to do so. So there's a whole long tail of extremely boring everybody understands kind of things that have been around for 20 years to like more weirdo stuff.
By far still the most common method people are tracked is just I drop a cookie. I drop a cookie on one site. I fetch the same image on another site. And so the cookie gets resent and that's my way of learning the same person visited site A and site B.
Web browsers like Safari and Brave will never send third party cookies with a very small number of exceptions. Firefox and Edge have a kind of complicated system for determining when they send third party cookies but do a good job of not sending them to the worst offenders. Then things get slightly more sophisticated.
Instead of dropping cookies maybe what you'll do is throw storage into other places where people don't usually look for it. Right now there's at least four or five different APIs, six different APIs you can use to have persistent storage on the browser through JavaScript. Then there's a whole long tail of ways that things can get cached that also turn into persistent identifiers. That's maybe the second weirdest or second most understood. So then there's a whole bunch of places where the browser is implicitly keeping universal global state that you wouldn't necessarily think about as being a tracking vector, but anytime you have global state, you have the mechanism you need for tracking.
And the most frustrating example is something called HSTS tracking or HSTS cookies. It's an abbreviation for a header that a website can send you that says, Always automatically upgrade this request to an encrypted version, to an HTTPS version, even if I requested over HTTP.
Just in general what would happen is.. I make a request to some website I like, and it's going to be HTTPS. HSTS instructions are not respected over HTTP generally. But I make a request to a website I like it sends back this HSTS instruction that says good. Now that we have a secure conversation, I want you to never ever not communicate with me over a secure channel. So we got this one secure communication. We're going to use this as the kernel of trust to build the rest of our communication over.
And so, that instructs me every time I visit the site again, automatically the browser will know add an S to the HTTPS. And the same thing is true for any sub requests or in general, until people started coming up with counter measures, the same thing is true of any sub requests as well.
Jeremy: [00:11:59] If I understand correctly, you make a request to a URL and it tells your browser, in the future you try to go to this URL and you don't put in HTTPS to automatically go to HTTPS instead. And the part that I don't quite follow is how is that used to uniquely identify you?
Pete: [00:12:19] Oh, okay. Step one is I make a request to your website it's a secure connection. Then on your website you have 26 different images. You know, an a b c d an e and so my browser will make new requests for each of those images. And those images in this configuration are each hosted on different sub domains on your website.
A image off a.you.com that kind of thing. So now, if those are all requested over a secure channel, your server can decide I'm going to send the HSTS instruction or not for each request. So I'll get back to those images. And, I'll have, more or less 13 new HSTS instructions.
And those will be different for me than they will be for you for anybody else. Just flipping a coin enough times that's like the setting, the identifier step. And then now a week later or whatever you want to identify me again. I clear my cookies and everything, so I think I'm not identifiable but I come back to your site and now you'll have me request the same 26 images, but over HTTP, not a secure channel. And now my browser will upgrade more or less 13 of those images. Your server will look to see which 13 images got upgraded. And now that'll be unique to me versus everybody else in the world.
Jeremy: [00:13:28] I see. Wow. So it's a feature that had good intent, but the way that people are actually using it is they're building like a fingerprint right? They know for each URL which one they told you to upgrade to HTTPS and which ones not.
And like you said, so even if you clear your cookies or whatever the URLs that should be upgraded based on HSTS they're still stored in your browser.
Pete: [00:13:56] Yep. And so there's a long tail of these kinds of things where they were added to the web platform or to browsers or to the internet infrastructure for largely completely benign or really desirable reasons. But because of the way they've been implemented or because of the way clever people have misused them they become tracking vectors.
HSTS is not at all the only one. It's just the one that is kind of the most galling because it's supposed to be helping people's security and ends up hurting their privacy or can hurt their privacy.
Jeremy: [00:14:24] Right. So in the past you were talking about classic tracking that was making use of cookies. and that's something that gets stored in the user's browser. And my understanding is that for cookies in the past, you would go to a site and in order to track somebody while you're on that site you're making a request to another domain, right? To a tracking domain. And as you go from site to site, those other sites use a cookie from that same tracking domain. So that's what you consider a third party cookie. Is that correct?
Pete: [00:15:00] Yeah. So I would go to your site. Your site is benign.com your site includes an image from tracker.com. I get the request back from tracker.com. It tells me to save a cookie from tracker.com. I go to some new website, it also requests tracker.com and now tracker.com can link those, those views.
Jeremy: [00:15:16] And so now you were saying a lot of browsers like Brave and Safari and Firefox are starting to block third party cookies. What are some of the ways that sites are working around that?
Pete: [00:15:30] So there's a long tail. Some people have been moving to these more esoteric tracking things. So HSTS tracking is one thing that people in the wild were doing particularly against Safari when Safari started blocking third party cookies. They've been moving to different types of identifiers in the browser.
So maybe I don't store something in a third party cookie, but I set a cookie on the first party cookie jar. And then I just append that to my request. So things like Google analytics do that. That's called riding on the first party cookie jar because even though the code is executed by say Google analytics or any number of other tracking scripts. It's actually living in your cookie jar, the cookie jar is associated with your origin. So those are two and then it just kind of gets more oddball from there.
So there's, browser fingerprinting, if you're familiar with this, which is ways of finding like things that are semi identifying or semi unique in your browser, then build up enough of them. Very quickly, I can identify a large number of people uniquely. It's like, guess who, and you just split the population in half enough times, so that's done extremely commonly on the web. It's very, very common.
And then there's a new kind of thing called bounce tracking. Not, it's not that new but increasingly common kind of thing called bounce tracking where, Different browsers will only let you set third-party state if you visited them in the first party context. And so websites will play these games where they'll just forward you through a long number of first parties before you get to where you want to go.
And now all these things can set third party cookies on in iFrames and things like that. I could go on and on. There's an endless number of ways these things are done, But getting rid of third party cookies is definitely an extremely helpful thing, but it's not the end all be all of web privacy.
Jeremy: [00:17:05] One of the things you mentioned was browser fingerprinting what are some of the ways that, people's browsers get fingerprinted?
Pete: [00:17:12] Sure. At a high level browser fingerprinting is looking for a bunch of things that are going to be different between people's browsers. They don't have to be unique to a single person, but there'll be minor configuration differences or subtleties that are different between my browser and your browser and somebody else's browser.
For example, English is a very common language that's spoken on the web, and so if I know that's someone's language, maybe that identifies me. 1 out of 20 people speak English in the web or something like that. Yeah. Some number that's less than one or less than a hundred, I mean, a hundred percent.
And so then now I've cut out a large portion of the web and now I look at and see, is this person running a Mac? That's going to also cut the search space down. And I look at this person, how many devices do they have plugged into their machine? That'll shrink it down further. What are the kinds of peculiarities of their graphics card when their different drawing operations does it draw a line in a slightly different way than on a different graphics card.
That'll shrink it down further. Does this system have odd fonts installed. It'll shrink it down further, et cetera, et cetera, et cetera. And if you have enough of these kinds of things, you can pretty quickly put a lot of people in a bucket of size one.
Jeremy: [00:18:20] And one of the things you mentioned is that you can identify what devices are plugged into your computer and maybe what your graphics card is. What are some things you think people would find surprising about how much your browser is actually sending to the server that you're visiting?
Pete: [00:18:39] So it depends on the browser that you're using. But that's a good question. I expect people to be surprised that any webpage that you don't trust this is without a permission, websites can in some browsers enumerate all the devices you have installed, like the labels, the type, this kind of thing.
They can learn about what kind of network connection you're on, whether you're plugged in. If you're in Chrome, the most popular browser you can learn about, the kinds of network errors the person is observing, which can be very identifying if you're moving between networks and they have different kinds of DNS configurations. If somebody has a two factor authentication device, like a hardware key, you can learn some things about the hardware key, like the strength of it.
Even if you wouldn't expect a website can access that automatically. Those are just things the browser is intending for websites to be able to access. That's not like cleverness. That's just like, there is an API that will tell me specifically this. And then there's a long list of other things with a moderate amount of cleverness you can figure it out as well.
Jeremy: [00:19:36] I think a lot of people are familiar with the fact that when they go to websites, it'll ask for permission to use a microphone or to use GPS, things like that. For some of these other things that you're referring to, that it's able to figure out such as your, your devices that are connected, things like that.
Is that something where the person is giving permission or is that something that just just happens?
Pete: [00:19:58] Yeah, so that, that just happens. So I represent Brave on the W3C and I co-chair, one of the privacy groups on the W3C, the horizontal review group that reviews specs for privacy. And that's something that we're working with the working group that authors that spec, the media and capture media streams capture group to improve that API.
But that is just the way the API works in the standard right now. The website, says.. tell me about the devices the machine supports. It gets back a list of every device that the machine knows about that has like an AV component. And then the website now says, okay, I would like to access this one.
And then you get the permission dialogue. But without any permission you can learn all the devices and all the labels and the categories and this sort of thing. I should say that the spec looks like it's getting much better. That working group has been really, it has been good to work with and it's been, been really receptive to those concerns, but that is that's the way it ships right now in Chrome, Edge, Brave makes some modifications to it. I don't know about the other ones, but that's the way the spec standard is written.
Jeremy: [00:20:59] And that's interesting because you're saying the standard is currently being written but a lot of these different browsers, they already have an implementation of it.
Pete: [00:21:09] Yeah. So this is another one of those good intentions that have turned out to have an unintended consequences kind of situations. So it used to be the case that you're running a web standard. A bunch of people would get together and work on the standard. And then when it was done, people were supposed to implement it.
And then, this is a rough history. I'm sure I'm going to get some of the details wrong but to a rough approximation something like CSS2 happened where you ended up with a standard that was basically unimplementable and it had all these kinds of subtleties that hadn't actually been worked out because nobody had implemented it yet.
There's other cases too and CSS2 might not have been the tipping point but it was definitely a famous example. Then there was a CSS21 standard that came out when people started implementing it, it had to get revised in certain ways to make it actually work in the world.
Hand-waving simplification, but people thought this is not great. We need to like, actually build these things as we're talking about them, to make sure that they work in the world. And so then you got into this kind of like prefix situation where I don't know if you, if you do.
If you're a web developer, you'll be familiar with like, until pretty recently, you had all these pre-fixed extensions like rounded corners, but WebKit rounded corners, and Microsoft rounded quarters and Mozilla Yeah. And you had similar things in, in the DOM.
And there's still some hangover at where a bunch of specs in most browsers are implemented twice. Once WebKit prefix, things like that. And so understandably people thought this is not great. Now I have to write my code four different times. And so, right now, if you're trying to get a standard finished in the W3C, I'm less familiar with other standards organizations, but in the W3C, you need to have two working implementations, two independent implementations.
They don't necessarily need to be shipping like unflagged like the way it's supposed to work in the best cases like they're running in the browser and there's some flag that you flip or it's only enabled for some set of websites, but that's not always the case. Things are getting shipped as they're being designed in the standards body absolutely is like a little bit begging the question right? And that if you find out there's a problem during review with the spec well, now a whole bunch of websites have already started depending on that certain functionality and have it baked in that is a real pickle and something that we fight with a lot during these reviews. But, yeah, that's, the less than ideal situation that things have come to at this point, I think it's getting better, but that's generally how things are done right now.
Jeremy: [00:23:20] That's kind of interesting because it sounds like you have the W3C and you're planning on things that will go into the browser, but in order for them to become a standard, they need to already be in the browser. Which also means, like you said, that people are already using them. I wonder what is the negotiation or the back and forth in terms of, let's say Chrome is already using a certain feature and you say we'd like you to change this feature for this reason.
They'll say well, we already have thousands of sites that are using this. how are we going to change this? What does that, that back and forth look like?
Pete: [00:23:59] Yeah. So, the first thing to keep in mind is that the standards body is not a legal organization. The standards body can't make anybody do anything, they can say something's out of standard. They can remove it, but people listen to a standards body if they want to listen to a standards body and they don't, if they don't.
So in that sense, a standards body works by trying to make it mutually beneficial for people to go along with it and providing resources that maybe an organization wouldn't have if they weren't in the standards body and some benefit in terms of interoperability, that sort of thing to strengthen the platform in general.
So that being said, there's not an easy answer. Sometimes you can find clever games you can play with the way the existing APIs work that will reduce the amount of information exposed but without breaking the function signatures or the expected flow of the program. There's several different tiers of review in the W3C and some happen earlier than others.
What we've been trying to do is push review earlier in the process to try to catch these things while they're still in prototype stage, instead of, web scale stage. But yeah, there's no way that the standards body can go to mr and mrs. Google and say, you must do a thing or, Mr. and Mrs. Apple or whatever else. Unfortunately that's just the case.
Jeremy: [00:25:13] And you were saying, how you're the co-chair of the privacy interest group on the W3C. How much power would you say that that group has? Do you have examples of times where somebody has tried to push a feature through and you've rejected it on the basis of privacy and there's actually been changes made or the feature has been dropped all together?
Pete: [00:25:39] I should say in terms of power we're a subset of the W3C and the W3C at the end of the day has, the organization has maybe some moral authority or some, you know, people respect it. And so there's some soft in quotation marks power there.
And there's a lot of expertise that people respect in the W3C. And there's a lot of mutual interests between the browser vendors to have a web that is friendly for developers and friendly for users. So, there's no authority, but there is, web browsers are interested in sometimes what the W3C has because of these other reasons. So power is a funny word to use there through I take your point.
To point out very specific changes that have been made. I'll talk about the WebRTC one the enumerate devices API, because that's the one we just mentioned before, so right now, partially because of interest among people in the working group, partially because of reviews that PING has done. PING is the privacy interest group. There's both immediate changes that look like they're going to go into the API.
None of this is a hundred percent. This is all in the GitHub issues right now. But this is the direction it looks like things are going, best guess. That there'll be a number of ways of shrinking the amount of information that websites can access by default. So it's things like, without a permission, maybe you don't see this person has 18 different microphones.
They just see there's at least one microphone. And then when the website asks for permission then they can learn about the number, the details, things like that. So that's a wonderful thing. I think the working group would agree that it's not the dream outcome, but it's dramatically better than what was there earlier.
And then we're also working with the working group. They're doing great work in terms of figuring out like where can we go to that's even better? And so that looks like it'll be the website doesn't see anything by default. If the website wants to see devices, they call a thing, the browser prompts you with a list of devices.
And then if you want to, that information gets passed. that's a example that comes to my mind. There's a long list of if it's of interest to your listeners too. I could also point you to the endless list of privacy issues that we raised and the back and forth that happens on them there.
But sometimes they're very large things like that. Sometimes they're very small things like you're leaking fingerprinting bits here and let's figure out a way to sort through that.
Jeremy: [00:27:44] One of the things I find interesting about your position is you work for brave, which is a browser vendor, and you have Microsoft with Edge. You've got Google with Chrome, Apple with Safari, and Mozilla with Firefox. And I would imagine that all of these different companies, they all have their own goals.
They all have their own things that they want. I wonder from your perspective, what are the kinds of roles that each of these companies play and where do they butt heads and where are they on the same page?
Pete: [00:28:19] I want to say things that I'm only very confident about. All these organizations and particularly the people that are sitting in these committees and these working groups that represent these organizations have an interest in the web.
They see there's something unique about the web that that is appealing and desirable and positive. That's not on other platforms. They may not a hundred percent agree on what those positive things are, but there's something that appeals to us about the web that doesn't exist on other platforms.
And so there's mutual interest there. I also think that all these people care about privacy, they care about making sure things are accessible to people with different needs on the web. They care about making sure APIs work well for people who speak different languages and come from different backgrounds and these sorts of things.
At the end of the day, people who choose to spend their time in these long meetings working with each other.. We have very similar interests and we're all pushing the same way. Where they differ is the prioritization of those interests. Brave is absolutely like, we think there's something super duper wrong, like kind of fundamentally wrong.
Yeah. Maybe that's too strong, but the web has really gone sideways and the privacy violations are endemic and really horrible. And like intolerable. I think other people would say, yes, privacy violations are bad, but also we want to make sure that we don't break the ecosystem that exists to fund the web as it exists today.
And so that's like privacy is just one among many different interests, including making sure advertising dollars self fund websites and things like that. And then I think there's other people who exist in other parts of that, on that spectrum and have different interests.
So I think we're all pushing the web in the same direction, are interested in making sure it flourishes, but what flourishing means probably differs between different people in different organizations.
Jeremy: [00:29:58] Something that sometimes comes up and maybe it's a little more front of mind because Apple's worldwide developer conference happened recently is that people have a perception of Safari not implementing a lot of features that other browser vendors either implement or want to implement.
I think a lot of times they say that they're doing it in in the name of privacy. And on the other hand, you have developers who are saying, Oh, we want all of these different features because we want to be able to build a progressive web applications. We want to be able to build a websites that are similar to apps.
And I wonder from your perspective, how do you balance these two goals?
Pete: [00:30:40] So I think that's a really interesting example you brought up for a couple of reasons. I bet we're thinking about the same tweet that went around and the same people blowing off steam, and I can totally understand their frustrations. But I should say two things first before going into the guts of your question.
One is that, most of the things are not standards they are proposals. And so, as much as the web community, we like to treat them as standards because they're implemented in the popular browser. They are not standards, nobody's agreed to them, blah, blah, blah. They are proposals. Second thing is that, I think there was 16 or 17 or 18 different things on that list.
I don't remember the full thing, but I remember looking through it and thinking Brave takes the additional step of removing these things from chromium before we ship Brave. I am completely sympathetic to the idea in the vast majority of those cases, maybe all those cases, I just don't remember the full list that those are really privacy risking features.
And the permissions models around them are not well-defined. They haven't been well reviewed and the risk is really significant. Look, Apple's got more money than anybody knows what to do with Apple. Apple's not, not implementing cause they're lazy. They may be pursuing a different strategy.
But I also know that the people in those committees have sincere strong, heartfelt interest in privacy. So I understand the frustration of the web community, but I find the privacy story there compelling.
Jeremy: [00:31:58] And I think it's also maybe important to think about the fact that as soon as you put those into the browsers, it's going to be extremely difficult to remove them.
Pete: [00:32:08] Yeah. I mean the web congeals around any features that gets there. And the moment you put something in it becomes extremely difficult to pull it out. Something that we deal with at Brave a lot, because we think that the way a lot of APIs work is inappropriate and intolerable and we have to be very clever in the kinds of ways we can modify behavior that websites already expect to exist in a certain way.
Jeremy: [00:32:32] I think I know about the tweet you're referring to, and I don't remember all the specific features but, I wonder from your perspective, are these features that you think shouldn't exist or is it more that the way that people want to implement them now wouldn't be done in a privacy conscious way.
Pete: [00:32:50] Hmm, that's a good question. So I also don't remember the full list, but I can pull off some examples. I think there's kind of three tiers. Some things just seemed like bad ideas that we should just not do, or at least not do without pretty fundamentally rethinking how they exist.
Some of these things are things because they make more sense as operating system features or native app features than they do websites. And some of these things, are things that, yeah maybe those would actually be very useful on the web. If we could figure out how to do them responsibly.
A lot of this stuff has its roots not in things that typical websites need to do, but like the union of a bunch of weird things that happened. One is like Firefox OS happened for awhile. And so a bunch of things got pushed into the web platform some of which got yanked out later, ChromeOS is another one, PWAs, things like this.
And a lot of these things are really different from what we think about as websites. It's worth thinking about where are those lines and should they be firm or that sort of thing. A while ago, the example that sticks out of my head is there was a standard that got shipped in Gecko in Firefox and Chrome that allows websites to read the amount of ambient light in the room.
The website could read it's very bright, it's very low, I don't remember the granularity, but any step in between. And of course the very first place this stuff got used was in tracking scripts to fingerprint people. Same with the battery API, there was an API that allowed websites to say, it's a full battery, low battery, that sort of thing.
You can imagine why that'd be a nice feature in an app. But you can also imagine it gets sucked into the fingerprinting scripts immediately and starts harming and targeting people. And so yeah there's definitely a part of the web that says let's just permission prompt everything, or use a number of different kinds of proposals that concerned me to restrict this stuff or allow it on the web in a responsible way. The web as it is without adding more functionality has so many deep privacy issues. I feel very nervous about pushing for new functionality unless privacy is really treated as a first class citizen in those standards.
Jeremy: [00:34:56] Yeah. And it sounds like where we're at now.. There's already so many different ways that you can be fingerprinted. And every time a new feature is added to the browser, it just gets more and more easy to track someone.
Pete: [00:35:11] Yup. I think that's exactly right. And there's also cases where adding a new feature undoes a privacy protection somebody else has added in somewhere else. It's good to be very cautious before throwing new powerful features into the platform.
Jeremy: [00:35:23] Another thing that you had mentioned when we first talked about doing this interview was you had said that Brave is based on chromium. And you said that you had a somewhat semi adversarial relationship with upstream chromium. I wonder if you could elaborate on that?
Pete: [00:35:44] Sure that was kind of a silly in a goofball thing that way to put it. That's that misstates things too strongly. The chromium developers have been very receptive to questions that we have. And we've tried to upstream stuff we found to be a positive experience.
But there are things where the vast majority of Chrome developers are Google employees. And of course Chromium is shipped in a lot of ways with Chrome in mind. I don't think it's a malicious thing, but it is the case and so there's a whole lot of stuff in the chromium code base that assumes Google. Which servers get talked to and account information stuff, and safe browsing things, and an incredibly long list of stuff that is just in the chromium code base but assumes Google including, and this is maybe this is what I had in mind when I said adversarial. Poor choice of words. There's a couple of features that Chrome ships that allow you to basically enable a feature only on certain origins and they call it field trials, this kind of thing.
So if the chromium folks want to test out any feature they can say only these three or four or five or whatever partner websites can use it. Sometimes that feature gets shipped, they'll ship a feature ungated, not flagged. And then they'll use this feature to turn it off.
So they'll ship some new experimental feature and then they'll say, but we're not gonna allow it on any sites. The field trials is zero or is empty. And so that's their way of making sure that sites don't get it. Well, if you're building a browser that wants to put firm lines between itself and Google data collection servers you don't get that information.
And so now all of a sudden, the weirdo experimental feature is enabled globally in Chrome or in your version of Chromium. A long list of things like that. There are also other choices in the platform that makes certain things that we would like to do difficult. I could go on those examples if it's of interest. I don't think that's adversarial that was a silly choice of words. But it does mean that there's different interests being pursued in the code base that are not always Brave's. It's not always as privacy focused as Brave, would like.
Jeremy: [00:37:40] I'm not sure if you would have an answer to this, but, when Brave was deciding, what rendering engine to use whether that's, Chromium's Blink or WebKit, or something else. Why, why make the decision to use chromium as a base?
Pete: [00:37:57] So this predates me at the company so I can only think through some of these things. I don't want to say something I'm not sure about. The early Brave folks considered a bunch of different engines and Brave started as an Electron app. So basically when there is an extremely small number of developers at the company and it's extremely early days it was just everything was done on top of stock chromium. It allowed the company to iterate really quickly and try a bunch of new things and do some of the kinds of things that it knew it wanted to do that were easier to do at that level. Then trying to maintain a large patch set in this kind of stuff against Chromium.
And there's probably some path dependency on that. We're no longer an electron app. We're a proper chromium project. That's part of it. I don't know the particulars of why Electron was selected and not a Gecko option or not a WebKit option. I couldn't say exactly what tipped the scale on one versus the other.
Jeremy: [00:38:46] Something you mentioned was that private mode or incognito might be something interesting to talk about so could you elaborate on what you were thinking there?
Pete: [00:38:55] Like the battle of what like private browsing mode is and the incognito mode is and what that is supposed to do is I think nobody has a single story for what it actually is supposed to be.
In some browsers that basically means your storage doesn't persist after you close the browser. And that's all it means. The browser operates exactly the same way. Local storage operates the same way, et cetera, et cetera, except you have a separate cookie jar and a separate set of state that goes away when you close all your private browsing windows.
That was for a long time the textbook definition or the whatever was agreed on. But you can see over time in standards bodies and in implementations.. I think there's been some recognition that users have a different understanding, or at least some users have a different expectation of what private means.
And it can connotate something beyond just the state goes away. And so there's been a slow drip of new features. New privacy features into private browsing windows and major browsers. So Firefox by default if you enable a private browsing window you're in strict versus default mode for intelligent tracking protection, it does slightly different things.
Chrome changes the operation of some APIs that allow you to query your quota on storage to prevent sites from detecting whether you're in private browsing mode, et cetera, et cetera, things like this. But I think it's interesting because it seems like a recognition that users want more privacy in a machine and are desperate for whatever buttons are in front of them.
Even if what guarantees are being made by those buttons aren't totally clear.
Jeremy: [00:40:26] Yeah, that's a good point because when I think of private mode or incognito mode, I think of your first example where it just means that it's going to clear whatever was stored on the computer like cookies or your history, things like that. And what you're saying is that now the opinions have shifted to where maybe private mode should be blocking trackers or maybe it should be... I think the example you gave was preventing sites from finding out certain things about your computer or your browser. That's a perspective that I didn't realize people thought but that makes a lot of sense.
Pete: [00:41:05] And maybe this is a positive thing. It's become a little bit, eh, I'm not sure that's true. My impression I wouldn't go to war over it is, is that it's a little bit of testing ground for people to say, we know less people use private browsing mode than the typical mode.
So we can be slightly more experimental in the kinds of features we test out in private browsing mode for privacy related features. If that's the case, then it means more and more stuff gets turned on by default over the medium term. I think it's probably a good thing for the web
Jeremy: [00:41:34] One of the things that you had touched on earlier was when you're trying to preserve privacy, when you have features that are blocking certain things that could be used to track you or block certain features in the browser, one of the side effects of that is that it can break websites.
What are some, some common examples of where that can happen and how are, you know, you have brave, but browser vendors in general trying to to work around that?
Pete: [00:42:04] Sure. So the most common or goofball way that can happen is say, you're using some ad blockers you're pulling in some filter list and it says you should delete everything that says, Ad in the URL or whatever, right. /ad/ something like that. Some website for whatever reason has something that's not an ad in a URL or something like that.
Right now you're blocking something you don't intend. And there might be a script the page depends on for its execution. And given that the size of these filter lists, given that you could easily be considering hundreds of thousands, maybe even 200,000 rules if you're using a tool like Brave or uBlock Origin or something like that. The possibility for false positives is very high. So that's the simplest case that can happen. But then it gets more complicated. Brave by default blocks third party storage by default, there's a very extremely small number of exceptions that we make to unbreak websites.
But by default we just block all third party storage. So if you're in an iframe, you don't get to store stuff, you don't get cookies. If you're a third party request, stuff like that. And, the vast majority of cases that works just fine.
People don't usually care about the stuff that's going on in iframes on a page and when they do it doesn't usually need to touch storage, but you can imagine some places that'll break. Someone embeds a video and that video wants to store it's state or something like that.
That requires some cleverness in dealing with it. And then just a third example. Like when COVID started becoming a popular concern is that people want to look at maps and see where COVID was spreading. And so these sites would usually use things like, either rendering these maps via SVG or be a canvas operation, and brave, by default did no, no longer, but at the time was blocking certain canvas operations and SVG operations because we knew they were being used by fingerprinters. And all three of those cases have privacy protections that ended up breaking things that at least in these cases are privacy harming.
Probably even more so than my job doing privacy stuff at Brave is figuring out how to do that privacy stuff in a web compatible way or how to break less websites so people can use Brave without having to drop shields and drop those protections. And so each of those different things warrants a different response.
So one has been to adopt a strategy that the uBlock Origin project takes. The uBlock Origin project is fantastic and all credit to those folks. That project, it is really fantastic work.
Instead of just guessing, yes, I allow the resource or no, I block it. They'll also sometimes say replace it with some different thing that maintains the API signatures, but it actually nulls out the tracking behavior. And so that's been a really useful approach for unbreaking websites.
If we can figure out what they expect, like the functions they expect to be in place but replace them with less painful stuff. And I can talk about our research project if it's of interest over the summer, actually with the student, Michael Smith, the student who's visiting from UCSD to leverage this, if that's of interest afterwards,
Jeremy: [00:44:56] Are you replacing something in the JavaScript code that's running or are you replacing something that some browser API that is trying to get access to?
Pete: [00:45:06] Sometimes, sometimes both. So in the simplest cases like Google Analytics provides some functions or like triggered some events on load. And if you block Google analytics, it means some things will never load. And so instead of blocking Google analytics, You just, you say, here's the request for Google analytics.
Instead, I'm going to turn this thing that does nothing but trigger a load of it, but actually it doesn't touch network or anything like that. And so you're replacing the resource instead of requesting it. But you might also do things like, I see that some code that does something nasty or is inline so I don't get a chance to modify the request. I see it's inlined, so I want to somehow modify its behavior. And so I'm going to.. I mean, sometimes this stuff gets really gross, but I'm going to say overwrite some structure, the page expects to be there. I'm going to throw a stack trace. I'm going to look up and see if I'm in the inline code.
If I am, I'm going to take path a and otherwise I'm going to take path B all these kinds of gross things. The web is a messy place and there's a whole bunch of tricks like that, that have to get pulled. So we pull a bunch of that stuff from the uBlock Origin project, we generate some of our own, for fingerprinting stuff.
And this is something we've been able to pull from research that I'm really proud of us shipping, or I'm really glad about is in the same sort of way that uBlock Origin said it shouldn't just be yes or no. We should have some middle road that allows us to be more clever.
We've taken the same approach of fingerprinting protection. So instead of just saying yes, it's allowed or no, the API goes away. We now do something we call farbling where we break the assumption that websites have that.. That the features are going to operate in a fixed way across browsers by adding a little bit of noise to the API response.
So, if you're doing some canvas operations we'll with very low probability modify a pixel here or there or flip a bit like the lowest bit in the color channel for a pixel, that kind of thing. So instead of just blocking the API to protect people, we can instead have this more web compatible way where we still all the APIs do work, but we remove its identifiability by having it always do something different between sites, between sessions.
Something that we're working on right now. And we actually are working with a student from North Carolina, who's prototyping this for us over the summer. This is another research intern named Jordan Stock who's doing great stuff. We're looking into a third option for local storage for remote storage.
So instead of a frame, either yes. getting storage or no, not getting storage. We want this middle option where we can say the frame gets what looks like normal storage for the execution of the page. But by the time the top page for the top frame is closed, then that storage goes away. A lot of this stuff is just figuring out ways like the web compatibility game is, is figuring out a bunch of ways of breaking the binary choice and figuring out ways of sneaking more cleverness into the platform.
Jeremy: [00:47:43] So when you're referring to a frame and the local storage going away could you kind of elaborate what you mean by that?
Pete: [00:47:50] Oh, sure. So I'm on a website, like a typical website. you have your one frame, which is just this document object. And there's a bunch of like DOM structure that hangs off of that. But, one of those things off it might be an iframe, which is itself like its own contained document structure.
And that can be infinitely recursed or infinite. It can happen infinitely deep. And so, this is usually referred to as the first party and the third party. Or the first like the local frame and the remote frame. There's some overloading of terms. Because, yeah, some browsers like remote frames are also remote processes, in the way that an operating system understands.
But, typically yeah, the local frame is a frame that has the same, ETLD plus one, which means effective top level domain plus one, which is like the level of domain that you can register if you go to hover or whatever. And so all the frames that have the same ETLD plus one it's the top frame or local frames, anything else is a remote frame or a third party frame.
And so browsers will use this as like a.. some browsers will use this as a heuristic for saying local frames the user trusts. And so I'm gonna allow it to store cookies and local storage and this kind of thing. Remote frames deserve less trust. And so I'm going to block storage or I'm going to partition storage or I'm going to do something possibly clever with storage, not all browsers do that, but it's a increasingly common.
Jeremy: [00:49:05] I see, and I think you were explaining how you could have let's say an embedded iframe and it could use browser local storage, but maybe as soon as you click to another page, then that local storage goes away. Is that kind of what you were...
Pete: [00:49:22] Yeah. So that's the approach Brave is taking. So there's another privacy group in the W3C called the privacy community group, which is, kind of like the sibling group to the group I co-chair. So I co-chair the review group that reads everybody else's specs and tries to improve the privacy of what other organizations or what other working groups are working on. Privacy CG is where browser vendors go to introduce new features. And so brave is involved in both. There's a lot of overlap between the two.
Jeremy: [00:49:50] earlier you were talking about how people could be fingerprinted. They could be identified by seeing how things render, whether that's on a canvas or SVG and what you were saying, the way that you were dealing with it, which I found was interesting is it sounded like you were adding additional information. So your video card might render something a certain way, but then you would add additional things that would make it render differently than the video card normally would. And that's how you would remove that as an identifying factor. I wonder also, you were mentioning about how you had a research project at UCSD and I didn't quite catch what, what exactly that was.
Pete: [00:50:35] Yeah. So in that order, the first one, the adding information parts, this, this approach came out of two research papers a while back. One is a paper called Privacator led by my current boss Ben Livshits who's a professor at Imperial in London. And the second paper was a paper called FP random fingerprint random.
Both of those things introduced this technique or played with it. Brave is the first one who's productized it or included it in a popular shipping browser. But yeah the approach is to break the assumption that there's something unique about this browser that I can identify across sites.
And so we randomize some of these features or we add an extremely subtle amount of noise that'll confuse fingerprinters, but look indistinguishable to users. We do it in a way that's random, but deterministic under each first party under each session. So you close the browser, you get a new fingerprint and if you go to a new site, you get a fingerprint.
And so that prevents things from being linked. So it's been a nice way of taking academic research and figuring out a way to use it for a shipping, privacy protection.
Jeremy: [00:51:36] Cool. So that was something that, in your role at brave or I guess brave as a company decided that this was something to look into from a research perspective. And then because the research went well, you're able to move that over to the product side.
Pete: [00:51:50] Oh, well, I wish that was the case. I mean, so these are papers that preexisted at brave. It was a situation where we knew we had a problem. Most sites were breaking because of our fingerprinting protections. We didn't want to leave people less protected. And so research was one place we could start digging for a solution.
So you asked about the project at UCSD this summer. There's a student who's visiting, Michael Smith, he's a fantastic student and a fantastic hacker. And so his project is, I mentioned before about the way uBO, uBlock Origin does these resources replacements. And so as you might imagine, these things are very difficult to generate. They take a lot of stepping into the debugger and manually figuring out how these large JavaScript blobs operate, particularly what, like what subset of the functionality you need to maintain to unbreak the pages. Extremely tedious and doesn't scale, it doesn't scale well. And so the approach that Michael and I are working on, Michael's doing the hard work. Is to see if we can automatically generate these things through a combination of browser instrumentation, a system we call page graph, which allows you to deterministically offline see the interaction of different elements of a page,
AST analysis... AST is the abstract syntax tree or it's, it's a parsing step in, in executing JavaScript, or, parsing any language and then code rewriting to identify the parts that are privacy harming, rewrite, just those parts. And then we can programmatically generate these, these privacy preserving resource replacements in a way that can be automated instead of requiring the heroic amount of manual intervention that they currently do.
Jeremy: [00:53:23] So if I understand correctly currently when you use something like uBlock Origin and you go to a website and let's say that website loads a script that has privacy implications, has some issues with tracking, but the behavior is still needed for that website to work. uBlock Origin will replace parts of the JavaScript source code so that the site still works. But it blocks whatever kind of tracking behavior that it was going to have. Is that correct?
Pete: [00:53:53] Yeah, it's not that it fetches the resource and then does some rewriting on the fly. It just preloads like, this is the privacy preserving version of the Google analytics script, this kind of thing. Brave does the same thing. By the way, we had someone who worked with an intern last summer, Anton, who's now a full time employee at Brave and is phenomenal.
But yeah, brave does the exact same thing out of the box. So we preload all the same resource replacements and are generating our own and do this in the same way.
Jeremy: [00:54:23] And then in the research project that you're currently working on, the goal is for, the browser to be able to load these third party scripts and on the fly figure out if there's something, that should be blocked or changed in the script is, did I get that right?
Pete: [00:54:41] That's mostly it, so it's slightly different than that. And the reason it's different is that, so JavaScript because it's so dynamic, it's difficult to statically analyze. You have to execute it and see what it actually does in a lot of cases to deal with all sorts of corner cases or all sorts of aspects of the language, because things could get aliased and functions get bound and there's dynamic code execution through evals and stuff like that.
And so the difficulty there is you hand me some Javascript. I can't reason about it in a fundamental way about saying these are the seven places to where it's going to write a cookie or do a network request or touch your local storage or whatever.
So that's one problem there. And the way we solve that is we have this heavily modified version of Brave that we call page graph, or that includes a feature we call page graph that allows us to among other things say, okay, these are the 18 parts of the JavaScript code that actually ended up touching local storage or doing a network request or whatever else.
And so we use that for de-aliasing the values of JavaScript then offline we can, once we have those, we can programmatically rewrite the code by analyzing where those places are and replacing those lines of code or those chunks of the file with privacy preserving alternatives.
And then at that point, we have our resource replacement automatically. So the process is offline and that we call the web and we will generate a whole bunch of these things beforehand that we can preload them in brave browser in the brave browser, or share them with the uBlock Origin project or anybody else.
But the appealing thing is if we do all this work over the summer and this research project is successful, which I think will be, we have a way of automatically doing the stuff that before it would take an extreme, a pretty awesome amount of manual, labor to do.
Jeremy: [00:56:28] And so it sounds like you have this special version of the brave browser and you could automate it to visit a bunch of websites. Pull all the scripts. And see what it does to the page. And then basically give you a list of, Hey, these are all the scripts that we think have issues.
And we saw what it did, and this is the part we need to remove or change. And then you can ship that to users, either in uBlock Origin or in the brave browser itself.
Pete: [00:56:57] Yeah. That's exactly right. And so most of that stuff already exists through fantastic tools that people like Google have made, puppeteer is a really fantastic system that Google has made that allows you to automate browsers and interact with sites and understand what browsers are doing.
I mean, it's phenomenal, but it also doesn't answer all your questions. It's very difficult using puppeteer or using any system to understand this script modified this file. And that file then requested this image, and that image, whatever, you know, these complicated chains of interaction.
That's extremely difficult to understand online in puppeteer or particularly after the fact just looking at the end result of the page. And so page graph is this system that allows us to with extremely high fidelity trace every single one of these operations in the page and then stitch them together in a graph in the sense of like edges and notes, not as a PDF.
Jeremy: [00:57:49] Yeah, I think that's really interesting because I know one of your other, papers or presentations you have talked about Easy List, which is the list of trackers and ads that uBlock Origin and a lot of other systems use to decide what to block. And that sounds like a very time intensive process of you have all these different people that are visiting sites themselves and figuring out like, Oh, these are the things that should be blocked.
Whereas with your research now, it would be more like we could have just the computer go and browse the internet for us. Figure out what needs to be blocked and save a lot of time in the future in terms of figuring out what we need to block.
Pete: [00:58:33] Yeah, I think that's true. Although two complications there. And first I want to say that.. So EasyList is a fantastic project in there's a bunch of unrelated child projects. So there's EasyList. There's EasyPrivacy. There's a bunch of regional region specific EasyLists, this sort of thing. and, one the core maintainers of EasyList, who goes by the online handle Fanboy, is part of our team at Brave.
He's fantastic. He's a full time brave employee and his job at brave is to maintain filter lists both for Brave, but also to benefit the larger community. And so things like Easy Lists are on one hand, phenomenal, like I think people just completely under appreciate how much there's four core maintainers of EasyList. And without these four people doing the things they do the web would be an infinitely more miserable place. And like the fact that like the web hangs on the evening, like the after hours jobs that these people have, until, at least until recently when they started being supported commercially is totally under appreciated and fantastic.
Those lists are also deeply imperfect they're full of heuristics. Like the ad example, like /ad/ examples. There's lots of heuristics, there's lots of stuff that gets broken. And there's lots of, in quotation marks, dead weight or rules that were useful five years ago, but now it's very difficult to know if they're still useful given the size of the web.
And so, it tends to just amass rules over time. None of that is the criticism of the maintainers who are fantastic or the community around them that contribute lists, but just the nature of the beast. Brave's approach. And some other researcher's approach has been to can we use these labels that these people have generated as high confidence things to start reasoning about the rest of the web.
So it wouldn't be a replacement, EasyList. You still need some human in the, in the cycle somewhere to make some of these assessments, but can we like force multiply what that person is able to do through automation or machinery or, machine learning or, you know, different types of, of, of tooling.
Jeremy: [01:00:24] Yeah, that makes a lot of sense. I thought it was really, surprising how few people were maintaining such a gigantic list. Like I think, you had said there were something like 300,000 entries, or I don't remember how many entries were on easy list, but it
Pete: [01:00:41] I think it's around 75 or I haven't looked recently. I know that Ryan's been doing some cleanup, but close to a hundred thousand in just EasyList. I think it's 70 something. And then there's EasyPrivacy and there's, you know, a long number of other lists too. So yeah, I couldn't tell you that the concatenated size, but large, very, very large.
Jeremy: [01:00:58] So you've been centered on the privacy side and the tracking side. And I wonder in your work, if you had any visibility on the people who kind of want all these things to happen, like the advertisers that want to be able to do the tracking, has this sort of tracking actually been really effective for them. And on the flip side, I wonder how much of these ads are even being seen by real people? Could there be ad fraud going on in terms of computers are just looking at these ads? And we're not the ones looking at these ads.
Pete: [01:01:37] Yeah. We're now stepping something out of my area and quadruple ironic quotation marks expertise. But I can, I can only share what I know, or my impression from working, doing what I do. Which is that yeah, absolutely. Fraud is completely endemic.
To the degree that people have no idea how much it is, but numbers that I've seen up for online ad fraud are anywhere from 10 to 50%. These are not numbers. You shouldn't hold me down to, but, but simply to understand the magnitude of the problem, like enormous and in the number of like middle players, that are in these markets, make it extremely difficult for any one party to understand what's going on.
There's a phrase that gets thrown around called the Lumascape. And after this call, I can try to find you an image, but it's, it's this kind of like 18 step deep, like flow chart of how advertising markets worked. And that was five or six or seven years ago when that image was made.
But yeah, these they're extremely dense. And the vast majority of players in the markets you wouldn't recognize their name. Nobody would recognize their name unless you were an employee of that company. So, yeah, ad fraud is an enormous problem. And it doesn't seem like there's a way to, it's going to get better anytime soon.
Like this system seems like it's definitely on its way out and kind of getting worse. One thing that's really neat about Brave is that a number of the people who work at Brave have like histories in ad tech markets playfully have said they're repenting for... their work at Brave is their apology for what they did earlier.
One person who works BizDev.. Luke is like just a phenomenal dude and incredible at what he does, but he used to work in doing this kind of stuff, in terms of like helping to build tracking systems and understanding how they work and now, yeah, Luke, Luke is fantastic. Johnny Ryan is somebody who does policy work at brave, too.
He used to work at PageFair. He talks a lot to enforcers, like people on the political side, who do like, CPAA and, GDPR kind of things to make sure that regulators are actually enforcing these things. And his sense is that just the amount of fraud and the amount of tracking is, is, is just unimaginable.
And so, and so, yeah, the problem is, is well established. In terms of whether it's actually profitable I'm sure that's like very deeply debated. So I know that Google has some numbers that have said that if you remove like the behavioral component from tracking, you need to do just contextual tracking, so, or contextual ads.
So ads that know like where they appear, but not who their who's looking at them. Their numbers suggest that like the profits dropped by like 50% something along those lines. I don't remember the exact numbers, but, something on that magnitude I know some people are extremely skeptical of those numbers. And of course has Google is not an unbiased actor, but those are the numbers that they've shared.
And I know that these numbers on the other hand, that, that get pointed at you that says, The amount gained by, marketers in the, and people who are placing ads, is negligible negligible to negative, when you removed the, behavioral compo component, because, there's so much fraud in the market that they ended up, like behavioral tracking actually ends up having a negative return.
So, so all that is to say is I, I, I deeply don't know. I know that the system relies on things that seem abhorrant to me, but, But there's a diversity of opinions. whether it's actually useful or useful for what it claims to do.
Jeremy: [01:04:45] From site to site, the sort of effectiveness in terms of the ads, you see how relevant they are, it can vary really wildly right. And, and we're never really sure Why certain ads are being shown to us. Right. you know, the example that a lot of people will give is, on Amazon, right?
Where you buy something and then all of the suggested items are like, for the thing you bought and people kind of joke like, Oh, you know, this targeting isn't very good. But on the other hand, you have platforms like Instagram, where I've heard that the advertisements on there they're actually very effective.
They tend to show people things that they actually might be interested in buying and they actually go through and click. But it's interesting because like I was saying, I don't know why some things seem effective and why some things don't. It could be that they have tons of tracking information and they still do a bad job of what they show to you.
Pete: [01:05:44] I had the same uncertainty about this stuff. I imagine that, I mean, I shouldn't hazard a guess. I honestly don't know, the usefulness of these things, I'm, I'm really dubious or I'm really uncertain about it. I doubt it, but I couldn't say confidently that it's definitely not the case.
And I've heard the same kind of success stories and the same kind of, you know, catastrophe stories too. Two things here. One is that there's, This might be of interest. There's this kind of famous story of not success of tracking, but the harm of it. I can send you a link to the story if it's of interest, but there's a famous case of a family getting advertisements from target.
These are paper advertisements from target. So the family starts getting advertisements sent to them for prenatal kind of stuff, cribs this kind of thing. And the father doesn't understand why this is happening and the parents don't understand what's happening.
It turns out that the daughter is pregnant, and has been looking up information about how to take care for the expected child. And advertisers knew it before the rest of the family knew it. Anyway. So, I guess that's a story where maybe it was effective, but also morally reprehensible.
And then the other thing I wanted to say is like, so this is maybe a chance to describe how brave does this differently than everybody else does. I think, one thing that I think is neat about brave is that brave does two things differently. One is that. There is no track, no information about you, your browsing ever leaves the device.
And so this has two benefits. One is that your device is gonna have a lot more information about you than any third party is because it sees every website you visit. So, it can do a better job of understanding what might actually be useful to you. And second is that Brave lifts the incentive structure.
So right now here on the web, the vast majority of ads are not going to be of interest to you. And all the ads come with this, like all these horrible side effects of hurting your performance. Violating your privacy, carrying the risk of malware, et cetera, et cetera, et cetera. And so nobody wants to look at it that's why ad blockers are popular.
Brave's approach is different. We'll pay you to look at ads like brave incentivizes you to look at ads that gives you a reason to look at ads. It gives marketers a reason to prioritize your attention. It breaks the privacy and performance harm and security risk.
And arguably can provide much better ads than, than some tracking based third party does. So I think there's something clever about what Brendan and Brian Bondy came up with. in terms of the way that brave goes about these things compared to how other marketers have.
Jeremy: [01:08:05] We've been talking about how advertisers are using tracking to, to hopefully show you something that they think you'll be interested in, right. A lot of the research you're doing is to, to try and prevent a lot of that tracking.
So if you do that and you show someone banner ads, how are you going to be able to ensure that those, those ads are relevant to the person, when you can't track them?
Pete: [01:08:29] Two things, one is that brave will never, Brave, never puts an ad in the page. Like whenever you see an ad through Brave, it's very clearly not related to the page. It's in a notification to make sure that we are not putting ads against publishers who don't want them and for a whole bunch of other reasons, to prevent that kind of like, like brand confusion and all that other kind of nasty side effects about it. So brave doesn't track you in the sense that like your information never leaves your device. Brave if we wanted to, we couldn't learn anything about our users in any capacity like that.
No bits hit our servers that describe your browsing behavior. But like that you're on the device that the advice is constantly learning and saying, Oh, it looks like you're looking at shoes. Looks like you're looking at cryptocurrency. Looks like you're looking at, you know, airline flights or whatever it may be.
And so the device has a great deal of information that might be able to say, maybe you would like to see an ad about shoes, or maybe you would like to see it, an ad about, you know, vacations or whatever it may be. And so it's not tracking in the sense that like, nobody's looking at you.
It's your own device, seeing what your device already sees. but it does have the kinds of information that seems like it might be able to, actually show you stuff that you might want to see it. The other thing is you mentioned, users understanding why they're getting the ads they're getting and to be able to control it.
I mean, I think this is like a totally underappreciated concern in almost all of machine learning, where you have these extremely complex, deeply nested structures that are arrive at decisions that are completely opaque even to like machine learning experts, let alone to typical internet users.
People's lives are being guided by these unauditable black boxes. Brave's commitment is we are committed to allowing people to understand and to see the model to edit the model, to partition the model, to add into, or remove certain interests.
We're not at a threshold yet where, it makes sense to do that, but it is a commitment that Brave has made. It is absolutely in the plans and like, yeah, I mean, black boxes, like that terrify me and Brave is not going to become one of them.
Jeremy: [01:10:29] And you had mentioned how Brave the browser is not going to add ads to a site that doesn't have them. Does that mean that for sites that will have ads, that they would have some relationship with brave where they say that we want to show, ads in brave and that's what has to happen in order for advertising to show?
Pete: [01:10:50] Right now there's a couple different types of ads that Brave sends, the main one is there's notification ads. So, by default you see zero ads. You don't see anything, but if you say like, yes, I want to start getting paid to look at ads. You can say, show me, between one and five ads an hour and every, you know, one to five times an hour, you'll get a notification that looks like (fix transcript).
But say notification, you get, if you received an email or whatever, and it'll say maybe you're interested in shoes or, you know, whatever it might say, that's the predominant way that you see ads in Brave products. You also sometimes see advertising and get compensated for ads. Like if you open up the new tab page and you haven't disabled it, you may see an ad there and that you similarly get compensated for that.
I should say that for brave ads, the user gets 70% brave gets 30%. So it's like the inverse of the Apple app store. Right? Then there's the third tier of ad that brave considered, but does not ship and is working through the details on it. If we do ship it called publisher ads and that's when a website could affirmatively say yes, brave, please add ads in these locations on my website, Those, we don't do that now. if we ever did do it, it would be only with like the affirmative consent of the website.
But there's a bunch of difficulties there that have kept us from shipping. It mostly like privacy concerns of, we don't want the ad, the site hosting the ad to be able to learn about the user based on the ad that Brave places in the site. Like that would be a way of just really enabling a lot of the same tracking that's happening right now.
So we do not do publisher ads right now. We are thinking through ways that we might be able to do it in the future in a privacy preserving way. But right now the only ads that gets shown are notifications, new tab page.
Jeremy: [01:12:28] I see. So, the new tab page would be something very similar to when you create a new tab in Firefox and they have like a list of suggested sites. Something like that.
Pete: [01:12:39] Yeah. So right now, Like Chipotle advertised with Brave for a while. And so I think it was like one out of three times you opened up the new tab page. It would have a, you know, an image of a delicious burrito or whatever in the lower right hand corner, it would say Chipotle or whatever, like attractive images.
It's not executing code. It's not doing animations. It looks attractive. That kind of thing. But it's just an image. And if you don't like it, you can turn it off. But if you like it, then brave will pay you to look at it.
Jeremy: [01:13:01] Interesting. Yeah, it sort of reminds me a little bit of, back in the past, there were desktop applications that people could install and, I think they paid you. I don't remember if it was to click on the ads or, or just to see the ads. and this sort of sounds like a bit of, a modern kind of version of that.
Pete: [01:13:20] I think that's true. Although, I mean, a bunch of things distinguish it. One is that, like the bonzi buddies of the world, like ended up becoming malware vectors, two is that they didn't have the kind of information that would be useful to actually like send you the kinds of ads you were interested in.
They just pulled from a stock catalog. they were extremely obtrusive. I'm not aware of any ones that I paid you X like actual money or like a significant amount of money. I mean, but they might've existed. I couldn't say that they don't. One of the things that I want to say though, that I think is exciting to me about the Brave model is there is a sincere, honest question about like, how does content on the web get funded? And like advertising is not the only, but it's a significant part of how it content on the, on the web is funded currently.
Brave's approach is different, right now, if you enable ads by default, that money goes to the websites you visit. I have, those are configured to show me five bites an hour. And at the end of every month, brave keeps track on the browser, not on the server, but the browser keeps track of these are the, you know, this is the distribution of your viewing time across the sites that you visit.
And if you don't by default, brave, will just send your ad earnings to those sites. The sites that are involved like that are, are verified sites. They get revenue very similar to, or if not greater than, than the revenue that they would get for you looking at an ad that was an iframe on their page, but without the privacy harm, without all the nasty side effects.
So I think this can be a really powerful way of funding the open web, but without all the horrible stuff that's comes, comes with it currently.
Jeremy: [01:14:49] Yeah. I mean, I think that what you're seeing with a lot of news publications and even just people doing blogs and things like that is a lot of people are moving towards a subscription model, right. Where you pay me five bucks a month and, you can see my articles. And I think what's, what's tricky is that. You know, the web is so is so broad, right? You visit so many different sites a day. And so it's hard to imagine paying a monthly fee for every single site you visit. And yes, I'll be interested to see, see how that kind of model works out in the future.
Pete: [01:15:26] Yeah. And I should say too, that it's all in exploratory stages, but, but an idea that brave is considering and may prototype at some point is can we have some sort of like, if someone opts into the brave system, then, then brave can be the way that you just automatically pass through those paywalls.
Using the cryptocurrency, you could pay it in a, in brave is a way of saying, I don't want to have a subscription to a million different sites. If I'm in brave, then I just automatically do these, these invisible microtransactions to fund the sites that I'm viewing. I think there's something compelling about that.
Jeremy: [01:16:02] Yeah, for sure. everybody loves complaining about paywalls.
Pete: [01:16:07] Yeah, no joke.
Jeremy: [01:16:09] Cool. Well, I think that's a good place to start wrapping up. is there, is there anything else you think I should have asked or you wanted to mention.
Pete: [01:16:16] Nothing else comes to mind. I think this has been really enjoyable. I, well, actually, well, I can say two things. One is that, if you're any of your listeners are interested in privacy and web standards, like it it's a forum that could absolutely use more voices and more people who, Yeah, a greater diversity of opinion than people who work at browser vendors or ad tech companies.
And so if any of your listeners are interested in those sorts of things, I would encourage them to get involved. They can send me a message or they can just go to the issues themselves, but that would be fantastic. They have more people involved there. And the second one is, I imagine that a large portion of your listeners are people who write software for, for a living or, or who are considering careers in writing software for a living.
And a little bit soapboxy but you, yeah, that's a powerful thing and a privileged position for many people. And I would, it's worth thinking really, really well through like the morality of the kinds of causes you're spending your nine to five, like supporting.
Jeremy: [01:17:06] And where can people, if they want to see what you're working on currently, where can they check out what you're doing?
Pete: [01:17:12] Ah, so I have a website called peteresnyder.com where I have my publications and my research interests. a lot of the publications I work on at braid get published at brave/research. I write pretty regularly for the brave blog about new privacy features that are coming out in brave.
And I will. Be writing as an additional set of articles on the brave blog about, standards work in the direction that privacy interests in what standards also I'm on Twitter at PES10k.
Jeremy: [01:17:41] Cool. I think you gave everyone a lot to think about in terms of privacy and in terms of what's going on in their browsers. So thank you so much for talking to me today.
Pete: [01:17:50] Thank you very much, Jeremy. This has been super fun. I appreciate it.
Jeremy: [01:17:53] Thank you for listening to my chat with Pete. You can get show notes and a transcript for this episode at softwaresessions.com. The music in this episode was by crystal Cola.
If you enjoyed the show, make sure to tell someone else about it. all right, I'll see you next time.
Vlad is a Pluralsight course creator and the author of Unit Testing: Principles, Practices, and Patterns.
We discuss:
This episode originally aired on Software Engineering Radio.
Related Links
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:05] Hey, this is Jeremy Jung for Software Engineering Radio. Today. I'm talking with Vladimir Khorikov. Vladimir is the author of the book Unit Testing: Principles, Practices and Patterns he's a Microsoft MVP, and he's the author of many Pluralsight courses including Applying Functional Principles in C#, and today we're going to be talking about functional programming in enterprise applications. Vladimir welcome to Software Engineering Radio.
Vladimir: [00:00:28] Thank you for having me.
Jeremy: [00:00:29] The first thing I want to talk about is sort of what functional programming means to you, because it means different things to different people. To you, what are the core principles of functional programming?
Vladimir: [00:00:41] If I were to describe functional programming in just a couple of words, I would say that functional programming is programming without hidden inputs and outputs. And that's basically it. And what I mean by hidden inputs and outputs is there are several examples of those. So the most, prevalent example of a hidden output is immutability.
So let's say that you have a method that takes in some integer and then increments that integer by one. And what it can do is it can return that incremented integer back as the return value, but. It can also mutate some global state with that integer. And by the way, by hidden, I mean that this information is not present in the method's signature.
So it's not present in the method arguments, and it is not present in their methods, return value. So in this example, when you, when you have this increment method if it, returns. A value, then it communicates what it does, pretty clearly. So it is honest about what it does. But if it instead mutates, global state, then this would be an example of a hidden output because that output is not present in the map at signature.
And to understand what this method does, you have to drill down to that method and to see what. What's actually going on, because it can this information is not present, in the, in the signature itself. So that would be an example of a hidden output. The hidden input is a similar to that. So instead of taking that integer as a parameter, as an argument. This method can also refer to some global state. So, for example, some static property or field, or it can refer to some external service to request that in integer and then incremented and then put in to some global state. So that would be an example of a hidden output.
Also reaching out to external systems such as the database or APIs would be that and also, the simple DateTime.Now would also be an example of hidden input because that input always changes. It basically refers to the systems low level, API, your, for example, windows API or Linux API to, to get, this input value so that's another example of, of a hidden input. Another example of a hidden output would be something like exceptions. So exceptions are hidden output because, when you throw an exception, and that exception is caught somewhere upper, the call stack. This exemption also is not present in the method's signature.
And, you basically introduce another, Hidden pathway, for your, program flow that is not present in the method signature. So these are common examples of hidden inputs and outputs and functional programming is about avoiding those hidden inputs and outputs. If your mathematical function, your pure function would be something that, accepts a value, as an argument and returns a value and doesn't do anything other than that.
Jeremy: [00:04:06] Okay, so let's sort of break down a few of those. So. The first thing is hidden outputs in terms of if you pass something into a function and the thing that you pass in becomes changed, then that in effect is a, hidden output because, there is no way of telling just from the method signature whether that behavior is possible.
Vladimir: [00:04:30] Correct. Yes.
Jeremy: [00:04:31] And so you're saying that the alternative to that is to make sure that the function is pure so that when you pass something in, if you are going to make a change, it would not be to what you passed in, but it would be something that you're returning back.
Vladimir: [00:04:50] Yeah. So instead of mutating the state of some existing object, what you need to do instead, in functional programming is you need to create a new object with the required properties. So instead of mutating the object that is passed in, you need to create a new object with new properties and return it back.
And with example, with a number increment that that's basically it, because when you increment the number by one and return it back, you're not, Changing the input parameter because it's a constant, you cannot change it. What you do instead is you create another number and return it back.
Jeremy: [00:05:26] And when we think about objects that we pass in, or we think about collections. a lot of times the objects and collections we work with are mutable, right? Like, we can have a list type in C sharp, for example, and we may want to add something to the list or remove something to the list. If we are instead creating a new list every time we want to change something, what are the performance implications of that?
Vladimir: [00:05:55] Well, yeah, definitely. So there are trade offs to functional programming. And one of the most common tradeoffs to, any immutability is this trade off of, Always creating new objects instead of mutating new ones. And, that's actually the reason why object oriented programming, has become so, so popular in the past.
Because if, you know, functional programming actually was introduced before object oriented programming. but why it didn't take off is because, computers back then were not as powerful as now. And so it was very costly to do functional programming with all those memory allocations, with all those new object creation.
So it was very costly to do so. And what we had had instead is we started to operate at the lower right level, of our programs. And we started to think of, , in terms of, memory allocations. So, but now we're kind of getting back to the roots and starting to, to do more of what we did back then.
There are definitely trade offs here. And if your performance requirements for your application are strict, then there are some restrictions. So there are some limitations and probably you will not be able to implement some functional programming approaches.
But in most business line applications, that's not something you need to worry about. So if you write some framework, for example, an ASP.NET Application not application, but the server itself, Kestrel, then you do need to worry about that. But in most enterprise level applications you don't, so performance is not the biggest concern.
The biggest concern is usually the complexity and the uncontrollable growth of complexity. And, what functional programming allows you to do is it allows you to reduce that complexity at the expense of maybe not as performant code as you could have otherwise.
Jeremy: [00:07:56] So would you say that in the average application that the developer should default to making things immutable, is that a reasonable default for most developers?
Vladimir: [00:08:09] I would say so, yes. If it is possible, then you definitely need to default to creating immutable classes, immutable objects by default, it is not always possible and one of the limitations why is in object oriented languages it's pretty hard to create new objects based on the existing objects. So, for example, if you take F#, there is a really nice language feature where you, where you can take data structure or an object, and, create a new object based on that, existing object, but with the mutation, of, with the addition of new properties to that object. So you can say, for example, old object with, some property equals a new value and some other proper property equals other new value.
And what it will do it will not mutate the existing object, but it will create a new object with those two properties changed and then, but all the old properties they will remain the same. And this is a really nice feature that helps you to default to immutability. Unfortunately, in object oriented languages like C#, we don't have such features and so it's not always feasible to do so.
What I recommend you do instead is if you have some value, Or a simple class, you can wrap it into a value object, which is immutable. but for, for all other classes such as entities, like for example, a user or a company, that is usually something that you need to mutate, that is an example of an entity.
It has its own, inherent identity, such classes usually it's usually not feasible to make them immutable, but what you can do is you can keep them mutable, but put as much logic as possible to those immutable value objects. And this way you can keep the separation between complexity and immutability.
So your objects will be either mutable or complex, but not both. You keep the separation between the complexity and, mutability because it is, when you combine the two, you start to have, these problems with the, the ever increasing complexity with unmanageable complexity.
Jeremy: [00:10:28] And you were referring to the concept of entities and value objects. Is that right?
Vladimir: [00:10:34] Value object, yes.
Jeremy: [00:10:36] And so what, what is the distinction between those two? You were saying that a value object should be immutable and an entity could be mutable. Like how do you decide which is which is, which?
Vladimir: [00:10:48] Yeah. these two concepts are, from the domain driven design, but they are actually applicable to an application in which even if you don't follow, domain driven design principles, it, it is handy to refer to your objects in this way anyway. The main distinction between them is that an entity is something that is trackable, by your application has an internal, inherent identity.
An example I often give is, let's say you have a dollar bill, so you have money class, and this money represents a dollar bill. So this dollar bill would be a value object in most systems because in most cases, you don't really care if you have the exact same dollar bill as you had before.
So let's say if you give someone a $1 bill and they give you back a also $1 bill. You don't care if it was the same piece of paper as before because for you, they are interchangeable. And that is one of the most important properties of value objects. They are interchangeable, but that also depends on the domain model so on your context, for example, if you create a system that tracks those dollar bills, dollar notes, then in this case, the, those bills and those pieces of paper, they become entities because you do care about each individual a piece of paper. so you, for example, you can have the number on that dollar bill as the ID of your entity, and then you track, where it goes throughout its lifetime.
So it becomes an entity because it starts to have its own identity, its own identity, and you cannot exchange $1 dollar bill for another one because you do care about the history.
Jeremy: [00:12:42] Does this mean that pretty much anything that's an entity would also exist in some kind of permanent store, like a database?
Vladimir: [00:12:50] Yes, exactly. That's because you as I said you care about the history of that entity and what it usually manifests in as you need to persist that entity in the database. And then look after the changes in this entity.
Jeremy: [00:13:05] And you were kind of referring to the fact that these entities could act as wrappers to value objects. So I want to kind of give an example of, let's say I have a a ride sharing company, and I'm keeping track of my, my fleet of vehicles. So I have cars that are driving around the city and they're all reporting back to me, their position.
And each of my cars has an ID which parts of my, data would be an entity and which part would be a value object?
Vladimir: [00:13:38] Yeah. That's a good example of where you can apply this entity value, object separation. So the car itself would be an entity because you need to track it. That's, the primary indication that it is an entity. It has an ID, and the position itself would be a value object because you can replace it with another object, with the same content and two positions of the same type and content, they are interchangeable for you. So yeah, that's a good example.
Jeremy: [00:14:08] And so the individual updates, they would have the ID of the vehicle and the actual position. That could be a value object. And as I'm receiving multiple updates from each of my vehicles, it's reusing the same ID. And, in my database, I might want to keep a history of, you know, all the locations that my car went.
So with those historical positions, would those be their own entity, or what would those be?
Vladimir: [00:14:41] So the positions themselves would not be their own entity in this case, what would be an entity is the historical record. So in this case, you wouldn't have the position itself as an entity with an ID, but you would still have some, let's say, vehicle history record or something like that, as another entity that would, also contain the position as a value object.
So, you will have this kind of nesting here where you still have the same value object, but the entity would be a different entity.
Jeremy: [00:15:15] I see. And so the reason that we have the value object is that it sort of tells our system that this concept of a location is identical, whether it's in the context of being an update, I'm getting from a vehicle versus, a historical position. They're really both the same concept, so I can use the same value object for both.
Vladimir: [00:15:40] Yes, exactly. it is much easier to reuse this position value object between these two concepts. And also, another reason why you wouldn't want to introduce a value object for these two positions is because it is much easier to keep consistency, in this way. So let's say you have other properties in your vehicle other than the position itself.
So yeah. Let's say vehicle has, a license plate. And let's, let's say it has two numbers of its license plate, and it also has, two properties that display the position of the vehicle. So X coordinate. And. Y coordinate let's say. Why you would want to introduce value objects is to reduce complexity. Because when you have, those four properties as just four properties, then the number of permutations between them is higher. Then it would if you group, those properties into separate value objects.
So if you group the two properties that belong to the license plate into a value object, and you also group, the two coordinates into the position value object, then you will have only two properties inside that vehicle. And the number of permutations between them is much lower. It's just two, whereas the number of fermentation between the four objects is going to be well, mu much larger, much larger number (laughs).
So, that's a good way to think of complexity of your system when you reduce, the number of properties that you need to keep in mind your software automatically becomes much simpler because it becomes much easier to keep the consistency because, for example, when you accept a position, what you need to do is you need to just, check the correctness of that position on its own without It's connection to other properties of the vehicle and the same for the license plate, you don't need to validate that license plate against the, the position coordinates. So yeah, the validation becomes simpler and just maintenance overall becomes much simpler too.
Jeremy: [00:17:46] And so it sounds like rather than having vehicle update type, for example, and making that object be responsible for it, the validations of the license plate and the validations of the position information. Instead, you break out those concepts into their own objects so that those objects, when you create them, they validate whether or not it's valid.
And so as long as you can successfully create one. You pass it into the constructor for the vehicle update entity and you, you know that they're correct because, they were already validated before you passed them in.
Vladimir: [00:18:26] Yes, exactly, and that is one of the benefits of functional programming is that when you keep your objects immutable, your value objects immutable, you only need to validate those objects once. When you create them after that, you are, you can be sure that those objects remain in the consistent, in the correct state, so you don't need to validate them afterwards.
Jeremy: [00:18:49] Do you find that in development code wise, is it easier to reason about what's happening when you're creating these objects and then creating the entity and then passing these objects into the entity versus having a larger constructor for the entity?
Vladimir: [00:19:08] Well, it depends on the use case. So, yeah, usually what I like to do is I like to, Do it hierarchically. So you first create, lower level objects such as, as you said, value objects. Then if you have some other value object that consists of those lower level of value objects, you create that value object and then you pass that you, you kind of create that structure where you go bottom up from the lower level objects to the higher level objects, and you create them sequentially one by one, and on the top at the top of that pyramid, you have the entity itself, which you can create just by passing those already validated value objects into that entities constructor.
And you don't need to do much else. So the entity itself becomes a much simpler to maintain.
Jeremy: [00:20:01] And these examples we've been giving the value objects, they've been able to validate themselves. Like for example, the position, there's only a fixed set of numbers that are valid for the position, and so we could validate that without talking to an external store or a database or anything like that.
How about when you have a case where. To see if something is valid. You need to talk to an external API or talk to a database. Like for example, if there was some kind of, driver registration or like a license that's associated with the car and you needed to talk to, some kind of. State API or, city information to find out if that car update is valid, where, where would that exist in your application?
Vladimir: [00:20:50] Yeah. So there is kind of debate into where you should put this logic. I strongly recommend not putting it inside your domain model because your domain model shouldn't talk to any external systems. So you should make that domain model as functionally pure as possible. Let me step back for a second and, describe what I mean by that.
So what you want to do in your application, especially if you try to, adhere to functional programming, you don't actually want to make all your code immutable because that's usually impossible. when you create an application, you do want that application to mutate some state because otherwise that application would be useless.
For example, if you create a user or if you update the vehicle information, you do want to, you do want that information to be eventually updated. So the vehicle record and the database, it should be updated. What do you need to strive for is the so-called functional architecture. And this architecture is, where you have some sort of a sandwich between, where you first gather the information required for the business operation.
Then you delegate, that, information. So you pass that information. And delegate the work to the domain model. And then when the domain model, completes its work, you persist the results of that work to the database. And so what you can do here is you can make your domain model as purely functional as possible, and this way you kind of push the mutable operations to the edges of your system, to the edges of your business operation.
Because as I said, you cannot, skip those mutable operations altogether, but you can kind of, work around them. and with this approach, with this functional architecture, what you achieve is you achieve the separation between, the two important concerns. So the domain model is responsible for your, domain logic. And, I like to call it, immutable and the rest of your system becomes mutable shell. So it is a dumb shell that is responsible for communicating with external systems. primarily. And so, the two responsibilities here are the domain modeling and orchestration and you don't want to mix them together because, when mixed together, they overcomplicate your code.
You know, you don't, you don't want to do that because, the code becomes much harder to maintain. So, yeah, so this is the functional architecture, this kind of sandwich where the beginning at the top of your business operation, it talks to external systems.
Then it passes control to your domain model, which is as purely functional as possible. And then at the bottom, all the decisions made by the domain model and communicates it to external systems, including your database.
In some cases, the flow of your domain logic itself depends on what kind of response you get from external systems. So in your, in your example, this response would be whether or not this vehicle. With this, number already exists in the database. And if it does, you cannot register a new vehicle with the same number.
What you can do is you can delegate some decisions to the controller, for which you have to reach out to external systems such as the database because the alternative to that one would be to delegate this responsibility for communication with the database to the domain model itself. And this way you will keep the controller simple, but you will make the domain model impure. And that is, in my opinion, a worse option. even though you kind of keep the controller simple and you kind of keep all the domain logic inside the domain model, it's still not the best approach because you want to separate that domain model from this responsibility of communicating with external systems because, when, these two responsibilities are combined together, that's, when you start to have an over complicated code base.
Jeremy: [00:25:09] If I understand correctly, it sounds like anything that has to do with external systems, even if it's a part of validation, those calls should be made at the outside layer. Like for example, in your application controller or in an object that's receiving messages from an external system.
And, that is where you would make the call to check in your database to see if, for example, if this was an ID that already existed. Or if, you know, we needed to talk to some citie's registration system to see if this vehicle is licensed. We would do all of that outside of the, the value objects. We would do that more in our controller or in some area outside of all of the internal domain objects.
Vladimir: [00:25:59] Yes, that's correct. So, your, domain model you cannot delegate this responsibility of maintaining consistency that spans across the whole database. So you only need to delegate to our domain model. the consistency requirements that span, the objects themselves. So ideally it should stay within a single entity or, in domain driven design terms an aggregate an aggregate is a group of related entities.
If you're consistency requirements spans more than one aggregate or one entity, then yes, it's usually, it should be attributed to the controller itself. So, and, they check for uniqueness, let's say user, email, uniqueness or vehicle number uniqueness. That's an example of such a validation. Yes.
Jeremy: [00:26:49] Let's say you have a value object that is a, vehicle and before we checked with the database, we didn't know if it was a valid new vehicle or not because we didn't know if it was already existing or if it's truly new. And at the controller we make a request to the database and we find out that this vehicle doesn't already exist yet, so it's a valid request to create a new vehicle.
Would we then need to create a new type that says like, this is a valid, vehicle creation request, because if we hadn't done that check yet, then we, we wouldn't actually know if it was valid yet. I don't know if that makes sense?
Vladimir: [00:27:32] Yeah, it does. so what you're talking about is you want to have, let's say an idempotent request that would either create a new vehicle or update an existing one if it already exists. Correct.
Jeremy: [00:27:46] Sure. Yeah.
Vladimir: [00:27:47] Yeah. I wouldn't create a new request, for that. So, if it is a requirement, if it is a business requirement to do that in one go, which it sounds like it is here, then I would just attribute this logic to the controller.
So first check, if this vehicle with this ID exists, and then, go from there, either update its position. using this value object or create a new vehicle using the same value object. So, in this case, you will work with the same position with your object anyway, because you will use it for creation and for update of the vehicle position. And so it's just a matter of, what to do next when you see if the vehicle exists or not.
Jeremy: [00:28:36] And so after we've completed this validation and we've created our value object, then when we persist it to our database, that would also be in the controller. Is that correct?
Vladimir: [00:28:47] Yes. So the creation of the vehicle. That part would be the domain logic part. So if we are talking about the sandwich architecture, again, the functional architecture, then at the first, the top part of the sandwich would be reaching out to the database to see the vehicle exists.
The middle part for the main model part would be the creation of new vehicle or, the update of the, the update of the existing vehicle. And then the bottom part would be a saving that vehicle to the database. Yes.
Jeremy: [00:29:22] And I want to kind of step back to, at the start of our conversation, you were talking about hidden inputs and hidden outputs. And one of the things that you were talking about as being a hidden output is, a exception. For example, like if there's an error that occurs. You may not know, that an exception is something that can return, Walk us through like why that's an issue or if that is an issue.
Vladimir: [00:29:53] The issue here is not the exception per se. But how you use those exception . So if you use those exceptions for validation, then it does become an issue. what I'd like to see with regards to exceptions is that exceptions are for exceptional situations only. And what I mean by that is situations that you did not expect to happen in your application.
Validation is by definition, an expected, situation. Because you do expect your clients or your users to enter incorrect values or send you incorrect data and so on. And so when you validate that data, you do expect it to be sometimes at least to be incorrect. And so, what you can do is you can, for example, validate that data and then if it is incorrect, you can throw an exception saying that this data is incorrect. But, it has the drawback of, of this hidden output that I mentioned earlier where you create another another path in the program flow that is not, that is not present in the method signature.
So let's say if you have a method a void method, that accepts some data and then throws an exception, you, you cannot be sure what it does, so you cannot be sure how you react to these exceptions, because it could be that these exceptions are caught at the method that now call this validate method, but it could also be several layers upward the call stack, and so it becomes much harder to debug this application and much harder to understand what it does.
Whereas with something like a result type, what you could do is you can explicitly return the result of the validation from that method and this, this way, you will make the signature honest. So an honest signature is something I like to also refer to a purely. Purely functional method would have an honest signature because it will tell you explicitly what it does, what it, what inputs it has and what outputs it produces. The drawback with the result type is that you often need to write more code, because without them, you can basically just, make several validations, call several validate methods, and then if something is wrong, you will catch an exemption in the catch part of your application. But with the result, you have to, process each output separately. So you have, the first result, which you need to process and the second one and so on. And also another issue here is that, it becomes, at least in object oriented programming languages like C#. It becomes simpler to, omit that result because there is nothing preventing you to just forget about it and ignore it. And yeah, that, that could be an issue. And that is one of the trade offs of this functional programming approach. but again, I think, the benefits of this approach, overweight the costs because you're becoming much more explicit. In the values, those methods for return.
And it is much more maintainable in the long run. And by the way, this issue with, the ignorable outputs, it is only present in OOP languages, like C#. For example. If you take F#, you cannot just ignore an output, which is none, which is non void or unit in F# terms, you have to pipe it into special function, ignore.
So you can always see that you are ignoring something because you're seeing that, the return value is piped into that, ignore method.
Jeremy: [00:33:44] For those who aren't familiar with result types, can you kind of give a brief explanation of what they are.
Vladimir: [00:33:51] Yeah, sure. So a result is an explicit representation of some operation. So let's say, when you perform some operation, let's say you're trying to, save. Something in an external system. let's say that you want to update your profile on Facebook and you're using a Facebook API for that.
And, so you're, you're doing some API calls to the Facebook API and this operation, obviously may fail. And one of the ways to to deal with that failure is, as I said before, is to use exceptions, but that's not the best way to deal with that because I, it is not often obvious where exactly you process those failures.
A better way would be to catch those potential exemptions from that API at the lowest level possible, and then wrap them into an explicit structure such as a result class and the result class is a simple structure that tells you whether or not this operation succeeded. So it has. such fields as is it a failure or is it a success? And if the operation was supposed to return a value, then you can also make this result, class generic. And if it is successful, it can also contain a value, that will contain the result of the operation.
Jeremy: [00:35:12] The benefit of using the result type is that your function signatures, they become very explicit in telling you that this call that you're going to make, it could succeed or it could fail. And this result type is going to tell you, whether it succeeds or fails and that way, you know, to write code to, to account for both cases.
Vladimir: [00:35:33] Exactly. Yes.
Jeremy: [00:35:34] One of the things about exceptions is that when an exception occurs, there's a lot of information that's kind of embedded with it generally a full stack trace, for example, whereas with a result type, you may not have any additional information on why something failed. how do you deal with that or, or are there cases where you would say it does make sense to use an exception instead?
Vladimir: [00:35:59] Yeah. I'm not saying that you shouldn't use exceptions at all because there are use cases for them, and one of those use cases is, well, actually the only use case is where you have an, a situation that you cannot, you don't know how to deal with. And for example. if that Facebook API returns an error that you didn't expect.
So let's say that when you wrote the software, you expected some set of return values, return errors, some set of errors from the Facebook. Let's say that the user doesn't exist, but if it returns some other error, some obscure error, you don't necessarily know how to deal with that error.
And in this case, because you don't know how to deal with that, it is preferable to throw an exception. This would be an example of an unexpected situation for which exceptions are preferable. because exceptions represent unexpected situations in code, you shouldn't catch them.
You should only catch them at the topmost level of your call stack. And the only way, you should react to them is to just log, what exactly happened. And then, basically crash the application. Or if it's some background process, you need to just restart all over again.
Because otherwise if you are if you're trying to continue working after this exception took place what you can run into is you can run into an inconsistent state where your program entered some, state where it. It's kind of still working, but, it may become inconsistent and even save some data into the database where it will become much harder to deal with because you want to avoid that inconsistency as much as possible and as soon as that inconsistency takes place, you want to stop everything, basically crash your application. And that would be the only use case for exceptions. And so in this case, you do still have the call stack, which you can then log somewhere and deal somewhere somehow later.
But. If it is an expected situation, let's say that the Facebook API returned that this user doesn't exist, then you do know how to deal with that and you basically don't need the exception, stack because you can just, process this, error from the Facebook. Turn it into a result object, and then, return back to the caller and that caller then can in turn show some friendly error message to the user.
Jeremy: [00:38:42] So, so basically when you're working with external API APIs, like the Facebook API, you may make a HTTP request, and maybe it times out and the HTTP library that's built into C#, I believe it would throw an exception. and what you're saying is that you would know ahead of time that I expect that there may be times where my request times out or it fails, I'm going to catch this exception and then I'm going to return a result type that kind of explains what the failure was, rather than, just throwing that exception and catching it somewhere else.
Vladimir: [00:39:19] Yes, but that depends on whether or not you know how to deal with that, even if you expect some time out. Yeah, let's say that the Facebook API, calls to that API may time, time out from time to time. So you need to see whether or not you can deal with those errors. Because if you cannot, then even if you expect those situations to happen, you basically cannot do anything about them.
And so you need to throw an exception anyway and that could be because let's say, the Facebook API is essential for this operation, and you cannot proceed without a response from Facebook. But if it is not essential, let's say if a user updates its profile and you want to update its Facebook profile as well simultaneously, but if you cannot do that, then still fine you can proceed further.
So in this case, you can see you that this facebook call was a failure, but you know, that it is not essential for this business operation and you can just ignore that result and move on. And another example, let's say you write an ORM such as entity framework, and, in this ORM. The lack of connection to the database would be an exceptional situation because you cannot do anything about that.
And you don't know, how the user of your library will react to that exception. And so in this case, because you as a library writer, you do not know how to deal with that exception. You also need to throw an exception and then the library or a user such as yourself or myself, we can decide whether or not this operation was essential for us and whether or not we can proceed, with that exemption or not.
So let's say that when you're trying to save something to the database, it is preferable to do that. Say, for example, when you try to log something into the database. So it is preferable to that the log is successful, but if it's not, then not a big deal. And so in this case, you also need to, to catch and that exception that the library throws and then transform it into a result class and then process it, upper the call stack.
But if it is essential for your application, let's say you're saving not a log entry, but the user itself, then even if you know that this ORM can throw an exception, you cannot do anything about that, and so you shouldn't process that exception. You should basically allow it to pop up to the upper layers where it will be logged and the application will crash.
Jeremy: [00:41:58] Another thing that you sometimes talk about in the context of functional programming is this idea of how object oriented languages, they usually have a null, concept where. Instead of returning the object, you expect you return a null. And that could be because you couldn't find the element it could be any number of reasons. What are the drawbacks of returning a null?
Vladimir: [00:42:25] Yeah, it's a common $1 billion mistake that all object oriented programming languages have in them. And that is the problem we have now is that, they make all your code dishonest because what, . What they do with your code. Let's say that you return a user from some method, and that user is a class.
So in C# classes are nullable. all classes are nullable, so they, you can return not an object of that class, but now that would be a valid program from the compiler perspective. the problem with that is that you cannot differentiate between nullable user and the non-nullable users and where you can, see a method that returns that user.
What it actually does, it returns a special, class, which you can call user or null because it may be either a user or null. And so when you want a non-nullable user, you, there is no way for you to do that. because all, as I said, all classes are nullable by default. and yeah, that's the problem because that introduces another hidden output that you cannot see just by looking at the method signature.
Jeremy: [00:43:36] One of the things that you've, done as a way to kind of push back against that as this concept of a maybe or an optional type, could you explain what those are and how they're used?
Vladimir: [00:43:47] Yep. I think what I did, in this course, yeah, I'm trying to remember, it was several years ago. Yeah. So what you can do instead is there is a good tool, it's called fody null guard that you can use to basically inject null checks in all your methods and properties. And what it will do is it will check all input arguments for nulls for you and also it will check output, return values for nulls as well. And it's a very good tool. I tried to use it, in as many of my projects as possible, but it's not always possible, let's say that. And what it does is it. Helps you to approximate your code to this world where nulls do not exist.
So, if you try to return a null, where your method returns a user, your method will automatically throw an exception because of this automatic checks for nulls. And to avoid that, you will need to use a maybe or an option as in F#. So, and Maybe is a special struct that you can use to explicitly tell the clients of your code which parts of your, inputs or outputs can be null.
And because it is a maybe it itself can not turn into a null because it is struct and structs in C sharp are not nullable. And it becomes sort of a nice trick, to avoid, these null problems. Because if you want to make your return value, your user nullable, you have to wrap it with a maybe of user and you cannot do that otherwise because if you try to return null without that Maybe your method will throw an exception.
But if you, if you do use the Maybe, then your null will be automatically transformed into an instance of that Maybe type and your code will, will proceed. The validation here is not as strict as in functional programming languages because, this issue with the null it will not be caught at the compile time.
But still it's close enough because, even though the compiler will see this code as valid, I mean the code where you return a user, but, return null instead, but it will still fail at runtime. So you will have sort of, close to functional guarantees. here as well.
Jeremy: [00:46:17] Sounds like it's similar to the result type in the sense that with the result type we were saying we would wrap an object in a result, and what that would do to the method signature is it would say that this function you're going to call it could succeed or it could fail.
And similarly, this Maybe type, it sounds like it's wrapping your object. It's wrapping your response and telling you that your response could have something in it or it could have nothing in it, and it's making it explicit as a part of the return type.
Vladimir: [00:46:48] Yes, exactly. So, the, the maybe type it gives you the same benefits of the result type. So it makes the, method explicit. But in addition to that, if you use it, these Fodi null guard library, it also gives you some guarantees, some runtime guarantees that you will not actually have a null. Where you return a non Maybe user.
So if you return just user, then you have this guarantee that it will actually be a non-null instance. Because if you try to return null, then the application will throw an exception.
Jeremy: [00:47:21] Something that C# added recently in C# 8 is it added non nullable reference types. So it has a compiled type check, to see if you could possibly be using something that's null, is that a good substitute for this Maybe type or kind of what are your thoughts on that feature?
Vladimir: [00:47:42] It's a nice move in to the right direction, but I don't think that it is a good enough substitute because those checks they will only give you compile warnings. So they are not compile errors, but that's not a big issue because you can turn those warnings into errors by setting up a couple of things in Visual Studio.
The main issue here is that, it doesn't catch all those situations where you may have nulls. And so you still may have some issues with nulls even though, C# 8 will tell you that everything is fine. So that's basically my concern with that, that it's not as strict as they might be type.
Jeremy: [00:48:27] It kind of gives you some protection, but there are some cases it doesn't catch. So it may give you this false sense of security.
Vladimir: [00:48:36] Yes, exactly.
Jeremy: [00:48:37] Another thing that you bring up in some of your courses is that. When data is coming in from an outside source, like let's say, you have a API and somebody sends you data via JSON, or you get data via a message queue, you tend to create a separate DTO, a data transfer object rather than use the entities or the value types that you've created internally.
Why do you make the decision to do that?
Vladimir: [00:49:11] Yeah, that's a very important thing to do in my opinion, because, you need to maintain the separation between data contracts and your domain model. And this is important because if you are using the same domain classes, and as, as these input data structures, then you may fall into several problems.
So let's say that you you have a controller that is responsible for user creation and, one way to represent that data that clients send you when they try to register a new user. one way to do that would be to use the same user domain objects. Object as you have in the domain model.
So let's say that it has a username or password and maybe some other properties that map one-to-one to the properties that the client sent you. The issue here, the first issue here is the security hole, potential security hole. Because, you may introduce some properties in the future to your domain class that you don't want the client to set. So let's say that you introduce a flag saying that this user is an admin. let's say it's a boolean isAdmin flag. And if you introduce it to your domain model, then, it, it becomes a potential security problem because now your clients can send this flag as well, and it will be deserialized into that domain model.
And if you save it. as as-is into the database, then, you will basically create another admin in your system without knowing that. And so that's one another problem here is that. When you use your domain classes like that, you are setting those, domain classes in stone, so to speak, because you often need to maintain backwards compatibility with the clients.
And what it means is you cannot refactor those classes as often as you may need. So let's say that, for example, you, your user has a name property and you want to split it into first name and last name. But because you want to maintain backwards compatibility, you can not just do that because the old clients of your application will break because they will not know that split they will not know about it and they will continue to send you just one name. And in this case, it becomes problematic because now you have to maintain sort of the old name property. But you also need to add the first name and the last name and then somehow correlated between the two maybe transform name into first and last name user using some rules.
And you don't want to do that. Instead, what you need to do is you need to have a separate layer of data contracts. DTO data transfer objects where you have as many versions of those data contracts as you as you want. So if you decide to split the name into first and last names. You don't need to modify the old DTO.
You can create a new endpoint that accepts a new. DTO, version two, let's say, that we'll have the first and the last name, and then you will do the conversion between these two end points. So the first endpoint that still has the first version of the DTO. You can do the conversion between your domain model and the old data structure there.
And so in this way, you are free to modify to refactor your domain model without looking back to how how it makes incompatible or backward compatible changes for your clients. and so you, you sort of. Decouple that data contract from your internal domain model. You want the internal domain model to move as fast as you want, so you want to be able to refactor it, but you want to keep the data contracts backward compatible.
Jeremy: [00:53:06] It's almost like the difference between when you have a class, you have a public interface, and then you have the private implementation. And when you. Use data transfer objects, you expose an interface that you want to keep the same, but you want to be able to modify how that's handled internally in your system.
And so having these DTOs, it makes sure that you can make as many changes as you want internally in your system without affecting what your API looks like to the outside.
Vladimir: [00:53:41] Yes, exactly. It's a good analogy. Yes.
Jeremy: [00:53:43] The one thing I can think about as far as DTO is and converting them to internal domain objects. One of the things about that is it sounds like you could potentially have a lot of conversion code. How should you sort of plan for that and where should that exist in your application?
Vladimir: [00:54:02] So, my view has evolved since that course. I think what I did is I created, extension methods on top of, result, and it's still a good way to do that. Let's say that, when you create a user, you need to validate, his first last name, let's say an email, and let's say a couple of other properties.
And what you can end up with is a lot of code that does validation. So you are creating a value object first, and then you need to make sure that the user with the same email doesn't exist in the database, and then you need to create another value object and validate it . And so it creates a lot of, if else statements that clutter your code base.
What you can do instead is you can follow the so-called railway oriented approach, which was introduced by Scott Wlaschin. and so what I did is I basically adopted this approach from F# to C#. and you can introduce extension methods that will drastically simplify all those if statements, It will help you to reduce the number of line codes by a factor of three, and without losing any readability. And for simple validations, it's still a good approach, but, there is a nice, way of dealing with validation in asp.net and that is validation attributes.
And what I did in that course, I said that validation attributes is nice, but they kind of don't play well with value objects. And so if you want to really adhere to domain driven design principles or functional principles, then you need to switch from those annotations to this railway oriented approach.
But you actually can combine the two. So you can combine the approach with annotations and still have this validation logic in Value Objects. So one of the biggest disadvantages of having those validations in annotations is that you are duplicating that validation. So you want to keep those validation rules inside your domain model because it is part of your domain model it is essential part of it. when you put let's say regular expression validation attributes on top of your DTOs you are duplicating those rules between the two parts of your system. So now you have a value object with the same rules and, and also, that same rule that exists in the data annotations.
So to combine the two, you can actually create your own custom annotations that would delegate, those checks to the value objects, but still would work exactly the same way as the regular annotation attributes, meaning that you can declaratively put them on top of your DTO properties and they will work, very well.
So you will still reduce the, number of validation code lies lines drastically, but you will still keep this nice declarative approach that you had with annotations. And I have a blog post on my website, we can link it in show notes where I showed this approach in detail.
Jeremy: [00:57:25] Just to make sure I understand correctly, so what you're describing is in ASP.NET. When you have a model for a DTO, you can put annotations on it. You can have your property and above the property, you could say something like, the max length is 50 so this person's name can't be more than 50 and what ASP.NET is able to do is if you were to create a form and you used that property in the form, if somebody typed in a name and they put in 80 characters, ASP.NET, using that annotation would be able to automatically, sort of create an error and you would be able to put that next to the field. And I think what you're saying that you can do is that you can keep those sort of validation rules inside the domain objects that you create, or I think you called them the value object, and you're able to still write an annotation that just refers to the validation that exists in your value object rather than using the builtin, data annotations.
Vladimir: [00:58:37] Yes. Yes, exactly. And that's a nice way to combine the two because it sort of combines the best of the two worlds. You still have your validation rules in one place.
Jeremy: [00:58:48] What's your approach to, when you have a code base, that has exceptions and it passes back nulls the calls to the database are sort of mixed in with the objects.
Like how do you start that process of bringing in more functional concepts or just bringing in more concepts that are easier to follow and to understand?
Vladimir: [00:59:09] Yeah, that's a great question. And yeah, it's a tough one it depends a lot on the specifics of that project on specifics of the team and the management. It's one thing if this project doesn't, evolve much and it's just some project in the maintenance mode where you don't need to introduce a lot of new features in this case, I actually don't recommend that you do much because it will be, it will most likely will not pay off in the long run.
But if it is a project that is actively developed, then it's a different story, and in this case, you need to come up with some refactoring approach, some refactoring strategy, there are a couple of approaches here. In Domain Driven Design, for example, Eric Evans wrote a great piece where he talked about, so this approach that involves bubble contexts. And so a bubble context is something that you create inside a legacy code base that adheres to all the good principles. So you have a nice separation between the domain logic and the orchestration and your domain logic is ideally, purely functional and, because you, you cannot refactor the whole application at once.
And I actually don't recommend that you rewrite your application either because it's, it's not a good idea in most cases. You still want to start somewhere. And where you can start is by creating these bubble contexts. Let's say that you have some new feature or you need to modify an existing feature, and this feature is somewhat not too connected to the other system.
And so you can start to isolate this functionality into the bubble context and surround that bubble context with an anti-corruption layer and that anti-corruption layer, it's basically a repository that converts your good and clean domain model into the database with this messy legacy structure and converts back into your nice and clean domain model.
And what you can do is you can start expanding that bubble context. You can gain territory, more and more with new features with new, refactorings. And eventually what you want to do is, come to this point where your bubble context becomes the main part of your application. And, it's the legacy part that is surrounded by the anti-corruption layer.
This pattern is also called a strangler pattern where you strangle, these legacy part, and cut off slices of functionality from that part and transform it and refactor them, into your bubble context.
You need to first define the building blocks of your domain model. And those building blocks are usually value objects. So the , easiest to create classes in your application, let's say as simple as an email value object or as simple as a customer name value object. And so, when you do that, you can put, domain logic that relates to those email and customer names to those value objects. start using these value objects from the rest of your system, and then, start, from there so you can build a hierarchy of objects. So let's say that you have another object that consist of, those smaller building blocks, smaller value objects. So you do that. And then, as I said previously, you can proceed to your entities and refactor that entity. So instead of separate properties on the entity, you can start to have, properties, defined as value objects. And so, you are attributing more and more logic from that entity to those value objects. And the entity itself becomes simpler. And then, from that level, you step even further and push the domain logic down from controllers to those entities. Because what you usually have in such legacy systems is the anemic domain model where your domain logic is separated from the domain data. So data is separate from the logic and that we can talk about it a bit too.
But the main drawback in this system is that it's hard to maintain encapsulation. It's hard to maintain consistency, inside. the domain data, because it's separated from the logic that works upon that data. and the logic itself is usually like in something like services. And so you can push that logic from services down to entities.
And so what you have it's sort of a cascade of a logic that you push further and further down. And the more down you can push it, the better because the easier it will be to work with. And the problem with the anemic domain model is that, well there is actually a nice, dichotomy between, anemic domain models and functional programming because, anemic domain model it's about separating data from functions or from operations that work upon that data. But functional programming is kind of the same. So it's also about separating data from operations that work upon that data. The big difference between the two is that in functional programming though, the data is immutable and it is a big deal because it's impossible to corrupt immutable data. So it's basically impossible to come up with something that you cannot change. but anemic domain models, although they exhibit similar properties to the functional approach. The biggest, difference is that it is mutable that data inside domain models is mutable and you can never know who mutates that data and how they do that. And so it becomes impossible to enforce restrictions on everyone whom mutated that data with, with such an environment.
Jeremy: [01:05:18] Given all the things we've talked about, if people want to kind of see an example of a lot of these things in action, are there any code bases that they can take a look at that are open source or any good examples that you can point them to?
Vladimir: [01:05:35] So, if we are talking about C-sharp, then I would recommend my Pluralsight course. it's called, applying functional principles in C or something like that. I actually have a trial code for Pluralsight side, so if you want just reach out to me. We can put my email address.
So they will give you a, I think it's 30 days, unlimited access to Pluralsight, so you can watch all my courses and more during that time. Also if we're talking to F#, I would highly recommend Scott Wlaschin's books on this topic. So he. Has a great site it's called F# for fun and profit, and it has a section with books in it where one of the books is basically the collection of them articles from the site itself.
But the other book is that it is about Domain Driven Design combined with the functional approach, and it's really great book. It, it explains how to do the Domain Driven Design in a functional programming language like F#.
Jeremy: [01:06:38] And, where should people go if they want to see more about what you're working on and follow you?
Vladimir: [01:06:45] The best place is to go to my website. It's called enterprise craftsmanship.com. And yeah, you will find all the links there.
Jeremy: [01:06:55] Cool. Well, Vladimir thank you so much for coming on the show.
Vladimir: [01:06:59] Thank you for having me.
Jeremy: [01:07:00] I hope you enjoyed the conversation with Vlad. You can get the show notes and a transcript for this episode at softwaresessions.com. Alright see ya.
Brian is a Senior Developer Advocate at GitHub and was previously a Developer Advocate at Netlify.
We discuss:
Related Links
Twitch Streams
Music by Crystal Cola: 12:30 AM / Orion
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] This is Jeremy Jung and today I'm talking to Brian Douglas, he's a senior developer advocate at GitHub, the host of JAMstack radio and the creator of open sauced, an application to help new contributors to open source.
Brian welcome to Software Sessions.
Brian: [00:00:14] Hey Jeremy, thanks for having me on.
Jeremy: [00:00:16] The first thing I want to get into is. What's the biggest barrier for people getting into open source?
Brian: [00:00:23] Yeah, that's a good question. I think the barrier for open source is something I found or discovered right off the bat. I've been developing for over seven years now, seven years ish. And getting into open source can be daunting, especially if you don't know where to get started.
So I think the biggest barrier is actually onboarding and it's just knowing is a CONTRIBUTING.md, the proper place to go to or is there some other secret channel somewhere or a Slack group or something else where you could actually get a relationship with the project? I think a lot of us leverage a lot of these tools that are open source, and go years of leveraging them without even knowing who's contributing to them, who's powering it.
What community is involved in the project. So just knowing where to start is usually the hardest part. I think that we do a good job as a developer community. There are guides on how to contribute, like open a pull request, manage your commits and stuff like that.
But there's no guide of how to say hello when it comes to giving your first open source contribution.
Jeremy: [00:01:27] Not knowing where to start, even if there is a CONTRIBUTING.md and there's issues out there. People are like, I don't know which one to pick up. I don't know who to talk to first. It's just awkward, I guess.
Brian: [00:01:38] Yeah. Each project is different. So there's no centralized CONTRIBUTING.md file everybody is sourcing from so where one project can be, could say, okay, CONTRIBUTING.md, git clone, git, check out a new branch, git push origin. And then that's it. And some of them don't even have contributing MDs.
Some of them are just READMEs. Then you go to the README and there could be missing information. Some projects don't have READMEs. Some projects have READMEs and websites and documentation and Slack groups. So not knowing the balance of how to actually get involved in the project.
And I think what it really comes down to is if I started a new job the first thing I'm gonna get is a step by step okay. Here's your laptop. Here's how to do this thing. Here's how to clone the repo. Here's who to talk to. A lot of projects don't have that. Like they don't have like area owners, plugin owners, who's on the review team. Who's on the triage team. How big is the contributing group?
You can go into the GitHub repo and discover it all. But it'd be nice if someone just gave you a piece of paper or one file to get all that information. I think we've sort of grown out of the CONTRIBUTING.md and we need something else.
Jeremy: [00:02:49] When you are looking at an open source project, there's all these different issues whether it's bugs or feature requests. And it can be hard to know which things are suited for your skill level. And what do you think is the solution for that for somebody trying to pick out that issue that would work for them.
Brian: [00:03:08] Yeah. And I mean, the easy answer is they have labels like good first issue and like documentation. If some people don't know this, if you go to any GitHub repo, so like github.com/nodeJS/node. Or node/node, I think is what it is /contribute. So if you add /contribute on the URL you can see all the issues that are available for up for grabs and you can leverage them and jump into.
I don't actually recommend doing all that first and going to labels. I think the very first step is actually talking to a person. So the quickest place that you can find communication synchronous of like, Hey, I'm looking to contribute. I've been using this thing for six months on this project. I just want to give back. I had this idea for a feature like open issue, like ask questions on the issue, or even like now we have a feature at GitHub called discussions. In addition to that go into the discussion, but also limit the amount of back and forth you have to do asynchronous. And just go directly to the source, which is the person who is on-call already to chat with you.
A lot of projects have discords now. So find the discord link and then jump in there and say hello, because your experience is going to be completely different when you're actually talking to somebody and asking questions synchronously in discord.
The chat scrolls so it doesn't matter if you say a random question or you ask a question that's been asked a hundred times. Someone will give you a link, but it's better to do that than to be the person on the issue asking the same question for the fourth time. Or asking the wrong question at the wrong time.
I think that's a little daunting as well. If you don't know how the project, the underlying secret sauce of the project is actually laid out for you.
Jeremy: [00:04:45] When I think about open source, I think about communication being asynchronous, right? Going through issues, emails, mailing lists, things like that. And you're saying if you're first starting out, actually the best thing would be to, to find more synchronous communication, find that discord room or gittir, or, whatever it is, where you can actually have a conversation with someone.
Brian: [00:05:10] Yeah. And to be fair I'm catering this more towards a beginner opensource contributor. If you're experienced, do the regular thing, reference the issue, open up the PR, and no need to look for a synchronous communication if you know how to solve the problem.
GitHub itself is like 50 million developers worldwide. There's not 50 million developers doing open source. Let's just be clear on that. So there's a big difference between the users on GitHub who are just shipping code, like normal building websites for their companies or mobile apps or whatever it is to the people actually contributing the code that's powering all that stuff and powering GitHub as well.
I'm using this term called unintentional gatekeeping. I've been thinking about this a lot and I want to write a blog post on this because it's around the flow of information. So if I happen to be in the right Slack channel or the right discord, I have more information than the person who's not there.
Because there's more information flowing through there than there are publicly on issues, because issues are treated as like a statement of work. You're declaring that this is the way it's working or declaring that it's broken and next steps to reproduce it or whatever it is.
And same thing with PRs. Like you're declaring this is the work I've done. This is the next steps. You review it. It's very robotic. But when you have that relationship that's built in a Slack channel and like, this is similar if you go to meetups or if you happen to know somebody from college or high school or whatever it is very similar. That relationship is like a relationship that helps give you that extra edge.
And I think when we talk about things like tech, there's definitely a lot of conversation about diversity, especially today. so when we have diversity of backgrounds, diversity of culture, where people coming from, you tend to find a lot of the, especially newer, smaller startups have a mono cultured diversity.
And people are well aware but there are VCs who are telling us it doesn't matter when you first start out just build a product, get all your friends in the same room. Build only the culture fits and stuff like that. And then let's move on and then we'll figure it out when the company is like much bigger and it has like much bigger issues.
So I say all this because it's opportunity for people to become quote unquote of the culture of the open source project by just being in a room like this being in the room and listening. And if you find out that maintainer or the contributors are not your cup of tea you can just move away and move to another project or fork the project and create a new project that's similar but has a different culture.
Like, open source a lot of these things are MIT licenses, no limitation for you to try things out and maybe copy code and create your own project and see if there's growth in that approach. I don't recommend that. But if that's your approach, definitely try it out.
Jeremy: [00:07:56] Another thing that I often hear when people talk about wanting to get into open source is they have trouble finding someone to mentor them or help them through the process. I wonder what are your thoughts on how we can improve that experience?
Brian: [00:08:13] Yeah, this is more on the maintainer. Individuals who are managing the projects. I mentioned the onboarding experience. There's obviously opportunity for them to have better onboarding. Have some clear steps of what your expectations are for people to contribute to the project. Not just how to clone it and open a PR but more of like, how do you report an issue?
Is there a template for reporting issues that can guide the person into actually asking the right question as opposed to free for all and then your issues turn into stack overflow which is not the best place to ask questions is GitHub issues. Like you could do that, but stack overflow it's an entire platform built for that reason. So how do you kick people from issues to stack overflow instead?
We didn't talk about what sort of code, right. But, I do a lot of JavaScript and there's this one library called express JS. it just builds quick servers for your websites and web apps and ExpressJS actually did ship something really recently, I think back in April, they merged in this new quote unquote feature or a guide, which is called the triager guide. Are you familiar with this at all?
Jeremy: [00:09:16] I'm not.
Brian: [00:09:17] Yeah. So, basically what they're doing is they're instead of saying, Hey, We have a lot of issues go ahead and pick one up and like merge it. Or not even merge it just open the PR and we'll go back and forth for like weeks or months. And then we eventually merge it.
Instead, they're saying if you want to intro into express, you don't have to know anything about express. You don't have to use express. We have this role called the triage role and it's literally a team in the org that you can join if you just raise your hand and your job is to triage issues.
So if someone provides an issue. If they don't provide reproduction steps, you kick it back and say, Hey, can you provide reproduction steps? So if you don't know how to do it, then the maintainer probably won't know how to do it. Or maybe they do, but that's a lot of time for them.
So joining like a triage role or having an opportunity to do that, to label issues, to mark things as ready for review or ready to contribute or whatever or good first issue or whatever it is like that's-- express has a lot of issues and there's a lot of time spent trying to figure out is this valid or is this not?
They're actually taking help from the open source community, giving them a badge, which is the triage role in the project. So it shows up on their profile. That was great for prospective employers. Like, Hey, you're in the org, we're using express, you have access to the maintainers. Maybe we can get our features on there.
That's eye opening and it's eye opening that I have not seen that at all until very recently. So me personally, for my project, I just launched a triage role. Cause I want people to be able to have an introduction into my project, which is a react app without needing to know react, like all you have to do is know how to answer questions or how to find information.
If you don't, there's other people on the team that can help guide you. And we have a discord as well that can guide you to actually getting things shipped.
Jeremy: [00:11:02] I've noticed when you watch people's livestreams for coding and who work on open source projects-- a lot of the time they spend is actually on issue triage. Is on looking through all these GitHub issues, figuring out which ones are valid and which ones are not. And so I think that's an interesting idea of getting people started there so that they get to see the process of open source without necessarily needing to jump straight into the technical details. That's an interesting path to get more involved.
Brian: [00:11:36] Yeah, I like it. I'm curious of, what Twitch streamers you watch too as well cause I've been trying to collect the list of myself. But I like watching people do open source, actually. I think right now, Jason Lengstorf is doing some open source right now on his Twitch stream, I'll catch the VOD later.
But, yeah, I think that's actually a good thing you brought up to you as well. Cause I've been doing some Twitch stream myself and trying to figure out what is the purpose for live coding on Twitch? Is it to give webinar type tutorials like screencasts, is it to interview like what we're doing on a podcast or do that as a Twitch stream and where I found my niche or what I like to do on Twitch streams is actually do exactly that triage issues.
I'm actually gonna be live streaming later today. And I've been doing some sketch, some UI building. I'm not a designer. At all. But I took a course last fall and learned how to use sketch to build some UI templates to not have to rely on somebody else to actually get me across the goal line for our shipping projects.
So I'm going to spend 90 minutes building out some UI, and actually trying figma for the first time too as well. Cause figma, it's sort of like the GitHub for designers. I'm not sure if that's the summary of their product. I don't work there, but, yeah. So basically I'll be doing figma.
I'll be building some UIs and some wireframes. Just sort of figure out the next steps, cause I've got a backlog of features I want to add to my project, but I don't know how to tell people that this is how we're going to work on it. Cause I have a whole I think 17 contributors at this point I was going to say team, but they're contributors. We do have teams in squads or whatever. But yeah, so it's easier for me to get everything in my head and the vision for the project out onto like a UI. And let individuals know.
But my question, I first asked, like, I'm curious, who are you watching on Twitch?
Jeremy: [00:13:19] Yeah, so I just, dabble a little bit. One of the people that I find interesting is, Suz Hinton and sometimes it's like issue triage type stuff, but also sometimes she works on more hardware type projects, in sort of the intersection between working with JavaScript, but actually working with, physical hardware and actually wiring stuff up. I don't watch a lot of streams but things that are interesting for me is being able to see someone's thought process. Cause often when you watch a streamer they're talking through their process and what they're thinking and whether it's doing triage or whether it's working on a bug or a feature. You get to see how somebody works in a way that you wouldn't from a screencast. With a screen cast or a lecture they're very well prepared. They've been practicing, whatever they're trying to teach, whereas in a stream it's more this is something that they haven't done before, and you're just going through that process with them.
Brian: [00:14:25] yeah, Suz Hinton, I guess. Is it nope cat or no op cat?
Jeremy: [00:14:30] Yeah, noopkat, I think that's right. Yeah.
Brian: [00:14:32] Yeah. Yeah. Not hearing the word said out loud. I say noop and nope. noop. I interchange it's like
Jeremy: [00:14:38] They're probably all right.
Brian: [00:14:41] Yeah. So I've actually watched Suz and I like her style and I like the projects that she works on too as well. I've yet to catch her live in a while. So I don't know if she stopped streaming.
But yeah, similar to yourself I like seeing the thought process and the people walking through how to build things. Because at the moment a lot of us are working from home. Especially in the state of California. And we're not sitting next to our coworkers anymore and asking those questions.
I think twitch has in the last three months it's sort of exploded with live coders, and live coders as in the general people who live stream and code at the same time. Because I think a lot of people will just figure it out, like, Hey, I need to have community and I'm not getting it through my team Slack channel. So it's been an interesting transition as far as like a whole other culture that's growing on Twitch at the moment.
Jeremy: [00:15:35] For sure. And then for yourself as the person who is doing the streaming, what do you get out of it and what are you looking for?
Brian: [00:15:43] Yeah, I mean, it just comes down to community. I started the stream, mainly because I wanted to have a place to start throwing my ideas out there for the project I'm working on right now, which is open sauce and I started streaming two years ago.
I'd heard of people doing live coding on Twitch, but it wasn't very popular at all. Only a handful of people were doing it and I even talked to some people at Twitch about it. Some people who were familiar with this space and were knowledgeable and so I started doing it, but I didn't really have the proper equipment.
I had my Mac and I was just streaming from my Mac. And nowadays you gotta have proper lighting and step up the game a bit and a green screen as well which I'm sort of sitting in front of it at the moment.
I was doing that but I sort of fell off because my daughter was born just a couple months after that. So I took time off from work but also took time off from coding, just to enjoy some paternal leave.
But anyway fast forward to very recently which is a couple of months ago I started streaming again focused on trying to build an open source project and just have a place to write code consistently cause my day job is developer advocate and I don't have any long standing projects that I work on a regular basis.
A lot of stuff I work on it just sort of ships complete and then we don't touch it unless there's something broken. So once I get done with the actual shipping something, I just move on to the next thing. So there's nothing that I can feel proud of that I continue to work on. Or where I do like the latest and greatest things like Vue JS or whatever. I shipped Vue projects, but I move on and they just work until I have to do maintenance on it.
So I wanted to have a consistent place where I could just talk about a story of a project that I was working on, which again, I keep mentioning it, which is open sauced. And since then I've actually built a community of quite a few developers interested in the same problem I'm trying to solve which is open source onboarding.
Jeremy: [00:17:32] Let's get a little bit into open sauce. We were talking about a lot of different troubles that people have when they're trying to get into open source. How is open sauce trying to address those?
Brian: [00:17:42] Yeah, you tee'd it up for me earlier with the whole trouble getting into open source, it's onboarding. So we're building a platform to provide structured onboarding for open source projects. So like me connecting with maintainers and projects to add a simple YAML file to the project.
So that way anybody who navigates to the project on open sauce, can have a good, like step by step process of who to talk to, how to get involved in the project, where to go for the synchronous communication.
The other thing is that tracking projects is something you can do on GitHub, but it's not really built for that. At least today, like hopefully GitHub gets on board. And puts open sauce or adopts a lot of the features that we're sort of building. But the goal is not to track projects you're already a part of or even track projects that that you're working on.
It's more tracking projects that you want to quote unquote stalk. So like a GitHub star is a thing sort of like you hit a star, like, or whatever on a project. And it goes into like a list. And usually most people just forget about that list because you just add a star, like that's about it. It's like sort of Instagram. You just add a like, and you move on and that's what GitHub stars have become.
So it's hard to track things that you're interested in based on stars. You could watch projects, but then when you start watching projects it becomes a basically signal to noise ratio. So then with a very popular project, you don't know what you're looking at. So then you fall off immediately, cause this is too much information or not enough information, one or the other.
Yeah, so basically there's not a lot of tools. So I think a couple of years ago, actually, when we met a couple of years ago, I had just shipped a little side project for myself, which is essentially bookmarking issues. Issues that I wanted to work on GitHub.
So my day job is at GitHub. I've actually seen internally this feature be built a couple of times, but we sort of backed away from it cause it didn't really solve the right problem. And so as soon as I joined GitHub, I was like, Oh, maybe I won't work on this thing anymore. So I took a bit of a break and then noticed that we weren't shipping it.
So I just picked it up recently, but essentially you could find the project you're interested in. Find the issues that you're interested in and mark them to save. And then manage the note taking. So if you want to take notes on like this is the maintainer to talk to, this is an example that I can leverage to solve problems or help triage things, even like--
So I was trying to contribute to the graphiql opensource project graphiql. It's like a little playground to test out GraphQL queries and trying to contribute to that was actually pretty hard. There was a lot of context I'd missed. And at the time the project itself was transitioning from Facebook's org to the GraphQL foundation.
But also pretty much everybody who was becoming maintainers on the project were actually transitioning in and owning the project. So there's a rough transition into that, moving from org to org, but also the maintainer is becoming acclimated to the project as well.
They were all familiar with it, but now they own it. So they are all trying to figure out the best practices and how to clean it up. So at that time that's when I was trying to contribute. So I was looking at the issues and I'm like, man, I think I could solve this one, but I'm not finding the bug's actually invalid because it seems to be fixed.
So they had a ton of old issues that were just sitting there that were invalid from that because the transition cause they had fixed a bunch of stuff. They were still getting acclimated. No one went through and closed out a bunch of old issues and to close out a bunch of issues automatically with no reason or question sends the wrong signal to the users. So they just sat. And I tried, I tried working on an issue that was invalid.
And I discovered that when I commented on the issue with my thought process, the maintainer or one of the maintainers came back to me and was like, Hey, this is actually invalid. All that backstory, I just told you, he told me right there on the issue. And he's also like, Hey, we also have the discord, that you should come and chat with if you want to work on anything else. And I was like, Oh, okay. That's weird. Like in the repo, there was nothing about a discord. They since added it. But then I was able to get all that context, the conversation and the questions, like, what is happening with this project? Like where can I help out all in discord. So that's like, that's sort of the summary.
That story is a summary of what I'm trying to accomplish. No one, like myself needs to go into a project and be confused with skills. Like knowing that they can actually do something. To fix problems, but they don't know where to start and they don't know how to approach it because the way I do code at GitHub is different than the way GraphiQL is doing it in their repo.
So that's the high level goals with some other features that we're trying to work on, but, we're always taking ideas. If you go to opensauce.pizza, that's the actual website that's live github.com/opensauced that also exists for anybody who wants to contribute or just ask questions.
Open ideas, open the cool ideas, or bad ideas. It doesn't matter. Open up a discussion. We'd love to hear what problems they're facing in open source.
Jeremy: [00:22:39] Do you envision this being something where the list of projects is curated or is it more somebody can pick any project on GitHub?
Brian: [00:22:48] There are projects that do curation for open source projects. So GitHub has the explore feature. You can sign up for a newsletter, you get a bunch of projects every night or every week. I forget what the cadence is. And then change log has a nightly, the most popular projects. Here's a list of them, check them out. And then there's like also code triage was another project too, as well. Where you can also be curated a list of like Ruby projects or JavaScript projects as well. We do want to have curation as a feature. Like this is more there's a repo that you're using or a library that you're using. Add the library or the repo just the URL to open sauce to your dashboard. And this is all login through GitHub. It's using your own data. The backend is all the open source repo.
So when you log in, you click the create the repo button, it starts tracking all your notes and all the issues in the repo itself, all open source. So once you've done that, then you have a nice tracking issue to then say, okay, I've looked at this issue, look, this issue, doesn't work, invalid or whatever. I closed this. We also track your contributions as well. So if you do any sort of PRs they'll show up in the list, but also in addition to that, it also tracks your issue contributions. So if you comment on an issue, it shows that in the list as well. So that in the eyes of open sauce, nontechnical contributions are contributions. That's another thing that I stand on, which is just because you don't have a green square for that day. Doesn't mean you didn't do anything.
The platform itself, the answer to your original question-- No curation today, curation in the future, maybe it is on the roadmap. It's not actually realized in a plan. But the focus really is around if I already know the project I want to get involved in, can I just take it to open sauce and get all the information I need, digested.
So I can just click the steps one, two, three, even to hammer down on that onboarding experience, like there's a project called babel they do transpiling for JavaScript for different versions. Like one of the best things you can do if you want to contribute to Babel is use Babel.
I did mention triaging is another thing you can do, but if you already know how to do it and you're ready to start-- use Babel use Babel plugins, build a Babel plugin, like try going that far and seeing actually how it actually works under the hood and how you interact with the actual babel core library.
So that's a recommendation and that's like a recommendation I'm actually trying to work with that team. Hopefully. I talked to him months ago, but I haven't really picked up the conversation because I wanted to focus on actually getting a dashboard working. But I would like to see as an onboarding experience, if it's like a Webpack or if it's Babel or something else, as part of my onboard experience, build a simple tool or clone or a hello world to actually get my brain wrapped around it.
So that way you can confidently go in there and answer questions around, how is this broken for this user? And how can I fix it in the context of what I know?
Jeremy: [00:25:39] So it sounds like coming up with-- I'm not sure what you would call it, almost like an exercise of before you contribute to this project here's a well-defined thing that you can build so you have an idea of how to tackle a real problem.
Brian: [00:25:56] Yeah. Yeah. And I think it's easier for some projects than others, But I think that's on the maintainer to say, Hey, here's the contributor guide. But in addition to the contributor guide, here's the actual action items to do to get yourself up to speed. So whether it's build something on your own or just clone one of the example repos and walk through that, those are all possibilities.
But it's up to the maintainer, not everybody has to have the same sort of step or guides or not everybody's working on projects on the web. But as long as you have the steps, that's all that matters. So if someone actually knows what the step is to actually get started, that's helpful.
And like, we're talking about at the moment we're currently in like a existential crisis or at least America is. And there's a lot of people who have been underserved by their leaders and their community leaders and even the higher level of government. And like you go into cities. And there's a different, this like take like LA County. LA has one of the largest police forces in the United States. LA has one of the worst public school systems in the United States. I know we're talking about a political issue so I won't go too deep in that, but really what it comes down to is like actually information sharing.
So if somebody who is in LA County, and working towards life skills or just like growing their career or whatnot. If they have to go to the public school system there they're going to miss out on a lot. Like there's going to be a lot of information they just don't know. And if you happen to be just one County over, which is Orange County, then you're in such a better experience. And it's such a much better step up.
And I think that it comes down to like, if I want to contribute to open source and I wanted to level up my skill in my career am I getting the right information by contributing to this project? Or even using this project? I think that should be a decision that we should make as far as contributing to projects.
If there are not people going in there and contributing and it's that free form, like free flowing information. And there happens to be few people who are managing the project whether it's good or bad that should be eye opening. Cause then you have one or two points of failure, like one person gets sick or has a kid, or takes time off. Then it's down to the one person left over to actually contribute. And there's nobody else in this entire developer community that has knowledge to actually contribute back.
This is maybe not popular to talk about but Facebook has a lot of open source projects that we are leveraging entire products our features our companies on. Yeah, but the only people who work on that are technically Facebook employees. So is that really open source? And I know things like React, they do have contributors outside, but the individuals making all decisions are internal Facebook employees.
And I know they have the best interest in the open source community. I'm picking on them because they just happen to be the example I have on top of my head, but it didn't seem like information is really flowing in back and forth. And maybe I could be corrected too. I'm happy to be corrected on that. And if there's information on the react community that allows people to onboard a lot easier than I'm all for hearing it.
I'll probably do my research after this podcast cause I pulled it out of thin air and picked on them without having any sort of backing statement. But anyway, regardless, there are projects that do not have information flowing that we're supporting or we're leveraging in our projects. So whether it's react or not, we should take a hard look on, is there a proper onboarding for anybody to basically jump in there and get things done?
Jeremy: [00:29:37] Yeah, I think that's an interesting point in terms of when you have companies, whether it's Facebook or any other company you have people who are being paid to work on these open source projects and ultimately the company that's paying, they want to get something that's of value to their own company. And, whether it's a benefit to the rest of the open source community is it may or may not be front of mind. So I think that that's an interesting sort of, I don't know if you'd call it a problem, but a discussion to have. How much control do companies have over the software we use and is it too much, and on the flip side, it's like, if it's not companies doing it, then it's volunteers doing it. Maybe that's an issue too, right? Like that we're relying on so much software that's being worked on by people for free.
Brian: [00:30:36] That's the thing that I like discussing too as well, which is not just a onboarding decentralization of open source, like future. This might be counterintuitive to everything I said before, but when you talk about working for free, there is money being funneled into open source.
And again, I apologize for picking on projects that people love and leverage in their, their projects day to day, but look at a project like Webpack and I only pick on them cause I know them and I use them. And, I know the maintainers as well, but you see the project is making half a million dollars a year, just in open collective.
So that's the one location that I've looked at it and I can sort of cite today cause I've just looked at it recently. But that actually pays contributors to contract, to help solve and squash bugs. So like when you look at that, that's awesome. Actually, hats off to them and I think we should see more of that. I don't think that's a bad thing just to be clear.
But what about projects like rollup or parcel or all these other bundlers and packagers and stuff like that? Those are all valuable projects, but they're not getting the same sort of funding. Are we voting by the dollars that we donate as well? And, that's another question that was asked and like, I'm not here to say that's wrong or bad. I'm happy to fund other people doing open source. Cause I think it's more about not about true open source. Like the Richard Stallman, like open source, everything type of deal.
Basically what I'm getting at is that we should put our dollars where our mouth is, but also we should put our money in the things that are actually providing value and providing information and providing access to all developers as well. The best thing about this is that you have all these bootcamp grads, all these college students, coming out the gate, leveled up, and ready to ready to ship day one, which is great.
There's no month long process of like, Oh, you're only stuck to doing bugs or reading articles. You can actually ship code day one because you have the GitHub account while you're in college, you have access to all open source technology. So if you want to build a quick website or a Minecraft server, whatever it is how to interact, like with stack overflow in forums and answer questions to get your job done.
And, that information sharing has exploded the ability for us to grow our developer community. To be able to hire developers and train them quick. And like all bootcamp grads are only two years behind from anybody else because they just need two years of experience to actually get up to speed because the web, the mobile, everything like code changes quickly. Well, not all code not all code's the same, but I can speak for the web. The web moves quickly. So you're only two years behind the last person.
Jeremy: [00:33:19] Yeah, I think that's really great that more people are getting exposed to the idea of what open source is and having the skills to be able to contribute. And what I also think is interesting is Dan Abramov, who's on the react core team and he's also the creator of Redux.
He was talking on a podcast and he was saying he has all these projects that he no longer maintains that he used to work on and he feels a little guilty about it. But he was also saying that if somebody comes in and takes over those projects, some of them, when he was working on them, he was working on them at Facebook.
So he was getting paid to work on them. And when you have somebody come in, who's coming in on a volunteer basis. I'm not sure the word he used. It's it's almost like they've been tricked, I guess is what he was saying is. I was working on this thing, getting paid for by my employer and somebody else is coming in and taking that on for free. And so there's this interesting imbalance in terms of the people who are getting paid to work on it and the people who aren't.
Brian: [00:34:25] I mean, it's a challenge cause there's a lot of people actually able to-- GraphiQL founder or sorry, maintainer I was talking to, he's getting paid full time to work on GraphiQL. So there isn't a balance. Like he definitely is a knowledge holder, but I think that's a testament of like I spent at least two minutes, dogging Facebook, but also it's a testament to Facebook that they actually value putting open source maintainers there full time, to support the community and also even open sourcing it in general.
Like there was a time when I first got into programming where you didn't open source stuff, just because like, it didn't make any sense. I talked to people at Pinterest and they open source somethings like they had a very similar front end framework, which they called Denzel. And like, maybe it was open source. I don't remember, but I'd never even heard of it until I talked to someone at Pinterest.
Facebook put the time into actually promoting it, putting a conference on and actually getting people to care about it and saying like, Hey, this is actually the way to do it. They get value because then it's easier to get hired at Facebook. Despite the fact that they don't actually use React in their interviews, but it is a leg up, like you're knowledgeable that Facebook is hiring and hiring React developers or JavaScript developers, or we're even doing JavaScript at Facebook.
So I would say that the value that the person who's sliding up against, Dan and working with them and getting feedback from him, he's actually getting mentorship directly from Dan. And I would say that's not a monetary value. That experience that relationship that you get is invaluable to be quite honest.
And I said, I was talking about the whole LA County and the information sharing, like the more, the information is shared, the more value it's going to be like, the information I've gotten for free from just doing open source, it'd be involved in the community and going to meetups, is invaluable. I would not be here today without it.
But if I relied on someone to tell me that or me reading my own blog posts or me figuring that out myself, I would not be here today. So I would say like, yes, it would be nice to have a six figure salary to work on open source every day and triage a bunch of issues. That'd be amazing. But also the fact that Dan's accessible, and makes himself accessible, I think, is what it makes the biggest difference. Dan he is a figurehead for the React community. But the fact that I can go to React. Open up a PR and get Dan or Brian Vaughn or somebody else from the team, to actually review my stuff and give me feedback and tell me what's up and make me feel comfortable. That's a big deal.
Jeremy: [00:36:57] For sure. the way we learn the quickest is when it's from somebody who knows more than us, or has come before us, is able to teach us. And like you said, I think that can be really invaluable for sure.
Brian: [00:37:07] Yep.
Jeremy: [00:37:08] Another thing I want to talk about is you've been a developer advocate for GitHub and previously for Netlify. And I know in the past you had mentioned you had been a little hesitant to take on the developer advocate role because you were really interested in coding and engineering work.
Have you ever thought about going back into a more engineering focused role or what keeps you in the advocacy role?
Brian: [00:37:37] So what keeps me there is my paycheck. So I'm paid as a senior developer. That was the whole deal for me to go to, GitHub. And, that's helpful.
Also. I love community. I love interacting with the community and having opportunities to be out there. I missed being able to just put on my headphones and just write some code all day, go to lunch, come back, write some code all day and then have maybe a meeting once a week, do a standup once a day. I do miss that. And I do miss that solidarity time, but also, I mean, I am a pretty outgoing person and happy to have those conversations.
So like there isn't a balance, of doing that. And I think a lot of devrel folks, they come and go, not like they don't quit devrel, but they do go work on a project for awhile just to get back in the right head space, to be able to actually talk about devrel.
Like one of my biggest fears from doing developer advocacy full time is that not working on a project full time on a regular basis your skills start to not keep up because I had mentioned you're only two years behind in the last thing that came out. So if you're not constantly trying new things out and seeing what's out there, then it might be harder to get an engineering job later on.
I've mostly give up on the dream of climbing the engineering ladder. And I've only made that decision recently, because I think I get a better feeling around writing code when it's my own code, but also open source. So another reason why I even have open sauce is because it was a project for me to have long standing code, like learn how to write, test, learn how to use hooks in React when everybody was transitioning, like I had a project the ability to leverage and there's no pressure to ship there's no PM pushing me to like, Hey, we should have had this last week.
Like I get to basically instead of sit and write code, I watch a lot of tech videos. I do a lot of screencasts. I do a lot of Twitch videos as well, so I have more freedom and less pressure to ship things. Mainly because I don't have a project that needs to be shipped constantly. So I tend to build, and I like the pattern as developer advocacy and I recommend this for anybody, like build a project that you can actually use to leverage your skills and keep that going.
So whatever it is, if it's your sourdough bread making app or whatever it is to tell you when to feed your starter, Which I mean, I mentioned that because I actually want to build that, but, anyway, like build something like that so that way you can leverage and talk about on a regular basis and I think most devrel folks, they have that app for them. And I think open sauce is mine. So yeah. I guess the original question was like, yeah, I do have feelings around doing full time engineering, but I'm actually pretty content with my role today and my access to information and leveling up my skillsets.
I am not spinning up Kubernetes clusters or even know how to do that. I've done it before, but like, it's going to take me a bit to refigure that out. Like just give me the one that's working on your repo and I'll go from there. And that's my approach to code, it's write it quick, get it done. And maybe write a test.
Jeremy: [00:40:41] Yeah, that's cool. You may not day to day be diving in really deep on coding every day, but you keep that for yourself for your own personal project. So that you keep your skills up and plus you get to work on your own terms at your own pace. So you don't lose that joy or the the fun of just building things.
Brian: [00:41:04] Yeah, and I mean, to be clear too. I do have projects that I do maintain. It's just that these are projects that I only maintain like twice a year for updates and I'm just basically having the dependabot, update them for me. And then every now and then we'll add a new feature or answer a question or something like that, but all closed source stuff to make my devrel a lot easier.
Jeremy: [00:41:26] Cool. I know we're running up on time. And I just wanted to ask you one more question. Five years ago you were a new developer. you moved from Florida to the Bay area. You attended a lot of meetups and community events, and now you're on the other side, you're the one giving talks, giving presentations and talking to new developers. How do you feel like things have changed?
Brian: [00:41:52] Yeah, I mean, it's changed a lot. And I think asking the question now at the time that we're in currently, I envision it's gonna change even more in the next year. But I would say when I first got the programming, jQuery was definitely a legitimate place to put all your JavaScript. CoffeeScript is probably the next level above it. And they were pretty legit things to use. I know a lot of JavaScript developers from 10 years ago probably are cringing for me saying that but you didn't have to know a whole lot. I think we had a lot of stuff that we just took for granted and we've seen a lot of security vulnerabilities because of that.
So I think now-- I feel like the developer space is just leveled up in being educated in things like security, progressive web apps. So with that being said there's a lot to learn. So you can't be counted on to know everything. And that's the other thing about being a developer advocate. It's like no one knows everything.
There's no pressure for me to get back in the engineering full time so I can know everything. Cause no one does, no one's perfect at backend orchestrating of servers and spinning them up in containers. And even on the front end doing CDNs, like no one's really expert on that.
And I think people are really focused on things like the JAMstack where you can just pick and choose and leverage tools and free accounts that get your stuff mostly done. I think that's been a big change as well. And I think I've rode the wave in that change where I now have an entire project where I have no database.
My database is literally github.com and could I have done that as easily five years ago? Probably? I roughly did it four years ago but four years ago as a junior developer. So like that goes to show we're transitioning the way that if you wanted to build something on top of a third party API or whatever, like there's a lot of tools for you to use free and I think there's a lot of VCs and a lot of founders and a lot of open source projects that are really looking at the space and looking at this sort of mock regurgitation of developer tools and how anybody has access to anything. And it's been super fascinating to see that.
Jeremy: [00:43:53] Cool. Well, I know you got to get off to a meeting, so I just wanna say thanks for chatting with me today, Brian.
Brian: [00:43:58] Cool. Thanks, Jeremy. Looking forward to seeing what comes out.
Jeremy: [00:44:02] I hope you enjoyed the chat with Brian. If you're interested in getting an open source, you can check out the show notes at softwaresessions.com. I've got links to Brian's project, open sauced and a link to where he does his Twitch streams. The music in this episode is by Crystal Cola. Alright, I'll see you next time.
Lauren is a Software Engineer on the React Organization's Web Core team at Facebook and was previously an Engineering Manager at Netflix.
We discuss:
If you enjoyed this discussion with Lauren, be sure to check out her episode on the Changelog.
Music by Crystal Cola: 12:30 AM / Orion
Related Links
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:00] Hey, this is Jeremy. Usually when software developers are talking career progression, it moves in the direction from being a software engineer to becoming an engineering manager. And today I'm talking to Lauren Tan who moved in the opposite direction. She was an engineering manager at Netflix, and she recently made the decision to become a software engineer at Facebook.
We discuss why she made that decision and the differences between being a software engineer, a technical lead and an engineering manager. We also discuss what it means to be a senior software engineer and the ways that you can increase your impact and your influence in a software engineering role. I really enjoyed the conversation with Lauren and I hope you do as well.
Hey Lauren, thanks for joining me today.
Lauren: [00:00:42] Hi Jeremy. It's such a pleasure to be here. Thanks for having me.
Jeremy: [00:00:45] If we look back at 2015, you're moving from Australia to Boston, you're starting your first senior developer role at the dockyard consultancy. How did you get into this position where you decided that I'm going to leave this country I live in and I'm going to start this senior developer role in Boston?
Lauren: [00:01:03] A long time ago. I never really planned to leave Australia, let alone come to America. And I kind of traced his back to essentially how I got my career started in technology where really what started as a hobby creating silly applications then.
In fact, one of my earliest introductions to programming was through excel and making elaborate spreadsheets and writing Visual Basic or VBA. It was something that I never really planned to do. The long story short is after college, I started a startup called The Price Geek with one of my classmates.
And at the time I was getting really interested in essentially exploring more of this hobby that I had of programming and potentially exploring the idea of turning that into a career. So the year or two that I worked on that startup was really fun. We learned a lot about product development, about the business side of things, how to manage your money and how to get funding and financing.
That was all really interesting. And near the end of the startup when we were basically throwing in the towel I realized that I enjoyed it so much and despite the fact that my degree was in finance and not computer science, I enjoyed it so much that I thought to myself, wow, it would be amazing if I could keep programming as a career.
So I was very fortunate to get a first job in Australia as a software engineer. And I had started writing a bunch of blog posts and started sharing them on Twitter and on medium. And slowly but surely, I got people reading it. And there was a point where one of the creators of that JavaScript framework that I was writing about got in touch with me to say: Hey, would you be interested in coming to speak at one of our conferences? And of course, I was totally taken aback because first of all, I had never even been to a tech conference at that point, let alone speak at one. So I had totally no idea what I was doing. But I was convinced by them to apply. So I did and I'm very grateful that they did that.
And doing all of this essentially, I started to get the attention of some of the people working in America. The CEO of that consultancy DockYard reached out to me and asked if I would be interested in working there. And at the time they were pretty well known in the field of building Ember applications, Ruby on Rails applications. And so I thought it would be pretty interesting to go and work there and learn from some of the people that I really looked up to in that community. And that was the start of my career in Boston. And really it was a difficult decision to move. I think moving anywhere it's difficult but the move from Melbourne to Boston was exceptionally hard because it's a totally different country. It's so far away. My family and my friends would be not even in the same time zone anymore in opposite ends of the world really. So that was particularly difficult. And of course the Boston weather, it's terrible. And part of the reason why I was like, I need to maybe live somewhere else is because of the terrible Boston winter that I experienced in 2015.
Jeremy: [00:04:33] That makes a lot of sense how you ended up in California.
Lauren: [00:04:36] Right. I was like-- I need to go somewhere warm.
Jeremy: [00:04:40] One of the other guests I'm going to have on, Swyx-- he often talks about learning in public, which you were doing with your blog posts which got you noticed.
So I think that's good advice for software developers in general that putting yourself out there and sharing knowledge can really make these opportunities come to you.
Lauren: [00:05:01] I think it can, but I also want to say that I think that developers that learned their craft during the time that I started I think we were very fortunate in the sense that the web was a bit of a simpler place back then. People would build applications just literally using HTML, CSS, and vanilla JavaScript back then. You might just consider using jQuery or Backbone, or MooTools even.
A single page application really wasn't the norm. I think today is a very different world because software development-- I don't know if it's gotten more complex, but I think at least in the world of front end development it's gotten much more difficult to just get started.
Not saying that you can't build an app with just HTML, CSS, and vanilla JavaScript. But if you want to get a job doing it, then there is a bit of a higher bar I think. So I will say learning in public can be very helpful. But I also don't want to lie and disguise the fact that the environment has changed.
Times have changed and things are getting slightly more complicated and complex to build and that just means that there's a bit of a higher hill to climb.
Jeremy: [00:06:18] If you are going to make a site, you have so many options you have React, Vue, Ember, Svelte. There's all these different frameworks and do I use Javascript? Do I use TypeScript? It's definitely a lot more-- I don't know how you'd describe it. Intimidating, I guess.
Lauren: [00:06:39] It shows the evolution of how front end development has improved in a way. It's like, it's a mindset shift I think in the industry where previously, like 10 years ago it was still okay to just build what people might call enriched documents. Really documents sprinkled with some interactivity. But these days you're often building interactive applications that warrant a framework like react or angular or svelte or Vue. So I think maybe the problems that we're trying to solve have also changed that warrant more complex solutions.
Because I don't think the answer is to say like: Oh, we just need to get rid of all the complexity. The complexity exists for a reason. I think if I had advice for someone who was coming up in the industry, I would say, don't get intimidated by all these different technologies. And honestly, it probably doesn't really matter in the grand scheme of things which one you pick as long as you pick one and then you don't shut yourself off to learning from the others as well.
Because frameworks will come and go, but the knowledge that you acquire from using these frameworks will hopefully stay with you for a long time. And so those are much more transferable than knowing every single detail about the React API or something like that.
Jeremy: [00:07:56] Yeah, I think that's good advice. And I also wonder, when you started-- you had experience building applications in things like Rails. There are a number of frameworks where you can build a front end using primarily server side code, not necessarily build a single page application. People starting out, is that still something they should look at or do you think they should jump straight to single page applications?
Lauren: [00:08:24] I feel like it depends on your goal and hopefully if you're learning to program, hopefully you also have a project or some kind of motivation for learning those technologies. You should hopefully use the right tool for the job. And if you're building something that really doesn't require a lot of interactivity, then maybe a single page application is overkill, even though it might be beneficial for you to learn.
So I think it depends on your goal. If your goal is purely just for educational purposes, then by all means, choose the fanciest technology stack and learn away. But if you're actually trying to get a project going off the ground, I feel like it's probably not that useful to bike shed on like, do we use Svelte or do we write our own thing, or do we just use server side rendered templates?
I think those are all fun as technologists to debate and think about. But they're just in my opinion obstacles for actually trying to do what you're trying to set out to do. So that's a bit of a roundabout way of saying that I think it depends on your goals. Is it to learn or is it to build something that you know you can get out the door really quickly. And depending on what goal you have, I think my suggestion would be slightly different.
I think fortunately, if your goal is mainly just to learn, then any one of those single page application frameworks are great to pick up. My only suggestion would be, again, like not to tie yourself too closely to just one framework, even though one may seem like the incumbent. The one that every company is hiring for and that's fine.
Maybe you start there, but don't let that limit you from learning everything else. Because again, like there are a lot of concepts from the different frameworks that often make their way into other frameworks as well.
Jeremy: [00:10:17] That kind of reminds me of how when you first started, you were very focused on Ember. And now you're deeply involved in React you don't have to feel like you're tied to just the one you start with.
Lauren: [00:10:29] Absolutely. And I think in the tech community there are a lot of these people who say: Oh, you know, don't bother learning a framework. Just learn the fundamentals. In spirit, I agree with that principle. I think that you should learn the fundamentals. But I also agree that actually learning a framework first is not a bad thing. In fact, it helps you.
Sometimes you don't need to peel away all the layers of abstraction straight away because that can be very overwhelming. And I think single page applications, there are a lot of tutorials online that you can follow and you can get something working. And that is your basis for starting to then peek under the hood to say: Oh, how actually does that work?
Why did I use a component here instead of make a component that does this other thing. I think of it more like the onion of knowledge really. I don't know what a good analogy is but like an onion in the sense that there are layers that you peel away and you slowly understand what the frameworks and the languages are doing.
And in fact, I see even today, like my career and the stuff that I'm doing is continually peeling the layers. Maybe today I may not be working on writing an application anymore. I might be working on the infrastructure that powers the tools that allow this application to be made.
But I wouldn't have been able to have gotten here if I had not been building applications before. So you go deeper and deeper. But you can't go deeper without a strong foundation. So my advice is start with what's comfortable. Start with something that's easy to learn and use that as a foundation for going deeper into the technologies and the areas of programming that you're interested in.
Because maybe you'll find that front end development is not for you and maybe you'll realize that actually I prefer back end development. And that's perfectly fine. There's no one path in this industry which is pretty cool. So I would say keep it broad, learn as much as you can, and then follow what interests you and what excites you.
Jeremy: [00:12:35] A lot of people when they're learning, it's hard to stay motivated unless you're building something that you can see. I think in that respect if using a framework like React or Phoenix or Rails-- if it's going to get you to the point of being able to see something working that will keep you motivated, keep you moving, then it makes a lot of sense to start there.
Lauren: [00:12:57] Yeah. I totally agree. There are a lot of great concepts in these frameworks that will apply in other areas as well. Again, whether you use this framework or that framework or no framework, there are still a lot of programming patterns that you can learn. Which is why if I were to start learning how to code again, I would still start from the same place. I would still pick a framework and go with that and then figure out how it works.
Jeremy: [00:13:22] I want to take us back to your time at DockYard. I believe your title was senior developer. What do you think made you a senior developer or did you feel like one at the time?
Lauren: [00:13:34] I think that's a great question. I think my general viewpoint on this is that I don't think we have agreed upon standards for what we deem senior, and I don't want to be the gatekeeper of what determines someone as a senior engineer. But I certainly didn't feel like a senior developer, at least in my definition of what I thought a senior developer should be at the time.
And at the time, I think I had a fairly naive impression of what a senior developer was and my thought was all about the senior developer is essentially the person who is the best programmer. Who knows every single API by heart. Is a genius at all the internals of every library that they use. And they're just technical, technical, technical chops.
But interestingly, the more I worked there and the more I interacted with others, people who had the same title. The more I realized that my viewpoint of what makes somebody a senior developer or engineer was totally off. And today I feel like the technical chops are just a small part of the skillset and the tool set of a senior engineer.
And if that's the only thing that you're bringing to the table, then-- it's not necessarily a bad thing, but I think you're doing yourself a disservice by not flexing those other muscles. Which is a huge lesson I learned when I took on the role of a manager. But yeah, I definitely didn't feel like I was a senior developer back then.
Maybe today I feel more like a senior developer but I think everyone has this different definition. But at least in my definition, I think I feel pretty confident in saying that. Yes, I am actually a senior developer.
Jeremy: [00:15:22] So what would you say were the key differences then? Because you were saying that it's beyond just the technical aspect, but what are those pieces that make you feel comfortable saying that you're senior now?
Lauren: [00:15:36] First of all, it was the mindset shift for myself that I can't pinpoint a specific point in time where it happened. But I certainly recognize it today where I essentially no longer feel the need to rush into writing code. Whereas in the past the moment you get a project, you're like-- all right, I need to write this proof of concept. You just focus on writing code and that's all like for you, your impact is all about the raw output of your keyboard essentially.
And that was the wrong mindset to have because what I learned over the years in working on different projects and in different companies is that oftentimes the most impactful things were not actually the result of code.
It could be a conversation that you have with your customer or your client to find out the assumption that you had made was incorrect. And if there's something you can ask in a question and you can get an answer to in 30 minutes or you could spend days and weeks building something and then you bring it back and showing it to them and then they tell you why this is not what I wanted.
I learned that lesson very painfully because I was one of those people who would just rush into writing code. My viewpoint was if I don't have to talk to anyone then I'm succeeding. But that was totally incorrect and it was a tough lesson to go through, but I think a lesson that I sorely needed. It's definitely affected the way I operate today.
I think today I don't shy away from talking to people. In fact, I will go out of my way sometimes to have conversations with people even when it's going to disrupt the time that I enjoy of writing code because I know how impactful conversations like that can be especially when you're trying to do things that are maybe not very certain or get more context or even prioritize things.
I think another aspect of being a senior developer is knowing when to say yes to things and when to say no to things. I don't think there's a decision tree for when to say no or yes. I think it's very much based on intuition and your understanding of the context and the problems you're trying to solve. And also organizational challenges that may happen, but prioritization is something I feel like we don't often talk about. Because again, if your mindset is all about my impact is based on the pure output of my coding, then you're not going to be in a position where you can say, hold on-- before I go and just jump straight into writing some code, let me actually speak with my manager and challenge the idea of like, wait, hold on. Is this actually the best way to do it? Do we even need to write any code to solve this problem? Maybe it's an organizational problem. If I were to distill it down I think it's the realization that my output is not just code anymore.
And I think that for me was the point where I could say to myself: I am a senior engineer. Even though maybe I'll join a company and not immediately be an expert in all of the proprietary tools that they have, which is expected. How can you, how can anyone be an expert when you haven't used those technologies. And there's certainly no expectation I think that any company you join you will be immediately the foremost expert on something that they do within the company. So the thing that you bring from position to position or project to project is really those core skills of your understanding of fundamental programming, things that are transferable, but also the organizational chops that are also equally if not more important than those foundational skills.
Jeremy: [00:19:34] Earlier in your career like at DockYard did you feel like you had the authority to ask those questions like to challenge your manager, go directly to the customer. Was there anything that was stopping you from doing that then?
Lauren: [00:19:34] I think there were a number of challenges for sure. The agency relationship with customers makes it a little bit difficult because as an agency or a software consulting company you are not always in a position to question or challenge the client because at the end of the day, the client's paying you to build something very specific and sure, maybe you can point out the flaws in their plan or deficiencies but ultimately the contract that you signed states that you have to deliver a certain product by the end of a certain timeline.
That was probably the systemic challenge there, but I think I also didn't feel empowered to do that anyway, even if that wasn't the case. I think there were a number of challenges, but certainly I was, maybe not having the right examples either too. Like for example, maybe if some of the more senior people in the company were doing that and setting a good example, then I think others would have followed as well. But I don't really feel like we were necessarily in a position to do so.
So I think that made it more complex. And I think once I started joining a companies like Netflix or Facebook where I currently work. I think that dynamic and the expectations also changed because now I'm in a position where my job is not just to blindly output code just because someone said so, but to be a problem solver.
And so I think it's a very different relationship. And I think if you are in a software consulting role or a software agency role then a lot of what I'm saying may not necessarily apply because you're not always in a position to go and question your client or customer.
Maybe you might find a customer that is awesome enough to let you do that and be receptive of the feedback as well. But that's not often the case, especially on projects where it's like super tight deadline, just deliver something in two to three months. So context is everything.
Jeremy: [00:21:42] Yeah. That's an interesting point about how when you're working at an agency, you're ultimately telling the customer: I will do this thing for you. It's written down in a contract, whereas for a more traditional company, it's really dependent on the culture of that company and maybe that's something that when you're interviewing or learning more about that company, you would want to figure out how much agency or how much control will you have as an engineer being in that company.
Lauren: [00:22:14] Yeah. I think that's a great point. And it's a question I will often ask when I've done interviews as an interviewee in the past is ask people, especially the engineers on the team that are interviewing me for examples of times where they were empowered to say no to certain things. And I think the way that the answer those questions will tell you a lot about the culture of the company.
I often find as a meta point asking for examples in interview questions to your interviewers are actually always for me, very helpful in trying to reverse engineer what the culture is really like versus what it's advertised. Because sometimes, and it's not ideal, but there's often a disconnect between what is stated and what really happens. And I think there's no better way to learn that than to ask for examples.
Jeremy: [00:23:08] What are some questions you would ask to reverse engineer that?
Lauren: [00:23:14] Oh, I have so many. The things that come to mind are, like I just mentioned-- can you share an example of a time where you were empowered to say no, or, tell me about a time where you disagreed with a manager and, you were given the autonomy or freedom to go and explore that solution that you were proposing. The things along those lines where it gives the interviewer a chance to show that the culture is truly what they say they claim it is. If I think of more later I'll bring them up as well. Or I can share some thoughts in writing later as well.
Jeremy: [00:23:52] When you ask those kinds of questions, reading people's body language or just the way they respond you can infer a lot of information, beyond what's being said.
Lauren: [00:24:06] Yeah. Especially in the times where maybe that person is unable to give you an example and instead they'll talk about it more generally, which for me is a bit of a smell. It means that maybe they don't practice what they preach. And I think what you just said is very important.
I think the way that the person answers that question tells you a lot, even if you don't come out right and say like: "No, I am not empowered to say no." I think it just tells you a lot, like just how the person answers what they say, what they don't say, is also important. But I mean, I also say that with the somewhat small caveat that there may be a chance that maybe that person just hasn't had that occur to them. Maybe not that they have not been empowered to say no and they just have never had to say no. It's not necessarily a bad mark, I would say. So a lot of judgment applies to how you interpret those answers. But again, they can be so subjective. So I don't know if there's a clear cut way to say like, "Oh, this company is definitively bad or good."
And then I think to make things more challenging, depending on the size of the company you're going into a lot of the culture will really depend on the immediate team that you're on. And in fact probably the manager that you have is a bigger indicator of the culture than the general company-wide culture.
So it really depends. But if you have the hiring manager in your interview panel and you're given time to ask questions, then I would definitely bring lots of really hard questions and really get a sense of whether this manager will be the right person to support you in your career.
And, sort of going off on a tangent here, but I think my own experience being a manager has also taught me that there are lots of different kinds of managers and it's not like one is better than the other. I think there is some kind of matching that you have to do on your own as you understand what kind of support you need.
For example. If you're still early in your career, maybe you do need a manager who is very technical, who can give you a lot of technical feedback, that can help you grow in your career, at least technically. But maybe once you get more senior in your career, maybe that kind of manager would no longer be as beneficial to your career as for someone who is earlier in their career.
And instead, maybe you might look for someone who is more of a sponsor. Someone who goes out and finds really difficult problems and says, "Hey, can you solve this?" And maybe that's what you need in your career. So I think spending the time to introspect and think about if I had the perfect manager, what would they be like?
And then go backwards from there and say what questions do I need to ask in order to determine if this manager would be that person? And obviously it's not perfect, you can never really know for sure until you start working with them. But it can at least give you more confidence if you're interviewing at lots of different places and you're trying to make a decision on where you ultimately land.
Jeremy: [00:27:17] Yeah. That's an interesting point about finding a manager fit. After DockYard, you moved to Netflix as a senior software engineering lead. In that role, what were you looking for in a manager?
Lauren: [00:27:34] So I joined Netflix, not as a lead per se. It wasn't an official title, but more so unofficial title after a period of time. I guess LinkedIn probably doesn't capture that very accurately. In terms of what I was looking for in my manager at that point when I had just joined the company, I did think I was at the point when I joined Netflix that I really wasn't necessarily looking so much for just raw technical chops.
I wasn't looking for a manager who was a better coder than me. I think the thing that I was excited about was Netflix's culture of freedom and responsibility and context, not control and all those things that they write on their culture memo. And I can actually safely say that pretty much most of it, if not all of it is true and they do apply in practice.
So I was very excited about going into a company where the culture was so different than anything I had ever experienced. And I wanted to learn what I had started to think about would define me as being at the level that I wanted to be. You know, like not someone who is only good at programming, but also someone who brings a lot of impact to the team that they work on, whether it's a contribution in the form of code or contribution in the form of an architecture document, or even a comment or some feedback that you've given someone. Because when I first joined, I didn't feel like I was in that position yet.
But about a year and a half, I don't remember exactly I did start to feel like I was getting the hang of the culture and also the technology at Netflix where I was then very comfortable when my manager came to me and said: "Hey, would you like to be the lead for the team?" to say yes.
I was like, yeah, absolutely. In fact, I already felt like I was operating like a lead. So this was more just a recognition that I was already operating at that capacity. So I think that my manager at the time was definitely very supportive and they looked out for opportunities for me.
And they were never really prescriptive about certain things like they may have had different opinions from me from time to time but they weren't afraid to say, you know what go try what you think is right. And then let's compare notes and see what turns out to be better.
And that was always very encouraging because it creates this almost psychological safety of going and trying different things that people don't necessarily immediately agree with. Like if you can prove that something is better with a prototype or a document or whatever it might be that you're given the autonomy and flexibility in the space to go and explore that and then come back and say, you know what? This was either a good idea, or a bad idea, or unconclusive. But I think that was, for me, something that I really enjoyed about that part of my career.
Jeremy: [00:30:36] And it sounds like your manager gave you the opportunity to explore having more influence, having more control over the types of work you were doing, and how you were doing it and at that time in your career, that's really what you were looking for.
Lauren: [00:30:55] Yeah. I think I don't recall when, but there has definitely been times where I've had what I would call the programmers midlife crisis. Where you start questioning what you're doing and the way that you've been doing things and the purpose and starting to look up from the keyboard and like, hold on a minute.
I can get this project done, but is this really the right thing to be doing? And I think the more senior you get, the more that urge will come to you and you start thinking more about, Hm. The moments where you say to yourself like, hold on a minute, something feels off.
And I think the turning point for a lot of people will be when you'll start turning those thoughts into action and instead of just saying, hold on a minute in your mind and then just continuing anyway, you start actually going forward and talking to people and say, hold on, here's something that doesn't sit quite well with me. Let's talk about it.
And in fact, I think one of the things I started to recognize once I was operating in that lead capacity even though maybe I didn't have the title just yet, was that actually I was spending less time coding.
And initially it felt kind of awkward. I was like, why am I in all these meetings? Why am I feeling like my output has dropped a lot? And it was true. If the only output that you're measuring is my code, then it definitely dropped quite a lot. But in terms of the impact I was having on the team and the projects that I was on I think definitely outweighed that. It wasn't a net loss because oftentimes when you have someone who's operating in a lead capacity, it means that in a way, they're giving away those problems that are maybe more difficult to solve. And allowing others to learn about them and, not hogging all the difficult pieces to themselves which sometimes the tech leads might do instead of giving opportunities to others to grow, which is actually a responsibility for a tech lead. So I think going back to your question of what did I need from my manager at the time, I think it was definitely being put more so than a lot of other things it was being put in an environment where I could really flex those nontechnical skills and understand, and almost in a way, create the environment you know? Like if a manager is like a gardener, then creating the right conditions in the environment so that I could not just thrive, but also evolve and grow and, broaden my branches. It's a weird analogy.
Jeremy: [00:33:38] And, we've stepped around it but I think the the title of someone being a lead to a lot of people is a little fuzzy. Some people think a lead is the same thing as a manager. And it sounds like what you're saying is in your case, a lead was someone who is able to ask questions to figure out what should actually be built. They're able to decide who should work on these things after you've decided what needs to get built, and we haven't mentioned this, but potentially help the people who are building these things if they get stuck. Would you say those are the three primary things that a lead does?
Lauren: [00:34:21] Um, at a high level, I think that's pretty accurate. To be a bit more granular, I would say it also depends on the kind of tech lead that you want to be, or, or maybe another way to put this might be the tech lead that your team needs. Because the truth is, at least from my perspective, just like managers have different archetypes. I would also say that tech leads had to have different archetypes and it really just depends on the kind of project that you're working on. I would say though, as a minimum for me, at least from the technical side of things, yes. Even though I wasn't writing a lot of code anymore as a lead, I was still reviewing a lot of code.
In fact, I think I would probably say I reviewed more than I wrote code. I think that was also part of the dawning realization that-- Hey, you know what? You can contribute in forms that aren't just you writing the code and then slowly universe expanding of like, Oh, if I step back just a little bit, I start to see the forest of what impact as an engineer is. And it was the realization that I had been only just focusing on this technical tree and not growing all these other skills that are also really important. So I think tech leads are typically the people who are seen as the best engineer who gets pushed into the lead position.
But I would say that tech leads are interesting in the sense that you're not a manager, you typically don't have reports, and you don't have any authority so to speak, over anyone. So all you really have typically, I feel is the influence that you've earned throughout your career in that company.
And that kind of social capital, if you will, that people will start to listen to you because you've been around, you know your way around, and you've proven that you can handle large projects and things like that and grow other engineers. So I think for me being a tech lead in some ways can be actually more challenging in some ways than a manager because it's blurring the lines I guess.
I think as a tech lead you're in this awkward gray zone between engineer and manager and you're not quite one. You're not quite a manager. You're obviously still an engineer, but you're in a position of greater influence and greater not really authority, but more respect typically is given to you.
And so you're in this awkward position. Where it again, it comes down to what your team needs. And maybe like for example, if I was to join a new team and I was the tech lead for that team and if it was a team of one or two people, then obviously the expectations and the way I would do my job would be very different from me joining as a tech lead on a team of 12 engineers. It's a very different set of variables that you have to learn how to tweak.
And again, it just depends on the makeup of the team as well. So like if I joined a team of 12 very junior engineers then also my approach would be very different versus if I joined a team of 12 extremely senior engineers. It all is very fuzzy. I don't think there's one, there's no one way to do your job as a tech lead, or as an engineer, as a manager. And maybe it sounds like a bit of a cop out answer, but I do think that a lot of questions can be distilled down to the age old answer it depends.
Obviously just saying it depends and nothing else is a bit of a cop out. But I can say that there are different circumstances and some may require as a tech lead more involvement from you at the architecture level. Maybe some less or maybe some where you, instead of worrying too much about architecture, maybe the problems are more around organizational challenges or headcount or constraints I would imagine things like that the tech lead should be doing as well.
Like that example I shared with you of joining a team and being one of two engineers. Maybe one of the first things of my job would be to point out to leadership that-- Hey, I've just joined this project and it's clearly very ambitious, but there's only two of us and the deadline-- that the timeline that we're gonna work on is way too unrealistic.
So I actually need to campaign my manager to say, this is why we need another two engineers or one engineer on this project. And so that's why I think it's a bit tricky. Cause it really depends on the team that you're on.
Jeremy: [00:39:05] That's a really good point in terms of the size and the experience and the actual project that you're tackling. I think that's why people have so much trouble understanding what it is a tech lead does. Because from what you're describing, it's a completely different job from person to person.
Lauren: [00:39:24] Yep. Yeah. It's very context dependent because you're straddling the line between manager and engineering and individual contributor. And so you have to sometimes wear the manager hat even though you're not a manager, and sometimes you have to wear the engineer hat.
But I think knowing when to switch hats is really important. And if the expectation that maybe someone else has set for you is that you are wearing the technical hat most of the time, then that's the expectations that you work towards. But I think for most companies, especially the bigger ones, I think there is an expectation that you also wear the project management hat, the organizational hat where you go and raise problems like that as well.
Jeremy: [00:40:10] So we've been talking about tech leads and managers and how the role of a tech lead is so fuzzy. After your role as a lead at Netflix, you moved on to becoming a manager.
What would you say are the key differences between being a manager and a tech lead? How did your job change, how did your role change?
Lauren: [00:40:33] So I do want to make the distinction that even though I said earlier that as a tech lead, you're in-between a manager and an individual contributor. I do want to say there are a lot of manager specific things that tech leads don't get exposed to.
It was definitely a big jump even going from tech lead to manager, let alone like as a non tech lead engineer to manager. And I think a lot of those challenges were in the forms of essentially problems which I never really thought about.
Like, maybe I would have said-- we need more people on this project. But I wouldn't have then gone on to say, all right, I need to spend the next three months looking for the perfect hire to join the team. Because that was the job of the manager, to really think about the people and the conditions of the overall team that they're supporting.
Whereas as a tech lead your sphere is slightly more constrained to maybe a project or two purely more in terms of the results of said project, whereas I think as a manager, the expectations become more at the organizational level and your success is really determined by the work that your team ultimately does or doesn't do.
Maybe it sounds kind of subtle when I describe it that way, but I will say that it was definitely a very different job when I went on to become a manager. The first of which is I think a very false conclusion that I may have harbored a long time ago and I think a lot of people share the same sentiment as well. Which I want to go on record and say I kind of disagree with that. And that view is essentially that becoming a manager is a promotion. In some ways, maybe it is a promotion, like maybe financially you might get paid more and you might have more opportunities to have certain kinds of impact depending on the company that you're in.
But I will say for the most part having that mindset that management is a promotion is not the right one to have because I think it disguises the fact that when you go from engineering to manager you're basically going from very senior engineer who was very good at their job to going to become a baby manager who knows nothing about their job.
So it is very different. They're like two definitely distinct tracks and all the skills that you've used. Like, 99% of them, you've used to be successful as an engineer are probably not going to translate that well to being a manager. Like you're not going to be expected to write as much code or to even do code reviews instead your role is really more to ensure that you have the right people on your team, and also the right environment where those people can thrive and do their best work and achieve the goals that have been set out for your team and even shape those goals that your team should be working on. So it was definitely a very different career change.
And I think even though I had expectations going in, that it'd be very different. It was a totally different experience doing it, if that makes sense. Like I was expecting one thing. I knew it would be different, but I didn't realize it'd be that different. And I remember as a manager, I was spending so many hours just looking through LinkedIn or reaching out to people on Twitter, and asking them like, "Hey, would you want to come work on my team?"
Because as a manager, your biggest lever for impact is getting to pick who is in the team and it may sound like a very-- it may sound simple, right? Like you're just hiring, but I would say it's actually a very very high leverage activity. If you find that person who fills out a gap in your team where maybe there's a certain skill set or a certain technology, skill or organizational skill that your team doesn't have that you want to have, and you're able to fill that position and not just do that, but keep them there as well for-- well not keep them there but, to create an environment where that person is happy staying for awhile, then you've really done a great job because now you have a strong, solid dream team that has the capabilities of doing awesome work that you need, that you want to achieve the vision that you have for your team. And then you also have to balance that with the often difficult work of what is often called talent retention, but I don't really like that term because I don't think so much about retaining people because that sounds to me like they're constantly trying to escape and then you're just trying to hold them back.
I think it's more about creating an environment where people are attracted and they want to stay not because they're handcuffed. But because they choose to stay, because you're a great manager and the team is good. That the work is impactful. If anyone listening is also going through that transition or you've just become a manager, I'll say that I think for me the biggest challenge I had to overcome in the initial couple of months of making this transition was really understanding that it was a completely different job and then changing all the things that I did for that new reality and not trying to go back to the things and the skills and activities that I had been relying on as a tech lead.
Jeremy: [00:46:10] You gave a few examples of what a manager does that wouldn't go to a tech lead like hiring and if your team needs more resources, making that pitch to get more money, and also creating an environment where people want to stay there. Are there any other specific examples for the types of things that would only be a manager's role versus something that a tech lead would do?
Lauren: [00:46:36] Yeah. So this is not going to be an exhaustive list for sure, but I can at least point out the things that are not immediately obvious, not unless someone explicitly says it. But I will say actually, depending on the company, I think one of the biggest jobs of a manager is actually the flip side of hiring, which is firing.
And that's a really tough one. You never go into a team or hire someone or work with someone on your team expecting that you'll need to one day fire them. But as a manager, that is the thing that you think about on a constant basis, not because you just like firing people just to fire people or again, I don't know, maybe at certain companies where they do things like stack ranking and there's an expectation that, yeah, I don't know. I've been fortunate not having worked in companies like that. But if you are a manager, I would say that you're often the person who does a lot of unglamorous things like that. To me at least, it seems unglamorous to me. The hard work of recruiting and hiring and speaking to candidates and selling them on your team.
And if you do write code it's most likely going to be the very boring parts of the code base. Like adding tests or writing a little script that does a certain thing. So you're not going to be working on those things that you thought were exciting or the things that may have even attracted you to software engineering in the first place.
So very different job. And, there is going to be so many other things that you may not be aware that managers have to do. Like, I don't really like how people will phrase this, but a lot of managers will say like they provide air cover or they shield their team from shit.
I think there's some truth to that in the spirit of what that means. But I think there are obviously different ways of approaching it. And personally for me, instead of thinking it like that, like I'm shielding my team from shit, I think it's more about maybe there is shit coming towards our way. But my job as a manager is to also not make that shit before it comes to my team, if that even makes sense. And so that often means talking to people in different positions, different parts of the company, people who are higher, like a VP or a director and convincing them that this path isn't the right one.
And the truth is that a lot of individual contributors won't see that. Not because they are ignorant but because if your manager is doing a good job of that, you just don't see it. And sometimes managers can be a bit flippant, I think. And to say that, Oh yeah, I shield all the shit, so you don't have to, which I think is, again, like in spirit it, it captures the outcome of what, what that is. But I think it also doesn't quite accurately portray how the manager goes about doing it cause there are many different ways that yes, maybe you can just shield and keep your team unaware of everything, but that's not necessarily a great way to run your team as well because your reports will not necessarily trust you very much if you're always being very dishonest and like not telling them the truth, because of your desire to shield them from shit.
Instead maybe the better approach is to let people know that. You know what, there are rocky things that are coming. They're things that the company that we work at doesn't do so well, and that's okay. We'll figure it out. But not to completely hide it, I think. I think that's the part which I am not a big fan of.
It's a bit of a cop out to me if the manager just keeps things from their team because of that mindset or because of that belief that by doing so, they are helping your team, when in fact, I think it's actually making the team worse off.
Jeremy: [00:50:25] That's an interesting perspective because ultimately, if you are shielding your team and the things that they're being shielded from are just shifting elsewhere in the organization, that's not really solving the root problem.
Lauren: [00:50:39] Right. Yeah, and I think it can also be very powerful for managers to point out areas which need help. And then instead of feeling like the manager has to solve all of those problems I think we-- we talk a lot about the management parts of the job, but not the leadership parts of the job.
Leadership is really more about influence and the way you conduct yourself and how others perceive your behavior versus management, which is more like-- I think a role that you play. So things like hiring and firing are obviously the role of a manager.
But getting people excited about a vision and getting people to do certain things even though you're not explicitly bending their arm to do it is a part of the job that is not often talked about or even taught-- like how do you do that? It's not something that you can just read a book and do. It's something that over time and trial and error and maybe some intuition you build that up over time.
A realization that I had over the past two and a half years when I was a manager was that leadership is not solely within the domain of the manager. It sounds silly to say this, I had to be a manager in order to realize this. But it wasn't as evident to me until I became a manager that, Hey, hold on. There's a lot of things that I'm doing as a manager that I didn't have to be a manager to do these. So when I started to think about the different parts of the job that I was doing, I started to realize that, hold on, there's the parts that I like, right? A lot of the leadership side of things I really enjoyed.
And then the other parts which I maybe didn't enjoy as much. And I realized like, Oh, hold on. Actually I don't necessarily have to be a manager to practice these skills. That was actually the realization that I needed to maybe go back to be an engineer again.
But I certainly don't regret the time I spent as a manager cause I was exposed to so many different kinds of problems that I'd never ever had to face as an engineer. Hiring, having to let people go. Dealing with the sometimes unreasonable demands of different organizations that we were working with and balancing that all.
And another thing that maybe managers don't talk about is oftentimes people will come to you with problems that you can't solve. And these are maybe personal problems, emotional problems. And if you're a very empathetic person, then I think the job of a manager gets really difficult because people come to you with lots of problems that you can't solve. And if you're an engineer, you probably want to try and solve all the problems and it can be very frustrating.
I guess I'll sum it up all by saying that being a manager is a totally different job from being an engineer, even a tech lead. It's totally different. It's not a promotion. I don't consider it a promotion. And I think if anybody chooses to do it, I think you learn a lot and hopefully you enjoy that transition as well.
But personally for me, I didn't enjoy it. That doesn't make being a manager bad. It just means that it wasn't for me.
Jeremy: [00:54:00] And now that you're at Facebook as a software engineer again, What's the thing that you enjoy most about being a software engineer as opposed to being a manager?
Lauren: [00:54:11] I think when it came down to it. It was really a reflection of why was I in the tech industry in the first place. And I think the simple way to put it is that I mentioned earlier at the start of this conversation that programming started out as a hobby for me. And it was something that I would spend all my free time just working on.
I would have these shower thoughts essentially of programming. And I realized I've been so fortunate that I was able to turn something that was purely a hobby into a full time career. And when I was reflecting at the end of 2019 about the next couple of years of my career, I did really start to think that there was a lot of programming in being an engineer that I really missed.
And also me realizing that other part, which I mentioned earlier, which is that there's a lot of things I was doing as a manager that they were not things that only managers could do. But you may have to become a manager to learn the skills, which just sounds kind of weird.
But I think it was that realization that Hey, one, I can go back to do what I love, which is programming. And two, I can also bring back all these lessons that I've learned as a manager and basically supercharge myself as an engineer and be so much more impactful.
Not because I'm going to write all the code and solve all the problems, but because I know how to inspire. I know how to influence. I know how to communicate. I know how to get things done and get other people to help out with those problems. And I think that was for me, the realization that I could have my cake and eat it too, I guess. And I think that I'm very fortunate in that.
I think at Facebook they think very heavily about career paths as an engineer, as a manager. And I think the company does a pretty good job at stating that one is not superior to the other. In fact, there's more or less an identical leveling track for engineers and managers and also very similar in terms of compensation.
So there's not a penalty for you if you become an engineer. It's not like you're going back. It's seen more as you are just hopping over to the other parallel track. And one of the blog posts that I want to call out here that really helped me think about my career this way is a blog post by Charity Majors called the engineer/manager pendulum.
And she does an amazing job of articulating this hidden career path of jumping between the engineering track and a management track every couple of years. And she does a way better job than I think I can to explain why it's an interesting career path to take, but it certainly inspired me to start thinking more critically about what I wanted out of my job.
And then finally mustering the courage to go and interview again because I don't actually know anyone that I can think of who actually enjoys interviewing. I don't. I think it's one of those evils that we put up with. So there is some courage you have to muster up often just to interview and go look elsewhere. But I think her blog posts really spurred me to take action on it.
Jeremy: [00:57:34] The interviewing problem is-- that sounds like maybe a job for a manager?
Lauren: [00:57:43] Yes. I think, yes, it is part of the manager's job, but I think as engineers, we can also do a lot to at least point out the problems.
Maybe we're not the ones to fix them, but we can at least say, Hey, this interview panel that I'm on-- I've looked at the other interviewers on the panel and you can call out things that aren't quite right, that don't sit right to you.
Maybe the panel is very undiverse or maybe the interview goes on for like eight whole hours. There are things like that you can still do to influence that process and even influence the questions that get asked. I haven't been a part of an interview panel here yet, but if I understand it correctly, I think that engineers have a lot of influence over the kinds of questions that set a standard for the different interviews that we have.
And so that's one way as well, to have a lot of impact and influence over the interview process. And make sure that the questions that we're asking are relevant, realistic, but also ensures that we keep that standard of engineering quality that we want, which is always a fine balance to strike.
I could probably talk to you for another whole two hours just on the topic of interviewing, so I won't go into that right now.
Jeremy: [00:59:01] Yeah. It's definitely something that everybody has an opinion and everybody agrees it needs to be better, but for some reason we as an industry just haven't gotten there yet.
Lauren: [00:59:12] I think my short answer to this is that I don't think there is a perfect solution. Which is why we haven't as an industry adopted something that's better. It's a process that is very lossy and there's just no way to really tell in a short frame of time what a person will be like working on a job.
And there are many ways to solve it. None of which I think is better than another. So that's all I'll say about that topic. Don't get me started.
Jeremy: [00:59:43] Yeah, I think that anybody listening to this maybe the big takeaway would be regardless of what your role is, even if you are just a regular software engineer, look for what are the places where you can ask questions, whether that's what type of work you're doing, whether the technology you're using is right, or do you have the right people to do it?
What are ways that you can really improve your team situation without necessarily having to change titles.
Lauren: [01:00:16] Yeah. I think that's a great way to put it. The way I would try to summarize my learnings over the years is really that it comes down to ultimately for me, it all stems from this root. And I think the root of this is all of that. Realizing that your job is not to write code. Code is merely a side effect of your job.
I think your job is really to solve problems and there are many ways to solve problems and I think realizing that is, to me, step zero in terms of growing more senior in your career. And, the other thing I'll say to that is also as you get more senior things will get more ambiguous. And you have to learn how to deal with that uncertainty and ambiguity and accepting that sometimes there isn't an answer.
And that's okay. I think those are the two big lessons that I've learned.
Jeremy: [01:01:10] That's interesting because I think as engineers, a lot of people feel like as they learn more things will get less ambiguous. But it sounds like as you progressed in your career, things are actually getting more ambiguous and that's how you know you're progressing.
Lauren: [01:01:24] Exactly. Yeah. I think even in the code I write. Cause like I think you can see it sometimes. I've seen this in myself as well. When you're not a junior engineer anymore but you're not really senior and you kind of know enough to be dangerous and you start dreaming up these-- I'm very guilty of this in my past of writing these weird abstractions that you think will save you a lot of time. But when you look at them in a couple of months, you realize this was totally the wrong abstraction to have picked and it's actually slowing the team down.
That is often because again, you're trying to feel your way around and explore and learn, and write better code in your mind. But I find myself these days, it's like trying to write the simplest possible code and delaying the point of abstraction as much as possible and writing a lot of comments about all right-- This could be better, but I'm not gonna make it more abstract right now because this is just a one off case and we don't actually know for sure if it's going to happen again. So that's, I think, the part of recognizing the ambiguity of things. And there are a lot of things that have subtly changed about my behavior.
I used to be all about talking about best practices and talking about, Oh, this is an anti pattern, or, so and so said, we shouldn't do it this way. Or you try to read the tea leaves of someone's tweets into Oh yeah, Dan Abramov says, don't do this. So this is now law and we cannot break this law.
But I think a painful but necessary part of growth I think is realizing that nothing is really an absolute, only the Sith deals in absolutes and being comfortable with that. Like, again, I was saying like there's often maybe no best answer, but picking the right tool for the job, the right solution, it takes a lot of patience and communication together with your team.
Jeremy: [01:03:20] For sure. Yeah. Dan Abramov's example is actually really funny cause he is the creator of Redux, right? And he has this tweet where somebody is describing like how somebody put Redux into their application because Dan said to do it and he replies to the tweet and says this is the reason I'm going to hell.
Lauren: [01:03:41] Yup. Yeah. Dan Abramov is a really smart person and someone I really enjoy working with. I think it's all part of our growth of realizing that the things that maybe we all believed were best practices a year ago are probably now anti-patterns, which is why I just shy away from saying this is the best practice and we must do it this way. And taking a more case by case basis to things.
And again, this all ties back to being comfortable with ambiguity, right? Because if you don't have these laws, so to speak. Then you're introducing a lot of ambiguity in your code because now maybe people have a lot of uncertainty about, Oh, do I use this in this situation or that?
And instead of you saying, Oh, you should always use this thing. You're now saying, right, let's evaluate it on a case by case basis.
And that's okay. Maybe it's going to slow it down a little bit but in the long run, it actually makes us faster and more resilient to change. Especially if product requirements change and suddenly all the abstractions that you dreamed up are now totally irrelevant.
It's a very interesting industry to be in. I think software it's, it's changing all the time. The way we build software also has to reflect that. And instead of trying to build these very rigid architectures and constructs-- which maybe in certain scenarios are warranted.
Like if you're writing code that will never be updated for the next 30 years, then it probably makes sense to get it right from day one. But if it's something that's constantly being improved and evolved, then maybe you don't, you don't jump into pouring the concrete where the concrete doesn't belong just yet.
Jeremy: [01:05:22] Yeah, I think that's a good note to end it on. Where can people follow you?
Lauren: [01:05:27] The best place to follow me will be on Twitter. My handle is sugarpirate_. You can also follow me on LinkedIn, or add me on Facebook but Twitter is probably your best bet if you're trying to get ahold of me.
Jeremy: [01:05:42] Lauren, thank you so much for chatting with me today.
Lauren: [01:05:44] Yes. And thank you Jeremy. It was really fun talking to you. And see ya everyone!
Jeremy: [01:05:47] That's it for my chat with Lauren, You can get show notes and a transcript for this episode at softwaresessions.com. And if you enjoyed the show, let someone else know about it. The music in this episode is by Crystal Cola. See you next time.
Swyx is a senior developer advocate at AWS, an instructor at Egghead, and the author of The Coding Career Playbook.
We discuss:
Music by Crystal Cola: 12:30 AM / Orion
Related Links
Transcript
You can help edit this transcript on GitHub.
Jeremy: I did a computer science bachelor's.
Swyx: [00:00:45] Nice.
Jeremy: [00:00:46] It's interesting seeing how you learned because when I went through school, I wasn't super passionate I think particularly because when I was going through school a lot of it was, data structures and algorithms and stuff like that, and it was a little bit disconnected from when I first started where I was like I'm going to make games.
I'm going to make cool GUIs and like when I get to school, it's like there's none of that. It's really on me I should have been seeking that stuff out on my own. It wasn't until a few years later after I had started working where I really started, enjoying the process, enjoying learning about the technologies and building stuff. Looking at what you were doing I definitely should have been doing that when I was going through school
Swyx: [00:01:34] Well, I mean, you're still figuring out what you want when you're still in college, Yeah. I went to school for finance and I no longer do that (laughs). But, yeah, I don't, I mean, don't live life with too many regrets it's not worth it.
I fell into this way of learning because of other people. All I'm doing is trying to spread the message and there will be more beneficiaries of this than me. I'm definitely lacking a lot of things that you learned in college.
I'm trying to make up for it. I really want to take an OS course. I want someone to force me to do a basic operating systems course. And I don't know what a syscall is, and I don't know the details on memory allocation and all that, like, but on some level, it doesn't matter because it depends on what part of the stack you want to work in. But I just don't have the option available to me if I wanted to go further down the stack. I just don't.
I still personally do wish that I did a CS degree. I'm just saying I definitely did not catch up with what you already know just from my bullshit web dev stuff. But, it's enough to get a job, which is absurd. This is the only career where, it's high paid and you can get up there in like three months ish. Maybe you won't be amazing. You're not going to be Jeff Dean or something at Google. But you can get by decently and you get paid the same as a doctor or a lawyer. And that's ridiculous.
Jeremy: [00:03:01] I found that to be pretty pretty insane. Though I will say, when you were talking about you can get a job in just three months or whatever. But your background it's really not the three month boot camp right? You had a much longer tail in terms of all the things that you've learned at your previous jobs. You said you had used Haskell right? That's before you went to the bootcamp
Swyx: [00:03:27] Yeah, I mean, I'm definitely not speaking for myself in terms of the three month thing. But I have seen my fellow bootcamp people get good jobs. Obviously there's a failure rate as some people don't make it. But it doesn't happen for medicine or law.
Jeremy: [00:03:39] Yes. Instead, it's six years, eight years, and then like you've talked before, when people learn something, it's normal for them to share what they've learned.
Swyx: [00:03:48] Also we'll promote you for it if you do a great job. It's just fun.
Jeremy: [00:03:54] I think when I first heard about you is-- I read hacker news relatively regularly and I remember you had made a post and you were saying: I used to be in finance, I'm going to do this boot camp. And so I'm doing this podcast where I interview people in my class and just a few years later, now you're like all over the place. You've got all these blog posts. You're moderating the React subreddit, you were recently, a senior dev at Netlify and I was like, this is crazy. This guy who was talking about going into a bootcamp just a few years ago he's doing so many things. I think there's a lot of lessons that people can learn from your experience starting their careers, learning how to learn, and deciding how to progress in general.
Swyx: [00:04:45] Thank you. I don't know what to say that. Yeah. I'm trying to write it down. Basically I have a lull between jobs. I'm going to join Amazon by the time this comes out. This is the only time I feel like I can write a career advice book because after this I want to focus on other things.
I guess I had a relatively fast trajectory. I finished my boot camp at the end of 2017. Started my first dev job in January of 2018 and then just got hired at Amazon at an L6 level in March of 2020. That's a relatively fast trajectory for anyone.
I'm not completely new to programming. I have done some code before, but then also I do attribute a lot of that to the ability to just learn very quickly in public. And a lot of people I think it's an alien concept. They vaguely know it's a good idea. I think they don't know how good. Just the relative rarity of people doing it means that being part of that population makes you stand out. And that's very beneficial for careers even before the current situation we are in, which is now everything's online.
So your professional profile doesn't have a physical presence anymore. You don't have to become a celebrity, a lot of people are like, ah, you have to be an influencer. No. It's more about just like having a place that you call home online. And you as a developer have an abnormal amount of control over that and you should exploit that as much as possible.
Jeremy: [00:06:25] Yeah. When you talk about learning in public you talk about exploiting that, right? It almost makes it sound like the fact that you are learning in public is a big benefit to you. You're getting things from people, whereas, I think a lot of people when they think about, Oh, I'm going to write a blog post, or I'm going to make this tweet or whatever. It's like I'm going to be helping other people, but maybe a big part of it is the opposite as well.
Swyx: [00:06:52] Yeah, I mean, look it's a plus if it helps other people, but it's completely self centered (laughs). I think that's good that means that you have the motivation to stick in this thing for the long haul. I think a lot of people get started blogging or whatever, and they don't see much immediate result. And then they get discouraged. And that's because they base their self validation on others. It's not worth anything if no one else reads the blog or likes or shares it or whatever. And that's not very healthy in terms of the way that you should approach your learning. So you should learn for learning's sake. And then if other people benefit, that's a plus. It's not an act of altruism. It genuinely is the fastest way to learn. And you're growing in your knowledge but also your network. And it turns out that your network is also super important for your career. So it comes hand in hand and I don't have to separate them so I just do them together.
Jeremy: [00:07:45] I think one of the things about this concept of learning in public, a lot of people are not really sure where to start. They start with a blog or they start with a Twitter account and I think what a lot of people run into myself included is, you make a post and you don't have a lot of traction in terms of people viewing it. So you don't know if it's helpful to people and you're not really sure if you're writing the right thing. And so I wonder in your opinion, how should you approach that? How do you decide what to write and how do you get it in front of people so that they can help you out and help you learn and go from there?
Swyx: [00:08:28] Oh my God. I have so many responses to that. So I'm going to rattle off a few quick responses. First of all, it's not always about writing. You can also do speaking. You can do cheat sheets. Organizing instead of writing a blog post. I don't want people to equate learning in public to writing a blog. That's not the only category. In fact, I have five categories. I have a talk on learning in public and I had a bunch of them. It's not just blog more, although blogging is the minimal viable thing and it scales extremely well. So I do recommend that. The other thing is-- Oh God, so many responses.
Okay, let's just talk about the immediate first thing that you should do. I also really want to stress that it's something that you do for yourself, right? Whatever you're interested in then you should write about even if no one else reads it, it's fine. What we're really talking about here is an optimization step on top of just the formal act of writing down what you learned and organizing your thoughts in what you already know.
So the optimization step is how do you get attention when you have no following, right? I call this the cold start problem. It's a little bit of the chicken and egg. Everybody wants feedback. Even I want feedback, right? And that helps me load the trigger for the next action which is the next talk, the next blog post, and the next podcast interview. And that's hugely motivating. But when you're just getting started, you don't have that yet. So you need to find yourself in a situation where people have no choice but to respond to you right? And those situations exist. The way that I phrase this is, "pick up what they put down." So whoever it is you look up to, they are experts in their field. They're also extremely busy but they also have things that they want feedback on. They also have things that they don't have time to do. So if you follow them closely and you want to help just pick up on what they put down. It literally is: Hey, they have a new project out. Go try it out. Try out the demo there's probably something wrong there. Go fix the demo. If they have a new book out, go read the book, give feedback, whatever. Then you slowly work yourself into becoming a trusted collaborator because I guarantee you, no matter how popular that person is that you follow, no one picks up on all their stuff right? And I think we all have our own stuff to do. But if you have that time on your hands and you want guaranteed feedback, that's the way you do it. Because they need to be responsive to their early adopters. And guess what? That's you right? It's unfair. But it's a hack. I literally call it a hack. They have to respond to you because that's just the contract of: "Hey, I put out something. Someone gives me feedback, I have to respond to them." And it really works for any sort of project. It could be an open source library. It could be a demo product or book. It really depends. Even if it's a talk. I just did this new talk, do a summary right? Do it like a bullet point. This is what I learned. And then they'll immediately reshare it because you added value for them.
The one thing that you have that they don't like absent of any other knowledge, but one thing that you have that they don't is you have the beginner's mind. They're the expert. They've been in this for so long, and they lost the ability to relate to the beginners. But you as a recent beginner have the ability to communicate across that knowledge gap because you're bringing along people with them. The more in their heads you can get, the better. But that's a really good hack because they already have followings and they're likely to start to see you as a collaborator, especially if you prove to be a good collaborator. And that will kickstart your following in a huge way. I didn't really think about this when I was starting out, but if I was starting from scratch, that's exactly what I will go for. And it's pretty logical to see a straight path to like, okay, I will draft off of this other person.
Jeremy: [00:12:01] I think the summary is a pretty good example because there's so many really great conference talks but if you look at the YouTube view counts for conference talks they're usually relatively low compared to a lot of other content, right?
But there's so much knowledge in these videos and if you can extract all of the high level facts or the big takeaways and just summarize that for people that helps so many other people, right? They can just look at the summary and figure out like, Oh, do I want to watch the video? Do I need to watch the video? That makes a lot of sense to me.
Swyx: [00:12:35] So Wikipedia calls this the 90-9-1 rule. It's like a 90 10 law. A lot of internet consumption is completely passive. 90% of people just view or read and never say a word. 9% will comment. And then one percent would actually create, so you automatically vault into the top 10% just by commenting on something.
And if you create something based on their work then they just have to respond to you. There's just no choice. You can guarantee yourself not only some response but you also get readership from people that you care about, which is actually the real thing.
Like, I don't really participate in this gaming of follower counts or anything. But I care about connecting with people who I can learn from who are my mentors, and who I might work with in future. That's really the main reason that I participate online and I think that's healthy because ultimately. Naval Ravikant who's one of these VC types says it's better to be rich and unknown than it is to be famous and poor. Or something like that.
You should only have a network to the extent that it helps to get shit done then. And, and once you start taking on more than that and starting to base your self worth and your income and your livelihood on being an influencer or a celebrity then you start being a product and you start being controlled by your audience. Anyway. That's, that's way far out from where we started, which we were just talking about getting started. I'm just saying play for the right reasons. Because this is a hugely rewarding thing cause people are out there wanting to look for and connect with good collaborators.
But if you start getting gamified, then you start to go down a really dark path. The point being the best way to get started is through that. If you care about getting engagement.
One of the people that I helped to mentor-- their first blog posts was about explaining man pages. Like the Linux bash bash command. And I'm like, sure. But I don't get up in the morning and go "I really wish I could read a blog post about man pages." It's not something that I really want. It's good to write about things that you're super interested in. It's fine. Just don't expect that to be the most popular thing or to get immediate feedback on that. But if it's something that's new and something is being put up by someone influential in the community and you want to collaborate and jump in, that's a pretty much sure bet.
Jeremy: [00:14:54] When you talk about jumping in and providing feedback or summaries, things like that. What's the best way to get that out there? Are you making a blog post and emailing the person? Are you @ing them on Twitter? What does that typically look like for you?
Swyx: [00:15:15] Typically it's going to be one of the social media platforms. You want other people to see it as well. So email doesn't really work for that. It's going to be a Reddit comment. It's going to be a hacker news comment. It's going to be Twitter reply or something.
I even leave good comments on YouTube just cause I want to encourage content creators that I like to keep to keep doing good stuff. So, I dunno. It's wherever that person hangs out the most. And for a lot of developers it is Twitter. But it is wherever you want to be.
If you want to host it on your own page, that's fine. Just send a link to it and people can reshare it and then you can start a newsletter of: here's the five coolest things that I did. And eventually, people will start noticing that you're doing that curation work and you're providing good summaries.
That's a really good way to bootstrap an audience. I think the point is though, that's a lot of like other centered learning. You know what I mean? You're reacting to what other people do and think. That's a good way to start. But ultimately you want to be more sort of inward, like self-directed, like what do you need? What do you focus on? And direct things that way. There's such a huge wealth of information out there, right? How do you survive among the deluge of-- you're in hacker news a lot, right? There's a new 20 posts every day. And you can't follow up on everything. So it's more about understanding what you want out of developer communities.
At the same time there's one global developer community, but then there's also a billion small little ones. Which one of those that you really want to plug into? Targeting in on the ones that are most helpful to you at that point in time.
So I think the third learning in public thesis is that whatever it is that you were interested in and want to learn-- If you put out content based on that they will find you. Cause you're going from passive and then slightly active commenter and remixer of content to someone who creates stuff.
And so you will be imperfect. You will put out stuff that you're not proud of. But, people will correct you because that's how the internet works. If there's someone wrong on the internet, they'll come and correct you. And once you've gotten things wrong in public, you'll never forget it. You just learn really quickly based on that. That's the whole thesis right there.
Jeremy: [00:17:29] Yeah. And if I understand correctly, it sounds like maybe when you first start out, I think you were calling it, picking up what others put down, right?
Swyx: [00:17:40] I really want a shorter word for that. I want a shorter word for that, but it's six words. So I have this thesis that every slogan should come down to two words. But I can't reduce that any further.
Jeremy: [00:17:52] So does learn in public count?
Swyx: [00:17:55] Yeah, because in there's a conjunction.
Jeremy: [00:17:57] Got it.
Swyx: [00:17:59] So learn, like always learn, and then public just reminds you that there's a choice. By default, we're trained to learn in private. and I'm not advocating for living life a hundred percent public. But it's possible to go from zero to 5% and and see a lot of benefits. I'm an outspoken advocate with that because it's benefited my own career so much.
Jeremy: [00:18:18] You start with remixing or summarizing or trying to provide feedback on things that other people make. And then maybe the next thing that you should do is figure out what you're interested in and just write about those things or, make videos about those things however you want to do it.
Put out takeaways of what you learned and bring that to communities where people who work on that type of project are, whether that's Reddit, Twitter, hacker news and so on. And that's how you start building up this community for yourself where there's people who are working on the same problems you are, who can provide feedback and that will help you learn whatever you're trying to learn.
Swyx: [00:19:05] Yeah, the first example that I really started doing this with was back in 2018 when Dan Abramov, who's like one of the most vocal members of the React core team, presented essentially the future of react at a conference. JSConf Iceland in March.
So that day it live streamed. And then everyone was talking about it. This was like game-changing for the react world. And I wanted to be better at react. So what I did was I stayed up that whole night, transcribed his talk, walked through the entire demo that he did with all the source code and commented every part of the source code and then posted it the next day.
Obviously he read through it, right? It was about his talk. and so did everyone else that follows him right? Cause I was the first one out with a full analysis. And I was a scrub back then. I didn't know anything. I got some things wrong and I was corrected. If people want to look that up, it's right there. It's just Google for my react suspense walkthrough.
I think that that's what you're doing now, right? You saw that I was working on a book. Then you were like, Hey let's have a chat on this podcast. And of course I have to respond and now I'm here and I'm not doing it out of a sense of obligation. I was just like, Hey, that's really nice that someone noticed and is willing to have me on his podcast. I will do whatever I can to provide a good interview and share whatever I can help with, you know?
And so I think it's just a mutual exchange of value. Even though you may look up to that person and you're like, what do I have to offer that guy? And, you do, you have a lot. Sometimes it's just your energy and your enthusiasm and then the platform that you're building and, everyone can do that, you know? It's awesome.
Jeremy: [00:20:47] Yeah. One of the things you talk about is deciding what to bet on in terms of technologies, as a part of your career, as a part of the things that you want to work on. What's your strategy for betting on technologies?
Swyx: [00:21:08] Yeah, so there's a whole chapter on this. First. My own strategy is, I guess it changed over time, but, I do like to be a little bit earlier on things than others. But it really depends what your risk preference is right? The earlier you are, the riskier the project is because it's going to be rougher, right? It's going to be less well tested. It just might flat not work out, and you might have invested a bunch of time and money on it.
First of all I think people overestimate downsides. You can learn a lot from a failure and still reapply it on the next thing. But I definitely prefer to be earlier. And there are a lot of other people. So Charity Majors who is the CTO of honeycomb. I actually interviewed her for a thing on my blog. She's fond of saying betting early on a technology that's emerging and clearly doing well, made her that person.
She was the Mongo DB person. For me, I was the React and TypeScript person for two years. And that really establishes your domain. Like even though that's not all of what you are, but being about a certain technology that's earlier on, and then people flock to you because you're kinda that community discussion point. That's very beneficial for your career. If you're early in your career and you're not really sure what to bet your career on something that's emerging that you feel has got a lot of potential. I think that's really something that is worth betting on in terms of your own projects.
I like to use what thoughtbot does. I don't know what they call it. I haven't been able to find a good source. I call it a strategy, like an innovation credit. Basically saying that you should have this tech stack that you're familiar with, right? But then in every project you're allowed to try new things on one part of the tech stack, and then everything else stays the same. So it contains the risk on the projects because, for thoughtbot, they're an agency, so they have to deliver by a deadline.
If they pick something that's too volatile and it just blows up their project, then they don't meet their, their client deadline right? So that's, that's, that's unacceptable. The core idea is that you should have one stack that you're very familiar with. That you can get most things done and then keep innovating because you're definitely not at the best possible point in your tech adoption curve. Things are changing so quickly. That's the managing risk section where you give yourself an innovation credit and you only pick one or two technologies to use. Then the other thing that I covered in that chapter cause I've already written this chapteris a little bit about like, know, what's missing.
So for me problems last longer than solutions. The core problem of writing apps on the web lasted before React. It will last after React. Our understanding of the problem that will last is more important than the actual solution to solve their problem.
So I think when you pick technologies-- if you have that mental list built of the problems that really need solving then you can really start spotting when a new technology comes along that solves the thing that you really need better than what we have right now, then go for it.
Otherwise it's just an endless parade of names and logos, right? And you can't tell one from the other. But if you have an evaluation criteria that involves something that you really experience and use then that's really helpful. You're not just looking through hacker news for hacker news sake, you're actually using it to skim and see if something's come up that really solves a problem that you have.
So the one way I do it, because it's hard to know what problems you have when you only know one system. So you often need to learn a competitor, a competing framework. Not learning like full immersion, just dabble. If you're in Rails, you should look at Django. If you're in Django, you should look at Laravel, something like that. See how the other half lives. I call this exposure therapy because you'll probably find that a lot of things that are the same, you probably find that some things are worse. You're like, why do you do that? It's so easy in my thing, but you'll also find some things that are way easier in other platforms than yours. And you're like, why? Why don't we have that? And that's probably a good question. Just cause no one's done it yet. That's an opportunity for you. You can go write that thing or you might say, "Oh, this is so core that I might actually switch stacks just to have that for this particular solution, for this particular problem when it comes up." That's how you start to get into right tool for the job by having that exposure. So other languages, other frameworks. That's really powerful especially as you get more senior. To know that the thing that you started with is probably not the end all solution. And there's probably better out there and you just have to be exposed and it's pretty easy to get exposure. I listened to SE Radio and SE Daily, and I watched conference talks and I I don't have to go through the full tutorial to get to understand the ideas behind what they're teaching. I skim a lot of things, but then go deep on some. That's pretty important. Have you heard of Redwood JS?
Jeremy: [00:26:04] I have. Yeah, yeah.
Swyx: [00:26:06] There's this guy on Twitter who was like, yeah, I looked at Redwood. It's not as full featured as rails. I don't think it's going to work out. And I'm like, dude, Redwood just launched this year and Rails is 16 years old. Like, it's going to be worse. I'm sure Rails look like shit compared to whatever came before it at the point of Rail's inception.
It's very common in technology usually a lot of things are worse when a new thing comes out. A lot of things are worse than their predecessors. They're worse in every way but one, and that one, depending on how critical it solves a real need. That one is the one that actually works out. And then the rest of the ecosystem comes in and fills out the rest. So I think that's super important.
Jeremy: [00:26:47] For people who aren't familiar with Redwood, could you give a really brief explanation of what it is?
Swyx: [00:26:54] Yeah. It's Tom Preston Warner's project. He's co founder of GitHub, so he's super well off and he still wants to code. It's his attempt at building the Rails for JavaScript that everyone wants, but it doesn't exist. There are other attempts at it.
Meteor was the most recent one that didn't really work out. It's his attempt and he's building on top of a whole new stack of serverless technologies and other technologies.
To me, there's too many innovation credits in that stack. Too many things that are immature. But he's taking a longterm view because he's a billionaire so he can do whatever he wants. But, I think it's an interesting idea.
I think it has innovations for react. I had a blog post about how Redwood is actually the first framework that comes up with single file components for React. But I do think that a lot of technologies inside of it are still too immature. It uses Netlify, and I used to work at Netlify. I even think Netlify is still not mature enough, for the ambition of what Redwood, becomes. But I don't diss it. I don't dismiss it right away because I know this is what year one looks like in a project. It's shaky.
Jeremy: [00:28:02] I mean, if you think about rails, I think what excited people about rails initially was that demo DHH gave the creator of rails. Create a blog in 15 minutes.
Like you were saying, if you compared it to I think there probably would have been spring I think was the web framework on the Java side, and it probably was a lot more mature. But I think what excited people about Rails was, DHH showing you hey, you can build this blog and it has so little code compared to what came before. You can get up and running really quickly. And so I think when there's new technologies for me I'm looking for what is the big benefit going to be more so than here's this thing that we had in another language and now it's in this language. That's a little bit less interesting, but I was curious what your thoughts were on that.
Swyx: [00:29:00] Yeah, I mean, like I said, it solves a real problem that people had, which was the verbosity of spring. I've never messed with spring, but, compared to that 15 minute demo, everything looks like crap. It was really a good demo. And I'll be honest, I don't think Redwood meets that bar yet for the 15 minute demo.
I think it's true that it resonated with a lot of people just because it demonstrated that something was possible that people didn't really know was possible with more opinions and the flexibility of Ruby and all that. He definitely got it right.
I think, I think I also want to talk about the people factor. When you bet on technologies technologies are driven by people. Yes, Redwood isn't there in year one but because of the caliber of the team that's working on it, people are more confident betting on it because they're like, Oh, I've seen this guy built github and github is a pretty big deal, you know?
People behind the projects are the projects. You know what I mean? The code almost doesn't matter. Are these people solid maintainers of things, do they respond to ideas do they push out features in a reasonable fashion?
I think a lot of people treat technologies as faceless things that are just logos and github projects. But really there are people behind them and they're working on, these things and they have motivations and they have dreams. They have other things that they want to work on and evaluating technologies involve all of that.
What's the context of this project? How did it spring up? What problem was it created to solve? How does it plug in with the rest of the ecosystem because of the people that are working with the project?
That's how I think about the people factor. And as part of the people factor there's also you. You are one of the potential people. And I think a lot of people treat technology as a hands off thing. Like, Oh, it's not ready now. I'll just give it a couple of years and come back and look at it.
They treat technology statically. It's like a living organism that will just get better. Maybe, hopefully without your involvement but you can be a big part of that improvement and tying your career to an early part of that development actually is hugely important for some people's early careers.
So I have this list. Jesse Frazelle with Docker, Charity Majors Mongo DB, Ryan Florence with React, Kelsey Hightower of Kubernetes. All these people did not start those projects, but they got involved super early and made it what it is today. They could have just as easily made that call and just say like, okay, it's not ready now I'm just going to go away and then wait a few years. But then they wouldn't have a name in the industry. So in terms of betting on technologies with your career, I think there's that active you could get involved component, which I think a lot of people don't see themselves as qualified to do or don't even think about it cause they're, they're just like, Oh, someone's going to do it for me. I'm just gonna lean back and read tutorials when it's out. That's a valid and it's a perfectly fine strategy. I do that a lot, but I'm just saying if you want to make a big bet get involved. Best way to predict the future is to create it, you know?
Jeremy: [00:32:00] When we talk about bets, where we're talking in terms of like which one do we think is going to succeed, and I guess what you're saying is that potentially you could be a person that helps it succeed, right?
Swyx: [00:32:12] Yeah. Look, it could fail and you could spend a whole big chunk of time on nothing. But everyone remembers the passion and quality of work that you did in that project and that transfers man. It doesn't go away just because the project failed. People have a long memory and your association with people will outlast companies and projects. The downside of betting on tech is not that low.
I have one more point on this which is values. Values are destiny in the sense of there's the code and then inside of that there's the people. And inside of the people, there's the values. And the values are what drives the ultimate destiny of the project. Because that will involve every single decision that they make in terms of code, but also in community maintenance, in terms of marketing or whatever. Brian Cantrell who was CTO at joyent and super involved in the early NodeJS community. There's this list of values and every language, framework, library, whatever. Every developer community embodies some of these values to different degrees and there's a preference stack of values, right?
Some prefer simplicity over security, like if you have to make a trade off, which one are you going to choose? Are you going to choose the less secure option, but it's simpler so you get more adoption? Are you going to choose more security at the possible risk of less adoption?
So picking values that you agree with in technologies is like the ultimate, first world problem. Like ahh so many technologies, I'm going to pick the one where I share the same values.
The original creator of node left node, the original creator of express left express, and they all went to Go because they shared those values more.
Brian Cantrill himself left JavaScript to go to Rust. People pick communities because they're going to have a much easier time making decisions together. You're picking up technology at a point in time, but then also you have to live with the technology for like, I dunno, 10 years, hopefully if you're lucky. That's a community of values that you're buying into.
As developers we don't like to talk about these soft wishy washy things. But it's real when your PR gets rejected cause it doesn't agree with you, you just fundamentally don't agree with the maintainers, and something that means you just don't have the same values and you may probably want to pick up different projects that reflects your values.
Jeremy: [00:34:40] And when you're talking about values, that's how that community runs itself? Is that what you're talking about or what do you mean specifically?
Swyx: [00:34:48] Yeah. It governs every decision and every action that the community makes. It could be how do we RFC for new features? Is it an open process? Is it fully democratic? Everyone has a vote or do only maintainers have a vote? How do we fund development? How much corporate shilling are we going to allow?Wwhat happens when one part of the maintainer team starts being a bad actor. What's the process for removal of that person? Or if the community is just split halfway how do we resolve conflicts? Stuff like that.
For new projects, it doesn't matter. But when money is at stake, well it's not money as such, it's not just money. It's like entire companies are built on this thing. It gets really, really heated. What license do we adopt? Oh my God, we started this open source thing and it was just for fun. But now Amazon's coming in and competing with like our thing. We just need to change our license. How do we do that while not screwing over our existing customers? These happen. And there's no right answer, but having a clear understanding of the community and that maintainer's core values means that you can help to predict or resolve those issues ahead of time. And as long as you agree with them, then you're probably going to have an easier time than if you disagree with them. And you just have to be dragged kicking and screaming along. I'm so sorry. Super long winded and not very relevant for most day to day decision making, but for the really core ones, you're gonna have to think about all this stuff right.
Jeremy: [00:36:17] Yeah. I mean, I think when a lot of people think about picking technologies they're thinking more-- does it do the thing that I want? And, is it popular? Are there a lot of stars on GitHub or something like that. But there's a lot more nuance.
Swyx: [00:36:34] Stars are a proxy of at first it's like, okay, the quality of the project, but then after that it's like, okay, this maintainer's done cool shit before he's a celebrity, so we're just going to star everything else that he does and it's not, it's not very indicative of anything.
Jeremy: [00:36:50] Right. Looking more at some of the more surrounding things, like you said, the values, how do they treat contributions, looking at the people who are a part of it, what did they work on before looking at the community, are there people actively using it and are they helping one another and things like that. And that helps you build this picture of whether this is something that is worth checking out or not.
Swyx: [00:37:17] Yeah. Ultimately, knowing what you want out of technology, is gonna guide you to the right decisions every time. Because, there's technology for every type of use case in people, personality and community out there. And there's no way that you're fit for all of them. So you just gotta figure out what you want.
Jeremy: [00:37:36] And how about deciding what things to jump off of? One of the things I know that you worked with previously was meteor, which it still exists, but is not as popular as it once was. What are the decision points where you decide okay, maybe I'm going to go try something else now.
Swyx: [00:37:56] Yeah. Meteor is a slightly complicated thing because I know Scott Tolinksy and some small part of the people that I know still use and love Meteor. The problem I had with it-- So it's twofold. So the most immediate problem was that, Meteor chose this very antagonistic approach to package management.
They invented their own package manager and basically everything that you use has to be within that ecosystem or else, you know? And that's very hostile to the rest of NPM and JavaScript in general. And that was very annoying. It's like, Oh, I need a Meteor version of this. Sorry. whatever. And then the other thing was that it was too opinionated. It had its own front end framework on top of having its own backend conventions. And those were unnecessary layers to debug. And it was just unnecessary magic over and above existing technologies, which I already did not know well. So the solution to that was to strip away that abstraction and just to go one level lower and learn those things well and return to Meteor if I ever needed it, but then I never needed it. So that's why I went away from it. And ultimately I think from my early career it was a right choice. Both mercenary, which was biased towards whatever the job market wants. And the job market wasn't asking for Meteor. I didn't go for it, but have a contract. I did do it like a freelance thing for a Meteor consultancy once. and yeah, it was great. Meteor does a lot of things well, if you stay within its path. I guess it's like that rails idea. But, once you want to do something that it doesn't plan for then you have a hard time bending things. At least that was my impression at the time. I've never gone back and I know it's under new ownership now. But in terms of jumping off, like when you start feeling that frustration, there's probably something else out there that fits you better or you can just jump down one level of abstraction and just roll things yourself and that's perfectly fine as well.
For example right now, I'm in a weird transition between react and svelte, right? I served as a moderator of the react Reddit community for two years. I have recently stepped down from that and I'm ramping up my efforts on the svelte community side of things.
And I straddle it cause react has a lot of company demand, but I think svelte solves problems that I deal with in a really elegant way. But react is still important for cross platform. So it's kinda this weird, like if X, then I'll use one thing. If Y I'll do the other thing.
I don't have a one hammer fits all policy anymore, but I think that the reason I jumped off-- coming back to your original question was, the reason for jumping off was realizing that there's better out there and it doesn't have to be this hard. So yeah.
Jeremy: [00:40:38] Okay. So the big decision point there is in the projects that you work on, and even if you have a technology that you know and know well, once you start feeling like you're fighting it or it's just very difficult to do what you're trying to do, that's when you start looking at other options and seeing like, okay, is there this other thing that I can jump to when I do hit these types of problems then I can use that other thing instead?
Swyx: [00:41:03] Yeah. A hundred percent.
Jeremy: [00:41:05] Another thing I'd like to ask about is, you've left Netlify and you're going to join Amazon. What was the process like for you and I'm assuming you interviewed with a lot of different companies, how did you decide what was going to be the right fit for you?
Swyx: [00:41:20] Ah, it wasn't a public search. It was actually more I just put the word out in different companies that I was interested in. And then other people heard. I didn't do a wide search. And honestly, like Amazon had it from the beginning because about a year ago I wrote this idea down for how to do offline apps with GraphQL. It was a gist I kinda drew out the idea and I was like, I don't have time to work on this, but I'm just gonna put it out there as an idea. And then last November, Amazon announced it at re:invent as a feature.
Then I was like, Oh, interesting. Somebody at Amazon thinks the way I do. And I was like, this is fascinating. I never thought that I would agree with Amazon on anything, but someone thought about it enough not only to agree with me, but then also build it to Amazon standards, which, is a significant investment. Because they never, close anything. They always support things all the way back. That was the point at which I which I reached out to Amazon. and I think the process was just more to see if it was a cultural fit. I interviewed at other companies as well. And the cultural fit wasn't necessarily there as much as Amazon was. It was also like, there's definitely an attraction or validation. I'm not too shy to admit it, that working for a FANG company is a big deal on a resume. I always worked at a small startup, scrappy startup, and the way that we think about enterprise usage, enterprise users and customers is wholly different from like a big cloud enterprise. Like that's real enterprise, you know? and I was basically interested in learning more about how did the big boys do it? I don't know if that's a valid concern or not but there's always this imposter syndrome of I don't have a traditional background. I never worked at a big, FANG company so I think this was just a nice attraction. I think I'm very drawn to the idea that Amazon is now for open for front end developers as well. As a front end developer myself, I never really felt welcome on the AWS console. And now that it's investing so heavily, in terms of all these frontend services, I thought that was very interesting.
And it's out of the box, more full featured than what I was working on at Netlify. I basically thought of the job as exactly what I was doing at Netlify, but with infinity more to learn. And, in terms of like what I wanted to be learning and what I wanted to be sharing with people, I thought Amazon was a good fit. I expect Amazon to outlive me. So knowing things that are, that are long lasting. I call this Lindy compounding. So do you know what the Lindy effect is?
Jeremy: [00:44:10] No, I don't.
Swyx: [00:44:12] The Lindy effect is-- you're probably more familiar with it in terms of the phrase, like, the longer something has been around, the longer you can expect it to last. You know what I mean? Usually sometimes people put it in a bad sense like, if you've been waiting for the bus for two hours, you can expect it to--
Jeremy: [00:44:31] To not come.
Swyx: [00:44:33] To not come for another two more hours. Right. And then it grows instead of declining the longer you wait. But the Lindy effect is important for things that we work on cause, in technology. A lot of things that we do have has a very short half life. like it has value now and then it declines, right. And it's fine. Everything declines, but, the longer we can make something last, then the more we can build on top of it.
And we can grow by compounding upon the work that we did before. That's a fundamentally sensible way to run things. That's why I'm very interested in things that last. And that's why I write. That's why I try to do podcasts cause people a year from now, two years from now can still come by this thing and still get value out of it. There's a lot of people who pivot into like, Oh, here's the news of the week. That lasts a week. Exactly a week. That's not even your half-life. That's your full life. And it's terrible. The idea of Lindy compounding is that you compound by working on things that last for a long while. And one way to bet on things that last for a long while it's just, you look around for things that have been around for awhile. So it's like the diametric opposite of betting on new technologies. I've done that part. Let's do the things that have been very stable and very long lived for now. I think that's a very interesting way to compound the skills that I have. Because probably your knowledge of S3 will last your entire lifetime. And that's, that's the thing. That's awesome.
Jeremy: [00:46:01] I was talking to, Daniel Vassallo on another episode, and he worked at AWS for awhile and he was saying there are certain services within Amazon that you can totally rely on because they've been around so long like the example you gave S3 right? Or DynamoDB and so once something has been around long enough people can trust it and it further puts it in stone that people can keep using it. And I wonder one of the other things that I thought was interesting as you were talking about how Amazon previously wasn't really, I don't know what the word would be, friendly to front end developers, but it's--
Swyx: [00:46:45] Yeah, I'd say that. Yeah.
Jeremy: [00:46:46] Yeah. And you go to that AWS console. And I would argue like not even just for front end developers, but for full stack or backend developers. As somebody who has worked with rails we're accustomed to being able to go to Heroku and push like our code and then Heroku puts it all up and sets up the database. And the web server and all that stuff and it's all abstracted away. But yet when you go to Amazon and you look at the menu, there's a hundred services or whatever, and you're kinda like, I don't know. I don't know where to start. And then there's what's it called?
Swyx: [00:47:25] IAM
Jeremy: [00:47:26] Yeah, all the IAM policies where you have to set up the security and you have to hook up all the different services together. Is Amazon going to be able to move into the space that a Heroku or a Netlify or, a Vercel these companies that are building on top of Amazon and providing like these really nice, developer experiences. It sounds like you think Amazon's going to try and move into that space.
Swyx: [00:48:01] Yeah, I think it's about two years into this move. It's apparently going well. That's what I'm told. I'll really see when I join, but, yeah. I'm a little bit casting in my lot and saying I'll be a part of that, you know? Like I said before you have to be an active participant in the thing that you're trying to promote.
I don't kid myself that Amazon will ever compete with Heroku or Netlify or Vercel on the limits of developer experience. Amazon will never win a design award. No one will ever gush about the console or anything. It doesn't have to, it just has to be good enough.
And then there's the other benefits. There's a trade off to all that start up awesomeness, which is that you don't have the majority of the rest of the platform and when you need to scale, you can't, you just have to migrate. And that's, that's unfortunate, right? That's just how startups have to have to make their way with things. I think Amazon has some potential as the only underlying layer to let people onboard, and scale as it gets a certain size. So I think that's an interesting thing to bet on. I'm personally not even betting on that. I just think it's fascinating that Amazon's even trying in the first place and I'd like to be a part of that. But then also just learn everything that I can and share with people. Because people fundamentally want to learn Amazon. I've just realized this after a while despite all of its messiness and complications. Yes, onboarding sucks and yes, the initial developer experience sucks, but you know what else also constitutes the developer experience? Whether the service sticks around forever. Whether it has really good uptime, whether it has a predictable pricing that never goes up, it only goes down. Amazon has a 20 year track record of that. So I'm like, yes you can always do a lot better on the initial experience, but then it's not just the initial experience as well it's also the subsequent experience of I need this thing and, oops, the platform that I chose doesn't have it. And that's fine. That's a trade off that everyone has to make for themselves. But it's nice to have a platform that just has everything by default. And the thing that you suck on is the initial experience, which is fixable right? so that's how I view it. I may regret these words later.
Jeremy: [00:50:35] It'll be interesting from your perspective once you get into it, right? And then you actually start to see, okay, what is the process of getting a feature built at Amazon? And getting to figure out, okay, are there ways that we can improve that onboarding or improve that developer experience? Or are there just a bunch of things that are happening that you're just not aware of yet and those things are what makes it so difficult to have that great experience from the start.
Swyx: [00:51:09] Yeah, I mean on the company process side of things I'll tell you everyone in the Valley, everyone in startups idealized the way that Amazon works, right? Like, have you heard of the two pizza teams?
Jeremy: [00:51:22] I have. Yeah.
Swyx: [00:51:23] Right? Have you heard of the, the six pager, the Amazon PR six pager thing?
Jeremy: [00:51:27] so is that where like if you have a new feature, then you, you write six pages
Swyx: [00:51:33] You write the press release before you actually code.
Jeremy: [00:51:37] Got it. Yeah.
Swyx: [00:51:37] Right? So these are just accepted. These are best practices, which, I'll be honest, at Netlify, we didn't even do that. But everyone understands that these are just good ways of working.
I'm like fascinated, I just want to see it in practice. I want to go to the source and just see because as badly as Amazon does a lot of things, they get a few things right. And this is what, Steve Yegge, I don't know if you've heard of it probably not, but, Steve Yegge has this, Google platforms rant.
I'll send that to you because everyone should read it. It's amazing. But basically he's like, yeah, Google does everything right. Amazon does everything wrong except for one thing. And then he's lists a bunch of things. That's a little bit out of date now, but, I think fundamentally true that Amazon is very strictly run according to a few small set of principles that, I personally strongly identify with.
So the interview was completely not a challenge for me because I love this stuff. Which people make fun of cause one of the principles is just ridiculous. One of them was like be, right a lot. Like, great.
Jeremy: [00:52:48] Okay?
Swyx: [00:52:49] Yeah. Yeah. It's like-- Do I choose to be wrong? No. But anyway, I think they live by the principles and one of them is the two pizza team thing. This is like the six pager thing. Even as an outsider, I've already been inundated by like, yes, this is a real thing. And yes, we live by that. And customer obsession is interesting. Because it doesn't show up in the design for sure, but it shows up in like other things that matter, like uptime and pricing and, availability zones and what have you. It's just like a new thing for me and I'm just joining in everything's new and I've gone from working like-- the biggest company I worked at, it was 200 people and now it's like 800,000, you know? I definitely look at this as like, I don't even know fully what I'm getting myself into. And that's exciting to me. Cause I definitely think of that as a learning experience.
Jeremy: [00:53:46] Yeah, that's a good point in that when someone's looking into taking another job or taking another position, your background is, you've worked at a lot of startups that have been relatively small teams. And so when somebody is thinking about making a move, like they might consider Oh, maybe I do want to try like a giant enterprise, or I want to try a FANG. Just so that I get to see how people there work differently than what I'm accustomed to. Because you may go in and you may hate it, right? But you won't really know until you jump in and see how things work. So that's an interesting sort of, additional factor to think about when you're picking a job.
Swyx: [00:54:27] Yep, yep. For sure. I'm also viewing this as, me generalizing a bit. So, at this chapter. I guess one of the questions in people's tech careers is: how much of a specialist versus a generalist do I want to be? And for me, I think the advice I have landed on is when in doubt specialize. You don't really know how to get past that and get past the trough of like, Oh, I don't know anything to like the, mastery level where you start making the tutorials that other people depend on. I understand the front end fields enough. Obviously I don't know everything, but it's enough that I can probably figure out how to do whatever I need to do at, any given stage. I want to generalize a bit beyond just being a front end guy to front end and serverless and AWS stuff and possibly anything beyond that.
Maybe one way to phrase this is that you have a lot of lateral transfer opportunities within a large company compared to a small company where you're just asked to perform one function. I think that's an interesting way to think about career trajectory as well.
Cause a lot of people think about like, should I do startup versus big co? Definitely the household recognition helps, brand name recognition helps. Once you've been through such a rigorous interview, like I, I don't know if you've interviewed at Google, but like Google does do very rigorous interviews.
So like, once you are ex-google, no one really questions your ability to code. So that's great. And then you can do other more speculative stuff with your career. Whereas if you're always from a small, small startup, you're always going to be interviewed with a whiteboard question for the rest of your life. Which is fine, but it's annoying.
Jeremy: [00:56:17] For somebody who is coming out of university or coming out of a boot camp-- given your experience, do you think it's better for them to start with a startup or do you think it's better for them to shoot for an enterprise or a FAANG?
Swyx: [00:56:34] Ooh.. It really-- I don't know. That's an interesting question. I'll talk about my personal preference. I probably would have preferred a big co to start with because the validation especially from a bootcamp you're not qualified goes away once you join a big co, whereas for a startup, the question always sticks around.
I did not have that opportunity. I interviewed with Google 15 times, but actually, yeah, 15 times. There was two on-sites and they just packed a lot in there. Actually, no, not 15 times, nine times. I'm sorry. I don't know where the 15 came from. That's almost like a superficial thing because at the end of the day it just matters where you grow the most. And it really depends on your personality and how much you fit with the team. A good mentor at a small company, beats a crappy mentor at a huge company, you know what I mean?
So, it's not always about career image, it's also about how much can you grow at that specific position and how much you're excited and fueled by it. For sure it's not gonna be your last job, you know? So you do have to think about that day one that you land on that first job. You're also thinking about do I enjoy this? What else exists out there? How can I keep sharpening my skills and broadening my interests? To really find the thing that I wanted, cause it's probably not going to be the first date that you find someone that you're gonna marry for the rest of your life. You know? It's kinda weird cause the process of finding a match and we just have a lot less tries at it than in real life dating. We're expected to find our first job and then just fall into it.
But, really there's a lot more out there and we might not find the right fit for awhile. I do think a lot of people in their mid careers there's like, some sort of mid career crisis that some developers go through, because they realized they're not that passionate about the thing that they've been typecast into from junior to all the way to senior. Like keeping a wide focus wide angle of what else is out there is important, especially as you start out. I don't care that much where people start. I just care that it's a good environment for them to grow. And that they start growing their network as well. Cause I think when you're early on it really is about your coding ability, but as you go almost senior, it's less and less about your coding ability and more about the communities that you're involved in, the architectural decisions that you've been through and understood. And the people that you know and can hire to to work with you is also important.
Jeremy: [00:59:08] Yeah. And so when you're at your positions, like you joined a new job, what are the ways that you can optimize your time there? Like whether that's trying to make sure that you can learn the most as possible from your coworkers or try to get mentorship or what are the strategies that you would ask people to consider?
Swyx: [00:59:31] All right. So I split this as like junior dev versus senior dev. I haven't written the senior dev part, but I have some ideas. The junior dev part, this is your time to not know anything. you should ask all the air quotes, stupid questions as possible because they're there. That's the responsibility of your managers to train you. And some people have screwed up. Some people have destroyed their production database on day one of the job, and guess what, it's not your fault. It's the fault of the managers for not building a resilient system, to recover from that or to prevent you from being able to do that in the first place.
The other strategy I think I like to encourage people is to pair program. there's just so much that you can pick up just from osmosis of like having a mind merge while you code and then someone can watch over you and then just, Oh, or you watch over them coding, right? And then you can see all the little tips and tricks that they do and that levels you up very quickly. In fact, some companies like pivotal actually always pair program. And that's very high stress cause it's a lot of talking.
Jeremy: [01:00:27] That sounds exhausting.
Swyx: [01:00:29] Also like, it's just like really you just don't have a distractions cause someone's looking over your shoulder. But it was great. I pair programmed a few times at Netlify and it was awesome.
And then once you, once you start being able to get other people in your head, that's one way of like emulating them. Like you start having a senior developer in your head, right? And you go like, what would they do? What would X do? What will I do? When you come to a similar situation.
And that's valuable, at least for the code review, where when you send in code, you can step back from yourself and have an out of body experience, and go like, okay, now I am the person reviewing my code. What are they likely to say?
And you can review your own code proactively and show them that you're learning. Show them that you're picking up, not just the ability to code, but the, the ability to integrate with and mind meld with the rest of your team. and that's, that's a really awesome way of thinking about that. In terms of like finding a groove.
Then there's also trying to add value and essentially I try to phrase it as you should be a problem sink not a problem generator. This is hard to do at the start, but basically like, people should be able to give you problems and you solve them.
There should be no new problems generated from your work otherwise you're just adding problems. The buck stops with you in that sense. And so the more capable you are, the more people will realize that when they have stuff to do, they can assign it to you and it will be done. That's probably the most immediate value. You can be proactive about that. And so you don't always have to sit there and just wait for things to come your way. You can actually ask to be assigned to projects that you want to work on.
Often it's doing the things that nobody else wants to do that ends up making you indispensable because A) it has to be done anyway, and B) nobody else wants to do them. So if you're the one doing them, then you now own it and you're indispensable now because you're the person that owns that. And guess what? You can make anything interesting, the most mundane things. Like, let's talk about tests like, the best thing, the best way to start a new job is to write tests because you cannot break the code base. In fact, you're doing the opposite of breaking the code base. You're making it more resilient. You're not just writing tests, you're also removing and changing outdated tests. Your coworkers will be grateful. There's never enough tests whenever you have to ship versus writing more tests, you ship, right? And then you go to the next thing. So as much as we all like to say that we believe in tests we don't always test everything. And as someone new, you can always contribute more tests, you can learn the code base and you can learn the product. So you probably as a user, you have used a product that you're working on, but you only see things from your perspective and you don't see all the hundreds of edge cases that you're also handling for, let's say, enterprise clients.
Writing tests is a non intrusive, noninvasive way of like contributing value to the code base. I think that's a highly profitable thing. I have some links from people who use this strategy at Netflix and Ionic and Paytm, and I'm sure there's a lot of other companies where people have used this and they just didn't see my tweet at the time.
But I basically said write tests when you're trying to join a code base. Once you find your group you're basically starting to go from reactive like things come into you and you work on them to proactive, you start to ask to be assigned on strategically important projects where your profile will rise according to the project they work on.
That's one way of like ensuring that you have impact. Because if you look at all the engineering calendars out there, so, I actually went through and collected all the public engineering calendars that people put out. Also blog posts. but basically if you look at all of them, they all evaluate engineers based on business impact, which is weird because you don't really have any control of that. You just write code. but you have control over the projects that you work on and it's up to you to position yourself. It's a little bit of politics. How much do you understand what the company's core business is and how much do you support that effort?
And obviously you contribute more value if you work on the company's most essential projects so you want to do that. Then there's all the other like meta learning stuff around the actual work. So there's reading technical books, reading framework source code, learning more languages, following people over projects.
So the projects that you use. Probably just projects to you, but then there's people who work behind them start following them. You should also be working on useful side projects. So basically I have like this whole long list of things which are probably super unrealistic for any real person to do, but these are just ideas that people can pick off like: Hey, I feel like I should be doing more. Here's a list.
I do strongly encourage people to do talks because doing talks is the ultimate skin in the game of learning because your face is there when you do the talk. When I say do talks obviously there are no conferences right now, but you can do meetups. You can do brown bag lunches at work or to your family or to your dog, to your YouTube channel. I don't care. The process of going through and teaching and speaking really solidifies it in your brain. It's that whole learning in public process but you can do it at work and people really appreciate you for that.
In fact, Matt Gerstman one of my friends in New York he started the JavaScript Guild at Dropbox. He got it to a point where he organized an internal conference for a thousand of his colleagues and they all flew in and he did talks and he definitely raised his profile at Dropbox just because of doing that.
And it wasn't his job. He just did it. But like, he's now viewed as the internal JavaScript expert, which is a huge position to be in for a company like Dropbox. That's, that's awesome. There's other things like guest writing for industry sites, blogging, answering questions.
Ooh. So answering questions you don't know what you don't know. And one way to get past that is to answer other people's questions because then you get exposure to the things that other people who are not you run into, and then you're like, Oh, I did not know that. You know what I mean? There's no way I could have figured that out if I didn't answer other people's questions. So, on the react Reddit we have 500 Q&A's every month of beginner questions and you want to get good at react just go in and answer every question (laughs) right?
You scale by just the number of people who are throwing stuff at you and so if work isn't challenging enough there's unlimited questions on stack overflow and twitter and reddit and github issues and you should just dive in and help out.
Jeremy: [01:06:42] For sure. Yeah. That was an interesting twist in terms of the JavaScript Guild or the internal conference at Dropbox where you've been talking about this idea of learning in public, but learning in public could also include the company that you work for, right?
Swyx: [01:07:00] Learn at work. I mean, some companies are tiny, so there's not much public to do, but when you're the size of Dropbox, I think they had like 10,000 people then. Yeah. It's the same thing (laughs).
Jeremy: [01:07:10] Yeah that's cool. One of the things about your career and your learning is that you have picked up so many different technologies and learned so many different things. When you were learning, what were the resources that were the most helpful to you?
Swyx: [01:07:28] Wow. That's broad (laughs).
Jeremy: [01:07:31] Maybe it's too broad (laughs).
Swyx: [01:07:32] More recently I think I just really got into a groove of reading technical books from cover to cover. I think people don't do this enough. Senior developers spend years of their life working on a book and then they sell it for peanuts.
You can buy their expertise and just read it. That's amazing. I learned typescript that way. I learned CSS that way. I learned DynamoDB that way now, cause I just did Alex DeBrie's DynamoDB book. Yeah, technical books are super underated because no one has the patience to sit through a long form book.
But as an intermediate developer, that's what you gotta do because, tutorials are targeted at beginners, right? Beginners, want to go from zero to hello world as fast as possible. And then experts don't need a tutorial. They just need to know what changed. Like, give me the changelog and then, and I can figure it out. But the intermediate people they know some things, they don't know everything. So what's the best way to fill in the gaps, which is read the docs or read the book. I don't care.
Yes, you will find a lot of things that you already know you can breeze past those, but you'll probably find a lot of holes that you weren't really sure about. You think I'll get to this someday. You never go back to it. Read technical books or read the entire docs or read the entire source code if you're ambitious. I don't recommend that for everything, but books you can definitely handle, right? Because books are made for you to skim through, like probably take you a week, two weeks, whatever. That will be a very high leverage thing because the amount of hours that went into preparing that is so different from podcasts or any of the popcorn stuff that we do on a day to day basis, right. Again books are Lindy compounding. They took a while to create and they'll probably last longer than hack a day blog posts that we get, does that make sense?
Jeremy: [01:09:19] Yeah. Yeah. No, that's actually a really good example because, wwhen I look online for how to do something, like you said, you'll find all sorts of blog posts a lot of the time they usually stay at the very beginner level and sometimes you're not even sure if they're up to date or what they're teaching is still relevant.
Swyx: [01:09:40] Yeah. Cause Google wants you to believe that all knowledge is one Google search away. but you get to a blog post and you're like, this doesn't exactly match up cause the version's a little bit out of date. What do I do now? You're totally lost because you're not learning from first principles.
You're just copying, pasting. So what a book or some form of long focus study gets you is the conceptual understanding to figure out anything that you need in terms of coming from first principles instead of just following instructions. And that's when you become more of a software engineer rather than a software user.
Like I had this post "the day I became a software engineer." My job title was software engineer. I was not a software engineer because I was still following other people's instructions. I didn't really have a conceptual connection with the thing that I was doing. But once I started to look at source code, when I started to understand conceptually what I was doing and figuring it out based on first principles, then you're really doing software engineering, right? You're not just following instructions, you're not using other people's software the way they tell you to use it. You're really interacting with it on a conceptual level.
Jeremy: [01:10:40] Yeah, books are a really good example because, when I'm looking for things online, a lot of times I wish there were more, intermediate resources or maybe even advanced resources because you'll have a thousand posts on how to make a blog or something like that but maybe not a post on how to do this complicated thing in graphQL. I agree with you I think books are definitely underrated. A lot of people don't even think to go see if there is a book they can go read. I'm wondering from your perspective, are there also other things that you think people could be creating? Whether that's, more technical blog posts or courses. Like what do you think that people should be creating or should exist in the world that would help people that are at that intermediate level or that advanced level?
Swyx: [01:11:34] Workshops are a thing. One resource that definitely helped me was Frontend Masters they basically do books in video form, They just have the creator of the technology teach a four hour, eight hour workshop on their technology, and that covers it.
Obviously it doesn't have as much room for nuance as a book does, but, different formats will appeal to different people. I definitely respect that. Some people just can't sit still for a whole book. So you can definitely create different forms of content for that.
It's just that books are well understood. They've existed for, I dunno, 4,000 years, and we know how to deal with books. The other forms of media are a lot newer.
I have this theme of like open source knowledge as well. That's a principle that, I'm expanding as well. So, for example, my React in TypeScript cheat sheet that I started, it's not a book. It's a repo that anyone can contribute and basically just gets better over time. A book it's something that one person or two people work on for an extended period of time. And then there's a definite end date and then they ship it. And then maybe it's out for awhile. And then, it gets updated at some point, but like it's not as live it's not as current as something that's just always on and like Wikipedia is just like constantly updated, right?
Wikipedia is like the ultimate version of what open source knowledge is basically like we used to have encyclopedias that was curated by an expert team and those things just got destroyed at a fraction of their costs by Wikipedia, right? Because everyone could contribute.
And so that's the idea of open source knowledge. We have open source code and we understand that by open sourcing code, more people can look at it. More people can contribute and write issues and it just is better after many eyeballs are on it. But why don't we open source knowledge?
And books are closed source knowledge right? I'm the author of this book. I will make all the decisions. But like, maybe there should be something more collaborative, that everyone can mark up and it just gets better over time and everyone can access that.
So I do like this idea of a more collaborative form of books. It's still a book basically, I just call it open source knowledge.
Jeremy: [01:13:45] Yeah. Maybe that could be the the docs for a project, right? The docs for a framework where the docs are so good that they get you the knowledge that you would have gotten from a book, but it's in the form of a git repo that other people can contribute to.
Swyx: [01:14:05] I mean, people tend to think of docs as something that is officially maintained by the team. You can only write for so many audiences at once and you're going to make some people unhappy. There's an unlimited space for community docs, essentially for like, X for Y, like MongoDB for JavaScript people. MongoDB for Ruby people and then it's just going to really focus in on those use cases and do a good job of it.
Again, not to go back on my example too much, but it's the one I'm most familiar with, but when I was trying to learn TypeScript, The react docs did not have TypeScript docs because they supported flow. And TypeScript did not have react documentation because TypeScript was focused on being TypeScript. So it's that intersection of people like me and like the technologies that I wanted to learn.
There's an unlimited space for those. Every project has only one official documentation, but then there's this unlimited space for unofficial documentation for different audiences, and people can, do that (laughs).
Jeremy: [01:15:01] Yeah. That's an interesting point about people all bring different context, right? Like they've worked with different languages or different frameworks before. And when somebody writes a book, like they have to sort of sit down and decide okay, what is my audience going to be?
I'm going to write this book and there are going to be people where this is great for them and there's going to be people where the level is too high or the level is too low and it's impossible to satisfy everyone. I like your idea of there being different communities building their own public knowledge of this is what we as a group learned, and it may not match you exactly if you just find it on Google, but there is going to be a community that does understand that and can contribute and it can be a living document, like you said.
Swyx: [01:15:54] Yeah. In fact, I think we could all do a better job of saying at the start who this is for and like if it's very clearly not for you, then the reader can move on and we can be all happy. But like everyone tries to write for everybody and it just like, is a random mismatch of like stuff. And we as readers, we're wasting time cause we were trying to evaluate whether this is the right thing for me or not. we should always just like, this is for X. If you're not an X, these are other resources that you can go to. We'd probably be better off as a community (laughs).
Jeremy: [01:16:23] Yeah. It reminds me of looking up a tutorial for I don't know, rust, and then you get to the tutorial and they go like, okay, this is what a variable is, or this is what a for loop is.
Swyx: [01:16:41] (laughs) Like, ahhh! Come on! Exactly. The TypeScript tutorial when I started, was teaching people ES6 JavaScript, which I didn't need cause I already knew it, but it was like, trying to colocate these things at once, just trying to be beginner friendly or whatever. And I was like, I'm sorry. I'm not a beginner and this doesn't help me at all.
It just argues for a diversity of docs and more people should be writing everything. So there's just an unlimited amount of things to get people involved in contributing but also learning in public.
It's a beneficial thing for them. And for the community. I can't say enough good things cause, Okay. So first of all, the point I wanted to make was that, the other feature of opensource knowledge is that because it's evergreen, it's always updating.
It starts to have Lindy compounding effects, right? So it's not just a one off blog post. You're actually maintaining and keeping it up to date. I like that as a form of concrete form of learning in public, cause you're not creating and throwing away your work. And that's, that's really awesome.
Oh, the other thing is, having things in your life where you know that you cannot possibly overshoot is a luxury. It means that you can invest infinitely in that thing it's always better. Again, this is how much I admire Amazon's principles. So a lot of people ask Bezos about. What do you think is going to change in the next 10 years? And for him, he's like, everything's going to change. What's more interesting is what is not going to change? Those are the things that you can invest in because everything else, if you invest in them and then things change on you, then your investment is wasted. But the things that don't change are the things that you can really invest in for a lifetime, and that's kinda what he did.
So for him, the examples were like the dumbest one, which was customers are always going to want lower prices. They want lower prices and faster delivery. They're not ever going to say, Hey, I want higher prices, or I want slower delivery. And based on that he built Amazon and Amazon prime. For me, this idea of like everyone can do a little bit more in public. It's something I can not overshoot (laughs).
Jeremy: [01:18:33] Yeah, sure.
Swyx: [01:18:35] So I can just say it all day long and not get sick of it. Cause I know that the more I can get people to do it, the more it would be better for them and better for me, frankly. As a participant in the dev community, I would want to see more people doing it. So, having things you can't overshoot is really good.
Jeremy: [01:18:49] Right cause there's never going to be a point where people are going to say like, Oh, there's too many tutorials. There's too many guides.
Swyx: [01:18:56] There are people who say that, there are, I think, that can be true. But you're probably like paying too much attention to beginner level stuff because I get it, like I'm an intermediate or advanced person and I see a bunch of beginner tutorials all day long on my Reddit but it is helpful for someone out there who is a beginner. and all I have to do is ignore that and find better sources of info for myself. Just got to get better at curating your own info stream.
Jeremy: [01:19:23] Yeah. It's not that there's too many, it's that maybe the fact that there are a lot makes it harder to find the ones that you care about. And that's, like you said, more of a curation or a search problem. And I'm not, I'm not sure how, how we solve that.
Swyx: [01:19:39] Look like, don't center your entire basis of learning on tutorials all the time. That's it. If your learning strategy is someday the perfect tutorial will come along and I will become a better developer because of that. Stop just, that's not how you get better. Go read books. I'll tell you, there's not enough books, you know? So yeah.
Jeremy: [01:20:02] Yeah. yeah, that's a good point. Like you can't always hope that somebody is going to have put in the work to do that tutorial or answer that stack overflow question for you. Like you, you do need to build that, that solid base. And then, once you figure it out, then hopefully you'll write the tutorial.
I think that's a good place to start wrapping up, but are, is there anything else that you do wanted to mention or you wanted to plug?
Swyx: [01:20:27] No, I'm still working on this book. It was supposed to be a two week project and then I like started really getting into it. So I don't really have a book to plug. I just hope that people try to go from a hundred percent private to, putting something out there.
It's now more than ever, that, you start to, be able to market yourself as a as a senior developer. If you're a junior, you want to market yourself as a senior. And you want to disconnect your income and your influence from your hours. That's wealth. Basically, you're building wealth by so that everything that you leave in public out there, works without your presence being required anymore. As developers, a lot of us sell our time for coding right? But we never really think about the people who buy all our services with money and use that to do something more valuable than what we have. And they use it, for example, to write sites and apps and we should try to build for ourselves as well in whatever form.
For me, it's been content. I've been relatively successful with writing and speaking. but for others it might be an app like a side project or whatever. But we should always think about how to make most use of our developer talents. And it's almost a guarantee that you're being undervalued by your employer, because that's just fundamentally how this works. They pay you less than they get from you, which is a fair trade. Now, don't get me wrong, but like, you should always be thinking about how can you be improving on that? So hopefully people can figure that out for themselves. I'm still figuring it out.
Jeremy: [01:21:59] So it sounds like maybe a, maybe in a year or two we'll be seeing Swyx incorporated?
Swyx: [01:22:05] I mean, I'm committed to Amazon for at least four years. I think this book thing has some legs though. I've been blogging for a year ish but like people understand that I put out quality work.
When I announced I was launching this book, I didn't have a book to sell, but people understood that I was going to put out something of quality. Basically. What I'm trying to say is I sold an empty PDF for $4,000. There's something there where everyone can have some sort of side hustle where they monetize their, ability to write and to teach.
I've done some egghead tutorials and that comes in I think like 500 a month. I haven't really done that much, but you could do something as an instructor, as a professional developer, someone finds your coding ability of value and you can definitely share that more widely.
Jeremy: [01:22:55] For sure. Yeah. And then that also hopefully builds your ability to communicate and to write and that can apply to any type of position or any type of work you do in the future.
Swyx: [01:23:08] Yeah. Yeah. Yeah. The trap that I find myself in is that I don't want to give everyone the impression that they need to go down this path because this is very much like a person who writes books and is a developer relations person. And the vast majority of developers are not that, but I think their own careers and their own learning can be enhanced by doing this stuff, even if they don't have direct financial benefit from it.
Like, I call some of these blog posts that I do, friend catchers. Which is basically things that earns you friends in your sleep. Like this podcast is a friend catcher, because after I'm done recording this, you're going to put it out and people a year from now will hear about it.
They'll know who I am, they'll contact me. And that's decoupled from my time. Right? So this is a good use of time because of that and I think everyone can think about ways in which their workflow can be improved by having some decoupling from their time in their income. That's all I have to say on that.
Jeremy: [01:24:05] Yeah that makes a lot of sense. I mean, like you said, whether it's podcasts or conference talks or, or blog posts, that sort of thing. Those are the kinds of things that can keep helping people, long after you finished them and can get you in touch with new people so that, that makes a lot of sense.
Swyx: [01:24:24] Oh, I got one more for you. Sorry. I know. I don't want to take too long to close out. But, basically I had this whole section on strategy, like business strategy. So as a developer you spend a lot of time learning about the art and science of coding, like art and science of creating software.
But, you should also spend some time learning about the business of software, like how people make money off of your work. So I have the section on the basic business models like advertising agencies, marketplaces, SAAS. People should understand the economic imperatives of what these things are, not least because you want the company that you joined in and have options or RSUs to do well. But you also want to be able to put yourself on strategically important projects that will have impact, right? You want to be able to suggest features that say like, Hey, this would be really easy for me to implement. So it's like, it's not technically your job because your PM is supposed to do that, or your CEO's will do that, but you as someone who touches the code have enormous power over what the final output actually looks like, because you're like, Oh, this is low hanging fruit. I can actually put that in there. You control so much because you control the tech stack, you estimate the cost of relative projects, you can actually say, this is really easy compared to the relative impact it's going to have on the business. So I think everyone should understand business strategy to some extent. Cause it directly affects your career. You want to be in something that is strategically important to the company. You want to invest in mega trends that are going to last your entire career so you you're at least early on them.
So yeah, these are all things that we discuss. I realized this only like halfway through because I was like, Oh, I'm like in this weird spot of being in like a really good position to write about this because I'm a developer. Plus I used to invest in tech stocks from a hedge fund point of view, and I have a finance degree and all that. I should probably write this down cause no one else talks about this, right? I've looked at other career books and I was like, I was like, yeah. Preparing a portfolio, writing your resume. No one actually talks about strategy.
Jeremy: [01:26:19] Yeah. It's the next step from people when they work on a project, they want to build something that's going to be useful to the customer, but then taking it one step above that and going what is the thing that is going to be bringing my company money so that I can pitch this to my manager or my technical lead.
Swyx: [01:26:41] And I think there's a founder in all of us. We're all looking for what we want to do next, and obviously there's important strategy there. Look, I'm not the authority on any of this, right? I'm just like random dude, writing my thoughts out.
At this point in my career and I hope 10 years from now I will think I'm a total idiot and disagree with like majority of what I say. Cause then that means I will have grown. But I hope that these topics are at least brought up in people's minds and they can at least start exploring it for themselves and figuring out what they, what they decide. And I'd love to have this discussion with anyone on this.
Jeremy: [01:27:16] Very cool. Swyx, thank you so much for chatting with me today. I think there's a lot for people to unpack and a lot for people to think about in terms of learning in public and what to do in their careers. And I think people are gonna really enjoy it.
Swyx: [01:27:31] yeah, we went two hours. Holy shit (laughs). All right. Thanks, Jeremy.
Jeremy: [01:27:38] And that was my chat with Swyx. If you want to see a good example of someone who documented their learning process, I highly recommend you check out a post by a previous guest, Federico Pereiro about how he writes back ends. Many people from around the internet, reviewed what he wrote and gave him productive suggestions.
Alright, that's gonna do it for this episode. We talked about a lot, so don't forget to check out the software sessions website for transcripts. All right. See ya.
Philip Kiely is the author of Writing for Software Developers and has written for companies like Smashing Magazine, CSS-Tricks, and Twilio.
We discuss:
Related Links:
Music by Crystal Cola: 12:30 AM / Orion
Transcript
You can help edit this transcript on GitHub.
Jeremy: [00:00:35] This episode, I'm talking to Phillip Kiely. He's the author of the book "Writing for Software Developers." We usually talk about code on this show, but knowing how to communicate with developers is just as important. We discuss topics like making articles easily skimmable and writing with a specific audience in mind as well as crafting emails that people will actually read.
We also talk about his experience creating, promoting, and launching his book. And what he learned from his many interviews with people like Cassidy Williams and Patrick McKenzie. If you want to get better at writing or you've ever considered launching a product, I think you'll enjoy this conversation with Philip.
Phillip, you recently graduated with a computer science bachelor's from Grinnell College and I'm sure you did a lot of research. When you have something new that you're trying to learn how do you get started?
Philip: [00:00:48] As a developer, I spend a lot of time getting error messages while I'm coding. The first thing I do whenever I get an error message is I just plop it into Google, see what comes up, see what people are saying on Stack Overflow, GitHub issues, that sort of stuff.
I also have a background in journalism. Having worked at the Grinnell College Scarlet and Black newspaper I have a lot of experience going and just talking to people doing interviews. And that's why that felt like a very natural thing to include in the book.
It's also something I've been fortunate to do in articles. For example, I wrote an article in Smashing Magazine about remote part time teams using agile development methodologies. As part of that, I did a quick email interview with one of my professors at the time as well as a different professor who had written the book that we were using as the textbook in this class about software development.
Bringing in those outside perspectives allowed me to write an article that was so much better than anything I could come up with from my own experience. So really when you're looking for things to use in your own writing look broadly and don't just look at the first page of Google results.
Ask specific questions, whether it be of a search engine or of an experienced person in the field. And then cross check that information against each other to make sure that it all matches up and presents the information in a way that's orderly and meaningful to the reader.
Jeremy: [00:02:19] The people you're reaching out to most likely don't know you and you're asking them for advice or for an interview. How do you get these people to say yes and agree to help you out?
Philip: [00:02:31] It depends on what I'm asking for. In the case of the article that I was mentioning earlier, I was asking for maybe 5-10 minutes of their time to do a quick quoted answer to one or two very specific questions that I knew they had off the top of their head because I had just read a whole book about it and that is actually a pretty easy ask. Most people who are in the public do receive a lot of email but don't necessarily receive a ton of these sort of requests on a daily basis. And if you make yours compelling and specific they are pretty likely to respond to you.
I'm nowhere near the public profile of the people I've been reaching out to. But over the past week with this book launch, I've gotten a ton of inbound emails so it's been really interesting to consider this question from the other side. Considering which of these emails I'm taking the time to write very detailed responses to. Which ones I'm shooting back a real quick one liner. And which ones I'm just pretending I never saw and sticking it in the trash.
I think that making my emails to these people who I'm trying to get interviews from very short, very specific, very relevant to their interests. Showing that I'm not just reaching out to them because they're a big name or because I had an idle inclination but because I'm very invested in a project that I need help with.
In terms of finding interviews for the book. It was the same process just magnified because instead of asking for a quick email exchange, I'm asking for 30-60 minutes of their time. I'm asking to ask them some really tough questions about their work and have them think through it and give answers that are going to get published and have a bunch of people read them.
I'm also implicitly asking them to endorse the work that I'm doing and the product itself with their presence in it. I have the picture of everyone who I interviewed on the sales page, for example. And with that the pitch is a little different. It's about investing credibility. The people who I interviewed for this book were kind enough to invest some of their credibility in this project and if the book was total garbage they would have gotten a negative return on that investment and future projects that they're involved with.
On the other hand, if I put in a lot of work editing and producing a great book then they're going to get a positive return on investment and to some small degree be even more credible for their involvement with this project.
I never state that explicitly, but throughout the process both the first short cold email, the scheduling, the interview itself, and then keeping them up to date on the progress of the project after the interview... I just have to deliver to them professional correspondence that gives them confidence that this project is going to have a positive return on the credibility that they invest in it.
Regarding the interview itself... I want to make sure that I'm asking them questions that people haven't asked them before or at least asking them these questions in new ways that allow them to say new things that they haven't said publicly in the past. Both because it's more interesting for them and it allows me to create a better product in doing so.
I'm going to inform them that I've read their work and not just by saying, "Hey, I've read your books" or "I've read your articles", "I've listened to your podcast". I'm going to say something specific that I liked about a previous thing that they've done. Or I'm going to reference that I just interviewed someone for the book that they know from a previous interaction. By establishing that credibility and proof that I'm going to be very invested in the project I can make them more likely to respond.
The final thing is that it really is a volume game as well. I sent almost a hundred pitches to the sort of people who you would expect to see on the list next to the 11 who did say yes. A number of people were just too busy to respond, have a policy against doing interviews or for any number of reasons didn't want to appear in the book, and that's totally fine that's their prerogative.
What I would encourage people to say is, okay, if I need 10 interviews for the thing that I'm trying to do I'm going to send a hundred emails. If I'm writing an article and I only need one or two interviews I'm going to send 10 emails. If you get more interviews than you're looking for, that's great. I was only planning on putting eight in this book and I got 11. And they all add a ton to the product. It takes a lot of time because you are sending specific targeted emails at scale but it was also worth it.
Jeremy: [00:07:18] So basically when you want to talk to someone... do your research, read their work, learn a little bit about their background and show them that in your email so it'll be more personal and it will make it clear that you're not just shooting emails out to everybody and hoping for a response.
Philip: [00:07:32] Then the other approach that you can take is more of the approach that I used pitching you to come on this show. I think I wrote you about a three sentence email and the first sentence was: "Hey, I just graduated from college and made $20,000 selling a book." The second sentence was: "I'm doing a podcast book tour." The third sentence was: here's how you can contact me.
When you're doing cold email like that it's almost like a non-negative form of clickbait. You're trying to intrigue the person who you're reaching out to say: Hey, I'm going to give you some information that might be interesting to you given what I know about your background. I'm going to include links so that you can verify that I'm not some scammer. Then here is the exact actionable steps that you can take with this information if you're interested.
Jeremy: [00:08:22] It was very straightforward. Then if you click the link, you see the landing page and see all the people that you interviewed whether that's Patrick McKenzie of Stripe or Cassidy Williams from Netlify. And I think anybody who sees that landing page and sees all the people that you interviewed it'll probably make them think there's something to the book so there's two different strategies for getting someone to notice your email.
Philip: [00:08:45] Right. You don't need that. I didn't have any of the social proof when I was getting the people who I was interviewing to interview with me. But if you have it you should definitely use it cause it makes it way easier.
Jeremy: [00:08:58] And when I received your email, I previously saw the post on Hacker News about your book. It was really easy to make that connection because I saw the same title in your email and I didn't have to think too hard about it.
Philip: [00:09:13] All credit for this title goes to my mother. I wanted to call this book the "Technical Content Development Handbook" and "Writing for Software Developers" is a much better title because it more accurately describes what the book is about, who the audience is, how they're going to use it. But also when you're sending cold email like this, every single bit of this very, very short email counts a lot. Having every aspect of your product optimized for people can understand this quickly is really helpful when you're sending cold email.
Jeremy: [00:09:45] You had an interview with Cassidy Williams and she was saying developers don't like being sold to. I think developers like in a broader sense knowing the specifics of what they're getting.
They want that title that says this is what this book is about. So that was a smart move.
Philip: [00:10:07] Yeah. Her interview definitely influenced how I created the sales page for this book by highlighting the interviews that I did, providing a free sample chapter, answering a bunch of made up questions in my frequently asked questions section. All of that was designed to do exactly the same thing that I do when I'm writing articles for clients that then use them for content marketing. I'm just providing a bunch of information and trusting that the people in my target audience are smart and engaged and want to find valuable things on the internet and that all I have to do to make the sale is provide them with the information they need to make their own decisions.
Jeremy: [00:10:49] Now that you're on the other end looking at inbound emails, what gets you to click versus just keep scrolling?
Philip: [00:10:59] It's a little difficult because there are so many different things that can make me interested. I'm still new to this so every email I receive is in some way incredibly exciting. Check back in 10 years and see if that's still the case. But I think that having confident well written emails about just about anything will pique my interest as long as someone demonstrates that they've put a bit of thought and effort into the communication. I feel somewhat of an obligation to respond with an equal amount of thought and effort.
Jeremy: [00:11:35] Instead of emails, how about when you are researching something. You're trying to learn how to do something and you see the list of tutorials on Google. Do you have a heuristic or something that you look at to decide this one doesn't look good I'm going to keep scrolling?
Philip: [00:11:50] Absolutely. And it depends on why I'm doing research. If I am programming myself and I want to solve an issue like I was talking about at the beginning of this episode I'm just going to be clicking through things as fast as I can, looking for code samples and trying it and seeing what works.
On the other hand, when I'm trying to learn a new skill or get some general background knowledge because I'm about to write about a topic, I do a much more methodical approach where first I click through the first few results, find something, read it, and then I read all of the things that article references. And then I read all of the things that those articles reference. It's more of a depth first search than a breadth first search.
And by doing that, I rely on the efforts that other authors have made to curate good information in the text or a further reading or sources section at the end. And I make the conscious effort to do the same thing in my own writing including in most articles some combination of in text citations or a list of further reading at the end that both talks about the exact topics that I used in the article and maybe includes all of the sources that I've cited as well as interesting things in tangential topics that might not have made it into the article proper, but are going to be really useful to someone embarking in a similar journey trying to figure out a topic.
Jeremy: [00:13:22] There's two very different modes of research. You're saying one is I am working on a problem right now and I want to know how to solve it. And for that you're basically scrolling through and looking for code samples that you can copy and paste.
And the other would be trying to get some kind of deeper knowledge and the way you search through things is very different. I think it's interesting to think about it that way.
Philip: [00:13:50] It is and it's very important to think about your audience's mode when you're writing. A lot of people talk about picking an audience, finding and defining exactly who you want to write for, and figuring out things like: How much programming experience does my potential reader have?
Those questions are very, very important and can be difficult to address because as a writer you get to define your own audience. You write for someone with a specific background and that person comes and finds your work if you do a good job of distributing it. And people without that background might come across it but don't read it.
However when you're thinking about your audience's mode that is something that you have even more control over because you can write the exact same topic with the exact same background two very different ways depending on whether or not you're trying to get someone to sit down, work through an entire sample application with you, and really develop a broad and deep understanding of a particular topic.
Or you're just trying to get someone in and out and get them a code sample and get them a quick explanation of why they might be experiencing an error and what they can do in the future to resolve it and then get them back to their work.
I think great publishers understand this difference and they're going to hire people to write technical content or they're going to write technical content themselves with an eye towards intentionally creating both types of content and trying to avoid hanging out in the middle where they have an article that's difficult to parse and doesn't provide sufficient depth.
Jeremy: [00:15:28] When I'm looking for a specific code sample a lot of times I'll look for links that are to Stack Overflow because I know that if I go to Stack Overflow then it's going to be the short code snippet and I'm not going to have extra time trying to find the thing that I'm looking for.
How do you decide whether to go to Stack Overflow vs click on a long blog post?
Philip: [00:15:56] Stack Overflow is an incredibly important part of the technical content ecosystem online. When I think about my competitors I don't think about the people who I'm competing against to maybe write an article for a client or land on the front page of hacker news every day.
I'm competing against the vast amount of free technical content already available on sites like Stack Overflow and what Stack Overflow is really great for is exactly this quick answer stuff that we've been talking about. But the places where I think blog posts can shine is when someone is not asking how, but instead asking why or what or who or any of these other questions.
Because if all you need is a code sample it's probably because you already know how to use it. You already know exactly what you want to do. Bringing back my original point about asking specific questions: Sometimes you don't yet have a specific question and your goal in research is not to find an answer, it's to find a better question.
And a lot of times going to longer form resources like blog posts or articles or even books is a great way of asking better questions which is a really, really critical first step in research that is easy to overlook because it might not feel like progress. It's kind of like when you're programming and you've been stuck on an error message for awhile and then you change something. And you got another error message but at least it's a new message and that's really exciting and that feels like progress. The same thing with asking a newer, better, more specific question can be achieved by consulting longer form resources first. Then you say, okay, great. I know that I just need to sort this database index by date ascending rather than descending and that's my issue. That's what's causing all these problems. I can look up the SQL query to do that on Stack Overflow.
Jeremy: [00:17:51] Yeah, that's a great point because stack overflow is not always the best place to understand why you would do something a certain way. Somebody will answer with a code snippet and if you copy and paste that, it'll probably work. But sometimes what gets lost is, is that the thing that you really should have been doing?
If you're not quite sure what you want it makes a lot of sense to have that be a long form blog post or a book or something like that.
Philip: Absolutely.
Jeremy: [00:18:23] Julia Evans creates zines, they're these illustrated guides on technical concepts. One thing she recommends is that people write for one person.
How do you decide who that person is and what's the best way to let the reader decide if they are that person you're writing for?
Philip: [00:18:45] That's a great question because once you figure that out, everything else is pretty easy. A lot of the time I write for myself in the past. I write the guide or the article or the book that I wish I had that I wish I had been able to read before starting out on a project.
But sometimes I think about, okay, I have a friend who wants to do something. I have a coworker who struggled with this. Just figuring out someone from my life who I can think about as a example of the audience. And again, that's sometimes, but not always me. That can be very helpful for solving this question and based on my interviews it's what a lot of other people do as well.
Jeremy: [00:19:29] When somebody comes to the article, do you expect them to scroll through and figure out if it's for them? Or do you think there should be a more explicit sign post saying: "Hey, for this tutorial you should probably know these things."
Philip: [00:19:44] I don't really believe in categorizing tutorials "This is a beginner tutorial" because maybe it has something really useful. And someone might overlook it if they consider themselves above that or on the flip side "This is an advanced tutorial" could scare away someone who is ready for it but just isn't confident in their work yet.
What I think of as some of the implicit markers of a tutorial first is the setting up section. When you talk the reader through here's what you're going to need to configure in your environment in order to run my sample code. If you spend a lot of time walking through the exact minutia of here's how you set up a virtual environment in Python, here's how you PIP install a package. That's going to be a really strong signal that this article is for a beginner level reader and an advanced reader can go ahead and skip that and assume that their thing is configured correctly. Move on to the thing that they're looking for and you haven't really bothered them by including that setup information as long as it isn't pages and pages of it.
On the other hand, if I was writing about a very advanced topic I might not include very substantial guidelines on how to get set up both because it's going to save some space that I'm able to instead assign to describing this complicated thing I'm talking about, but also because if someone isn't able to figure out the setup steps that I provide, then they're probably not ready for the advanced article itself.
I think that the setup section of any article provides a very important form of positive gatekeeping in terms of implicitly informing people whether or not an article is written with an audience in mind that they were part of.
Other than that, I think that like a focus on very specific titles. Not clickbait, but explaining exactly what you're doing, how you're doing it, why you're doing it, both in the title, the summary, the first couple paragraphs is a great way of letting readers know whether or not they're in the audience. And that even ties back to my own book because the title "Writing for Software Developers" lets anyone who's not a software developer know that they're probably not going to get a lot of value out of this book.
Jeremy: [00:21:59] Rather than being this gatekeeper you let the reader infer it through the title, introduction, the setup steps, and hopefully they can decide from there whether it makes sense to keep going.
Philip: [00:22:16] Exactly. The other thing about technical content in general is that it's an opportunity to provide people with a pathway into the field. People talk about how computer science is one of the easiest things to learn without going to college. I'm fortunate to have gone to a great college with very strong CS program but even still I learned a lot of the things that I use in my actual job and in my work outside of the classroom by using technical content. So I think it would be counterproductive to explicitly tell anyone: " Hey, this isn't for you."
Jeremy: [00:22:53] When you are writing an article how do you decide the scope for it? To give you an example: You mentioned you had listened to the episode with Courtland Allen. He talks about a lot of different things that are interesting.
For example: How he made his site stand out with the design, how he scaled his site, the technical stack. If you were going to write an article based on something like that, how would you decide whether to put it all in one or split it up into pieces?
Philip: [00:23:30] If I'm faced with a very large project like that, I would definitely split it up into pieces. I mean, there's so many interesting aspects of Indie Hackers. I'm sure I could write a entire article just about, the CSS within the upvote downvote buttons. You can get very, very deep and very, very specific and there's a lot of value in doing that.
On the other hand. One thing that was consistently mentioned in the interviews is that if you write a series of articles, people aren't necessarily going to read the whole series. So what I would focus on is picking out a bunch of the most interesting aspects and trying to create connected but self-contained pieces of the puzzle.
For example, this is something that I've been doing for the past few months with Smashing Magazine. Every so often I've been publishing a Django tutorial that all fit together into a broader understanding of how to use the framework. But I'll also very intentionally create it so that they are individually consumable and do not rely on any other bits of the series in order for you to get the code samples up and running in order for you to get a understanding of what the topic of the article is and what the main takeaways are.
I think I would do something very similar. I would take the smallest piece that I think will make an article, because I'm always surprised by how much bigger every topic ends up being once I started writing about it. I would find the most interesting nuggets from this massive project. I would build individually packaged tutorials around each of them. And that's where I'd start.
If you knew it was going to be a book or a video course or something that was delivered as a continuous whole, that's where I'd invest some time in talking about, okay, here is the overall structure. Here's how all of these things connect to each other, and here's how you can identify patterns in your own projects or your own work that match patterns in this example and will lead you to the specific pieces of content that I've developed that might address similar issues to the things that you're facing.
Jeremy: [00:25:43] You would default to splitting things up into as small a piece as possible while still being valuable on its own. And then if you were going to try a larger project you would have to know upfront whether that's a book or a video course.
Philip: [00:26:01] Exactly. And the thing with technical content is you have to trust your audience a lot. For example, the first half of the book is about writing a 2000 word technical tutorial for clients and I talked about this for three reasons.
The first is that writing a 2000 word article for a client is a great, achievable first challenge for people who want to get into this.
Then there was also a large amount of demand from publishers for this exact format and style of writing.
But the main reason is because this is what I have experience in but that experience transfers very easily. Everything we've talked about in this interview about finding an audience, figuring out your scope, figuring out how to make your code samples complete and runnable. All of that applies. If you're writing a book, if you're writing a README, if you're writing documentation, a Wiki, the skills are really transferable.
Similarly, when I'm writing technical content, I have to assume that the people who are reading it are smart, they're engaged, and they're going to take the specific context that I've presented them. They're going to take their own context and they're going to figure out how to bridge that gap.
Jeremy: [00:27:15] You had an interview with Chris who's a community manager at Digital Ocean and he mentioned how readers will skim code and images and they gloss over long paragraphs.
Does that match with how you navigate content or is that more specific to when you're searching for a certain type of thing?
Philip: [00:27:35] That interview had a lot of surprising things in it. And that was one of the most surprising to me because I spend a lot of my time when I'm writing focusing on crafting not necessarily sentences that my literature professors would consider to be well-executed, but at least competent English phrasing. And I spent a lot of time thinking about the readability of my paragraphs. Starting at one idea, flowing into another one, and to hear that a lot of people were skipping them was kind of disheartening. But then I did think about how I read content myself. I'm fortunate to be a really fast reader. If I'm trying to read a book with another person, it can be a little unfortunate because I'll get all the way to the end and they'll want to talk about chapter three. The way that I skim articles is I just read the text but really, really fast.
Chris had actual statistics based on using Hotjar on scotch.io to see how people were actually interacting with the site. And in aggregate, a large portion of the people were just scrolling down to the first code sample, copying it, and when I think about the mode of: I need to solve a particular issue in this application so that I can commit it and go eat lunch, then absolutely that's what I'm going to do. As a technical content writer, there are two things I can do to address that.
One is to increase the skimmability of my content as he was talking about. To maintain the same focus on good writing just put more white space in there, put more paragraph breaks, more bulleted lists, numbered lists, quick summaries. Because if those are going to be really useful to someone, I should absolutely include them. And it's going to do nothing to detract from the experience of someone who wants to really read and sit with the content.
And then the second thing is when I'm considering how to write this stuff, I do need to focus a lot more on putting in code samples that are complete, runnable, and understandable without the context of the article around them. Even if that makes those code snippets a little longer.
And it makes describing them a little more difficult because I have to parse through eight lines of code including some import statements or configuration. For the majority of the people in get it done mode who are looking at the article if that's going to help them copy it into their application more easily then that's an investment worth making.
Jeremy: [00:30:14] Let's say I'm going through a tutorial and I follow the code as it's described and if it's missing import statements or it's missing steps to install prerequisites.
That's a lot of time that's sunk. You end up googling how to complete the tutorial that you're currently working on. I can understand why that's frustrating.
Philip: [00:30:36] Exactly, and that's something that is worth investing in avoiding. One of the best examples I've ever seen of this is Mark McGranaghan's Go by Example where every single code example on that entire site you can take it, copy it, and paste it into an online go interpreter or your own go environment, hit run, and there you go. It works exactly the way it says on the site. The value of that is incredible.
One of the places we can see a real investment being made in that is with Stripe's experimental documentation that they published a couple months ago where instead of just showing people: "Here's the single function you write to integrate the payments API." They created sample applications. Not just one sample application. But the same sample application with three different backends, four different frontends, two different frameworks, all this different stuff. And you can mix and match these and download these sample applications.
And then they also did the two column design that Mark McGranaghan talked about where you have the code on the right, the words on the left and they're matched up, and you actually click through the steps in Stripe's documentation and it zooms around in the code example that you're looking for, highlighting the blocks of code that are relevant to the information that you're taking a look at.
By doing something like this, you're creating more than just a piece of technical content. You're creating an immersive experience where it's a guided walk through a complete piece of sample code and that makes it really, really easy to figure out how to bring in the thing that you need into the application that you're working on.
Or if you're not in get it done mode, you're in learning mode. Then it's really easy to take a broader look at the application and see how it's structured, see the decisions that went into it in the high level, and understand how you can structure things similarly in the future.
Jeremy: [00:32:38] It's letting the consumer of the content, the person who's looking at the tutorial decide whether all they need is the code and they understand just by looking at the code or whether they need that extra sort of guidance that's going to show up on the left side.
Philip: [00:32:53] Exactly. And making something like that is not a small investment. I don't even want to speculate how many hours of engineering time went into that. If you look at the fully loaded costs, they probably spent hundreds of thousands of dollars of engineering time on this demo. Which is worth it for them because if it gets people to integrate Stripe in the applications then Stripe's gonna make some money.
Individuals such as myself can't necessarily make a similar scale investment in technical content. I write almost exclusively about JavaScript, HTML, CSS, Python, Django, because those are the languages that I know really well and that I feel confident teaching people.
And if someone would have asked me to write a sample application in 10 different languages and frameworks I just straight up would not be able to do a good job of that. When you're an individual and thinking about how to make meaningful investments in your work. When you don't have those sort of resources, what you can do is even if you're not doing it in a bunch of different languages you can still focus in on one and you can make a really great sample application that's complete, runnable, configured out of the box. And that's going to provide a smaller niche audience of readers that same value. And then what's great is it's going to be free on the internet so someone else can come along and say, Hey, I really like this. They did a really interesting thing here. I'm going to do something similar in my framework and language of choice, and that's how great ideas and practices can propagate around the internet.
Jeremy: [00:34:31] When I'm trying to learn something, whether it's searching online or even if it's trying to do something in a company. The way that I typically learn and I think a lot of other people learn is they see an example of something somebody has done before. And that could be a sample project or it could be the way a certain feature was built in a code base at work, and just by seeing how somebody has already done it, it's not a full copy and paste, but it gives you this guide for how you can build something similar. And I think that allows people to be able to build something new a lot quicker than, just looking at API documentation.
Philip: [00:35:17] Exactly. There's no new code. We're all standing on each other's shoulders. And I think that's one of the things that's so great about technical content is even though you know a lot of it's paywalled or a lot of it you sell to publishers who release it for free with advertising or content marketing.
Ultimately we're all helping each other out to build things and it's about creating a corpus of knowledge across the entire internet that everyone benefits from.
Jeremy: [00:35:44] I'm interested to see how we can improve how we share that knowledge going forward. Is there something beyond the current tutorials we have like full featured applications or templates that people could use to learn from.
Philip: [00:36:01] I think that an under appreciated source of this knowledge is open source repositories. You can learn a lot just by going and inspecting the actual source code of big open source projects. Now, open source projects have a ton of responsibility already. The developers and maintainers and volunteers are under a massive amount of pressure to keep everything running and up to date and to continue providing new features that people are asking for often with very little funding.
So I can't just go ahead and say: "Hey, open source maintainers. You've got to add a bunch of documentation." because that's just not a realistic thing to ask for.
But I think that one under explored area that can be pretty interesting is taking a piece of open source code. And writing either as a separate thing or as a contribution to the repository directly an explanation of how it works, and I can give a anecdote about this. Well over a year ago, I was working on this application with my friend and I wanted to render an iPhone in CSS so that I could stick an image inside of it. And I found a library, an open source library that provided these beautiful CSS renderings and the one issue is that they weren't resizable. So I created my own fork of the library and wrote this Python script that used SCSS and some variables and some other things to go through and automatically convert all of these non resizable CSS devices into resizable ones. I used them and just sort of forgot about it.
A year later I decide "Hey, I want to write an article for CSS Tricks. When have I done something cool with CSS before? Oh, Hey, this time with this open source library." I pull out the code. I realize that a year ago I had no clue what I was doing and was a terrible programmer. And so I rewrite the whole thing. Which is a very common occurrence when I pull something out from the past to use in an article the first thing I do is decide it's terrible and rewrite it. But that's an important part of my own learning process and then sharing that learning with my audience. So anyway, I'm here. I have this great code. I write an article about it. And CSS Tricks is thrilled. They publish it.
What's super great is first off, this is increasing the visibility of an open source product, both the original one and then the modified version that I created.
Secondly, anyone who comes to this modified version of the product and wants to know more about it can go look at this article and read more about how it was created and thus really understand how to use it.
The third thing is that when people read technical content, as we've been talking about this whole episode, the first thing they do is they take it and they copy it and they paste it into their own applications. Whether that's personal projects, things for work, whatever. So when you're publishing technical content, the responsible thing to do is to make all of the source code open source under some very unrestrictive license. All of my clients do this. Either they'll release it themselves as open source. There'll be a disclaimer in the footer. Or what happens a lot is I create an open source thing and then I write an article based on the thing.
I sell the article to the client, but not the thing itself that's not considered part of the work product. And while this might seem like a technicality, it is an important thing to think about in terms of how you're delivering value not only to the client or to your employer, but also to the reader. Because if you're going to create something useful they've got to be able to use it. Open source can help.
Jeremy: [00:39:36] That's an interesting example of having this open source library. Having this tutorial or document that describes how it works and that helps other people learn how they can build something similar. And at the same time, you got paid by CSS tricks in this case.
So it's like everybody benefits.
Philip: [00:39:55] Exactly. It's an indirect model of promoting and funding open source software. And in this case this library is a very small scale thing. It's not something I've dedicated a ton of time to. And the original creators of the library used it as content marketing for their product.
But ultimately, I don't know if this model is scalable. I don't think you could support the development of Django by just writing a ton of tutorials about Django. But I think that at the smaller scale it is just another way that individual developers are able to monetize our own and other people's contributions to open source software in order to create useful things and pay the bills.
Jeremy: [00:40:44] Yeah. I think an interesting possibility that I don't think I've seen before is, there are some open source applications, like for example, GitLab is an open source project that's written in Ruby on Rails. I wonder if somebody could based on GitLab's code write a book or a series of articles explaining this is how we wrote a real world application in Rails. Because I think when you have a code base that is as large as their's is, telling somebody to just look at the source code can be pretty daunting.
I think there could be room to come in and walk people through parts of the source code and have something that they can learn and take to build their own projects with. I think that's a really interesting possibility.
Philip: [00:41:38] Absolutely it is. There are two great examples that I can give about the marriage of open source software, technical content, and a sustainable business model.
The first is Taylor Otwell's work with Laravel which is a PHP web development framework where he and a team have managed to not only create really great open source products, really great technical content. But also charge for various value adds that larger corporations, larger teams who are using these products are able to invest a lot in because they rely very heavily on this as a foundational part of their infrastructure and stack. Which in turn goes back to fund open source developments that make this whole product better for everyone.
Another great example is Adam Wathan's work with Tailwind UI. Tailwind is an open source CSS library that is very, very popular. It's kind of like bootstrap in my mind although I know that power users might disagree. The way that this is monetized is through something called Tailwind UI, which is a collection of components that you can buy either an individual or a team license to which was an inspiration for the team license in my own product actually.
You can buy an individual or team license. And it's I think $300 or $600 and this gives you access to a ton of components and a ton of technical content about how to use these components and Tailwind in general to make your websites look great. And if you think about it, $300 is what, three hours of developer time? Maybe less if you are hiring very experienced developers and you have fancy offices somewhere. So it's a total no brainer to purchase a piece of not just technical content, but also reusable example code that you and your development team can use to create something that looks great and drives business results.
One of the things that I drew inspiration from them as well as Daniel Vassallo's book is the idea of price discrimination with a corporate license. When you buy an ebook, it's kind of like buying a physical copy of a book. You're not supposed to email a copy of it to everyone on your team. But I figured that some companies have documentation teams or might otherwise want to purchase a copy of this for everyone working there. But I didn't want them to have to go buy a bunch of copies and so I created a team license. A new tier of the product on Gumroad. I added a one page PDF saying, here are the three ISBN numbers that you can share with as many people in your company as you want and set the price a hundred dollars higher, figuring, Hey, this is a huge markup. This is amazing. So all that work took me about 10 minutes.
So far, seven people have bought corporate licenses, which is really exciting and is a massive return on that time invested. So that's another facet of thinking about your audience during the product development cycle. How different people in different situations might use your product and then creating something that adjusts to their needs. Because for a company it's worth the hundred dollars to know that they have permission to use this book however they see fit, and to show it with as many people as they want.
Having that kind of license both helps me out because it made me all of this extra money. It also helps out the people who are buying it because it lets them use this content in a way that they might not have been able to otherwise to better achieve their goals.
Jeremy: [00:45:12] That's a great point. The threshold for what somebody thinks is expensive is very different for the person vs the company. If I look at something and it's a hundred bucks and I'm buying it for myself, it's a little expensive for me to buy. But if it's a business paying for it then it doesn't even register.
They'll put it on a card. You don't have to think about it. It makes a lot of sense to price differently depending on who's buying.
Philip: [00:45:35] Exactly. It allows me to provide the value to the people who are able to pay for it while also keeping the main product cheaper for everyone else because I'm not trying to hit a midpoint between individual and corporate use.
Ultimately what both of these examples show to me is that where a lot of the money is in technical content is first, you make something great, and then you save people time. So you've made some really great open source thing that people want to use. And then you take the businesses who can really invest in saving their teams some time and you say, Hey, I'm going to make you a great resource. It's going to cost a few hundred dollars, but it's going to increase your development velocity so much that you won't care at all. I know that both of these people have also done a lot of other stuff in technical content, written great books, built a big audience doing interesting work in public. And so I'm not sure I can attribute all of their success to this business model or even the majority of it. But I think that this business model is the sort of thing that I'm looking at very closely as I consider longer term more sustainable work in technical content development.
Jeremy: [00:46:46] Adam Wathan with Tailwind is a great example because Adam has put together screencasts, he's put together tutorials, and the quality is so incredibly high. Just going through those tutorials I learned a ton about CSS beyond just learning about Tailwind. Then you have this product that helps you move even quicker it's a very easy decision.
Philip: [00:47:11] Exactly. And this goes back to what we were quoting earlier from Cassidy Williams saying developers don't like to be sold to. We don't like to be sold to but we love to be taught interesting things. And that's why content marketing, really good content marketing does so well in the developer space. And it's actually something that gave me some pause and some doubts when I was thinking about launching my book is I was thinking I have what, 20 Twitter followers? My website gets a nice dozen hits a day. Half of them are from my friends. So I was thinking, do I have the established credibility to launch a product like this?
And are people gonna feel like I've created something that is valuable or they're going to feel like I'm selling them something? And part of that was solved using the investment of credibility for my interview subjects I was talking about earlier, but part of it was a experiment. Everyone says you have to build an audience before you launch a product. My hypothesis was, I can build an audience by launching a product.
It happened to succeed. It's a single data point. I don't know if it's repeatable. I can't guarantee it'll work for anyone else, but a lot of people might look at a business model like Adam's, and say, okay, this is great for him, but he already made however much money selling books. He already invested years and years in building this open source library. I need money now. I can't do that. But I think that an iterative approach where you build a thing, you launch a thing, you use the momentum, you build a bigger thing, you launch a bigger thing might be possible and might be a way to introduce more people to this business model and this space.
Jeremy: [00:49:02] Like you said, a lot of people do say you should build an audience first. I had a conversation with Ben Orenstein, he did a lot of public work in terms of, conference talks, blog posts, he ran a podcast for a long time, mostly in the Ruby on Rails community.
And when you ask him how did you build your audience? He said well, I was just helpful to people for a decade. And you took a very different path where you didn't have that following, but you were able to have a successful launch out the door.
And what you were saying makes sense in terms of having other people vouch for you. Having people like Patrick McKenzie, post on Twitter and say like, Hey, there's this, book, writing for developers, I think it's really great. And if you like what I post on Twitter, or you like my blog posts, I think you're really gonna like this. For me as a developer or just as a person, when I see something like that, then I'll go like, Oh yeah, I trust Patrick. I'm definitely gonna check out this book. So I clicked the link and then I get the landing page and that's the second step, right?
You get the person to the landing page, And then what's on there is attractive enough where you go to the next step, click the buy button right? So I think that social proof or having somebody else make the recommendation is pretty huge.
Philip: [00:50:24] It really is, and I will not sit here and pretend that I would have sold the number of books that I did if I didn't have the buy in from these very successful people with the interviews and the promotions.
However, there are some steps that I made that I think anyone can replicate for their own products that I'd like to talk about sort of going through the funnel.
So at the top of the funnel, like we've been talking about, it's Twitter and Hacker News, and I had no following on Twitter, but no one has any following on Hacker News. The concept of a follower doesn't exist. Every single post goes into new. So for a week I focused on, I'm going to create a great Hacker News post.
I revised it like I would revise a article for a client or even the book itself. I wrote a top level comment that I posted immediately after posting the Show HN. I looked through all data and statistics to see, okay, I'm going to want to launch this on a Tuesday, 8:00 AM central time so that people in Europe are going to be awake to see it, cause it's afternoon there. I'm going to get that East coast and Midwest morning traffic, get it to the top by the time the West coast wakes up, then they're all going to see it. And that's where I'm going to get the massive influx in traffic.
So taking the time to really consider how you present yourself on these platforms and that did rely on the fact that I've been a member and user of this platform for years, so I know it well. I know the sort of unwritten rules and what the audience is going to respond well to and not respond well.
To contrast that with Product Hunt where I posted the same thing and got 10 up votes and no views. Having an understanding of the platform that you're launching on is really, really important. And is the difference between staying at the top of Show HN for a day vs getting ignored.
That's the top of the funnel. Creating a post that really targets the specific audience of the platform that you're using. From there, my sales page, I spend a lot of time working on that. I based it on the advice from Nathan Barry's book Authority, which covers a lot of this stuff and made sure that I included not just the social proof, but also things like a free chapter so that people can look at the formatting and the content and make sure it's something that they want to read.
I focused on having a good bio listing my clients which is a form of social proof that I owned for myself over the last 15 months or so writing for people. And then with the Gumroad page itself I spent a lot of time thinking about the layout, the pricing, eventually landing on $36 as the right price because on the one hand, the book teaches a lot of valuable skills. On the other hand. A lot of the people who might be interested in reading it are going to be students like me or other early career people, which is why I also put on the website that anyone who can't afford it can just send me an email. I'll send them a copy for free. So I've sent out dozens and dozens of free copies as well.
And all of that just serves to get more Twitter followers, get more people excited about the book, spread the word. Even small things like making the book DRM free. A lot of people worry about what if someone's going to pirate my book? Well, I'm worried about exactly one thing. What if someone goes and lists it on Amazon? That's why I've registered copyright so I could issue a takedown notice. Other than that small authors such as myself have to worry about things like obscurity much more than things like piracy. I'm mostly concerned about getting the word out there about my ideas, of my content, rather than making sure that every single person who looks at the book has done so in the approved manner.
Looking at all of that, there are all of these little steps that you can consider in a launch that are going to provide a big impact on the way that your product is received. People are going to be able to perceive that you are really, really invested in this thing that you've created and you've thought through all these things, and thus the thing that you're actually selling them is pretty good too.
In summary, absolutely. The people who I interviewed were a massive factor in driving sales, and I'm incredibly grateful to them for that. But none of that would have meant anything if the underlying product hadn't had that time investment that anyone can make.
Jeremy: [00:54:51] The way that I found out about the book was through Hacker News. It was seeing the post and the title was very to the point. It's writing for software developers. I do writing being a software developer and I would like to get better at that. I click on the comments, I see your post, which, was very targeted, to hacker news. It understood people want to hear about the process or they would want to hear about Patrick, who is well known on Hacker News. You put a quote from him in your post. It was very effective at getting what I think the average reader would pique their interest and get them to click.
Philip: [00:55:31] Throughout the lifetime of that post on Hacker News. It generated almost 10,000 page views on my site. I'd imagine more because I think the average Hacker News user has more trackers and analytics blockers than the average web browser. But as a lower estimate 8,000 or so people looking at the page and converting at remarkably high rates.
To anyone who wants to launch something technical content related or anything that the audience might like investing some real time and effort in a great Hacker News launch is really worthwhile for sure.
Jeremy: [00:56:10] Another person that you interviewed for your book is Daniel Vassallo. Something that he's been really effective at is distilling thoughts into tweets. As you start to think about the next step of building your audience, how do you approach writing tweets versus more long form content?
Philip: [00:56:33] Yeah. This is something I've really been thinking about and struggling with over the past few days because I have this brand new audience and it's the biggest platform I've ever had. Over 750 people follow me on Twitter and that's a sign of trust that I take very, very seriously.
I want to focus on quality tweets over quantity and focus on creating threads to get around the 280 character limit. Because I mean, let's face it, I'm a long form sorta guy. I speak and write in a lot of words. And so what I focus on is taking something I've been thinking about or working on that day, distilling it into something actionable, and then providing that on Twitter and just hoping that people like it.
The other thing that I've been focusing on as I engage and grow this new audience is responsiveness. I'm trying to answer email really, really quickly. I'm not necessarily always doing a great job. Anytime someone responds to me on Twitter, I want to be answering them right back or at least liking the tweet or something.
I've sent so many direct messages. I actually had to look up Twitter's guidelines on how many you could send a day because I'm DMing people who follow me, who have interesting profiles and saying, Hey, you know, what do you want to read about on here? What do you want to know about the topics that I'm working with?
All of that is in the pursuit of figuring out what people want and then figuring out what I can provide and operating in the intersection of that.
Jeremy: It'll be interesting to see how you grow that audience and how you use Twitter in general.
Philip: [00:58:15] Yeah. The thing that I wake up thinking about every morning and go to bed thinking about every night is at this point, more than 20,000 people have looked at the writing for software developers landing page, which is super awesome, and ~560 of them have bought a copy of the book, which is also super awesome.
My question is, how do I get the rest of those people interested in this book? Or if it's not the right thing for them what is the right thing for them? What will get them contributing to this online community creating their own technical content for publishers, for themselves in their own learning, for the companies that they work for, the open source projects, and just expanding this idea because one thing that Angel Guarisma said in our interview right at the beginning that's been a real motivation for me over the past few months is there was a lack of a canon or accepted literature on technical content development. There's not a lot of books besides this one and a few others about the things that we do. There was that sense of responsibility as I was writing this book, living up to the title, writing for software developers.
I don't think I've made it there yet, but in my pursuit of this field, I have the opportunity to be definitive, and that's something that I take very seriously and want to do a good job of and take my time and do right. But in order to do that, I have to engage with the people who like what I've done so far and ask them what's next?
What can I make for you that will help you make the things you want to make? That's an incredibly privileged position to be in and I'm very grateful for it and I just want to live up to it.
Jeremy: [01:00:08] Earlier we were talking about when Chris told you that people skim through articles that surprised you. Is there anything else from the interviews that really surprised you?
Philip: [01:00:25] One thing I was surprised by is that many of the people who I interviewed were so strongly against advertising, which at the time I was a little confused by. I've never really minded ads all that much but I've also grown up steeped in them. Today with the sharp decrease in the number of advertisers on the market right now and thus the decrease in rates a lot of publications that have been relying on advertising are struggling to find sponsors and get the sort of revenue that they've been previously accustomed to. That was surprisingly prescient of my interview subjects. That's just what comes from all of those years of experience and seeing dips in advertising rates in the past. I've always thought of it as a very solid foundation to build a business on where in fact, it turns out it is not. That was one thing.
I wasn't necessarily surprised but it was really interesting to see the level of passion and commitment that everyone puts into the work that they do. That was very inspiring to me because sometimes it can feel like, okay, I've just got to churn out another article. Just got to get something that's going to get the editors approval and be decent and people won't hate.
And seeing the amount of effort that all of my interview subjects put into their own work, rekindled my passion for the field and really reminded me that every single article that I create has the opportunity to be my best work and that I should not waste that opportunity.
Jeremy: [01:02:06] One thing about your journey learning how to write and publish professional content. This all happened while you were in university. I don't think there's many people who think I'm going to write professional technical content. What was the path for you finding out that was an option and deciding that was something you wanted to do?
Philip: [01:02:29] So the great thing is that on the internet, no one knows you're a cat. That's a famous quote and I think that in technical content I have a picture and a byline and maybe a short bio, but that usually shows up at the bottom of the article, not the top.
And the great thing about publishing with some of the big names that I've gone with is their aura around my work gives the work the ability to stand for itself and people are able to judge the work rather than the author.
How I got into it was almost by accident. I started out doing some freelancing for a company based in Germany who was trying to localize their site for a United States audience. I rewrote their web copy in native English, to improve the perception of the site.
And then I noticed that they publish technical content. I asked if they wanted to pay me for some tutorials about how to use Django and they said absolutely. I wrote a couple of tutorials and it was a lot of fun. Then I thought, Hmm. If this one tiny company out in Germany publishes technical content maybe other companies do too, that could be cool. So I went around and I started looking. I found FloydHub, I found Smashing Magazine. I found all of these places.
This is a metaphor that I use in the book. Imagine if a college kid being me woke up one day and said, you know what? For my first attempt at journalism, I'm going to write the headline story in the New York Times. That would just go, nowhere.
But what's amazing about technical content is because there's so much demand for it even the biggest publishers are willing to give you the time of day if you write a good pitch. So after doing some research and writing some outlines I started pitching people and people said yes a lot more than they said no.
I've done a lot of other attempts at freelancing and entrepreneurship and such before this. Building products, all of that stuff. And reaching out to two people to have one say yes instead of reaching out to a hundred people to have one say yes is a massive change of pace.
This is what I'm going to wake up over the summer and do at 5:00 AM before work because just that cycle of feedback and validation was really, really inspiring and got me to commit to creating technical content as a important facet of my overall work.
Jeremy: [01:05:00] What's interesting is a lot of people when they see a page like Smashing Magazine or CSS tricks, not even just people in school, but people who work professionally. Most of them probably think to themselves there's no way I could write for this publication. You have shown that it's a lot more accessible than people think.
Philip: [01:05:22] Yeah. I think unfortunately a lot of people struggle with imposter syndrome working in technology and that's a huge issue in our field. It limits the people coming in. It limits the diversity of the field. It limits the things that we can accomplish together. But I do hope that seeing, Hey, this college guy was able to publish with insert big name here is helpful for people as they decide that they want to either do something similar in technical content or the equivalent in their own field.
Jeremy: Cool. I think that's a good place to start wrapping up, but is there anything that you felt we should have talked about or you wanted to mention?
Philip: No, I'm really happy with everything we talked about.
Jeremy: If people want to check out writing for software developers or see what you're up to where should they head?
Philip: [01:06:16] So my central hub on the internet is my website, which is philipkiely.com the book is available at philipkiely.com/wfsd. I maintain an email list that you can sign up for on that website. I'm also increasingly active on Twitter and occasionally post content to YouTube. I'm working on a video right now going through my first week's analytics on this book and looking at where the traffic came from and who converted. I think that's going to be a really interesting thing for people who are interested in their own product launch to look at. And I'd be really thrilled if you purchased a copy of the book or followed my other work.
Jeremy: Thanks for chatting Philip and congratulations on the launch of writing for software developers
Philip: Well, thanks, Jeremy. It's been a great time talking to you today.
Jeremy: That's it for my chat with Philip you can get the transcript for this episode at softwaresessions.com. See ya!
Sara Leen is a localization programmer for XSEED Games on titles like Ys, Trails in the Sky, and Corpse Party. She got her start reverse engineering and fan translating games.
We discuss:
This episode originally aired on Software Engineering Radio.
Related Links:
Music by Crystal Cola: 12:30 AM / VHS Skyline
Transcript
You can help edit the transcript on GitHub.
Jeremy: Hey, this is Jeremy Jung. This episode, I'm talking to Sara Leen, a localization programmer for XSEED Games. Sara has over a decade of experience localizing games from Japanese to English. And we talk about how games are different than other types of software, storing and extracting text, porting games to different platforms, and rewriting a game from scratch.
A game written in an obscure programming language called... can you guess it? Hot Soup Processor.
Pretty good right? This was a fun episode and I hope you enjoy it.
Jeremy: [00:00:35] The first thing I want to start with is you have experience with building other types of software. I was kind of taking a look at your resume. What makes games different than working on a typical software application?
Sara: [00:00:48] With games, there's a lot of things that you have to worry about that you otherwise wouldn't like the frame rate of the application becomes very important. For example, you need to make sure that the timers of the application are coded just so that the game will always run at a steady pace. And there are various applications of timers like delta time that are usually used to ensure this goes perfectly smooth.
Jeremy: [00:01:17] When you talk about timers, I've heard of this concept called a game loop. Is that what you're referring to?
Sara: [00:01:27] Essentially a game is made of graphics, audio, and input. But all of these have to be checked every frame. And so you have your timer going so that it is basically looping through the game 60 times a second or whatever other frame rate your game is running at a lot run at 30. There will be various steps in this process, like game logic and drawing the various models. And of course you have to keep the sound system updating.
Jeremy: [00:02:01] As opposed to a web application where a lot of things are in response to something, right? Somebody's going to a webpage, clicking a button. With a game it sounds like there is this consistent loop that's tied to the frame rate. Is that correct?
Sara: [00:02:20] Right. And instead of reacting to things, you are occasionally checking the state of course so many times per frame and you will still have to respond to the user input but it's all going to be one part of this large set of functions.
Jeremy: [00:02:38] So you're in this loop and let's say somebody was pressing a button on their controller or moving their mouse that would affect the state. And when you're in that loop it's going to look at that state to determine what to do next in the loop?
Sara: [00:02:58] Yes, exactly. For example, in your input system you are usually going to be saving every frame, the state of the input system at the moment, like is this button being pushed? Is this button being pushed? Was this button pushed last frame and by comparing it against the previous frame. You know what the user is doing right now, and then many parts of the game logic will have various branches that check exactly what the user is inputting and what the current context is.
It's certainly more complex than the idea of I push a button and it calls this function.
Jeremy: [00:03:34] That sounds like like an entirely different model of thinking. The structure of the application has to be almost completely different.
Sara: [00:03:42] Absolutely.
Jeremy: [00:03:43] In terms of skillsets working on a game versus working on a regular application. What parts are really different and which parts are kind of the same?
Sara: [00:03:54] Regardless of what you're working on you're going to be working with a lot of external APIs, but these are probably going to be entirely different for a game versus other software.
For example, you need to be familiar with APIs like DirectX and OpenGL and all of these typically have different ways of handling graphics, handling audio, handling input.
There's also SDL of course that you could use and depending highly on the programming language as well there may be different ways of handling certain things that would be different from handling a normal application.
But most importantly you need to be familiar with graphics and audio APIs above all. I think those are typically less important when you're working on normal software because you're more concerned about how to build your window and how to get all of the elements displaying in the proper positions.
Jeremy: [00:04:51] With a typical application the actual act of rendering to the screen is abstracted away from you so you don't have to worry about how to draw to this screen. It's more like, I want a button or I want a window and something else is handling that for you.
Sara: [00:05:08] Yes, that's pretty much it. And when it comes to video game graphics, you have to consider a lot of different things like differences between graphics cards. Can this graphics card handle square textures or does it need a different shape? As well, of course, as being concerned with the quality of how it's drawing. Like is this texture supposed to be filtered? Will this graphics card filter it correctly? The APIs simplify that, but you're working at a lower level.
Jeremy: [00:05:36] And when you were giving examples of APIs you were talking about DirectX and OpenGL and SDL. How low a level are we talking? If I wanted to draw something to the screen at the most base level what kind of work am I looking at?
Sara: [00:05:55] Well, for SDL in particular this is somewhat of a simplified process because it handles all of the initialization of devices and such for you.
But with DirectX and OpenGL, you have to actually get the list of devices you have to initialize each process that you need. Like you need Direct3D, you need DirectInput, you need DirectSound, and then you have to make sure that you have all the device information that you need, including stuff like the current resolution or the capabilities of the device. Device capabilities is a big thing when it comes to that.
And then to actually draw the image, you are going to need to create a texture buffer. You're going to need a back buffer for the screen, and then you will have to load the texture and you will have to draw the texture to the buffer and then present that to the screen.
Jeremy: [00:06:53] When you refer to the buffer, at kind of a high level, can you kind of explain what that's doing?
Sara: [00:06:58] Essentially you have the screen that you're drawing to, and you usually don't want every single draw command to go directly to the screen. It's better to have a buffer that you can operate with so that you can display things in the correct order at all times. You can have them on the correct Z level, and of course, any sort of processing you need to do, you can do before it's drawn to the screen. And so everything goes onto the buffer after it's ready more or less.
Jeremy: [00:07:29] And when something is in the buffer is that what you plan to actually have the video card render? Or is that after it's already been rendered? Where is that in the stage?
Sara: [00:07:43] Well, rendering is basically the last step of presentation, but it depends. If you are drawing something 3D naturally you are going to have to do some processing on that via the graphics card. And the APIs allow for things like that, but typically rendering to the screen itself is always going to be the very last step of the frame.
Jeremy: [00:08:06] And then elsewhere in your application even though you're putting things into the buffer, you're deciding that there are certain frames that you don't need to render?
Sara: [00:08:17] Yeah, and of course that depends highly on the game, but often there may be no difference. So you don't really need a new frame or there might be a case where the game is overall running a little too slow. So skipping some of these frames will get you more CPU cycles free to work with.
Jeremy: [00:08:36] We were talking about the loop earlier where all the logic is happening at a set rate, but in terms of the way people see the game of how smooth it is, you may have the game logic running at the same rate, but only choose to render certain frames in the buffer so that it still runs the same it's just that it's a little choppier?
Sara: [00:09:03] Yeah and essentially you need your game logic working at the highest that your gameplay can really go. But when it comes to the graphics, it's a little more arbitrary for sure. There are situations where you might intentionally want your game logic running at 120 frames per second and then the player input and such will still come at that speed regardless of the actual refresh rate of the display.
Jeremy: [00:09:32] Interesting. I want to talk a little bit about how games can have a lot of binary assets. Typical software development you would use git as a version control system. Would you use the same for games or is there a different system that's more suitable?
Sara: [00:09:49] Well naturally, this depends on the company. Quite a few do use git and simply store the binary assets even though there is no real comparison system for them. There are also various internal systems that companies may use that are specially made for handling their specific assets. Essentially version control is not that different in the gaming industry. It just occasionally uses different applications.
Jeremy: [00:10:22] And so when somebody uses git even with the binary assets, it's not an issue. You just can't see the diffs properly.
Sara: [00:10:32] No, it's typically not a big deal because it's not difficult to look at two different pictures and visually compare them. And generally you're going to be changing the binary assets less as development goes on rather than more, whereas the code that's always going to be changing.
Jeremy: [00:10:52] The title you use is localization programmer. What would you say is the main role of a localization programmer?
Sara: [00:11:02] Essentially, when a game is being localized from one language to another obviously the goal is that you get all of the text inserted and you make sure that it all looks correct. At base levels this is generally the accepted position of a localization programmer. You are getting all of the texts into the game. You are making all of the code work with the correct fonts and other asset changes that you may need that are different from the original language. However there are a lot of things in this process that you wouldn't normally need as a skill. For example, if you are working with Japanese games, all of the comments and everything are going to be in Japanese of course.
Jeremy: [00:11:43] When you're translating games, do you usually have access to the original source code?
Sara: [00:11:49] I usually do, but there are plenty of cases where we are working on a game and the original developer is actually the one to handle the localized version. It depends a lot on how much other companies feel they are able to release to other parties, whether it's because of rights issues or they're just kind of nervous about their code quality, which I mean, I would be.
Jeremy: [00:12:12] And when you get access to the code do you usually have access to the original developers or are you on your own?
Sara: [00:12:21] I would say that completely varies. You never know exactly, but it's usually going to be the same with each partner. Like, they will usually be more than happy to help you if they have the developers on hand to help you. But game companies are constantly developing new titles themselves.
So when someone like me is working on a game, we may not have as much access to the developers or they would be working on the game.
Jeremy: [00:12:48] You were talking earlier about being able to extract text or bring text into the game. What are some examples of ways that the text would already be stored and how would you bring that out?
Sara: [00:13:03] This also varies a lot and it's very interesting. Many games use script files, essentially lists of commands for their events, et cetera. And in these cases, as long as you know the format of the script file, you're usually going to be fine. But some games like to hard code their text, and yet others have been in binary formats that have been compiled in some way, assembled rather.
And when that happens, you often have to take a look at the code and figure out exactly what this format is, which may or may not be documented, and basically do the process in reverse unassemble it.
Jeremy: [00:13:46] The first example you gave was script files. So would that be a plain text file and you could replace that text with instead of Japanese put in English?
Sara: [00:13:57] Essentially, yes, like when there's a script file format, you're going to see various commands, like there will be message and then it will have the text for a message box, that kind of thing. And when that is the case, it's usually pretty easy to change since it is just text. But you might run into some encoding issues.
Jeremy: [00:14:17] By encoding issues, would that be where let's say you replaced the Japanese with English and then the game itself tries to load the texts from the script file and it can't due to a different text encoding?
Sara: [00:14:29] Not so much that as you might get garbled text, what in the fan community that I used to work in, we would call cave speak.
Jeremy: [00:14:38] Could you explain what you mean by that?
Sara: [00:14:40] Imagine that you are playing a game in Japanese, but your system doesn't have Japanese fonts loaded. The font not being there for Japanese, you might instead see jibberish like ekvgh, that kind of thing. And that is cave speak.
If the encoding is very specific for the game or the font simply doesn't have other symbols, you might get complete jibberish in what font was already there. In Japanese, it's called mojibake.
The way that encodings work of course, is that the text is interpreted from its binary or hexadecimal format into characters we can understand. And since there's a lot of different encodings out there, sometimes you don't know exactly what you're going to get. So, for example. If a game was made for an older system, it may have a completely custom encoding where a value like "2" means "a" and in this situation just replacing the text you have no idea what the game's going to output. You're going to have to change it all yourself.
Jeremy: [00:15:55] So the original text in the script files, or perhaps you're talking about the binary files, they're not necessarily using a standard encoding like ASCII or Unicode. They may have made up their own encoding format.
Sara: [00:16:12] Yes, and this certainly varies. When it comes to Japanese, you'll usually be able to find that it's in the standard Japanese encoding Shift JIS. And Shift JIS largely supports English letters, but because Japanese text tends to include characters that are all the same width, it doesn't often account for the fact that in English you have very thin letters like the letter I. And so when you simply insert the text into those games you might get a situation where all of the letters are spaced really far from each other. And that does not look good.
Jeremy: [00:16:52] You were talking about how there's the example of script files or the example of code being hard coded or even binary files. What is the end result you'd like to get this text out for a translator to work on?
Sara: [00:17:25] Typically the text you want to get into an Excel sheet because that's the industry standard for translation and keeping texts like that. Sometimes the developer will provide this themselves and that will of course, make the process of translation a little easier because somebody like me can work while the game is being translated but usually it's more a case of what is the translator comfortable working with?
Are they comfortable working with script files? Do they need an Excel sheet? Most of them do and that's fine. But obviously this results in more steps. You need to create a tool chain to both extract the text and get it into a sheet, and then you have to make the call of whether you can get that text directly back into the game or if you would need to include more information, like the exact lines it's from.
Jeremy: [00:18:21] The simplest would be the script file where you might build a tool to convert the script files to rows in an Excel spreadsheet. A translator works on that to translate it to English. They give the spreadsheet back to you and you run some kind of application to convert their spreadsheet back into the script files.
Sara: [00:18:43] Yes, exactly. Although granted if a game is very small, you might just do it all by hand.
Jeremy: [00:18:49] In terms of these script files are the lines of dialogue or the text-- are they in their original context? If somebody in a game was having a conversation would all of the sentences be grouped together or is it sometimes just spread all over the place?
Sara: [00:19:14] This varies a lot. Sometimes you won't even know which character is speaking unless you actually play through the game while you're working on the script. And obviously this poses some interesting challenges because in general, Japanese is not as context heavy and context apparent as English is.
You can't always tell who's talking just by the tone of their voice, so to speak.
Jeremy: [00:19:41] Going back to when you have something where the text is hard coded. Is that a case where you basically have to do it by hand or have you found a process to deal with that?
Sara: [00:19:53] There are a variety of ways that this can be approached too, which is part of the beauty of games. It's an art form. But what I personally like to do is simply make something that extracts all of the apparent strings, like surrounded by quotes from all of the code and gives that information along with the line number. And just like with any other script, you can usually automate this.
Jeremy: [00:20:20] So you create your own version of a script even though they didn't exist originally.
Sara: [00:20:29] Yeah. And then you have this list of all the strings in the game, and you can easily put those into a sheet.
Jeremy: [00:20:37] When you work on localization, how much of the code base or the game engine do you feel like you need to understand?
Sara: [00:20:45] It depends on the scope of the task, but typically I would say that I don't need personally to understand much of the game to start working on it, but by the time that the project is finished, I should understand almost everything except for the raw graphics the APIs being used. Even then, I might need to depending on how the game is functioning and whether we need to do something differently, like making an older game use widescreen.
Jeremy: [00:21:15] And what's your strategy when you first start work on a project? You get the source files and you need to find out where the text is stored and how it's stored. What's your strategy for figuring that out?
Sara: [00:21:30] First I will look through all the files I received and make sure that the text isn't somewhere extremely obvious. And if it's inobvious, I will then start looking at the source code and getting it to compile. And I will look through that to see exactly what files it's loading. And from there I can usually figure out where the text is stored.
Jeremy: [00:21:52] In terms of when you get a project, do you just look at the code or do you figure out how to to run it and all that first?
Sara: [00:22:01] Usually I prefer to start by looking at the code itself and getting that to compile, but typically a developer will provide a working copy of the game that you can use to see exactly how everything functions. I do like to know what kind of game I'm working on and the basics of the gameplay, the basics of the setting, because that will help me motivate myself to work on it. It will help me get into the right mindset for it.
Jeremy: [00:22:31] You were talking about how there were easier ways of localizing in terms of script files and the hardest being hard coded or binary. What are some ways that if a developer is making a game that they can facilitate the easiest localization?
Sara: [00:22:51] The easiest localization would definitely be first you want to make script files for everything, whether it's just the list of strings that the game is using, or a proper script file for all the events. And then you want to make sure that the game is capable of loading a different sets of these based on what language is currently in use.
And if the game is already capable of loading different folders or different files for different versions of the scripts then it shouldn't be too hard to localize, but you also need to make sure that there are easy ways to work with the graphics and of course, absolutely all of the assets can be loaded differently based on language.
Another problem of course, is when it comes down to stuff like fonts. Because to make a very pretty looking font, you probably want to make the font graphical rather than using a true type or open type font. And of course, the challenge with that is including every character you could need for every language as well as issues like the font could be badly spaced in certain languages and to make a game as easy to localize as possible, you probably want to use Unicode and Unicode fonts and just include everything you possibly can.
Jeremy: [00:24:17] And because unicode has most languages used in the world, then you won't have the text and encoding problems.
Sara: [00:24:25] Exactly. You would completely skip that step. You wouldn't need to worry about the fonts. You would just be able to put whatever language you need and you won't get tripped up by stuff. Like in Japanese games Shift JIS is the encoding used right? And if you try to insert accents like an E with the little dash over it. It will instead draw a Japanese symbol if you don't change the encoding.
Jeremy: [00:24:52] You were talking about graphical fonts, so would that be a raster image, like a giant PNG or a JPEG that has all the different characters on it?
Sara: [00:25:06] Yes, exactly. Usually this also has some sort of table containing the geometry of the font so that you can draw it with proper spacing and everything.
Jeremy: [00:25:18] And then when you're talking about assets, that would be like textures. For example, if you had a sign in the game, that would be something that somebody would have to go into Photoshop and have a different version for each language?
Sara: [00:25:34] Yes, and ideally you want to keep all of the original layered versions of this with the text on a separate layer, because that will make it as easy as possible to change the text that's on it. But of course that isn't always available. So when it's not, you need somebody who's actually able to remove the text and put something new in.
There's also things like title screens of course, and any sort of special menu texts you have like that's drawn in a different font or has a special design.
Jeremy: [00:26:06] Because something in Japanese versus something in English can look significantly different when you're doing a logo. So you need someone with the artistic expertise to be able to recreate something that looked like the other thing, right?
Sara: [00:26:22] Yes. And of course Japanese logos are often written entirely in Japanese, so you will want to come up with a localized title for these games and then put it in the correct localized language, and that will involve basically designing a new logo that looks similar. It gets the same point across which is the big thing in localization. You want to get the point across.
Jeremy: [00:26:47] One of the games you worked on was a game where you decided that the game's code needed to be rewritten. Can you talk a little bit about what the game is and why you decided that you needed to rewrite the game?
Sara: [00:27:00] Well, in this case, we are talking about a game called Corpse Party. Corpse Party's origins were as an RPG maker game on an obscure Japanese computer (PC98). Obscure over here at least. And when the developers revisited it many years later, they decided to remake it in a programming language called Hot Soup Processor or HSP for short. HSP was taught in many Japanese schools because it was free and it was made for people wanting to get into things like this.
Now Hot Soup Processor resembles basically the child of Java and C and in this way, everything is a little different. I wouldn't even compare it to C, honestly more like BASIC. And as a result, your game looks more like a scripting language than a programming language. And the bigger problem is that you don't have as much control over the graphics and sound APIs for games that are coded in HSP, and so to do many things with the graphics properly or get the timers right it may be better to change over to something else. Especially if you're going to put this game on something like Steam, because Steam does not have a version of their API for Hot Soup Processor. I know. Shocking.
Jeremy: [00:28:30] About it being like BASIC, is there a high reliance on GOTO statements?
Sara: [00:28:38] Unfortunately, yes, there is.
Jeremy: [00:28:40] Was this statically typed? What was it like to kind of work in that language?
Sara: [00:28:49] Well, when it comes to typing. You don't always know what type it's going to be. You don't know if this variable is necessarily an integer or if it's a float, and the compiler of the scripts largely doesn't care, and obviously that means you can run into some typing issues because you have completely misunderstood what a variable actually was. That definitely gave me some headaches.
Jeremy: [00:29:16] So it's sort of like Javascript or Ruby where you can define a variable and that variable, you could put in a string, you could put an a number, you could put in pretty much anything and there isn't really a way to check. I guess it just runs through the interpreter and then you find out at runtime whether something's going to blow up.
Sara: [00:29:37] Pretty much, I definitely would compare it to Ruby in that regard.
Jeremy: [00:29:42] What was your strategy for rewriting a game? Were you porting function by function? How did you get started?
Sara: [00:29:50] Of course, I started by just copying all of the code and then I began rewriting each function one at a time. Trying my best to understand the purpose of each function as I was going along. I knew what some of them did already because we actually inserted the translation before all of this, but the thing is that I was never going to be sure exactly what functions relied on each other and what variables were needed and how exactly they needed to be typed until everything was already basically put together. And so I honestly was pretty scared during this process at times because I had no idea if it was going to work until it worked.
Jeremy: [00:30:36] You were rebuilding it function by function but because it wasn't really clear which functions relied on other functions you couldn't run the game. There wasn't this iteration of getting part of it working and moving on to the next part, you had to build everything and then you found out if there were issues?
Sara: [00:30:58] Yes, exactly. Because for the most part with games, you are not building the code level by level. You are programming something that will be able to run all of the levels.
Jeremy: [00:31:09] Wow, that sounds terrifying.
Sara: [00:31:11] It is.
Jeremy: [00:31:12] Were you able to on a function by function level, have an idea of what the function was supposed to return so you could at least test certain functions?
Sara: [00:31:26] Yes. There were definitely cases like that. However, the most complex parts of the game are often more in the graphics and audio engines and such if you're working at this kind of level. And I wasn't always sure how the graphics functions worked. No fault to the developers, just that it was so much.
Jeremy: [00:31:46] Since you didn't quite understand how the graphics functions worked, when you rewrote that portion of the game did you take a different approach rather than trying to recreate what they had done?
Sara: [00:31:59] Well, I initially tried to recreate what they'd done, although for certain API functions, et cetera, I basically took my best guess and so when the game was finally running, it definitely didn't look or sound right. At first, there were problems, and these were entirely problems with how I had understood the code rather than anything to do with the game itself of course.
Jeremy: [00:32:24] How long did it take from getting the code to actually being able to run the game at all?
Sara: [00:32:31] The original version of the game pretty much worked out of the box and compiling it I just needed the version of the HSP tools that the game had been made for. But when it came to actually making my version of the game run it took about a year.
Jeremy: [00:32:50] Wow. So a year before you could launch the game. That's intense.
Sara: [00:33:00] Believe me, I hated myself for a lot of that project. I was like, why didn't I just use the original code? Why did I put myself through this? But the result was that the game runs very well.
Jeremy: [00:33:12] One of the common pieces of advice people get when they work on software is quick iteration in terms of build the thing, see if it works, then you can build the next thing. But in this case it seems like that wasn't possible to do.
Sara: [00:33:32] Yeah. Unfortunately. With games everything is going to be more tangled and depending on the way you approached creating the game you can end up with either something that's very clean and parts don't depend too much on each other, more abstracted, or it can be a plate of spaghetti. And just like any application, this is a matter of designing the application before you start building it.
Jeremy: [00:34:01] Something you managed to do when you rewrote the game was make it platform agnostic. What are some examples of ways that when you write a game, you make something that can be easily run on different platforms?
Sara: [00:34:15] Regardless of the platform, you are going to be dealing with different graphics and audio API APIs. Because you're not going to run DirectX on a PS4 and essentially the approach you need to take with this is mostly related to creating wrapper functions for all of the graphics stuff you need to do so that you can easily change these functions and not have to change any other part of the code.
For example, you want to have functions for drawing a sprite. You want to have functions for drawing some text, and you want these to be things that you can easily change the contents of without changing any part of the actual games code directly.
Jeremy: [00:35:00] And the parts that are going to be different from platform to platform, would you say that's the graphics, the sound, the input that sort of thing?
Sara: [00:35:10] Yes. Basically everything that deals with IO. Aside from the fact that on pretty much any platform you're going to be able to use, fopen() without any issue.
Jeremy: [00:35:21] When you rewrote the game, I think you had said that when you got the original version of the game, you had replaced the text. Was that hard coded into the game or was that through scripting files?
Sara: [00:35:34] Mostly scripting files, but it took me quite a bit of effort to understand these scripting files. There definitely was some hard coded text in the game. Mostly related to the menu functions. Chapter names for example that you would encounter in the menus were hard coded.
Jeremy: [00:35:52] And did your rewrite load in the original script file format?
Sara: [00:35:58] Yes, actually. Although interestingly enough with this particular game, all of the original scripts were in Japanese. Even the commands and this is not common. It does happen, but usually the commands and such will be roughly in English because when you're working with programming, you're going to be writing some of it in English no matter what language you speak.
But since all of these scripts were completely in Japanese, I decided to replace all of the commands such that the game could still load the original format, but it was using something sort of new in that the script commands were simply translated.
Jeremy: [00:36:37] And when you refer to a script command could you give an example of what some of those would be?
Sara: [00:36:42] Stuff like message, wait, display picture. Move character to the left, stuff like that, or check a variable.
It's a lot like programming. It's just that you are working in a language that is created for the game and is being used by the game without compilation usually.
Sometimes there's assembly involved, but when you assemble the script like this, you are just converting the commands to one, two, three, et cetera to make sure that you know what the commands are without having to use as much text and space.
Jeremy: [00:37:19] So it's kind of like an enum, is that a good example?
Sara: [00:37:24] Yeah. Essentially
Jeremy: [00:37:25] And those commands would use the English letters but they would be written in Japanese words?
Sara: [00:37:33] They were actually written in Shift JIS full Japanese.
Jeremy: [00:37:36] Wow.
Sara: [00:37:37] I never encountered any other project that did that, although I understand it must've made the script files really easy to understand for them.
Jeremy: Yeah. It's interesting how we sort of make the assumption when we look at code that there's going to be English in it. I guess sometimes you get surprised.
Sara: [00:37:55] Yeah, for the most part, there aren't many programming languages where you can do everything in another language, but if you're creating the language yourself, obviously you have plenty of control over that and you can do anything.
Even though we sort of take for granted that there's going to be English there it doesn't have to be, and I find that extremely interesting.
Jeremy: [00:38:17] I don't know that it happens very often, but I think a lot of languages do support Unicode for variable names and stuff like that, I just am not sure how many people actually take advantage of that.
Sara: [00:38:28] Yeah. When it comes to seeing localized variable names and source that are not in the normal character set? I would say I hardly ever see it. I can only think of one occasion and you will usually instead find that it is written in English characters, but is in that language, which in the case of Japanese is what they call Romaji.
So you would have the Japanese written entirely in letters. And you find that a lot when you're localizing Japanese games.
Jeremy: [00:39:01] You said after a year you were able to run the game, but there were bugs or there were different issues. How were you able to track down what bugs were in the game? Was it a completely manual process where you just had to keep playing and keep trying things?
Sara: [00:39:19] Essentially, as well as having testers within the company also handle that sort of testing, what is called quality assurance. Of course, we just call it QA. But by having people iterate through the game many times we would find all of the bugs with it, essentially like you would any other program. You just have to have people using it.
Jeremy: [00:39:41] Were there any bugs or odd behavior in the original version that the game was relying on that you had to reproduce?
Sara: [00:39:51] I wouldn't say there were any bugs that I had to reproduce, but since I had copied the game's code as closely as possible, some of the original bugs did carry over initially and I was later able to fix a lot of them.
Jeremy: [00:40:08] We've been talking about the process of making games and rewriting games. What is the language that you rewrote the game in?
Sara: [00:40:18] C++.
Jeremy: Is that pretty typical for the projects you work on?
Sara: I would say that the vast majority of professional games are written in C++.
Jeremy: [00:40:31] What makes that the default choice that people would go to?
Sara: [00:40:38] Frankly, it is widely supported. You have C++ compilers for consoles. You have them for computers, you have them for all kinds of different devices. Whereas you create a game in Ruby, you have no idea if you're going to be able to put that on a PlayStation.
Jeremy: So it's just the fact that everybody's using it and there's so many tools, so much support.
Sara: [00:41:04] Yeah, exactly. As well as the fairly robust standard libraries and since these are available on so many platforms, it's just always been the choice ever since it came to be. Originally back in the days of like the Nintendo Entertainment System, games were primarily written in raw assembly. But obviously that's not very practical, especially for the scale of games these days.
So it eventually changed to C and then C++. There have been efforts to adopt C# in more platforms, but it's kind of a slow going because it's just so deep-rooted.
Jeremy: Typically when I hear about C# and games it's within the context of the Unity engine. It seems like that would make it easier for people to get started and work on projects but I don't know what the trade offs are relative to C++.
Sara: [00:41:58] Typically, I would say that C++ simply is the most easy to optimize compared to other options like this because of the nature of how it's compiled. Whereas with C#, they take a more native approach where all of the code is managed code, and the native code is easier to optimize, whereas the managed code that makes up most of C# simply isn't something you can optimize much beyond how it's handled for everything.
It's a global optimization, whereas optimization specific to various functions in games are better than global optimization in a case like this. The trade off is simply performance. How much you can pull out of the system before it starts choking.
Jeremy: [00:42:50] And one of the big differences between the two languages is that C# has a a garbage collector. For games, is that something that's seen as not desirable?
Sara: [00:43:02] Rather than having garbage collection, you need to handle all of the games memory very, very strictly. You need to know exactly how much memory everything is going to take at any given time. The reason for this is because consoles typically have much more severe memory limitations than a computer does, and if you start running over that memory, you just don't have a program running anymore. It's problematic.
Whereas on PC, there are things like virtual memory to pick up if you end up running out of memory somehow, and as a result, if you have any sort of garbage collection, you're going to be wanting to do it yourself. Frankly, all memory that you're locating, you should be doing yourself. You should have your own memory manager in the game.
Jeremy: You said it's probably true for all platforms, but on consoles it's more important just because of the memory restrictions.
Sara: [00:43:58] Yes. And of course that is also why when you are porting a game to a console, you have to be very considerate of things like memory space and of course video RAM. It was more important back in the older days of systems like the Super Nintendo and the Nintendo 64 but it still remains important today. If you start running out of memory on console, there's nothing you can do about it.
Jeremy: [00:44:27] You've ported some games from low spec hardware like the PlayStation Portable to the PC. What are some of the unique challenges related to porting a game?
Sara: [00:44:38] First of all, when it comes to games on an older system or a weaker system such as the PlayStation Portable the game is made exactly for the hardware specifications and you are at the mercy of those specifications at first. For example, the PlayStation Portable, its resolution is quite small. It is something like 480 by 272 and while that is almost 16:9 it's not quite.
When it comes to PC games people want options. They want the ability to use different resolutions and customize their controls and the original game is probably not going to have supported this and more importantly if you're working in a game at a low resolution like that, that is not going to look good on a PC screen.
With these projects we take as many of the original PSD assets and such as we can and just redo the games assets in HD and that obviously results in a much prettier game.
But that also assumes that you actually made the assets in a higher resolution to begin with. Otherwise, they're going to have to be remade, and that is a whole different ballpark of problems.
Jeremy: [00:46:00] Yeah, their resolution is so low that if you were to use the original assets, they would just look super blurry or blocky if you were to port it straight to the PC.
Sara: [00:46:10] Exactly right. And frankly, that would make the game look a lot worse as a whole and less people would play it because at that point you're just playing something that looks old and you don't want it to look old because you're creating a new version of it. You want it to look new and shiny and sparkly.
Jeremy: [00:46:30] How does the process of porting a game differ from your experience rewriting Corpse Party? How much of it is a rewrite versus something different?
Sara: [00:46:43] When you're porting the game, hopefully it's in a language that you can still use. And if that's the case then the first step is going to be ripping out all of the stuff that you can't use.
For example, when you're working on a console you have to use the console developer's development kit in everything. You have to use their software and while usually this supports the same programming languages that you would be able to use on PC anyway, you won't be able to use most of the graphics functions and the audio functions and you're just going to have to do all of that over instead of copying like I did when I was porting, one language to another.
I instead had to basically take guesses at what all of these functions did and then reproduce them to my best. And occasionally this would result in something I didn't understand, just like when I was working on the original corpse party game, but more often I would find that I missed a functionality and that something wouldn't change color when it was supposed to or something wouldn't play from the correct speaker or things like that.
One of the latest issues that I got when I was working on my first port last minute was that I realized there were audio files that were supposed to be looping and cross fading. And I had forgotten to implement any of this because I didn't realize what it was. When it comes to porting to PC, there are two main concerns. One is making sure all the functionality is correct in the first place. The second is making sure that the game runs well on a PC because there are different threading models and different timer models that simply don't convert well to a PC. That's the case with changing platforms in general.
You want to make sure that everything runs as it should, even though it's on a different platform, which I know I'm just speaking in circles there, but at the same time, you have to realize if you just put a game directly on PC and you don't change anything it is going to run a lot slower than it could because it was optimized for something else.
Jeremy: [00:49:04] When you talk about different threading models, could you give an example of something running on a console versus on a PC? What are these different threading models or different choices you'd be looking at?
Sara: [00:49:18] Well, for example let's say that you are programming a game that is working on a Nintendo Wii. You will find that you have to handle the graphics at this time and the audio at this time and threading is often a better choice when it's available for simple performance reasons.
However, it's also the case that a lot of older games don't use any sort of threads at all. Everything is just a single thread going through that single game loop and you have to deal with the results as they come. But when it comes to a PC game, you of course want threads. You want to be able to use more of the CPU and instead of using the exact CPU number and types that you have on the console like that, you have this audio and this CPU and this console happens to have a dedicated audio processor, stuff like that. You want to use something more general on a PC that'll work on anything.
Jeremy: [00:50:20] So in terms of concurrency models on a console, the software development kit for that console may already have ways that you're required to do certain things. Like you were saying, a certain way to do graphics, a certain way to do audio, and once you bring it to PC, then you have more control over how you do concurrency, how you do graphics. And so you may choose to do it completely differently than how it originally worked.
Sara: [00:50:52] Yes, and I think in many cases that's preferable because obviously concurrency in something like a game is best used for optimization and making sure that everything runs smoothly without any hitches. But on a console, you are optimizing to such a specific scenario that this optimization does not carry over well and is actually harmful when you are copying it directly to a PC.
Jeremy: [00:51:18] Are there any specific APIs for example OpenGL that you would choose when you want to make something that you can consistently use across platforms?
Sara: [00:51:29] I would choose OpenGL or possibly SDL because surprisingly SDL exists for quite a lot of platforms. That said, I would say that Vulkan is becoming something that would be very useful in that sort of way.
Jeremy: [00:51:43] In terms of the projects that you've worked on, the example you gave with PlayStation Portable, were those games using some APIs specific to the console to do graphics, rendering, audio, that sort of thing?
Sara: [00:51:59] Yes. The APIs for these things are typically provided by the development kit and whether they resemble an existing API or not is pretty much up to the console creator's choice. For example, if you're working on a Nintendo 3DS, you can probably use OpenGL straight up with just a few changes. Although the optimization is still going to be a nightmare by comparison because it is completely different.
Jeremy: [00:52:26] When you're working on one of these ports have you ever had to go the other way around where you had a less constrained environment and then had to move it to a more constrained environment?
Sara: [00:52:40] So far, I would say no, because for the most part, PC is about the least constrained you'll get. But I still try to make sure that when I'm working on a game, it'll run on as many systems as possible, and that means trying to make it run on a toaster if I could. And when your PC is a potato, you don't really expect a fully 3D game to work right.
But sometimes they can. Sometimes if you are very careful about the way you optimize the game and take a lot of constraints in what you're doing, you will make a game that looks a lot better than it should on an older PC. And I try to make that a goal of mine. I try to make sure that these games run in older environments, Up until last year I was even making a point of supporting Windows XP with all of my games and that would be more of an issue at times than you would think.
Jeremy: [00:53:36] What's an example of something that you would have to do specifically for such an old operating system?
Sara: [00:53:42] For supporting old operating systems you essentially are limited when it comes to your compilers and your libraries. You can't use newer versions of libraries because they weren't made for older systems. The windows kernel changed very dramatically with Windows Vista and it's only been evolving more and more since.
And you are limited to using this older version of the kernel that is largely still supported, but there's often behaviors that are just a little different between a system and you won't expect them. Like you might be trying to get the size of your window rectangle and it will be off by one. Small things like that. Or you'll be drawing text and Vista might support outlining the text and XP might not.
Jeremy: [00:54:32] So there's subtle bugs or small features that don't exist. So when you have to take into account whether you'll have those or not it just makes everything more complicated.
Sara: [00:54:44] Yes, exactly. And you don't always have the ability to expect it because often the functions will look the same, but the behavior is simply different.
Jeremy: [00:54:56] And these are operating system level calls that you're referring to?
Sara: [00:55:01] Yes, largely. Although there is also the matter of APIs for things like graphics that don't exist on older operating systems. For example, you can't use DirectX 10 or later on XP.
Jeremy: [00:55:15] One of the things about porting is we were talking about the things that you can't reuse. Are you able to use most of the core game loop and the logic as-is without modification?
Sara: [00:55:27] Well, for the most part, most of the game code is going to stay intact like that. You will have to alter stuff like the coordinate calls for various textures because all of a sudden you're worried about different resolutions. Games running on consoles generally run on a single resolution that is their target. And if you are using a different resolution on your TV, the console is scaling the output to that.
But when you are working on PC, you need to account for that and you need to account for different aspect ratios because while it's easy to say, Oh, I'm going to make my game this resolution, you might upset the people who have ultra wide monitors or the people who are still using old 4:3 monitors. They exist and ideally you want to support all of these scenarios as smoothly as you can. When you're creating a game, you can account for this better than if you're working on a game that already exists.
Jeremy: [00:56:23] It sounds like a lot of work in terms of finding the sections of code that are expecting a specific resolution, a specific aspect ratio. And earlier you were talking about how your assets may change as well, because you need UI elements that can be rendered at a much higher resolution or at a different shape. It seems like there's a lot of work that needs to go into it.
Sara: [00:56:51] Yes. A lot of the games that I worked on earlier in my career were older games that were made for 4:3 and in the process of working on these I realized obviously these games need to be widescreen, and so there were a lot of different approaches that could be taken. I could have converted the entire game to widescreen, but then what about the purists?
So I made this approach where I could arbitrarily in the code set any graphic to be aligned to the left or right of the screen or the center. Essentially by dividing the screen into regions that were still 4:3. So for example, I would draw a UI element that was always in the bottom right of the screen on the right, and when the aspect ratio increases, this would still be drawn on the right of the screen in the corner.
And by separating out HUD elements like this it gives a fairly consistent experience between different platforms without having to limit the user to a specific aspect ratio.
Jeremy: [00:58:00] You've worked on a lot of different games that have been made a long time ago or more recently. What are some of the big differences code-wise you've seen in the oldest projects you've worked on versus the newer ones?
Sara:[00:58:15] Well, I actually got my start in fan translation, working on things in assembly, disassembly rather. And obviously the challenges for that are very apparent. You're working with very raw code that you don't have any information on. But as I've gone along, I worked on various Japanese indie games as well as Japanese professional titles.
As the time has gone along, I feel that on average these games are more abstracted and more object oriented. And back then there was very little object orientation. There were just these loops and loops upon loops. Nowadays, you'll have various objects that might be inherited from other objects.
So a video game's character might be a single generic object that has most of the basic behaviors. And then on top of that, you will have the class for the player character or the class for a dragon. And they have their own unique stuff to them. And obviously the game loop has changed a lot over the years as a result of things like this.
Jeremy: [00:59:27] I think that matches how software development in general has evolved because there used to be primarily procedural code, right? And now we're seeing more object oriented code, more code related to functional programming. Maybe that parallels the general software community.
Sara: [00:59:49] Yes, I believe so, yeah. In this regard, the evolution does seem pretty parallel. The only limiting factor is that adoption of new programming languages is very slow.
Jeremy: [01:00:01] I'm not even sure what they would have used before C++
Sara: [01:00:07] C. But before that, they really did just program games in basic assembly. And occasionally in BASIC.
Jeremy: [01:00:15] You worked on fan translations where you would have to reverse engineer the code. You said that was assembly?
Sara: [01:00:27] Yes. Because when it comes to these games that are already compiled and you're working on them as a fan, you're not going to have any access to the source code ever.
So there are two options available to from there. You can either learn how to work with the code that's already compiled by learning the assembly and learning how to debug this.
Or you can in some cases to an extent decompile the code if it was made in an existing framework that's well understood like .NET or some other programming languages that exist. But usually you're going to be working with the raw assembly when you're working as a fan and you need a strong understanding of the processor that the game was developed for and the exact sort of quirks to expect.
Jeremy: [01:01:18] When you're working as a fan you somehow have to determine where the text is in that game right even though everything is raw assembly. How would you even know where to start?
Sara: [01:01:35] Well, initially you would have a debugger that works with the game and you would have to step through step by step until you actually see the texts being drawn. And then you basically start stepping back from there to see how the text was loaded. And so it's all a process of finding first something that uses the function you need to edit and then working your way back.
Jeremy: [01:02:01] That sounds very, very time consuming.
Sara: [01:02:05] It absolutely is, but it's an interesting process because you learn a lot about how different styles are used in different things.
Jeremy: [01:02:16] Not only would you have to to find the text but you would have to find a way to re-insert it without the application breaking.
Sara: [01:02:26] Yeah, and in some cases the text would be stored in the application itself, but usually it's going to be in some sort of container you don't have any specifications for. And then it might be in some encoding you don't know. It might even be encrypted and obviously all of the functions for you to fix that are right there, but how do you put it back in? It's easier to extract something than it is to create an archive that works exactly as the one before.
Jeremy: [01:02:55] That's interesting. You have these files where you don't know how they're encoded but you're looking through the assembly code to maybe get a hint for how they were generated so you can modify or generate them yourself.
Sara: [01:03:09] Exactly. And when it comes to compressions and encryptions. It definitely is much easier to get a file that's already encrypted or encoded or compressed open than it is to make a new one because you have to worry about the size of the file. You have to worry about getting the optimization of the compression right. You mess up one part of the encryption key and everything's wrong.
Jeremy: [01:03:36] That sounds like pretty intense work for a fan project.
Sara: [01:03:40] Yeah. I actually got my start working on just simple script engines in games like Furcadia of all things when I was young. But as it went along I discovered things like emulation and all these interesting games in a language I didn't understand. And all I could think was how could these games be English? Is there a way? And sure enough, there actually was.
Jeremy: [01:04:05] That's really impressive and very cool. For somebody who's interested in modifying or localizing games, How would you recommend someone today get started?
Sara: [01:04:20] Well, I would find a project that you like on something like GitHub and simply try changing things and improving things in that first. Make it your own. Essentially with something like this, you need to get comfortable working in other people's code. It's not about learning to program a program yourself. It's about learning to be comfortable in other shoes and being able to cope with the decisions they make.
Jeremy: [01:04:52] I feel like general software development, a lot of it is also that right? So many of the projects people work on they're not brand new. They're something that a team has been working on for years and you're coming into that environment, living with the decisions that were made, and figuring out where to go from there.
Sara: [01:05:16] Yes. With normal game development, you'll find all sorts of tutorials that teach you how to build a basic platform and game engine or such, but it won't necessarily tell you why it's built that way, and you can make games that way, but you can't really work with somebody else's games from tutorials like that.
It has to all be about putting in the footwork yourself and just trying. You just have to try and you just have to keep doing it until it gets right. Just like any other skill.
Jeremy: [01:05:43] After all the years of experience you've had are there certain things that you do really differently now than what you did before?
Sara: [01:05:55] Oh, definitely. For one, I used to do absolutely everything by hand even stuff like text. I used to copy the text and paste it into files as I needed it and rinse and repeat. And when I was working as a fan translator, sometimes early on I would change the pointers for the text manually when I needed more space.
And that is just an awful idea. Automate as much as you can, make sure that whatever you're doing, you're doing it the easiest way you can. But at the same time, don't sacrifice quality for this. But typically you won't.
Jeremy: [01:06:38] So basically find the things that you are doing that are repetitive. Identify the things that you can automate and build that script or build whatever functionality you need.
Sara: [01:06:52] Yes, exactly. Because if you're spending a lot of time pulling the text out or something like that, you obviously don't need to be. There are cases where you might want to do that, like if there's exactly one or two hard coded strings or just a single menu, but if the project you're working is at any sort of scale that makes that an hours long job? Do not do it manually. Please.
Jeremy: [01:07:20] I think that applies broadly as well. How much pain do you need to feel before you just automate it and write that script.
Sara: [01:07:32] Yes, exactly. The beauty of being a programmer is you shouldn't need to do much math because you can make the program do it for you.
Jeremy: [01:07:40] Cool. I think it's a good place to start wrapping up, but is there anything else that you thought we should have mentioned or talked about?
Sara: [01:07:50] I suppose another thing I would like to mention is if you're getting into localization programming for a specific language, like Japanese to English or something like that you should probably try to learn as much of the languages you can along the way. So not just the programming language, but the actual language as well.
If you are working on Japanese games a lot, knowing Japanese makes your work so much easier.
Jeremy: [01:08:15] What are some of the ways that you've found that knowing the language really helped you a lot?
Sara: [01:08:21] Even though all of the texts for a programming language is just gonna be in like ASCII, you're going to be dealing with the fact that all of the comments are going to be in another language. A lot of the function names are going to be in another language. You're going to see a file named Zako and you're not going to know that means a small fry enemy.
Jeremy: [01:08:42] Yeah, so just helping you get as many hints and context as you can.
Sara: [01:08:49] Exactly. Because when you're working on someone else's code in another language, you are not going to be working with documentation you can necessarily read. All of the documentation is absolutely going to be in that language.
Jeremy: [01:09:03] I guess code comments as well if those exist.
Sara: [01:09:07] They do usually.
Jeremy: [01:09:09] For people who are interested in checking out what you're working on or what you're up to how can they follow you?
Sara: [01:09:17] First of all, obviously you should keep up with XSEED's work in general if you want to keep up with what I'm doing. But I also have Twitter @SaraJLeen and I also occasionally do Twitch streams, not of programming, but of games in general. And my Twitch username is @saralene. Sort of a corruption of my name.
Jeremy: [01:09:45] You work in a very unique field of software development and I really enjoyed the conversation. Sara. Thank you so much for sharing your experience, translating and porting games.
Sara: [01:09:55] Thank you for giving me a chance to talk about the technical side of things.
Jeremy: Yeah, it was really fun thanks again for coming on the show.
Sara: My pleasure.
Jeremy: That's it for my chat with Sara. I hope you check out some of her projects like Corpse Party and Trails in the Sky. She put in a lot of work to make sure they run well on old PCs so don't worry if you've got a slow computer. As usual, a transcript for this episode is available at softwaresessions.com. Alright. See ya.
Spencer Dixon is the CTO and cofounder of Tuple, a pair programming application for remote developers.
We discuss:
This episode was originally on Software Engineering Radio.
Related Links:
Courtland Allen is the founder of Indie Hackers, a community for people who want to start bootstrapped and profitable businesses. It was acquired by Stripe in 2017.
We discuss:
Related Links:
Music by Crystal Cola: OrionVisit the Software Sessions website for a full transcript of the episode.
Iris Zhang is a gameplay engineer at Riot Games on the League of Legends Champions Engineering team. She previously worked on backend services at Microsoft.
We discuss:
Related Links:
Stephen Cleary is the author of the Concurrency in C# Cookbook and a Microsoft MVP. He has also written many blog posts on asynchronous programming.
We discuss:
Related Links:
Music by Crystal Cola:
Federico has been writing backends for web applications since 2012 and is the co-founder and chef of alto;code. He wrote a post on GitHub named "How I write backends" that summarizes his process.
We discuss:
Federico Pereiro
Related LinksFred Brooks & The Mythical Man Month (conceptual integrity)yegge.blogspot.com/2007/12/codes-worst-enemy.html">Steve Yegge - Code's worst enemySteve Yegge - Platforms rant (no backdoors, all services talking through the wire as if they were external actors)no-jankiness.html">Book of Hook - Suffer no jankinessTaiichi Ohno & the Toyota Production SystemAuto-activation in software
Noah is the CTO and cofounder of Veryable, an on-demand marketplace for manufacturing, logistics, and warehousing labor.
He's also the host of the Code Story podcast where tech leaders reflect on the path they took to create their products.
We talk about:
Noah Labhart
Other Links
Music by Crystal Cola
Daniel is the co-author of the book "The Good Parts of AWS" and previously worked at AWS on the CloudWatch team. He left last year after over 8 years at Amazon to work on his own projects.
He's currently working on an end-to-end encrypted user database SaaS called Userbase. Daniel is also openly sharing his experiences building an audience, writing a book, and building Userbase on twitter.
We talk about:
Daniel Vassallo
Other Links
Ben is the CEO of Tuple, a pair programming application for remote developers.
He currently co-hosts The Art of Product with Derrick Reimer.
We talk about:
Ben Orenstein
Podcasts
Other Links
Kyle is the creator of Neocities a free web hosting site that encourages people to build creative personal websites in the spirit of GeoCities. He's currently working on restorativland where people interested in the history of the web can discover websites originally hosted at places like GeoCities and Myspace Music.Kyle Drake
Projects
Other Links
Timestamps
Music in this episode is VHS s k y l i n e by Crystal Cola.
Jack Ellis
Fathom Analytics
Laravel
Hosting Providers
AWS Services Used
Other Links
Timestamps
Theme music is 12:30 AM by Crystal Cola.
Vincent is the creator of Zola (Formerly Gutenburg), a Static Site Generator built with lang.org/">Rust and Tera, a Jinja2-like template engine.
If you create a site with Zola, Vincent would appreciate you adding your site to the EXAMPLES file in the repository.
You can also take a look at the source for this website which is currently built with Zola.
Vincent Prouillet
Zola
Tools/Crates used by Zola
Static Site Hosts
Crates for Web Applications
Compiled Template Engines
Runtime Template Engines
Static Site Generators
Other links
Timestamps
Daniel is the Director of Product at HAProxy Technologies.
This episode originally aired on Software Engineering Radio.
Theme music is 12:30 AM by Crystal Cola.
Links
Timestamps and Transcript are coming soon.
Cassidy is a developer and instructor at React Training and the Director of Outreach at cKeys, a Seattle based organization that promotes learning electronics through mechanical keyboards. She previously worked at Amazon, CodePen, L4 Digital, Clarifai, and Venmo. However, she's probably best known for promoting diversity in tech, creating cool mechanical keyboards, and making memes.
CassidyPersonal Website@cassidoo
Current WorkReact TrainingcKeys
KeyboardsFollow your dreams (literally): How I designed and launched the Official Scrabble KeyboardAstrolokeys (Collaboration with Amy Wibowo)
Notes AppBear
Resources for new conference speakers@CallbackWomenYou've got this!
Theme music is 12:30 AM by Crystal Cola.
Timestamps
Samuel is a member of the Ruby core team. He's working on making it safer and easier to write scalable applications in Ruby.
Conference Talks:Fibers are the Right SolutionThe Journey to One Million
Posts:Asynchronous RubyFibers Are the Right SolutionEarly Hints and HTTP/2 Push with Falcon2019 Ruby Association GrantSource with comments on why the Global VM Lock exists
Contact:@ioquatixSamuel's website
Timestamps:
Aaron is on both the ruby and rails core teams. He's best known for making puns, teaching others, and his contributions to the internals of ruby. Lately he's been focusing on a compacting garbage collector for ruby.
This episode originally aired on Software Engineering Radio.
Timestamps:
Links:
Mubs is an amazingly prolific creator. He's built 85 side projects which are all listed on his website and was Product Hunt's Maker of the Year in 2016. His output is inspiring and he offers great advice on how to approach building side projects of your own.
Mubashar Iqbal
Projects
Pod Hunt Stack
Timestamps
Felienne is an associate professor at Leiden University who brings a unique perspective on programming education backed by scientific research. She also runs the Programming Education Research Lab (PERL) in order to study the best ways to teach programming.
Keynote at Strange Loop
Related Research Papers
Felienne
Bonus
Timestamps
Julia Evans is best known for her zines and blog posts that break down technical topics in a friendly way. She has written educational material on everything from Linux system calls to working with managers.
She joins us to talk about what she learned while writing her latest zine: HTTP: Learn your browser's language.
Timestamps:
Links:
Topics:
If you're interested in helping Armin build an open source debugging community, reach out to him via e-mail or twitter.
This episode is part of the station.org/">Rustacean Station feed. Check it out if you're interested in Rust podcasts.
Links:
Show timestamps:
This podcast could use a review! Have anything to say about it? Share your thoughts using the button below.
Submit Review