This podcast currently has no reviews.
Submit ReviewThis podcast currently has no reviews.
Submit ReviewGeorge talks about his experience streaming during Lexember.
NOTE: This episode was written and recorded in the middle of the D&D OGL debacle. The way it was resolved changes some calculations slightly, but I’m still a bit perturbed by it.
Aidan Aannestad comes on the show to talk about information structure, which included discussions on topic and focus and how they can be realized in language.
Links and Resources
George will be streaming word creation for Lexember every Saturday of the month at 1 PM US Central Time! You can check it out here!
Today, Logan Kearsley joins us to talk about whistled registers, and to let us know about his whistle synthesizer that can help you make one.
Links and Resources:
George discusses word substitutions people use to avoid Internet censorship, and how that could be applied in worldbuilding.
One of the interesting things you find in internet spaces is the presence of content filtering and the attempts to get around them. On the one hand, the people who have control of a given space have impressive control over the language that is allowed to be used on their platforms. Yet, on the other hand, many of their tools are fairly easy to circumvent, especially if there aren’t expensive human reviewers involved.
The result of this is a really interesting environment for a weird kind of taboo avoidance. People avoid certain words not because of any genuine belief that it’s wrong to say them, but because there are people in power who have an effective means to ban those words, and a lot of their replacement strategies have a clear eye to keeping the meaning clear while avoiding the automated filters. This could be really interesting to think about for conlangers working in modern or science fiction settings, where the same kinds of filtering tools might be present, though I have a thought how it could even extend into less technological fantasy settings.
Before we get to that, Conlangery is entirely supported by our patrons on Patreon. You can become a member at patreon.com/conlangery. You can get early access to episodes and even see the scripts for these short before they are recorded. Go to patreon.com/conlangery to pledge your monthly amount.
This topic came to me as I was musing about the kinds of taboo avoidance I see on TikTok. I’ve been on TikTok for about a year now, and in that time, I’ve observed an interesting phenomenon of word replacement to avoid censorship. TikTok is known to do a lot of algorithmic enforcement of their community guidelines, and a combination of creators getting videos downgraded or removed along with maybe some technological superstition has led a lot of people to put together some interesting strategies to avoid potential censorship.
One very ubiquitous term you’ll hear or see is unalive. It seems that TikTok doesn’t like terms referring to death, so a lot of creators have used unalive as a substitute for die, kill, and even suicide. Note that this collapses the semantics quite a bit, though context will usually pick up that load. You can talk about someone who unalived, someone who unalived someone else, or someone who unalived themself. The meaning remains very clear, with an intuitive derivation. I’ve often mused about how I never see tabooing of terms relating to violence, and this still isn’t quite that, but it does include violence-related terminology. It is interesting that TikTok apparently censors words related to death enough for this euphemism to catch on.
In a lot of other avoidance strategies, it’s often more about how words are spelled in captions, which are easier for the app to censor than spoken words. Sex is replaced with seggs, people put random spaces into words in their captions, or follow 1337 conventions of replacing letters with similar-looking numbers or symbols, like 1 for i or the euro sign for e. One user seems to get by with mostly adding diacritic marks to vowels in banned words. Like unalive, it’s aimed at preserving the meaning while avoiding word filters. I even see people use “clock app” or the clock emoji in place of TikTok, presumably in case the site suppresses it’s own name to suppress criticism.
I did encounter one avoidance strategy that didn’t really aim to keep meaning clear. For a while, I saw people replacing sex work with accounting and sex worker with accountant in order to talk about sex worker rights issues. Sometimes, they would call out the taboo avoidance with star emojis, but not always. And as always, this may be said out loud or may only be replaced in the captions. This strategy seems to also be related to a more complex tactic of telling allegorical stories — basically satire aimed at talking about something that’s likely to get removed or deemphasized by the platform.
As I alluded before, there are differences in how people implement all of these strategies, with some people saying the replacement out loud, while others only replace it in the captions. This, of course, can cause an accessibility issue when the captions don’t match the speech, but there seem to be cases where the app actually will not transcribe a particular word, indicating that it’s banned from captions.
Another place I have encountered some interesting word avoidance in the face of technology is on the Chinese Internet. It’s been a while since I read much about Chinese netizen language, so some of this is definitely out of date, but it’s still interesting.
You may know that China exercises a significant amount of censorship on online speech. This is a system that they’ve built up over the years, but it includes a mix of blocking select foreign sites, keyword filtering of social media, and human reviews of online content. The avoidance strategies I’ve seen mostly revolve around using homophones or near-homophones, which works very well in Mandarin Chinese, since you can find homophonous characters pretty easily.
A lot of what I saw around when I was paying attention to these things were actually more mocking replacements. Around the aughts, one of the slogans of the Chinese government was 和谐社会, meaning “harmonious society”. People mocking the slogan online replaced the characters of 和谐 “harmonious”, with a homophone (河蟹) meaning “river crab”. This escalated to incorporate a second slogan, 三个代表, “the three represents”, transformed into 带三个表, “wearing three watches”. This, of course, led to photoshopped images of a river crab wearing three watches, which was popular for a while.
But there is more straightforward, non-political taboo replacement. This is not episode 13, so I will let you go look up the grass mud horse and the french-croatian squid to figure out the “obscene” phrases they are replacing.
There are a lot of things that you can do with these sorts of replacement games. Obviously, this is something worth thinking about if you have some sort of science fiction or modern day world where these kinds of forces are likely to be present on different Internet-like platforms. You can be thinking in terms of your writing system and what can be replaced with what.
An idea that came to me was how this could apply in a fantasy setting. For instance, in the book Tigana, an enchantment is placed over the titular princedom on the Peninsula of the Palm by a foreign conqueror. People not from Tigana cannot hear or retain its name or many of the cultural products from the princedom, instead referring to it as Lower Corte. In the story, children of people from Tigana found each other through songs or other cultural knowledge they learned from their parents.
However, what if you took that basic premise, but applied some of that TikTok euphemism logic to it. Could they twist the name into something similar that outsiders could hear and retain? If it’s a transparent name, maybe they could use synonyms — perhaps the country is named the equivalent of Rose Kingdom, and various flowers end up substituted, or a description like Thorn-stemmed Kingdom. This all depends on how you decide the enchantment works, of course, and that’s all up to what limits you decide to put on it.
Of course, you can also take some inspiration from the mocking nature of some of the Chinese examples above, and come up with some fun, punny ways people refer to the ruling class or the official government. Who is making fun of the government? Why? What are the things they hit on in their satire?
In any case, exploring ways that people obfuscate words in an online context or some similar censorship situation can really help you tie language into politics and culture in your world. What things are censored? Why are they censored? What motivates people to talk about them anyway? How effective is the censorship? There’s a wealth of issues to explore this way.
Happy Conlanging!
George talks about some interesting terms he encountered in his most recent job, and how you can pay attention to language around you at work for inspiration.
Welcome to Conlangery, the podcast about constructed languages and the people who create them. I’m George Corley.
Today, I want to continue my occasional “listen like a conlanger” series talking about how you can think like a conlanger at work. Language is everywhere at the workplace, and by having language on your mind as you work, you may be able to improve your craft and distract yourself from the drudgery of capitalism.
Conlangery is entirely funded by our patrons on Patreon. To make a monthly contribution, go to patreon.com/conlangery. Patrons get access to early episodes and get to see these shorts scripts as I am working on them. I’m also considering what other kinds of perks I can add, so if you have suggestions, let me know.
Some of you may know that I was working at Google as a transcriptionist for the past two years. Now that that contract has ended, I want to talk a little bit about my experience there. I know that many people bemoan corporate language, but the way people around you talk at your job can be an interesting thing to think about. There can be a lot of jargon floating around with unusual and interesting origins, and it can reveal something about the subculture that exists within the company.
I’d like to start by talking about a few of the metaphors I encountered at Google. Many of these are likely common at other tech companies and throughout the business world, but I find them interesting.
First, there are metaphors that seem to be related to the technology business. It’s common for people to ask for a high-level summary of some system or process. High-level here means a surface-level or abstract and simplified explanation. As far as I can tell, this seems to be related to terminology for programming languages.
A high-level programming language is designed to be easier for a human programmer by giving abstracted options that are closer to how we understand the program, taking care of the technical aspects under the hood. Python is a good example of a high-level language. Conversely, a low-level programming language, like Assembly, is much closer to the detailed instructions, down to managing the computer’s memory within the code.
So it makes sense that these would end up getting extended to things like an explanation. Ultimately, high-level and low-level are related to the cognitive metaphor of DEPTH IS DETAIL, which you see in phrases like in depth and deep dive. It’s interesting to me, as these terms seem to conflict with other English height metaphors that relate to hierarchy. It’s entirely plausible that a high-level meeting will have two very different meanings: either a meeting of people who are in high positions in their company, organization, or government, or a meeting where topics are covered in a broad and general way. I expect that both apply simultaneously a lot of the time, since powerful people don’t have time for details, but they could clash.
Another technology metaphor I heard was bandwidth. Most of us are familiar with the computer networking sense of bandwidth, where it refers to the capacity of a connection to carry data. Apparently this has caught on in the business world to mean someone’s ability to complete tasks. In an environment where people are expected to be juggling multiple tasks at once, this metaphor is useful. We might criticize that environment for the stress that it causes, of course, but perhaps that’s a discussion for another podcast.
In both of these, I can think of a couple of things to ask about your conlang. First is what kind of technology is common in your world? Many languages have farming metaphors, because until very recently, most people were farmers to some degree. There are also car metaphors, and now, information technology metaphors are becoming common. Second, what kind of things do your speakers work with all the time? This is especially good if you want to mention specific subcultures within certain professions. Alchemists might make chemical metaphors. Astronomers might take to star metaphors.
Another term related to the business is Dogfood, which refers to internal testing where employees are opted into features that are still under development. I see competing origins for this online leading to different companies, but the phrase “Eating your own dogfood” seems to be involved, in a sense of using your own products, which may have originated with actual dogfood companies. Google’s internal material indicated the term was used because it’s a dog-friendly company — employees are encouraged to bring their dogs to work — but I’m pretty sure that this didn’t start with them. Nevertheless, there’s actually an interesting extension here, as an even earlier development testing stage is called Fishfood, which is a more limited pool of opted-in employees.
What interests me here is the complexity. A common saying catches on in the business world, then inspires a technical term with possible reinforcement from the company’s culture, then another extension is added on by slapping a different animal metaphor on. That kind of chain reaction is hard to think of in a conlang, especially if you’re going word by word, but can happen fairly often in natural languages.
I really wish I could give examples of internal codenames, but I’m really uncertain about whether I can mention those. Most of the ones that I know are for internal data products or backend data systems, and I don’t think they’re publicly known. I will just say that there are a variety of names: some are pop culture references, some are generic and descriptive, and some of them, I’m not sure what the name came from.
And of course, there was a plethora of technical terminology. As a transcriptionist, it wasn’t really my job to understand that beyond what was necessary to put the right word in. Occasionally there were interesting little things that can pop up. For instance, the database query language spelled as SQL (for Structured Query Language) was usually pronounced as “sequel”, which is apparently the most common everywhere, though there were those who pronounced the letters S Q L out, especially non-native speakers. If you think about it, it’s clear how dependent that kind of situation is on particulars of our culture. It requires a highly literate culture with an alphabetic script using initialisms. How would this sort of thing work differently under an abjad? Or an abugida?
I don’t have one overarching lesson here, except to say that this is another case of using the world around you as inspiration. Think about terminology you come across at work, what its origins are, and how it connects to work culture. How might different professional or work environments in your conworld contribute to the terminology there? For instance, I still need to come up with some technical terminology for magic and magical universities in my current conworlding project, and I’m going to have to think about these things before I come up with finalized terms. I know that different cultures will view these things differently, but I’m going to have to sort out where those differences are.
This episode, George gives a short discussion of the idea of language as having infinite fractal complexity, and what this means for conlangers building fictional worlds.
Special Mention: Resources on the Line 3 protest: Stop Line 3, Center for Protest Law and Litigation, Sierra Club Fact Sheet, Line 3 Legal Defense Fund
Welcome to Conlangery, the podcast abou t constructed languages and the people who create them. I’m George Corley.
Today, I’m going to talk a little about the realities of what naturalistic conlangers are trying to simulate. What does it mean for a language to look natural or realistic, and can a conlanger actually create something as complex as a natural language? I’m going to suggest that you ultimately can’t, but I also think that you don’t have to. Most people’s goals in conlanging will not really approach that, and I’m going to talk a little bit about how to decide what you really need out of your conlang.
Instead of doing my normal Patreon pitch, I wanted to draw attention to something I think is important to point to. You may have heard of Line 3, the pipeline that is being built in Minnesota to bring tar sands from Alberta into Wisconsin. This pipeline is going through Anishnaabe land. It has the potential to pollute waters through much of the United States, and it’s going to contribute greatly to climate change. I would encourage you guys to go to stopline3.org. I’m also going to link a couple of other resources in the shownotes, and I have decided to make a small donation to the Center for Protest Law and Litigation, which is providing some legal defense funds to people who are protesting the pipeline. I would encourage people to learn about what’s going on here.
Of course, there are many important reasons to oppose Line 3. It’s going to have huge ecological impacts. It’s going to impact water in a huge area. It’s going to contribute to climate change. And it is going through treaty land. I, personally, feel the need to highlight it in this podcast specifically because we, as conlangers, often draw inspiration from indigenous languages, and I, myself, have drawn inspiration from the Nishnaabemowin language, also known as Ojibwe, so it seems kind of wrong to take that inspiration and not care about the issues of the actual people who speak those languages. But this is up to you guys as individuals, what you want to do to support this cause. I just want to raise some awareness and let you guys know that I’ll make that small donation. Thank you.
Now on to natural languages. Many of us conlangers have a goal of creating a language that at least looks like a natural language, and people do succeed at that in varying degrees. In some ways, it’s not so difficult. There is a reason that ANADEW, A Natlang Already Did it Except Worse, is such a common term in the community. You have to almost deliberately go out to the edge to come up with some grammar or phonology that really looks impossible for humans to come up with naturally. The lexicon can be a bit harder, but with some work, you can avoid relexing and come up with realistic senses for words.
What is difficult, and likely impossible, is to come up with the massive amount of variation in language. You can work on dialects and registers all you want, but you won’t really get to the complexity we see in the real world. The reason for that, I’m going to propose, is because in the real world, natlangs have infinite fractal complexity.
What do I mean by infinite fractal complexity? Let’s start with your language. For the sake of this exercise, let’s assume a language that is relatively unified and not part of a dialect continuum. That language can naturally be divided into a number of dialects, based either on geography or on social divisions, though most likely both. But those dialect divisions are not hard lines, and there is variation within each dialect. You can subdivide and subdivide until you get to the idiolects of individual people.
You might assume that idiolects are atomic, but they’re not. Even there, you’re going to run into variation. Most people code switch between several different varieties, even if they are monolingual. Not only are there different registers for different situations, but you subtly change the way you speak in individual interaction. Sociolinguists often model this by simply taking statistics on how often one variant of a word or construction is used as opposed to others and reporting the percentage. Those percentages change depending on who you’re talking to, and it may even change over the course of a conversation.
Add to this the fact that every speaker knows thousands, even tens of thousands of words, and each speaker may have slightly different understandings of their meanings. There are also thousands of collocations, idioms, and combinations involved. And of course, infinitely many unique sentences that can be constructed.
All of this extends back into history as well. Historical linguistics attempts to classify languages into neat family trees, but this is still an abstraction. In reality, every reconstructed proto-language is really just an approximation of a messy collection of different dialects. There were many branches off of our languages that we will never know, and others that were reabsorbed into another, surviving branch, possibly leaving traces behind. Words take their own individual journeys, branching out through derivation or hopping across languages in unique patterns reflecting trade routes or migrations. We even have mysterious words that may be from languages that we otherwise know nothing about.
So far, I think I’ve impressed on people the fact that it is truly impossible for anyone to construct a language that truly approaches the complexity of what happens in natural languages. The question, I guess, is “What do we do about it?”
The seeds of this short are in my interview with Lauren Gawne. We talked a little bit there about determining how much of a language you need for worldbuilding. Lauren told me that Aramteskan mainly just needed enough fleshed out for the needs of the book, though she did go beyond that a bit. My own current project mostly requires a number of naming languages, with some of them related to others. That led me to do some significant historical work on sound changes, in order to create families, but only to get me to phonology and basic morphology. Depending on how the story goes, I might actually do some significant grammar work on two of the languages, but that really depends on whether I decide to include any dialogue in those languages.
To understand what my goal should be, I had to think about what these languages were for. This is a story set in what is, for lack of a better term, magic grad school. I have two protagonists and three other viewpoint characters, all from different cultures, and probably from three different language families. There are also other characters: students and professors from a variety of cultures, plus locals to the area the university is in, which I plan to be speaking another language. Being in something resembling a modern academic environment, I intend for characters to be citing names from various cultures, too, potentially including names from significantly earlier times.
All of these factors led me to think that I needed the basic structure of at least two language families with enough sound changes and morphology to make names, and possibly a few extra isolates with room to expand on. In addition, at most two languages might need to generate sentences for this story, but it remains to be seen whether I’ll actually have to go that far. My historical work, for now, is limited to sound changes. I may need to add a few grammaticalized forms for the more synthetic languages, but I’m not going to go all out on verb forms or syntax evolution until I decide I need to make sentences.
There isn’t really a standard shape for what you need here. The complexity is still something to think about, but precisely what parts of that complexity you need to simulate is something you need to consider when you’re building. For one world, you may decide that you will only ever need names for one culture, and so you just make one naming language. For another, you might decide that you want a few phrases in an archaic form of a language, so build one language out to build those sentences, and maybe do a little bit of historical work to get modern names or a bit of a modern version of a language.
But what if you want two characters from the same linguistic culture communicating in the language, well, then you want to have that one language built out, but you also want to think about not just broad strokes diversity, but the particular idiolects of these two characters. You want to ask if they come from different regions. You want to ask if their social class, gender, or other social identifications affect how they speak. Perhaps most importantly for this one dialogue, you want to work out just how these two particular characters would speak to each other. That involves some thinking about register, politeness, and relative social standing, but also about the emotional state of these particular characters and their own tendencies to adhere to norms or not. That sounds like a lot of detailed conlanging work, but it’s specific to the two characters in the dialogue. Some of it has implications that could affect other things down the road, but you only need enough framework there that you can expand later if you need to. You can focus on the differences between these characters, and along the way, work out some basic ideas of how to expand things in the future.
Now, there are plenty of conlangers out there who enjoy going deep into a language without some particular application in worldbuilding, and there are always people who don’t mind building more than you need. If you are building a language for a work of fiction, though, I hope these ideas are helpful to you. The bottom line is, language is fantastically complex in myriad ways, but you don’t need to deal with all of that complexity at once. Consider what you need right now, and where you need room to expand in the future.
George talks to Jasper Charlet and his opera, Heyra, written entirely in the Carite language, which is currently in crowdfunding.
Top of Show Greeting: Opaki Aŋkuati
Links
William comes back on the show to tell us all about the category of Associated Motion.
This podcast could use a review! Have anything to say about it? Share your thoughts using the button below.
Submit Review