I.
In the fall of 2017, I attended a computer science class at Columbia University in New York City. The topic of the course was natural language processing (NLP), and the lecture was about representing words using computational tools. It would be a year before the first large-language models (LLMs) would be released, but already the field of linguistic AI was abuzz. Neural networks offered the promise of astonishing improvements over previous approaches in computational linguistics. Not only could they help us push the bar on old tasks like classifying and understanding pieces of text, but recent advancements in domains like text summarization hinted at an even grander promise: soon, computers would be able to produce, or “generate,” language in a manner indistinguishable from human discourse.
While the rest of the world was mostly oblivious to this nascent technology, tech influencers were already pairing Promethean promises with dire warnings. AI would remake the world. In 2014, during an aerospace summit at MIT, Elon Musk cautioned that humans would be “summoning the demon” through AI, suggesting that a power was being manifested that would refuse to obey any master. Abstaining from outright demonology, Kathleen McKeown, my professor at Columbia, also resorted to the vocabulary of mysticism. During one particular class, McKeown, one of the old hands in the discipline, was introducing the concept of “word embeddings”—sequences of numbers that represent words in many dimensions and provide a fundamental building block for language models. I distinctly remember how McKeown presented a slide with the title “Word embeddings are magic.”1 “They work, but we don’t quite understand why,” she explained.
Today, state-of-the-art computational semantics are based not only on the famed “attention heads” of the “Transformer” LLM but also on this effective but opaque technique that McKeown was describing, the “embedding” of words. While the attention head helps an LLM know what to attend to in a given piece of text, embeddings provide the fundamental representation of all linguistic elements that the model relies on to make sense of natural languages. In this sense, embeddings are much like the bit, with the important distinction that they are explicitly semantic. If two words are similar, they will be represented by similar embeddings, i.e., similar sequences of floating-point numbers. By contrast, words with similar bit representations are not necessarily similar or related at all.
Back at McKeown’s lecture, my head was spinning. Not because the subject was difficult for me (which it certainly was), but because the representation of lexical structures of meaning through arbitrary numeric sequences seemed so utterly foreign to me. In her slides, McKeown showed how word embeddings enabled a strange sort of lexical algebra. By adding and subtracting word embeddings, we could inductively explore conceptual relationships that usually were mapped using natural language or laborious systems of deductive logic. By adding their respective embeddings, we could, without any pre-given “ontology” of concepts, show how various words were related to each other. A slide displayed an example that is today canonical: king − man + woman = queen.2 I followed this perplexing performance with increasing fascination. After all, I was—somewhat naively—used to thinking of language, discourse, and meaning as almost entirely qualitative and often quite intimate fields of study, whose depths were explored more under the guidance of authors such as Foucault and Derrida than under the tutelage of your computer science professor.3 McKeown’s lecture filled me with both dread and awe. If semantic structures could be so effectively quantified, we would soon witness entirely new forms of science, poetry, literature, and even governance and power.
The production of embeddings is relatively straightforward. By letting a neural network predict words from their context, its internal structures (i.e., embeddings) end up reflecting real relationships in language. First, a model is shown a sentence like “the cat sat on the …” Then, using embeddings for each of these “context” words, it predicts the appropriate “target word,” which here might be “mat.” If the wrong word is predicted, a little penalty is applied to show which way the model should correct the embeddings (a technique known as “gradient descent”). By repeating this task thousands, millions, and now even billions of times, models develop the embeddings or “weights” that—along with some important other mechanisms—power the modern LLM.
Eight years after McKeown’s lecture, LLMs and their embeddings have entered the daily lives of innumerable people, whether directly through the graphical interfaces of chat services such as ChatGPT, Claude, and Gemini, or indirectly through the backend of search engines, mapping apps, and the like. Embeddings are everywhere, not just in language: Spotify listens, clicks during a browser session, likes on a dating app, and any other sequence of digital events can be represented as embeddings. The models underlying these services tend to be far more complex than those presented by McKeown, but many of the principles are the same: most AI relies on embeddings and similar digital structures, “magic” that works but that remains opaque even to those who develop the technology. At the same time, the sense of awe in the face of this magic has become nearly ubiquitous.
II.
The magical vocabulary invoked by Professor McKeown and the satanic prophesies proclaimed by Musk draw on a much older arcana surrounding not just science, but information theory and the computational sciences in particular.4
Many cheerleaders and alarmists alike seem to think of AI in terms that are reminiscent of French mathematician Pierre-Simon Laplace’s thought experiment about an omniscient demon.5 “Laplace’s demon” knew the exact locations and momentum of all the particles in the world and could, therefore, predict the future with perfect accuracy. Somewhat similarly, AI is often depicted as a “god trick” that establishes a gaze that sees everything while being nowhere.6 Somewhat akin to Laplace’s thought experiment, a sufficiently large LLM is thought to converge towards perfect knowledge of things themselves, expanding the frontier of scientific innovation and socioeconomic optimization. In this conception, the world is seen as a game, much like Go or chess, which can eventually be “won” (as AI agents have already done in the case of these games). If sociologists did their best in the 1990s to convince us that we live in a “risk society” in which all actors strive to minimize future “losses,”7 AI offers the ultimate solution to this state of constant worry and malaise. AI-driven information systems can use data gathered from our smartphones and other devices to anticipate everything, finding the optimal solution to any utility function. Increasingly, fridges, cars, light bulbs, and other mundane machines are also recruited into this vast network of informers, as they are made “smart” and brought into the fold of ubiquitous digital surveillance. Laplace’s demon is the fear and fantasy of total control.
![](https://images.e-flux-systems.com/e_flux_figure1_COPY.jpg,1600)
In a very simple language model, each word has one embedding. In this figure, with embeddings from a pre-LLM model known as Word2Vec, each word is represented by fifty numbers, which are shown as a range of colors. In contemporary LLMs, the embeddings are more complex, as they are usually context-dependent: i.e., the embedding for “bank” would be different if the context was “financial bank” or “river bank”. Figure by the author.
While this demon encapsulates many of the hopes and fears around AI, it relies on a somewhat naive determinism that was put in question by twentieth-century quantum physics. According to the so-called “uncertainty principle,” one cannot simultaneously know both the exact position and momentum of a particle. The type of perfect knowledge that powers Laplace’s demon is, in fact, impossible. The universe is probabilistic, not deterministic. Consequently, the real question is how to make order amidst these probabilities. Only thus could the powers of Laplace’s demon be approximated.
For LLMs, this process of mapping the probabilities of human language started with the internet. The data repository that was necessary to train modern-day AI was created through thirty years of incessant “content production.” Only today can we begin to grasp the true significance of the concentration of online activity into various “platforms” that began in the late 2000s and came to dominate the internet in the 2010s. The anatomy of AI is the key to the anatomy of the present-day internet, much like Marx thought that “human anatomy contains a key to the anatomy of the ape.”8
While the demon of AI relies on probabilities, it is no less terrifying than Laplace’s creature. Today, AI is already generating a new set of significant power imbalances: AI-assisted drones have, for the past year and more, been dropping bombs in the Middle East and monitoring protesters around the world.9 LLM bots are scouring the web for inappropriate and “harmful” speech, in whatever way this is defined at a given moment in time.10 Neural networks monitor workers and measure their efficiency (is the worker taking too many breaks? are they going to the bathroom too often?)11 “Citizens will be on their best behavior because we are constantly recording and reporting everything that’s going on,” said Oracle cofounder Larry Ellison during the company’s 2024 financial analyst meeting. Not long after, he was announced as a backer of the $500 billion AI initiative of the second Trump administration.
While these glimpses into the future are concerning, they do not announce the arrival of an all-powerful being. We will not be ruled by Laplace’s demon but by some other creature. How could its powers be undermined?
III.
The historiography of AI often distinguishes between two distinct phases: symbolic AI and the “connectionism” of current models, where the former relies on a vast accumulation of hard-coded rules and the latter on statistical learning. The prehistory of connectionism is usually found in the cybernetics of the 1940s and ’50s.12 In this interpretation, the first wave of connectionism coincided with the rise of cybernetics, while the second wave began in the late 1980s when increases in computational capacity and a slew of innovations in neural networks and efficient differential optimization created the conditions for a renaissance in connectionist AI.
From the vantage point of contemporary AI research, the key innovations in cybernetics were threefold. First, Claude Shannon’s information theory showed that any single “event” can be represented as a set of “yes” or “no” choices, i.e., “bits,” ones and zeros.13 Second, and somewhat along the same lines, Walter Pitts and Warren McCulloch’s simple model of an “artificial neuron”—extended by Frank Rosenblatt to the famed “perceptron”—showed that even complex mathematical functions can be reduced to a set of simple functions.14 Third, theories of feedback and teleology proposed by Norbert Wiener and others suggested that the world should be grasped not through the essence of things themselves, but superficially through the observed inputs and outputs of various systems.15 If a system is given a goal (teleology) that it can strive towards iteratively by learning from its mistakes (feedback), then the system will gradually develop the information structures (bits, or, in language models, embeddings) and schemas needed to achieve these goals. This is also the paradigm that connectionist AI relies on.
After World War II, the conceptual innovations of cybernetics spread like wildfire. Its profound influence has been traced through various literary genres, the social sciences, economics, psychology, biology, and so on.16 However, the impact of cybernetics on artificial intelligence was initially short-lived. Although Shannon’s bit revolutionized computer science, the other basic principles of cybernetics were too demanding for the computers of the time. There simply was no way to actually implement ideas of feedback through learning at scale. For several decades, AI research focused not on feedback and neurons, but on symbolic systems governed by predetermined, precise instructions and logical guidelines.
It was only in the late 1980s that the second wave of connectionist AI began, coinciding with the rise of the internet and the widespread adaptation of home computers in the broader consumer market. AI researchers in the late 1980s demonstrated, finally, what cyberneticists had already hinted at: combining simple mathematical functions, a neural network could, in theory, estimate any complex mathematical function.17 In other words: by combining simple functions into multiple “layers,” neural networks could model almost any function. This is why the process of “training” a neural network is called “deep learning.” A sufficiently “deep” and wide neural network would, in theory, be able to express any mathematical function, according to the basic tenets of connectionism.
![](https://images.e-flux-systems.com/e_flux_figure2.png,1600)
Word embeddings are formed by training a language model through various prediction tasks. The figure shows a classic language modelling task, where the model is asked to predict the context words (white frame) from a given target word (blue frame). By repeating this type of task millions of times—along with some important post-training alignment work—a model can be trained to respond to prompts on a service like ChatGPT. Source: Chris McCormick.
Since cybernetics had—through its impact on sciences such as sociology, neuroscience, and economics—already conditioned the Western mind to think of all social phenomena as functions, neural networks could now offer the promise of a world where everything could be modeled. For two decades, this promise was a dream that was shared mainly by faculty in computer science and mathematics departments, but at least since the rollout of ChatGPT, it has become mainstream ideology and paradigmatic “normal science.” Alarmists like Eliezer Yudkowsky warn us that the world could be annihilated by an omnipotent AI, while an endless stream of rapturous CEOs and researchers from a small set of industry leaders herald a better, more controlled world brought forth by AI.
In either case, the demon of AI will be summoned through a machine that can estimate the probabilities of “events”—from words in a sentence to the incidence of crime in an area—with increasing accuracy. But what counts as an “event” is, if not entirely, then at least largely subjective. Indeed, one of the key insights of cybernetics is the distinction between “system” and “world.”18 In the cybernetic framework, each system has its own “world,” consisting of all the possible outcomes and entities that the system can perceive. A famous example is the tick, whose world the biologist Jakob von Üexkull described in works that preceded but greatly influenced cybernetics.19 A tick sits on a branch, waiting for certain odors and patterns of light that distinguish its prey from the environment. Once it is on an animal, the tick can sense the heat of its skin and distinguish hairless patches of skin from those with hair. The “world” of the tick is composed of these signals of light, smell, and heat. Changes in these variables qualify as events for the tick. These are the inputs it recognizes as significant and real.
Turning back to statistical modeling, the “world” of a classical sequential model predicting the weather might consist of the states “sunny,” “cloudy,” and “rainy.” Transitions between these states are the possible “events” that the model can recognize. The world of GPT models, operating on sequences of characters and words, is many times larger, with complex relations drawn between myriad entities. Nevertheless, it is still limited to a set of discrete events.20 One limit of the model consists of all the letters and words in the data seen by the model. A “multimodal” model, which combines images and words, also recognizes various combinations of the additive primary RGB colors. Sensors that collect data for remote sensing using drones, aircraft, or satellites can produce data for models that utilize hundreds of frequencies of light, many of them outside the range of human vision. This is their world, contained and restricted by the data that the model can access.
Of course, data is not magically “out there,” but must always be produced. Borrowing jargon from the philosophy and sociology of science, we could say that data requires “translation.”21 Just as an English sentence can only be rewritten in Finnish by doing the actual work of translation, the production of data requires its own kind of translation work to make use of what those sensors gather, for instance. Such translation requires a certain amount of effort: it takes energy to complete. It also always demands a certain work of interpretation and is undertaken from a certain point of view.22 To interpret new semantic structures, we have to draw on the structures that are already at hand, grounded in our previous experiences. Interpretation unfolds in the present, in a given place and at a given time, yet it does so in light of the past and under the shadow it casts. In some sense, translation is therefore always violent: certain things are highlighted, others go unnoticed or are willfully erased.23 Often translation is actively repressive.
The diversity of perspectives implied by the sociology of translation is not entirely foreign to the science and industries of AI. In 2023, OpenAI released the “GPTs” service (with “GPT” pluralized), which allows users to create specialized GPT models based on a centralized and more general GPT model. In machine learning, “fine-tuning” a large model for a specific purpose, such as classifying emails as spam or ham, has long been a prevailing practice. Nonetheless, these perspectives are always reducible to the perspective of one central view, the “eye of the master,” to borrow Matteo Pasquinelli’s phrasing.24 Harnessing particular models to serve the learning purposes of larger, more general models is fairly straightforward for a company like OpenAI or Google. Yet despite this, the proliferation of AI models in the plural might just provide a vantage point from which to perceive a future outside the centralized, omniscient gaze of the AI demon, with its Laplacean aspirations. So where would we turn from here?
IV.
Some guidance can be found in the movements for peer-to-peer file sharing and “online piracy” which boomed in the 2000s, during the prehistory of the platform economy that we have come to take for granted today.
Before Spotify’s streaming model, peer-to-peer file sharing seemed like an inevitable future. When the copyright industry successfully undid the first file-sharing service, Napster, in 2001, several new services immediately replaced it. The most important of these, the Pirate Bay, was brought to court, but without any real effect beyond the harm caused to a handful of individuals.25 The founders were handed hefty fines and given short prison sentences, but the service remained in operation. Only Spotify managed to challenge file-sharing in earnest, with carrot rather than stick.26 The streaming service was easier and more consumer-friendly than downloading and running torrent-tracking software, as it offered a simple interface that required little or no technical understanding. Meanwhile, stricter copyright legislation and more aggressive persecution against offenders allowed for tighter control of the dwindling pool of file-sharers. Capitalism worked as it always works: as a global embodiment of cybernetic principles, a giant machine that eventually returns all lines of flight back to its motions, with both stick and carrot, discipline and control.
Amid this crisis, the Swedish blogosphere adjacent to the pirate movement started outlining various principles and operational modalities for a new and emergent internet politics.27 The Pirate Bay, which initially gave the finger to the copyright industry, represented the epitome of an “accelerationism” that proclaimed the unlimited and ever-faster sharing of bits, with no concern for potential repression or societal disruptions. Effectively, the Pirate Bay “channeled” and multiplied the growing desire for entertainment that characterizes late-capitalist societies, accelerating it as much as possible.28 At this moment, amidst the relative marginalization of file-sharing after the emergence of Spotify and other streaming services, with increased repression targeting not only file-sharing but also “online” political movements, new tactics were necessary. This felt particularly urgent in Sweden, where new wiretapping legislation gave the government sweeping powers to intercept and monitor traffic directly from internet cables.
Some commentators at the time—including former members of the Pirate Bay–affiliated Piratbyrån and activists in the hacking collective Telecomix—suggested that the new politics of the internet required moving from a boastful acceleration to a more cautious pace and the “tunnel politics” of encrypted and private connections. As Christopher Kullenberg, one of these commentators, summarized:
As matters stand now, we must think in terms of cipherspace, the net’s tunnels of encrypted information. If the 00’s was the decade when cyberspace imploded and we finally stopped thinking of the internet as a “virtual world,” then the 2010’s might be cipherspace + hackerspace.29
This notion of “tunneling” was borrowed from the practice of encrypting—i.e., “tunneling” online traffic using Tor, VPNs, I2P, and other methods.30 An encrypted internet like this would be less a cyberspace than a cipherspace. While tunnel politics enabled the continued traffic of copyrighted or otherwise “illicit” data over peer-to-peer networks, its main effect would be to emphasize a multitude of small worlds above and under the limitless bounds of the open internet. If everything is shared openly and directly, the worlds connected by tunnels will collapse. Instead, tunnels must be carefully and selectively dug between individual worlds. For example, the now-defunct What.Cd and Waffles.fm private torrent trackers were closed file-sharing sites, worlds with points of entry that were carefully guarded through referrals and memberships that required “seeding” over certain quotas. Instead of uniform acceleration, tunnels move at different speeds: sometimes they are painfully slow, other times blazingly fast.
The key difference between accelerationism and tunneling runs not only along the axes of speed and volume but also along the axis of visibility. When file sharing over peer-to-peer networks began, the vast majority of users openly declared their IP addresses, giving away their “identity” and location. By contrast, the entire point of tunneling is that everyone tries to be anonymous, sharing personal information only selectively. Beyond just online spaces, the idea of tunnel politics highlights encrypted connections between material hubs, “cipherspace + hackerspace.”
In some very limited regards, tunneling is now mainstream. Since the NSA document leaks by Edward Snowden, there has been a new demand for both online privacy and device security.31 Today, secure messaging services like Signal are used not only by activists, journalists, military personnel, and government officials but also by many ordinary people. WhatsApp and Facebook Messenger both have end-to-end encryption built into them and recently even my own mother started, without my influence, using both a VPN and Signal.32
V.
What would it mean to accelerate or tunnel in the domain of AI? To grapple with this question, we need to also consider, in addition to acceleration and tunneling, a third term: centrality. Spotify was perfectly accelerationist, if we take acceleration to mean an increasingly fast transfer of more and more bits of data. The paradigm that shifted with Spotify and, more broadly the platform economy, was not about acceleration, but about centrality. It was centralization that destituted the previous strategy of acceleration. Instead of the peer-to-peer model where all users could host and share files, Spotify centralized all the power in the network to one node, cementing the client-server model of communication as the foundation for an internet that was all of a sudden only about various “apps.”
In this sense, ChatGPT was something of a Spotify moment for AI, although this comparison must be made with many reservations: AI development was never a clandestine and horizontal activity, but always facilitated by scientific institutions with significant gatekeeping and dubious ties to both industry and military. Be that as it may, the explosion of research on neural networks in the 2010s happened in a very open and collaborative spirit. Before the latest GPT models, all the main models—from pre-LLM models like GloVe and Word2Vec to early LLMs like BERT and ELMo—were either open source or, at the very least, had open weights, meaning that their embeddings were freely available to download and fine-tune. ChatGPT took the insights from a fairly open research community and packaged them behind a convenient interface, making interface access free but enclosing the models and source code behind them. Now, we do not know what data the latest GPT models and most of their competitors are trained on, and we cannot download model embeddings and other “weights” for our own use. When we want to fine-tune models, we usually do so on OpenAI’s paid platforms or by accessing their commercial API. If Spotify took our musical commons and packaged them into a paid service, ChatGPT goes much further. It packages our shared collective intelligence and sells it back to us as a convenient service.
Furthermore, by initiating the LLM race, OpenAI has created financial incentives for closing one of the last channels for reversing the gaze between user and platform: the free application programming interfaces (APIs) of platforms like Twitter and Reddit. Sites like Wikipedia, GitHub, Reddit, and Twitter are all repositories of data for LLM training. Even when the internet in the mid-2000s started to close around Facebook and gradually other major social media platforms, these sites for a long time maintained open APIs through which data from their platforms could be uploaded (this is also how the Cambridge Analytica scandal came about). However, in 2023 both Twitter and Reddit closed their APIs behind paywalls, bringing the counterrevolution against the open internet to a new culmination. Ironically, OpenAI, which was originally set up to produce open models for the “benefits of all of humanity,”33 has itself led this development, both by shutting down model development and by provoking other major platforms to guard more closely their own data, which suddenly has become a valuable resource for model training.
In this sense, the chatbot LLM—whether accessed through ChatGPT, Claude, Gemini, or DeepSeek—is just an extension of the platform economy. But something has nonetheless shifted. If the paradigm shift that Spotify brought about was about centralization (and convenience), the current paradigm shift is about abstraction, about the emergence of the embedding as a fundamental unit of information. When we look at the history of the internet in this way, this current watershed moment is about the development of the weights and embeddings on which neural networks depend. In them, information is condensed into opaque vector structures—a “machine semiotics”—that to most humans would be as inscrutable as ancient runes, but which can be used by the model to write prose or predict stock market movements.34 In Marx’s account of capitalism, the proliferation of the commodity form involves the abstraction of all concrete use values into the “thin air” of exchange value.35 Embeddings make this especially literal: they are the highest level of abstraction of the semantic structures captured by an LLM, a compressed and partial truth of the discursive field of texts, images, and other media that was used to train the model.36 The idea that “the medium is the message” has never been more true. In this regard, the embedding establishes a new “logic of sense” that is inscrutable for the human reader in its operation, legible—for laypeople and technical experts alike—mainly through probing the outputs that are generated when a model is prompted.
Once a certain threshold of abstraction has been passed, political strategy also has to adapt to that level of reality. Consequently, a comprehensive tunnel politics would have to grapple with the compression of discursive reality into the embedding. The acceleration started by Napster was also a result of a new form of abstraction, the compression of music into the MP3 file. Similarly, the tunnel politics of closed torrent trackers, VPN services, and apps like Signal has been all about bypassing and desitituting the surveillance apparatus brought forth by the centralization of online life in the platform economy. However, in theory, what was centralized after Spotify was not really the power to abstract per se. Rather, the centralization was a consequence of the so-called “network effect” in social media (a social network becomes more useful the more users it has) and what we might dub the “convenience effect” (people tend to prefer convenience over principle). What is happening now is different: it is the very power to abstract that needs to be contested.
![](https://images.e-flux-systems.com/e_flux_figure4_COPY.jpg,1600)
Material from Tinder at a 2017 machine learning conference in San Francisco. The figure shows how people on the app can be modelled similarly to words in a sentence, forming embeddings of users based on swipe behaviour. According to the presenters, the embeddings “represent possible characteristics of the swipee implicitly,” including their interests and chosen career path.
In this sense, the lines of what would count as tunneling and acceleration in AI are somewhat blurry. Of course, the current LLM paradigm is acceleration. This type of AI only exists at scale; it does not exist and improve without more and more data. But it is far from clear that the current industry giants are truly accelerating AI. On the contrary, the enclosure of embeddings as intellectual property is—in the long run—likely to slow down the development of AI. Confining the development of models to an oligopoly of companies will, most likely, reduce the amount of effort put into the models and undermine possible innovation on other models, given that insights gained within OpenAI and Google will be unavailable to the wider scientific community as well as various hobby enthusiasts, small start-ups, etc. This type of competition is accelerationist only to the extent that it forces challengers to develop ad hoc modeling strategies—as happened with the models developed by the Chinese company DeepSeek, which in early 2025 suddenly leapfrogged ahead of closed-source models on many important AI leaderboards.37 At the same time, various “safety” restrictions placed on models—while often well-intended—create a further layer of opacity around model design, which is unlikely to facilitate true experimentation with the centralized models.38 In this sense, the real accelerators in AI might be found among developers of open-source and open-weight models. In addition to DeepSeek, a prime mover in this space has been the social media giant Meta, the latter a late-comer to the AI game and the former a geopolitical underdog due to American export restrictions on GPU chips. Meta’s open-weight Llama model, its derivative models of various sizes, and truly open-source models like DeepSeek, Qwen, and (to a lesser extent) Mixtral already rival the flagship models of OpenAI and Google on a number of leaderboards.39 Even more important have been start-ups that have focused less on models and more on ecosystems for sharing models. In this space, the primary actor is Hugging Face, which has developed an important platform for sharing not only models but also training data and AI apps.40 Other noteworthy initiatives include open-source tools for running LLMs locally (e.g., Ollama), and at a more fundamental level, open-source programming frameworks for fitting AI models (e.g., PyTorch).
![](https://images.e-flux-systems.com/tsmc_semiconductor_chip_inspection_1_copy.jpg,1600)
A silicon wafer being inspected at a TSMC semiconductor fabrication plant. Image: TSMC.
This accelerationism of open-weight and open-source models somewhat inevitably functions as a material practice to develop AI in and through a multiplicity of worlds. Within this paradigm, many models are being developed and deployed and anyone can refine and fine-tune them, but no single model acts as the arbiter of some final “truth.” Open development also involves producing models of different sizes, more and more of which run locally on a laptop or smartphone, “on the ground” rather than out in the cloud. This is where, in the domain of AI, accelerationism converges with tunnel politics: instead of one model to rule them all, it provides a million models ruled, if not by all, at least by many. Developing different strategies for model training, models of various sizes and for different purposes, and curating datasets of various sizes and scopes—all of this accelerates abstraction, while also creating the conditions for tunneling AI. The most obvious example is the local deployment of smaller AI models: when models can be used on a local device without a network connection to a model running on a big server, the Laplacean aspirations of the AI demon are undercut. At the same time, the world (of data) is fragmented. Data can be collected and interpreted entirely locally. Is this brand of AI accelerationism then also a movement towards fragmentation: In this dawning world, are we on the brink of losing ourselves into completely local abstractions, each in our own burrow?41
VI.
Probably, and hopefully, not—at least not without reservations. For their part, Google and OpenAI are growing a system of local models (e.g., the GPTs), with the primary aim of growing the power, influence, and capacity of their centralized models. When you use a specialized GPT on the OpenAI server, you are still contributing to the accumulation of potential training data for that company. There are of course instances where this sort of continuous relationship between local users and a central node seems necessary to guarantee a particular type of service. In these cases, the central server is key for aggregating and disseminating local knowledge. For example, if Google Maps predicted traffic entirely locally, we would have no data on congestion and similar phenomena that can only be modeled and monitored through data sharing. Most urban drivers, being now used to real-time congestion updates, would never opt for such an app. If we look beyond AI as a chatbot and instead consider its myriad backend applications, open-source and open-weight models are not enough. AI is still likely to accelerate not just abstraction, but also centralization. The burrows offered by local LLM models will not save us from AI as panopticon.
So how do we resist the demonology of those who wish to bring about various Laplacean creatures, if neither fully rejecting nor fragmenting the technology is an option? In some ways, borrowing terms from Marxist political economy, we might say that an AI running on a server represents the total tyranny of the center over the periphery, while a fully individualized AI represents the total detachment of the periphery from the center.42 The challenge, then, is not only to resist the dominance of the center but to overcome this antinomy entirely. Currently, the center is rapidly consolidating its dominance, despite challenges to incumbents in frontier LLM modeling. Since state-of-the-art embeddings are generated from volumes of data that are inconceivably large, and require a similarly stupefying number of graphics processing units (GPUs) to train, our intellectual commons are being enclosed at breakneck speed.43 Meanwhile, other developments are further speeding up this process. Elon Musk’s Starlink will further centralize the infrastructure of the internet and, by extension, client-server-based AI. Starlink points towards a future where only those who can reach the stars can access the cloud.
![](https://images.e-flux-systems.com/android_clusters_1_copy.png,1600)
The figure shows embeddings for every sentence in the script for the 1986 film Blade Runner as well as the Philip K. Dick book which it is based on, Do Android’s Dream of Electric Sheep. Embeddings from the book and film are shaped like circles and diamonds, respectively. The embeddings were produced using the OpenAI API. The labels and colors demarcate embeddings that are clustered, i.e. more similar to each other. Different sentences spoken by the same character in the book and the movie will be in the same cluster. For instance, in the film Rick Dekard says “I knew the lingo, every good cop did,” while in the book he laments that “The Soviet police can’t do any more than we can.” These sentences, neither of which mention Rick Deckard by name, are both in the purple “deckard” cluster. Why? Because the LLM has an internal representation of these sentences as something Rick Deckard says. They work at this level of abstraction. Figure by author.
In the early 2000s, the French collective Tiqqun suggested that cybernetic capitalism needed to be countered through “zones of offensive opacity.”44 Rather than complete detachment, such a zone would allow information to flow in, while keeping the zone itself hidden from the eyes of the master. Like extreme shades of black—e.g., Vantablack, Singularity Black, Black 3.0—this offensive opacity absorbs a lot of light but reflects very little. Guided by this idea, we might ask ourselves how we can take from AI without giving much of ourselves to it. How can we learn from LLMs without surrendering our own multiple worlds to just one or a few tech giants?
In the realm of digital technologies, the domain of encryption is where the idea of opacity has so far been most readily embraced, in stark contrast to the obsession with transparency in both AI ethics and fields like “critical algorithm studies.” There are two particularly interesting paths to explore here. The first concerns decentralized training of AI models, so-called “federated learning.” Through this approach, local models can share parameters with each other without sharing training data. The second concerns new encryption models to completely or partially protect user data from the server under the client-server model. These encryption frameworks would enable access to centralized services without revealing details (differential privacy) or anything at all (homomorphic encryption) about the client.45 These tools thus allow us to request information from a server without revealing to the server what we are asking for, or even what it should return to us. Let us consider these two approaches in unison through an example: I might use a mapping app locally with a federated congestion model. My model is small and fits on my device. Instead of providing a centralized server with exact updates about unusual changes in the route and speed of my vehicle (potentially caused by congestion), I provide differentially encrypted updates on my congestion model to a limited pool of trusted users. I share the parameters of my model (which would probably include some embeddings) rather than the exact data. Moreover, using differential privacy I do not even share the exact parameters, but add some “noise” to them in order to make it harder to reverse engineer the training data, i.e., my driving routes. Alternatively, I might use a centralized server with a homomorphic encryption model. The model never receives my data as such, just an encrypted version of it. Moreover, when it serves me congestion advice, it does not know which area I am asking advice for, nor that I am asking for congestion advice at all.
While seeking political solutions through a technological “fix” is always somewhat questionable, these new technologies nonetheless seem to open certain new political horizons. They invite us to develop an “art of distances” to harness artificial intelligence and the data structures it creates, without letting any centralized gaze make complete sense of us in the process. Zones of offensive opacity are not individual bubbles but shared secrets and collective privacies. They may be based on technical encryption (i.e., cryptography) or on shared “truths” that do not translate into the language of centralized power; this might be something innocuous like the inside joke of a friend group, or something more broad, like a local dialect and the modes of expression it affords.46 Zones of offensive opacity use the center but make themselves only selectively accessible to it, sometimes opting to restrict their sharing to other peripheries and not relying on the center at all, as in the dummy example on federated learning around congestion. Certain subcultures, criminals, ethnic communities living in the shadow of mainstream culture, hackers, and similar groups understand opacity better than others, even if they do not always perceive it as offensive. Yet the goal cannot be opacity as a marginal phenomenon but as a general norm. Opacity is only truly “offensive” if it belongs to everyone. Offensive opacity moves beyond individual privacy, towards a more collective infrastructure of burrowing and tunneling.
At this point, it is important to note that opacity is already operational in AI, something I have hinted at but not spelled out. This is the opacity from above, largely a result of combining powers of abstraction with powers of centralization. In this sense, opacity can only be weaponized for popular endeavors if it is coupled with decentralization. While open-source and open-weight models are not sufficient to break the current centralization of the new powers of abstraction afforded by the embedding, they are an essential part of it. In the LLM era, there will be no true and widespread opacity from below without them.
So, what qualms should we have regarding AI? It is not that it quantifies the world as embeddings (which it does), or that it understands the world as a set of functions (which it also does), or even that it looks for answers in statistical structures and feedback loops (yes, it does this too). Instead, the danger of AI is that it is a movement, through these aforementioned means, towards explaining everything in terms of one or very few worlds and the tendency to capture all means of translation in one or a few models. AI is plagued by a tendency to translate all languages into a single language, which will inevitably be the language of power. A black box that rules over us without giving us anything but nominal control over our own lives, entrenching a Laplacean hubris into material infrastructures. As more and more of the content on the internet is expressed in this language, the world will literally become a smaller place.47
To challenge this development, I do not think we should balk at forming a fairly close relationship with AI, as long as it is in the spirit of mastering it without surrendering ourselves to it. This requires a tunneling practice. If centralized AI makes the world a smaller place, tunneling AI will make it larger and more fragmented. In terms of a project for an opacity from below, tunnels are the structures that afford us the condition of opacity. They increase “the surface of a body while decreasing its volume,” providing space for new territories and modes of being.48 They have limited access points but often lead to a sprawling and decentralized network. To maintain, build, expand, and protect tunnels, one can operate on many planes—from building and adapting new tools that destitute the centralized AI platforms, to nudging national and philanthropic investment programs towards directing funds into the techniques of “private” and decentralized AI outlined above.
By themselves, tunnels are neither “good” nor “bad.” However, if a tunnel network is wide and dense, it is hard for any central actor to grasp and control. It is in this sense that tunnels afford us a political metaphor and some clear material tools that should be prioritized and developed at this pivotal moment. To accomplish this type of project, we need a new kind of technological literacy, along with new kinds of experiments around data, encryption, and modeling. To put it as concretely as possible: We need small models that are both independently trained and “distilled” from larger models. We need to extend state-of-the-art encryption techniques towards more private AI. We need to harness federated learning for bypassing client-server AI solutions. And, last but not least, we need all of these approaches to be combined with good product design and user-friendly interfaces.
On a more principled level, the guiding questions should be: How can opacity belong to everyone?49 How can AI serve many worlds, instead of one central world? Tunnels offer one possible metaphor for establishing opacity as a new universal right and capacity. Through this universal right, the world could, paradoxically, again disappear into fragments that no one claims to know from a single universal perspective.
For a copy of McKeown’s lecture slides, see →.
For details on this example, see Jeffrey Pennington, Richard Socher, and Christopher Manning, “Glove: Global Vectors for Word Representation,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014).
Although even the French theory of Foucault and Derrida was deeply influenced by cybernetics. See Bernard Dionysius Geoghegan, Code: From Information Theory to French Theory (Duke University Press, 2023).
Jimena Canales, Bedeviled: A Shadow History of Demons in Science (Princeton University Press, 2020).
Here I’m reading Laplace through the magnificent history of demonology in science by Canales in Bedeviled.
Donna Haraway, “Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective,” Feminist Studies 14, no. 3 (1988): 575.
Ulrich Beck, Risk Society: Towards a New Modernity (Sage, 1992).
Karl Marx, Grundrisse, 1858 →.
Harry Davies, Bethan McKernan, and Dan Sabbagh “‘The Gospel’: How Israel Uses AI to Select Bombing Targets in Gaza,” The Guardian, December 1, 2023 →; Kim Lyons, “Use of Clearview AI Facial Recognition Tech Spiked as Law Enforcement Seeks to Identify Capitol Mob,” The Verge, January 10, 2021 →.
Allie Funk, Adrian Shahbaz, and Kian Vesteinsson, “AI Chatbots Are Learning to Spout Authoritarian Propaganda,” Wired, October 4, 2023 →.
Leonie Cater and Melissa Heikkilä, “Your Boss Is Watching: How AI-Powered Surveillance Rules the Workplace,” Politico, May 27, 2021 →.
Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 4th ed. (Pearson, 2020) →.
The two articles in which Shannon introduced this idea have been compiled as a book with an accessible introduction by Warren Weaver: Claude E. Shannon and Warren Weaver, The Mathematical Theory of Communication (University of Illinois Press, 1963).
“For any logical expression satisfying certain conditions, one can find a net behaving in the fashion it describes.” Warren McCulloch and Walter Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biophysics 5, no. 4 (1943); Frank Rosenblatt, “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain,” Psychological Review 65, no.6 (1959), 386.
Norber Wiener, Cybernetics: Or Control and Communication in the Animal and the Machine (MIT Press, 1948); Arturo Rosenblueth, Norbert Wiener, and Julian Bigelow, “Behavior, Purpose and Teleology,” Philosophy of Science 10, no. 1 (1943).
For a fantastic introduction to cybernetics, see Ronald Kline, The Cybernetics Moment (Johns Hopkins University Press, 2015).
“Standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.” Kurt Hornik, Maxwell Stinchcombe, and Halbert White, “Multilayer Feedforward Networks Are Universal Approximators,” Neural Networks 2, no. 5 (1989).
Niklas Luhmann, Social Systems, trans. John Bednarz, Jr. with Dirk Baecker (Stanford University Press, 1996).
Jakob von Uexküll, A Foray into the Worlds of Animals and Humans, trans. Joseph D. O’Neil (University of Minnesota Press, 2010).
Emily M. Bender and Alexander Koller Bender, “Climbing Towards NLU: On Meaning, Form, and Understanding in the Age of Data,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020).
Michel Callon, “Some Elements of a Sociology of Translation: Domestication of the Scallops and the Fishermen of St. Brieuc Bay,” Sociological Review 32, no. 1 (1984).
Michael J. Reddy, “The Conduit Metaphor,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (Cambridge University Press, 1993).
Anna Lowenhaupt Tsing, The Mushroom at the End of the World: On the Possibility of Life in Capitalist Ruins (Princeton University Press, 2015).
Matteo Pasquinelli, The Eye of the Master: A Social History of Artificial Intelligence (Verso, 2023).
The nature of the Pirate Bay changed a lot as a result of the trial, and many make a sharp distinction between the site pre- and post-trial.
Maria Eriksson et al., Spotify Teardown: Inside the Black Box of Streaming Music (MIT Press, 2017).
Most of the blogs that drove this debate have since ceased to exist, but the relevant posts can be found through Archive.org. These include historian and Piratbyrån spokesperson Rasmus Fleisher, “Pirate Politics: From Accelerationism to Escalationism?,” Copyriot (blog), January 13, 2010 →; and philosopher of science and hacktivist Christopher Kullenberg, “Tunneled Cipherspace / Burrow Internet,” Intensifier (blog), January 21, 2010 →.
This history has been recounted by Rasmus Fleischer, “Vastavallankumous,” in Verkko suljettu: Internet ja avoimuuden rajat, ed. Mikael Brunila and Kimmo Kallio (Into Kustannus, 2014).
Christopher Kullenberg, “FRAktivism – Några möjliga vägar,” Intensifier (blog), January 9, 2010 →. Translation by the author.
For one history of the relationship between accelerationism and tunnel politics, see Mikael Brunila, “Runsauden räjähdys: Internet hakutalouden jälkeen,” in Verkko suljettu.
Gabriella Coleman, “How Has the Fight for Anonymity and Privacy Advanced Since Snowden’s Whistle-Blowing?,” Media, Culture & Society 41, no. 4 (2019).
However, one would be well-advised to not trust any of the services provided by Meta. See for instance Jim Salter, “WhatsApp ‘End-to-End Encrypted’ Messages Aren’t That Private After All,” Ars Technica, September 8, 2021→.
See →.
Using terminology from Gilles Deleuze, we might say that language becomes content for the expression of embeddings. Following linguist Zellig Harris, doctoral advisor to Noam Chomsky, we might further say that embeddings thus take on the appearance of a meta-language, which assigns meta-meanings to human language and other structures of meaning. Gilles Deleuze, Foucault (Continuum, 1999); Zellig Harris, Language and Information (Columbia University Press, 1988). See also Mikael Brunila and Jack LaViolette, “What Company Do Words Keep? Revisiting the Distributional Semantics of J. R. Firth & Zellig Harris,” Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022) →.
Karl Marx, Capital, vol 1. (1867) →.
“Language modelling is compression,” as researchers at Google and other top AI institutes recently wrote. See Grégoire Delétang et al., “Language Modeling Is Compression,” arXiv, September 19, 2023 →.
DeepSeek-AI et al., “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning,” arXiv, January 22, 2025 →.
Lingjiao Chen, Matei Zaharia, and James Zou, “How Is ChatGPT’s Behavior Changing Over Time?,” arXiv, July 18, 2023 →.
See →.
Eli Pariser, The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think (Penguin, 2011).
Neil Smith, Uneven Development: Nature, Capital, and the Production of Space (University of Georgia Press, 2008).
While DeepSeek trained its model with a much smaller number of GPUs, it still had access to a very large cluster of them. The GPU wars are by no means over.
Tiqqun, The Cybernetic Hypothesis, (Semiotext(e), 2020).
For one survey of local differential privacy, see Mengmeng Yang et al., “Local Differential Privacy and Its Applications: A Comprehensive Survey,” Computer Standards & Interfaces, no. 89 (April 2024). For a particularly compelling example of homomorphic encryption, see Samir Jordan Menon and David J. Wu, “Spiral: Fast, High-Rate Single-Server PIR via FHE Composition,” Cryptology ePrint Archive (2022) →.
For a comment on this from the tunnel discussions in the Swedish blogosphere, see Magnus Eriksson, “Pop Culture and Tunnels,” Blay (blog), January 19, 2010 →.
Models that are trained on outputs from other models tend to generate content around an increasingly narrow statistical norm. Ilia Shumailov et al., “The Curse of Recursion: Training on Generated Data Makes Models Forget,” arXiv, May 27, 2023 →.
Fleisher, “Pirate Politics.”
On the “right to opacity” see Édouard Glissant, Poetics of Relation, trans. Betsy Wing (University of Michigan Press, 2010), 189.