at the factories of artificial intelligence

We have every right to sleep
or maybe not?
think about it if sleep belonged to the state
things would be difficult
would we buy sleep with a voucher? or freely? and how much?
and if it had been conceded to private initiative? what
is its price today?
would we die from insomnia without money?
those suffering from insomnia a bit luckier
— isn’t that so?
Think about it if sleep didn’t belong to us.

N. Karouzos

You decide that it’s finally time to get a firsthand taste of the “impressive” capabilities of artificial intelligence and so-called large language models. So, you sit down at your computer, go to the ChatGPT1 page, create an account, and within just five minutes you’re facing the “speaking” machine. You overcome your initial awkwardness and enter the discussion. You start with some simple questions (“What’s your name? What exactly are you? How many siblings do you have?”) and gradually move on to more complex and controversial topics (“Was Céline a fascist? Do machines produce value? Is literary neoformalism a reactionary trend?”). At no point does the machine give you the impression of any difficulty in expression. Its answers may seem very rounded and “trained,” but at the same time they seem excessively similar to a piece of a normal conversation.

Before you hit X to close the browser and be able to go to sleep peacefully, having a new discussion topic with your colleagues at work tomorrow morning, the idea crosses your mind to test the engine on topics related to your job. “What is the etymology of the word χάδι?”2, “How can I design a rectifier from alternating to direct current?”, “Write me a script with hopeless loves, substance abuses, and cursed poets who have memorized Rimbau.”, “Create a program in Python language that calculates the roots of a second-degree polynomial.” Big mistake. The engine’s answers appear not only correct, but also additionally analytical and thorough. Your sleep won’t be so peaceful after all, and tomorrow’s discussion with your colleagues will revolve around estimates of when you’ll lose your jobs to be replaced by speaking machines.

You will read the same more or less concerns about super-intelligent machines that will eliminate one form of work after another in your favorite newspaper too (God forbid, don’t even open a book), you’ll see them recycled on social networks, you’ll hear them repeated on radios, televisions and podcasts. After the era of “immaterial” capitalism, it seems that the era of absolutely automated capitalism has also arrived. The basic social and productive functions will pass into the hands of machines which will operate with minimal supervision, as long as they are fed like obedient pets with sufficient electrical energy. Just as the rhetoric about “immaterial” capitalism contained many and heavy doses of ideology some years ago, so too does this rhetoric about “automated” capitalism. Behind the curtain of the magician who conjures intelligent answers onto your screen from thin air lies a lot of matter, a lot of energy and a lot of labor. Dirty matter, environmentally harmful energy and poorly paid human labor at every stage of “producing” an intelligent answer from a speaking machine. The fact that all this matter, energy and labor are not arranged at a specific spatiotemporal point in the form of a production line with the usual meaning of the term does not mean that they have been dematerialized. They have simply been (deliberately) displaced beyond the usual scope of your gaze.

Together with artificial intelligence, another term that has become popular in recent years is that of cloud computing (or even fog computing), which refers to the ability to run many demanding computational tasks not on the desktop computer that everyone has, but “out there,” on remote servers. In this way, the illusion becomes complete. The ascension of artificial intelligence into the skies and computational clouds comes to latch onto the rhetoric of full automation in order to construct the myth of the light, airy, and inspired Computational Spirit. However, every time you press enter to send your question to a speaking machine, it doesn’t go to any cloud. On the contrary, it must travel through kilometers of cables and optical fibers, pass through hundreds of switches, modulators/demodulators, and decoders before finally arriving at colossal data centers (in which, before entering, you might even need ear protection to avoid injury from the continuous noise they produce) and to some of their processors, which will compute the answer. For these data centers to operate, they in turn require massive dams, many wind turbines, and entire fleets of gas-powered ships to produce the required amounts of electrical energy. Incidentally, it’s not a bad idea to have a river or two nearby to cool the micro-machines of processors, graphics cards, and hard drives. Naturally, the software running on these machines doesn’t spontaneously sprout within them. Hundreds of engineers are required to write and maintain it. And thousands of micro-contractors in some African country who spend endless hours in front of a screen cleaning and categorizing the data with which machine learning algorithms must be trained. This is a process that doesn’t occur just once, but must be repeated periodically so that the algorithm remains at a respectable level.

History has a way of mocking people. Once, in the late 19th century, the Mechanical Turk (which had impressed Benjamin so much) of Wolfgang von Kempelen toured Europe, defeating even mediocre chess players. It was a mechanical automaton, in the form of a Turk, which its inventor claimed had the ability to play chess autonomously. Many were those who tried to compete against it and confirmed, with bitterness, its high capabilities, leaving defeated. Until it was finally revealed that the Turk was constructed in such a way that a small (but intelligent) person could fit inside, who could move its hands and the pieces on the chessboard. Artificial intelligence today seems indifferent to chess (which it has, nevertheless, managed to automate). However, this doesn’t make it any less similar to the Mechanical Turk. Its ambitions remain the same as those of some of the mechanical automatons of the 19th and 18th centuries: the mechanization of human thought. And if this is not (directly) feasible, it doesn’t hesitate to set up electronic Turks behind which entire armies of workers are hidden.

The most material part of the electronic Turk consists naturally of… electronic components: the so-called VLSI (very large scale integration) circuits, named precisely because billions of microscopic switches – transistors – can fit within a few square millimeters of silicon. Processors, network cards, graphics cards (now extensively used by artificial intelligence applications), all these components are such electronic circuits. A single processor, along with network and graphics cards and a hard drive, is more than sufficient to meet the needs of a home user, possibly excepting some special cases with specific requirements (e.g., users who need to perform systematic video processing). Many previous-generation artificial intelligence models also had the capability to run on such modest computers. Latest-generation models, however, such as the famous ChatGPT, require entire “farms” of processors (along with their peripheral memories) in order to assimilate and process enormous volumes of data3. Without this data mass, their performance is far less impressive. Beyond the data, even these models themselves consist of billions of parameters, which prevents any “home” use (although in some cases compressed versions with fewer parameters are provided). The only way for these models to operate is to distribute them across multiple processors so that many of their individual parts can run in parallel. Data centers are the infrastructures that house multiple processors, enabling them to work on large-scale problems.

Such data centers are not small or lightweight at all. Some of them may host millions of processors and occupy areas of thousands of square meters. In other words, they constitute a form of… factory. It is estimated that approximately 80% of the expenses of artificial intelligence companies are directed toward building and maintaining (or renting) the necessary computing infrastructures for training and operating their models. It is evident that very few companies have the capital to build or purchase their own data centers. For most, the solution is to rent specific computing resources with specific specifications and for specific periods. The major players in the field of artificial intelligence are those who not only have the most advanced models and the richest data but also the massive infrastructures to support these models and data.

It is not, of course, only the volume of these infrastructures that makes them factories. Their energy requirements are also of industrial scale. Based on calculations by the International Energy Agency, it is estimated that data centers consumed 2-3 percent of the total energy produced globally in 2022. A country like Ireland, where many American companies from Silicon Valley maintain subsidiaries, would need to channel 27% of its electricity production towards AI factories by 2028, if the current trend of increasing and expanding data centers continues. If a company decided to build an AI google where users would pose their queries to a language model in the same way and with the same frequency as they do now to a conventional search engine, then the total electricity production of Ireland would barely suffice to power such a next-generation machine. A large percentage (around 40%) of the energy consumed by data centers is not directed towards the actual operation of their processors, but towards cooling them using air conditioning and water. A large data center may require over 10 million liters of water daily, almost as much as a (American) city of 50,000 inhabitants.

The above numbers refer only to the operation of data centers, without taking into account at all the requirements of their construction, from the mining of the necessary rare earths to the work of data annotators who feed the artificial intelligence models with training data. Even so, however, they indicate what the mythology around artificial intelligence tends to conceal. The fact that data centers do not have smokestacks (and annotators do not walk around with soot on their faces and grease under their fingernails) does not dematerialize them or absolve them of their factory-like burden. Every time you press enter to send a question to ChatGPT, you set in motion an entire just-in-time assembly chain of answers, even if it is not visible to you.

A perception widely circulating in artificial intelligence circles predicts that progress in related technologies will inevitably lead to a reduction in the energy cost of large (linguistic or other) models4, somewhat like processors gradually becoming more efficient. However, the fact that modern processors consume less energy per transistor does not mean they generally consume less energy overall. A 2008 Intel Core i7 processor consumes around 80 Watts and contains approximately 731,000,000 transistors, meaning it requires about 70 nanoWatts per transistor. The Intel 4004 from 1971 contained only 2,250 transistors and consumed half a Watt, or 200 milliWatts, which is 200,000 nanoWatts per transistor. A modern processor can indeed be nearly 3,000 times more efficient (per transistor) compared to a 1970s predecessor, yet it may still consume 200 times more energy overall. Therefore, increasing the performance of a system does not necessarily imply a broader reduction in its energy demands. As demand for the services it provides increases, so will its energy requirements accordingly. Just as building new roads and adding more lanes to existing ones would simply bring more cars onto the road unless public transportation alternatives are substantially promoted, similarly, improving data center efficiency will merely incentivize the construction of even more. In any case, there does not seem to be any realistic prospect for a future where artificial intelligence models no longer reside in noisy factories but float in the clouds as ethereal melodies.


The other component that AI factories need to operate, apart from matter and energy, is of course human labor. There is certainly labor channeled into the construction and maintenance of data centers, but this constitutes only a small percentage of the total labor required. In contrast to the announcements of any opportunistic political showcase when a company decides to open a data center in the protectorate it administers, no flood of new jobs is created. Once constructed, data centers require maintenance (or occasional upgrades), which however can be handled by a small team of technicians and engineers.

Clearly more prominent and “laureled” are the groups of engineers and programmers who develop artificial intelligence models at the software level. As we have already mentioned, these are mainly models based on neural networks. Although the basic operation of neural networks is well studied and there are several ready-made “libraries” that handle their implementation, the exact architecture of such a network for a specific purpose still constitutes a fundamental part of an engineer’s “creative” work. Among other things, the architecture of a neural network also determines the volume of data it can accept during its training, a factor absolutely critical to its performance. Therefore, the rise of artificial intelligence in recent years is due not so much to some genuinely groundbreaking idea (neural networks have been known for decades, already since the 1950s), but rather to the “brute force” work of engineers who have discovered ways to construct massive neural networks without them collapsing. These particular engineers constitute something like a caste of brahmins within artificial intelligence circles: they are surrounded by an aura, promoted in the media, paid handsomely, and their hands always remain clean.

Despite the recognition that engineers’ work enjoys, its subject (the development and management of models) is by no means the most demanding in terms of required work within the construction cycle of an artificial intelligence system. Beyond model management, the other equally (if not more) important aspect of an artificial intelligence model concerns the data with which it can and must be trained. Big Tech advertisements may present their electronic Turks as capable of handling almost anything that comes their way. In reality, data proves considerably more stubborn in practice and requires multiple stages of processing before the Turk is allowed to touch it. It is estimated, in fact, that 80% of a model’s total training time is devoted to this “pre-processing” of data. Even data collection itself is often far from simple. Considerable (human) labor is needed to design and install data collection sensors, decide on the form and sampling frequency they will have, and design the database that will host them. Alternatively, there is also the solution of producing synthetic data. In this case, an additional layer of code must intervene (hence, be designed and implemented) to generate the data. Real, “raw” data, as collected from sensors or any other source (e.g., the internet itself), is almost never used as-is for model training. First, it must be cleaned of any “dirt” and noise and brought into a form suitable for a model. It may need to be enriched with additional information or combined with data from other sources (e.g., RADAR data with satellite imagery data) for the model to achieve decent performance. One of the most fundamental stages, however, involves the so-called “annotation” (data annotation/labeling) of data, which is mainly undertaken by people in places far removed from where these insatiable models are initially developed. By annotation, we essentially mean the categorization of data based on certain machine efficiency criteria. If, for example, the goal is autonomous driving, then images from the vehicle’s cameras must be annotated according to whether they contain a STOP sign, whether they contain pedestrian crossing lines, etc. This way, a model is guided during its training toward the desired goal.

A small problem that arises in this whole process is that the volume of annotated data required for adequately training a neural network is not small at all. The data requirements of neural networks were not something unknown. Only gradually, however, was the scale of these requirements understood. Famous is the story of ImageNet, a dataset of images that is widely used among computer vision researchers. Fei Fei Li, a researcher at Princeton University, is considered the mother of ImageNet. Her “ingenious” idea did not concern any algorithm or new architecture, but rather the suspicion that sprouted within her that image recognition neural networks could be dramatically improved if simply fed more (annotated) data. For the scale Fei Fei Li had in mind, however, the cost of annotation would have been prohibitive if she followed the beaten path. Instead, she turned to Amazon’s Mechanical Turk (irony?), a (cheap for employers) platform for managing and distributing micro-tasks of this kind that can be done remotely. ImageNet is now considered something of a benchmark in the field of computer vision. OpenAI had similar adventures with its own famous ChatGPT. Theoretically, systems like ChatGPT do not necessarily require human annotation since they are trained “simply” to predict the continuation of a text. And there is no shortage of texts on the internet. In practice, however, it turned out that even there, annotation was deemed necessary. Without it, the free-range ChatGPT had the tendency to reproduce whatever it had “consumed” from its wanderings on the internet. Among other things, even sexist or fascist stereotypes. Or simply opinions and perceptions quite outside the mainstream. The solution was again human intervention—annotation so that texts with “extreme” content could be appropriately categorized and “punished” during the system’s training. OpenAI did not enlist Amazon this time. It hired relevant companies based in Kenya, Africa, to which it sent ChatGPT’s training data and requested it to be returned with the appropriate annotation. The hourly wage of the African workers who worked to protect our eyes from “extreme” and “inappropriate” responses was less than $2 per hour.

It is difficult to find official data regarding the number of “labeling workers” worldwide. Some unofficial estimates suggest millions. However, for this type of job, official estimates or a vivid imagination are not required to understand how repetitive and tedious it can be, especially considering that labeling workers are simply given data and instructions without being told the ultimate purpose. Workers in Ford’s automobile factories at least knew a century ago that the final result of their work would be a car. For labeling workers, even that is a luxury. What may not be immediately obvious is that these workers are often called upon to deal with texts or scenes of truly extreme content (e.g., murders or rapes), with whatever consequences this may have on their mental well-being. Even with normal content, attention must be extremely focused. A labeling worker must “think” like a machine so that eventually the machine gives the impression that it “thinks” like a human. If an image contains a STOP sign, but it’s a drawing made by a child, how should it be categorized? Should it be placed in “negative examples,” since the machine does not inherently have the ability to understand that such a STOP sign outside the road context carries no meaning in terms of correct driving behavior? And if the image contains a STOP sign but it appears as a reflection in a mirror? And if the STOP appears in a fork, but refers to a parallel road? For the machine, all these are examples it cannot distinguish based solely on visual recognition of the STOP sign. Semantic labeling work must be done by human hands and eyes that are paid a few crumbs and whose work is closely monitored by surveillance systems. As in the case of the energy needs of AI models, where experts “predict” that these will decrease impressively in the coming years, similarly in the case of the labor behind such models, some other experts “predict” that the need for human data labeling will gradually disappear. Models will be able to generate data on their own with which they will train themselves or other models. Predictions like these resemble wishes more than anything. Neural networks are notorious for how “fragile” they are, sometimes producing completely unrelated answers when fed data significantly outside the range they saw during training. Moreover, attempts have indeed been made to train AI models using data generated by other models. The results were disappointing. Models trained this way showed a stronger tendency to produce answers with a higher degree of repetition. In other words, they had even less ability to “innovate” and simply repeated the same things over and over.


Does all of the above mean that the latest artificial intelligence technologies are something like a modern version of Kamatero’s water? Not exactly. However, they do debunk some myths that show particular persistence. The automation trend does exist; however, this does not mean that it will eliminate all forms of work. At every stage of building, training, and (repeatedly) retraining an artificial intelligence model, there is a human hand involved. In fact, many hands. Thousands of hands and eyes. This need for human labor is not expected to disappear anytime soon.

What are the changes that a new generation of artificial intelligence can truly bring, which will eventually find widespread application and overcome the experimental stage5? The trend of concentrating the infrastructures on which artificial intelligence systems run is evident. The entry barrier into artificial intelligence is dangerously rising for everyone except the major players—companies6. If current trends continue, it is not unlikely that artificial intelligence, in this version, will be transformed into something like a service that you can never own but only rent for a fee7.

As for work now, it may not disappear, as experts “threaten.” This does not mean that it will not be transformed. A useful analogy might be the era of the transition towards Taylorism in the field of manual labor. There are not a few “intellectual” office jobs that already include repetitive parts. If it is judged that these parts can be performed at a lower cost by a “smart” machine, then it is very likely that they will be assigned to it. The office worker (who at some point may have had a great idea of himself) is not excluded from being transformed into an “intellectual” worker – mass, running to catch the next image with some STOP in it. And when we talk about “office workers,” we do not mean only clerks and cashiers, but even positions of “higher specialization.” For example, programmers are on the front line of this battle of furrows, given the fact that programming languages have strict syntax (which helps artificial intelligence models, due to repetitiveness) and there are abundant available and open data so that models can be trained to produce code8.

From this perspective, therefore, the latest generation of artificial intelligence models seems to aim more at appropriating the contents of intellectual labor through their commodification. The issue, therefore, is not absolute automation, something that will probably remain a distant dream. The targeting is elsewhere: in the restructuring of intellectual labor (where Westerners perhaps believe they still have an advantage) so as to fit the broader socioeconomic restructuring according to the bio-informational model paradigm. How off-target it would be, then, to speak of an ongoing wave of intellectual enclosures9;

Separatrix

  1. Or perhaps better than Chinese and more efficient Deepseek. ↩︎
  2. From the medieval “echo,” that is, the small sound – lullaby that mothers made to put their children to sleep. ↩︎
  3. We have provided a more detailed description of large language models in a previous article. See “artificial intelligence”: large language models in the age of “intellectual” enclosures, Cyborg, vol. 26. ↩︎
  4. In the terminology of artificial intelligence, this category of models based on giant neural networks (from transformers) are generally called foundation models. The first ones, such as ChatGPT, were language-based as they focused on processing and producing (written) language. Now there are variations of them that handle other types of data as well. ↩︎
  5. We are always talking about this type of artificial intelligence models. There are also other, less impressive machine learning and artificial intelligence algorithms that are applied to more humble tasks, e.g., in designing robot movements in a factory. No manager would entrust this job to a foundational model for now, unless they wanted to play Russian roulette, waiting for when the production chain will dissolve. ↩︎
  6. Here the Chinese Deepseek created a rift, as it is provided in open source form and does not require half a mountain of lignite to run. ↩︎
  7. One can imagine what this would mean in countries like Greece, where of course cartels are never, ever created in sectors with strong tendencies towards concentration. ↩︎
  8. Another irony of history. Open source software was supposed to democratize the programmer’s work. Now it can be used to train models that will replace him. ↩︎
  9. Resistances will surely emerge in new forms. A Kenyan content moderator worker lost access to his account on a relevant platform for reasons unknown to him. However, he didn’t give up. He created multiple accounts with pseudonyms and uses ChatGPT to carry out the tasks assigned to him more quickly. Instead of tormenting himself over images and texts, he has set one artificial intelligence system to “train” the others. Our respects! ↩︎