Machine Learning as Natural Monopoly


Our modern economy has been built atop industries regulated as natural monopolies. Railroads, for example, helped to transform the country’s agrarian economy to an industrial one, opening distant markets to mass-produced goods.1 Power utilities brought electricity to homes and businesses nationwide.2 And the telephone helped to knit together far-flung communities by way of a single unified communications network.3

Now a new class of natural monopoly—one that again seems critical to our evolving modern economy—may be emerging. Machine-learning-based applications (sometimes termed artificially intelligent systems) are transforming the ways we generate, store, use, and interpret information.4 Such applications are embedded in a vast range of systems—from medicine to law enforcement to internet platforms and beyond.5 In short, machine-learning-based applications are beginning to pervade the economy, transforming operations across sectors, much as the railways, electric grids, and communications platforms once did.6

Natural monopolies are exemplary candidates for regulation.7 This is for two reasons: One, because they are monopolies, consumers are likely to face high prices, poor quality, or both. Two, because these markets tend naturally to monopoly control, competition is unlikely to address these price- and quality-related problems—indeed, attempts to introduce competition may be more harmful than helpful. So how might we address the problems that attend to inefficiently high prices or substandard quality without turning to competitors (who, in other contexts, typically aim to offer better products at better prices)? Regulation. As Judge Posner once explained, regulating natural monopolies helps avoid “wasteful” competition by trading a promise of monopoly status to one provider “in exchange [for] a commitment to provide reasonable service at reasonable rates.”8

Hence, to return to the telecommunications precedent, regulators would grant telephone providers a license to operate in exchange for public oversight.9 Such oversight has historically encompassed rate regulation (to ensure that consumers were not charged inefficiently high prices), service specification (to ensure that the monopoly service satisfied public standards), and build-out requirements (to ensure that the network served the entire community), among other rules.10 Of course, this process was not perfect. It was difficult, for example, for regulators to calculate the real costs of a telephone provider’s network, to predict the investments required for upkeep and expansion, and to assess a reasonable profit. And providers themselves had a stake in inflating their costs, exaggerating their difficulties, and currying favor with local authorities.11 But policy makers agreed that public regulation and oversight of this sort—such defects notwithstanding—were preferable to any alternative (such as unregulated monopolist control of a given market).12

Such regulation has since fallen out of favor in many of these contexts—not necessarily for any defect in this basic theory undergirding natural monopoly regulation generally, but rather because these industries have since been disrupted in ways that undermine their status as natural monopolies.13 For example, once new telecommunications competitors like MCI proved it possible to provide long-distance service wirelessly,14 in competition with AT&T’s wired networks, policy makers no longer saw that market as naturally tending toward monopoly, thereby undermining the case for regulatory control and leading to that market’s deregulation.15 The railways were likewise disrupted by technological advancements in automotives, trucking, and air travel, leading to deregulation across the transportation industries.16

These sectors have experienced waves of deregulation. But policy makers are only now beginning to consider the possibilities for addressing the problems—rooted, perhaps, in monopoly conditions—that attend to machine-learning-based applications. For example, the White House recently announced the creation of the National Artificial Intelligence Initiative Office to “set[] AI technical standards” and to develop a “risk assessment framework” for such applications, among other initiatives.17 But the possibility that machine-learning-based applications—that artificially intelligent systems—are natural monopolies suggests an even more robust range of regulatory possibilities. Just as in the early days of the railways and the telephone networks, society may be better off with regulated applications, ones with legally specified rates and service conditions. In short, the tradition of natural monopoly regulation offers policy makers the opportunity to regulate machine-learning-based applications in line with shared public values regarding privacy, accuracy, and algorithmic bias, all while granting the public the opportunity to participate in those regulatory proceedings.

In this Article, I analyze the claim, occasionally made intuitively by scholars across disciplines, that machine-learning-based applications are natural monopolies.18 Some lessons from the computer science and economics literatures may seem to confirm this contested intuition in some cases, particularly where development and data costs are high and network effects are strong.19 And lessons from the legal literature chart a regulatory path forward.20

The defining characteristic of a natural monopoly is that the benefits of scale increase over the entire relevant market, such that the average cost to serve each consumer approaches some lower limit.21 Stated similarly, it is, on average, more efficient to serve a given market with only one provider rather than with many (no matter the number of providers actually in the market at any time).22 In at least some cases, this definition of natural monopolies seems to apply to machine-learning-based applications. This is for at least three reasons.

One, for some applications, the fixed costs of developing a machine-learning-based application overwhelm the marginal costs: It is far more expensive to develop the application than it is to make it available to its consumers. Of course, this is true for most software development: It is comparatively costless to deploy a software program (like Microsoft Windows or Google Sheets) once it is through development.23 But machine-learning-based applications have even higher fixed costs.24 The costs of acquiring the specialized hardware and volumes of data necessary to “train” the application can yield fixed costs that can outpace marginal costs at an even greater pace.25 Competition in such a market (among aspiring natural monopolists) may thus yield higher losses (through wasteful duplication) and fewer gains.

Two, the computational process of training a machine-learning-based application is itself another significant fixed cost.26 Machine-learning-based applications can be understood as developed in two stages—training and then prediction or classification.27 In training, such applications must surmount a steep learning curve until optimized—until the costs of further training (in terms of, say, additional data acquisition and computing power) outweigh the benefits of such training (in terms of accuracy).28 Training is computationally costly: Though training complexity (computationally speaking) varies across applications and specific machine-learning implementations, it is generally far more costly to train an application than it is to ask a trained application to respond to a query—to, say, predict a successful treatment regimen, identify a criminal suspect, or respond to an online search query.29 Hence, this vast difference between the fixed costs of training and the marginal costs of prediction suggests that the average marginal computational cost of each prediction is likely to decrease across all users of the application—in short, machine-learning-based applications appear computationally subadditive, a sufficient condition for natural monopoly. This is particularly true for deep learning applications that are based upon large text- or image-based datasets and that demand intense computational power, including, for example, GPT-3 (a powerful language prediction model).30 A second such application is likely to duplicate the costs and training efforts of the first application, while adding little new capacity and apparent comparative benefit.31 The second provider is largely redundant: It is akin to a series of parallel railroad tracks, or a second, overbuilt telephone network.

Three, some machine learning algorithms—deep learning or continual learning systems, for example—continue training even after development, using information drawn from their practical applications to further improve accuracy or performance.32 This process of retraining—a virtuous circle, to adopt the industry’s terminology—can give the first system deployed at scale an insurmountable lead.33 The reason is straightforward: The first system to market gains access to new training data—data from market consumers—before any competitor. That new information makes the first system better still (compared to its competitors)—and so that system may earn even more subscribers. Those new subscribers give that first system even more information with which to further improve—and so on. In such cases, one provider is consistently better—or, at minimum, has access to better and more complete data—than the rest.34 Hence, given these network effects, limiting the market to one provider helps ensure that that system “learns” from the entire corpus of information provided by the market it serves.35

I do not mean to contend that all algorithmic systems—a set that includes a vast range of technology, including other systems (e.g., expert systems) that are sometimes also called artificial intelligence—fall into this category.36 But I do conclude that some machine-learning-based applications—especially where data are expensive, computational demands are high, and network effects are strong—seem to tend toward natural monopoly.

This conclusion carries some specific consequences. As noted above, natural monopolists may resist competition, charge supracompetitive prices, and skimp on service quality. In response, policy makers have long sought to regulate natural monopolies in order to retain the gains associated with consolidation, such as the effects of the virtuous circle, while minimizing these harms of monopoly power.37 Drawing upon this tradition of regulation, I begin to consider a policy approach for machine-learning-based systems, focusing on three regulatory levers—rate regulation, service specification, and induced competition—each designed to address problems arising from the natural monopoly condition.

First, monopolists (including natural monopolists) typically charge inefficiently high prices, giving rise to problems that sound in both distributive and welfare concerns. In response, policy makers have traditionally regulated natural monopolists’ rates to protect consumers from such charges. In the case of machine learning regulation, this can encompass direct regulation of the data collected and used by these systems and their developers. Drawing from the (perhaps tired) metaphor that consumers “pay” for various services with personal information,38 we can understand providers’ privacy-invading data collection practices as an unduly extractive charge, as well as an inefficiently high price that yields welfare losses: Some developers may demand and hoard unfairly vast volumes of user data (using that information to exploit consumers, among other things), while some privacy-sensitive consumers may decline to use a beneficial machine-learning-based advance because of such demands.39 In response, regulators might promulgate a scheme of rate regulation that both advances privacy interests and reflects public preferences by limiting data collection and use to only that which is reasonably necessary for the continued development of the regulated system, or by directly specifying the information that providers may collect and keep.40 These alternatives—drawn from the leading modes of rate regulation, rate-of-return regulation and price-cap regulation—offer two concrete paths, steeped in an existing regulatory tradition, for addressing the varied privacy-related concerns attending to the volumes of personal information amassed by private companies. Such public law approaches, moreover, are better suited to the relational and interconnected nature of data than privacy law’s traditional focus on individual rights and harms.

Second, monopolists may have private incentives to underinvest in their systems in pernicious ways—say, by underinvesting in safety and security protocols or by overlooking the social costs of discrimination.41 Though competition can cause providers to offer ever-better services and to ensure that their products are available to the widest possible market, such competition is unlikely to discipline more highly concentrated markets.42 Hence, regulatory authorities have instead set service-quality standards for such providers to account for the difference between private incentives and the public good.43 In the machine learning context, such regulation could take several forms. For one, regulators might require that systems satisfy some regulatorily-prescribed standard for accuracy (notwithstanding computational or market-based measures of optimization), to prevent the harms of a faulty medical application, or to prevent errant arrests based on defective facial recognition systems. Or regulators might use their authority to address bias and algorithmic redlining by requiring that these systems do not discriminate against historically marginalized and disadvantaged populations. In short, regulators might substitute social metrics—metrics that fare better along dimensions of accountability, legitimacy, and transparency, among other values—for computational ones.44

Finally, because competition in natural monopoly markets can be wasteful and inefficient,45 regulators will sometimes instead seek to induce efficient competition—and the concomitant competition-based outcomes (to avoid prescribing rates and service standards). For example, regulators sometimes require putative monopolists to compete by bidding for a franchise to serve a natural monopoly market (rather than compete within that market), thereby avoiding duplicative and wasteful fixed investments.46 In other contexts, policy makers aim to induce efficient competition by lowering the barriers to market entry by, say, allowing new competitors to share in existing investments,47 or by imposing sharing requirements on market participants. Similar outcomes might result from a shift toward franchising for machine-learning-based applications deployed for public purposes (like law enforcement),48 by mandating federated learning or otherwise requiring data-sharing by putative competitors,49 and by closely regulating the terms of shared computing infrastructure providers to ensure neutrality.

In short, machine-learning-based applications make extractive and inefficient demands of their users, and they generate results that are both inaccurate and systematically biased. I do not, to be sure, claim to be the first to identify the accuracy, bias, and general privacy difficulties that attend to these applications.50 Rather, this Article draws together two parallel debates that have informed these findings—one regarding ethics and transparency in machine-learning contexts, and another regarding power and consolidation in the information technology industry. Computer scientists, for example, have begun to investigate computational (and other) solutions to the problems of accuracy and bias in machine learning.51 And legal and social science scholars such as Safiya Umoja Noble, Frank A. Pasquale, and K. Sabeel Rahman, among others, have argued in favor of greater regulation of our informational infrastructure in view of the growing concentration in these markets, contending that companies such as Google and Facebook resemble modern utility providers.52 Meanwhile, others have responded with cautions against regulation, suggesting that innovation and market competition are better suited to address such problems.53 But if these applications do indeed act as natural monopolists—or even, perhaps, as mere monopolists or oligopolists—then market power may contribute to these problems of accuracy, bias, and privacy. Hence, the legal traditions of natural monopoly regulation, and market regulation more generally, offer a ready, but perhaps overlooked, framework to help address these problems.54 This Article consequently explores the extent to which this legal framework applies to machine-learning-based applications and considers how policy makers may draw from that public regulatory tradition in this new technological space. It does so by building on the important prior work identifying issues with machine-learning-based applications and calling for greater regulation of those applications by casting those issues as problems of market power and by drawing on the tradition of natural monopoly and market regulation to address these specific concerns.

This Article proceeds in three parts. First, I begin with a brief description of the legal tradition of natural monopoly regulation. As noted above, policy makers have regulated apparent natural monopolies since at least the development of railway networks, both to target the specific and reasonably anticipated harms caused by the exercise of monopoly power (such as high prices and poor service), as well as to implement a range of social policies related to the natural monopoly condition (such as universal service). Second, I consider whether machine-learning-based applications fall into this class of natural monopolies. I find that at least some do, in part because of several features particular to machine-learning-based applications, including the means of training such applications to learn from masses of data. Third, given that at least some machine learning systems may—like their natural monopoly predecessors—be appropriate candidates for regulatory scrutiny, I begin to consider the possibilities for such a regime. Other scholars have noted myriad problems with algorithmic systems, including those related to accuracy, bias, and privacy. I suggest that some such problems may derive from a provider’s monopoly status and can therefore be addressed with tools found within the tradition of natural monopoly regulation.

Natural Monopolies and Regulation

I begin with a brief overview of the history and tradition of natural monopoly regulation. Understanding natural monopoly regulation, however, demands an understanding of natural monopolies, and so I start by briefly elaborating this unique market condition before turning next to a description of paradigmatic regulatory responses.

Understanding Natural Monopolies

Scale characterizes natural monopolies: If the benefits of scale (i.e., decreases in the average cost to serve a customer) accumulate across an entire relevant market, then that market may be best served by a monopolist.55 This may seem odd. The antitrust laws, for example, were originally enacted to counter the harmful effects of powerful monopolists. So why might some markets operate better under monopoly control? Again, the benefits of scale provide the answer: In natural monopoly markets, it is always more efficient (from the standpoint of productive efficiency) to add new users to the incumbent provider; it never makes sense to introduce a competitor.56 And this is so no matter the number of competitors actually in the market—natural monopoly markets sometimes include (typically temporarily) multiple participants, yielding an initially inefficient allocation of market inputs (among other concerns described below).57

Imagine, for example, a communications network with high fixed costs, including wire, pole, and labor costs, among others. Once the network is deployed, however, it is comparatively simple to connect a user to the network: In some instances, it is little more than flipping a switch.58 In this example, it always makes sense for a user to connect to the existing network—and, inversely, it never makes sense for a user to connect to a different network. Why? The second, competing network duplicates those high fixed costs. Where the market is served by a single provider, it might spend only $1,000,000 (as an example) to connect one million users—$1 each. But if another provider puts up a competing network, then the competitors have spent $2,000,000 combined to connect the same million users—$2 each. With two providers instead of one, it costs more to do the same (assuming users of the two networks can even connect with each other). This gives rise to higher consumer prices, as well as increased costs for other companies competing for the same inputs—other companies that need copper wire or wooden poles. In short, from a production standpoint, one provider is better than two (because the market’s cost function is, in a word, subadditive). Hence, even when multiple providers enter a natural monopoly market, one will typically win the competition for the market and consolidate the investments made by (now-former) competitors.

This is true, however, only where such scale economies extend throughout the relevant market. Compare, for example, some other complex commodity—an elevator, say. It is expensive to build an elevator factory and train elevator installers and inspectors. Once the factory is built and installers trained, however, it might be comparatively cheap to build and deploy an elevator. So is the elevator business, too, a natural monopoly? It seems unlikely. This is because the benefits of scale do not extend across the whole relevant market. It is unlikely that one factory and one cohort of elevator installers can support the market’s demand for elevators. Hence, at some point, the market will demand a second factory and a second cohort of installers and inspectors—and it does not matter whether Otis Elevator Company or ThyssenKrupp makes that additional investment. The average cost per elevator is higher because of the new significant investments, and the fact of higher average costs does not depend on whether those investments are made by the incumbent or a new competitor. Scale economies do not extend across the entire relevant market because the market is greater than a single facility’s capacity.59

But if one provider—one communications network, to return to the earlier example—can support the market, then there is no need for a competitor. Indeed, such competition will typically give way to a monopoly, with higher prices for consumers and other industries in the interim. And monopoly power sometimes yields further benefits: AT&T devoted some of its capital to Bell Labs, which helped drive a vast amount of new, innovative activity. So policy makers must trade the benefits of consolidation against the negative consequences of monopoly power.

Natural Monopolies and Regulatory Policy

The possibility that some markets tend naturally toward monopoly control has long yielded some specific legal consequences. For example, “[t]he railroad industry represented a true natural monopoly when the Interstate Commerce Commission”—the nation’s first regulatory agency—“emerged to regulate it during the 1880s.”60 Likewise, the Kingsbury Commitment—one of AT&T’s earliest encounters with regulation—was consummated by the Justice Department under its antitrust authorities to address AT&T’s apparent natural monopoly in long-distance telephone service.61 Over time, those regulatory interventions grew to encompass a range of state and federal oversight over an array of communications technologies.

Some aspects of these responses react to the possibility of monopoly control. Monopolists typically charge inefficiently high prices. Monopolists, that is, charge consumers more than they would under competitive conditions, thereby leaving some consumers unable to afford (or unwilling to buy) its sociallyvaluable telephone or railway service.62 Similarly, monopolists can skimp on quality because they face no competitive pressure to improve.63

Other regulatory responses flow from the view that these markets tend naturally toward monopoly control, sometimes entrenching the first provider to a market (without respect to whether the first is the best).64 Communications regulators, for example, helped ensure affordable and reliable service across a wide range of locales.65

In all, regulators have traditionally held some power to influence the rates, service standards, and number of providers in order to balance the unique benefits of consolidation in natural monopoly markets against these potential harms.

Rate Regulation

Rate regulation is perhaps the hallmark of natural monopoly regulation.66 It is a direct response to the natural monopolist’s power over price, namely, its power to charge inefficiently high prices. What makes monopolists’ prices (including natural monopolists’ prices) inefficiently high? Stated simply, it is the disconnect between a monopolist’s incentives and the social good.

Ideally, under a standard rational choice framework, consumers buy goods that they think are worth more than their production costs. If a railway journey costs $1 to complete, but is worth $3 to the average consumer, then those consumers should buy train tickets: Such tickets are worth more than the paper they are printed on (together, of course, with the other costs incurred by the railroad to complete the journey). But this simple equation ($3 > $1) assumes that railroads will charge close to cost for their services. In competitive conditions, this is likely true: Different providers will compete to provide ever-better products and services at ever-lower prices (but they will not charge less than cost, because they are not interested in losing money on the enterprise). Hence, competition drives prices down to cost. So long as consumers value goods more than they cost, they’ll buy them. And that’s good, in general, for the world: Consumers get more out of the goods they buy than it costs to make those goods.67

But what if there is no competition? In that case, the monopolist’s price is not driven by competitive concerns—it faces no pressure to charge ever-lower prices. Instead, the monopolist chooses a price that it thinks will maximize its profits.68 For our railroad monopolist, that might be, say, $5. Some people will still make the journey: To them, the trip is especially important—worth $5, $6, or even $7. And the monopolist maximizes profits from those consumers—fewer buyers, but higher margins. That is one problem: To the extent these higher rates exaggerate wealth inequity by consolidating capital and resources in a single entity, or limit access to a critical facility to only a selected portion of the population, or are otherwise extractive, monopoly pricing can undermine important distributive values.69 Moreover, other consumers might conclude that $5 is too steep and forgo the purchase. That’s another problem. The monopolist’s incentives are to maximize profits by charging some price above cost.70 But that means that some consumers—even consumers who are willing to pay more than cost—are left out. There is unrealized potential when a consumer is willing to pay $3 for a $1 ride—but cannot: $3 is more than cost, but less than the monopolist’s profit-maximizing price.71 And that has further effects on populations that are left unable to travel—to, say, visit family or seek out new employment opportunities.

Enter rate regulation. Unregulated, a natural monopolist is tempted to charge profit-maximizing prices, leaving some consumers back at the station. So rate regulation attempts to mimic competitive conditions by requiring that the monopolist charges prices, set by regulators, that approach cost (thereby offering a more equal opportunity to travel by rail, or use the telephone, across populations).72 Mimicking competitive conditions is important, and so some statutes allow regulators to set rates only in the absence of competition (revoking such authority whenever a provider’s prices are already disciplined by “effective competition”).73 In short, the monopolist is required to forgo most monopoly profits in exchange for some legally recognized status as the monopoly provider, more easily facilitated access to necessary inputs (e.g., public rights-of-way), and a regulatorily set reasonable profit.

Assessing such reasonable rates, however, is not simple.74 It requires that regulators gain access to information typically exclusive to the monopolist: Regulators need to know how much it costs to operate in the regulated market, they need to estimate necessary upkeep and repairs, they must anticipate changes in input costs (e.g., steel tracks or copper wires), and they account for some profits, too.

Historically, regulators have set such rates using a “rate-of-return” method: After accounting for all relevant past and future costs, the regulators add some reasonable rate-of-return (profit) on those investments and set prices accordingly.75 This method, however, is susceptible to manipulation. Since monopolists typically have better information than regulators, they may make unnecessary improvements that inflate actual costs, persuade regulators of inflated estimates of their actual costs, exaggerate their future difficulties, and convince regulators to award higher rates of return.76 If any of these strategies are successful, rates increase—yielding profits for the monopolist at the expense of the public regulator’s view of the social good (since, as explained, the higher rates leave some consumers out).

Such difficulties with rate-of-return rate setting led regulators to experiment with a different approach—price-cap regulation. Under this approach, regulators set an upper limit on rates (based on, e.g., historical data or benchmarks derived from other firms) and sometimes lower that ceiling decrementally.77 In the first year under price-cap regulation, a monopolist may be allowed to charge $5; in the second year, $4; in the third, $3.50; and so on.78 Under this approach, regulated monopolists have comparatively little opportunity to manipulate the baseline rate, and have ever-increasing incentives to improve internal efficiency in order to maximize profits under the price cap. But no matter the method—rate-of-return or price cap—the goal of rate regulation is to prevent the social losses that arise from monopoly pricing.

Service Specification

As noted, competition pushes providers in most markets to offer better products at better prices. But natural monopolists face limited, if any, competition. And so, just as rate regulation addresses concerns that prices are not disciplined by competition, service specification aims to address quality-related concerns. Indeed, price-cap regulation itself gives rise to further risks regarding the service quality of natural monopolists: Incentives to improve internal efficiency (and maximize profit under a price cap) might also give rise to incentives to skimp on quality, yielding poor customer experiences or even grave risks to public health and safety.79 Monopolists might, for example, forego necessary repairs or prudent investments in safety features both to maximize returns under a price-cap methodology and because they face little competitive pressure to do so. Monopoly regulators have therefore focused their attention not only on price, but also quality.

If it is difficult for regulators to predict, counterfactually, what prices could be under conditions of competition, then it might be near impossible for regulators to anticipate the sorts of dynamic service quality improvements that competition could bring. As a result, service regulation has tended to focus on two primary matters, one more technocratic, the other more concerned with matters of democratic governance. First, service regulation has aimed to ensure some baseline standard of service: Where rate regulation sets a price ceiling, service specification sets a quality floor (sometimes set, again, by reference to relevant benchmarks).80 Second, service specification has also tended to focus on issues of public concern—public safety, universal service, and nondiscrimination—that relate to the natural monopoly condition.81 In the context of communications regulation, for example, this has included rules relating to 911 service and emergency backup power, nudging providers into adopting certain technical standards—for better or for worse. It has also encompassed carrier-of-last-resort requirements (i.e., rules requiring carriers to serve customers that might otherwise be left off a communications network), as well as prohibitions against redlining in the construction of network facilities.82 Service rules thus sometimes implement social goals—sometimes as a substitute for market pressures that might normally induce a provider to satisfy such standards, sometimes to ensure that the benefits of market consolidation (e.g., network effects) accrue to the whole population, and sometimes with the effect of entrenching a particular technical pathway.83 And so service (and rate) regulation is often a participatory process, one that includes both the regulated entity as well as interested members of the public.84

Consider, for example, the New York Public Service Commission’s exercise of regulatory power over Verizon in the wake of Superstorm Sandy. Sandy destroyed a significant portion of Verizon’s landline telephone network on Fire Island, New York.85 When Verizon, the monopoly provider of wireline telephone service there, began to rebuild its infrastructure, it sought regulatory approval to replace its copper-based infrastructure with “Voice Link,” a wireless substitute. This wireless solution was, from Verizon’s perspective, cheaper to deploy and maintain. But the New York Public Service Commission worried that Voice Link was a fundamentally unreliable service—one that fell below a reasonable standard for phone service.86 Voice Link was, for example, susceptible to inclement weather and network congestion, resulting in higher volumes of dropped calls (far more so than the previous copper-based service).87 Moreover, these quality problems gave rise to specific public safety concerns: “Verizon itself conceded that congestion could block or slow completion of 911 calls via Voice Link,” and the Voice Link terms of service purported to limit Verizon’s “liability for failing to complete 911 calls.”88 These limits gave rise to both intense public dissatisfaction with Verizon’s plan, as well as severe regulatory scrutiny. Verizon relented: As the Public Service Commission readied to deny Verizon’s request, Verizon withdrew its application and agreed to build a wireline fiber-based network on the island.89 In short, the regulator’s authority over communications services providers encompassed some power to dictate the terms of the service provided, including standards related to matters of both private concern (such as call quality) and public interest (such as emergency communications).

Induced Competition

Both rate regulation and service specification aim, in part, to substitute for the benefits—better prices for better products—that typically come by way of market competition, on the assumption that competition is unlikely or undesirable inside natural monopoly markets. But some modes of regulation aim to induce efficient competition in these markets all the same.

The tradition of utility franchising offers one important, longstanding example.90 Rather than attempt to directly mimic the outcomes of intramarket competition, regulators control market entry by establishing an arranged competition: Several firms, each vying to serve some market, submit offers to provide their best service at their best price, and whichever firm offers the best value is endowed with a monopoly franchise—essentially, the license legally required to serve the area.91 Harold Demsetz, who was among the first to argue in favor of franchising over direct regulation, reasoned that competition among bidders could achieve the same ends as rate regulation and service specification, but without the substantial regulatory structure that typically attended to natural monopoly regulation.92

In other contexts, policy makers have lowered barriers to market entry by, for example, allowing new competitors to share in existing investments, thereby reducing duplicative fixed investments and the concomitant consumer price increases.93 In short, the rules of the market are specially structured to sustain competition. Consider the Telecommunications Act of 1996. Before the 1996 Act, policy makers had long regulated local communications markets as natural monopolies by setting rates and imposing service conditions (such as carrier-of-last-resort obligations).94 But the 1996 Act sought to change the dynamics of entry into these local markets by giving putative competitors various rights to lease capacity on existing telephone networks.95 Competitors could thus avoid building out duplicative networks before offering competing service, relying instead on the existing infrastructure (until those new providers found it more desirable to invest in new facilities—perhaps when existing infrastructure could no longer support the total demand for communications services, thus requiring new investments).96

Similarly, antitrust’s essential facilities doctrine requires that natural monopolists make their facilities available to competitors in adjacent markets: Companies that own power lines, for example, must make those lines available to a range of electricity producers (e.g., power plants, wind farms), so that competition among producers may thrive without requiring that each build its own transmission grid.97

Just as both rate regulation and service specification have proved to be imperfect substitutes for market competition in some respects, so too have modes of inducing competition sometimes come up short. Franchising, for example, did not upend all other natural monopoly regulation (as Demsetz originally envisioned).98 Commentators responding to Demsetz, such as Oliver Williamson, reasoned that natural monopolists might behave opportunistically after having won a franchise, charging prices or offering service inconsistent with their promises.99 And public choice concerns likely played a role in entrenching existing regulators, too.100 Hence, even after monopolists were selected competitively, they remained subject to regulatory supervision to ensure that providers adhered to their franchising commitments, and to negotiate the terms of the provider’s ongoing franchise.101 In all, even competitively designated franchisees remained subject to rate and service supervision.

Commentators likewise “agree that the [1996 Act] failed to fulfill its promise of leading to more vibrant and competitive markets,” casting blame on both incumbent providers, who resisted the legal mandates to share their infrastructure, and on the new competitors, who often failed to make the hoped-for investments in new communications facilities (riding instead on the incumbents’ facilities).102

*   *   *

Industries characterized predominantly by scale can sometimes present natural monopolies, where such scale economies extend over the entire relevant market. Such monopolists—including, historically, railways, electric grid operators, and telephone carriers—have traditionally been subject to regulatory supervision over rates and standards. Indeed, such regulation is sometimes carried forward into other monopolistic or oligopolistic markets (even if the natural monopoly condition is not strictly met). Underlying this regulatory scrutiny is a view that something is necessary to discipline a dominant provider’s prices and services—but, in these markets, regular competition will not do it. And so state and federal authorities substituted competition with regulatory control over rates and service, thereby mimicking the effects of competition while also retaining the gains of consolidation and implementing other important social goals, or by restructuring the market to sustain competition. These regulatory substitutes were sometimes imperfect. But regulators and policy makers—responsible to elected officials and a voting public—seemed to conclude that the costs of such regulatory shortcomings were less than the costs of monopoly control, and that public accountability for these decisions (and missteps) was better than private obscurity.103 However, once newer technological advances proved these markets capable of sustaining (evermore) competition on their own—once, for example, the railroads faced competition from an array of long-haul truckers (among other newly developed transportation options), or once wireless telephone service proved to be an adequate substitute for wireline service—these industries were slowly deregulated, perhaps, in part, because this error cost analysis changed.104

Machine Learning as Natural Monopoly

Many markets historically regulated as natural monopolies—the railroads and the telephone, among others—have since proved capable of sustaining competition. But that need not imply the theory of natural monopoly regulation set out above is dead. Other natural monopolies may surface, and hence deserve regulatory attention (even assuming that these prior examples are now, in view of technological change, capable of sustaining competition).

Indeed, scholars across disciplines have recently hinted at the possibility of a newly emergent natural monopoly (or, more accurately, series of natural monopolies) in applications of machine learning technologies.105 Oren Bracha and Frank Pasquale, for example, have suggested that the search engine market exhibits the characteristics of a natural monopoly.106 Likewise, Steven Weber and Gabriel Nicholas have identified “a tendency toward natural monopolies in data platform businesses.”107 Of course, neither search engines nor data platforms necessarily implement their services by way of machine learning algorithms: Google’s search results, for example, are determined by a range of factors, only some of which require machine learning.108 But Steven Weber and Gabriel Nicholas focus their attention on machine learning systems in particular. And both sets of authors identify data feedback loops—one facet of machine learning (sometimes termed continual learning)—as critical to their findings.109 Continual learning, however, is only part of the story, since such positive feedback loops, standing alone, need not necessarily imply natural monopoly conditions.110

Hence, I closely analyze the claim that machine learning systems are natural monopolies. I begin with a brief description of machine-learning-based applications, and then turn to a general (if informal) analysis of the natural monopoly claim focusing on three primary features. First, I describe the scale economies that inhere to the development of such systems. Second, I describe the ways in which the costs associated with training certain machine-learning-based applications appear to be, from a computational perspective, subadditive—a sufficient condition for natural monopoly.111 And third, I elaborate on the positive feedback effects of continual learning. Viewed together, I find it likely that, for at least some applications, a natural monopoly exists.

Understanding Machine Learning

Computers do what they are instructed to do. As I elaborate infra, a software developer might, for example, ask a computer to sort an array of numbers—and, in so doing, give the computer explicit directions on how, exactly, to sort that list.112 That’s useful because the computer now has some new functionality—each new set of instructions provides a new tool: a list sorter, a number adder, a game player, and so on.

But in each of these examples, the software’s output is some thing—a sorted list, a numerical sum, or a path to victory over Bowser.113 Machine learning algorithms operate differently: With machine learning, the output is itself “another algorithm that can be directly applied to further data.”114 In this way, machine learning offers a sort of “meta-algorithm,” one that draws on existing data to derive a new prediction model.115

An illustration may help clarify. In The Ethical Algorithm, Michael Kearns and Aaron Roth develop an example related to high school performance and college graduation: Given an array of students’ high school coursework, grades, test scores, and college graduation statuses, we can calculate a number of functions: the average test score for graduates and nongraduates, the frequency with which graduates and nongraduates took certain classes, and so on.116 But using that database to manually “derive a model [for] predicting the likelihood of graduation for future applicants” could be “quite difficult and subtle.”117 A machine-learning-based application, however, might absorb that mass of data—information on hundreds or thousands (or more) student profiles—and thereby derive a model for predicting the likelihood of graduation for some new applicant, i.e., a student absent from the original dataset. Once the application has learned which students graduated and which did not, it can attempt to predict whether other high school students are likely to also graduate from college.118

Such applications are hardly limited to this college graduation example. Machine learning algorithms have been deployed to help solve a wide range of complex problems operating across distinct markets, including medicine, law enforcement, and online search.119

W. Nicholson Price, among others, has detailed how medical professionals use machine-learning-based applications to diagnose ailments and to develop treatment plans for patients.120 Such applications, “derived from large datasets[,] . . . reflect complex underlying biological relationships” and are used to diagnose and treat a range of ailments and conditions—from migraines to autism and beyond.121 Specifically, these algorithms rely on the stores of data contained in medical health records to identify both patterns suggestive of a particular illness and treatments likely to succeed: Some providers claim “to use ‘170 million data points to “transform” data from insurance claims, electronic medical records, medical sensors and other sources into information . . . to predict the best ways to treat individual patients and conditions.’”122 Google’s Project Nightingale, for example, draws from the medical records “of millions of people across 21 states” and may incorporate data from personal health tracking devices such as Fitbit.123 Many such algorithms are “locked”—i.e., once trained on an existing dataset, they do not incorporate new information from their in-market use.124 But while most FDA-approved applications are, so far, locked, developers continue to experiment with continual learning models that learn from the new outcomes generated by the application’s use in medical settings.125

Likewise, facial recognition technology has, among other uses, gained significant traction in law enforcement, using machine-learning-based advances to identify criminal suspects. Using collections of digital photos, these applications learn to recognize the visual identity of a person through identified photographs—photos tagged in Facebook, associated with a drivers’ license, or taken while processing an arrest—and use that information to predict the identity of someone in a new image—say, a frame from a security camera recording. Many of these systems do continually learn and adapt, both by incorporating new known images as they become available and by learning from erroneous outcomes.126

And a vast range of internet companies use machine-learning techniques to provide consumers with search results and recommendations.127 Search engines, for example, use machine-learning-based applications, trained on text, to better understand search queries and identify relevant results—by, say, identifying synonyms and antonyms, and by using metrics suggestive of a successful (or not) search result to improve training over time.128 Similarly, companies like Netflix and Spotify use machine-learning-based applications to recommend new content to consumers based on their knowledge of every user’s content preferences (to derive a prediction model), and each user’s individual content consumption habits (to offer tailored predictions).129

In total, machine-learning-based applications—together with the volumes of data now available—are remaking operations across sectors.130 To some, these new tools are a welcome development: College admissions officers, for example, might be interested in tools to help better assess students’ applications. Patients want better health outcomes. Law enforcement officials want to reliably identify criminal suspects. And most consumers would like Netflix to reliably recommend good content. But such software may also seem potentially problematic, implicating concerns regarding recommendation accuracy and bias, the incentives of the application’s developer, and user privacy, among others. Before addressing such concerns, I examine the nature of these machine-learning-based applications in more detail, focusing on the possibility that any one application may be a natural monopolist in its own market, whether medicine, law enforcement, or search, among other possibilities.

Machine Learning as Natural Monopoly

  At least three features of machine-learning-based applications are suggestive of natural monopoly conditions (or natural monopoly effects): the costs of application development; the costs of training and optimization; and the potential for network effects. I consider each in turn below.

Application Development

Software markets mimic natural monopolies in at least one important respect: It can cost a lot to develop software, but it is comparatively trivial to distribute that software to consumers.131 Developing software can be an exceptionally complex task, requiring significant resources in terms of expertise and labor (i.e., software developers), as well as computing power.132 But once a program is built, it is easy to burn copies onto discs or, more commonly (and more cheaply), transmit them over the internet. Because software has such relatively high fixed and low marginal costs, it has—roughly consistent with the traditional understanding of natural monopolies—declining average marginal costs.133

Machine learning presents an extreme example of this dynamic, as the fixed costs that attend to developing a machine-learning-based application are significantly greater than those associated with the average software development project (while marginal costs remain comparatively low).

First, for so long as machine learning remains a relatively new and novel technology, we might expect that the labor costs for machine learning engineers—their living wages—to be higher, given the demand for their specialized expertise.134

But even if this effect is only transitory, other more durable aspects of machine learning development exaggerate the differences between fixed and marginal costs. In addition to standard software development costs, machine-learning-based applications rely on vast volumes of data—data used to train the learning algorithm to, say, recognize someone’s face or identify a potentially successful therapeutic regimen. In some cases, such data may be freely (if controversially) available: IBM, for example, scraped millions of publicly available images from image-sharing website Flickr to train its then-nascent facial recognition software, and Clearview AI employed a similar strategy, drawing criticism from privacy advocates.135 Nevertheless, good data is typically expensive, no matter whether the proprietor chooses to “build it” or “buy it.”136

Moreover, the physical computing resources required to design and train a machine learning algorithm can be expensive, requiring specialized equipment that can cost millions of dollars.137 Some applications—particularly those developed using relatively simple prediction algorithms (on which I elaborate in the next Section) can be developed on commodity hardware—the sorts of regular computers that consumers might purchase for home use.138 But others, such as GPT-3—a popular natural language process application—require a massive computing infrastructure that is likely out of reach of new competitors.139

In short, human, data, and hardware resources are all more expensive in machine learning contexts (as compared to the average software development project), yielding an even wider disparity between fixed development costs and marginal distribution costs suggestive of natural monopoly conditions.140

In the general software context, scholars and commentators have generally concluded that these sorts of scale economies are, standing alone, insufficient to give rise to a natural monopoly in need of regulatory intervention (even if natural monopoly effects arise). This is for two reasons. First, though the fixed costs of software development are much higher than the marginal costs of software distribution, those fixed costs are, in absolute terms, not all that high. Compared, say, to the fixed costs of building a power plant and an electric grid, developing a software program may seem trivial.141 And so competition in these software markets, even if properly characterized as natural monopolies, yields less waste than in traditionally regulated markets.142 Moreover, the possibility of a longer, ongoing competition for the market (facilitated by these comparatively lower barriers to entry) yields some important quality gains in the software product.143 Altogether, scholars have concluded that the possibility of quality gains through competition (together with avoiding the downside risk of regulatory error) seem, in this context, to outweigh the efficiency gains of avoiding competition in a natural monopoly (or natural-monopoly-like) space.144

But these justifications operate with less force in the machine learning context. As noted, the fixed data and equipment costs that attend to developing machine-learning-based applications are significantly higher than for traditional software development, thus suggesting both that the losses from wasteful competition are higher and that the possible gains from competition may be lower (due to the higher barriers to market entry).145

In total, the fixed costs of developing a single machine-learning-based application for a single market (e.g., facial recognition or diagnostics) seem to significantly outpace the marginal costs of making those applications available to its consumers (e.g., law enforcement or health care providers), possibly giving rise to a natural monopoly in that application’s specific market.146 After all, one provider is likely to be able to serve the market—several law enforcement agencies, or several hospitals—without needing to build additional capacity after designing, developing, and training the application. And so, in such cases, a second provider will add no new capacity to the market but will duplicate these fixed costs—thereby increasing the average marginal cost across that market.147

I am careful not to overstate this claim. As I explain infra, it may be possible to reduce these fixed costs and avoid the natural monopoly condition by, say, relying on shared infrastructure or public datasets.148 Moreover, some of these fixed investments may yield improvements across markets or sectors, i.e., spillovers. Competitive investments in machine learning hardware may, for example, yield positive externalities for other technological development.149 Or investments in the development of a recommendation algorithm based on consumer preferences might find application in different consumer-facing markets (e.g., song, movie, and restaurant recommendations). Indeed, some markets may sustain several competitors through product differentiation: One song recommendation engine might prioritize similar musical genres while another emphasizes lyrical themes, and, given this differentiation, both might co-exist in a single market.150 But, in general, the case for natural monopoly in machine-learning-based application markets is stronger than for traditional software development, particularly for deep-learning-based applications that are based on large text- and image-based datasets. Moreover, even where the natural monopoly condition is not strictly met, the market for particular applications may be highly concentrated, with such concentration giving rise to similar (if less pronounced) market effects.

Training and Optimization

Beyond the sorts of tangible and intangible inputs to machine-learning-based application development described above—human capital, hardware resources, and data—other machine-learning-specific features reinforce the preliminary conclusion in favor of natural monopoly. Specifically, the methods of optimizing machine-learning-based applications suggest that the time and computational resources dedicated to training amount to one more specific and significant fixed cost of development. And once an application is trained, it will often make sense to direct users (and their further inputs) to the existing application—much like, as in other natural monopoly contexts, it always makes sense to connect users to an existing network rather than to stand up a new one.

Computer scientists have often traditionally measured algorithmic efficiency using “Big O Notation,” a standard method for comparing different approaches to a computational problem (that discounts differences attributable to, say, hardware performance).151 Consider, for example, the familiar problem of sorting items in a list. There are more than a few ways to do this.152 You might, for example, look for the smallest item, set it aside, and then the next smallest, and so on. Or you might choose one random item in the middle and push all the smaller elements to one side and all the larger elements to the other side, and repeat for each side, and so on, until everything is in place. From a computational perspective, these two approaches are significantly different.153 The first approach—selection sort—is generally considered comparatively inefficient because as the list to sort gets longer, it takes significantly more time to finalize the sorting process: Each new element in the list adds one new round of comparisons and one additional comparison in each round. In formal terms, selection sort takes O(n2) time, where n represents the number of items in the list: As n gets larger, the algorithm’s time complexity grows polynomially.154 By contrast, the divide-and-conquer approach is more indifferent to increases in the size of the unsorted list: As the list gets longer, quicksort does not become unbearably slow. In formal terms, quicksort takes, on average, O(n log n) time—significantly better than selection sort’s polynomial growth.

Such complexity measures suggest one way to detect further fixed and marginal costs (beyond hardware and data, among others) in computational spaces. An algorithm that operates in exponential time—O(2n)—becomes costly quickly. And such computational costs, moreover, are not mere hypothetical constructs: Rather, they represent real value in terms of hardware demands, electric power, climate effects, time, and opportunity costs, among others.155 By contrast, an algorithm that operates in logarithmic time, e.g., O(log n), suffers its worst increases in complexity when n is small, but as n gets larger the incremental costs in computational complexity are comparatively slight.156

These tools for measuring computational complexity can help assess the fixed and marginal technical costs associated with machine learning. In particular, machine-learning-based application development might be considered in at least two stages—one for designing, building, and training the application (which itself occurs in multiple stages), and a second for predicting future results based on such learnings.157 Each of these stages has a different function and, importantly, a different computational cost structure. Application design and training more closely resembles software development: It is a critical input to the development of the machine-learning-based application. By contrast, prediction or classification more closely resembles distribution: Consumers of machine-learning-based applications send queries to the application and expect to receive some response—a proposed treatment regimen, the identity of a person in a photograph, or a set of search results.

Application training is computationally costly as compared to prediction or classification. In one example, computing the solution set—i.e., training—runs in polynomial time, while prediction runs in linear time.158 I should clarify that my assessment here operates at a high level of generality, and, below, I use Bayesian classifiers as an exemplar. The genre of machine-learning-based applications encompasses a wide range of classifiers and techniques, each of which operates slightly differently and which may be better suited to different sorts of tasks and applications.159 It is beyond my present scope to assess each of these sorts of classifiers individually, and so I paint with a somewhat broad brush, knowing that some of the conclusions suggested here may be overinclusive. Nevertheless, the process I describe offers a good general approximation for the comparative computational costs associated with training machine-learning-based applications and then later deriving predictions from such applications.160 In all, training is significantly more costly than prediction.

How does training work? Consider Bayes’ Theorem—foundational to much modern machine learning research and practice—which, in its simplest terms, estimates the probability of a given outcome in view of some known probabilities and available evidence.161 Training supplies the prior known probabilities. Some candies might be fruit-flavored (Skittles), others might be chocolate (M&Ms). If we have some known information about the frequencies of various colors in Skittles and M&Ms (by, say, opening bags of each sort of candy at random and counting the colors inside), then we might be able to predict the identity of an unknown candy, given that is brown (99 percent M&M, one percent Skittles) or orange (50 percent M&M, 50 percent Skittles). Of course, such predictions cannot be perfect, and so the Bayes error rate offers the lowest possible error for a Bayesian classifier under ideal conditions—the best it could do.162 Hence, when training a machine learning system to predict the identity of a candy (or, more seriously, of a criminal suspect), developers sometimes measure their application’s error against this benchmark.163

As these machine-learning-based applications are trained (e.g., through more training examples—data), they typically logistically (asymptotically, that is) approach the Bayes error rate. In short, the learning curve is steep: These applications generally begin poorly, improve predictive accuracy with the first several thousands (or more) training examples and intense computational effort, and thereafter show some continued, though ever-decreasing, improvement as the volume of training examples increases.164

Hence, the application’s developers must select some point in the process to conclude the training process. Such decisions are often made by trading the costs of additional training—costs in terms of computing resources, additional data acquisition, and time—against the expected benefit of such training—benefits in terms of improved accuracy. Where those costs exceed benefits, training typically terminates.165 What remains is an application that has maximized the trade (from the developer’s view) between performance accuracy and training costs and is ready to respond to queries about a candy’s flavor, a patient’s treatment, a criminal’s identity, or a website’s location, as the case may be.

Importantly, moreover, the computational costs of responding to such individual queries are, as measured against the computational training costs, relatively small.166

Some machine-learning-based applications may thus resemble a natural monopoly. This is especially true for large deep learning applications (such as GPT-3, a natural language processing application, or for facial recognition applications) though perhaps less true for simple regression models based on only a few thousand observations (such as COMPAS, an application used to conduct a risk-assessment of a criminal defendant, or even the college graduation example described above).167 In such examples, the fixed computational costs of training the application will be comparatively high, growing with each training set and terminating when the costs of additional training outweigh the predicted accuracy benefits of such training. In contrast, the marginal costs of responding to queries after training will be comparatively low. Hence, assuming such a significant difference between the fixed computational costs of training and the marginal costs of prediction
—an assumption that, as I noted above, may vary by the specifics of any given machine learning implementation—the average marginal computational cost of each query to the system seems likely to decrease continually. In short, machine-learning-based applications seem computationally subadditive.168 Stated similarly, a second application, trained on a similar (though, perhaps, not identical) set of random bags of candy and taught to predict brand by color, would duplicate the efforts of the first application, adding no new capacity to the candy-detection market.169 Indeed, research regarding model distillation suggests that machine-learning-based models tend to funnel toward a single set of outcomes.170 In short, the second provider is akin to a redundant set of railroad tracks or telephone wires.

Machine Learning’s Virtuous Circle

As I noted above, developers of machine-learning-based applications typically terminate training when the costs of additional training (in terms of, say, acquiring additional training data) outweigh the benefits of such training (in terms of predictive accuracy). But what if it were nearly costless to obtain additional data and insights from that additional information could be cheaply incorporated into an application’s prediction model? Various advanced machine learning techniques—deep and continual learning, for example—are aimed at accomplishing exactly that, allowing the model to digest new information on an ongoing basis. Such applications further resemble natural monopolies.

Machine-learning-based applications that continue to internalize new data, including information drawn from their practical deployments, may gain an insurmountable lead over putative competitors in their initial competition for the market. This is because the first application in the market gains access to more recent and more relevant training data—data from in-market consumers—before any competitor. Integrating those results into its prediction scheme thus gives rise to better results for the next query. And that next query, again, gives the provider even more recent and relevant data that may further improve its application—and so on. Industry experts and technologists have referred to this process of continual learning as the “virtuous cycle” of artificial intelligence.171 In these examples, the application’s learning curve ascends steeply—it continually becomes more accurate—over a significant portion of the market (perhaps even the whole market), thus making it difficult, if not impossible, for putative competitors to ever catch up.172

Scholars have suggested that this cycle helps explain the natural-monopoly-like tendencies exhibited by some machine-learning-based applications.173 Frank Pasquale, for example, has suggested that search engines (which are sometimes implemented, at least partially, with machine learning technology) resemble natural monopolies, explaining that “[t]he ‘rich get richer’” and that the most popular search services gain access to “ever more data to refine and improve.”174 Steven Weber and Gabriel Nicholas have similarly ascribed “a tendency toward natural monopolies in data platform businesses” to this very “positive feedback loop.”175

Strictly speaking, this feedback loop does not give rise to the conditions that typically define a natural monopoly. As explained above, natural monopolies arise where a service (like telephone service) is most efficiently supplied by a single provider. But the cycle described here does not obviously bear on the supply-side costs of machine-learning-based applications.176 Rather, this positive cycle seems to resemble a sort of demand-side network effect: Each use of the deep learning application increases the application’s value for future uses by improving the application’s predictive functions.177 In short, demand begets demand.

Such network effects have, in this context, natural monopoly-like consequences: As with traditional natural monopolies, one provider may be better than two. If two competing providers were to split the market—and thus split access to the new data made available for continued improvement and retraining—then each application would benefit from only half the newly available data. By contrast, a single application in monopoly control of the market (and the resultant data) would have access to the entire corpus of information, thus performing better than either of the two competitors forced to split the new information.178 As with traditional natural monopolies, one provider is more efficient than two (even though the source of this conclusion rests in demand-side, rather than supply-side, effects).179 And that provider thereby wields significant power—“network power”—over the further development of the market.180

It is possible, of course, that a second comer might leapfrog over the incumbent: Perhaps the incumbent’s training data or model design is deeply flawed, so as to permit a newcomer to do better, even given the disadvantage of being a late-mover. But even these such cases present some of the problems of natural monopoly discussed above: wastefully redundant efforts, higher input prices, and, during the time of the incumbent’s dominance, problematic rates and service. And, assuming the natural monopoly condition holds, one provider—in this case, perhaps the second-comer—is likely to dominate in the end.

Hence, given this similar preference for one provider over two, network effects often give rise to conditions that are “analogous in important ways to that of a natural monopoly.”181 In the different but related context of internet software, Mark Lemley has suggested that network effects may, like traditional natural monopolies, cause a market to “converge on a single product . . . over time.”182 Moreover, that result may “be highly durable and self-reinforcing, as consumers . . . focus their attention on the dominant standard to the exclusion of all others.”183

*   *   *

Applications based on machine-learning technologies have been deployed in a wide range of industries and markets, from law enforcement to healthcare to software development itself. Users rely on these applications, themselves relying on existing stores of data, to classify new inputs and predict future outcomes: Given a vast repository of patient records, for example, an application might diagnose an ailment and suggest a course for treatment.

Such applications, moreover, resemble natural monopolies in several significant ways. First, the basic inputs necessary to build a robust machine learning architecture—expertise, data centers equipped with specialized computing equipment, and vast volumes of data—are costly. Second, the computational costs of converting those volumes of data into a predictive model—the costs of training—are similarly high. But the costs of prediction are comparatively low. Hence, this cost structure—computational and otherwise—may suggest that at least some of the markets for machine-learning-based applications are subadditive: It is most efficient to serve such markets with only one application. Third, even more recent advances in machine learning, like continual learning, suggest that these applications benefit from important network effects. In all, the putative developers of a machine-learning-based service face, in any given market, some important entry barriers—infrastructural expenses, computational costs, and network effects.184 Of course, network effects are not, strictly speaking, suggestive of a natural monopoly. But they nevertheless imply a similar conclusion: In some markets for machine-learning-based applications, one provider can be better than many.

I am careful not to overstate the case: It is not necessarily the case that every machine-learning-based application operates in a natural monopoly market. Where, for example, an application is easily built on shared infrastructure using freely available information and operates in a market that might sustain a wide degree of product differentiation, the natural monopoly case is comparatively weak. Such applications may include some regression-based predictive models that are based on a limited (and cheaply obtained) dataset, such as criminal risk assessment tools, and also, perhaps, consumer-facing recommendation engines deployed by popular content providers through shared computing infrastructures.185 Though, depending on the strength of some of these features—where, say, multiple (but only a few) companies have the computational resources to develop some medical deep learning application, or where only a few dimensions for product differentiation are available—some markets may nevertheless exhibit concentration through oligopoly or monopoly control.186 And where these features compound because applications are based upon a deep learning model whose costly training algorithms turn on several (billion) parameters and demand specialized expertise and equipment, as well as expensive or difficult-to-obtain data—e.g., GPT-3, facial recognition applications, and deep-learning-based medical applications—the natural monopoly condition seems increasingly likely to be satisfied. In short, the natural monopoly case depends on a close assessment of the features and inputs—computing hardware, data (and related network effects), and computational power—described above.187

Machine Learning and Regulatory Policy

It thus seems likely that at least some machine-learning-based applications operate as natural monopolists. Moreover, empirical examples of consolidation in some application markets may offer some evidence suggestive of this natural monopoly condition.188 Consider, for example, the use of facial recognition technology by law enforcement. Amazon, IBM, and Microsoft all recently exited that market.189 In so doing, these companies cited concerns about law enforcement practices and racial bias, including, especially, unregulated bias in the technology itself.190 Such explanations are certainly plausible. But (more cynically) it is also possible that these companies recognized a market tending toward consolidation and that they are too far behind to make up the difference in view of the fact that “none [of these three companies] is a major player.”191 Instead, companies like NEC, which has relationships with law enforcement agencies “from Texas to Virginia” as well as international bodies (like London’s Metropolitan Police), and Clearview AI, which has relationships with “more than 600 law enforcement agencies,” might seem to have a distinct advantage in the battle for training data, and for the market.192 I do not, of course, mean to say definitively that these companies withdrew because they recognized a coming loss in the battle for the market. IBM, Microsoft, and Amazon might have been entirely sincere in their desire to exit after evaluating the social consequences of the technology.193 But their behavior is also consistent with the dynamic of a natural monopoly market.

These observations help illuminate the basis—and possible need—for the regulation of machine-learning-based applications as natural monopolies. Here, the companies most willing to voice concern regarding bias-related accuracy problems withdrew from the market—possibly paving the way for another provider to consolidate its power over the market and, even more concerningly, perpetuate bias without competition.194 In short, competition—competition from providers interested in producing and selling a less-biased service—may not effectively discipline these problems of accuracy and prejudice in the law enforcement market for facial recognition services. This pattern seems likely to repeat: Competition fails in natural monopoly markets, and so regulation has often substituted for the price and quality effects that market competition usually supplies.

But what sort of regulation? As described above, monopolists, including natural monopolists, often have incentives to charge inefficiently high prices or to underinvest in product quality. And these concerns persist even in other highly concentrated markets, including those characterized as oligopolies, so providers have historically faced forms of rate regulation, service specification, or specially induced competition in order to avoid these predictable problems, while retaining the efficiencies and network effects of consolidation. I begin by briefly describing these problems of machine-learning-based applications—problems that have been well-documented elsewhere in the legal, social science, and computer science literatures—before turning to a solution set drawing on the tradition of regulation described above.

Some Problems of Machine-Learning-Based Applications

Some developers of machine-learning-based applications stand accused of a wide range of infractions, from exploiting an application’s users to generating results that create or perpetuate bias, as well as hoarding information about these application users. Such problems are sometimes cast as problems of personal privacy, of antidiscrimination law, or even of copyright.195 But these might also be problems of market power: Consolidation has both enabled undesirable privacy-invading and data-hoarding practices and has caused providers to take a lax approach to problems of technical bias. It is, of course, naïve to think that mere market competition could resolve problems of, say, racial bias in law enforcement activity (including automated suspect identification), so I do not mean to suggest that either competition or regulation is a magic salve for the problems of discrimination and prejudice that can pervade society. But it is nevertheless true that industry consolidation can entrench these harmful practices, yielding technologies that enshrine and perpetuate problems of prejudice. Some discipline—perhaps through regulation—can help address the effects of such concerns.

I start with matters relating to privacy and private surveillance. Many machine-learning-based applications demand a price from their users beyond those imposed in dollar terms—they also extract information.196 To the extent providers condition the use of a machine-learning-based application on access to some consumer information—to the extent, for example, that Google requires that consumers agree, in exchange for using Google to carry out internet searches, to allow Google to use information derived from those searches (e.g., search terms, the websites consumers visit, etc.) for its own purposes—those developers thereby extract some charge from consumers in exchange for using the application’s predictive capacity.197

Where such providers are monopolists (including natural monopolists), they can charge prices—in terms of both data and dollars—that are undisciplined by competition, to two possible effects. One, these high prices can lead to accumulations of capital—in this case, of personal information—that undermine both distributive values and democratic participation.198 As Julie Cohen has explained, people who feel themselves under constant surveillance—who are made to trade personal information to engage in modern society—lack the freedom to engage in the sorts of expressive and reflective conduct that is both “an indispensable structural feature of liberal democratic political systems” and a necessary element for further development.199 In short, we feel the weight of these pervasive and invasive systems—and that burden restrains our ability to carry out our everyday lives. And two, even skeptics of such an account may concede that monopoly prices are likely to give rise to deadweight losses—i.e., an inefficient underutilization of a socially valuable service.200 Consumers may avoid using certain applications out of concerns that those services will extract too much information201—even if those same consumers would be more willing to use a service with stronger privacy guarantees (e.g., a facial recognition application might collect either only that which is necessary for the effective performance of the application (like updated images of one’s own face), or it might collect updated images, as well as periodic location information, voice print information, unrelated usage statistics, and more).202 And such selection effects may, in turn, further skew the results of continual learning systems.203 In sum, there is a robust concern among scholars, consumers advocacy organizations, and policy makers over the scope and extent of these natural monopolists’ extractive data policies, both imposing significant privacy costs on individual citizens and undermining the use and continued refinement of these applications.204

Indeed, scholars and commentators have explained how machine-learning-based applications—applications built upon the stores of data generated by society (or some subset of society)—yield inaccurate and problematic results.205 In some instances, such inaccuracies are the byproduct of a poorly selected collection of training data—training data that fails to adequately represent the application’s users.206 In others, these inaccuracies both mirror and propagate the systemic biases already endemic to society.207 And in still others, these errors results from flaws in the application’s model design.

IBM, for example, reportedly trained a cancer-treatment application with “hypothetical patient data” yielding “unsafe and incorrect” treatment recommendations.208 But, apparently unconcerned about these results (as well as any competitive pressure from other providers), IBM sought to bring the product to market, notwithstanding these important and substantial inaccuracies.209 And this is not an isolated incident, as there are several examples of troublesome, if not downright dangerous, artificially intelligent medical applications.210 Some examples even suggest differences by skin color in an application’s efficacy.211

Likewise, facial recognition applications and other “police surveillance technologies” have been shown both to exhibit increasing industry consolidation—due, at least in part, to the technological and market features elaborated above,212 and to be notoriously inaccurate, with the effects of these errors falling disproportionately on women of color. For example, in their widely cited study of several classification systems, Joy Buolamwini and Timnit Gebru found that darker-skinned females are misclassified in over one-third of all test cases—while lighter-skinned males are misclassified in less than one percent of tests.213 And such results have, predictably, real-world effects. Law enforcement officials in Detroit recently suggested that their facial recognition software misidentifies suspects in approximately 96 percent of cases, sometimes leading to false arrests or to tenuous criminal prosecutions.214 Such inaccuracies in facial recognition systems (and other similar technologies) come with significant costs—but costs that are not borne by systems developers or their direct law enforcement purchasers: Citizens may, for example, be falsely accused of crimes, arrested and held in jail, and kept away from their families and jobs, with substantial consequences.215 But such harms are borne only by private citizens—and, given the embedded bias in these systems, borne disproportionately by people of color—and are not internalized by either applications developers or their buyers in law enforcement.216 Developers thus have little incentive to improve the accuracy of their systems: They may succeed in the market—they may earn contracts to serve law enforcement agencies in major metropolitan cities—even when they fail to accurately identify suspects in 96 percent of cases, with such errors falling disproportionately on those cities’ citizens of color.

In short, many of the now well-known problems attending to machine-learning-based applications might be understood as problems of market power—and, in some cases, the predictable effects of the natural monopoly condition.217 Where any given application will tend to consolidate its control over an application market—the market, say, for cancer diagnostic software—the application’s provider is likely to tend towards higher user prices. And where such applications gain a respite from the competitive pressures that typically force quality improvements, application quality (in terms of, say, accuracy) may be likely to suffer—with the effects of such underinvestments likely to fall on historically marginalized groups, who are, among other things, underrepresented in available training data or represented discriminatorily (as a consequence of extant discrimination).

I do not mean to suggest that these applications only present problems. Rather, their invention suggests some existing problem in need of a solution.218 Medical software and facial recognition technology, among others, all help to solve real, pressing problems. Moreover, to the extent these applications satisfy the natural monopoly condition, such problems seem most efficiently solved by a monopolist. But that solution offered by a natural monopolist gives rise to the further, different problems of monopoly power suggested above. We may thus consider—as prior scholars and policy makers have—how to retain the benefits of the machine-learning-based advance (and the benefits of consolidation), while mitigating the predictable negative consequences of its natural monopoly provision.219

Machine Learning and Monopoly Regulation

Given the predictable consequences of monopoly control, we may turn to the traditional regulatory measures that policy makers have long deployed against natural monopolies. As described above, regulators have often required that natural monopolists and providers in concentrated markets offer reasonable service at reasonable rates in order to substitute for the price and quality effects typically supplied by competition.220 In some instances, policy makers have altered a market’s landscape to make competition viable. Similar principles can apply to machine-learning-based applications, too.

As before, the suggestions elaborated below operate, by necessity, at some level of generality: Just as there are various types of machine learning implementations that vary in their technical details, so too are there numerous machine learning-based applications—medical and diagnostic, facial recognition, search, and so on—each presenting its own specific regulatory challenges. But, setting these differences to one side, we can nevertheless distill some general principles, founded on the natural monopoly tradition, for some modes of regulation that address several of the problems attending to machine-learning-based applications.

 Informational Rate Setting for Privacy Harms

Given the likelihood that consumers will face monopoly prices, one straightforward possibility is to regulate the prices, in dollar terms, that service providers charge users. Consider, for example, a machine-learning-based diagnostic service that, as a natural monopolist, has captured its market. As with other monopolists, the diagnostic provider is likely to charge a profit-maximizing price—one that, as described above, may give rise to social losses by leaving some consumers (here, healthcare providers and their patients) unable to access the benefits of the new technology.221 Hence, cost-based regulation of the prices charged by such services providers might help ensure comparatively wider adoption of the medical advance—and the concomitant realization of any associated health benefits.222 Such price-oriented rate setting presents a standard, but important, regulatory response to natural monopolies in general, and need not turn on any feature specific to machine-learning-based natural monopolies.223

This approach, moreover, hints at another response that is more specific to the problem of data extraction described above.224 As other scholars have explained, and as I described above, some machine-learning-based applications extract value from their users in terms of data—data that is often repurposed for various ends, including advertising sales, the continued training of that application itself, and the cross-subsidization of new applications, among other purposes—at monopoly levels.225 In short, where applications providers are monopolists, there is cause to be concerned about the nature of that informational charge: It is not disciplined by competition and may be both extractive and inefficiently high.226

Regulators can set a more efficient and fair informational price.227 Of course, the analogy is not perfect: Information is neither truly rivalrous, nor a currency with which applications providers can pay salaries or purchase computing equipment. But, as explained, access to information can be limited or controlled, and this information supplies a direct input into these (and other) applications.228 And so regulators might limit information collection by setting an upper bound on the sorts of information that may be collected (including, perhaps, none at all), or they may allow developers to use only that which is reasonably necessary for the continued improvement of the regulated application. In short, regulators may adapt traditional rate-setting approaches—rate-of-return regulation and price-cap regulation—to the monopolists’ information collection practices, setting out rules for data collection and use.

Under a rate-of-return-styled approach, regulators may set data collection rates and use-limitations by reference to the application’s needs. Recall that standard rate-of-return regulation sets prices (in dollar terms) by reference to the real costs of production: Regulators attempt to discern the monopolists’ total costs (including anticipated expenditures) and set prices to cover only those costs and some reasonable margin for profit.229 Likewise, in data contexts, regulators might attempt to discern an application monopolists’ real information needs—what information is useful to, say, an application’s continued learning (as opposed to information that is instead used to cross-subsidize the development of some other, unrelated product)—and limit data collection and use to those terms: They might, for example, allow diagnostics developers to collect relevant medical information, but preclude them from accessing other patient details, including names, addresses, or billing details.230 In short, regulators can require both data minimization (i.e., reductions in the amount of data collected and retained) and data protection practices (including, e.g., retention limits and anonymization, as appropriate),231 thereby helping to relieve the pressures of constant surveillance and to ensure wider adoption of these advances.

Of course, rate-of-return regulation can sometimes be prone to error. As noted earlier, regulated entities have an important advantage over regulators: In order to set rates correctly, regulators need access to information about the regulated entities’ internal operations.232 Regulated entities in turn have incentives to inflate their requirements, to exaggerate the difficulties associated with existing limits, and to entangle separate products and markets with the regulated one. In short, regulated entities might persuade regulators to set rates too high, at the expense of both welfare and distributional values (and it is important to consider both). Applications that are too costly to use will result in medical benefits that are inefficiently allocated (and, moreover, that internalize new data from only a selected population set), and that extract more information than necessary, further burdening our individual and collective privacy.

When faced with such problems of rate-of-return regulation, regulators have often switched to a mode of price-cap regulation. We can adapt a similar approach to the machine learning context, too: Rather than setting rates from applications developers’ own statements of their needs, price-cap regulators tell providers what consumers should be made to offer (within reason and based, perhaps, on benchmarks), leaving it to those providers to work within those constraints.233 Indeed, some existing privacy statutes can be conceptualized as price-cap rules: The Health Insurance Portability and Accountability Act, for example, defines certain classes of regulated entities, including healthcare providers and insurers, sets caps on the sorts of information those entities may collect, and limits how such information may be used.234

Hence, this mode of rate setting is analogous to calls for federal privacy legislation—federal rules that would directly constrain data collection and use by vast range of companies, including, presumably, machine learning-based applications developers. But while much privacy regulation (both in Congress and through the courts) has traditionally focused on individual rights and harms, the tradition of monopoly regulation takes a wider view, considering the broad range of market harms and benefits that accrue from certain pricing (and service) practices.235 A similar approach to informational rate setting can help to balance the privacy costs—both to individuals as well as to the public at large—against the benefits of various machine-learning-based applications.

Moreover, an agency model—one which vests authority over the specifics of a data-rate regime with executive departments—can be made more precise to context, both across sectors and over time. Consumers may, for example, be more willing to trade more sensitive personal information for superior medical diagnostic services than they are for better music content: Some might give up their entire medical history for a better diagnosis but would not trade their favorite color for a better song recommendation. And a regulatory model may be better able to capture such nuances than omnibus legislation can. Moreover, a regulatory model is likely to be better able to respond to changes over time: An unexpected pandemic, for example, may shift both the need and willingness for information related to a disease, its symptoms, and its spread.236 Regulators may be able to react to such requirements faster, and with greater expertise, than legislators. Though a complete description of the institutional design of data rate setting is beyond the scope of this project, it is possible, if not likely, that the need for context, expertise, and agility all suggest an approach that vests decisional authority with expert agencies (over one that relies upon direct congressional action).237

In addition to the benefits set out above—a more efficient allocation of the advances offered by machine-learning-based applications, and fewer concerns about developer fairness—price-cap regulation seems likely also to fare better along dimensions of transparency, accountability, and legitimacy. Rate setting is a participatory and political process: It invites public comment and debate. And such public participation can help ensure that rates reflect public input, and that, if they do not, regulators may be held to account by elected leaders or the voting public. So too could data rate caps be set participatorily, by reference to shared public values rather than private developers’ demands.238 While I set to one side, for now, the institutional design specifics that might attend to such data rate setting, it suffices for present purposes to note that, under a price-cap regime, decisions about the amount of private control over personal information—the extent of the incursion into private information spaces society is willing to tolerate—could be made by the public’s representatives, with direct public input, in a participatory and democratically accountable process (rather than by a closed set of monopoly or oligopoly providers).

Hence, though defining the scope of a reasonable intrusion into our privacy (in exchange for some technological advance) remains an unresolved and perhaps interminable problem in the abstract, our democratic processes offer one way of addressing the commensurability challenges that come with informational rate setting. Any regulatory intervention comes with some costs and benefits: As developers access more information, they may be able to generate more accurate results, or even new applications altogether—but these informational concessions come at a great cost to our collective privacy.239 These competing concerns present a difficult commensurability problem, requiring us to trade privacy values against the uncertain potential of innovations in health, safety, and convenience—but the democratic processes and political checks that attend to public rate setting help address this quagmire.240 Of course, any rate setting approach carries some risk either that regulators will be too aggressive in limiting providers’ access to resources, or, conversely, that regulators will be captured by developers and cater to their priorities.

Hence, I do not mean to suggest that rate setting is an appropriate response to every application that relies on machine learning—especially since, as noted above, such applications are beginning to pervade the economy widely. Rather, the appropriate regulatory response depends—as it has in previous settings—on a difficult and prospective, but necessary, error cost analysis: Given the possibility both that regulators may err (by setting prices incorrectly) and that developers may err (by setting prices inefficiently and in conflict with shared public values), which outcome is more likely, and which error is more severe?241 Here, as before, public officials—accountable to a voting public—may decide when the downside risk of regulator error is preferable to the costs imposed by a provider. In cases where the natural monopoly condition seems likely satisfied (or where significant concentration is likely), yielding predictable price effects, and where the consequences of error are greater (in, say, medical or law enforcement applications), the case for regulatory intervention is comparatively stronger.

In sum, to the extent machine-learning-based applications developers are indeed natural monopolists, competition fails to discipline their prices, both in terms of dollars and in data. Our longstanding response to such power over price has been rate setting: From the railways to the telephones (and beyond), regulators have long set rates for natural monopolists to avoid both undue concentrations of capital and the social losses of monopoly pricing, the risk of regulator error notwithstanding. An analogous approach might yield similar benefits when it comes to machine-learning-based applications and to prices set in terms of information: No matter whether we choose a rate-of-return or a price-cap approach to data rate setting, regulators can reduce extractive, inefficient, and privacy-invading information collection through more public, participatory, and democratic processes.

Service Standards for Accuracy and Bias Harms

As described above, natural monopoly regulation has traditionally focused not only on price effects, but also on service quality.242 In ordinary markets, competition spurs providers to offer not only better prices, but also better products. But because competition tends to fail in natural monopoly markets, regulators have often issued service specifications that aim both to set a quality floor and to advance related social goals related to natural monopoly condition, (e.g., democratic-inclusion-related values that also help to maximize the network effects). Communications and electricity systems regulators, for example, have historically required that monopoly providers meet basic service level requirements and that they act as “carriers-of-last-resort” to all residents in their service area, including those that are otherwise too expensive to serve (because, say, they live too far from the providers’ existing facilities), in order to ensure that such residents are not excluded from the advances offered by the regulated service and that they are connected to and made a part of the national conversations that run atop these communications networks.243

Regulators may likewise set service requirements for machine-learning-based applications along a variety of dimensions to reduce the harms that result from algorithmic error and to guard against the risk that such harms will exacerbate existing inequities.244 Consider accuracy. As I explained above, the accuracy of an application’s prediction model is set, in significant part, by the availability of (comparatively cheap) training data. Hence, machine learning engineers tend to terminate training where the costs of additional training—costs in terms of data, time, and other inputs—outweigh the benefits in terms of accuracy improvements.245 In a competitive market, such improvements can be quite valuable: Each provider has an incentive to offer a product that is better—e.g., more accurate—than its competitors. In short, market success may depend on predictive accuracy, and so providers have incentives to invest in a well-trained application. Such incentives, however, are decidedly diminished where competition is diminished, as a less accurate model may, as noted above, do as well as in the market as a more accurate one, so providers may forego some application training to reduce costs.246 Hence, in such cases, a provider may decide to terminate training early—or, stated slightly differently, may terminate training where the private costs of training exceed the private benefits of training—ignoring the public benefits of increased accuracy.

Regulatory intervention can align private incentives with public priorities.247 In the context of accuracy, for example, regulators may set output standards requiring that, say, facial recognition applications deployed in law enforcement contexts meet certain standards—that any indication of a “match” is correct in at least, say, 96 percent of cases (rather than incorrect in 96 percent of cases)—perhaps by passing audits for bias and fairness.248 Indeed, proponents agree that such “appropriate guardrails” should apply to law enforcement use of facial recognition technology in order to prevent the sorts of false arrests and the associated grave consequences described above.249 Moreover, such standards might be set independent of the provider’s private training incentives—where an unregulated provider might prefer to train applications on hypothetical data or easily obtained public information scraped from the internet, and to terminate training once such cheap data runs out—a regulated provider must continue to source data, and train and optimize its application, until it satisfies regulatory standards.250 More generally, public standards can force developers into particular technical pathways that help satisfy public goals in addition to private aims.251 And, as before, those public standards can be set through public and participatory processes—ones that can lend greater public confidence and legitimacy to the application and its use (especially for public purposes, as in law enforcement).252

Likewise, regulatory intervention may help avoid the problems of preexisting and technical bias that have been documented in many machine-learning-based applications.253 As described above, many of these harms fall disproportionately on people of color, women, and other historically underrepresented and disadvantaged populations.254 In short, these systems are biased in ways that both reflect and perpetuate existing biases in society (and in the data that society produces), sometimes because reliable training data on minority populations can be comparatively expensive or difficult to procure, or because such reliable data cannot exist in a society that produces intrinsically biased outcomes. Regulators, however, can require that developers employ practices tempering the extent to which continually learning applications propagate or magnify biased outcomes, and that they obtain sufficient (if costly) training data encompassing historically underrepresented and disadvantaged populations (either by directly regulating data inputs or by testing an applications output), and that they remain current by training on new and updated datasets. Indeed, where intellectual property concerns might limit or cabin access to such data, policy makers may be able to overcome such limits through the exercise of regulatory authority or even eminent domain.255

Some scholars and commentators have suggested that regulators are ill-suited to such interventions, both because regulators lack the expertise necessary to evaluate machine-learning-based applications, and because it is difficult to apply such output-oriented standards to dynamic, continually learning applications.256 I agree that such service specification will require that some agencies and regulators build new capacities and expertise, and that developing standards for dynamic systems may prove difficult. But neither obstacle is, in my view, insurmountable—nor do I think we should abandon the challenge.257 Some public agencies, for example, already conduct periodic audits of machine-learning-based applications to detect bias.258 Others, like the FDA, already mandate an approval process that includes such testing.259 Therefore it is plausible to imagine a wider range of similar requirements that both set a quality floor and implement public antidiscrimination goals (among other policy possibilities). Whether it is prudent to do so depends, again, on a cost-benefit analysis that turns on the comparative likelihood and severity of both regulator error (including dynamic effects on innovation) and developer error (including the predictable effects of natural monopoly power, where it exists).260 It seems overwhelmingly likely that, in at least some instances, the disconnect between public and private incentives has yielded applications that impose significant costs on the public.261 And while I do not discount the possibility that regulators will seek to entrench themselves, we might nevertheless prefer that these decisions about tradeoffs between private value and public good—or between competing public concerns—be made by regulators and policy makers who can be held to account to elected leadership and a voting public.262

In short, regulators have long set service rules for natural monopolists, both to ensure a minimum quality standard and to implement important social goals. Communications providers, for example, are subject to service quality requirements that pertain to call completion and emergency communications—they must, for example, make best efforts to complete every call sent over their network, with special rules applying to emergency 911 calls.263 Moreover, communications providers must serve as carriers-of-last-resort to all citizens in the regions they serve (including remote and difficult-to-reach clients), and are prohibited from redlining their service areas—drawing the boundaries of their networks to exclude low-income or minority populations. Regulators might draw up similar rules for machine-learning-based natural monopolies. For example, regulators might set data security standards or mandatory accuracy benchmarks, thereby issuing rules that supply training incentives that the market fails to produce.264 And they may, as with existing bias audits,265 check for compliance against test data that encompasses a diverse range of populations in order to limit the harms of algorithmic redlining and to advance the goals of equality, nondiscrimination, and democratic inclusion that have long been a part of natural monopoly regulation.266

Toward Competition

As explained above, rate regulation and service specification aim to substitute for market competition, given that the absence of competition in a natural monopoly market yields predictable price and quality effects, and, hence, the effects of regulator error may be less severe than the effects of monopoly control. But, as noted above, some modes of regulation aim to minimize the effects of both regulatory missteps and monopoly power by inducing efficient competition through, e.g., franchise bidding or lowering the costs of market entry (thereby undermining the natural monopoly condition).267 Similar solutions may address these concerns in machine learning contexts, especially where the risks of regulator error seem especially grave, or the natural monopoly case seems borderline.268

I begin with the analogy to franchising. In essence, a franchise is a license to serve some designated population: In many locales, only franchised cable operators may deploy a cable network and offer service to residents.269 And such franchises were historically earned through a process of competitive bidding that enabled local regulators to select the provider that promised the best prices—or, in some cases, the best value, by comparing providers’ promised prices alongside dimensions of service quality.270 In short, franchising aims to reproduce the effects of competition by shifting from costly and inefficient intramarket competition to a competition among bidders for the market. Such intramarket competition is likely costly from a privacy perspective, as several competing providers have independent incentives to collect and recollect personal information (as opposed to making such data available to single, trusted, and regulated provider).271

Moreover, Ken Bamberger and Deirdre Mulligan have detailed the many shortcomings of public officials’ current competitive processes for selecting machine-learning-based applications. Acting as procurers of technologies, public agencies often seek bids from technology providers that focus narrowly on matters such as price.272 Bamberger and Mulligan urge a turn away from a narrow procurement mindset and towards an approach encompassing “expertise, transparency, participation and political oversight, and reasoned decision making.”273 And while traditional franchising processes sit somewhere between pure government procurement and public policymaking, Bamberger and Mulligan are correct to urge decisionmakers to more fully consider the ramifications of their decision. When public agencies select a provider for, say, a law enforcement application, they may well be helping to entrench a monopolist—and to entrench that monopolist’s service’s embedded biases and values. And so policy makers (which, as Bamberger and Mulligan rightly note, include public procurement officials) must take greater care to consider the range of concerns—from privacy to accuracy and bias—that are implicated in the procurement of machine-learning-based applications.274

Policy makers have also, as explained above, sometimes sought to change the economics of market entry to induce efficient competition. As explained above, the 1996 Telecommunications Act offers one example. There, Congress sought to induce entry into local telecommunications markets (sometimes considered a natural monopoly) by requiring that network owners share their wires with competitors on reasonable and fair terms. Competitors could thus avoid the significant fixed, upfront investments required to enter the market and instead choose instead to build out their networks incrementally (relying on incumbent providers to fill gaps in the interim).275 And some municipalities built public infrastructure that they leased to various competing private providers.

Similar tools might apply to the fixed investments that machine-learning-based applications demand, including both computational hardware and data. As explained above, designing and training machine-learning-based applications often demands specialized (and expensive) computing infrastructure.276 Such infrastructure is already available on shared terms: Amazon Web Services, for example, sells access to its massive computing resources to applications developers (among many others). But as concerns about exclusion and affiliation in such markets grow—as such shared infrastructure may not be available to developers building applications that compete with, say, Amazon’s other businesses277—policy makers may decide to ensure that such “machine-learning-as-a-service” offerings (to the extent they are available from only one or a few providers) are available on fair and reasonable—i.e., neutral—terms. Or, beyond these behavioral remedies, policy makers may choose structural solutions, such as a separations rule.278 Policy makers may also consider investing in a public computing infrastructure, akin to some local public investments made in communications architectures: Such a public option for machine learning may likewise discipline existing infrastructure providers’ terms and rates through competition (and gives applications developers another option).279

Other scholars, including Amanda Levendowski, as well as Mark Lemley and Bryan Casey, have argued against intellectual property protection for training data, thereby reducing the costs of such data acquisition.280 And while freeing such reliable data from the copyright’s constraints can reduce these fixed costs, copyright is only one barrier. Other data is protected by trade secret.281 Private agreements—e.g., exclusive data sharing agreements between, say, developers and medical providers—can also restrain the development of competing applications, so policy makers might consider an even more expansive approach. One possibility is to limit such exclusive agreements and mandate data sharing, through, say, models of federated learning—i.e., “an approach to machine learning where a shared global model is trained across many participating clients that keep their training data locally.”282 Another possibility is to create public datasets, in order to induce greater competition.283 One concern, of course, with public datasets is that these will push developers to some suboptimal but uniform standard. So a federated learning approach among competitors may be more attractive, especially because, as explained above, consolidating such information into a single model can give rise to accuracy-improving network effects.284

And finally, as explained above, antitrust’s essential facilities doctrine limits the extent to which a provider may leverage a natural monopoly into a monopoly in an adjacent market (such as leveraging control over power transmission lines into a monopoly over power generation, or leveraging control over telecommunications networks into a monopoly over telephone devices).285 Similarly, in the context of machine-learning-based natural monopolies, providers might leverage their monopoly control over one market into another in at least two ways. First, such providers may adjust their predictive model to prefer affiliated services: Google searches, say, for “Xi’an Famous Foods reviews” might yield results from Google’s own restaurant review service, notwithstanding a consumer preference for a competing provider (such as Zagat),286 or a diagnostics system provider might recommend an affiliated treatment, even where a generic or other therapeutic equivalent is available. Second, a monopoly provider might use the data it collects from consumers to cross-subsidize the development of some separate product. Indeed, the Fifth Circuit has already recognized an essential-facilities claim to data used by natural monopolists in this way (though in the somewhat antiquated context of telephone directories).287 In total, natural monopoly providers may be able to use their monopoly power in one market to help develop and distribute products in adjacent markets—so, as regulators seek to induce competition in putatively regulated markets, they must also be wary to safeguard competition in adjacent markets, too.288

*   *   *

If some machine-learning-based applications might be natural monopolies, then some such applications might behave as natural monopolists. Such applications might, for example, make unreasonable or inefficient demands of their consumers, extracting more information than is reasonably necessary for the application’s continued training. Such developers might also fail to make appropriate investments in an application’s training, yielding inaccurate or prejudiced results. Therefore, we may wish to regulate these markets in order to limit these effects of market power, all while retaining the benefits of the machine-learning-based advance (and of consolidation).289

I emphasize that I do not mean to declare open season on all applications developers, granting regulators a license to target any machine-learning-based system. Rather, as noted above, the nature and extent of any regulatory response depends on a close analysis of the application at issue, the likelihood that the natural monopoly condition is satisfied, and a comparative analysis of the costs and benefits of regulation against the costs and benefits of concentration. Monopolists have incentives to charge too much or to skimp on service. But regulators can make critical errors, too—either by setting prices too high and quality standards too low (thus replicating the effects of monopoly power), or also by setting prices too low and quality standards too high (thereby deterring investment and innovation). Threfore any regulatory intervention must trade the possibility for regulator error against the likelihood of the exercise of monopoly power and must consider the benefits of the regulation alongside the benefits of consolidation.290 In some cases
—medical applications, say, where computational demands are high, data are expensive, and the costs of monopoly too risky—a more stringent approach may make sense. In other contexts, some basic quality standards and periodic audits may suffice. And in still others, the availability of shared computing infrastructure and cheap, reliable data, alongside possibilities for market differentiation, can suggest the possibility for a competitive market. In short, there is space for a pluralistic approach even within this domain of regulation. Moreover, such an evaluation must itself be dynamic: As commentators from Schumpeter to Breyer have remarked, natural monopolies may be transient, undermined by creative destruction, disruption, and innovation.291 But so long as the natural monopoly condition is satisfied, policy makers should seriously consider the possibilities for rules founded on this longstanding regulatory tradition, protecting the public from the harmful distributional and welfare effects of monopoly power—the privacy-invading and data hoarding practices, the misalignment among private incentives and public preferences, and the entrenchment of bias.

Conclusion: Public Regulation and Private Power

Since the advent of the railway system, policy makers have sought to regulate natural monopolies in order to balance the harms of monopoly power—e.g., inefficient pricing, substandard service—against the gains of consolidation and network effects in such markets. Now, a new class of natural monopoly is emerging. Just as prior natural monopolies helped to transform the nation, machine-learning-based applications are remaking the economy in countless substantial ways.292 Moreover, the fixed costs of developing these applications—fixed costs in terms of computing equipment, data, and computational power—together with the virtuous cycle of artificial intelligence suggest that at least some markets may be best served by a single application (as opposed to multiple competitors).

Hence, as a range of commentators across disciplines have begun to document several problems with these applications—problems of accuracy, bias, privacy, and security, among others—we may both understand those problems as ones of monopoly power and draw from the tradition of market regulation to address them. If developers make unreasonable data demands of their consumers, undermining the availability (and accuracy) of the application, then regulators may place limits on the use and collection of personal information. And if applications—freed from the constraints of competition—make unsafe or prejudiced recommendations, then regulators may subject them to accuracy standards.

I do not mean to suggest that making these regulatory determinations will be easy or straightforward: Indeed, it may well be difficult to set data rate caps that reconcile interests in innovation against those in individual privacy, and to set and enforce standards against dynamic and evolving systems. Hence, the case for regulation may be comparatively strong in some contexts, while weak or nonexistent in others. It depends both on the features described above—the fixed equipment and data costs, the computational costs of training, and data network effects—that suggest the existence of a natural monopoly, and on the likelihood and severity of regulator error and developer power—the costs to innovation and investment of regulatory intervention, the costs to privacy of intrusive information demands, and the costs to society of entrenching and propagating bias.

Alongside these concerns, at least one more dimension merits consideration. We must decide not only whether regulators or applications developers are more likely to impose costs on society, but also the extent to which each may be held to account for their inevitable blunders.293 In short, we must decide between public regulation and private power. Here, a regulatory approach offers much in the way of transparency, public participation, and legitimacy—while also helping to ensure that the advances made possible with machine learning yield reasonable service at reasonable rates.

  1. [1]. See generally Sarah H. Gordon, Passage to Union: How the Railroads Transformed American Life, 1829­–1929 (1996) (describing and examining the economic, legal, and social effects of the expansion of the railways nationwide).

  2. [2]. See, e.g., Eric Biber, Sarah E. Light, J.B. Ruhl & James Salzman, Regulating Business Innovation as Policy Disruption: From the Model T to Airbnb, 70 Vand. L. Rev. 1561, 1589–90 (2017).

  3. [3]. See, e.g., Tim Wu, The Master Switch: The Rise and Fall of Information Empires 4 (Vintage Books 2011). See generally Jonathan E. Nuechterlein & Philip J. Weiser, Digital Crossroads: Telecommunications Law and Policy in the Internet Age (2d ed. 2013) (describing the development of the nation’s telecommunications industry (and of its telecommunications policy)).

  4. [4]. See, e.g., Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control 62–102 (2019) (cataloguing near-term and long-term changes based on artificial intelligence and suggesting that, cumulatively, these developments “could change the dynamic of history”).

  5. [5]. See, e.g., Jason Furman & Robert Seamans, AI and the Economy, 19 Innovation Pol’y & Econ. 161, 161–62, 177, 181–82 (2019); see also James Bessen, Stephen Michael Impink, Lydia Reichensperger & Robert Seamans, The Business of AI Startups 2–3, 24­–25 (BU Sch. of L., L. & Econ. Series, Working Paper No. 18-28, 2018) (describing the impact of artificial intelligence on labor markets across sectors).

  6. [6]. Stuart J. Russell & Peter Norvig, Artificial Intelligence: A Modern Approach 28 (3d ed. 2015) (explaining that “many thousands of AI applications are deeply embedded in the infrastructure of every industry” (quoting Ray Kurzweil, The Singularity Is Near: When Humans Transcend Biology 204 (2005))); see also Pedro Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World, at xi (2015) (“You may not know it, but machine learning is all around you.”).

  7. [7]. E.g., George L. Priest, The Origins of Utility Regulation and the “Theories of Regulation” Debate, 36 J.L. & Econ. 289, 295 (1993) (explaining that natural monopolies “have traditionally provided the strongest public interest justification for regulation”).

  8. [8]. Omega Satellite Prods. v. City of Indianapolis, 694 F.2d 119, 126 (7th Cir. 1982); see also Richard A. Posner, Natural Monopoly and Its Regulation, 21 Stan. L. Rev. 548, 548­–50 (1969) [hereinafter Posner, Natural Monopoly] (describing and explaining the natural monopoly concept). Judge Posner was ultimately skeptical that this theory for regulation applied to cable television systems, as Omega Satellite (and his other writings) suggest. Omega Satellite, 694 F.2d at 126 (describing the finding of natural monopoly conditions “sketchy”); Richard A. Posner, The Appropriate Scope of Regulation in the Cable Television Industry, 3 Bell J. Econ. & Mgmt. Sci. 98, 111 (1972) [hereinafter Posner, Scope]; see also Richard A. Posner, Theories of Economic Regulation, 5 Bell J. Econ. & Mgmt. Sci. 335, 337–39 (1974) [hereinafter Posner, Theories] (finding a reformulation of the public interest theory to be unsatisfactory).

  9. [9]. See, e.g., Cmty. Commc’ns Co. v. City of Boulder, 455 U.S. 40, 43–46 (1982).

  10. [10]. Nuechterlein & Weiser, supra note 3, at 32–40; see also Delos F. Wilcox, Municipal Franchises: A Description of the Terms and Conditions upon Which Private Corporations Enjoy Special Privileges in the Streets of American Cities 252–396 (1910) (describing, in detail, provisions in some of the earliest municipal franchise contracts).

  11. [11]. Nuechterlein & Weiser, supra note 3, at 34–35.

  12. [12]. An alternative theory, advanced by George Stigler (among many others), is that regulation is itself a commodity that is best understood as demanded by some and supplied by others. See George J. Stigler, The Theory of Economic Regulation, 2 Bell J. Econ. & Mgmt. Sci. 3, 11
    –16 (1971). Judge Posner compared Stigler’s theory to the “public interest” theory of regulation elaborated above to a somewhat more cynical “capture” theory of regulation and concluded that neither adequately explained all utility regulation (which includes, but is not necessarily limited to, natural monopoly regulation). See Posner, Theories, supra note 8, at 343–44. Others have since taken a more pluralistic approach. See Priest, supra note 7, at 293–94.

    I do not mean to weigh in on this longstanding debate over the true origins of utility regulation, though I suspect that no single theory can adequately explain the true origins of utility regulation across all states and municipalities nationwide. It suffices for my purposes to note that certain forms of regulation can be—and have long been—justified by the (apparent) existence of natural monopoly conditions. See Wilcox, supra note 10, at 28–33 (explaining that municipal utility franchising is explained, in part, by “the natural tendency toward monopoly” in such markets).

  13. [13]. See, e.g., Biber et al., supra note 2, at 1591–92; Michael E. Levine, Revisionism Revised? Airline Deregulation and the Public Interest, 44 L. & Contemp. Probs. 179, 191–93 (1981); see also infra notes 14–15; cf. Clayton M. Christiansen, The Innovator’s Dilemma: When New Technologies Cause Great Firms to Fail 9–19 (1997) (describing such disruption). But see Priest, supra note 7, at 290 (suggesting that it was skepticism for this “public interest” natural monopoly theory of regulation—skepticism driven by the Journal of Law and Economics itself—that “significantly contributed to the movement toward broad deregulation”). One other explanation for the fading popularity of such regimes lies in the problems of regulatory capture and the difficulties with rate setting described above. I address these problems in more detail infra notes 74–78 and accompanying text. And, as noted above, I do not mean to discount the possibility that changes in the regulatory apparatus were motivated by something other than—or something in addition to—these technological developments. Rather, as above, it suffices for my purposes to note that changes in the regulatory regime may be—and have been—justified by technological changes that affect market structure (no matter whether we can pinpoint any single or set of historically-accurate rationale(s) for such shifts in the prevailing regulatory paradigm). See supra note 12 and accompanying text.

  14. [14]. MCI was initially known, more formally, as Microwave Communications, Inc., but was later renamed to MCI Communications Corp. In either event, the company was one of AT&T’s primary competitors in the long-distance telephony market (among other markets). See generally Philip Louis Cantelon, The History of MCI 1968–1988 (1993) (describing a history of MCI from 1968 to 1988).

  15. [15]. See W. Kip Viscusi, Joseph E. Harrington, Jr. & John M. Vernon, Economics of Regulation and Antitrust 534–44 (4th ed. 2005) (describing the economic significance of microwave transmission); see also Nuechterlein & Weiser, supra note 3, at 40–51 (describing the developments leading to the enactment of the (largely deregulatory) Telecommunications Act of 1996).

  16. [16]. See, e.g., Alfred E. Kahn, The Economics of Regulation: Principles and Institutions, at xv–xxxvii (1988); see also Stephen Breyer, Regulation and Its Reform 237
    –39 (1982) (discussing deregulation in the trucking industry).

  17. [17]. Press Release, White House Off. of Sci. & Tech. Pol’y, The White House Launches the National Artificial Intelligence Initiative Office (Jan. 12, 2021), https://trumpwhitehouse.archiv []; see National Artificial Intelligence Initiative, 15 U.S.C. § 9411 (2021); Exec. Order No. 13,859, 84 Fed. Reg. 3967 (Feb. 11, 2019); Christopher S. Yoo & Alicia Lai, Regulation of Algorithmic Tools in the United States, 13 J.L. & Econ. Regul. 7, 9–15 (2020) (providing overview of federal regulatory initiatives for various artificial intelligence applications).

    The Risk Assessment Framework is meant to help “better manage risks to individuals, organizations, and society associated with artificial intelligence . . . [by] improv[ing] the ability to incorporate trustworthiness considerations into the design, development, use, and evaluation of AI products, services, and systems.” AI Risk Management Framework, Nat’l Inst. of Standards & Tech., [].

  18. [18]. See, e.g., Oren Bracha & Frank Pasquale, Federal Search Commission? Access, Fairness, and Accountability in the Law of Search, 93 Cornell L. Rev. 1149, 1180 (2008) (“It is unclear whether search engines fall under the strict definition of a natural monopoly, but they exhibit very similar characteristics.” (footnote omitted)); Steven Weber & Gabriel Nicholas, Data, Rivalry and Government Power: Machine Learning Is Changing Everything, 14 Glob. Asia 23, 26 (2019) (“[T]here’s justification for concern about natural monopolies ... .”); see also Ad J.W. van de Gevel & Charles N. Noussair, The Nexus Between Artificial Intelligence and Economics 97 (2013) (“For the first in a new class of products, a natural monopoly may briefly exist.”); A. Michael Froomkin, Ian Kerr & Joelle Pineau, When AIs Outperform Doctors: Confronting the Challenges of a Tort-Induced Over-Reliance on Machine Learning, 61 Ariz. L. Rev. 33, 88 (2019) (suggesting that machine-learning-based applications face “high fixed costs and low marginal costs[,] resembl[ing] the economic profile of a so-called natural monopoly”); Benjamin L. Mazer, Nathan Paulson & John H. Sinard, Protecting the Pathology Commons in the Digital Era, Archives Pathology & Lab’y Med. 1037, 1038 (2020) (“[O]paque [machine learning] algorithms can lead to a natural monopoly.”); Ad J.W. van de Gevel & Charles N. Noussair, Artificial Intelligence and Economics: From Homo Sapiens to Robo Sapiens 5 (International Conference on Advances in Computing, Electronics and Communication, 2015), [] (suggesting that “AI systems, including autonomous robots, can be expected to generally have th[e] type of cost structure” associated with natural monopolies); Romesh Vaitilingam, How Leading Economists View Antitrust in the Digital Economy, London Sch. of Econ. & Pol. Sci. (Nov. 18, 2020),
    eview/2020/11/18/how-leading-economists-view-antitrust-in-the-digital-economy [] (“The tech industry is rife with natural monopolies, which are routinely regulated in other sectors.”) (internal quotation marks omitted). But see Herbert Hovenkamp, Antitrust and Platform Monopoly, 130 Yale L.J. 1952, 1971 (2021) (“Few platforms are natural monopolies.”).

    Many commentators have focused on search markets in particular (without regard to whether those search applications rely on machine learning technology).

  19. [19]. See infra Section III.B.

  20. [20]. See id.

  21. [21]. See, e.g., Viscusi et al., supra note 15, at 401–02; William J. Baumol, On the Proper Cost Tests for Natural Monopoly in a Multiproduct Industry, 67 Am. Econ. Rev. 809, 819 (1977) [hereinafter Baumol, Proper Cost].

  22. [22]. See Posner, Natural Monopoly, supra note 8, at 548 (“If the entire demand within a relevant market can be satisfied at lowest cost by one firm rather than by two or more, the market is a natural monopoly, whatever the actual number of firms in it.”). I explain this in greater detail, infra Section II.A; see also William J. Baumol, Quasi-Permanence of Price Reductions: A Policy for Prevention of Predatory Pricing, 89 Yale L.J. 1, 11–14 (1979) [hereinafter Baumol, Quasi-Permanence] (comparing and contrasting the concept of scale with the concept of subadditivity).

  23. [23]. See, e.g., Microsoft Corp. v. AT&T Corp., 550 U.S. 437, 439 (2007). This is especially so when considering the costs of setting up a distribution architecture as among the fixed costs of software development.

  24. [24]. See infra Section III.B.

  25. [25]. I use the term “train” here in quotation marks in view of the controversy over whether it is accurate or appropriate to say that these algorithmic systems are capable of being trained or of learning. See, e.g., Russell & Norvig, supra note 6, at 1041–48 (summarizing this debate). After this instance, I no longer place the term in quotation marks—but, by doing so, do not mean to diminish the significance of this philosophical, etymological, and, indeed, existential question.

  26. [26]. See infra Section III.B.2.

  27. [27]. See, e.g., David Lehr & Paul Ohm, Playing with the Data: What Legal Scholars Should Learn About Machine Learning, 51 U.C. Davis L. Rev. 653, 655 (2017) (suggesting “that legal scholars should think of machine learning as consisting of two distinct workflows,” first, the design, development and training of an application, and second, an application “deployed and making decisions in the real world”).

  28. [28]. See, e.g., Ian Goodfellow, Yoshua Bengio & Aaron Courville, Deep Learning 114
    –17 (2016) (explaining that as “the training set increases” the “optimal capacity” of the machine learning system also increases until it “plateaus” in view of having “reach[ed] sufficient complexity to solve the task”).

  29. [29]. See infra Section III.B.2.

  30. [30]. See, e.g., Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever & Dario Amodei, Cornell Univ., Language Models are Few-Shot Learners 5 (2020), [https://].

  31. [31]. See infra Section III.B. As I explain there, my claim turns on some tricky questions of market definition. And, as I readily concede, some applications may operate in markets that can sustain competition because, say, different applications may serve different segments of the same market. See infra note 185–86 and accompanying text (noting possibilities for product differentiation). In short, I am careful not to suggest that all machine-learning-based applications necessarily operate in a natural monopoly market. Rather, my claim is, as I say above, more narrow: I find it likely that at least some machine-learning-based applications, when serving a single market, are natural monopolists. See also infra note 35–36 and accompanying text. (“I do not mean to contend that all algorithmic systems ... fall into this category. But I do conclude that some machine-learning-based applications [in some scenarios] ... seem to tend toward natural monopoly.” (foonote omitted)).

  32. [32]. Importantly, not all machine learning systems learn continually. See, e.g., Arti K. Rai, Isha Sharma & Christina Silcox, Accountability, Secrecy, and Innovation in AI-Enabled Clinical Decision Software, 7 J.L. & Biosciences, Jan.–June 2020, at 1, 3, 5 (explaining that machine-learning-based applications “could either be locked, or [] could be allowed to learn in real-time from new data to which it was exposed,” but that the “FDA has thus far cleared only locked models” for use in clinical settings).

  33. [33]. See Andrew Ng, Landing AI, AI Transformation Playbook: How to Lead Your Company into the AI Era 4 (2019),
    AI_Transformation_Playbook_11-19.pdf []; see also Jaehun Lee, Taewon Suh, Daniel Roy & Melissa Baucus, Emerging Technology and Business Model Innovation: The Case of Artificial Intelligence, 5 J. Open Innovation: Tech., Mkt. & Complexity 44, 49–51 (2019) (similarly describing the “virtuous cycle of AI”).

  34. [34]. See, e.g., Frank Pasquale, Rankings, Reductionism, and Responsibility, 54 Clev. St. L. Rev. 115, 129–30 (2006) (“Growing numbers of searches on a given service give that service ever more data to refine and improve its index. The ‘rich get richer,’ making the search and rankings field a very difficult one to enter.” (foonote omitted)).

  35. [35]. See supra note 25 and accompanying text (regarding my use of the term “train”).

  36. [36]. See, e.g., Bruce G. Buchanan, Can Machine Learning Offer Anything to Expert Systems?, 4 Mach. Learning 251, 251 (1989) (explaining that, historically, most “expert systems do not employ a learning component to construct parts of their knowledge bases from libraries of previously solved cases”); cf. Digit. Competition Expert Panel, Unlocking Digital Competition: Report of the Digital Competition Expert Panel 54–55 (2019), https://assets [] (analogizing digital markets generally to natural monopolies, but concluding that the analogy may not hold across the entire sector).

  37. [37]. See, e.g., Cantor v. Detroit Edison Co., 428 U.S. 579, 595–96 (1976); Otter Tail Power Co. v. United States, 410 U.S. 366, 386–89 (1973) (Stewart, J., concurring in part and dissenting in part); see also Priest, supra note 7, at 301 (explaining that municipalities regulated utilities, through franchise contracts that set rates and service quality, as early as “the early decades of the nineteenth century”).

  38. [38]. See Ann Brody Guy, Reinventing Cybersecurity, Berkeley Eng’g (Apr. 14, 2020), https:// [] (noting the “conventional wisdom” that where “a service is free, you are the product”); see also Tim Wu, The Attention Merchants: The Epic Scramble to Get Inside Our Heads 376
    –81 (2016) (explaining how human behavior is monetized by companies providing free services online); Chris Jay Hoofnagle & Jan Whittington, Free: Accounting for the Costs of the Internet’s Most Popular Price, 61 UCLA L. Rev. 606, 608–12 (2014) (describing how online businesses are offering services marketed as “free” but use such programs to obtain consumer’s data in return); Shira Ovide, Google Ends Its Free Digital Photo Storage, N.Y. Times (Nov. 12, 2020), https:// [https://perma.
    cc/RL7Z-SVND] (explaining that Google’s service “has never really been free” because “Google uses our photos to train its software systems”).

  39. [39]. See, e.g., Maurice E. Stucke, Should We Be Concerned About Data-opolies?, 2 Geo. L. Tech. Rev. 275, 285–86, 302 (2018) (explaining that a monopolist in a data-intensive industry “has the incentive to reduce its privacy protection below competitive levels and collect personal data above competitive levels” giving rise to privacy harms and deadweight losses); Ryan Calo, Digital Market Manipulation, 82 Geo. Wash. L. Rev. 995, 1018 (2014); Shira Ovide, Just Collect Less Data, Period, N.Y. Times (July 16, 2020), [] (“For the companies, there’s no downside to limitless data collection, and there’s little to prevent them from doing so in the United States.”); see also Jennifer Valentino-DeVries, Coronavirus Apps Show Promise but Prove a Tough Sell, N.Y. Times (Dec. 7, 2020), [] (explaining that privacy concerns have depressed interest in coronavirus-related contact-tracing applications); Cynthia Cole, Brooke Chatterton & Natalie Sanders, The Safety of Privacy: Increased Privacy Concerns May Prevent Effective Adoption of Contact Tracing Apps, Law.Com (Aug. 18, 2020, 7:00 AM),
    2020/08/18/the-safety-of-privacy-increased-privacy-concerns-may-prevent-effective-adoption-of-contact-tracing-apps [] (“Privacy concerns, which lead to low adoption rates, are a barrier to the success of a contact tracing app.”).

  40. [40]. See, e.g., infra note 234 and accompanying text (describing HIPAA as an example of such a provision); cf. Cal. Civ. Code § 1798.100 (West 2020) (explaining that personal information may be used for business purposes only to the extent such use is “reasonably necessary and proportionate”).

  41. [41]. See, e.g., Tejas N. Narechania, The Secret Life of a Text Message, 120 Colum. L. Rev. F. 198, 205 (2020) (suggesting that telephone providers “may have private incentives ... to underinvest in safety-related infrastructure”). In some cases, providers might even discriminate intentionally. Cf. Elena Botella, TikTok Admits It Suppressed Videos by Disabled, Queer, and Fat Creators, Slate (Dec. 4, 2019, 5:07 PM), [] (“TikTok, a social network video app with more than 1 billion downloads globally, admitted Tuesday to a set of policies that had suppressed the reach of content created by users assumed to be ‘vulnerable to cyberbullying.’”).

  42. [42]. See Restoring Internet Freedom, 35 FCC Rcd. 12328, para. 48, 12351 n.170 (order on remand) (Oct. 27, 2020) (suggesting that competition among providers will improve safety-related services and applications).

  43. [43]. See infra notes 80–89 and accompanying text.

  44. [44]. See Munn v. Illinois, 94 U.S. 113, 124–25 (1876) (affirming regulators’ power to regulate private providers “for the public good”); K. Sabeel Rahman, The New Utilities: Private Power, Social Infrastructure, and the Revival of the Public Utility Concept, 39 Cardozo L. Rev. 1621, 1634–38 (2017); see also Solon Barocas, Moritz Hardt & Arvind Narayanan, Fairness and Machine Learning: Limitations and Opportunities 13 (2021),
    fairmlbook.pdf [] (introducing “formal methods for characterizing ... problems [relating to objectionable results of machine learning systems] and assess[ing] various computational methods for addressing them”); cf. Seth Katsuya Endo, Technology Opacity & Procedural Injustice, 59 B.C. L. Rev. 821, 862–63 (2018) (highlighting the importance of participation values (as against accuracy and efficiency concerns) when considering whether to use predictive coding in litigation discovery contexts).

  45. [45]. Posner, Natural Monopoly, supra note 8, at 548 (suggesting that sustained competition in natural monopoly markets “produces inefficient results”).

  46. [46]. See, e.g., Cmty. Commc’ns Co. v. City of Boulder, 660 F.2d 1370, 1373 (10th Cir. 1981) (“[T]he City ... concluded that cable systems are natural monopolies. Consequently, the City became concerned that CCC, because of its headstart, would always be the only cable operator in Boulder if allowed to expand ... . The City decided to place a moratorium on CCC’s expansion in order to provide other companies the opportunity to make bids to service the remaining parts of Boulder ... .”).

  47. [47]. See, e.g., 47 U.S.C. § 251(c)(3) (2018); see also Nuechterlein & Weiser, supra note 3, at 58 (describing the purpose of §251(c)(3)).

  48. [48]. See Deirdre K. Mulligan & Kenneth A. Bamberger, Procurement as Policy: Administrative Process for Machine Learning, 34 Berkeley Tech. L.J. 773, 785–98 (2019).

  49. [49]. Charles I. Jones & Christopher Tonetti, Nonrivalry and the Economics of Data, 110 Am. Econ. Rev. 2819, 2820 (2020) (explaining that firms have incentives to “hoard data in ways that are socially inefficient”). In response, both Mark A. Lemley & Bryan Casey, Fair Learning, 99 Tex. L. Rev. 743 (2021) and Amanda Levendowski, How Copyright Law Can Fix Artificial Intelligence’s Implicit Bias Problem, 93 Wash. L. Rev. 579 (2018) generally contend that copyright law should be interpreted so as to limit infringement liability for training data, thereby reducing the training costs associated with application development. But, as I elaborate below, copyright is only one limit among several on the availability and use of training data. See infra Section IV.B.3; cf. Ruckelshaus v. Monsanto Co., 467 U.S. 986, 1018–19 (1984) (finding that, even though data used for product development and regulatory approval may be protected as trade secret and under the Constitution’s Taking Clause, statutory schemes for arbitrating a price for the use of that data by competitors satisfy constitutional scrutiny). Federated learning may offer one way out of this quandary. See TensorFlow Federated: Machine Learning on Decentralized Data, TensorFlow, [] (defining federated learning as “an approach to machine learning where a shared global model is trained across many participating clients that keep their training data locally”).

  50. [50]. On bias and accuracy, see, for example, Batya Friedman & Helen Nissenbaum, Bias in Computer Systems, 14 ACM Transactions on Info. Sys. 330, 332–37 (1996) (describing preexisting, technical, and emergent bias); see also Joy Buolamwini & Timnit Gebru, Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification, 81 Proc. Mach. Learning Rsch. 1, 6–11 (2018) (evaluating three commercial gender classification systems and finding that darker-skinned females are misclassified in up to 34.7 percent of test cases, while lighter-skinned males are misclassified in only 0.8 percent of test cases). On privacy and autonomy, see, for example, Julie E. Cohen, What Privacy Is for, 126 Harv. L. Rev. 1904, 1906, 1912–18 (2013); see also M. Ryan Calo, The Boundaries of Privacy Harm, 86 Ind. L.J. 1131, 1151–52 (2011) (describing and classifying the privacy harms that attend to informational systems, including Google’s Gmail (email) service).

  51. [51]. See Barocas et al., supra note 44, at 13 (“[I]ntroduc[ing] formal methods for characterizing these problems [relating to objectionable results of machine learning systems] and assess[ing] various computational methods for addressing them.”); Michael Kearns & Aaron Roth, The Ethical Algorithm: The Science of Socially Aware Algorithm Design 16 (2019) (“[O]ur book ... dives headfirst into the emerging science of designing social constraints directly into algorithms ... .”); see also Russell, supra note 4, at 171–83 (2019) (discussing “beneficial machines”); Benjamin Kuipers, Perspectives on Ethics of AI: Computer Science, in The Oxford Handbook of Ethics of AI 421, 433–35 (Markus D. Dubber, Frank Pasquale & Sunit Das eds., 2020) (discussing risks of bias); cf. Lawrence Lessig, Code: Verison 2.0 1–9 (2006) (describing how the design of technical systems constrains and guides behavior); Joel R. Reidenberg, Lex Informatica: The Formulation of Information Policy Rules Through Technology, 76 Tex. L. Rev. 553, 556–65 (1998) (similarly explaining how the technical systems can implement policy choices and values).

  52. [52]. See Safiya Umoja Noble, Algorithms of Oppression: How Search Engines Reinforce Racism 156–60 (2018); Rahman, supra note 44, at 1640–41; Frank A. Pasquale, Internet Nondiscrimination Principles Revisited 8–10 (Brook. L. Sch. Legal Stud., Working Paper No. 655, 2020); see also Bracha & Pasquale, supra note 18, at 1152, 1201–09 (outlining “some possible directions for effective regulation”).

  53. [53]. E.g., Andrea O’Sullivan & Adam Thierer, Counterpoint: Regulators Should Allow the Greatest Space for AI Innovation, 61 Commc’ns ACM 33, 33 (2018) (“[A]rtificial intelligence technologies should largely be governed by a policy regime of permissionless innovation so that humanity can best extract all of the opportunities and benefits they promise.”); see also Alicia Solow-Niederman, Administering Artificial Intelligence, 93 S. Cal. L. Rev. 633, 653 (2020) (counseling “against ‘command-and-control’” regulation); cf. Julie E. Cohen, Between Truth and Power: The Legal Constructions of Informational Capitalism 90–92 (2019) (summarizing such arguments).

  54. [54]. Cf. Rahman, supra note 44, at 1634–35 (considering regulatory possibilities through the “more expansive” lens of public utility regulation); id. at 1674–78 (considering regulatory possibilities for Uber, Airbnb, and like competitors after concluding that such “businesses may not be conventional natural monopolies” but are nevertheless likely to be entrenched and therefore require a regulatory response); John M. Newman, Antitrust in Zero-Price Markets: Foundations, 164 U. Pa. L. Rev. 149, 190–94, 198–200 (2015).

  55. [55]. See Baumol, Quasi-Permanence, supra note 22, at 11–12 (explaining “natural monopoly always has been associated loosely with the phenomenon of economies of scale”).

  56. [56]. See id. at 11 (explaining that a market is “a natural monopoly if production by a single firm is the cheapest way to produce the combination of outputs supplied by the industry” and that, consequently, “if several firms can produce its output at least as cheaply as one, then” there is no natural monopoly); see also Robert Baldwin, Martin Cage & Martin Lodge, Understanding Regulation: Theory, Strategy, and Practice 16–17, 444–46 (2d ed. 2012) (describing how natural monopolies occur and their effects); Baumol, Proper Cost, supra note 21, at 815–16 (discussing implications for evidence on natural monopolies).

    Some readers might contend that a natural monopoly exists only where only single-firm control of an entire market can justify that firm’s investment of the fixed costs required to serve the market. As I explain below, however, that’s not quite right. As the following examples suggest, it may be profitable for a second firm to enter a market (either wholly or in part), either in the hopes of eventually winning the competition for the market, or perhaps to serve some lucrative market subset (e.g., cream-skimming). But such duplicative entry may nevertheless be inefficient (again, from the standpoint of productive efficiency).

  57. [57]. See Posner, Natural Monopoly, supra note 8, at 548; cf. Hovenkamp, supra note 18, at 1971 (“While we sometimes use the term ‘natural monopoly’ to describe a firm or a market, natural-monopoly status actually applies to particular inputs or technologies.”).

  58. [58]. See U.S. Telecom Ass’n v. FCC, 359 F.3d 554, 569 (D.C. Cir. 2004) (explaining that incumbent telecommunications providers typically need only make a “software change” in order to connect a new customer).

  59. [59]. One competitor might, of course, be more efficient than the other—Otis, say, might be able to build a better factory more cheaply than ThyssenKrupp. But such differences are beside my point. It suffices to note that the market can support multiple providers, and that those providers will rationally compete amongst each other to provide the best elevators most cheaply, and that such market competition is more likely, we assume under this framework, to identify the better competitors as compared to, say, a central planner deciding who gets to build the second elevator factory.

    Similarly, in intellectual property markets, the demand for books (or movies, or music) is greater than any one work of authorship. So while the ratio of fixed costs to average marginal costs for goods traditionally protected by intellectual property is similar to that seen in natural monopoly markets—indeed, that ratio is often cited as a primary justification for intellectual property protection—many such goods are not natural monopolies because of these greater market dynamics.

  60. [60]. Thomas K. McCraw, Prophets of Regulation 306 (1984).

  61. [61]. See Viscusi et al., supra note 15, at 534–35. The Kingsbury Commitment is widely considered to be one of the first instances of federal telecommunications regulation. See Nuechterlein & Weiser, supra note 3, at 5, 264–65.

  62. [62]. Stated similarly, monopolists often find it profitable to charge prices above average marginal cost. As a result, some consumers—even consumers who are willing to pay more than it costs to produce a good—will be unable to pay the market price for the service. See infra note 71 and accompanying text (describing deadweight loss).

  63. [63]. Such quality effects are analogous to the price effects described above. One way to restate this issue, for example, might be to say that the monopolist’s price is too high for such a low-quality good.

  64. [64]. Cf. Jean Tirole, Market Failures and Public Policy, 105 Am. Econ. Rev. 1665, 1669–70 (2015) (describing the role that luck can play in the acquisition of market power, including in natural monopoly and utility contexts).

  65. [65]. See Nuechterlein & Weiser, supra note 3, at 307–24.

  66. [66]. Verizon Commc’ns v. FCC, 535 U.S. 467, 477 (2002) (“At the dawn of modern utility regulation, in order to offset monopoly power and ensure affordable, stable public access to a utility’s goods or services, legislatures enacted rate schedules to fix the prices a utility could charge.”).

  67. [67]. My explanation of the basic economics here is, by necessity, a simplification, ignoring, e.g., the externalities of railway pollution. I put such complications to one side for now in order to explain one view of the basic premises underlying natural monopoly regulation. Regulators have sometimes directly addressed externality-related concerns in natural monopoly markets (among others), and I discuss such possibilities in the machine learning context infra notes 235, 238–39, 246–47 and accompanying text.

  68. [68]. Cf. David Singh Grewal & Jedediah Britton-Purdy, Liberalism, Property, and the Means of Production, LPE Project (Jan. 25, 2021), [] (describing the view that, in markets characterized by market power, “there are many possible [price] equilibria, none of them determinate”).

  69. [69]. See, e.g., Tim Wu, The Curse of Bigness: Antitrust in the New Gilded Age 33–44 (2018); see also Grewal & Britton-Purdy, supra note 68 (describing “the problem that scale economies presented to a conception of freedom and equality”). This may be especially true in classical natural monopoly markets. In competitive markets, “[t]he opportunity to charge monopoly prices”—i.e., the desire to dominate the market—supplies a powerful incentive for firms to deploy the “business acumen” that gives rise to “innovation and economic growth” that may help offset these distributive concerns (assuming these concerns are commensurable at all). Compare Verizon Commc’ns Inc. v. L. Offs. of Curtis V. Trinko LLP, 540 U.S. 398, 407 (2004) (distinguishing monopolies that result from such “acumen” from those that result from anticompetitive practices), with David Singh Grewal & Jedediah Purdy, Inequality Rediscovered, 18 Theoretical Inquiries L. 61, 78 (2017) (contesting the view that “attempt[s] to use state power to shape economic outcomes directly is an error,” noting that such “arguments ... ten[d] to conceal the distributive choices inherent in market-making policy”). But because natural monopoly markets will typically tend to a single winner, natural monopolists have a somewhat diminished incentive to bring their acumen to bear on the market, and may thereby have less to offer by way of offsetting innovation and growth. See Posner, Natural Monopoly, supra note 8, at 548 (suggesting that, in natural monopoly markets, “firms will quickly shake down to one through mergers or failures “).

  70. [70]. See, e.g., Baldwin et al., supra note 56, at 447–48.

  71. [71]. This is because of the loss in the difference between cost and value (here, $1 and $3, respectively), known more formally as surplus. In this example, our consumer would realize $2 in surplus by paying $1 for a service that is worth $3 to her. That loss in surplus is more formally known as deadweight loss. See, e.g., Viscusi et al., supra note 15, at 82 (defining deadweight loss).

  72. [72]. See, e.g., id. at 421–22 (explaining that regulators aim “to ensure that the monopolist’s revenues equal its costs,” though regulators sometimes prioritize social goals when establishing a rate structure); see also 47 U.S.C. §§ 201–02 (describing the Federal Communications Commission’s authority to set reasonable rates); Mary A. Wallace, Interstate Commerce Commission, 56 Geo. Wash. L. Rev. 937, 938 & n.7 (1988) (explaining that the Interstate Commerce Commission was designed to address “abuses of monopoly power” including “rate abuse”).

  73. [73]. See 47 U.S.C. § 543(a)(2) (2018) (revoking federal and state rate setting authority wherever “a cable system is subject to effective competition”).

  74. [74]. See infra text accompanying note 76.

  75. [75]. See Christopher Decker, Modern Economic Regulation: An Introduction to Theory and Practice 104 (2015) (describing rate-of-return regulation); see also Nuechterlein & Weiser, supra note 3, at 33–34 (describing rate-of-return regulation).

  76. [76]. See Nuechterlein & Weiser, supra note 3, at 33–34 (describing problems attending to rate-of-return regulation).

  77. [77]. See Decker, supra note 75, at 114–29 (describing price-cap regulation).

  78. [78]. See Implementation of Sections of the Cable Television Consumer Protection and Competition Act of 1992: Rate Regulation and Adoption of a Uniform Accounting System for Provision of Regulated Cable Service, 9 FCC Rcd. 4527, 4531 (1994) (report and order) (explaining that “[t]he Commission adopted a benchmark and price cap approach to serve as the primary regulatory mechanism for setting initial regulated rates and for governing rates” for cable companies).

  79. [79]. See, e.g., Nuechterlein & Weiser, supra note 3, at 34–35 (describing incentives to skimp under price-cap regulation).

  80. [80]. Cf. Connect America Fund, 26 FCC Rcd. 17663, para. 85 (2011) (report and order) (requiring that subsidized monopoly providers of broadband service meet standards of “reasonable comparability” with competitively offered service).

  81. [81]. See, e.g., Wilcox, supra note 10, at 304, 309 (showing that, even in the earliest days of telephone regulation, municipalities imposed conditions to protect public safety operations).

  82. [82]. See, e.g., 47 U.S.C. § 541(a)(3) (2018) (prohibiting income-based redlining); Hinton Tel. Co., 30 FCC Rcd. 2308 (2015) (forfeiture order) (fining telephone carrier $100,000 for failing to properly route 911 calls); Connect America Fund, 26 FCC Rcd. 17663, para. 15 (2011) (report and order) (explaining that carrier-of-last-resort requirements help ensure that subsidized monopoly providers “exten[d] voice and broadband service, both fixed and mobile, where it is lacking, to better meet the needs of their consumers.”).

  83. [83]. See Rahman, supra note 44, at 1637 (describing the moral and social dimensions of progressive-era utility regulation).

  84. [84]. See, e.g., Major Rate Case Process Overview, N.Y. State Dep’t of Pub. Serv. (Sept. 23, 2011, 3:55 PM),
    OpenDocument [] (“Rate cases are a primary instrument of government regulation of these industries. Interested persons may intervene and become parties in a utility company’s rate case. Typical intervenors include: industrial, commercial and other large-scale users of electricity; public interest groups; representatives of residential, low-income and elderly customers; local municipal officials; and, dedicated advocacy groups ... . Rate cases proceed in an entirely public and open process.”); see also Breyer, supra note 16, at 341, 351–53 (discussing the importance (and shortcomings) of participation and representation to the legitimacy of regulatory action).

  85. [85]. The Phone Network Transition: Lessons from Fire Island, Pub. Knowledge (Mar. 7, 2014), [].

  86. [86]. See Letter from State of N.Y. Dep’t of Pub. Serv., to Marlene H. Dortch, Fed. Commc’ns Comm’n 2 (July 29, 2013), [

  87. [87]. Id.

  88. [88]. Tejas N. Narechania & Erik Stallman, Internet Federalism, 34 Harv. J.L. & Tech. 547, 580 (2021).

  89. [89]. Patrick McGeehan, Verizon Backing Off Plans for Wireless Home Phones, N.Y. Times (Sept. 12, 2013), [].

  90. [90]. See Wilcox, supra note 10, at 28–33.

  91. [91]. Paul L. Joskow, Regulation of Natural Monopolies, in 2 Handbook of Law and Economics 1227, 1244–47 (A. Mitchell Polinsky & Steven Shavell eds., 2007); Robin A. Prager, Franchise Bidding for Natural Monopoly: The Case of Cable Television in Massachusetts, 1 J. Regul. Econ. 115, 126–27 (1989). But see Oliver E. Williamson, Franchise Bidding for Natural Monopolies—in General and with Respect to CATV, 7 Bell J. Econ. 73, 76–78 (1976) (describing the difficulties that attend to evaluating ex ante what constitutes the best value).

    Notably, a state’s decision to award an exclusive franchise to a natural monopolist (or a locality’s similar decision, made under duly delegated authority) has traditionally posed no antitrust law concerns. See Cmty. Commc’ns Co. v. City of Boulder, 455 U.S. 40, 54–56 (1982) (citing City of Lafayette v. La. Power & Light Co., 435 U.S. 389, 413 (1978)); see also Town of Hallie v. City of Eau Claire, 471 U.S. 34, 43–47 (1985) (finding that a delegation of sewage services by the state and cities was not an antitrust concern).

  92. [92]. See Harold Demsetz, Why Regulate Utilities?, 11 J.L. & Econ. 55, 63–65 (1968); see also Viscusi et al., supra note 15, at 469 (suggesting that “franchise bidding [could] result[] in average cost pricing and the most efficient firm operating” with the added benefit “that [bidding] imposes no informational requirements on a government agency”).

  93. [93]. See supra notes 58–59 and accompanying text (explaining that duplicative fixed investments lead to higher consumer and input prices).

  94. [94]. Nuechterlein & Weiser, supra note 3, at 40–44.

  95. [95]. 47 U.S.C. §§ 251–252 (2018); see also Nuechterlein & Weiser, supra note 3, at 52 (briefly summarizing Congress’s goals for these new provisions).

  96. [96]. See Nuechterlein & Weiser, supra note 3, at 58–59; see also supra notes 58–59 and accompanying text (explaining that natural monopolies do not exist—that competition is sustainable—where costs do not fall over the relevant market, but rather new fixed investments are required as demand grows).

  97. [97]. Otter Tail Power Co. v. United States, 410 U.S. 366, 378 (1973) (“When a community serviced by Otter Tail decides not to renew Otter Tail’s retail franchise when it expires, it may generate, transmit, and distribute its own electric power. We recently described the difficulties and problems of those isolated electric power systems. Interconnection with other utilities is frequently the only solution ... . There were no engineering factors that prevented Otter Tail from ... wheeling the power ... . Otter Tail’s refusals to ... wheel were solely to prevent municipal power systems from eroding its monopolistic position.” (citations omitted)).

  98. [98]. See Priest, supra note 7, at 304–08 (explaining that Demsetz’s view of franchising presupposed the existence of the very regulatory apparatus he sought to undermine).

  99. [99]. See Williamson, supra note 91, at 76–79; see also Robin A. Prager, Firm Behavior in Franchise Monopoly Markets, 21 RAND J. Econ. 211, 223–24 (1990) (finding evidence of opportunistic behavior that varies in severity across sectors).

  100. [100]. See Priest, supra note 7, at 293; cf. Nuechterlein & Weiser, supra note 3, at 11–12 (discussing public choice theory).

  101. [101]. See Victor P. Goldberg, Regulation and Administered Contracts, 7 Bell J. Econ. 426,
    444–46 (1976); see, e.g., Denver Area Educ. Telecomm. Consortium, Inc. v. FCC, 518 U.S. 727, 772 (1996) (Stevens, J., concurring) (“[P]ublic, educational, and governmental access channels
    ... . owe their existence to contracts forged between cable operators and local cable franchising authorities.”)

  102. [102]. Stephen Labaton, Phone Start-Ups Win the Latest Round in Court, N.Y. Times (May 14, 2002), [].

  103. [103]. See Breyer, supra note 16, at 357–60.

  104. [104]. See Wallace, supra note 72, at 938 (railroads); Nuechterlein & Weiser, supra note 3, at 137–41 (telephones); see also Breyer, supra note 16, at 315–21 (airlines); Kahn, supra note 16, at xv–xvii (electricity and gas); Biber et al., supra note 2, at 1591–92 (power utilities); Levine, supra note 13, at 191–93 (airlines).

    As I noted above, I readily acknowledge this is only one among several plausible explanations for the deregulatory movement of the late twentieth century. It does not undermine my point here that some other theory also helps explain changes in the regulatory regime. See supra note 12 and accompanying text.

  105. [105]. In addition to the sources described infra notes 106–07, see the sources cited supra note 18 and accompanying text.

  106. [106]. Bracha & Pasquale, supra note 18, at 1180–81 (“It is unclear whether search engines fall under the strict definition of a natural monopoly, but they exhibit very similar characteristics.” (footnote omitted)).

  107. [107]. Weber & Nicholas, supra note 18, at 25.

  108. [108]. See Jack Clark, Google Turning Its Lucrative Web Search Over to AI Machines, Bloomberg (Oct. 26, 2015), []; see also Lawrence Page, Sergey Brin, Rajeev Motwani & Terry Winograd, The PageRank Citation Ranking: Bringing Order to the Web 4–10 (1998), [] (describing Google’s original ranking methodology, which did not employ machine learning, but rather weighted link analysis).

  109. [109]. See Weber & Nicholas, supra note 18, at 25–26 (describing the “positive feedback loop that creates a tendency toward natural monopolies in data platform businesses”); Bracha & Pasquale, supra note 18, at 1178–80 (citing, among others, Pasquale, supra note 34, at 130, which describes a similar feedback effect in searches).

  110. [110]. See infra notes 176–77 and accompanying text (describing the difference); see also, e.g., Mark A. Lemley & David McGowan, Legal Implications of Network Economic Effects, 86 Calif. L. Rev. 479, 595–96 (1998) (drawing distinction between supply-side natural monopoly effects and demand-side network effects); David McGowan, Networks and Intention in Antitrust and Intellectual Property, 24 J. Corp. L. 485, 488 (1999) (explaining how network theory does not necessarily imply a monopoly); David L. Aldridge Co. v. Microsoft Corp., 995 F. Supp. 728, 754 (S.D. Tex. 1998) (describing how natural monopolies can occur in industries that have a large initial fixed cost but also have declining marginal production costs); Carl Shapiro & Hal R. Varian, Information Rules: A Strategic Guide to the Network Economy 13–14 (1998) (discussing the relationship between positive feedback and network effects).

  111. [111]. See, e.g., Baumol, Quasi-Permanence, supra note 22, at 12–18.

  112. [112]. See infra notes 152–54 and accompanying text.

  113. [113]. Bowser is a primary adversary in the video game franchise Super Mario Bros. See, e.g., Super Mario Bros. 3 (Nintendo 1990).

  114. [114]. Kearns & Roth, supra note 51, at 6.

  115. [115]. Id.

  116. [116]. See id.

  117. [117]. Id.

  118. [118]. Cf. Tom Simonite, Meet the Secret Algorithm That’s Keeping Students Out of College, Wired (July 10, 2020, 7:00 AM), [] (describing the International Baccalaureate program’s decision to cancel testing during the COVID-19 crisis and replace actual test scores with algorithmically-predicted results, to much controversy).

  119. [119]. See, e.g., supra note 5 and accompanying text; Emmanuel Gbenga Dada, Joseph Stephen Bassi, Haruna Chiroma, Shafi’I Muhammad Abdulhamid, Adebayo Olusola Adetunmbi & Opeyemi Emmanuel Ajibuwa, Machine Learning for Email Spam Filtering: Review, Approaches and Open Research Problems, Heliyon, June 2019, at 1, 2 (“To effectively handle the threat posed by email spams, leading email providers such as Gmail, Yahoo mail and Outlook have employed the combination of different machine learning (ML) techniques such as Neural Networks in its spam filters.”).

  120. [120]. See, e.g., W. Nicholson Price II, Regulating Black-Box Medicine, 116 Mich. L. Rev. 421, 423 (2017); see also Rai et al., supra note 32, at 2 (examining “AI-enabled clinical decision software”).

  121. [121]. Price II, supra note 120, at 430.

  122. [122]. Id. at 427 (quoting Chris Rauber, Lumiata Nabs $6 Million for Personalized Medical Care Software, S.F. Bus. Times (Sept. 11, 2014, 8:38 AM),
    blog/2014/09/lumiata-6-million-funding-personalized-health-data.html [

  123. [123]. Rob Copeland, Google’s ‘Project Nightingale’ Gathers Personal Health Data on Millions of Americans, Wall St. J. (Nov. 11, 2019, 4:27 PM), [https://].

  124. [124]. Rai et al., supra note 32, at 5.

  125. [125]. Price II, supra note 120, at 427–28; Rai et al., supra note 32, at 8.

  126. [126]. Apple’s Face ID technology, for example:

    [I]mprove[s] match performance and keep[s] pace with the natural changes of a face and look, [by] augment[ing] its stored mathematical representation over time. Upon a successful match, Face ID may use the newly calculated mathematical representation—if its quality is sufficient—for a finite number of additional matches before that data is discarded. Conversely, if Face ID fails to recognize a face but the match quality is higher than a certain threshold and a user immediately follows the failure by entering their passcode, Face ID takes another capture and augments its enrolled Face ID data with the newly calculated mathematical representation. This new Face ID data is discarded if the user stops matching against it or after a finite number of matches. These augmentation processes allow Face ID to keep up with dramatic changes in a user’s facial hair or makeup use while minimizing false acceptance.

    Apple, Apple Platform Security 24 (2021), platform-security-guide.pdf [].

  127. [127]. See, e.g., Clark, supra note 108.

  128. [128]. For example, the verb “execute” means one thing in computer science contexts, and quite another thing in legal contexts. So a search application may begin by thinking that “execute” means “kill” or “terminate”—but learn, over time, that when associated with other computer science terms (e.g., program, computer), it should use different synonyms, like “run” or “start.” See, e.g., Kyle Wiggers, Search Engines Are Leveraging AI to Improve Their Language Understanding, VentureBeat (May 20, 2020, 8:40 AM),
    search-engines-google-bing-are-leveraging-ai-to-improve-their-language-understanding [https://].

  129. [129]. See, e.g., Machine Learning, Spotify, [] (“Machine learning touches every aspect of Spotify’s business. It is used to help listeners discover content via recommendations and search ... .”).

  130. [130]. Russell, supra note 4, at 62–101 (cataloguing near-term and long-term changes based on artificial intelligence and suggesting that, cumulatively, “[t]hese developments ... could change the dynamic of history”); Russell & Norvig, supra note 6, at 28 (explaining that “many thousands of AI applications are deeply embedded in the infrastructure of every industry” (quoting Kurzweil, supra note 6, at 204)).

  131. [131]. See, e.g., Microsoft Corp. v. AT&T Corp., 550 U.S. 437, 439 (2007) (“[C]opying software ... is indeed easy and inexpensive ... .”).

  132. [132]. See, e.g., Bryan H. Choi, Software as a Profession, 33 Harv. J.L. & Tech. 557, 572 (2020) (describing “the exponential growth of software complexity”).

  133. [133]. See David McGowan, Regulating Competition in the Information Age: Computer Software as an Essential Facility Under the Sherman Act, 18 Hastings Commc’ns & Ent. L.J. 771, 813, 841 (1996); see also supra Part II (explaining the importance of scale to the definition of natural monopoly). But see Mark A. Lemley, Antitrust and the Internet Standardization Problem, 28 Conn. L. Rev. 1041, 1056 (1996) (“The Internet software industry is not a natural monopoly”). I explore this apparent disconnect further infra notes 176–82 and accompanying text.

  134. [134]. E.g., Stacy Stanford, Artificial Intelligence (AI): Salaries Heading Skyward, Medium: Towards AI (May 20, 2020), []; Louis Columbus, Machine Learning Engineer Is The Best Job In The U.S. According To Indeed, Forbes (Mar. 17, 2019, 10:35 AM), [].

    Though I describe machine learning as “new” and “novel” above, it is worth noting that the premises that underlie modern machine learning technology have been long in development. See Russell & Norvig, supra note 6, at 16–28 (describing the history of artificial intelligence, beginning in 1943). Despite this long history, only recent advances in computer processing and the availability of large data sets have made possible the practical and increasingly ubiquitous applications described here. See id. at 27–28; see also Amy Kapczynski, The Law of Informational Capitalism, 129 Yale L.J. 1460, 1468 (2020) (describing the recent developments that have enabled machine-learning-based advances).

  135. [135]. See, e.g., Olivia Solon, Facial Recognition’s ‘Dirty Little Secret’: Millions of Online Photos Scraped Without Consent, NBC News (Mar. 17, 2019, 10:25 AM),
    net/facial-recognition-s-dirty-little-secret-millions-online-photos-scraped-n981921 []; Kashmir Hill, The Secretive Company That Might End Privacy as We Know It, N.Y. Times (Nov. 2 , 2021), [].

    These examples underscore the importance of other features of machine learning development, in addition to data. While obtaining training data is important, obtaining good training data can be more difficult and costly. And even if one regards publicly available digital images as good training data, the computing infrastructure required to train a deep learning facial recognition system is extensive and costly. Cf. infra notes 137 & 157 (discussing, respectively, the infrastructural and computational costs associated with training).

  136. [136]. Levendowski, supra note 49, at 606–10 (explaining that acquiring data “to use as training data for AI systems can be exhaustingly resource intensive,” no matter whether it is built (by, say, a social media platform) or purchased (from specialized data keepers, such as hospitals as to patient case histories)); see C. Scott Hemphill, Disruptive Incumbents: Platform Competition in an Age of Machine Learning, 119 Colum. L. Rev. 1973, 1978–79 (2019).

    It matters, of course, that data is different from other inputs in that data is not strictly rivalrous. When one company uses certain information to train its machine-learning-based applications, it does not diminish that data’s availability for future competitors. But, as other scholars have noted, intellectual property rights over these data can mimic the effects of rivalry, limiting their use to particular rightsholders. See, e.g., Lemley & Casey, supra note 49, at 756–70; Levendowski, supra note 49, at 590–610. And intellectual property rights are not the only concerns at issue. I explore this further infra Section IV.B.3.

  137. [137]. See Edd Gent, Powerful AI Can Now Be Trained on a Single Computer, IEEE Spectrum (July 17, 2020),
    werful-ai-can-now-be-trained-on-a-single-computer [] (explaining that some training approaches “have relied on servers packed with hundreds of CPUs and GPUs” while others use “[s]pecialized hardware ... with a price tag running into the millions”); see also Hemphill, supra note 136, at 1978 (2019) (discussing how Google and other companies have invested in custom hardware to support machine-learning) (citing Norman P. Jouppi, Cliff Young, Mishant Patil & David Patterson, Motivation for and Evaluation of the First Tensor Processing Unit, IEEE Micro, May/June 2018, at 10, 14 tbl.2, 16 tbl.4).

  138. [138]. See Aleksei Petrenko, Zhehui Huang, Tushar Kumar, Gaurav Sukhatme & Vladlen Koltun, Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning, Proceedings of the 37th International Conference on Machine Learning (2020); see also Gent, supra note 137 (explaining that the Petrenko et al. paper, supra, “describe[s] how [the authors] were able to use a single high-end workstation to train AI ... using a fraction of the normal computing power”).

  139. [139]. Brown et al., supra note 30, at 39 (describing the computational costs of training GPT-3); see also Kyle Wiggers, OpenAI’s Massive GPT-3 Model Is Impressive, But Size Isn’t Everything, Venture Beat (June 1, 2020, 1:05 PM), [] (estimating that GPT-3 required “over 350GB of memory and $12 million in compute credits,” which was feasible because it was developed by a “a well-capitalized company that teamed up with Microsoft to develop an AI supercomputer,” but is “potentially beyond the reach of AI startups ... which in some cases lack the capital required”).

  140. [140]. See Hemphill, supra note 136, at 1977 (“Advances in machine learning ... ha[ve] a high fixed cost and low marginal cost ... .”); cf. Ducci, supra note 18, at 57 (“It is therefore plausible that cost subadditivity in delivering a general search service applies over the relevant demand output region and average cost may be decreasing over the same range, which makes the presence of a single dominant search engine the most cost-efficient way to serve the market. The cost structure is indeed similar to traditional natural monopolies.”).

  141. [141]. Cf. Transcript of Oral Argument at 5, Alice Corp. Pty. Ltd. v. CLS Bank Int’l, 573 U.S. 208 (2014) (No. 13-298), 2014 WL 1279672 (statement of Kennedy, J.) (suggesting that it is trivial for an inexperienced software developer to write a program covered by the challenged (and ultimately invalid) patent).

  142. [142]. See Lemley, supra note 133, at 1059.

  143. [143]. See id. at 1058–59; cf. Demsetz, supra note 92, at 57 n.7 (“‘[C]ompetition for the [market]’ [can mimic effects of] ‘competition within the [market] ... .’” (emphasis removed) (quoting Edwin Chadwick, Results of Different Principles of Legislation and Administration in Europe; of Competition for the Field, as compared with the Competition within the Field of Service, 22 J. Royal Stat. Soc’y. 381 (1859))).

  144. [144]. See supra note 76 (describing some information asymmetries between providers and regulators that can give rise to regulatory error).

  145. [145]. Hemphill, supra note 136, at 1975–81 (discussing “[m]achine [l]earning as a [b]arrier to [e]ntry”).

  146. [146]. Hemphill, supra note 136, at 1977 (explaining that improving and deploying machine-learning-based software systems “has a high fixed cost and low marginal cost, a combination that tends to favor large firms that can spread the fixed cost over a large number of units”).

  147. [147]. See supra text accompanying notes 58–59.

  148. [148]. See infra notes 277–84 and accompanying text.

  149. [149]. Cf. Colin Barker, How the GPU Became the Heart of AI and Machine Learning, ZDNet (Aug. 13, 2018), [] (describing how innovations in graphics processing have had spillover effects on machine learning).

  150. [150]. Cf. Viscusi et al., supra note 15, at 87; Hovenkamp, supra note 18, at 1996 (“Even if costs decline continuously as output increases or network effects are large, a digital platform is still not necessarily a natural monopoly. Another pervasive reason for [interplatform] competition is product differentiation.”).

  151. [151]. See, e.g., Russell & Norvig, supra note 6, at 1053–55 (describing and explaining “O() notation”).

  152. [152]. See Kearns & Roth, supra note 51, at 5 (listing sorting algorithms).

  153. [153]. Id. at 4–5 (noting “choices and trade-offs” in terms of memory and time).

  154. [154]. One might be tempted to say, colloquially, that the complexity of this sorting algorithm grows “exponentially,” rather than “polynomially.” But it is more precise in this context to say that training runs in polynomial time (O(n)), since exponential runtime algorithms, e.g., O(2n), represent an even more costly class.

  155. [155]. See generally Emma Strubell, Ananya Ganesh & Andrew McCallum, Energy and Policy Considerations for Deep Learning in NLP (June 5, 2019) (unpublished manuscript), https:// [] (describing costs in terms of power and shared computing resource prices); Kate Saenko, Feed Me, Seymour!—Why AI Is so Power-Hungry, ArsTechnica (Dec. 29, 2020, 6:38 AM),
    why-ai-is-so-power-hungry/?comments=1 [] (citing the previous source and explaining that the power consumption demands of training and optimizing one machine-learning-based language model is equivalent to the cost of flying “315 passengers, or an entire 747 jet” on a “round trip between New York and San Francisco”).

  156. [156]. Such algorithms operating in O(log n) time mimic, roughly, the effects of having large initial costs (high complexity costs for the first items in the list) and lower subsequent costs (comparatively lower complexity costs for the next items in the list).

    These analogies are imperfect. Such computational algorithms (like the sorting example) typically operate at a single moment in time, with a complete set of inputs (say, the whole unsorted list). But a real market does not, of course, work in the same way. Nevertheless, as I describe in the following paragraphs, such distinctions are less salient (though still important) in the context of machine learning because of such applications’ multistage nature.

  157. [157]. See, e.g., Lehr & Ohm, supra note 27, at 655 (“contend[ing] that legal scholars should think of machine learning as consisting of two distinct workflows” first, the design, development and training of an application, and second, an application “deployed and making decisions in the real world”); see also Antonio Rafael Sabino Parmezan, Vinicius M.A. Souza & Gustavo E.A.P.A. Batista, Evaluation of Statistical and Machine Learning Models for Time Series Prediction: Identifying the State-of-the-Art and the Best Conditions for the Use of Each Model, 484 Info. Scis. 302, 307–08 fig.4 (2019).

  158. [158]. See Mehryar Mohri, Afshin Rostamizadeh & Ameet Talwalkar, Foundations of Machine Learning 230 (2d ed., 2018); cf. Frank Hutter, Lin Xu, Holger H. Hoos & Kevin Leyton-Brown, Algorithm Runtime Prediction: Methods & Evaluation, 206 A.I. 79, 89 (2014).

    I acknowledge that some researchers in the computer sciences disagree over the value of measuring algorithmic complexity in Big O terms for machine learning, preferring instead to rely on, say, real-time measurements or other constructs. See, e.g., Daniel Justus, John Brennan, Stephen Bonner & Andrew Stephen McGough, Predicting the Computational Cost of Deep Learning Models 1 (Nov. 28, 2018) (unpublished manuscript), []. My point, however, does not turn on any one form of measurement. No matter whether one chooses Big O, real-time measurements on like hardware, or some other method, the basic point remains—training is significantly more costly than prediction or classification. See, e.g., Florian Scheidegger, Roxana Istrate, Giovanni Mariani, Luca Benini, Costas Bekas & Cristiano Malossi, Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy, 37 Visual Comput. 1593, 1597 fig.3 (2021); Abdiansah Abdiansah & Retantyo Wardoyo, Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM, 128 Int’l J. Comput. Applications 28, 33 (2015).

  159. [159]. See generally Lehr & Ohm, supra note 27 (describing the machine learning process in order to elucidate the harms as well as benefits in machine learning and provide policy solutions); Parmezan et al., supra note 156 (describing different models); Mohri et al., supra note 158 (describing many computational methods in detail).

  160. [160]. See, e.g., Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang & Yanqi Zhou, Baidu Rsch., Deep Learning Scaling Is Predictable, Empirically 13 (2017),
    1712.00409.pdf [] (demonstrating, across application types, that “model accuracy improvements from growing data set size and scaling computation are empirically predictable”).

  161. [161]. See Thomas Bayes & Richard Price, An Essay Towards Solving a Problem in the Doctrine of Chances, 53 Phil. Transactions (1683-1775) 370, 370–418 (1763); see also, e.g., David Haussler, Michael Kearns & Robert E. Schapire, Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension, 14 Mach. Learning 83, 106 (1994) (“Perhaps the most important general conclusion to be drawn from the work presented here is that the various theories of learning curves based on diverse ideas from information theory [and] statistical physics ... are all in fact closely related, and can be naturally and beneficially placed in a common Bayesian framework.”); Daniel Berrar, Bayes’ Theorem and Naive Bayes Classifier, 1 Encyc. Bioinformatics & Computational Biology 403, 403 (2018) (“Bayes’ theorem is of fundamental importance for inferential statistics and many advanced machine learning models.”).

  162. [162]. E.g., Trevor Hastie, Robert Tibshirani & Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction 21 (2d ed. 2009).

  163. [163]. See Goodfellow et al., supra note 28, at 113.

  164. [164]. See, e.g., Hastie et al., supra note 162, at 243 (2009); Goodfellow et al., supra note 28, at 114; see also Samuel L. Smith & Quoc V. Le, A Bayesian Perspective on Generalization and Stochastic Gradient Descent 5–8 (2018), []; Tarun Yellamraju, Jonas Hepp & Mireille Boutin, Benchmarks for Image Classification and Other High-Dimensional Pattern Recognition Problems 2 (2018), []; cf. Ivor W. Tsang, James T. Kwok & Pak-Ming Cheung, Core Vector Machines: Fast SVM Training on Very Large Data Sets, 6 J. Mach. Learning Rsch. 363, 380 (2005); Gary M. Weiss & Ye Tian, Maximizing Classifier Utility When There Are Data Acquisition and Modeling Costs, Data Mining & Knowledge Discovery (2007); Christopher Meek, Bo Thiesson & David Heckerman, The Learning-Curve Sampling Method Applied to Model-Based Clustering, 2 J. Mach. Learning Rsch. 397, 407 (2002).

    It does not matter, for my purposes, precisely how computationally complex this training process is. It suffices, to note that training can be very costly in computational terms, and as I explained, that these training costs typically far outpace prediction costs. It is true that the nature of these costs can vary substantially, from exponential versus polynomial (as measured by the volume of the training data), depending on the sort of learning algorithm used, and so the conditions I describe here may not hold where the ratio between training costs and prediction costs is comparatively low. For more on the computational complexity on of machine learning algorithms, see generally Michael Justin Kearns, The Computational Complexity of Machine Learning (1990) (exploring the computational complexity of machine learning). For a short summary of the complexity of various learning algorithms, see Mohri et al., supra note 158, at 280, as well as RUser4512, Computational Complexity of Machine Learning Algorithms, Kernel Trip (Apr. 16, 2018),
    exity-learning-algorithms []; see also Hutter et al., supra note 158, at 89.

  165. [165]. See, e.g., Weiss & Tian, supra note 164, at 4; Meek, et al., supra note 164, at 397. Indeed, developers must make these decisions about tradeoffs along a number of dimensions, including accuracy, bias, and robustness. See, e.g., Irene Y. Chen, Fredrik D. Johansson & David Sontag, Why Is My Classifier Discriminatory? 2 (2018), [https:

  166. [166]. More precisely, for Bayesian classifiers, the prediction function is constant to the number of parameters to which the application is sensitive. In our candy example, for instance, a prediction algorithm might try to discern a candy’s identity based on color alone, or color and height, or color, height, and weight. Each added parameter increases the cost of the prediction function (as well as the training function). See Parmezan et al., supra note 157, at 332 (“[C]omputational complexity is empirically related to the number of parameters of a model.”). But, importantly for my present purposes, such changes in cost do not seem to regularly vary relative to the high fixed costs of training. No matter whether it takes $1 to connect a new user to a telephone network or $10, it is likely that the average marginal cost of connecting users to the network will decrease over the whole market, because of the wide disparity between fixed and marginal costs. Indeed (to draw out the example further), a $10 connection cost (as compared to $1) is indicative of even-higher fixed costs. This is because a complex prediction—one that turns on a number of parameters—also requires complex training along that same number of parameters. See Mohri et al., supra note 158, at 280; see also id. at 230.

  167. [167]. Compare supra note 139 (highlighting the costs of training GPT-3), with William Dieterich, Eugenie Jackson & Christina Mendoza, Risk Assessment Factsheet: Correctional Offender Management Profiling for Alternative Sanctions (Compas) Pretrial Release Risk Scale II (PRRS-II),
    uploads/2019/06/COMPAS-PRRS-II-Factsheet-Final-6.20.pdf [] (describing COMPAS as a regression model based on fewer than 3,000 observations).

  168. [168]. Baumol, Quasi-Permanence, supra note 22, at 12–13.

  169. [169]. But, in a market that can sustain some forms of product differentiation, a second application trained to optimize along a different dimension—candy size, say—could add new (different) capacity to the market. See infra note 185–86 and accompanying text.

  170. [170]. See, e.g., Derrick Mwiti, Research Guide: Model Distillation Techniques for Deep Learning, Heartbeat (Nov. 20, 2019), []. My thanks to Aileen Nielsen for highlighting this research.

  171. [171]. See Ng, supra note 33, at 4; see also Lee et al., supra note 33, at 51–52 (similarly describing the “virtuous cycle of AI”). From a putative competitor’s perspective, this cycle may seem more vicious than virtuous.

  172. [172]. See, e.g., Sumit Gupta, Deep Learning Performance Breakthrough, IBM (Jan. 16, 2018), https:// []; see also Andrew Ng, PowerPoint Presentation at Extract Conf., What Data Scientists Should Know About Deep Learning, at slide 30 (Nov. 24, 2015), available at
    rew-ng-chief-scientist-at-baidu [].

  173. [173]. In addition to the sources described infra notes 174–75; see also Dipayan Ghosh & Ben Scott, Digital Deceit II: A Policy Agenda to Fight Disinformation on the Internet, Shorenstein Ctr. (Oct. 2, 2018), [] (noting “curation and targeting algorithms feed on one another to grow ever smarter over time—particularly with the forward integration of advanced algorithmic technologies, including AI” and that “[t]he most successful of the[se] companies are natural monopolies in this space; the more data they collect, the more effective their services”).

  174. [174]. Pasquale, supra note 34, at 130; see also Bracha & Pasquale, supra note 18, at 1180–81 (2008) (“It is unclear whether search engines fall under the strict definition of a natural monopoly, but they exhibit very similar characteristics.”); see also Clark, supra note 108.

  175. [175]. Weber & Nicholas, supra note 18, at 25–26.

  176. [176]. See Lemley & McGowan, supra note 110, at 495 (“Network effects are demand-side effects—they result from the value that consumers place on owning what other consumers already own. By contrast, economies of scale are supply-side effects—they are a function of the cost of making the goods ... regardless of positive utility payoffs among consumers.”); see also Daniel McIntosh, We Need to Talk About Data: How Digital Monopolies Arise and Why They Have Power and Influence, 23 J. Tech. L. & Pol’y 185, 196 (2019) (drawing similar distinction). By contrast, the two preceding subsections do focus on supply-side costs.

  177. [177]. Indeed, these network effects accrue across a single user. Consider a police department that uses facial recognition technology to identify criminal suspects. Even if the police department is the only user of the deep learning system, accuracy might improve between the first, second, and third (and so on) uses by the single department, provided that each query contributes some new predictive information.

  178. [178]. See Shapiro & Varian, supra note 110, at 173–226; Ducci, supra note 18, at 5, 57–59 (“[T]he value of data and the economies of scale and scope that arise from larger datasets are critical for the purpose of improving search algorithm predictions . . . where more data increase the efficiency of search results in terms of quality-adjusted costs.”).

    This is especially true in instances of deep or continual learning where significant accuracy improvements accrue across the entire set of training data (including data used for ongoing training). See supra note 172 and accompanying text. In more traditional machine learning examples where, say, new training data is incorporated at regular intervals through retraining and redeployment, this is also true—but the effect may be far less pronounced: In such scenarios, accuracy seems bounded by minimum error rate—that is, the application learns quickly at first, but then slowly levels off indefinitely. See supra note 164 and accompanying text. Hence, if the total available training data is large enough such that even a fraction of it would allow an application to ascend the majority of the learning curve, then the market (and its data) may be split among multiple applications with comparatively slight accuracy effects (namely, the gradual improvements that would accrue to one provider after having ascended the steep part of the curve).

  179. [179]. See Shapiro & Varian, supra note 110, at 173–226; Lemley & McGowan, supra note 110, at 549 (explaining, from the example of the telephone networks, that though “network effects did not dictate a telephone network run by a single firm as a regulated natural monopoly” such network effects did “dictate[] that one network was the efficient outcome” (emphasis omitted)); see also Thomas Hanna, Mathew Lawrence & Nils Peters, A Common Platform 6 (2020), https:// [] (“Network effects ... are one of the reasons why some analysts consider the dominant platforms to essentially be natural monopolies.”).

  180. [180]. See generally David Singh Grewal, Network Power (2009) (describing the network effects of standard setting (to draw from the same example as in Lemley, supra note 133) as a form of power).

  181. [181]. Lemley, supra note 133, at 1056.

  182. [182]. Id.

  183. [183]. Id.

  184. [184]. See Ghosh & Scott, supra note 173 (“The size of its physical infrastructure, the sophistication of its data processing algorithms (including AI and machine learning), and the quantity of data served on its infrastructure and feeding its algorithms (constantly making them smarter) constitutes an unassailable market advantage that leads inexorably to natural monopoly.”).

  185. [185]. See, e.g., With Deep Learning, Disney Sorts Through a Universe of Content, Amazon Web Services, [https://] (explaining that Disney uses Amazon Web Services to enable deep learning algorithms categorizing its content repository).

  186. [186]. See Viscusi et al., supra note 15, at 87.

  187. [187]. See Hovenkamp, supra note 18, at 1971 (explaining that “natural-monopoly status actually applies to particular inputs or technologies”).

  188. [188]. See, e.g., Posner, Natural Monopoly, supra note 8, at 548 (suggesting that, in natural monopoly markets, “firms will quickly shake down to one”).

  189. [189]. See, e.g., Matt O’Brien, Microsoft Joins Amazon, IBM in Banning Police from Using Its Facial Recognition Tech, Chi. Trib. (June 11, 2020, 2:13 PM),
    business/ct-biz-amazon-facial-recognition-ai-20200611-at2lh7ub5bavzeukdrcsxk4ada-story.html [] (“Microsoft has become the third big tech company this week to say it won’t sell its facial recognition software to police, following similar moves by Amazon and IBM.”); see also Jeffrey Dastin, Amazon Extends Moratorium on Police Use of Facial Recognition Software, Reuters (May 18, 2021, 1:12 PM), [

  190. [190]. See, e.g., O’Brien, supra note 189 (explaining that concerns related to law enforcement bias precipitated these companies’ withdrawal from the market).

  191. [191]. Id. (“But while all three companies are known for their work in developing artificial intelligence, including face recognition software, none is a major player in selling such technology to police ... . Several other companies that are less well known dominate the market for government facial recognition contracts in the U.S., including Tokyo-based NEC and the European companies Idemia and Gemalto.”); see also, e.g., Jared Council, Facial Recognition Companies Commit to Police Market After Amazon, Microsoft Exit, Wall St. J. (June 12, 2020, 5:28 PM), [] (quoting Jameson Spivack, a policy associate at the Center on Privacy and Technology at Georgetown Law, as stating that “[m]ost of the major companies that have the most contracts with law enforcement for facial recognition are smaller, specialized companies that most people have not heard of,” like NEC or Clearview AI); see also Mordor Intel., Global Facial Recognition Market (2020–2025) 4 (2019) (market analysis report similarly focusing on NEC, Idemia, and Gemalto (among other companies) and excluding, notably, Amazon, IBM, and Microsoft).

  192. [192]. See Julia Horowitz, Tech Companies Are Still Helping Police Scan Your Face, CNN Business (July 3, 2020, 8:36 AM),
    index.html [] (describing NEC’s domestic and international clients); Hill, supra note 135 (describing Clearview AI’s reach).

  193. [193]. But see O’Brien, supra note 189 (“MIT researcher Joy Buolamwini found racial and gender disparities in facial recognition software. Those findings ... irked Amazon, which last year publicly attacked her research methods.”).

  194. [194]. See Council, supra note 191 (finding that other providers had no intention of exiting the market over such concerns, explaining that their services “help identify suspects in criminal cases, find missing children and assist in other investigations”).

  195. [195]. See, e.g., Calo, supra note 50, at 1153–55 (personal privacy); Sonia K. Katyal, Private Accountability in the Age of Artificial Intelligence, 66 UCLA L. Rev. 54, 59–62 (2019) (civil rights); Pauline T. Kim, Data-Driven Discrimination at Work, 58 Wm. & Mary L. Rev. 857, 883–92 (2017) (anti-discrimination); Lemley & Casey, supra note 49, at 744–50 (copyright); Levendowski, supra note 49, at 630 (copyright).

  196. [196]. See, e.g., Hoofnagle & Whittington, supra note 38, at 626–28; Newman, supra note 54, at 203–06; Wu, supra note 38, at 296–99.

  197. [197]. Cf. Ovide, supra note 38 (explaining that Google’s photo storage service “has never really been free [because] Google uses our photos to train its software systems”); see also Daisuke Wakabayashi, A Former Google Executive Takes Aim at His Old Company with a Start-Up, N.Y. Times (June 21, 2020), [] (describing the process Google uses to target advertising based on user data).

  198. [198]. See, e.g., Kapczynski, supra note 134, at 1489; see also Shoshana Zuboff, The Age of Surveillance Capitalism 13 (2019) (describing the “new logic of [data] accumulation”).

  199. [199]. Cohen, supra note 50, at 1905, 1912–18; see Julie E. Cohen, Examined Lives: Informational Privacy and the Subject as Object, 52 Stan. L. Rev. 1373, 1426 (2000).

  200. [200]. See Jones & Tonetti, supra note 49, at 2820 (describing firms’ incentives to “hoard data in ways that are socially inefficient”); Stucke, supra note 39, at 286 (explaining that a monopolist in a data-intensive industry “has the incentive to reduce its privacy protection below competitive levels and collect personal data above competitive levels” giving rise to deadweight losses).

  201. [201]. Though, of course, this forces us to confront the question of how much is too much? And the answer may lie in the cognitive and expressive burdens of surveillance described supra text accompanying note 199.

  202. [202]. Cf. Olivia Solon, Insecure Wheels: Police Turn to Car Data to Destroy Suspects’ Alibis, NBC News (Dec. 28, 2020, 8:18 AM), [] (interviewing one person who has declined to use connected vehicle services out of data extraction concerns); Valentino-DeVries, supra note 39 (explaining that privacy concerns have depressed interest in coronavirus-related contact-tracing applications); Cole et al., supra note 39 (“Privacy concerns, which lead to low adoption rates, are a barrier to the success of a contact tracing app.”).

  203. [203]. Cf. Will Douglas Heaven, The Way We Train AI is Fundamentally Flawed, MIT Tech. Rev. (Nov. 18, 2020),
    hine-learning-broken-real-world-heath-nlp-computer-vision [] (describing this phenomenon as data shift).

  204. [204]. See, e.g., Calo, supra note 50, at 1147–52; cf. Eleanor Fox & Harry First, We Need Rules to Rein in Big Tech, CPI Antitrust Chron. (Oct. 27, 2020), https://www.competitionpolicyinter [] (“The platforms vacuum up huge amounts of data from users of the platforms, and use the data not only for efficiencies but also for exploitations and exclusions.”).

  205. [205]. See, e.g., Algorithms Behaving Badly: 2020 Edition, Markup (Dec. 15, 2020, 8:00 AM), https:// [https://perma.
    cc/SWF8-HGFG] (cataloging some of the most troublesome algorithmic deficiencies of 2020).

  206. [206]. See Heaven, supra note 203 (describing underspecification).

  207. [207]. See generally Friedman & Nissenbaum, supra note 50 (describing preexisting, technical, and emergent bias in computer systems).

  208. [208]. Solow-Niederman, supra note 53, at 641 & n.34 (quoting Casey Ross & Ike Swetlitz, IBM’s Watson Supercomputer Recommended ‘Unsafe and Incorrect’ Cancer Treatments, Internal Documents Show, STAT (July 25, 2018), [

  209. [209]. Solow-Niederman, supra note 53, at 641.

  210. [210]. See, e.g., Casey Ross, Epic’s AI Algorithms, Shielded from Scrutiny by a Corporate Firewall, Are Delivering Inaccurate Information on Seriously Ill Patients, STAT (July 26, 2021), https://www.stat [] (noting “particular[] concern[] about Epic’s algorithm for predicting sepsis” finding that it both “routinely fails to identify the condition in advance, and triggers frequent false alarms”); see also Heaven, supra note 203 (describing similar examples regarding eye disease, skin cancer, and kidney failure).

  211. [211]. See Todd Feathers, Google’s New Dermatology App Wasn’t Designed for People With Darker Skin, Vice: Motherboard (May 20, 2021, 8:40 AM),
    googles-new-dermatology-app-wasnt-designed-for-people-with-darker-skin [
    PN2T-V2U9]. This app is a complex case. Reporting suggests a clear problem of sampling bias, one that may affect dermatology practice more generally. Google’s own tests suggest that it performs well for Black patients. But those results are based on ethnicity, which is distinct from skin type (i.e., color). Other researchers found Google’s training dataset to underrepresent patients whose skin color falls into the clinical definitions for “skin type V (brown)” and “skin type VI (dark brown or black).” Id.

  212. [212]. See Elizabeth Joh & Thomas Joo, The Harms of Police Surveillance Technologies, Denver L. Rev. F. (forthcoming 2022) (manuscript at 1).

  213. [213]. Buolamwini & Gebru, supra note 50, at 8–9 (evaluating three commercial gender classification systems and finding that darker-skinned females are misclassified in up to 34.7 percent of test cases, while lighter-skinned males are misclassified in only 0.8 percent of test cases); see also U.S. Gov’t Accountability Off., GAO-21-518, Facial Recognition Technology: Federal Law Enforcement Agencies Should Better Assess Privacy and Other Risks 24–26 (2021) [hereinafter GAO Report], [].

    Facial recognition is only the beginning. Speech recognition applications, for example, also perform markedly worse for Black populations than for white. See Allison Koenecke et al., Racial Disparities in Automated Speech Recognition, 117 Proc. Nat’l Acad. Scis. U.S. 7684, 7685 (2020). Others have found evidence of gender bias in image search. See Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez & Kai-Wei Chang, Men Also Like Shopping: Reducing Gender Bias Amplification Using Corpus-Level Constraints 1 (2017), []. And in predictive text, too. Paresh Dave, Fearful of Bias, Google Blocks Gender-Based Pronouns from New AI Tool, Reuters (Nov. 27, 2018, 12:06 AM), []; see also Feathers, supra note 211 (noting concerns about a dermatology application).

  214. [214]. Timothy B. Lee, Detroit Police Chief Cops to 96-Percent Facial Recognition Error Rate, ArsTechnica (June 30, 2020, 11:12 AM), [].

  215. [215]. See O’Neil, supra note 213, at 24–27; Kashmir Hill, Another Arrest, and Jail Time, Due to a Bad Facial Recognition Match, N.Y. Times (Jan. 6, 2021),
    technology/facial-recognition-misidentify-jail.html []; see also Garance Burke, Martha Mendoza, Juliet Linderman & Michael Tarm, How AI-Powered Tech Landed Man in Jail with Scant Evidence, AP News (Aug. 19, 2021), [https://] (describing the ShotSpotter application, used to identify the source of a gunshot sound, and an individual who was arrested on the strength of the application’s findings, but eventually released (after contracting COVID in jail) for insufficient evidence). These effects result not only from inaccuracies and bias in facial recognition (and other similar) applications, but also those embedded in applications used to assess whether criminal defendants should be eligible for bail or parole. See Julia Angwin, Jeff Larson, Surya Mattu & Lauren Kirchner, Machine Bias, ProPublica (May 23, 2016),
    ments-in-criminal-sentencing [].

  216. [216]. Hence, in this example, the market failure I have identified is not necessarily linked to the application providers’ status as a natural monopolist. Rather, the example here more closely resembles a principal-agent failure: Law enforcement agencies bargain on behalf of the public, but do not (and are not forced to) internalize all the costs of their decisions. See Mulligan & Bamberger, supra note 48, at 778–79. Hence, the agent’s decision may vary from the principal’s preferences. Cf. Glob. Tel*Link v. FCC, 866 F.3d 397, 404–05 (D.C. Cir. 2017) (describing a distinct but related principal-agent failure). To be sure, law enforcement agencies would certainly prefer more accurate systems to less accurate ones (though because they bear the costs of inaccuracies in different ways—police officers are rarely, if ever, jailed for misidentifying a suspect—they weigh accuracy differently than the general public). It is only the unavailability of more accurate systems that might be attributed to a failure of competition—while the procurement and use of these systems more closely result from the principal-agent problem. In all events, no matter the mechanism—whether through a principal-agent failure or natural monopoly—providers have reduced incentives to invest in training yielding accuracy concerns. In future work, I plan to address a wider range of market failures, including both addressed in this footnote, that may require intervention in data-intensive industries.

  217. [217]. Cf. Lyria Bennett Moses, Regulating in the Face of Sociotechnical Change, in The Oxford Handbook of Law, Regulation, and Technology 573, 577–78 (Roger Brownsword, Eloise Scotford & Karen Yeung eds., 2017) (arguing that lawyers and regulators should focus on how they can best adjust the law and regulation in accordance with continual sociotechnical advancements rather than asking how to best regulate new technology).

  218. [218]. Cf. Plato, Republic 369 (Paul Shorey trans., Harv. Univ. Press 1969), http://www. [https://perm] (“Its real creator, as it appears, will be our needs.”).

  219. [219]. Cf. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts, COM (2021) 206 final (Apr. 2, 2021) (noting that the “socio-economic benefits of AI can also bring about risks of negative consequences for individuals or the society”).

  220. [220]. See supra Section II.B.1.

  221. [221]. See W. Nicholson Price II, Black-Box Medicine, 28 Harv. J.L. & Tech. 419, 462 (2015) (“Payment is the dominant concern for adoption of a newly available medical technology.”).

  222. [222]. Cf. id. at 462–63 (“[P]ublic policy could drive overall adoption [of machine-learning based medical tools] by encouraging reimbursement by public insurers, especially Medicare and Medicaid.”). Though Medicare is not, strictly speaking, a rate regulator, Medicare does, by setting reimbursement rates, have an important effect on market rates more generally. See, e.g., Eleanor D. Kinney, Making Hard Choices Under the Medicare Prospective Payment System: One Administrative Model for Allocating Medical Resources Under a Government Health Insurance Program, 19 Ind. L. Rev. 1151, 1164 (1986). Of course, this brief example is presented as simplified—pricing in medical markets is complicated, made even more so by the roles played by insurance networks—to highlight the threats of monopoly pricing and the role of responsive rate regulation.

  223. [223]. See Nicholas Bagley, Medicine as a Public Calling, 114 Mich. L. Rev. 57, 62 (2015); Cf. Ramsi A. Woodcock, The Contrasting Approaches to Power of the Modern State and the Antitrust Laws: Lessons for Regulation 66 (Oct. 3, 2020) (unpublished manuscript), https://papers.ssrn.
    com/sol3/papers.cfm?abstract_id=3704450 [] (“In theory, price regulation is a complete solution to the problem of the exercise of power.”).

  224. [224]. See supra notes 196–204 and accompanying text.

  225. [225]. See, e.g., Hoofnagle & Whittington, supra note 38, at 609–10 (explaining that many firms—including, ostensibly, firms that rely on machine learning technologies—“collect valuable information about consumers” and “realize financial gains from the collection and use of [such] information”); Fox & First, supra note 204, at 2–3.

  226. [226]. Several scholars have suggested that even where there are several participants in one market, such competition is unlikely to discipline such data “charges” because consumers are not particularly sensitive to such privacy-related concerns. See, e.g., Daniel J. Solove, Introduction: Privacy Self-Management and the Consent Dilemma, 126 Harv. L. Rev. 1880, 1880 (2013) (“[E]mpirical and social science research demonstrates that there are severe cognitive problems that undermine privacy self-management.”); see also Cohen, supra note 50, at 1919 (“[R]egulators timidly opine that privacy harms result from ‘unexpected’ disclosures of personal information and that more robust guarantees of notice and choice therefore may be needed to ‘build[] consumer trust in the marketplace.’ This simplistic view of the relationship between privacy and innovation is wrong.” (second alteration in original) (footnote omitted)). I do not mean to dispute that point here, and, indeed, in future work I plan to consider the possibilities for regulation even under conditions of competition. For now, it suffices to note that, even if consumers are privacy-sensitive, see, e.g., Janice Y. Tsai, Serge Egelman, Lorrie Cranor & Alessandro Acquisti, The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study, 22 Info. Sys. Rsch. 254, 255 (2011) (suggesting “that individuals are willing to pay a premium for privacy when privacy information is made prominent and intuitive”); see also Clark D. Asay, Consumer Information Privacy and the Problem(s) of Third-Party Disclosures, 11 Nw. J. Tech. & Intell. Prop. 321, 323 (2013) (noting “that notice and choice can and should play a significant role” in privacy law, even given the troubles that consumers face in fully understanding and internalizing privacy costs), regulation is warranted in machine-learning-based natural monopoly markets. Stated similarly, the model described here suggests that regulation is necessary because competition is exceptionally unlikely to have this important effect. Hence, like scholars who, drawing from behavioral literatures, conclude that regulation is necessary because consumers are insufficiently sensitive to these privacy risks, I similarly conclude that regulation is necessary because, no matter the extent of consumer sensitivity, competition is unlikely to discipline application developers. In either case, relying on usual market-based forces seems unlikely to succeed.

  227. [227]. Cf. 47 U.S.C. § 201(b) (2018) (requiring the authority to set “just and reasonable” rates).

  228. [228]. See Katharina Pistor, Rule by Data: The End of Markets?, 83 L. & Contemp. Probs. 101, 103–04 (2020).

  229. [229]. See supra notes 75–76 and accompanying text (describing rate-of-return regulation).

  230. [230]. See, e.g., Rob Copeland & Sarah E. Needleman, Google’s ‘Project Nightingale’ Triggers Federal Inquiry, Wall St. J. (Nov. 12, 2019, 11:13 PM), [
    575N-TEBR] (explaining that Google’s agreement to obtain medical records for health application development purposes included “personally identifiable information on millions of patients, such as names and dates of birth ... and some billing claims,” among other details); see also Roger Allan Ford & W. Nicholson Price II, Privacy and Accountability in Black-Box Medicine, 23 Mich. Telecomm. & Tech. L. Rev. 1, 32–35 (2016) (proposing some such limits); Christina Farr, Facebook Sent a Doctor on a Secret Mission to Ask Hospitals to Share Patient Data, CNBC (Apr. 6, 2018, 11:46 AM), [].

  231. [231]. Cf. Commission Regulation 2016/679 of Apr. 27, 2016, General Data Protection Regulation, art. 5 (L 119) 1 (EU). Of course, some anonymization practices can be reverse-engineered. But some protection is likely better than none.

  232. [232]. See supra note 84 and accompanying text (describing such problems).

  233. [233]. Cf. John M. Newman, Regulating Attention Markets 45–48 (July 22, 2020) (unpublished manuscript), [
    F3EE-NM3U] (proposing advertisement caps—i.e., caps on charges imposed on consumers in attention markets).

  234. [234]. See, e.g., 45 C.F.R. pt. 160–64 (2021); see also Cal. Civ. Code § 1798.100 (West 2020) (explaining that personal information may be used for business purposes only to the extent such use is “reasonably necessary and proportionate”).

  235. [235]. Cf. Salomé Viljoen, A Relational Theory of Data Governance, 131 Yale L.J. 573, 592 (2021) (describing “privacy law’s individualism” and explaining how that approach is insufficient in light of the “population-level insights” that application developers attempt to discern).

  236. [236]. But see Valentino-DeVries, supra note 39 (explaining that privacy concerns have depressed interest in coronavirus-related contact-tracing applications); Cole et al., supra note 39 (“Privacy concerns, which lead to low adoption rates, are a barrier to the success of a contact tracing app.”).

  237. [237]. Indeed, in a similar but distinct context, I have argued in favor of an agency-based approach. See Tejas N. Narechania, Patent Conflicts, 103 Geo. L.J. 1483, 1529–32, 1534–36 (2015). Cf. Andrew Tutt, An FDA for Algorithms, 69 Admin. L. Rev. 83, 90 (2017) (“a dedicated agency charged with the mission of supervising the development, deployment, and use of algorithms will soon be highly desirable, if not necessary”); Ryan Calo, Ctr. for Tech. Innovation at Brookings, The Case for a Federal Robotics Commission 3 (2014), https:// [https://] (“[T]entatively conclud[ing] that the United States would benefit from an agency dedicated to the responsible integration of robotics technologies into American society.”). I am happy to align myself with such calls for greater administrative oversight over these systems and applications. Whether, however, such additional oversight should take the form of some new agency, or, instead, additional capacity at existing agencies—particularly in view of the wide range of sectors, many of which are already regulated by agencies with relevant domain expertise, affected by machine-learning-based applications—is well beyond my present scope and left to future work. Cf. Narechania, supra note 237, at 1529–32, 1534–36.

  238. [238]. See Kapczynski, supra note 134, at 1480 (calling for an account of law that allows the public “to channel outrage into a platform for democratic change”); Paul M. Schwartz, Internet Privacy and the State, 32 Conn. L. Rev. 815, 816–17 (2000) (arguing in part “that the state has a special role in ... developing privacy norms”); see also Elettra Bietti, Locked-In Data Production: User Dignity and Capture in the Platform Economy 1 (Oct. 14, 2019) (unpublished manuscript), [] (contending that “individuals should have a say over how data is collected, used and stored, and not only a right to be compensated for uses of their data determined by others”); Ford & Price II, supra note 230, at 35–36; cf. Rahman, supra note 44 (considering regulatory possibilities through the “more expansive” lens of public utility regulation).

  239. [239]. See, e.g., W. Nicholson Price II, Problematic Interactions Between AI and Health Privacy, 2021 Utah L. Rev. 925, 935 (2021) (“AI weakens protections for health privacy, and health privacy weakens the AI used in health.”).

  240. [240]. See Paul M. Schwartz, Privacy and Democracy in Cyberspace, 52 Vand. L. Rev. 1609, 1648 (1999) (advocating for democratic deliberation and legislative enactment to protect privacy in cyberspace); see also Major Rate Case Process Overview, supra note 84 (“Rate cases are a primary instrument of government regulation of these industries. Interested persons may intervene and become parties in a utility company’s rate case. Typical intervenors include: industrial, commercial and other large-scale users of electricity; public interest groups; representatives of residential, low-income and elderly customers; local municipal officials; and, dedicated advocacy groups . . . . Rate cases proceed in an entirely public and open process.”); Anton Korinek, Integrating Ethical Values and Economic Value, in The Oxford Handbook of Ethics of AI 475, 491 (2020) (“[W]e need a large and concerted public effort ... to ensure we develop AI in a direction that is both economically beneficial and ethically desirable.”); Moses, supra note 217, at 581–83; Pistor, supra note 228, at 118; Ben Leonard, Rep. Buddy Carter Signals Support for Federal Privacy Legislation, Politico (Jan. 26, 2021, 1:58 PM),
    26/buddy-carter-supports-federal-privacy-legislation-462679 [] (paraphrasing Representative Carter as saying that “privacy legislation ... could ease concerns and build public confidence in AI”); cf. Fed. Trade Comm’n, Comment Letter on Proposed Rule for Developing the Administration’s Approach to Consumer Privacy 11–12 (2018),
    oping-administrations-approach-consumer-privacy/p195400_ftc_comment_to_ntia_112018.pdf [] (explaining that the Commission already makes decisions about trade-offs among, say, innovation, privacy, and security).

  241. [241]. See William P. Rogerson & Howard Shelanski, Antitrust Enforcement, Regulation, and Digital Platforms, 168 U. Pa. L. Rev. 1911, 1924 (2020).

  242. [242]. See supra Section II.B.2 (describing service specification).

  243. [243]. See, e.g., Nuechterlein & Weiser., supra note 3, at 297–98. Contra Restoring Internet Freedom, 33 FCC Rcd. 311, para. 103 (2018) (declaratory ruling, report and order) (explaining that the Commission selected a statutory classification for broadband, even if it excludes low-income consumers from obtaining subsidized broadband internet access, because the benefits of “innovation” outweigh the costs of more limited broadband internet access for such consumers).

  244. [244]. For more on the need for, and the possibility of, data security regulation in response to market failure, see generally Jeffrey L. Vagle, Cybersecurity and Moral Hazard, 23 Stan. Tech. L. Rev. 71 (2020) (discussing issues with data security regulation in response to market failure and addressing the moral hazard).

  245. [245]. See Weiss & Tian, supra note 164, at 31; Meek et al., supra note 164, at 399–401; see also Solow-Niederman, supra note 53, at 682 (“Choices about the design of algorithmic systems, often made for business or technical reasons, will effectively regulate the ways that this technology interacts with human values.”).

  246. [246]. See supra notes 164, 205–16 and accompanying text.

  247. [247]. Cf. Yafit Lev-Aretz & Katherine J. Strandburg, Regulation and Innovation: Approaching Market Failure from Both Sides, 38 Yale J. on Regul. 1, 26 (2020) (“Regulation’s traditional goal is to bring market demand into better alignment with individual and social preferences and values ... .”).

  248. [248]. See Lee, supra note 214. Accuracy in machine learning’s computational context has a specific meaning, referring to the ratio of true positives and true negatives as compared to all outcomes (including false positive and false negatives). Here, however, I do not mean to necessarily imply that regulators should set a precise standard for this sort of accuracy (which would treat false positives and false negatives equally). Rather, I use accuracy in a more colloquial sense, to underscore the need for some collective determination on the range and rate of acceptable errors (including the need to protect against errors that are biased against certain populations).

  249. [249]. See Shira Ovide, A Case for Facial Recognition, N.Y. Times (Nov. 11, 2020), https:// [https://] (“James Tate, a member of Detroit’s City Council ... believe[s] th[at] facial recognition software—with appropriate guardrails, including multiple steps for approval[—]was an imperfect but potentially effective tool ... for law enforcement in Detroit.”); see also supra notes 213–16 and accompanying text.

  250. [250]. See, e.g., Price II, supra note 120, at 459 (“FDA cannot and should not abandon its command-and-control role in directly regulating premarket access for at least some forms of algorithmic medicine.”).

  251. [251]. See Lehr & Ohm, supra note 27, at 669–70 (describing various choices that developers may make, including choices about data inputs, data cleaning, acceptable errors, and so on). See generally Chen et al., supra note 165 (contending that discriminatory applications should be remedied through additional data collection, rather than model constraints).

  252. [252]. See, e.g., Major Rate Case Process Overview, supra note 84 (“Rate cases are a primary instrument of government regulation of these industries. Interested persons may intervene and become parties in a utility company’s rate case. Typical intervenors include: industrial, commercial and other large-scale users of electricity; public interest groups; representatives of residential, low-income and elderly customers; local municipal officials; and, dedicated advocacy groups ... . Rate cases proceed in an entirely public and open process.”); see also Breyer, supra note 16, at 351–52 (discussing how public interest participation in agency decisions protects the interests of various populations); cf. Frank Pasquale, The Black Box Society: The Secret Algorithms That Control Money and Information 216–18 (2015) (describing a need to reclaim institutions of democratic governance).

  253. [253]. See generally Friedman & Nissenbaum, supra note 50 (describing taxonomy of bias encompassing preexisting bias, technical bias, and emergent bias). The forms of regulations described here are most likely to address forms of technical bias and to mitigate the effects of preexisting bias, but are more limited in their applicability to emergent bias.

  254. [254]. See supra notes 205–16 and accompanying text.

  255. [255]. Cf. Narechania, supra note 237, at 1529–32; Ruckelshaus v. Monsanto Co., 467 U.S. 986, 987–88 (1984).

  256. [256]. See, e.g., Solow-Niederman, supra note 53, at 638–39.

  257. [257]. See Virginia Dignum, Responsibility and Artificial Intelligence, in The Oxford Handbook of Ethics of AI 215, 224–25 (Markus D. Dubber, Frank Pasquale & Sunit Das eds., 2020); Lehr & Ohm, supra note 27, at 716–17 (“Over time, we imagine the boundary between [the] ‘art’ and [the] ‘science’ [of machine learning] will continue to shift, with more processes becoming understandable science and fewer remaining inscrutable art.”); see also Ryan Calo, Artificial Intelligence Policy: A Primer and Roadmap, 51 U.C. Davis L. Rev. 399, 420–22 (2017) (proposing solutions to the expertise gap).

  258. [258]. See Cathryn Virginia, Visionaries: Inioluwa Deborah Raji, MIT Tech. Rev. (Sept. 23, 2020), [] (describing Raji’s work and explaining that it has spurred “the US National Institute of Standards and Technology [to] update[] its annual audit of face recognition algorithms to include a test for racial bias”); see also GAO Report, supra note 213, at 7–8.

  259. [259]. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices, FDA (Sept. 22, 2021), []; see, Price II, supra note 120, 423–24 (describing the FDA’s approval processes); see also Rai et al., supra note 32, at 5 (describing and critiquing the FDA’s approval process as both “not always ... asking for highly rigorous performance information with respect to the ML-CD software it reviews” and failing to publicly disclose “information about either the machine learning model used or the training data”).

  260. [260]. See supra note 251 and accompanying text (suggesting a similar analysis in the context of ratesetting); see also Rogerson & Shelanski, supra note 241, at 1924; id. at 1920–21 (noting benefits of expertise in making such a calculation). Price II, supra note 120, at 451 (suggesting that “a stricter regulatory approach ... could be overregulatory”). But cf. Lev-Aretz & Strandburg, supra note 247, at 17–22 (suggesting a more nuanced approach to the view that regulation stifles innovation); Kapczynski, supra note 134, at 1492–93 (same).

  261. [261]. See, e.g., the examples described supra notes 208–16 and accompanying text.

  262. [262]. See, e.g., Microsoft Recommends Government Lead AI Frameworks, Wash. Internet Daily (Sept. 25, 2020), [] (explaining that a government-led effort can help ensure that embedded values are “accountable and transparent”).

  263. [263]. Narechania, supra note 41, at 205 (describing such requirements).

  264. [264]. Some applications providers may counter that such regulations amount to a speech regulation that is impermissible under the First Amendment. Providers, that is, might argue that they have a First Amendment right to “speak” through inaccurate or incomplete prediction models. See Kapczynski, supra note 134, at 1510–11; cf. Kashmir Hill, Facial Recognition Start-Up Mounts a First Amendment Defense, N.Y. Times (Mar. 18, 2021),
    08/11/technology/clearview-floyd-abrams.html []. But here, too, the natural monopoly condition provides a ready solution: Courts have long held that governments have more power to regulate providers, even in ways adjacent to the First Amendment, where they are natural monopolists or have market power. See, e.g., Cmty. Commc’ns Co. v. City of Boulder, 455 U.S. 40, 48–51 (1982). My future work will more fully examine this apparent market power exception to the First Amendment. For now, it suffices to note that this exception empowers providers to regulate the speech, such as it is, of machine-learning-based natural monopolists. See, e.g., Tim Wu, Machine Speech, 161 U. Pa. L. Rev. 1495, 1496–97 (2013); Stuart Minor Benjamin, Algorithms and Speech, 161 U. Pa. L. Rev. 1445, 1446
    –47 (2013).

  265. [265]. See Reva Schwartz, Leann Down, Adam Jonas & Elham Tabassi, Nat’L Inst. of Standards & Tech., A Proposal for Identifying and Managing Bias in Artificial Intelligence 5 (2021) (“NIST has experience in creating standards and databases, and has been evaluating the algorithms used in biometric technologies since the 1960s”), https://nvl []; see also Bran Knowles & John T. Richards, The Sanction of Authority: Promoting Public Trust in AI, in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 262, 263 (2021), [https://] (“show[ing] that public trust in AI will arise only through development of a robust regulatory ecosystem that provides some guarantee that the public is protected from harmful consequences of AI.”).

  266. [266]. See Rahman, supra note 44, at 1637.

  267. [267]. See supra Part II.

  268. [268]. See Hemphill, supra note 136, at 1981 (explaining that “[f]ostering competition against the leading platforms is socially desirable [because] competition encourages lower prices and higher quality on both sides of the platform, including lower prices to advertisers and greater privacy protection for users,” among other reasons).

  269. [269]. Narechania & Stallman, supra note 88, at 584–86.

  270. [270]. But cf. Viscusi et al., supra note 15, at 421 (suggesting that a franchise bidding process that accounts for dimensions beyond price is a bad idea).

  271. [271]. See Daniel J. Solove, Understanding Privacy 133 (2008).

  272. [272]. Mulligan & Bamberger, supra note 48, at 779–80.

  273. [273]. Id. at 780.

  274. [274]. See id.; Cary Coglianese & Erik Lampmann, Contracting for Algorithmic Accountability, 6 Admin. L. Rev. Accord 175, 197 (2021) (suggesting that governments pay “greater attention to how they design and structure their contracts for services to develop and operate AI tools” noting that such a procurement contracting approach can help to balance the costs and benefits of such technologies).

  275. [275]. Of course, policy makers might be rightfully concerned about the incumbents’ willingness to share existing infrastructure, and the new competitors’ appetite for making investments in new systems. See supra note 102 and accompanying text (describing how a similar scheme failed to achieve hoped-for results in the context of the 1996 Telecommunications Act).

  276. [276]. See supra Section III.B.1.

  277. [277]. Robin Harris, It’s Time to Spin AWS Out of Amazon: They Regulate Utilities, Don’t They?, ZDNet (May 26, 2020), [] (“Amazon’s Prime video runs on AWS, and so does Netflix, Disney+, and Hulu” and so “AWS has deep insight into its competitors’ businesses.”); see also Dana Mattioli & Joe Flint, How Amazon Strong-Arms Partners Using Its Power Across Multiple Businesses, Wall St. J. (Apr. 14, 2021, 10:34 AM),
    -multiple-businesses-11618410439 [].

  278. [278]. For some contrasting approaches to addressing this problem, see, e.g, Lina M. Khan, The Separation of Platforms and Commerce, 119 Colum. L. Rev. 973, 1065–90 (2019) (proposing and considering a separations rule), with Woodcock, supra note 223, at 66 (proposing price regulation); Rogerson & Shelanski, supra note 241, at 1930–33 (proposing a nondiscrimination regime akin to network neutrality rules); id. at 1934–36 (considering, cautiously, a separations rule or line-of-business restrictions). See Tim Wu, Network Neutrality, Broadband Discrimination, 2 J. Telecomms. & High Tech. L. 141 (2003) (proposing a non-discrimination regime).

  279. [279]. Pasquale, supra note 252, at 208–10. Cf. Yotam Harchol, Dirk Bergemann, Nick Feamster, Eric Friedman, Arvind Krishnamurthy, Aurojit Panda, Sylvia Ratnasamy & Michael Schapria, A Public Option for the Core, in SIGCOMM ‘20 377, 378 (2020).

  280. [280]. Levendowski, supra note 49, at 622–30; Lemley & Casey, supra note 49, at 748; see also Kapczynski, supra note 134, at 1509.

  281. [281]. See, e.g., Sonia Katyal, The Paradox of Source Code Secrecy, 104 Cornell L. Rev. 1183, 1186
    –87 & n.13 (2019); Brenda M. Simon & Ted Sichelman, Data-Generating Patents, 111 Nw. U. L. Rev. 377, 378 (2017) (describing two such examples).

  282. [282]. See TensorFlow Federated, supra note 49.

  283. [283]. See, e.g., Peter K. Yu, Beyond Transparency and Accountability: Three Additional Features Algorithm Designers Should Build Into Intelligent Platforms, 13 Ne. U. L. Rev. 263, 290–95 (2020) (describing data interoperability as critical to sustaining competition); Solow-Niederman, supra note 53, at 689; Hovenkamp, supra note 18, at 2032–38. Public regulators, moreover, may be able to overcome the intellectual property concerns that would otherwise attend to forced data sharing. See, e.g., Kapczynski, supra note 134, at 1509 (describing some such possible intellectual property limits); Narechania, supra note 237, at 1483 (explaining how regulators have overcome similar objections in other policy contexts). Indeed, even where these datasets are protected by trade secret, see supra note 281 and text, regulators may require that incumbents negotiate rates for sharing such protected information with putative competitors. See generally Ruckelshaus v. Monsanto Co., 467 U.S. 986 (1984) (finding that, even though data used for product development and regulatory approval may be protected as trade secret and under the Constitution’s Taking Clause, statutory schemes for arbitrating a price for the use of that data by competitors satisfy constitutional scrutiny).

  284. [284]. See, e.g., Mary D. Fan, The Public’s Right to Benefit from Privately Held Consumer Big Data, 96 N.Y.U. L. Rev 1438, 1477–91 (2021); see also Rogerson & Shelanski, supra note 241, at 1927–29, 1933–34(similarly arguing in favor of interoperability and portability across data platforms).

  285. [285]. Even skeptics of monopoly leveraging theories might be persuaded by the possibility for leveraging in these contexts, especially if these natural monopolists are made subject to (informational) rate regulation. See Nuechterlein & Weiser, supra note 3, at 14–17 (explaining Baxter’s Law).

  286. [286]. See Michael Luca, Tim Wu, Sebastian Couvidat & Daniel Frank, Does Google Content Degrade Google Search? Experimental Evidence 13 (Harvard Business School, Working Paper No. 16-035, 2015); Pasquale, supra note 52, at 22.

  287. [287]. See Great W. Directories, Inc. v. Sw. Bell Tel. Co., 63 F.3d 1378, 1384, 1386 (5th Cir. 1995) (concluding that current customer information controlled by a telephone provider was an “essential facility,” and that the phone company could thus not attempt to monopolize the separate “directory market”).

  288. [288]. See generally Nikolas Guggenberger, Essential Platforms, 24 Stan. Tech. L. Rev. 237 (2021) (stating that an essential facilities framework can help constrain “Big Tech”); see also Khan, supra note 278, at 1066–74.

  289. [289]. See Symposium, Artificial Intelligence in Federal Agencies: Government by Algorithm: Symposium on Artificial Intelligence in Federal Agencies, Admin. Conf. U.S. 7 (2020) (statement of Hillary Brill, Geo. L. Inst. for Tech. Law and Pol’y) (noting such tradeoffs); Ovide, supra note 249.

  290. [290]. Cf. Bagley, supra note 223, at 62 (“Public utility regulation is every bit as much a part of that tradition as laissez-faire. And if the market-oriented approaches that are ascendant today prove unsatisfactory, public utility regulation is an option worth exploring ... . [T]he debate between market-oriented and regulatory approaches should unapologetically examine the virtues and vices of both.”).

  291. [291]. See, e.g., Joseph A. Schumpeter, Capitalism, Socialism and Democracy 89 (3d ed. 1950); Breyer, supra note 16, at 286–87.

  292. [292]. Russell & Norvig, supra note 6, at 28 (explaining that “many thousands of AI applications are embedded in the infrastructure of every industry” (quoting Kurzweil, supra note 6, at 204)).

  293. [293]. See, e.g., Breyer, supra note 16, at 346–54.


Robert and Nanci Corson Assistant Professor of Law, University of California, Berkeley, School of Law.

For helpful comments and suggestions, I owe many thanks to Michael Abramowicz, Abhay Aneja, Abbye Atkinson, Ken Bamberger, Pam Bookman, Maureen Carroll, Andrew Chin, Zach Clopton, Erin C. Delaney, Charles Duan, Robin Effron, Seth Endo, Mary Fan, Amit Gandhi, Rebecca Goldstein, Mark Gergen, David Grewal, Scott Hemphill, Chris Hoofnagle, Bert Huang, Ben Johnson, Sonia Katyal, Aniket Kesari, F. Scott Kieff, Mark Lemley, Joy Milligan, Aileen Nielson, David Noll, Michael Murray, Manisha Padi, Claudia Polsky, Asad Rahim, Steve Ross, David Schwartz, Paul Schwartz, David Simon, Aaron Simowitz, Erik Stallman, Maurice Stucke, Daniel Walters, Steve Weber, Rebecca Wexler, Tim Wu, Christopher Yoo, Adam Zimmerman, and audiences at the Center for Long Term Cybersecurity at the University of California, Berkeley, School of Information, the George Washington University Law School, the University of California, Berkeley, School of Law, the University of North Carolina School of Law, and the University of Pennsylvania Carey Law School. For outstanding research assistance, I thank the stellar law librarians at Berkeley Law, Tian Kisch, Delia Scoville, and Kaavya Shah. I also thank Adam Garcia and the editors of the Iowa Law Review for their helpful suggestions and careful edits.