Resonance Calendar for 2023-03-06

What got my attention last week?

Data Industry & Economics

Ben Lorica and Kenn So have followed up their first annual list of data and AI ‘pegacorns’ (startups that have at least $100M in annual revenue) with a second edition. Their analysis, in the closing thoughts, are worth considering carefully:

New pegacorn companies continue to emerge, many of which operate in the application layer. This trend is not surprising, as the application layer offers a wider range of personas to target, including function, industry, and geography, compared to the infrastructure layer. The pain point at the infrastructure level tends to be similar across companies, leading to a concentration of larger vendors.

The proliferation of foundation models also opens up opportunities

… 

The emergence of new pegacorn companies brings with it both risks and opportunities. One key risk is the increasing commoditization of products as a result of the availability of pre-trained models and open-source tools. This makes it easy to create similar products with simple user interfaces. To overcome this, companies must differentiate themselves by offering a deeper product suite tailored to specific personas or even specific users. On the other hand, the proliferation of foundation models and decentralized custom models open up opportunities for tooling companies to provide solutions for building, fine-tuning, and optimizing models, as seen in projects like LangChain and GPT Index.

Insights from New Data and AI Pegacorns - Gradient Flow

Policy

A theme that received a great deal of my attention last week was the state of data brokerage, including the most important actors, the regulatory governance that is in place, its effectiveness and what’s missing. While I noted the publication of Data Cartels: The Companies That Control and Monopolize Our Information when it was published by Stanford University Press in November 2022, it was a piece in Wired by Dell Cameron that motivated me to spend a good part of my weekend with the book by Sarah Lamdan, of CUNY School of Law. Extensively footnoted, with thoroughness and references that belie her background in Library & Information Management.

This book not only resonates with me, it’s been setting off all sorts of alarm bells. This is definitely worth your time and attention if you are engaged, as I am, in various aspects of data engineering, data science and the industry of data and information.

An abstract from the first chapter: The Data Cartels: An Overview

Data analytics companies are a relatively new type of information firm, the result of mass consolidation across information markets and a proliferation of data analysis technologies. We think of companies like RELX (Reed Elsevier LexisNexis) and Thomson Reuters as publishers, but they’ve made a transition away from being traditional content providers. Instead, they’re crunching their warehouses of digital content through AI software, machine-learning technology, complex algorithms, and other types of data analytics systems to form new information products to sell. They’ve taken over multiple information markets, and they use their informational power to build “risk” and “insight” products that provide predictive and prescriptive information to law enforcement, lawyers, academic institutions, investors, and other entities that make big decisions about our lives. In an era of informational capitalism, they’re engaging in anticompetitive, cartel-like behavior to maintain and expand their control of data and information markets.

Rich Miller @rhm2k