It's Recommendations From Now On

In “The End of Social Media and the Rise of Recommendation Media,” Michael Mignano describes the transformation of many so-called social media platforms into recommendation media platforms (Mignano 2022). He also argues that this is for the better: it will give users a better experience. Many responses to his essay point out that Mignano, like a lot of tech wonks, misunderstands what ordinary people saw in social media: they were interested in the social dimension, with the media piece being interesting because it made it easier to share a variety of forms of information.

There are a couple of things to add to this discussion, I think, the first of which is that one wonders what social media might have looked like if it wasn’t based on the usual American media model of being funded by advertising. What if Facebook had been a subscription service like Netflix? The need to generate revenue by being able to sell users to advertisers meant that businesses had to make “sticky” content. To compete in a media landscape that can only be described as over-saturated, almost all those businesses found that fear and anger worked … or at least provide easy on-ramps, after which they could use a myriad of technologies developed by decades of market-focused psychological research to create addictive experiences.

All of this in an effort to turn social users into media consumers, which means that people like Mignano are really just in the same old business, it’s just got a lot more levers to pull, uses a lot more data, and has as much interest in making us better people or a better nation of people as bad old industries like big oil and big tobacco.

TikTok and ChatGPT in the News

The lead article in this news-roundup isn’t about ChatGPT at all, but rather about the current trend among state governments to ban TikTok on state-issued devices and for public universities, usually in the same states, to ban TikTok on their wifi networks. The ostensible, and perhaps actual, reason for doing so is because of the data that TikTok can, or does, collect on its users with the additional factor of TikTok’s unclear relationship to/with the CCP-run Chinese government.

To be clear, governments, and their publics, should be concerned about data collection by social media platforms, as well as all other businesses and organizations, including themselves.1 Given the amount of data currently already available, what more the Chinese government, or any other entity, needs to know about each individual American citizen is really a matter of finer strokes of the brush.

Here’s a partial account of the data already out there:

YEAR PLATFORM ACCOUNTS DETAILS
2018 Instagram, TikTok, YouTube 235 million profile name, real name, profile photo, likes, age, gender, +
2018 Facebook 30 million everything
2018 Facebook 419 million IDs, names, phone numbers
2019 Facebook 540 million IDs, comments, likes, “reaction data”
2019 Facebook 533 million in 106 countries IDs, phone numbers, “other info”
2021 LinkedIn 500 million full names, email, phone numbers, workplace information, +
2021 Clubhouse 1.3 million User ID, Name, Photo URL, Username, Twitter handle, Instagram handle, + *
2021 Parler 60TB user data all network activity
2021 Gab 60TB data all posts (public and private)

Given this data, and the ability for an entity with the will and means to do so – and the means to do so amounts to sufficient computational power and data storage, each of which still gets cheaper every year – the ability to generate custom material that addresses a user with the correct form and content to get inside their information bubble is now entirely not only imaginable but feasible.

When you add in the ability to run A/B testing to see what works, and how well it does (and to whom the user passes on the package), and what does not work, functionality which already exists on almost all social media platforms, you have the ability to deliver with remarkable precision exactly the package you want delivered.

This is something I explored with the Army over the last two years, but with the rise of ChatGPT, and other generative AIs, it has begun to creep into public discourse that we are facing a new landscape, even now, as glimpsed in a recent report for Yahoo Finance, which notes “90% of online content could be generated by AI by 2025.”

Elsewhere, the NYT has coverage of the concerns over ChatGPT, and how they might be addressed, are working their way through universities.

For the record, I think Kevin Roose, also writing for the NYT has the right approach: it makes me feel a little sorry for younger people that so much of the world as they will encounter it will be generated for them, but not necessarily of their choosing.

  1. The mantra, which should be a policy (or even a law?), for any organization should be not to collect any data you are not prepared either to spend inordinate time and sums of money protecting or are prepared to lose. 

A General Index of Science

A little over a year ago Cory Doctorow echoed out to a larger audience a report by Nature on Carl Malamud’s development of “a full-text-searchable index of 100,000,000 scientific articles.” The catalog contains 355 billion words, and returns five-word snippets and citations in response to queries. It’s publicly available for all to mine and search.

The index itself is at The Internet Archive.

CSS Colors by Name

I prefer to keep things simple, so when I am working in/on CSS I tend to use the more limited palette of named colors precisely because they are named and not a hexadecimal sequence.

CSS Colors by Name
CSS Colors by Name (Click to embiggen.)

Army Dayz

At some point I knew I needed to account for the two years I spent working for / in / with the Army. Army Dayz offers a chronology with some reflection. I also have notes on topical / intellectual matters that are, I hope, worth thinking about.

Scholarly XML for VS Code

Scholarly XML is an extension for Visual Studio Code with a validator and autocomplete for features typically needed by academic encoding projects. It checks if XML is well-formed, validates a file when you open or modify it, makes schema aware suggestions for elements, attributes, and attribute values, shows documentation from schema for elements, attributes, and attribute values when available, and wraps selected text with tags using Ctrl+e. Most importantly, it does not require Java!

Midjourney AI Image Creation

I asked Midjourney to create an image of “people and books in a network stretching as far as the eye can see”:

(Click to embiggen.)

Story Circles

For those interested in the various abstractions about the “shape of stories” post Freitag’s triangle (or pyramid), I sat down one day to try to graph three of the more popular circles currently, er, circulating.

Composite Story Circles
Composite Story Circles (Click to embiggen.)

Open Source / Public Domain Materials

If you’re in need of free to use, and possibly free to adapt – what the legal types call derive – images and possibly audio, there are two places you should definitely bookmark:

  • Library of Congress
  • Smithsonian Open Access encourages downloading, sharing, and reuse of its millions of 4.4 million 2D and 3D digital items from their collections, with the promise of more to come. This includes images and data from across the Smithsonian’s 19 museums, nine research centers, libraries, archives, and the National Zoo. They note there’s no need to ask for permission.
  • Openverse
  • Yale Center for British Art: http://britishart.yale.edu/collections/search.
  • The Lewis Walpole Library: http://images.library.yale.edu/walpoleweb/ usually allows free reproduction inside scholarly books and journals.
  • Rijksmuseum (change to English): https://www.rijksmuseum.nl/en/rijksstudio offers public domain, free to use.
  • Welcome Library: http: //wellcomeimages.org/. Public domain, free to use: amazing range of subject matter beyond medicine and science.
  • The Folger Shakespeare Library: http://luna. folger.edu/luna/servlet/FOLGERCM1~6~6. Pretty good policy about reusing material inside scholarly books and journals.
  • At LACMA, look for images marked “Public Domain High Resolution Image Available” – many from 18th century: http://collections.lacma.org/
  • http://www.metmuseum.org/research/image-resources#scholarly & via Images for Academic Publishing at ArtStor: http://www.artstor.org/content/collaborations
  • NYPL has some lovely digitized pieces from 18th century, believe it or not, and all public domain: http://digitalcollections.nypl.org/.
  • Wikimedia Commons - includes notes about public domain images to identify them for use. For example: https://commons.wikimedia.org/wiki/File:Jean_Sim%C3%A9on_Chardin_The_Monkey_Antiquarian.jpg.
  • Digital Public Library of America http://dp.la/.
  • British Library on Flickr https://www.flickr.com/photos/britishlibrary/ Public domain images that they allow people to use are on their Flickr account.
  • Fisher Library in Toronto only charges for reproducing the images in digital format: very reasonable rates.
  • The PIMS in Toronto has an amazing collection: http://www.pims.ca/the-institute/directory-e-mail-and-telephone-contacts.
  • And what is FADIS.

Install Briss with Homebrew

If like me you found yourself in need of the PDF-cropping abilities of Briss, but have faced the wall that is Java on Apple Silicon, fear not. Homebrew has your back. For those not familiar with Home-brew, it is a package manager, much like the venerable MacPorts – which I used for years to manage my installation of Python before switching to Mini Conda. All three are package managers which make it easy to install a variety of shell programs and Python, and other scripting language, libraries on your computer. All three work on a Mac. Home-brew also works on Linux, and Miniconda, like its larger sibling Anaconda, works on all three current OS platforms: macOS, Linux, Windows.