A friend of mine pointed me to reporting by Futurity on the discovery of Ajami, a form of Arabic script modified to capture local spoken languages in Africa. It has apparently been in use for quite some time, and undermines claims by colonial operatives that natives were illiterate. Far from it, as it turns out! Across West Africa people had taken the Arabic script that came with the spread of Islam and used it to capture the sounds of the languages they spoke, thus making it possible to record a variety of information. Even better: it appears that the effort to document Ajami has been institutionalized. (My training in linguistics is largely autodidact, so I don’t know what the technical term is for this. A kind of creole perhaps? At the very least a hybrid written language?
I’ve seen a number of almost/somewhat generic researcher positions that I think humanities majors with some computational abilities could not only manage but make a real difference. The convergence of quantitative and qualitative approaches not only to the research which is often the focus of these jobs but also the ability to communicate that research and to adapt to changing circumstances is something we could be really developing in university programs.
Here’s one such position with the details of the particular organization removed, but I have seen so many positions like this that this almost feels like a template that’s been adapted:
The Researcher position supports the execution of research initiatives in support of organizational objectives and to the progression of the HR profession by producing content for publication online and in print. This position is focused on researching all things work, worker and workplace and may include topics such as human resources, workplace economic impacts, diversity, equity and inclusion, racial injustice, and worker benefits.
Research & Data Analysis
Conduct primary and secondary research on work, worker, and workplace topics
Design and execute concurrent research studies from start to finish, including defining objectives, developing research plans, designing, and programming surveys, analyzing results, summarizing findings, and creating final deliverables
Adapt research practices to meet business needs
Produce well-researched content for publication online and in print in compelling and creative ways to foster and elevate customer understanding and help achieve product and/or business impact.
Develop related content for multiple platforms, such as websites, email marketing, product descriptions, videos, and blogs.
Utilize industry best practices and familiarity with the organization’s mission to inspire ideas and content.
Project Management, Collaboration, & Communication
Organize schedules to complete drafts of content or finished projects, collaborating with team members within deadlines and ensure timely delivery of materials.
Act as a brand ambassador, ensuring that all projects fit the client’s style and voice
Communicate and collaborate internally and externally to support the research functions to bring high quality proposals, reports, deliverables, presentations, etc. to fruition and help bridge alignment within and across diverse teams to meet business objectives.
2 years of professional experience in research, project management, or data analysis
Proven time management skills, including prioritizing, scheduling, and adapting as necessary
Proficiency with computers, especially writing programs, such as Google Docs and Microsoft Word, Excel, Outlook, and PowerPoint
In “The End of Social Media and the Rise of Recommendation Media,” Michael Mignano describes the transformation of many so-called social media platforms into recommendation media platforms (Mignano 2022). He also argues that this is for the better: it will give users a better experience. Many responses to his essay point out that Mignano, like a lot of tech wonks, misunderstands what ordinary people saw in social media: they were interested in the social dimension, with the media piece being interesting because it made it easier to share a variety of forms of information.
There are a couple of things to add to this discussion, I think, the first of which is that one wonders what social media might have looked like if it wasn’t based on the usual American media model of being funded by advertising. What if Facebook had been a subscription service like Netflix? The need to generate revenue by being able to sell users to advertisers meant that businesses had to make “sticky” content. To compete in a media landscape that can only be described as over-saturated, almost all those businesses found that fear and anger worked … or at least provide easy on-ramps, after which they could use a myriad of technologies developed by decades of market-focused psychological research to create addictive experiences.
All of this in an effort to turn social users into media consumers, which means that people like Mignano are really just in the same old business, it’s just got a lot more levers to pull, uses a lot more data, and has as much interest in making us better people or a better nation of people as bad old industries like big oil and big tobacco.
The lead article in this news-roundup isn’t about ChatGPT at all, but rather about the current trend among state governments to ban TikTok on state-issued devices and for public universities, usually in the same states, to ban TikTok on their wifi networks. The ostensible, and perhaps actual, reason for doing so is because of the data that TikTok can, or does, collect on its users with the additional factor of TikTok’s unclear relationship to/with the CCP-run Chinese government.
To be clear, governments, and their publics, should be concerned about data collection by social media platforms, as well as all other businesses and organizations, including themselves.1 Given the amount of data currently already available, what more the Chinese government, or any other entity, needs to know about each individual American citizen is really a matter of finer strokes of the brush.
Here’s a partial account of the data already out there:
Instagram, TikTok, YouTube
profile name, real name, profile photo, likes, age, gender, +
IDs, names, phone numbers
IDs, comments, likes, “reaction data”
533 million in 106 countries
IDs, phone numbers, “other info”
full names, email, phone numbers, workplace information, +
Given this data, and the ability for an entity with the will and means to do so – and the means to do so amounts to sufficient computational power and data storage, each of which still gets cheaper every year – the ability to generate custom material that addresses a user with the correct form and content to get inside their information bubble is now entirely not only imaginable but feasible.
When you add in the ability to run A/B testing to see what works, and how well it does (and to whom the user passes on the package), and what does not work, functionality which already exists on almost all social media platforms, you have the ability to deliver with remarkable precision exactly the package you want delivered.
This is something I explored with the Army over the last two years, but with the rise of ChatGPT, and other generative AIs, it has begun to creep into public discourse that we are facing a new landscape, even now, as glimpsed in a recent report for Yahoo Finance, which notes “90% of online content could be generated by AI by 2025.”
For the record, I think Kevin Roose, also writing for the NYT has the right approach: it makes me feel a little sorry for younger people that so much of the world as they will encounter it will be generated for them, but not necessarily of their choosing.
The mantra, which should be a policy (or even a law?), for any organization should be not to collect any data you are not prepared either to spend inordinate time and sums of money protecting or are prepared to lose. ↩
A little over a year ago Cory Doctorow echoed out to a larger audience a report by Nature on Carl Malamud’s development of “a full-text-searchable index of 100,000,000 scientific articles.” The catalog contains 355 billion words, and returns five-word snippets and citations in response to queries. It’s publicly available for all to mine and search.
At some point I knew I needed to account for the two years I spent working for / in / with the Army. Army Dayz offers a chronology with some reflection. I also have notes on topical / intellectual matters that are, I hope, worth thinking about.
Scholarly XML is an extension for Visual Studio Code with a validator and autocomplete for features typically needed by academic encoding projects. It checks if XML is well-formed, validates a file when you open or modify it, makes schema aware suggestions for elements, attributes, and attribute values, shows documentation from schema for elements, attributes, and attribute values when available, and wraps selected text with tags using Ctrl+e. Most importantly, it does not require Java!
For those interested in the various abstractions about the “shape of stories” post Freitag’s triangle (or pyramid), I sat down one day to try to graph three of the more popular circles currently, er, circulating.