TikTok and ChatGPT in the News
The lead article in this news-roundup isn’t about ChatGPT at all, but rather about the current trend among state governments to ban TikTok on state-issued devices and for public universities, usually in the same states, to ban TikTok on their wifi networks. The ostensible, and perhaps actual, reason for doing so is because of the data that TikTok can, or does, collect on its users with the additional factor of TikTok’s unclear relationship to/with the CCP-run Chinese government.
To be clear, governments, and their publics, should be concerned about data collection by social media platforms, as well as all other businesses and organizations, including themselves.1 Given the amount of data currently already available, what more the Chinese government, or any other entity, needs to know about each individual American citizen is really a matter of finer strokes of the brush.
Here’s a partial account of the data already out there:
|2018||Instagram, TikTok, YouTube||235 million||profile name, real name, profile photo, likes, age, gender, +|
|2018||419 million||IDs, names, phone numbers|
|2019||540 million||IDs, comments, likes, “reaction data”|
|2019||533 million in 106 countries||IDs, phone numbers, “other info”|
|2021||500 million||full names, email, phone numbers, workplace information, +|
|2021||Clubhouse||1.3 million||User ID, Name, Photo URL, Username, Twitter handle, Instagram handle, + *|
|2021||Parler||60TB user data||all network activity|
|2021||Gab||60TB data||all posts (public and private)|
Given this data, and the ability for an entity with the will and means to do so – and the means to do so amounts to sufficient computational power and data storage, each of which still gets cheaper every year – the ability to generate custom material that addresses a user with the correct form and content to get inside their information bubble is now entirely not only imaginable but feasible.
When you add in the ability to run A/B testing to see what works, and how well it does (and to whom the user passes on the package), and what does not work, functionality which already exists on almost all social media platforms, you have the ability to deliver with remarkable precision exactly the package you want delivered.
This is something I explored with the Army over the last two years, but with the rise of ChatGPT, and other generative AIs, it has begun to creep into public discourse that we are facing a new landscape, even now, as glimpsed in a recent report for Yahoo Finance, which notes “90% of online content could be generated by AI by 2025.”
Elsewhere, the NYT has coverage of the concerns over ChatGPT, and how they might be addressed, are working their way through universities.
For the record, I think Kevin Roose, also writing for the NYT has the right approach: it makes me feel a little sorry for younger people that so much of the world as they will encounter it will be generated for them, but not necessarily of their choosing.
The mantra, which should be a policy (or even a law?), for any organization should be not to collect any data you are not prepared either to spend inordinate time and sums of money protecting or are prepared to lose. ↩