Countering AI disinformation and deep fakes with digital signatures

According to The Economist, disinformation campaigns (often state-sponsored) use “AI to rewrite real news stories”:

In early March [2024] a network of websites, dubbed CopyCop, began publishing stories in English and French on a range of contentious issues. They accused Israel of war crimes, amplified divisive political debates in America over slavery reparations and immigration and spread nonsensical stories about Polish mercenaries in Ukraine… the stories had been taken from legitimate news outlets and modified using large language models.

Deep fakes of still images and now video clips are similarly based on legitimate original photos and video. Detecting such fakery can be challenging.

Disinformation comes from publishers (social media posters, newspapers, bloggers, commenters, journalists, photographers, etc.) who invent or misquote factual claims or evidence. Ultimately, we trust publishers based on their reputation – for most of us an article published by chicagotribune.com is given more credence than one published by infowars.com.

An obvious partial solution (that I haven’t seen discussed) is for publishers to digitally sign their output, identifying themselves as the party whose reputation backs the claims, and perhaps including a permanent URL where the original version could be accessed for verification.

Publishers who wish to remain anonymous could sign with a nym (pseudonym; a unique identifier under control of an author – for example an email address or unique domain name not publicly connected with an individual); this would enable anonymous sources and casual social media posters to maintain reputations.

Web browsers (or extensions) could automatically confirm or flag fakery of the claimed publisher identity, and automatically sign social media posts, comments, and blog posts. All that’s needed is a consensus standard on how to encode such digital signatures – the sort of thing that W3C and similar organizations produce routinely.

Third party rating services could assign trust scores to publishers. Again, a simple consensus standard could allow a web browser to automatically retrieve ratings from such services. (People with differing views will likely trust different rating services). Rating services will want to keep in mind that individual posters may sometimes build a reputation only to later “spend” it on a grand deception; commercial publishers whose income depends on their reputation may be more trustworthy.

Posts missing such signatures, or signed by publishers with poor trust scores, could be automatically flagged as unreliable or propaganda.

Signatures could be conveyed in a custom HTML wrapper that needn’t be visible to readers with web browsers unable to parse them – there’s no need to sprinkle “BEGIN PGP SIGNED MESSAGE” at the start of every article; these can be invisible to users.

Signatures can be layered – a photo could be signed by the camera capturing the original (manufacturer, serial number), the photographer (name, nym, unique email address), and publisher, all at the same time, similarly for text news articles.

When a new article is created by mixing/editing previously published material from multiple sources, the new article’s publisher could sign it (taking responsibility for the content as a whole) while wrapping all the pre-existing signatures. A browser could, if a user wanted and the sources remain available, generate a revision history showing the original sources and editorial changes (rewording, mixing, cropping, etc.). Trust scores could be automatically generated by AI review of changes from the sources.

Video could be signed on a per-frame basis as well as a whole-clip or partial-clip basis. Per frame signatures could include consecutive frame numbers (or timestamps), enabling trivial detection of selective editing to produce out-of-context false impressions.

If there’s a desire for immutability or verifiable timestamps, articles (or signed article hashes) could be stored on a public blockchain.

Somebody…please pursue this?

SSI to PHP Conversion Made Easy

I’m learning PHP in order to simplify one of the websites I maintain.

It turns out that SSI (server-side includes) and PHP are incompatible – you can’t use both at once, so I have to convert all my SSI includes into PHP includes.

PROBLEM STATEMENT

I’ve been using the “virtual” SSI mode, as in:

<!–#include virtual=”/INCLUDE/_HEADER.html” –>

This is nice because it’s relative to the site root (instead of relative to the calling document location), so you can cut-and-paste include directives without worrying about what folder the calling document is in.

Unfortunately, PHP only does document-relative inclusion, using include() or require().  So you have to do something like:

<?php include(“/home/dave/public_html/nerdfever/INCLUDE/_HEADER.html”); ?>

I guess that would be OK except that I run test sites on another domain, and keep the files in another folder (think “testsite” instead of “nerdfever” in the path above).  That means all my include() calls need to change when switching from the test site to the live site – which defeats the purpose of testing.

SOLUTION

I found a trick that solves the problem.  There’s a global variable in PHP called $DOCUMENT_ROOT – so you just do this:

<?php include(“$DOCUMENT_ROOT/INCLUDE/_HEADER.html”); ?>

Strangely, I couldn’t find anyone else posting this solution on the web, so I thought I’d post it here.

There are other solutions, but they’re either more complicated or they require support of the PHP virtual() function, which isn’t allowed on most shared hosts.

Have fun.