Countering AI disinformation and deep fakes with digital signatures

According to The Economist, disinformation campaigns (often state-sponsored) use “AI to rewrite real news stories”:

In early March [2024] a network of websites, dubbed CopyCop, began publishing stories in English and French on a range of contentious issues. They accused Israel of war crimes, amplified divisive political debates in America over slavery reparations and immigration and spread nonsensical stories about Polish mercenaries in Ukraine… the stories had been taken from legitimate news outlets and modified using large language models.

Deep fakes of still images and now video clips are similarly based on legitimate original photos and video. Detecting such fakery can be challenging.

Disinformation comes from publishers (social media posters, newspapers, bloggers, commenters, journalists, photographers, etc.) who invent or misquote factual claims or evidence. Ultimately, we trust publishers based on their reputation – for most of us an article published by chicagotribune.com is given more credence than one published by infowars.com.

An obvious partial solution (that I haven’t seen discussed) is for publishers to digitally sign their output, identifying themselves as the party whose reputation backs the claims, and perhaps including a permanent URL where the original version could be accessed for verification.

Publishers who wish to remain anonymous could sign with a nym (pseudonym; a unique identifier under control of an author – for example an email address or unique domain name not publicly connected with an individual); this would enable anonymous sources and casual social media posters to maintain reputations.

Web browsers (or extensions) could automatically confirm or flag fakery of the claimed publisher identity, and automatically sign social media posts, comments, and blog posts. All that’s needed is a consensus standard on how to encode such digital signatures – the sort of thing that W3C and similar organizations produce routinely.

Third party rating services could assign trust scores to publishers. Again, a simple consensus standard could allow a web browser to automatically retrieve ratings from such services. (People with differing views will likely trust different rating services). Rating services will want to keep in mind that individual posters may sometimes build a reputation only to later “spend” it on a grand deception; commercial publishers whose income depends on their reputation may be more trustworthy.

Posts missing such signatures, or signed by publishers with poor trust scores, could be automatically flagged as unreliable or propaganda.

Signatures could be conveyed in a custom HTML wrapper that needn’t be visible to readers with web browsers unable to parse them – there’s no need to sprinkle “BEGIN PGP SIGNED MESSAGE” at the start of every article; these can be invisible to users.

Signatures can be layered – a photo could be signed by the camera capturing the original (manufacturer, serial number), the photographer (name, nym, unique email address), and publisher, all at the same time, similarly for text news articles.

When a new article is created by mixing/editing previously published material from multiple sources, the new article’s publisher could sign it (taking responsibility for the content as a whole) while wrapping all the pre-existing signatures. A browser could, if a user wanted and the sources remain available, generate a revision history showing the original sources and editorial changes (rewording, mixing, cropping, etc.). Trust scores could be automatically generated by AI review of changes from the sources.

Video could be signed on a per-frame basis as well as a whole-clip or partial-clip basis. Per frame signatures could include consecutive frame numbers (or timestamps), enabling trivial detection of selective editing to produce out-of-context false impressions.

If there’s a desire for immutability or verifiable timestamps, articles (or signed article hashes) could be stored on a public blockchain.

Somebody…please pursue this?

Von Neumann’s First Draft of a Report on the EDVAC

I just read John von Neumann’s First Draft of a Report on the EDVAC, a tremendously influential 1946 document about computer architecture. A paper copy is available on Amazon, and the identical document as a PDF here.

Written after his experience with ENIAC (where he, famously, configured the plugboards to implement stored instructions in memory), the document describes the architecture of a (then) next-generation machine. Supposedly this early draft (the only draft he ever produced) was circulated widely and led to many implementations of similar machines in the late 1940s.

It’s a fascinating historical document. One thing that jumped out at me (no pun intended) is that nowhere in the document does he mention the idea of conditional branches – without those it’s extremely difficult or impossible to make the machine Turing-complete.

Also, he seems to have conceived the machine purely as a programmable calculator. The concept of what we’d call “data processing” is completely absent.

Von Neumann is considered one of the smartest people to have ever lived – Hans Bethe said “I have sometimes wondered whether a brain like von Neumann’s does not indicate a species superior to that of man”, and Edward Teller (a very competitive guy) admitted that he “never could keep up with John von Neumann.”

Teller also said “von Neumann would carry on a conversation with my 3-year-old son, and the two of them would talk as equals, and I sometimes wondered if he used the same principle when he talked to the rest of us.”

I think this just goes to show how difficult it is to predict even the future of a narrow technology.

Make Windows open URLs in Linux .desktop files

Here’s a Python script that makes Windows open a URL (in your default browser) when you execute (double click) a Linux .desktop file that contains a URL.

If you use both Windows and Linux, and save links to websites from a browser in Linux, the link will become a .desktop file, which can be opened by Linux. .desktop files are a lot like Windows “shortcut” files, but are (of course!) incompatible with them.

This script lets Windows open the website link saved in a .desktop file.

It relies on Python already being installed in your Windows system.

You can use it at the Windows command line like this:

python launch.desktop.py <.desktop file>

To use it at the Windows GUI (to be able to double-click on a .desktop file to launch it), put the following line in a batch file in your executable path somewhere (one place that would work would be C:\WINDOWS\System32) as “launch.desktop.bat”:

python "%~dp0launch.desktop.py" %1

Then put the Python script in the same folder where you put the batch file, as “launch.desktop.py”:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
    Launches a Linux .desktop file that contains a URL, on Windows.

    Tested on Win10, Python 3.11.
    
    Note: Doing this in a Windows batch file is a problem because findstr doesn't want to open files with special characters in them
    (which Linux likes to put there). There are ways around that (make temp files), but Python doesn't mind the characters.

    """

__author__    = 'NerdFever.com'
__copyright__ = 'Copyright 2023 NerdFever.com'
__version__   = ''
__email__     = 'dave@nerdfever.com'
__status__    = 'Development'
__license__   = """'Copyright 2023 NerdFever.com

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License."""

import os, sys, subprocess

def main():
    if len(sys.argv) != 2:
        print(os.path.basename(__file__))
        print("Launches a Linux .desktop file that contains a URL, on Windows.")
        print("Usage: python", os.path.basename(__file__), "<filepath>")
        return

    filepath = sys.argv[1]

    if not os.path.exists(filepath):
        print(f"Error: {filepath} does not exist!")
        return
    
    # open file, scan for "URL=" or "URL[anything]=", get URL from remainder of line
    url = None
    with open(filepath, "r") as f:
        for line in f:
            if line.startswith("URL="):
                url = line[4:].strip()
                break
            elif line.startswith("URL["):
                url = line[line.find("]")+1:].strip()
                break

    if url is None:
        print(f"Error: {filepath} does not contain a URL!")
        return
    
    # run "start" on the URL (launch default browser configured in Windows)
    subprocess.run(["start", url], shell=True)

if __name__ == "__main__":
    main()

The first time you double-click on a .desktop file, Windows will say it doesn’t know how to open it, and offer to let you choose a program on your PC for that. Choose “launch.desktop.bat” and check the box “Always use this app to open .desktop files”. Done.

Using AI to moderate online discussions

This brief post is here solely as prior art to make it more difficult for someone to patent these ideas. Probably I’m too late (the idea is so obvious the patent office probably has a dozen applications already), but I’m trying.

GOALS

One of my pet projects for years has been finding a way to promote civil discussion online. As everyone knows, most online discussion takes place in virtual cesspits – Facebook, Twitter, the comments sections of most news articles, etc. Social media and the ideological bubbles it promotes have been blamed for political polarization and ennui of young people around the world. I won’t elaborate on this – others have done that better than I can.

The problem goes back at least to the days of Usenet – even then I was interested in crowdsourced voting systems where “good” posts would get upvoted and “bad” ones downvoted in various ways, together with collaborative filtering on a per-reader basis to show readers the posts they’ll value most. I suppose many versions of this must have been tried by now; certainly sites like Stack Exchange and Reddit have made real efforts. The problem persists, so these solutions are at best incomplete. And of course some sites have excellent quality comments (I’m thinking of https://astralcodexten.substack.com/ and https://www.overcomingbias.com/), but these either have extremely narrow audiences or the hosts spend vast effort on manual moderation.

My goal (you may not share it) is to enable online discussion that’s civil and rational. Discussion that consists of facts and reasoned arguments, not epithets and insults. Discussion that respects the Principle of Charity. Discussion where people try to seek truth and attempt to persuade rather than bludgeon those who disagree. Discussion where facts matter. I think such discussions are more fun for the participants (they are for me), more informative to readers, and lead to enlightenment and discovery.

SHORT VERSION

Here’s the short version: When a commenter (let’s say on a news article, editorial, or blog post) drafts a post, the post content is reviewed by an AI (a LLM such as a GPT, as are currently all the rage) for conformity with “community values”. These values are set by the host of the discussion – the publication, website, etc. The host describes the values to the AI, in plain English, in a prompt to the AI. My model is that the “community values” reflect the kind of conversations the host wants to see on their platform – polite, respectful, rational, fact-driven, etc. Or not, as the case may be. My model doesn’t involve “values” that shut down rational discussion or genuine disagreement (“poster must claim Earth is flat”, “poster must support Republican values”…), altho I suppose some people may want to try that.

The commenter drafts a post in the currently-usual way, and clicks the “post” button. At that point the AI reviews the text of the comment (possibly along with the conversation so far, for context) and decides whether the comment meets the community values for the site. If so, the comment is posted.

If not, the AI explains to the poster what was wrong with the comment – it was insulting, it was illogical, it was…whatever. And perhaps offers a restatement or alternative wording. The poster may then modify their comment and try again. Perhaps they can also argue with the AI to try to convince it to change its opinion.

IMPORTANT ELABORATIONS

The above is the shortest and simplest version of the concept.

One reasonable objection is that this is, effectively, a censorship mechanism. As described, it is, but limited a single host site. I don’t have a problem with that, since the Internet is full of discussions and people are free to leave sites they find too constraining.

Still, there are many ways to modify the system to remove or loosen the censorship aspect, and perhaps those will work better. Below are a couple I’ve thought of.

OVERRIDE SYSTEMS

If the AI says a post doesn’t meet local standard, the poster can override the AI and post the comment anyway.

Such overrides would be allowed only if the poster has sufficient “override points”, which are consumed each time a poster overrides the AI (perhaps a fixed number per post, or perhaps variable based on the how far out of spec the AI deems to the post); once they’re out of points they can’t override anymore.

Override points might be acquired:

  • so many per unit time (each user gets some fixed allocation weekly), or
  • by posting things approved of by the AI or by readers, or
  • by seniority on the site, or
  • by reputation (earned somehow), or
  • by gift of the host (presumably to trusted people), or
  • by buying them with money, or
  • some combination of these.

Re buying them with money, a poster could effectively bet the AI about the outcome of human moderator review. Comments posted this way go online and also to a human moderator, who independently decides if the AI was right. If so, the site keeps the money. If the moderator sides with poster, the points (or money) is returned.

The expenditure of override points is also valuable feedback to the site host who drafts the “community values” prompt – the host can see which posts required how many override points (and why, according to the AI), and decide whether to modify the prompt.

READER-SIDE MODERATION

Another idea (credit here to Richard E.) is that all comments are posted, just with different ratings, and readers see whatever they’ve asked to see based on the ratings (and perhaps other criteria).

The AI rates the comment on multiple independent scales – for example, politeness, logic, rationality, fact content, charity, etc., each scale defined in an AI prompt by the host. The host offers a default set of thresholds or preferences for what readers see but readers are free to change those as they see fit.

(Letting readers define their own scales is possible but computationally expensive – each comment would need to be rated by the AI for each reader, rather than just once when posted).

In this model there could also be a points system that allows posters to modify their ratings, if they want to promote something the AI (or readers) would prefer not to see.

Advice on archival backups

Want a real archival backup that your great-great-grandchildren will be able to read?

Most media (CD-Rs, tape, disk, I think also flash) decay after 10 to 20 years. Having lots of redundant copies helps a little, but only a little.

If you want things to last a LONG time (say, 100+ years), I think the best options today are:

1 – M-DISC BluRay discs. They are designed to last 100 years. Of course this hasn’t been tested – the number is based on projections and knowledge of the decay mechanisms. From what I’ve read the chance of them being readable after 200 or 300 years is pretty good, if they’re stored in a dark cool place (say 5 to 10 C). Probably purging the oxygen from the container (flush it out with nitrogen) is a good idea too.

And, of course, multiple redundant copies.

2 – Multiple external HDDs, but keep them running the whole time (bearings often seize up after a few years if they’re not run). Actively migrate the data to new hardware every 10 years or so. (This requires money and effort of course.)

My understanding is that the ZFS (Zettabyte File System) is the way you’d want to store data across multiple HDDs – you can adjust the redundancy level as you like and ZFS will use all available space to create more redundancy if you want (as you originally where thinking).

3 – If you only have a little bit of data to store (< 1 MByte) – punched paper tape or punched cards. Store them in sealed containers purged of oxygen, in a dark cool place. If you’re really serious, use punched cards made of out of solid gold (gold is inert), and put the sealed container in a Nazi submarine and sink it to the bottom of an ocean.

Are you concerned that 100+ years from now nobody will have the hardware to read the media? Don’t be. If civilization doesn’t fall, it won’t be a problem (if it does fall, yes a problem).

I don’t think there’s any storage media used 100 years ago that we can’t today build a new reader for – easily and cheaply. And of course there will always be historians and museums who keep readers functional for common media.

Re uncommon media, a few years ago there was a project to recover data from some 50 year old tapes with data from NASA spacecraft. There were no working tape drives that could read it, so they built a new one from scratch. Anything we can build today at all, the technology of the future (assuming civilization doesn’t collapse) will find easy to duplicate.

Use Beyond Compare to launch Word’s legal blackline compare (on Windows)

[Minor update 2020-09-06]

I use Beyond Compare a lot – every day. It’s the best “diff” utility I’ve ever found.

But I also need to compare Word documents a lot – also every day. And Beyond Compare isn’t very good at that.

Microsoft Word has it’s own “legal blackline” (sometimes called “redline”; I don’t know why) compare which works well, but is very tedious to start each time. To use it (in Office 365), you need to:

  1. Open a document
  2. Go Review>Compare>Compare two documents
  3. Find the original document and select it
  4. Find the revised document and select it (yes, even tho you already have it open)
  5. Click OK

If, as is often the case with me, the two documents are in different folders, this is a lot of work.

With Beyond Compare, on the other hand, you can just select two documents in File Explorer, and right-click on “Compare”. Done.

Here’s a way to get Beyond Compare (BC) to launch Word’s legal blackline, the same easy way. Step-by-step:

1 – Download script “Diff-Word.ps1” into the BC4 folder (usually “C:\Program Files\Beyond Compare 4”)

(That file is modified from what I found at https://github.com/ForNeVeR/ExtDiff – many thanks to the author of that!!)

2 – Open Beyond Compare and do Tools>FileFormats, go to the bottom of the window that pops up and click ‘+’, then choose “External Format”.

3 – In the Mask box paste in “*.doc;*.docm;*.docx;*.dot;*.dotm;*.dotx” (without the double quotes).

4 – In the Quick Comparison paste in (again without the double quotes): “powershell -ExecutionPolicy ByPass -File Diff-Word.ps1 %1 %2”

5 – In the Compare View box paste in (again without the double quotes): “powershell -ExecutionPolicy ByPass -File Diff-Word.ps1 %2 %1” (note reverse order of parameters at the end – this is necessary)

6 – In the Description box past in (if you care): “Make Word open it’s own blackline compare.”

7 – Click Save, click Close, exit Beyond Compare.

Now to use it, the order in which you do things matters (because of the way Word’s compare works – it marks revisions against an “original”; if you do things in the wrong order you’ll be marking the original against the revised version, which isn’t the same thing).

So to use it:

1 – Right click on the ORIGINAL file and choose “Select left file for compare”.
2 – Right click on the REVISED file and choose “Compare to”.

That’s it. This will open 2 Word windows, one with the blackline change marks (revised marked against original), and the other with the revised document (no changes – ready to edit further).

If you don’t want that 2nd window (just want the changes), put a ‘#’ (comment) in front of the line that starts “$new =” in Diff-Word.ps1.

Force remove an entire Windows folder tree at the command line

Supposedly

del /f/s/q [target]
will delete an entire folder in Windows.

But often it doesn’t – excessively long file names, excessively long paths, and other things, break it. Sometimes it can be quite difficult to fully clean out a folder.

There are many solutions (cygwin‘s rm is pretty powerful) but here’s a simple batch file that harnesses the power of Robocopy (which comes pre-installed with Windows) to do the job:

@rem thanks to https://stackoverflow.com/questions/97875/rm-rf-equivalent-for-windows
@echo TEP - Terminate with Extreme Prejudice (die, die, die) @echo off setlocal SET /P AREYOUSURE=Are you ABSOLUTELY SURE you want to irreversibly delete folder '%1' [y,N]?
IF /I "%AREYOUSURE%" NEQ "Y" GOTO abort set emptyFolder=%TEMP%\tep_%RANDOM%%RANDOM%%RANDOM%
mkdir %emptyFolder%
@REM robocopy will mirror an EMPTY FOLDER into the target
robocopy /mir %emptyFolder% %1 rmdir %emptyFolder% rmdir %1 goto exit :abort echo Nothing done. :exit endlocal

The only thing I’ve found that this won’t delete is open files.

How to use Google Earth with a SpaceNavigator and a joystick – at the same time

I love Google Earth.

And the best way to run Google Earth is with a 3Dconnection SpaceNavigator. This is a 6-degree-of-freedom controller that lets you fly around in Google Earth effortlessly. (Supposedly it’s good for 3D CAD work too; I haven’t tried that.) Yes, it’s a little pricey but it’s worth every penny.

88590_scn2

(Tip: It works better if you glue it to your desk with some double-sided sticky tape. It’s weighted to prevent it from flying off the desk when you pull up, but the tape helps.)

To use Google Earth with the SpaceNavigator (once you’ve got that installed), in Google Earth just do Tools>Options…>Navigation>Non-mouse Controller>Enable Controller.

Unfortunately, if you also have a joystick – any joystick – attached to your Windows box, Google Earth will take input from both at once – which makes control impossible from the SpaceNavigator.

I used to deal with it by unplugging the joystick USB, or by disabling the joystick in Device Manager, but I found a better way.

Start up Google Earth. Get your joystick and adjust it carefully (including the throttle) so that there’s no motion at all in Google Earth. Then turn it off. (Or just leave it alone if your joystick doesn’t have a power switch.)

That’s it. Now Google Earth won’t see any input from the joystick, and the SpaceNavigator will work fine.

A platform for crowdsourcing rewards for good deeds

Here’s something the world needs – build it and get rich! (I’m too lazy.)

I really want somebody to finish porting OpenCV to Python 3. It’s an open-source project that isn’t getting enough effort to finish it.

I’m willing to offer money for it.

Not a huge amount – a few hundred dollars.

Somebody needs to build an online platform that will let me make an offer like that – finish the port, get my money.

Surely there are other people who share this goal – probably many of them are also willing to kick in something to make it happen.

The platform should allow me to set a goal with clearly-defined criteria for success, and then aggregate the rewards offered by everyone who signs on to the goal. Developers looking to make some money could pick a goal, accomplish it, and collect the reward.

Whoever sets up the platform (analogous to Kickstarter, Indiegogo, etc.) can charge a fee or small percentage of the rewards.

 

While you’re at it, the world also needs ways to reward people for other kinds of good deeds.

For example, florist Debbie Dills heroically tailed Charleston shooter Dylann Roof’s car until the police arrived to arrest him.

When I read a story like that, I should be able to click on the hero’s name and send him or her $1 or $5 as a reward, in appreciation of the heroism. I think millions of people would do that upon reading about a hero in a news story, if it was as easy as clicking on her name and entering the dollar amount.

That should be doable.

So, go do it. Please. You’ll make the world a better place by rewarding good deeds – it’s not only fair, it might make people behave better.

And if you’re the one to do it, it’s only fair that you charge something for setting up and running the system.

 

Hey: VCs often say that good ideas are a dime a dozen. Mine go even cheaper than that. If you use this idea to make money, I’d like 0.5%. Of the equity in your company, or the profits. Or something. If that’s too much (or too little), fine – whatever you think is fair. This is a request for a gift, or reward – it is not a legal obligation. You’re free to use this idea and pay me nothing. If you can live with yourself that way.