Notes on “Among the A.I. Doomsayers” by Andrew Marantz

This came out today, and I’m in it a little bit because I attended a dinner party Katja threw for Marantz and arrived early. Mostly this is just disappointingly bad and it was a mistake to talk to him. He warned in advance “I’m going to have to talk about the people as well as the ideas” but it seems clear that he’s not very captivated by the ideas and a lot more interested in talking about the people and the fights we’re having; the overall tone of “what a bunch of weirdos” pervades the whole article. Maybe I’m just spoiled by talking to Tom Chivers.

Update: smart people disagree. Maybe I got out of the wrong side of bed this morning.

Nothing in the article really needs rebutting as such. I kept going “hm that’s not quite right” as I read it, so I thought I’d jot down some trivial bullet points.

  • Published in the print edition of the March 18, 2024, issue, with the headline “O.K., Doomer.” great.
  • I first met Katja at the CFAR minicamp in 2012, I think I said this at the time.
  • We’re only called “A.I. safetyists, or decelerationists” or “A.I. doomers” by people who hate us. We also want AI to take humanity to the stars, we just need to not all die for that to happen. Thus “AI notkilleveryoneism”. (Update: pushback on this)
  • And no-one calls “e/acc” people “boomers”. I posted to Twitter asking for a name for them, and the best candidate was “vroomers”.
  • “(Some experts believe that A.G.I. is impossible, or decades away; others expect it to arrive this year.)” Who expects it to arrive this year? For that matter, which experts say it’s impossible?
  • “And then, as if to justify the moment of levity” This is a very strange imagining of the conversation. The quote is about, and was quoted in the context of a discussion about, what kind of emotional state it makes sense to occupy when everything you hold dear seems to be at risk. It wouldn’t make any sense to quote it immediately following the Snoop Dogg quote.
  • “The existential threat posed by A.I. had always been among the rationalists’ central issues, but it emerged as the dominant topic around 2015, following a rapid series of advances in machine learning.” I don’t think rationalists became more concerned about AI x-risk in 2015; they were already super concerned. In 2014 Stephen Hawking, Max Tegmark, Frank Wilczek, and Stuart Russell published an article in the popular press about the risk, which was a big step forward in public recognition, and Bostrom’s book “Superintelligence” came out.
  • Seems odd to mention that Hinton and Bengio are on-side without mentioning that LeCun is a constant source of ill-thought-out mockery on this issue!
  • “Some people gave away their savings, assuming that, within a few years, money would be useless or everyone on Earth would be dead.” I advise against this and I don’t know anyone who did this.
  • “Like many rationalists, she sometimes seems to forget that the most well-reasoned argument does not always win in the marketplace of ideas.” This is just one of those false things people like to say, and Katja’s quote doesn’t illustrate it. She states that her targets would have reason to listen, not that they would listen to reason.
  • “Most doomers started out as left-libertarians” – this is a non-standard use of the phrase left-libertarian. I would say that “libertarian” would be more accurate, with the caveat that the Libertarian Party fell to a form of entryism and are now insane right-wing parties a million miles from my libertarian friends.
  • “A guest brought up Scott Alexander” I’m pretty sure I was that guest and this was at the party I was at, not this one.
  • “The same people cycle between selling AGI utopia and doom” as always Gebru has no idea. The anti-AI-omnicide movement started by Yudkowsky and Bostrom has always argued that AI could lead to a utopian future if we can somehow avoid it killing everyone.
  • “Recently, though, the doomers have seemed to be losing ground.” It’s worth taking a moment to think how extraordinary it is that we have the ground we have – concerns that in 2013 I thought would forever be dismissed as sci-fi are spoken of as worthy of serious treatment at all three major AI companies, and in the Senate.
  • Wild to hear that Upton Sinclair quote used as a reason to take “don’t worry about AI killing us all” more seriously. UPDATE: two people pointed out that I read this exactly backwards, and Sinclair is being applied in the very straightforward way that makes sense.
  • “Scott Alexander wrote a few days after the incident.” The web has this fantastic technology called the hyperlink, and indeed Scott’s essay is a necessary rebuttal to what that paragraph says.
  • “coördinate their food shopping” ah, the New Yorker.

Informal sampling of the Fatal Force Database

See a random example

Police shot dead 1,160 people in 2023. What’s a typical shooting like?

We learn about fatal police shootings through two routes. First, dramatic examples get in national news. However, these are clearly very atypical, or they wouldn’t be in the news. Second, we can compile statistics. The Washington Post’s Fatal Force Database aims to be a comprehensive database of every police-involved killing in the United States since its inception in 2015, and includes a variety of information on whether the suspect was armed, their race, and so forth. But when what you really want to know is how often the police should have acted differently, how do you gather statistics on that?

If you visit my Fatal Force Database informal sampler webapp, you’ll see a randomly selected row from the Post’s database. A few details such as name and location are shown on the page; to learn more about the incident, press “Search Google” for a Google search tailored to find news reports about the killing. A second button, “Sample again”, selects a fresh random row from the database.

I’m not sure if there’s an established name for this technique, but I call it “informal sampling”: to get an informal picture of what a population is like, choose a random member of that population, look into that member by whatever means, and then repeat for another random sample. This seems to me to retain many of the advantages of anecdote in getting a gut feel for a population, without being thrown by rare vivid examples in the way that non-statistical methods often are. I have used this technique in other ways too, such as by trying to work out where all my money goes by sampling on where a randomly chosen penny was spent. One of my main motivations in writing this webapp and this blog post was to promote the idea of informal sampling as a technique; I have very often wished for a tool like this in other domains.

Although I encourage you to try this out and get your own picture of these things, I must warn you that it is all very sad.

See a random example

What about Alpha Centauri?

If we make Sphere the Sun, and Pluto is in San Francisco, where does Alpha Centauri go?

Alpha, Beta, and Proxima Centauri, as taken by Skatebiker

Alpha Centauri is 4.344 light years away, so at our scale of 1:8.855 million, model Alpha Centauri must be placed 12 times further away than the Moon. Not 12 times further away than our model of the Moon in Las Vegas; 12 times further away than the actual Moon. Even if we decide to build a model of Alpha Centauri A which is 1.2 times the radius of Sphere itself, it’s going to be a little tricky to make it stay in place.

Fortunately this contemplation of astronomical distances suggests its own solution. Our model of the Earth will be nearly 5 feet in diameter, enough to portray the continents and oceans in fairly rich detail. Obviously this model will require an arrow pointing at south Las Vegas reading “YOU ARE HERE”, with another arrow pointing 1/16 in away reading “SUN”. We can then also mark the locations of our models bodies from Jupiter to Sedna similarly, in model California and model Oregon. With this done, it is a small extension of our model-within-a-model to put Alpha Centauri 1719 feet away from our Earth model, saying “we didn’t include Alpha Centauri in our model, but here it is in our model-within-a-model”.

We can even extend this to, say, the galactic core, 1956 miles away in our model-model. However the Andromeda Galaxy would need to be most of the way to the Moon at 190,000 miles; for this we would have to resort to a model-model-model.

In conclusion: Space, is big. Really big. You just won’t believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to space.

Thanks to John Cody for inspiring this thought!

Picture credit: Skatebiker, Wikimedia Commons

Make Sphere the Sun

The Sweden Solar System, at a scale of 1:20 million, is the world’s largest scale model of the Solar System, with the Avicii Arena representing the Sun. The Arena was the largest spherical building in the world… until the completion of Sphere in Las Vegas.

It is clear what America has to do.

Sphere is 516 feet in diameter, implying an overall scale of 1:8.855 million, making the model over twice the size of that of Sweden. Even at this scale, most of the planets, dwarf planets and moons are surprisingly small. Earth is the largest non-gas-giant in the Solar System and would be represented by a sphere less than five feet across, about 10 ½ miles from the Sphere; there’s a convenient casino about the right distance and not far from the road towards LA. The Moon would then be a 15 inch sphere 142 feet from the Earth.

The four gas giants present a more serious challenge; even the smallest, Neptune, must be 18 feet across. While it’s preferable that all bodies be represented with spheres or hemispheres, in these four cases we might initially follow the example that the Sweden Solar System sets with Jupiter and Saturn, and use a circular representation at least initially. For example we could represent Jupiter as a painted circle, 53 feet in diameter, at the Cima Road truck stop. I’ve put the locations on a road trip from Vegas to SF via LA, since that should maximize visitability and also is most convenient for me. Note that the distances increase rapidly; while Mars can be placed near a race track just outside Vegas, Jupiter is already out of Nevada and into California. Uranus can be placed at a botanic garden in Greater Los Angeles, Neptune at a hotel in SLO, Pluto in the Exploratorium in SF, and Eris in Portland Oregon.

One convenient thing about this scale is that even relatively small astronomical bodies will be visible to the naked eye: for example, Mars’s moon Deimos will be a sphere slightly under 1/16 inch in diameter, around 9 feet from the 30 inch sphere representing Mars itself.

This spreadsheet details the diameter and distance each model needs to be to fulfil our chosen scale, and in most cases suggests a suitable location at roughly the right distance from the Sphere.

The relative distances in the solar system, and the vast space between its bodies, can be counterintuitive. Now that America has claimed the title of the world’s largest spherical building, a grand project to make this the center of the world’s largest scale model of the solar system is the perfect opportunity to capture the vastness of space.

Followup: What about Alpha Centauri? Photo credit: Sin City Las Vegas

Responses to Cade Metz’s hit piece on Scott Alexander

This depraved rubbish got quite a few responses. Please let me know about any I’ve missed. Inner bullets are subtitles or exerpts. Open to adding good Twitter threads too!

  • Scott Alexander, Statement on New York Times Article
    • I have 1,557 other posts worth of material he could have used, and the sentence he chose to go with was the one that was crossed out and included a plea for people to stop taking it out of context.
    • I don’t want to accuse the New York Times of lying about me, exactly, but if they were truthful, it was in the same way as that famous movie review which describes the Wizard of Oz as: “Transported to a surreal landscape, a young girl kills the first person she meets and then teams up with three strangers to kill again.”
    • I believe they misrepresented me as retaliation for my publicly objecting to their policy of doxxing bloggers in a way that threatens their livelihood and safety. Because they are much more powerful than I am and have a much wider reach, far more people will read their article than will read my response, so probably their plan will work.
  • Cathy Young, Slate Star Codex and the Gray Lady’s Decay
    • The New York Times hit piece on a heterodox blogger is a bad stumble — the latest of many
    • The New York Times has wandered into an even worse culture-war skirmish. This one involves bad (and arguably dishonest) reporting as well as accusations of vindictiveness and violation of privacy. It’s a clash pitting the Times against a prominent blogger critical of the social justice progressivism that has become the mainstream media’s dominant ideology in the past decade. The Times does not look good.
  • Robby Soave, What The New York Times‘ Hit Piece on Slate Star Codex Says About Media Gatekeeping
    • “”Silicon Valley’s Safe Space” has misinformed readers.”
    • It’s a lazy hit piece that actively misleads readers, giving them the false impression that Siskind is at the center of a stealth plot to infiltrate Silicon Valley and pollute it with noxious far-right ideas.
    • The idea that a clinical psychiatrist’s blog is the embodiment of Silicon Valley’s psyche is very odd
    • To the extent that the Times  left readers with the impression that Siskind is primarily a right-wing contrarian—and Silicon Valley’s intellectual man-behind-the-curtain to boot—it is actually the paper of record that has spread misinformation.
  • Matt Yglesias, In defense of interesting writing on controversial topics
    • Some thoughts on the New York Times’ Slate Star Codex profile
    • I think Metz kind of misses what’s interesting about it from the get-go.
    • something about the internet is making people into infantile conformists with no taste or appreciation for the life of the mind, and frankly, I’m sick of it.
  • Tom Chivers, When did we give up on persuasion?
  • Freddie deBoer, Scott Alexander is not in the Gizmodo Media Slack
  • Scott Aaronson, A grand anticlimax: the New York Times on Scott Alexander
  • Sergey Alexashenko, How The Hell Do You Not Quote SSC?
    • “The NYT misses the point by a light-year.”
  • Kenneth R. Pike, Scott Alexander, Philosopher King of the Weird People
  • Noah Smith, Silicon Valley isn’t full of fascists
  • Thread by Steven Pinker

NYT plan to doxx Scott Alexander for no real reason

UPDATE 2020-06-25: Please sign the petition at DontDoxScottAlexander.com!

The New York Times is planning on publishing an article about Scott Alexander, one of the most important thinkers of our time. Unfortunately, they plan to include his legal name. In response, Scott has shut down his blog, a huge loss to the world.

This will do enormous harm to him personally; some people hate Scott and this will encourage them to go after his livelihood and his home. Not all of them are above even SWATing, ie attempted murder by police. If he does lose his job, it will also “leave hundreds of patients in a dangerous situation as we tried to transition their care.”

However, the greatest harm is to the public discourse as a whole. Shutting people down in real life is an increasingly popular response to all forms of disagreement. Pseudonymity plays an essential role in keeping the marketplace of ideas healthy, making it possible for a wider spectrum of ideas to be heard. If the NYT policy is that anyone whose profile becomes prominent enough will be doxxed in the most important newspaper in the world, it has a chilling effect.

All this might be OK if there was some countervailing public interest defence,  if there was a connection between his blogging and his real world activity that needed to be exposed. But as I understand it, no-one is asserting this. The defence of this incredibly harmful act is simply “sorry, this is our policy”. It’s not even a consistently applied policy: a profile of the Chapo Trap House hosts published in February rightly omitted host Virgil Texas’s real name, though they must surely have been aware of it.

I urge you to spread the word on this everywhere you have reach, and to politely contact the New York Times through the means Scott outlines in his post to urge them to do the right thing.

UPDATE 2020-06-25: Please sign the petition at DontDoxScottAlexander.com!

Here’s the letter I wrote:

I am a subscriber, and I am dismayed to learn that the Times plans to doxx blogger Scott Alexander. In an age where people so often respond to disagreement by attacking someone in the real world, whether by getting them fired or by SWATing, pseudonymity plays an essential role in the marketplace of ideas, helping to ensure that a wide spectrum of voices can be heard. 

Obviously if there was a public interest defence of publishing this information – if there was a connection between his blogging and his real world activity that needed to be exposed – that would be different, but as I understand it no-one is asserting that. If you plan to do something so tremendously harmful to the public discourse as a whole, please have a reason other than “this is what we do”. You were right not to doxx Chapo Trap House host Virgil Texas; please apply that policy here.

Some more numbers as lambda calculus

On Reddit, u/spriteguard asks:

Do you have any diagrams of smaller numbers for comparison? I’d love to see a whole sequence of these.

I have work to put off, so I couldn’t resist the challenge. Like the previous post, these are Tromp diagrams showing lambda calculus expressions that evaluate to Church integers.

0
1
2
3
5
Multiplication: λa b f. a (b f)
10 (represented as 2 × 5)
100 = (2 × 5)^2
A googol
A googolplex
A googolduplex
Decker (10↑↑↑2)
Graham’s number
My own large number, f_{ω^ω + 1}(4) in a version of the fast-growing hierarchy where f_0(n) = n^n
A truly huge number, f_{Γ_ω + 1}(3)

Code to generate these diagrams is on Github; I generated these with the command

./trylambda --outdir /tmp/out demofiles/smallernums.olc demofiles/graham.olc demofiles/fgh.olc demofiles/slow.olc

A picture of Graham’s Number

One of the first posts I made on this blog was Lambda calculus and Graham’s number, which set out how to express the insanely large number known as Graham’s Number precisely and concisely using lambda calculus.

A week ago, Reddit user u/KtoProd asked: if I wanted to get a Graham’s Number tattoo, how should I represent it? u/FavoriteColorFlavor linked to my lambda calculus post. But in a cool twist, they suggested that rather than writing these things in the usual way, they use a John Tromp lambda calculus diagram. I got into the discussion and started working with the diagrams a bit, and they really are a great way to work with lambda calculus expressions; it was a pleasure to understand how the diagram relates to what I originally wrote, and manipulate it a bit for clarity.

The bars at the top are lambdas, the joining horizontal lines are applications, and the vertical lines are variables. There are three groups; the rightmost group represents the number 2, and the middle one the number 3; with beta reduction the two lambdas in the leftmost group will consume these rightmost groups and use them to build other small numbers needed here, like 4 (22) and 64 (43). The three is also used to make the two 3s either side of the arrows. Tromp’s page about these diagrams has lots of examples.

I’m obviously biased, but this is my favourite of the suggestions in that discussion. If u/KtoProd does get it as a tattoo I hope I can share a picture with you all!

Update: Some more numbers as lambda calculus

Update 2020-02-24: I’ve added the ability to generate these diagrams to my Python lambda calculus toy. After installation, try ./trylambda demofiles/draw.olc.

More thoughts on TCRs

A few more notes regarding target collision resistant functions, following up from my $1000 competition announcement.

Second preimage resistance

There is a simple way to construct a secure TCR compression function given a second-preimage-resistant compression function—just generate a key which is the length of the input, and XOR the key with the input. So if we can build a fast second-preimage-resistant function, we can build a fast secure TCR.

The history of hash functions shows that we have been much more successful at achieving second-preimage resistance than collision resistance. From the excellent Lessons From The History Of Attacks On Secure Hash Functions:

The main result is that there is a big gap between the history of collision attacks and pre-image attacks. Almost all older secure hash functions have fallen to collision attacks. Almost none have ever fallen to pre-image attacks.

Secondarily, no new secure hash functions (designed after approximately the year 2000) have so far succumbed to collision attacks, either.

Tweakable target collision resistance

In the definition of target collision resistance, the attacker supplies a single message, but in practice, we usually want to hash many messages with the same key, eg when constructing a variable-length TCR from a compression function that takes a fixed-length message. This is OK because there’s a straightforward security reduction which shows that if an attacker can find a collision for a single message with probability ε, then they can forge a collision for any of n messages with probability at most .

However, when the messages to be signed are large as they are in Android, this linear falloff is kind of a shame. One possible advantage of TCRs is that it can be secure to use much shorter hash outputs, say 128 bits, which will make Merkle trees much smaller, saving disk space and improving performance. But if the hash function consumes, say, 128 bytes at a time (like BLAKE2) and the system partition is 1GB, it will be broken up into 223 messages, leaving us with only 105-bit security at best. I’d like to do a little better than that, and so I’d like to build multiple-message security into the security definition. I propose a new kind of primitive, a tweakable TCR, which takes a tweak as well as a key and a message. The attacker faces the following challenge:

  • Attacker chooses n messages m1mn and n distinct tweaks t1tn
  • Attacker learns random key K
  • Attacker chooses i, m
  • Attacker wins if  m’ ≠ mi but H(K, ti, m’) = H(K, ti, mi)

If each of the 223 messages gets a distinct tweak, we can preserve 128-bit security across large partitions. I therefore encourage people to design not just TCRs, but TTCRs!

$1000 TCR hashing competition

In my day job, I do cryptography for Android. I have a problem where I need to make some cryptography faster, and I’m setting up a $1000 competition funded from my own pocket for work towards the solution.

On Android devices, key operating system components are stored in read-only partitions such as the /system partition. To prevent an attacker tampering with these partitions, we hash them using a Merkle tree, and sign the root. We don’t check all the hashes when the device boots; that would take too long. Instead, at boot time we check only the root of the tree, and then we check other sectors against their hashes as we load them using a Linux kernel module called dm-verity.

This likely works pretty well on phones sold in the US, which will have the ARM CE instructions that accelerate SHA2-256. But a lot of devices sold in poorer countries don’t have these instructions, and SHA2-256 can be pretty slow, and hurt overall system performance. For example, on the 900MHz Cortex-A7 Broadcom BCM2836, hashing takes 28.86 cpb [eBACS 2019-03-31], limiting reading speed to 31.2 MB/s. One partial fix is to switch to a hash function that is faster on such processors. BLAKE2b is nearly twice as fast on that processor, at 15.32 cpb. However, this is still a lot slower than I’m happy with.

Where sender and receiver have a shared secret, authentication can be very fast; a universal function like NH can run at around 1.5 cpb on such a processor. But this isn’t an option for verified boot, because it’s hard to keep the key out of the attacker’s hands, and given the key it’s trivial to forge messages.

Inbetween these two notions of security is the idea of a “target collision resistant” function, once known as a “universal one-way hash function”. With a TCR, hashing is randomized with a key chosen at signing time once the message to be signed is known. This makes the attacker’s job much harder since they cannot simply search for a pair of colliding messages. Instead, they must choose the first message to be hashed, and only then do they learn the key that will be used at hashing time, after which they must generate a second message that hashes with the first using this key; this problem is more akin to second preimage finding than collision finding. While collision attacks against hash functions are plentiful, second preimage attacks are far rarer.

Collision resistance Target collision resistance Universal function
K \xleftarrow{\$} \mathcal{K}
A \leftarrow K \\ A \rightarrow m_1 \\ A \rightarrow m_2 A \rightarrow m_1 \\ A \leftarrow K \\ A \rightarrow m_2 A \rightarrow m_1 \\ A \rightarrow m_2 \\ A \leftarrow K
Attacker succeeds if m_1 \neq m_2 and H(K, m_1) = H(K, m_2)

In principle, the vastly harder job facing the attacker of a TCR should mean that secure TCRs much faster than hash functions are possible. However, the main impetus to research on TCRs was a desire to bolster existing hash functions, when attacks on MD5 and SHA-1 were new and we didn’t know which if any of our existing hash functions would be left standing. As a result several ways to construct a TCR from a hash function with good provable properties were proposed, but none of these could be faster than their underlying hash functions. As far as I know, no-one has ever proposed a TCR as a primitive, designed to be faster than existing hash functions, and that’s what I need. I’m probably not the only one who’d find a primitive for much faster broadcast authentication useful, either!

To me this looks like an interesting, overlooked problem in symmetric cryptology, and I’d really like it to get some attention. So I’m offering a $1000 prize from my own pocket, to be awarded at Real World Crypto 2021, for the work that in my arbitrary opinion does the most to move the state of the art forwards or is just the most interesting.

I offered a similar prize at the rump session at FSE 2019, promising to award it at the end of the year, but I neglected to really tell anyone and didn’t get any entrants. Hopefully this will be a more successful launch for the prize, and see some of you at Real World Crypto 2020!

I’ve moved some technical notes into a subsequent blog post.