Animated gradient text is like the poster child of AI right now. Gotta have that animated gradient text sparkle button thingy to be cool.
So here! Steal the recipe.
OpenAI o3 breakthrough high score on ARC-AGI-PUB
François Chollet is the co-founder of the ARC Prize and had advanced access to today's o3 results. His article here is the most insightful coverage I've seen of o3, going beyond just the benchmark results to talk about what this all means for the field in general.
One fascinating detail: it cost $6,677 to run o3 in "high efficiency" mode against ...
OpenAI's new o3 system - trained on the ARC-AGI-1 Public Training set - has scored a breakthrough 75.7% on the Semi-Private Evaluation set at our stated public leaderboard $10k compute limit. A high-compute (172x) o3 configuration scored 87.5%.
This is a surprising and important step-function increase in AI capabilities, showing novel task adaptation ability never seen before in the GPT-family ...
Live blog: the 12th day of OpenAI - "Early evals for OpenAI o3"
It's the final day of OpenAI's 12 Days of OpenAI launch series, and since I built a live blogging system a couple of months ago I've decided to roll it out again to provide live commentary during the half hour event, which kicks off at 10am San Francisco time.
Here's the video on YouTube.
Tags: ai, openai, prompt-injection, generative-ai, llms, o1, inference-scaling, o3
5 CSS Snippets Every Front End Developer Should Know In 2024
Earlier this year, I wrote 5 CSS snippets every front-end developer should know in 2024 on web.dev. I think it holds up!
Checkout the 6 snippet 2023 edition
This is the 69th edition of People and Blogs, the series where I ask interesting people to talk about themselves and their blogs. Today we have Zinzy and her blog, zinzy.website
To follow this series subscribe to the newsletter. A new interview will land in your inbox every Friday. Not a fan of newsletters? No problem! You can read the interviews here on the blog or you can subscribe to the RSS...
I had big plans for December: for one thing, I was hoping to get to an actual RC of Datasette 1.0, in preparation for a full release in January. Instead, I've found myself distracted by a constant barrage of new LLM releases.
On December 4th Amazon introduced the Amazon Nova family of multi-modal models - clearly priced to compete with the excellent and inexpensive Gemini 1.5 series from Google...
Building effective agents
My principal complaint about the term "agents" is that while it has many different potential definitions most of the people who use it seem to assume that everyone else shares and understands the definition that they have chosen to use.
This outstanding piece by Erik Schluntz and Barry Zhang at Anthropic bucks that trend from the start, providing a clear definition tha...
50% of cybersecurity is endlessly explaining that consumer VPNs don’t address any real cybersecurity issues. They are basically only useful for bypassing geofences and making money telling people they need to buy a VPN.
Man-in-the-middle attacks on Public WiFi networks haven't been a realistic threat in a decade. Almost all websites use encryption by default, and anything of value uses HSTS to ...
Those new model releases just keep on flowing. Today it's Google's snappily named gemini-2.0-flash-thinking-exp, their first entrant into the o1-style inference scaling class of models. I posted about a great essay about the significance of these just this morning.
From the Gemini model documentation:
Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "think...
Everything Is Always Getting Worse (Until It Isn't)
A few months ago, I found myself doomscrolling through X (first mistake) when I found a thread about how “everything is getting worse.” The author had assembled an impressive collection of graphs showing declining trust in institutions, rising polarization, increasing mental health issues among teens, and various other
Is AI progress slowing down?
This piece by Arvind Narayanan, Sayash Kapoor and Benedikt Ströbl is the single most insightful essay about AI and LLMs I've seen in a long time. It's long and worth reading every inch of it - it defies summarization, but I'll try anyway.
The key question they address is the widely discussed issue of whether model scaling has stopped working. Last year it seemed lik...
While working with CSS carousels, I needed a solution that could adapt the position of the scroll buttons to be either inside or outside based on the available space.
I'm using anchoring to pin the buttons wherever I want, and when they're outside, there's potential for them to be off screen or out of bounds.
The solution?
#
U...
q and qv zsh functions for asking questions of websites and YouTube videos with LLM
q and qv zsh functions for asking questions of websites and YouTube videos with LLM
Spotted these in David Gasquez's zshrc dotfiles: two shell functions that use my LLM tool to answer questions about a website or YouTube video.
Here's how to ask a question of a website:
q https://simonwillison.net/ 'What has Simon written about recently?'
I got back:
Recently, Simon Willison has written about...
I’m in the middle of a design tokens project and I thought I’d share something I’m learning that is probably obvious to everyone else; every design token is a feature.
A token is a magical contract between design and engineering, if we agree to use the same name to abstractly refer to the same value, it will produce a desired outcome. That bridge alone is probably worth the investment, but toke...
Building Python tools with a one-shot prompt using uv run and Claude Projects
I've written a lot about how I've been using Claude to build one-shot HTML+JavaScript applications via Claude Artifacts. I recently started using a similar pattern to create one-shot Python utilities, using a custom Claude Project combined with the dependency management capabilities of uv.
(In LLM jargon a "one-shot" prompt is a prompt that produces the complete desired result on the first atte...
The Future of CSS: Construct <custom-ident> and <dashed-ident> values with ident()
Uniquely name a bunch elements in CSS in one go! Instead of assigning 100 unique names through 100 declarations, write only 1 and use ident() to construct the names.
Java in the Small
Core Java author Cay Horstmann describes how he now uses Java for small programs, effectively taking the place of a scripting language such as Python.
TIL that hello world in Java can now look like this - saved as hello.java:
void main(String[] args) {
println("Hello world");
}
And then run (using openjdk 23.0.1 on my Mac, installed at some point by Homebrew) like this:
...
A new free tier for GitHub Copilot in VS Code
It's easy to forget that GitHub Copilot was the first widely deployed feature built on top of generative AI, with its initial preview launching all the way back in June of 2021 and general availability in June 2022, 5 months before the release of ChatGPT.
The idea of using generative AI for autocomplete in a text editor is a really significant innov...
A polite disagreement bot ring is flooding Bluesky — reply guy as a (dis)service
A polite disagreement bot ring is flooding Bluesky — reply guy as a (dis)service
Fascinating new pattern of AI slop engagement farming: people are running bots on Bluesky that automatically reply to "respectfully disagree" with posts, in an attempt to goad the original author into replying to continue an argument.
It's not entirely clear what the intended benefit is here: unlike Twitter there's...
Meet the Conflict Entrepreneurs. They’re Fucking All of Us.
Social media runs on conflict. This isn’t exactly breaking news — but what’s worth thinking about is how this has spawned an entire class of what we might call “professional conflict entrepreneurs” and their cousins, the “trauma grifters.” These
This is part three of my series of posts describing how I made my quiz game o(m)g:image.
Project Announcement
Pt. I: Design Iterations
Pt. II: As Little JS As Possible
Pt. III: The HTML
o(m)g:image is presented like a quiz:
You get one question at a time
When you choose an answer, it shows you if you got it right (and, if you didn’t, what the right answer is)
You go to the next question
T...
OpenAI WebRTC Audio demo
OpenAI announced a bunch of API features today, including a brand new WebRTC API for setting up a two-way audio conversation with their models.
They tweeted this opaque code example:
async function createRealtimeSession(inStream, outEl, token) {
const pc = new RTCPeerConnection();
pc.ontrack = e => outEl.srcObject = e.streams[0];
pc.addTrack(inStream.getTracks()[0])...
You Got Dragged on the Internet. Stop Building a Shrine to It.
“Once the bear’s hug has got you, it is apt to be for keeps.” — Harold MacMillanA pattern, common to the internet in 2024. Someone has a conflict with a member of Group X. Let’s say they get called
Happy to share that Anthropic fixed a data leakage issue in the iOS app of Claude that I responsibly disclosed. 🙌
👉 Image URL rendering as avenue to leak data in LLM apps often exists in mobile apps as well -- typically via markdown syntax,
🚨 During a prompt injection attack this was exploitable to leak info.
— Johann Rehberger
Tags: anthropic, claude, ai, llms, johann-rehberger, ...
Ah, the CSS top layer, what a great invention. And the Popover API, incredibly helpful for accessibility.
Over at work, while implementing my first ever native popover (which in the meantime already got shipped), I noticed inconsistent behavior between the Chromiums and Firefox. This does not come as a surprise, it’s just a small price to pay for all the new shiny things we are getting these da...
CSS Wishlist time! Sarah Gebauer shared recently and inspired me. Thanks Sarah! I'm also enjoying Johannes Odland's Web Wish series. Keem em' comin!
Previously in 2013, CSS Tricks made a Wishlist, and look at how much of it we have now! Fast forward to last year where Chris Coyier rounded up wishlists cuz there were so many.
I'm gonna bucket my 2025 CSS Wishlist into 2...
2024's top three front end framework [React, Vue, Angular] were all launched over a decade ago.
Now sure, all three have evolved a lot along the way, and the patterns of 2014 would seem downright antiquated today. But given the JavaScript ecosystems's reputation as a constantly-churning whirlwind of change, it can be nice to know that some things do remain constant.
— 2024 State of JavaSc...
I’m pretty proud that I managed to keep my iPhone 8 going for over five years (with one battery replacement in that time). But recently it’s been increasingly unreliable, switching itself off at random times, and spontaneously draining the battery after doing anything remotely taxing. This combined with the fact that software updates are no longer available for this model led me to the conclusi...
Security ProbLLMs in xAI's Grok: A Deep Dive
Adding xAI to the growing list of AI labs that shipped feature vulnerable to data exfiltration prompt injection attacks, but with the unfortunate addendum that they don't seem to be taking the problem seriously:
All issues mentioned in this post were responsibly disclosed to xAI. Over the course of multiple weeks I answered many questions aroun...
Veo 2
Google's text-to-video model, now available via waitlisted preview. I got through the waitlist and tried the same prompt I ran against OpenAI's Sora last week:
A pelican riding a bicycle along a coastal path overlooking a harbor
It generated these four videos:
Here's the larger video.
Via Hacker News
Tags: ai, google, generative-ai, pelican-riding-a-bicy...
The hardest part for me about collaborating with junior programmers, whether it's in open source or at work, is avoiding the premise trap. That's where the fundamental assumptions baked into the first draft of the code aren't questioned until you've already spent far too long improving the implementation. It's the same with AI.Because AI at the moment is like a superb jun...
Tea sales in the UK are falling. It’s an old person’s drink (BBC):
less than half the nation, 48%, now drink tea at least once a day.
Shocking.
Coffee is where it’s at, of course. It costs a ton so the experience can be good and there’s the convenience and the frequency of it, and all of that builds habit, and how is tea to survive an onslaught like that.
If I were the tea marketing board, t...
Stop Expecting Facebook to Fix What We Can't Fix in Real Life
There's been a lot of hand-wringing about social media moderation of late."Platform X isn't doing enough to stop harmful content!""Platform Y is censoring too much speech!""Platform Z needs better content moderation!"But there's something deeply
Video from Chrome’s 2024 EOY campaign, highlighting View Transitions. I made sure the code snippets and animations that you see where Technically Correct™ + coded up a POC for this one (link to demo included).
WebDev Arena
New leaderboard from the Chatbot Arena team (formerly known as LMSYS), this time focused on evaluating how good different models are at "web development" - though it turns out to actually be a React, TypeScript and Tailwind benchmark.
Similar to their regular arena this works by asking you to provide a prompt and then handing that prompt to two random models and letting you pick th...
If there’s one thing I learned about myself over the years is that there’s much still I don’t know about myself and the way my brain works. And the more I pay attention to it, the more I discover aspects of myself I wasn’t aware of.
One thing I became aware of recently is that I’m quite incapable of asking others for pretty much anything important or consequential. I’m incapable of asking for f...
Phi-4 Technical Report
Phi-4 is the latest LLM from Microsoft Research. It has 14B parameters and claims to be a big leap forward in the overall Phi series. From
Introducing Phi-4: Microsoft’s Newest Small Language Model Specializing in Complex Reasoning:
Phi-4 outperforms comparable and larger models on math related reasoning due to advancements throughout the processes, including the use of ...
How We Became the McWorld - Global Culture is Getting More Boring
In the 1990s, we were promised a digital utopia. The internet would be a uniting, democratizing force, they said, bringing diverse voices and perspectives from every corner of the world into dialogue. Local cultures would flourish as they found their global audiences. Niche interests would thrive in their newfound ability
Motionless moonlit night
Dog panting on the grass
Winter is near
Thank you for keeping RSS alive. You're awesome.
Email me ::
Sign my guestbook ::
Support for 1$/month ::
See my awesome supporters ::
Subscribe to People and Blogs
Preferring throwaway code over design docs
Doug Turnbull advocates for a software development process far more realistic than attempting to create a design document up front and then implement accordingly.
As Doug observes, "No plan survives contact with the enemy". His process is to build a prototype in a draft pull request on GitHub, making detailed notes along the way and with the full inten...
Making o(m)g:image, Part II: As Little JS As Possible
This is part two of my series of posts describing how I made my quiz game o(m)g:image.
Project Announcement
Pt. I: Design Iterations
Pt. II: As Little JS As Possible
Pt. III: The HTML
One of my goals when making this project was to use as little JavaScript as possible.
In retrospect, I have to admit that was a pretty ambitious goal. Not because it was hard from a technical point of view, bu...
In search of a faster SQLite
Turso developer Avinash Sajjanshetty (previously) shares notes on the April 2024 paper Serverless Runtime / Database Co-Design With Asynchronous I/O by Turso founder and CTO Pekka Enberg, Jon Crowcroft, Sasu Tarkoma and Ashwin Rao.
The theme of the paper is rearchitecting SQLite for asynchronous I/O, and Avinash describes it as "the foundational paper behind Limbo, ...
Doing some admin this Sunday morning (funny, I typed morning and I heard the bells in the distance ringing because it’s actually noon): I updated my about page and I’ll keep adding to it because I want to move there all the info I have scattered in the various slash pages because it makes no sense to have bits and pieces all over my site when I can have all neatly organised on my about page.
Al...
It's cookie party day with our friends, but both kids are sick (and my spouse isn't 100% either), and we call in sick…
So, I bundled everyone on the couch, and we giggled (wrong family, too many hands, etc) our way through making cookies with AI.
An LLM knows every work of Shakespeare but can’t say which it read first. In this material sense a model hasn’t read at all.
To read is to think. Only at inference is there space for serendipitous inspiration, which is why LLMs have so little of it to show for all they’ve seen.
— Riley Goodside
Tags: riley-goodside, llms, ai, generative-ai
The Mighty Git "One of the curious upsides to discovering you have ADHD as an older man is that suddenly other people have an easier time finding presents for you..."
Tech Stuff
Mantine DataTable When you need an all powerful data table in your UI that can do
3 shell scripts to improve your writing, or "My Ph.D. advisor rewrote himself in bash."
3 shell scripts to improve your writing, or "My Ph.D. advisor rewrote himself in bash."
Matt Might in 2010:
The hardest part of advising Ph.D. students is teaching them how to write.
Fortunately, I've seen patterns emerge over the past couple years.
So, I've decided to replace myself with a shell script.
In particular, I've created shell scripts for catching three problems:
abuse of...
Why the Internet Era Might Be History's Least-Documented Period
Last week, I tried to find some photos from my college graduation. Despite being only fifteen years ago, they proved surprisingly elusive - trapped on a defunct Photobucket account, lost to a crashed hard drive, and scattered across social media platforms that no longer exist. This got me thinking about
BBC complains to Apple over misleading shooting headline
BBC complains to Apple over misleading shooting headline
This is bad: the Apple Intelligence feature that uses (on device) LLMs to present a condensed, summarized set of notifications misrepresented a BBC headline as "Luigi Mangione shoots himself".
Ken Schwencke caught that same feature incorrectly condensing a New York Times headline about an ICC arrest warrant for Netanyahu as "Netanyahu arr...
OpenAI: Voice mode FAQ
Given how impressed I was by the Gemini 2.0 Flash audio and video streaming demo on Wednesday it's only fair that I highlight that OpenAI shipped their equivalent of that feature to ChatGPT in production on Thursday, for day 6 of their "12 days of OpenAI" series.
I got access in the ChatGPT iPhone app this morning. It's equally impressive: in an advanced voice mode conver...
<model-viewer> Web Component by Google
I learned about this Web Component from Claude when looking for options to render a .glb file on a web page. It's very pleasant to use:
<model-viewer style="width: 100%; height: 200px"
src="https://static.simonwillison.net/static/cors-allow/2024/a-pelican-riding-a-bicycle.glb"
camera-controls="1" auto-rotate="1"
></model-viewer>
Here...
This is the 68th edition of People and Blogs, the series where I ask interesting people to talk about themselves and their blogs. Today we have Chris DeLuca and his blog, chrisdeluca.me
To follow this series subscribe to the newsletter. A new interview will land in your inbox every Friday. Not a fan of newsletters? No problem! You can read the interviews here on the blog or you can subscribe to...
OpenAI's postmortem for API, ChatGPT & Sora Facing Issues
OpenAI's postmortem for API, ChatGPT & Sora Facing Issues
OpenAI had an outage across basically everything for four hours on Wednesday. They've now published a detailed postmortem which includes some fascinating technical details about their "hundreds of Kubernetes clusters globally".
The culprit was a newly deployed telemetry system:
Telemetry services have a very wide footprint, so ...
A year ago, I decided to try Helix.
“The joy of learning Helix (and probably other modal, terminal-based editors)”
reveals more about the motivations and initial impressions. A few weeks after
playing around with it, I adopted it as a daily driver. I love Helix, its
simplicity, the set of features it comes with, the documentation, and the
community behind it. Learning this tool made...
Clio: A system for privacy-preserving insights into real-world AI use
Clio: A system for privacy-preserving insights into real-world AI use
New research from Anthropic, describing a system they built called Clio - for Claude insights and observations - which attempts to provide insights into how Claude is being used by end-users while also preserving user privacy.
There's a lot to digest here. The summary is accompanied by a full paper and a 47 minute YouTube int...
What does a board of directors do?
Extremely useful guide to what life as a board member looks like for both for-profit and non-profit boards by Anil Dash, who has served on both.
Boards can range from a loosely connected group that assembled on occasion to indifferently rubber-stamp what an executive tells them, or they can be deeply and intrusively involved in an organization in a way that u...
The Matthew Effect of Post-Twitter Social Networks
The Matthew Effect was first coined by sociologists Robert K. Merton and Harriet Zuckerman in 1968, who noticed that eminent scientists tended to get disproportionate credit for collaborative research compared to their less-well-known colleagues. The same paper would get more attention if a famous name was on it, even
This is part one of my series of posts describing how I made my quiz game o(m)g:image.
Project Announcement
Pt. I: Design Iterations
Pt. II: As Little JS As Possible
Pt. III: The HTML
I blogged about my recent project omgimg.jim-nielsen.com and I figured I’d write more details about my process behind making it.
When the idea first struck, I jumped into Figma and started working out the idea...
A few days ago, someone shared something I wrote on Hacker News and for reasons unknown it got traction and the post spent quite some time at the top of the front page. If it wasn’t for a few kind people who got in touch with me via email where they mentioned that they found my site thanks to HN I’d have never noticed it. My server logs did notice it though and it looks like this:
That is, inde...
Welcome to the Protestant Reformation of Social Media
In 1517, Martin Luther nailed his 95 theses to the door of All Saints’ Church in Wittenberg, fracturing the unified Catholic hierarchy that had dominated European spiritual and social life for centuries.In 2022, Elon Musk bought Twitter for $44 billion, and we’re still dealing with the
⚠️ Content warning: Weight loss, feel free to skip if that is not a good topic for you.
A doctor told me to look into intermittent fasting. Not for weight loss, but for ADHD. There’s some new data that suggests a link between ADHD and insulin in the brain. Based on that science, intermittent fasting or a ketogenic diet –which can help improve insulin resistance– might help my brain. I’m a wee...
The lack of server-side rendering in web components has become a sort of folk belief that oft goes unquestioned. I am happy to report that the fears are unfounded.
The world isn’t falling apart—it’s being pried apart, one stolen moment at a time. Attention has become the most coveted commodity, and we’re little more than unwitting marks in a global grift. The apps, platforms, and systems sucking up our time aren&
I’m not the kind of person that develops a strong attachment to their own work. When I decided to leave Redis, about 1620 days ago (~ 4.44 years), I never looked at the source code, commit messages, or anything related to Redis again. From time to time, when I needed Redis, I just downloaded it and compiled it. I just typed “make” and I was very happy to see that, after many years, building Red...
From here to Harrison Bergeron via AirPods and transparency mode
In Kurt Vonnegut’s 1961 sci-fi short Harrison Bergeron (Wikipedia) it is the year 2081 and "everybody was finally equal."
Nobody was better looking than anybody else. Nobody was stronger or quicker than anybody else.
How? The United States Handicapper General takes care of it.
Hazel had a perfectly average intelligence, which meant she couldn’t think about anything except in short bursts. A...