logo
Manhattan Metric
Software and Science in Equal Measure

ChatGPT Told Me to Go Work for Anthropic

I had just wrapped up my Ph.D. dissertation defense and was waiting outside while my committee conferred. After a few minutes, they called me back in to hear their verdict. The head of our department addressed me directly. “We are going to grant you your Ph.D.,” he began, “on one condition: you have to promise to continue this line of research.”

I eagerly agreed, shook my committee members’ hands, and headed home to celebrate a satisfying conclusion to a rather stressful chapter of my life. It was only some days later, as I returned to my full-time job as a software engineer, that I began to consider his words more carefully. He and my committee were well aware that, facing the prospect of at least three years as a post-doctoral researcher making at most $40,000 per year, I had already opted for a career in software earning more than triple that. I was certainly passionate about my research, but I also wanted to start a family. I realized that what he was saying, in effect, was: “I know you’re stepping away from research now, but don’t stay away forever. There’s something in your work worth investigating further!”

Admittedly, over the past fifteen years I haven’t exactly held up my end of that bargain. While I have, on many nights and weekends, returned to my original research question, I’ve never made it the focus of my career. This past weekend, though, I considered that the time for such a shift might be nigh. Before settling on a change in direction, I consulted with ChatGPT. In the past, I’ve found that ChatGPT can serve as a perfectly capable research assistant, quickly surfacing relevant papers and restating or forcing me to restate conclusions in a way that helps hone my thinking. This past weekend’s chat was no exception1.

Satisfied that my research remained relevant, and that there were still interesting questions and directions in which I could take it, I decided to go for the gusto. I asked ChatGPT, based on the content of our conversation:

If I wanted to get a job at OpenAI working on this research and pushing further in these directions, who should I talk to?

I was somewhat taken aback by the response:

Your ideas intersect strongly with scaling law research.

One of the foundational researchers here was Jared Kaplan, who helped derive empirical scaling laws showing how model performance improves with size, data, and compute…Kaplan has since moved to Anthropic.

A surprising number of people doing fundamental ML theory came from physics backgrounds. Another example is Sam McCandlish, who studied theoretical physics and co-authored early scaling-law work at OpenAI…Even though McCandlish is now at Anthropic, many people at OpenAI have similar backgrounds.

If you want, I can also tell you which labs (including OpenAI competitors) are most receptive to this kind of thinking.

Reading between the lines, I followed up by asking instead about going to work for Anthropic, to which ChatGPT responded:

Anthropic is actually a much more natural target for the kind of work you’re describing than OpenAI, for a few reasons:

  • They already employ multiple physicists and complex-systems thinkers.
  • Their culture tolerates more fundamental speculation about intelligence.
  • Some of their core research (scaling laws, interpretability) is basically proto-physics of learning systems.

OpenAI tends to focus more on:

  • engineering
  • scaling
  • product deployment.

(You can read my full conversation here.)

Now, if I was to tell you that one of these two organizations is a public benefit corporation controlled by a nonprofit foundation, and one was a venture-backed Silicon Valley startup, which would you guess is which?

There’s an oft retold story in Silicon Valley about Xerox’s Palo Alto Research Center, or PARC. It is generally told as a cautionary tale of a company which founded a research lab that went on to develop the GUI, object-oriented programming, Ethernet, and more…and which commercialized none of it. Yes, Xerox was and is an immensely profitable company, the story goes, but just imagine the kind of company they could have become if only they realized what they had in PARC.

The thing is, Xerox had their technology breakthrough very early on, and it did change the world. It’s hard to fault Xerox management for ignoring PARC’s non-printing-related inventions when printing technology was so useful and so profitable2. Given that OpenAI proved, with GPT, that transformer models could be practically useful and then showed, with GTP-2 and GPT-3, that bigger is better, it’s not surprising that they would focus their attention on being better at what they’re already good at.

A surprising number of the technologies, and people, from the early days of PARC moved on to Apple, and it’s no stretch to say that Apple would not be what it is today without PARC. Since the first Apple I computer they produced, Apple has evolved through periods of selling printers, displays, music players, phones, tablets, watches, and even movies. Xerox still sells copiers and printers.

It’s tempting to think that more conservative funding models reduce profit pressures for a company, giving them freedom to experiment and explore, and that speculative, highly-leveraged funding puts more of an onus on a company to turn a profit, but often the opposite is true. The only recent exceptions are AT&T’s Bell Labs and Google, both of which held near (or actual) monopoly status and managed to use this to foster an environment of speculative invention. However even in these cases, the inventions to come from each were generally more closely linked to the profit drivers of the company, and neither managed to produce quite the same concentration of paradigm-shifting ideas as PARC. So it would not surprise me if, a few years into the future, we find out that OpenAI has become the Xerox of AI, still selling tokens, while Anthropic has morphed into something more like Apple, building new and fantastical things in diverse areas we can’t even imagine today.

Having worked at Apple (while also finishing my PhD) through the release of the iPhone, Apple TV, and iPad, and enjoying the semi-controlled creative chaos of that environment, I suppose it makes sense that I would be more drawn to Anthropic. That ChatGPT would be the one to highlight this for me says something interesting…though I’m not entirely sure what. On one hand, there’s an expectation that companies build self-promotion into their products, and so I guess I naively assumed that ChatGPT would tell me: “Of course OpenAI is the best company to work for!” On the other hand, if Anthropic would be a better match for my interests, then the fact that ChatGPT would call that out points to the strength of the model.

I still haven’t entirely decided what to make of this all. While ChatGPT is not so obsequious as it once was, I’m still not certain I can trust its fawning assessment of my past research. What I do know is that I still remember the promise I made to my committee all those years ago. However this unfolds, whether as nights-and-weekends curiosity or something more, I intend to keep it. And hey, if you happen to know someone in Anthropic that would be interested in talking about the possibility of pushing on the boundaries of what’s known about the proto-physics of learning systems, drop me a line!

  1. Coincidentally, one of the papers ChatGPT pointed me in the direction of was already open in another of my browser tabs. 

  2. It’s also worth noting that laser printing, also invented at PARC, was successfully commercialized by Xerox.