- Published on
OpenAI GPT-4 Turbo: First Impressions
- It's interesting to see OpenAI committing to providing infra, not just models (RAG, langchain-style assistants api, etc).
- the infra is very general purpose, and thin — it's a replacement for simple wrapper co techstack.
- GPT-4 Turbo is so cheap. The cost reductions were expected (I've certainly been counting on it) — but it nonetheless enables a lot.
- Curious whether that 128k context is truly usable — in practice this probably just means I get to remove my "max_tokens" checks in my code for most places.
- Consumer-focused stuff was interesting — plugins were the wrong abstractions, maybe GPT is better.
- I find the revenue-sharing a bit weird — it's like TikTok. I think TikTok showed everyone that the "creator fund" model is incentive-optimized for the platform, only because creators are irrational in their response to lottery-like payouts. Definitely not an app store vibe. Maybe better for OpenAI, but definitely worse for GPT-makers.
- I wonder whether ChatGPT will get 128k context — I suspect not — full 128k context costs $2 per full query, it doesn't take many queries for ChatGPT to lose money on the monthly subscription. Probably will use RAG / the various compression that they do.
- Final, big takeaway — OpenAI is absolutely playing to be the generational company of this era in Silicon Valley — and seems to be collecting up the talent bench & ambitions to match.
- I'm left wondering how much longer it's conventionally worthwhile to write terraform-style "LLM-provider-neutral" code when OpenAI is starting to break away from the pack so fast…