• 0 Posts
  • 18 Comments
Joined 1 year ago
cake
Cake day: July 8th, 2023

help-circle


  • Did you read the article, or the actual research paper? They present a mathematical proof that any hypothetical method of training an AI that produces an algorithm that performs better than random chance could also be used to solve a known intractible problem, which is impossible with all known current methods. This means that any algorithm we can produce that works by training an AI would run in exponential time or worse.

    The paper authors point out that this also has severe implications for current AI, too–since the current AI-by-learning method that underpins all LLMs is fundamentally NP-hard and can’t run in polynomial time, “the sample-and-time requirements grow non-polynomially (e.g. exponentially or worse) in n.” They present a thought experiment of an AI that handles a 15-minute conversation, assuming 60 words are spoken per minute (keep in mind the average is roughly 160). The resources this AI would require to process this would be 60*15 = 900. The authors then conclude:

    “Now the AI needs to learn to respond appropriately to conversations of this size (and not just to short prompts). Since resource requirements for AI-by-Learning grow exponentially or worse, let us take a simple exponential function O(2n ) as our proxy of the order of magnitude of resources needed as a function of n. 2^900 ∼ 10^270 is already unimaginably larger than the number of atoms in the universe (∼10^81 ). Imagine us sampling this super-astronomical space of possible situations using so-called ‘Big Data’. Even if we grant that billions of trillions (10 21 ) of relevant data samples could be generated (or scraped) and stored, then this is still but a miniscule proportion of the order of magnitude of samples needed to solve the learning problem for even moderate size n.”

    That’s why LLMs are a dead end.


  • When IT folks say devs don’t know about hardware, they’re usually talking about the forest-level overview in my experience. Stuff like how the software being developed integrates into an existing environment and how to optimize code to fit within the bounds of reality–it may be practical to dump a database directly into memory when it’s a 500 MB testing dataset on your local workstation, but it’s insane to do that with a 500+ GB database in production environment. Similarly, a program may run fine when it’s using a NVMe SSD, but lots of environments even today still depend on arrays of traditional electromechanical hard drives because they offer the most capacity per dollar, and aren’t as prone to suddenly tombstoning when it dies like flash media. Suddenly, once the program is in production, it turns out that same program’s making a bunch of random I/O calls that could be optimized into a more sequential request or batched together into a single transaction, and now it runs like dogshit and drags down every other VM, container, or service sharing that array with it. That’s not accounting for the real dumb shit I’ve read about, like “dev hard coded their local IP address and it breaks in production because of NAT” or “program crashes because it doesn’t account for network latency.”

    Game dev is unique because you’re explicitly targeting a single known platform (for consoles) or targeting for an extremely wide range of performance specs (for PC), and hitting an acceptable level of performance pre-release is (somewhat) mandatory, so this kind of mindfulness is drilled into devs much more heavily than business software dev is, especially in-house dev. Business development is almost entirely focused on “does it run without failing catastrophically” and almost everything else–performance, security, cleanliness, resource optimization–is given bare lip service at best.




  • Basically, X11/Xorg doesn’t isolate programs from one another. This is horrible for security since malicious software can read every window, as well as all the input from mice and keyboards, just by querying the X server, but it’s also handy for screen reading software, streaming, etc. Meanwhile, Wayland isolates programs in their own sandbox, which prevents, say, a malicious browser tab from reading all of your keyboard inputs and logging your root password, but also breaks those things we like to use. To make matters worse, it looks like everyone’s answer for this and similar dilemmas wasn’t “let’s fix Wayland” but “let’s develop an extension to fix Wayland” and we wound up with that one fucking xkcd standards comic that I won’t bother linking because everyone has seen it a zillion times.

    ETA: Basically, my (layman’s) understanding is that fixing this and making screen readers work in Wayland is hard because the core Wayland developers seem to have little appetite for fixing this themselves. Meanwhile, there’s 3-4 implementations of Wayland that do things differently, so fixing it via extensions means either writing multiple backends in your program to do the same damn thing (aka a giant pain in the ass) or getting everyone to agree on the same standard implementation (good fucking luck).


  • The problem is that there’s no incentive for employees to stay beyond a few years. Why spend months or years training someone if they leave after the second year?

    But then you have to question why employees aren’t loyal any longer, and that’s because pensions and benefits have eroded, and your pay doesn’t keep up as you stay longer at a company. Why stay at a company for 20, 30, or 40 years when you can come out way ahead financially by hopping jobs every 2-4 years?


  • It makes sense to judge how closely LLMs mimic human learning when people are using it as a defense to AI companies scraping copyrighted content, and making the claim that banning AI scraping is as nonsensical as banning human learning.

    But when it’s pointed out that LLMs don’t learn very similarly to humans, and require scraping far more material than a human does, suddenly AIs shouldn’t be judged by human standards? I don’t know if it’s intentional on your part, but that’s a pretty classic example of a motte-and-bailey fallacy. You can’t have it both ways.


  • Who even knows? For whatever reason the board decided to keep quiet, didn’t elaborate on its reasoning, let Altman and his allies control the narrative, and rolled over when the employees inevitably revolted. All we have is speculation and unnamed “sources close to the matter,” which you may or may not find credible.

    Even if the actual reasoning was absolutely justified–and knowing how much of a techbro Altman is (especially with his insanely creepy project to combine cryptocurrency with retina scans), I absolutely believe the speculation that the board felt Altman wasn’t trustworthy–they didn’t bother to actually tell anyone that reasoning, and clearly felt they could just weather the firestorm up until they realized it was too late and they’d already shot themselves in the foot.






  • First, it’s important to find an instance that caters to your interests, especially if you have more niche hobbies. Once you’re set up, search for and follow hashtags related to your personal interests, and use those to find accounts you like. Use hashtags in your own posts so that people can discover you more easily, and browse users that follow you to see if they’d be interesting to follow back and expand your network out. Keep an eye on the local and federated timeline for interesting posts, which includes all posts from people on the same instance and from all federated instances. Eventually, as you build up a follow list (and especially as you follow highly active accounts) your followed accounts will start introducing you to new accounts themselves through boosting posts.

    It’s more work since you’re building the network yourself instead of having it spoon-fed to you by an algorithm, but it’s overall much more rewarding, and lets you tailor your experience to your own personal preferences.