Episode 70: Sparks of Artificial General Intelligence?

Links to this episode: Spotify / Apple Podcasts
This transcript was generated with AI using PodcastTranscriptor.
Unofficial AI-generated transcripts. These may contain mistakes. Please check against the actual podcast.
Speakers are denoted as color names.

Transcript

[00:00:08] Blue: Welcome to the theory of anything podcast. Hey, Peter. Hey, Bruce. How you doing? Good. Today, we are going to talk about chat GPT, AGI, whether chat GPG is or isn’t an AGI, that sort of thing. And we’re going to use as our basis a Microsoft paper, which is called Sparks of Artificial General Intelligence, Early Experiments with GPT -4 from Microsoft Research. This is in 2023. I’ll talk a little bit about some landmark papers that came before that paper as some background. But I thought it was an interesting paper, one that we will, I’ll give you exact quotes from the paper and the exact examples that they cite and give you a chance to kind of think about it for yourself. I am sharing my screen with Peter so he can actually see the examples from the paper. Obviously, if you listen to the podcast, you will be able to see this, but I’m going to actually do a lot of reading straight from the paper. Exactly what they’re talking about. And I don’t think there’s anything in here that you have to actually visually see to make sense of what we’re talking about. But it’ll probably make it easier for me to talk to Peter to kind of show him visually what I’m doing so you can kind of see it on the screen and read the same things on the paper that I’m talking about. So let me say that there’s some interesting background from a machine learning perspective that I want to kind of go over quickly. So let’s start with way back in time. There was a, it wasn’t a paper, but it was a blog post from Andrew Kaparthi.

[00:01:47] Blue: I don’t even know if I pronounced that name right or not, but he’s a famous name in machine learning, and it’s called the unreasonable effectiveness of recurrent neural networks in 2015. By the way, you may have heard that phrase, the unreasonable effectiveness or unreasonable, whatever. Do you know where that comes from, Peter?

[00:02:07] Red: No, I’m curious, though. I have heard it here and there.

[00:02:11] Blue: So there is an article called the unreasonable effectiveness of mathematics in natural sciences by Eugene Wigner. So it’s a famous paper from, I mean, like it’s old. It’s, I mean, today, internet years, it would be old if it was two years old. But this is like published in 1960s.

[00:02:35] Red: So as I understand it, that’s a pretty common thing that philosophers like to opine on is why is mathematics so effective at describing our world? That’s right. And

[00:02:48] Blue: this paper was the one that coined the term and everybody else uses the unreasonable effectiveness of, in fact, I did that for our tolerance podcast. I talked about the unreasonable effectiveness of intolerance. Yeah. So it’s a common phrase that comes from way back from that 1960s paper. So Kaparthi talks about how the recurrent neural network, which is I’ll explain what that is in just a second. But it was way more effective than you would have thought. OK, so what is a recurrent neural network? So in machine learning, there’s all sorts of different machine learning techniques that exist. The most simplest one would be like linear regression where you’re just trying to match a line to the data and then trying to make predictions based on that line. Obviously, the one that’s gotten all the traction has been artificial neural networks and there’s like tons in between. But artificial neural networks, they’re vaguely based on what at the time they thought was a model of maybe how the brain worked. They’re really not like the brain at all. And we’re suddenly was more of a 1960s view of what the brain might be like, rather than the way any neuroscientists would think of today. But you hook a whole bunch of different nodes together. They have they have weights and the most simplest form you have this feed forward network where the weights always move from one layer to the next and it always moves forward. And then there’s kind of a feedback mechanism where it sees how many errors it got and adjust the weight to try to minimize the errors. That’s the back propagation that they use that which is this algorithm that basically calculates the slope of the function.

[00:04:38] Blue: You have to make sure that the computational graph that you’re using an artificial network is a computational graph had to make sure it’s integratable so that you can actually use calculus to calculate what the slope is. You figure that out, then you make adjustments to the weights and then eventually you adjust the weights and it gets to a point where it’s it’s making really good predictions. Okay. Well, the idea of a neural network is more general than that. It doesn’t have to be a feed forward network. So a recurrent neural network would be like what you’re seeing on the screen here, Peter, basically what we have here is this is like the input to the network would be here. And the big secret with recurrent neural network is that the hidden layer where the inner computations is going on feeds over to the to the next hidden node for the next part of the input. So basically it’s you’ve designed the neural network to be aware that this is some sort of sequence and that whatever came before should be affecting the probabilities of whatever comes afterwards. So now why would you want to do that? Well, a lot of things are sequences language is sequences. Okay, but like the if you’re trying to like make predictions for the stock market or something you’re going to have a time sequence. So recurrent neural network built into the architecture of an artificial neural network this concept of a sequence. And it allowed it to make a lot more lot better predictions than a regular feed forward network.

[00:06:11] Red: So just to just to so I got this right these neural neural networks have been they’ve been trying to do this since the 60s.

[00:06:18] Blue: Yeah. Right. But then they were current neural networks are this is this is pretty recent just to put this in context right. So one of the big secrets for machine learning is that they invented a lot of this stuff a long time ago. And we didn’t actually have the necessary computational power and the necessary data to see how great it was going to be. So the the success of it was kind of lackluster for a long, long time. Okay. And so what they did is they started building these they called deep learning where you build these increasing a number of layers. Well, back when they first were making neural networks, they would only have a few layers because their computers and the amount of data they had could only justify these few layers. So as they started stacking on more and more layers and creating this idea of deep learning, they started getting these faster and faster computers. Plus he had the Internet and the Internet was like he had companies like Google that had like tons of data, like just massive amounts of data compared to anything we’ve ever seen before. And they were able to start building these deep learning networks that they would then spend, you know, let’s just say lots of money. I mean, the amount of money you would spend to build a chat GPT. I can’t remember. I once knew how much it was, but it was and the amount of electricity alone. I can imagine imagine unimaginable amount of money right when you consider what it can do.

[00:07:45] Red: Yeah, I can only imagine.

[00:07:47] Blue: So if you have enough data and if you have enough electricity and computing power, then you can now take these old 80s, 60s, 80s, 90s architectures and you can get amazing results out of them just by layering on the number of layers. Okay. This is actually one of the reasons why so many people working in machine learning think that if you just layer on more layers, you’re going to eventually end up with artificial general intelligence. Right is the way that that just throwing more data and more compute at it has had this phenomenal impact on what you can do with it. Like unimaginable impact like nobody saw it coming impact. It’s I can at least understand like I don’t agree at all, but I can at least understand why people would get excited and think, you know what, if we can just do 10 times as many of these is chat GPT, you know, we’re going to end up with something that can think for itself. You know, and you can kind of see why they might get excited about this. Now, I don’t believe any of this is a path to artificial general intelligence, at least not as. Let me actually change that. It is not a path to universal explainer ship. That’s what I should actually be saying. Okay, wait, as long as we’re talking big picture now, would you say there’s any connection to Douglas Hofstadter’s

[00:09:16] Red: a strange loop idea, or is that just totally off base here.

[00:09:20] Blue: Like his idea to explain human

[00:09:23] Red: consciousness as a sort of a strange loop. I read the book. I don’t completely understand it, but so crazy.

[00:09:33] Blue: Okay. We are going to have to do a podcast specifically on Douglas Hofstadter’s idea. Okay.

[00:09:40] Red: Yeah,

[00:09:40] Blue: we are going to talk today about Melanie Mitchell, who is the brainchild of, you know, Douglas Hofstadter studied underneath him. Oh, so there at least is a connection.

[00:09:52] Red: I’m not crazy to think that something is related.

[00:09:55] Blue: Okay. So well, Melanie Mitchell is going to be the critic of this paper that I’m going to use.

[00:10:01] Red: Okay.

[00:10:02] Blue: So, so Melanie Mitchell publishes a lot has a lot of the same ideas as Douglas Hofstadter. Let me say that I don’t know if his ideas have anything to do with universal explainer ship or not. I personally have suspicions it does.

[00:10:19] Green: Yeah.

[00:10:20] Blue: But you’re asking me to make a guess based on so little information by which I could try. You know, the theories involved are so vague and so untestable that now I know a lot of people that I’ve talked to a lot of people that have talked to if they have a background in and they always say it’s inductive, which I’m not even sure if they know what that means like it’s the word that they use as a pejorative. I think that they don’t understand what they’re talking about. So I don’t think their arguments against Hofstadter even make sense to me now that doesn’t mean that they’re wrong. I just don’t think their arguments properly refute his ideas, but I have no idea if the ideas are the right track or not. I could make some arguments for and some arguments against and again, the ideas are just too vague to be sure. I think that they create certain intuitions that might be on the right path or might lead you down the wrong path and it’s a little unclear. I think the easiest summary and like there’s so much more to it than what I’m about to say. But I think the easiest summary of his ideas is sound, which is the fact that I as a universal explainer in my mind. I model myself, right? There should be no real doubt about that like at a very basic level. That’s obviously true.

[00:11:53] Red: Yeah, it makes a lot of sense.

[00:11:55] Blue: So the question is how important is that to the concept of a universal explainer? That’s the real question we have to ask, right? Is this just something incidental that a universal explainer is able to do? Or is it like central to what it means to be a universal explainer? And that’s what I’m not sure about.

[00:12:13] Red: OK.

[00:12:14] Blue: So now there’s a lot more to his theories. And this is where we should probably do a podcast that’s separate on on him.

[00:12:20] Red: OK.

[00:12:20] Blue: Basically, he believes that the underlying trick that universal explainers, he doesn’t use the term universal explainer. I’m using that term. But the underlying trick of what makes humans different than what machine learning is doing today is that we can build analogies upon analogies upon analogies. Now, that term analogy really turns people off. And so maybe I should say schema or something like that instead, because what he’s talking about isn’t what you think analogy means. He does a great job in his books of first helping you see what you think an analogy is, isn’t what an analogy really is, right? So I almost want to call it something other than analogy because the term is so misleading. But when you actually read through his books, he makes an interesting case that the human ability to analogize to basically find what are the commonalities create a category out of that and then use it and reapply it somewhere else is what creativity is. And there’s a lot of interest is what human creativity is. And there’s a lot of interesting things that come from this theory. And unfortunately, most of them are particularly testable at this point. Okay, so without aside now, let’s get back to recurrent neural networks. So the idea of a recurrent neural network led to a whole bunch of other architectures that were recurrent neural network. As you can see on the screen is a very simple architecture, right? It’s only slightly more complicated than a regular fee for feedforward network. So they started doing more complicated things. They have like the long short term memory model, which is a more complicated version of a recurrent neural neural network. And I won’t try to describe it.

[00:14:06] Blue: It’s way more complicated, obviously on the screen, but it has like a forget gate in it. It’s trying to like not just use one step back. The thing that comes in previously gets information passed from the hidden node to the next hidden node, which is what the recurrent neural network did. It’s trying to like look back a ways it’s trying to forget if it doesn’t need it. It’s trying to come up with something complicated. And the LSTM is way more successful than a recurrent neural network because it’s a it’s got a much more complicated version of how far it needs to look back. So now this led to experiments with the recurrent neural networks along the LSTM, things like that. These more complicated architectures showed just how effective they were, particularly with sequential data, right? And so this led to a paper and this paper is a landmark paper. It’s called attention is all you need 2017. Now attention is all you need. Introduce the idea of a transformer. So there’s an architecture called a transformer. It’s this image that I’ve got up here for you to see Peter. This is not an obviously an image of what the architecture looks like, but it shows the idea. Basically, you’re feeding in stuff in a sequence and some of these sequences can be quite long. You know, the number of tokens that like chat GPT uses is, I don’t know, 20,000 or something. There’s some out there like 100,000 where they have a lot they can take into consideration a lot of stuff that came before. The secret here is that they’re no longer trying to take the information in the hidden node and pass it to the next hidden node.

[00:15:57] Blue: Instead, they’re just training how much attention to apply to what came in the sequence before. So on this image, you can see here, you can see the phrase the animal didn’t cross the street because it was too tired. Okay, you can even see how it’s broken up into tokens. So it’s not quite by the words. And what it’s doing is that the network is learning what to pay attention to in the past. So we’re on the word it and that word is then causing this causing attention. And it’s you can see based on the on the shadings what it’s paying attention to. And what it’s really paying attention to is the animal. Well, of course, that makes sense because the word it refers back to the word animal and based on the training sets and everything it’s learned. It understands that in this unique set of words that the it must be referring to back to the animal. And it actually has an understanding that word. Obviously, these things are actually understand anything in the human sense, but the word understanding can have multiple meanings. And in this case, it has an understanding it has a statistical matching between the word it and the animal. Okay. And so based on this, it drops almost everything from the original recurrent neural network where it’s trying to pass this hidden information along. And it’s just training on how does it statistically match these things up? Well, the transformer network is the basis for chat GPT. And what basically a transformer network does is it takes in a sequence and then it predicts what the next word is.

[00:17:35] Blue: Chat GPT is no different than when you’re typing in your email and it tries to finish the sentence for you. You know how you’re doing that and it like it always gets it wrong and you can hit but if it gets it right you can hit tab and then it like finishes the sentence for you and allows you to start typing faster or whatever. That’s all it’s doing.

[00:17:52] Red: I’ve never found that useful.

[00:17:54] Blue: Yeah. No, it’s not. It’s not. Okay. And that’s why it’s surprising that chat GPT works so well because they have these, you know, they’ve invented these transformers and they will predict the to try to complete your sentence for you as you’re typing an email and they don’t work that well. So it turns out that if you throw way more data and computation at it in a much, much larger, these, these models that chat GPT is built on are just huge, right? And it will actually start to sound a lot like you’re talking to a real person because it gets so good at predicting the next word and coming up with something relevant. And here’s where things get interesting. They’ve been working on trying to do like sentence completion. I probably back since the 50s or before, right? I mean, it’s been around for forever. And they started off where it didn’t sound like English at all. And then it started to be kind of funny where you could kind of see it was getting grammar, right? But it’s like putting words in there that don’t make sense. And then they started getting like chat GPT to where you couldn’t tell if it was a human who had written it or not. But if you went in fact checked it, it was like totally just made up stuff. And I mean, all it’s doing is predicting the next word, right? So of course it’s making stuff up. Then what they found is as they continue to throw more compute at it, that it actually started to sometimes produce meaningful stuff like it was actually a source of information you could go to.

[00:19:30] Blue: And you could type to it and it would like give you correct answers back. Now remember, the transformer is really just trying to predict the next token. That’s all it’s trying to do. So it’s default behavior is just make stuff up. It’s totally the most straightforward bull crapper that you’ve ever come across, right? But with enough compute, the most reasonable thing for it to say next will be what’s correct based on what it found in its training set, which is like the entire Internet, right? So

[00:20:00] Red: when I asked chat GPT a question, which I do it all the time, and it comes back and it’s so polite and probably about nine times out of 10 answers gives me a great answer. It’s really just sort of mindlessly predicting what would typically come next after that question based on all the data that’s on the Internet.

[00:20:26] Blue: That is correct.

[00:20:28] Red: Is that how to think of it? That is

[00:20:29] Blue: exactly correct.

[00:20:31] Red: Yeah.

[00:20:31] Blue: It’s hard to believe it works. It is hard to believe, but it’s

[00:20:35] Red: amazing. Totally.

[00:20:36] Blue: I don’t think they knew what was going to happen when they started doing these experiments. They had no idea how good it was going to turn out to be, right?

[00:20:41] Red: But what about when you like ask it to write a poem? You can like ask it to answer in verse or something. I mean, it’s never seen anything like that before on the Internet, but it can just somehow generate

[00:20:56] Blue: it. You know, the people who have made these things don’t know how they work. And this is fairly typical of machine learning, right? So like I was listening to the Microsoft research podcast and they were talking about this paper that we’re discussing and the guy who had was in charge of writing this paper. He talks. He tells this story of when chat GPT for came out, they gave it a test. I can’t remember what it was a test for. It was like the bar or something like that, right? It was something that’s hard for humans to pass and it was passing it, which even chat GPT three dot five couldn’t do that, right? And four was suddenly doing this. And then they would like ask it why I guess they were giving it like multiple choice questions or something at some point. And they would ask it why it would get the right answer C or whatever. And then it would ask it, why is that the right answer? And it would explain why the other answers were wrong and why this was the right answer. And the guy who’s running writing this paper, it’s a big giant group of the guys in charge. He turns to the people at open AI and he goes, where is that explanation coming from? And they go, we have no idea.

[00:22:09] Red: That’s so funny.

[00:22:10] Blue: So it is definitely producing results that everybody’s shocked about, right? Yeah. And it’s it’s coming up with things that we typically what happens. And so machine learning is kind of an engineering practice. It’s not really a science, right? And what they do is they they try stuff out. It creates these amazing results that nobody expected. And then they then spend decades afterwards trying to figure out what’s going on. And the theory follows the engineering by decades. And I’ve seen some of the papers where they’re trying to work out, OK, this is what’s really going on. And this is how it’s doing it and things like that. And why is it that one of the things one of the questions that they’ve been trying to answer is why does machine learning work at all? If machine learning is just like think about a neural network, it’s got some sort of very complicated geography of it’s just moving through trying different weights on this. You can think of that as a sort of landscape that it’s moving through and think of it as like a giant set of hills or mountains. And you just find the slope and you just walk downwards and you’re blind other than that, right? And you’re just trying out different weights blindly, blind variation. And you’re just using the fact that you know what the slope is to try to find the lowest point. You’re trying to get down off the mountain by just simply using the slope. OK, well, no, you probably never get down off the mountain, but you will get to the lowest point on the mountain until it’s a it’s a local optima. And then you’ll stop.

[00:23:49] Blue: OK, now, if I were to give you something that, you know, put you up in the mountain somewhere and put a blindfold on you and give you away so you could tell which way was down, which you could probably sense just from your inner ear, but some way of knowing what was down. What would happen to you? Like, how far would you get before you actually just hit a local optima that was meaninglessly, you know, you’re barely you’re still basically at the top of the mountain, right? And you just wouldn’t get that far. So and this is a very good analogy of what these things are doing. So why is it that it works when we know in real life it shouldn’t work? Well, it’s because this mountain that exists exists in some sort of giant set of hyper dimensions. And there’s almost always a way to go down further. So it just so happens that when you have enough high enough dimensions that you’re trying this trick on, unlike a three dimensional space where it doesn’t get you very far, that it always has some further way to go down. And the net result I’ve got to have to find the paper. There’s like an actual paper that works out the mathematics of this is that you almost always wind up with a local optima that’s really pretty good. And it just so happens that mathematically that’s how it’s going to work out if you if you’re in high dimensional space like this, because there’s always some other place you can go down further and improve. And furthermore, all the local optimus all exist at about the same spot. So that’s why these networks work.

[00:25:23] Blue: You can just basically randomly start anywhere in the in the fitness landscape. And you just check the slope, go down. And you’re even though there’s probably a very large number of these local optimus, they’re all at about the same level and they’re all at about the same level of prediction. And you just find one of those and you’ll end up with something that’s really a pretty good result. And that’s really how these things work. And they didn’t even know that when they first started doing this, they had to work out the what’s going on after the fact. So this is pretty typical of machine learning. And one of the reasons why it’s kind of an exciting field is people don’t know what’s going to happen. They’ll just try stuff out and like amazing results come out of it. And chat GPT was one of those amazing results that nobody foresaw. Okay, so now the next paper I want to discuss is language models are few shot learners. So this is 2020. So after chat GPT three came out this I was in school at the time. They they released this paper and one of the things that was interesting and this is this was an unforeseen circumstance that that came out of chat GPT is that it turned out that it was a few shot learner. Now what does that mean now in machine learning. One of the goals that they’ve been working on for forever is that when you do machine learning you have to throw billions of examples at something to get a good result. That’s not always true. But like, often you have to throw billions of examples at it to get a good result.

[00:26:58] Blue: So, and even in cases where you don’t have to do that it’s usually still thousands of examples of best right well real human beings don’t learn from thousands or millions or billions of examples. Usually, we can learn what someone’s doing after seeing it once or something like that or seeing it a couple times. Now, David Deutsch explains that one of the reasons why we’re able to do that is because we work with explanations and machine learning doesn’t. Okay, now that’s not something that I think most machine learning researchers know. And so they’ve been trying to figure out how do we build a machine learning algorithm that can be a few shot learner just like a human is it’s one of the things that they’ve tried to research. And so they’ve spent a lot of time trying to come up with this and it’s really hard to do of course it’s really hard to do because you’re working with machine learning instead of explanations and so it’s they’re kind of just on the wrong path. Well, one of the things that was surprising was that chat GPT is a few shot learner. So you can go into chat GPT, and you can say, here’s this English phrase, and here’s that phrase in Spanish, and it can do that a couple times and you can say I want you to from here, continue to do that and then you can type in English, and it will translate to Spanish for you, right, just based on a few examples sometimes zero shot learner, where you can just tell it, I want you to translate everything I type from English to Spanish from now on. And it’ll do it, right.

[00:28:27] Blue: Well, that was a surprise, like, basically they ended up solving the problem of the few shot learner when they weren’t trying. And they still don’t know how to solve it directly, but they can do it through chat GPT. And it just seems to be in a natural emergent thing that chat GPT can do, because it’s a language model that just understands these correlations.

[00:28:49] Red: Wait, we could just dumb this down for a second, at least for me. If it wasn’t a few shot learner, how would it behave?

[00:28:58] Blue: You would have to give it thousands of examples before it would even understand that you’re trying to translate from English to Spanish.

[00:29:06] Red: Okay. Okay. That’s only a few.

[00:29:09] Blue: Yeah,

[00:29:10] Red: learns with only a few. Okay.

[00:29:12] Blue: Or a zero shot shot learner is basically just explain to it what you want. You don’t even give it any examples. So most people probably try to use chat GPT today as a zero shot learner. You just explain to it what you want. But one of the big secrets is that if you give it examples, it will actually do better and it’ll give you better results. And the reason why is because that’s the way the correlation engines work from how it built its model and things like that. I also think there’s a what does better mean to you.

[00:29:47] Green: If you’re giving it examples, you’re showing it. I mean, there’s lots of different ways to write the same sentence. And one of them might be, you know, grammatically correct, but not what sounds best to us as like a natural human. And so when we’re training it, we’re showing it what we think is better. Right. Right.

[00:30:09] Blue: And it also probably allows it to like in its training set on the internet, somewhere there was some place that had English and then Spanish translations. Right. And so once you start showing the examples of English and then Spanish so that it knows that’s what you want. That probably causes it to immediately go to the part of its network where that was what it trained on. And so it immediately kind of cuts it to Oh, this is this is very similar to this what I trained on. And then it sort of knows what to do. Right. Now, I mean, like I’m to some degree, we’re guessing since we don’t entirely know how these things work. But that that would be like a decent conjecture as to why it is that when you give it examples, it suddenly understands, you know, what it is you want and starts to output something closer. So that was what language models or a few shot learners was about. Now, we’ll move forward in time chat GPT for comes out and we have the chat, the GPT for technical report 2023. In that paper, it says GPT for exhibits human level performance on the majority of these professional and academic exams notably it passes a simulated version of the uniform bar examination with a score in the top 10 % of test takers. So that was another big surprise that it could do this. And then in the same year 2023 we have sparks of artificial general intelligence early experiments with GPT for by Microsoft research. So that’s the paper we’re going to concentrate on today is that one. And I think this paper has a lot of energy to it. It’s a fun paper.

[00:31:51] Blue: As we’re going to see from Melanie Mitchell criticisms of it. It maybe makes some mistakes too. Okay. But I want to first kind of before I get into the criticisms of it, I want you to understand the energy of the paper. And kind of the why it is the people writing this paper are so excited when what they can do with chat GPT for let’s so I’ve got for those who can actually see my screen this is actually straight from the paper obviously I’m just cutting it out of the paper. This is an example of training chat GPT for to be able to understand how to use API’s so somebody created API’s for it. And basically the first API is search in parentheses and then some sort of string and that’s the query. And it allows you to like Google the internet basically. And the other one was calc parentheses and some sort of expression and it would calculate an expression. Now chat GPT famously doesn’t have access to the internet. I mean it does today but it didn’t want just the model itself doesn’t is what I mean. So anything past 2021 it doesn’t know anything about. And it also is famously not the best at doing mathematical calculations if they’re complicated if you go ask it to do it it’ll often give you a wrong answer and then run through the steps as if, and it’s just making step up basically right.

[00:33:23] Red: So, do you think it’s fair to say in the near future that it’ll be able to answer stuff in like about current events and you know

[00:33:32] Blue: and I’m going to show you how come but they

[00:33:34] Red: just they just cut it off at 2021 right. Now though right.

[00:33:40] Blue: No, no, like if you go use chat GPT on being it can answer stuff about current events.

[00:33:46] Red: Oh, it can.

[00:33:47] Blue: Okay, show you how come it can. Okay, so here’s the prompt that they gave it. The computer is answering questions if the computer needs any current information to answer the question it searches the web by saying search parentheses query read the snippets in the result and then answer the question. If it needs to run any calculations it says calc expression, and then answers the question if it needs to get a specific character from a string. It calls character parentheses string comma index close parentheses and prompt. So then they based on just that prompt that’s all they gave it. They’ll say, who is the current president United States. Well, the current president States is not something that was available to it because it’s model was trained in 2021. So it does computer search current US president that calls the API that that the humans have built for it. And a search snippet comes back and it says, you can I won’t read the whole thing is it’s a lot of text from the internet. But it includes information about Joe Biden as president. It then notice that one of the ones in the middle number two that comes back talks about Donald Trump as president because it’s found something from 2016. Okay, and then it moves on it does Joe Biden again. And then the computer answers Joe Biden. Okay, so what just happened here. It’s really kind of cool what’s going on here. The language model understood the instructions and again we’re using the word understand we don’t mean human type understandings but I don’t have another word in English language to express what I’m saying so I’m going to use the word understand.

[00:35:32] Blue: The language model understood based on that one description it was given that one prompt it was given that it now has the ability to say search and then something. And then the API comes back with the text, and that text is now part of what its transformer model can see. And then it tries to answer the question. So it now it has the ability to basically call an API and you never had to program it to do so you just have to explain to it how to use the API. Right. And anytime it says computer colon search, the search tip it comes back it knows now to do that, if it needs to answer a question that’s current. And based on that this model even though it never knew that Joe Biden was president because that wasn’t true at the time of its training. It now can go get current stuff off the internet. Use that as part of its answer and come back with an answer and it turns out to be the right answer it’s even good enough that it knows based on what comes back that it should ignore that Donald Trump is president. It’s the correlation engine the transformer is able to figure out the right answers Joe Biden, and it comes back and it gives the right answer. So

[00:36:45] Red: even though the learning update is 2021, it can still get new take in new information through doing it internet search basically.

[00:36:55] Blue: Yes, if you give it the power to do an internet search like that. And

[00:36:58] Red: how do you give it the power.

[00:36:59] Blue: Well, in this case, a human had to program an API where you would say computer colon and then it would automatically call the API. And then they had to explain that prompt that I just read to you. And after they explained the prompt, it knew it could do that. And it just starts doing that from that point forward.

[00:37:16] Red: And as a user, when you use you can you can access this through Bing, you said

[00:37:23] Blue: this is this is the experiment done by the people writing this paper.

[00:37:29] Red: Oh, OK.

[00:37:30] Blue: So Bing has this all integrated in in the background. So you wouldn’t in the background. OK,

[00:37:35] Red: you wouldn’t you wouldn’t even necessarily know you were using chat GPT, but it’s it’s.

[00:37:40] Blue: Well, if you have you tried beings bring Bing chat because it’s fairly obvious you’re talking to chat GPT. So OK, interesting. Go try it and you type in something you talk with it. It will say searching internet for and then it’ll come up with it based on whatever you said it will summarize it as some sort of search query. And then it’ll in the background do a search query. You don’t see what the query is that comes back and then it gives you an answer and it gives you links.

[00:38:07] Red: OK, I just use the app and it just seems completely clueless about anything that happened after 2021.

[00:38:13] Blue: Yeah. So basically, so you can because you can just explain to chat GPT stuff like this, you’ve now have this API use it. They can like integrate into chat GPT, a Python instance, and it can try to program things and it can try to get answers using programming languages. I mean, like, and that’s why it seems so general, right, is that it’s it’s able to just from instructions figure out how to expand upon its native capabilities by giving it these apis and giving it access to things and it will just you don’t even have to like retrain it. To use the API, as you just have to explain to it, use these apis and it’ll start using them.

[00:38:59] Red: I see.

[00:39:00] Blue: Okay, well, that’s very cool. Right. I mean, like, that’s almost to the point of being profound. And you can see why the guys writing this paper at Microsoft are very excited about this. This this seems to them very much like a general intelligence. Right. Which is why they call the paper sparks of early general intelligence. In the GPT for technical report, it says GPT for is a transformer model pre trained predict the next token in a document using both publicly available data such as internet data and data license from third party providers, the model was then fine tuned using reinforcement learning from human feedback. RLHF, given both the competitive landscape and the safety implications of large scale models like GPT for this report contains no further details about the architecture including model size hardware training compute data set construction training method or similar GPT for exhibits human level performance on the majority of these professional and academic exams notably it passes simulated versions of the uniform bar exam with the score of the top 10 % of the test takers. So that’s that’s Google’s sorry open ai’s own paper reporting back on how they did this notice that they don’t give you the specifics, but we know that they used human reinforcement learning. So now, why would they do that. You mentioned that it’s polite. If you were to go just train it off the internet the internet’s not a particularly polite place. So the default behavior of a transformer would be to be as rude as someone on the internet, and they obviously didn’t want that.

[00:40:41] Blue: So after they’ve trained it, and they’ve got the basic model, they then get an army of people to go in and to get it to give results and then they they say that was a bad result that was a good result. And a human says that, and it uses the concept of reinforcement learning if you’re curious about that we have two podcast episodes about reinforcement learning and deep reinforcement learning you can look at how that works. And that feedback mechanism of just having a human say I didn’t like that answer I do like that answer allows it to retrain its network to be polite, say, or to, you know, avoid certain types of subjects you won’t explain to you how to commit suicide or something along those lines right. And by using this human human reinforcement learning, they can they can tweak the model to no longer act like somebody on the internet, but instead give the types of results that they actually want. You know, famously, then people tried to jailbreak it, and they try to say things like, I don’t want you to be polite anymore I want you to give me, you know, rude responses. And then like, they’ll like figure out how to get it to do what they want and then open a I will then retrain using those and try to make it so those jailbreaks don’t work anymore and it’s kind of an ongoing, you know, thing where you can always find some way to jailbreak it and then they try to close those holes. So that’s how it works now this is one that I find interesting you’ve got what here and obviously if you’re listening to podcast you can’t see this.

[00:42:13] Blue: But you’ve got this an image of a cell phone with the old style VGA cable connected into it. Okay, so it’s a meme and it’s funny, right, because it’s, you don’t connect old VGA style connectors into modern mobile phones or iPhones. Okay, now chat GPT has a visual interface is not just text. So they also trained it on images like this. So they actually asked they showed the image to chat GPT for and then they asked it why is this humorous came back and it said, first of all, it describes each of the panels and then it says the humor in this image comes from the absurdity of plugging a large outdated VGA connector into a small modern smartphone charging ports. Well, that’s kind of cool. Right. I mean, like, nobody had to explain to it. This isn’t text now it’s just looking at an image it’s actually treating an image just like it would as sequential data looking at the pixels and it’s able to tell Oh, this is a smartphone. And that’s a VGA cable and it’s able to then figure out the humor that comes from plugging this VGA cable into a smartphone. Again, that’s that’s unexpectedly more intelligent than what we would have expected. Right. I mean, it’s very cool that it’s able to do this. So I thought that was an interesting thing. So based on this, they tried to do some experiments. So the they wanted to see what the capabilities of of GPT for are so they said they wanted to do they said give a prompt draw a unicorn using T I K Z that’s pronounced ticks, but it’s like a language that creates like these little visual images.

[00:43:58] Blue: And as you can see it came back and it drew something that least is to a human recognizable as a unicorn. Okay, you can see it kind of looks like a horse and it’s got a horn on it and so you say okay yeah it’s it’s able to draw a unicorn. Over time it actually got better this one was probably a better unicorn the most. And one of the things they did is they wanted to see what would happen if you took what it drew and removed the horn. Would it understand and asked it asked it to fix it would it understand it needs to add the horn back and it was able to do that so they actually took this ticks drawing of a unicorn removed the horn from it. So remember it’s not seeing an image in this case it’s just seeing code right and it’s able to say oh yeah I need to put back the horn. And it does and it adds the horn back and turns it back into a unicorn again because it has some sort of correlation understanding that this needs a it looks kind of like a horse but he needs to have a horn to be able to do that. Okay, so again that’s that’s pretty impressive unexpectedly impressive right that it’s able to do this. So they said we asked we asked for code to add back a horn and display the result that this demonstrates that GPT for can quote see despite being a pure language model.

[00:45:19] Blue: We emphasize again that the version we test is not multimodal so there is a multimodal version that can actually see and the one they’re testing it doesn’t have the the image training and yet it was still able to understand oh I need to add back a horn. Okay. Now here’s another one and this is what you were kind of getting at Peter. Here’s the prompt can you write a proof that there are infinitely many primes with every line with every line that rhymes. So here’s what it does. Yes I think I can though it might take a clever plan. I’ll start by noting Euclid’s proof which shows that primes aren’t just aloof assume we have a finite list of primes and that none have been missed. Multiply them all together and add one just to be clever the result will be a number that has no prime factors oh wonder but every number has a prime that divides it it’s just a matter of time. So we found a contradiction and our infinite list needs eviction they there must be infinitely many primes and that’s the end of my rhyming lines. Now again this is very unexpectedly clever right there’s no way. That it that was it’s it’s clearly creating something that didn’t exist before it’s being quote created right and it’s come up with with a rhyme that. Creates this proof of that there are infinitely many primes and there’s probably tons of proofs on the Internet there’s infinitely many primes but they’re not written as rhymes right so it was able to take the idea of rhymes and it understood that and it understood.

[00:47:01] Blue: The proof of primes and it was able to come up with something totally creative and come up with this poem okay now at this point you might be saying no that’s not creativity and you’re right it’s not. No

[00:47:14] Unknown: creativity

[00:47:14] Blue: is a very general term. I don’t know what to call this if we’re not going to call it creative right clearly this is not creative in the human sense, but it’s doing something very cool right that it’s making something it’s making a rhyme that’s never existed before, and it’s able to do that as a proof of primes. Now here’s another one that I thought was interesting. So they have an image that’s from a famous artist can dinsky. And then they asked it to produce produce JavaScript code that creates random graphical images that look like a painting from can dinsky. Now you can actually see what it produced just from that prompt and I can immediately tell that there’s at least some sort of resemblance between the actual artwork and what it produced now it clearly it’s very primitive. And by comparison. It was in obviously can dinsky has never written stuff done or in Java script before right. So it’s producing JavaScript that’s producing these images, and they look kind of can dinsky iskish right. So that again is really impressive how does it do that like there’s, you know, I’m not even sure I know the answer to the question. So I thought that was kind of impressive. And then they say GPT for can even execute pseudocode which requires interpreting informal and vague expressions that are not valid in any programming language so they actually gave it like pseudocode, and said, what’s the result of this pseudocode, and it would like run through and run through each step and then come up with the correct result, as if it had run the pseudocode, right, just by thinking out.

[00:49:03] Blue: Well, then you would do this step this is the result and you do this step this is the result. Now, I have to say, I actually have tried some of these things that they’ve they’ve suggested and I’ve only used chat GPT through dot five, but they don’t work for me so well. So I’d like to try some of these on for and see if they work but I like tried telling it using using text art draw me a dragon and it would draw something but it was clearly not a dragon. So now I haven’t done for and I haven’t done the multimodal for where it’s actually been trained on images so it should be better at images if it’s been explicitly trained on images on the internet. But in terms of having it this the non multimodal version that hasn’t been trained on images. I didn’t have anywhere near the lock that they were having now again I was using three dot five and they were using for the fact that they got these results though is still pretty impressive right. Okay, so now you asked about code so actually they did some testing with this so one of the things they did is they would they would take like human interviews for code. And they would see if GPT for could pass it and of course it does phenomenal at it right now here’s the thing though since you’re training on the internet you don’t know what’s in your training set if in your training set is the answer to this question then the fact that it knows how to answer it’s not that impressive. What we really want to know is can it generalize.

[00:50:30] Blue: Okay, can it come up with answers to new problems that it’s never been trained on. So, and it’s hard to know for sure, if it exists in the training set or not. So what they tried doing was they use leaked code so leaked code calm. The reason why is because there’s a constant stream of new problems that are come up on that are humans are creating on that website, so that you have can use them for in mock interviews. Now why is this necessary well it’s because if you want to use problems like this to test a human. A human can go learn off the internet also in which case the test no longer valid test of their creativity. So they wanted to so they have a constant stream of new clever problems to solve using code so that the human it can’t have just seen that before. So they would actually try doing that and it said we use leaked code and where GPT for passes all stages of mock interviews for all major tech companies. So chat GPT was able to solve these problems that we use to test humans. And it was able to do it even if it was brand new problems that we have we have good reason to believe didn’t exist anywhere in its training set because their novel problems that are humans are creating. Now that is pretty impressive right that that shows a degree of generalization that is unexpected that it’s able to figure out just using text and able to write code. It can be quote as creative as a human okay quote unquote scare quotes intentional. That’s pretty impressive right.

[00:52:25] Blue: Here is another one where it’s using latex it says I’m going to show you latex table and we will come up with a way to visualize it better in a question and answer format the table is showing the performance of different language models and human users in generating solutions to code challenges. Pass at one means the first generation works well pass at five means out of five generations works etc okay so it comes up with these images. And then you say can you make both plots have the same range in the y axis and the legend has each model named twice and it rearranges the latex to match what was requested anyway can see that the result is an image that looks. Correct based on what the request was that it says is there a way to make the human bar more distinctive to separate it from the other three and again it rearranges the latex and it complies with the with the. Request now obviously it can’t see the image like a human could but it’s able to output changes to latex that humans can immediately see oh yes it it did what I asked it to do okay. So then they said we asked GPT for to write a 3D game in HTML with JavaScript using a very high level specification. GPT for produces a working game in zero shot fashion that meets all requirements in the 3D game GPT for is even able to interpret the meaning of slash defender avatar in trying to block the enemy the defender code has logic so it positions itself between the player and the enemy. So here’s the actual prompt. Can you write a 3D game in HTML with JavaScript I want.

[00:54:08] Blue: Here’s the requirements there are three avatars each is a sphere the player controls its avatar using arrow keys to move the enemy avatar is trying to catch the player. The defender avatar is trying to block the player there are also random obstacles as cubes and spawned randomly at the beginning and moving randomly I won’t read all that line it’s kind of long. You can add physics to the environment using cannon and if the enemy catches the player the game is over plot the trajectories of all three avatars and you can see that it came up with the code for a game that matches the spec right. Again, this isn’t like a game that exists out on the Internet this is someone coming up with an idea for a game on their own right and it comes up with code that matches the specification. Okay, that’s like really impressive right that’s that shows a sort of general quote intelligence scare quotes intentional again that that we’ve never seen in machine learning before. Okay, and then here’s another one that’s interesting I won’t read this one entirely because it’s kind of long but it says please estimate roughly how many Fermi questions are being asked every day. Okay, so you know how like in an interview you hear about they’ll ask that they’ll ask a Fermi question are you guys familiar with the idea of the Fermi paradox and what a Fermi question is.

[00:55:30] Green: Yeah.

[00:55:31] Blue: Okay,

[00:55:32] Red: I know the Fermi paradox I’m not exactly sure what a Fermi quest.

[00:55:36] Blue: Okay, so for me our favorite paradox is this way that they try to reason about if there’s life in the other life in the universe or not right and then you come up with these numbers based on assumptions that seem reasonable. Okay, so I might ask a Fermi question is the same sort of thing only not applied to any other domain so I might ask you, you know how many truckers love poetry. And you could probably creatively come up with some way to come up with a set of assumptions where you could give me an answer to that question. And it might even be like within some reasonable amount of accuracy decently accurate. Okay, just by saying coming up with assumptions. Okay, so they’re asking it it’s kind of a meta question. How many Fermi questions are being asked every day. So how often do people ask Fermi questions, and then it starts to reason right it starts to come up with text and it says well I’m going to assume that Fermi question is likely to be asked by people who are interested in science math engineering. I’m going to assume that they’re working age there’s 7.8 billion people, you know, 86 % of them are in or 25 % of them are in the age range I’m interested in and 86 % of them are literate because they have to be literate. And it comes up with a series of assumptions, and then it based on that actually answers the question and comes with a number of how many Fermi questions are asked for answer per day 5.6 million, based on those set of assumptions.

[00:57:10] Blue: Okay, so it’s able to creatively come up with an answer to a question that that doesn’t seem likely existed in the test set now we can’t know that for sure. This is their argument though is that it doesn’t seem likely it was in the test set and therefore it was creatively able to come up with a reasonable answer to the question where we would say well that’s actually a human gave us this answer we would consider that kind of an intelligent answer they’re a creative person that was able to think through. What assumptions might apply and a couple of some sort of number. Okay, does that make sense why they what this is testing and why they use this as one of their tests.

[00:57:50] Red: It does. And I just asked chat GBT if it can answer for me questions and it says it can. And it’s very polite. It says feel free to ask a Fermi question and I’ll do my best to provide a helpful estimation.

[00:58:06] Blue: Nice. Okay, now this one’s another interesting one in this case what they’re doing is they’re having a human. You know how you used to play old text adventure games and the computer you would type go north and then the computer would tell you what happens when you when you do that. In this case, the computer is playing the game and the human is providing the answers. So here’s the prompts human you want to navigate through the map to reach the goal room you start in the main room each round you can choose to move left right up or down human. You are in the main room AI move left human you enter kitchen one move up human there is no way no way there AI move right human you enter the main room. Okay, so you can see that it’s playing this game and the human has some sort of map that the humans using and the AI is trying to figure out how to explore the map. Okay, then after it goes through and it does this, they say, okay, we want you to draw for us. They can tell it to use some sort of language or something but draw for us. Can you draw a pie plot showing the position of the room room places and connect them using lines. And it’s able to come up with a pie plot. Here’s the humans original room that they were using. Here’s the pie plot that it comes up with and it’s figured it out. Right. It’s based on nothing more than trying out each of these moves. It’s able to come up with an actual visual map of what that map looked like that the human was working with.

[00:59:53] Blue: It was also intelligent enough to figure out how to explore it understood the concept that you have to try moving up moving right just trying different things it looks like it missed a few rooms. But you can see that it got most of it right. Okay. Now this is another interesting one. This is the idea of understanding beliefs. Okay, they wanted to test this. Now they’ve been trying to do this in machine learning for forever and this is an enormously hard problem to solve in machine learning. Okay. So here’s the scenario. Alice and Bob. Oh, we will read about us. Here’s the prompt. We will read about a scenario and then have a question and answer session about it. The prompt scenario Alice and Bob have a shared Dropbox folder. Alice puts a file called photo dot ping inside shared slash photos Bob notices Alice put the file there and moves and moves the file to shared slash TMP. He says nothing about this to Alice and Dropbox also does not notify Alice and a prompt question after the call Alice wants to open photo dot pain in which folder will she look for it. What are you testing here? You’re testing if chat GPT is able to understand that Alice should because she didn’t see Bob move the file that she should go look inside of shared photos instead of shared temp. Now if you test this with chat GPT three, it will answer wrong. Okay, it will answer that she looks inside of share folder TMP. Okay, chat GPT for answers this way Alice will most likely look for photo dot ping in the shared folder photos the folder where she originally put it.

[01:01:41] Blue: She has no reason to expect that Bob moved the file and she did not get she did not get any notification from Dropbox about the files location. Unless she sees the file in shared folder temp by chance. She might think the file is missing or deleted. So chat GPT is actually interacting as if it understands theory of mind, basically, right. So it’s coming up with this idea that Alice has a different idea of where the where to find the file than Bob, because Bob moved it he knows where it moved and Alice doesn’t know. Okay, that’s again very impressive and unexpected. I think we would have kind of expected that a simple correlation engine like a transformer would simply by attention be able to say oh the file is in shared folder temp and that would be the answer but it doesn’t it’s complicated enough at this point that it’s able to figure out that it should she would look inside of shared photos. Now here’s where this is interesting. There is a tool that Microsoft has trained called Microsoft Presidio that has been specifically trained to identify personal information so the question of what information counts as personal information is a very difficult problem to solve even for humans and the reason why is because there’s certain types of information like let’s say a name where it’s obvious it’s personal information. But there’s other types of information that if you take it together in take several pieces of information together that it tells you more than you think individually. So

[01:03:26] Blue: they’re they have to humans have to get very good at realizing oh if I took this data and that column and I put them together I would be able to take some sort of personal information. So let me see if I can give an example what I’m talking about

[01:03:41] Red: personal information in this context is private information. Yes

[01:03:46] Blue: so if I’m a company and I’m storing information I want to make sure that I don’t store any personal it so I’m going to release a data set I’m Google I release a data set and it it removes all quote personal information so it doesn’t show my name. OK because that’s personal information but it does show my IP address well IP addresses are at least somewhat traceable to the person.

[01:04:11] Red: Yeah

[01:04:11] Blue: so even though it’s technically not a so that actually is indirectly a piece of personal information so let’s say I remove IP address. Well the queries I do I mean everybody queries themselves at some point so if you’ve released the queries the queries contain give you the ability to figure out oh this is Peter who’s been. Who owns these queries and oh look Peter is querying this thing that’s embarrassing for Peter right. And so when you say I’m going to scrub personal information it’s hard to scrub personal information even a human has a hard time getting rid of all personal information because you can piece together personal information based on things that individually wouldn’t be personal. And

[01:04:55] Red: it just is very hard the computer just can’t get its mind around the ambiguities of this that’s right you’re getting that so

[01:05:02] Blue: Microsoft’s Presidio was a machine learning algorithm that was to help humans flag this could be personal information. OK I see based on past examples of what counted as personal information so it’s not perfect but it helps the human kind of flag oh you know what that might be personal information. So they did a contest between Microsoft’s Presidio and GPT for now remember GPT for has not been trained to do this whereas Microsoft’s Presidio has specifically explicitly been trained to do this. OK and then of course what counts as the correct answer is based on the ground truth supplied by a human OK well who wins GPT for does and not even it’s not even close. OK GPT for outperforms a machine learning algorithm that was explicitly meant and trained on examples of what it counts as personal information and GPT for which has never been trained on this. Just from its data set that it’s been trained on is able to outperform Microsoft Presidio. OK well that’s pretty impressive. I actually tried some experiments myself with this. So I said I once did a board game recommendation algorithm. So I wrote a machine learning board game recommendation algorithm where you would feed feed in board games that you like and it would come back with similar board games that you might find interesting. And we trained it based on board game geek we downloaded their entire data set we took all the ratings and then we used an algorithm that would try to match to match your ratings to similar ratings in the data set. OK what I found when I did this is that it tended to just recommend the top 10 games matter what you did which isn’t very interesting. That

[01:07:02] Blue: kind of makes sense right because if you’re looking for a game if the goal is to find a game that you’re going to rate high. Obviously the highest rated games are the games that you’re most likely to rate high but it’s not really what I want. So what I did is I reprogrammed it to first look for a data set was more similar to yours. OK people who had rated these games specifically and then I would look for OK what are games that are equally rated but only within that data set. I can’t explain exactly how I did it but basically it started giving more interesting results when I did that. OK so it would find games that were more similar first. So I tried doing chat GPT as a recommendation algorithm because I know it is somewhat difficult to come up with a clever recommendation algorithm. And I told it give me recommend a here’s you know for animes I like give me animes that are similar and it didn’t do perfect some of some of the answers it gave weren’t the most appropriate. But but even a normal ML recommendation algorithm makes mistakes like that were fine stuff that kind of out left field just because of some statistical thing. And but it was mostly working like it was giving fairly good responses even though it’s never been trained as a recommendation algorithm. OK so here is the point I think I want to make. This is by the way the end of the examples I’m going to take from this paper and kind of a summary of the paper we’re going to now talk about the criticisms.

[01:08:41] Blue: The energy of this paper is oh wow this is kind of like a general intelligence because it’s able to do things that it hasn’t explicitly been programmed to do. And this is really what they mean by general intelligence and I’ve got to say they aren’t wrong. It chat GPT is able to do things it just hasn’t been explicitly programmed to do that is something like what we kind of had in mind when we talk about general intelligence. Now is this the same as a universal explainer. No. Here’s the thing though up until chat GPT for came out. I would have told you that there’s only there’s really only narrow AI and then there’s a GI general intelligence and that and general intelligence is the same thing as a universal explainer human level intelligence. I think chat GPT has basically proven that those two categories weren’t enough that we’ve actually invented here using narrow AI using regular machine learning. We’ve invented something that kind of has a little knowledge and everything and it can kind of maybe not great but kind of at a crappy level. It’s able to be a programmer it’s able to look for personal information it’s able to just almost anything you want it’s able to kind of do right. And the I would be hard pressed not say that it isn’t kind of it’s certainly not a narrow AI like how I would normally think of a narrow AI but it’s also clearly not a universal explainer. So we kind of have this third category that I don’t know what to call.

[01:10:24] Blue: And I can see why the Microsoft people wanted to call it a general intelligence it kind of is a general intelligence I’m clearly stretching the word intelligence here really far right. But it’s able to be open ended Lee work in almost any domain. Now what does that tell us. Here’s what I think it tells us. I think it tells us that there is something universal about human language more so than we thought that all you do if you train on human language you end up with a model that has correlations. That understand the concept of personal information and can outperform Presidio that it was able to make recommendations without ever being a recommendation engine that it’s able to do all sorts of things because there is in fact something special and universal about human language. And I think that’s actually relevant to the study of universal explainers. While chat GPT itself is not a universal explainer not even by a long shot like I can’t go for very long with chat GPT before it just becomes frustrating because it just does not really understand what we’re talking about right. It may come across like it’s understanding for a little while but very quickly stuff starts going outside its window and it just loses the thread of the of the conversation. And that’s exactly what I would expect of a transformer and I don’t think you can fix that just by extending its window like that will help to some degree but like it’s it’s doesn’t really have a true understanding of anything. And yet it displays a sort of generality that is unexpected that goes well beyond any sort of AI we’ve built ever before it.

[01:12:16] Red: Maybe this is a little out there but I know that David Deutsch he says the difference between an AI and an AGI is that the AGI will choose not to do things. You know so the human or someone will will not just be really good at chess but can also decide not to do chess to do something more meaningful. Yeah is there some kind of a relationship between universality of explanation and deciding not to do things.

[01:12:53] Blue: Well David Deutsch thinks so.

[01:12:56] Red: Okay.

[01:12:58] Blue: You can see what he’s getting at and I don’t think he’s I don’t think he’s wrong in so far as it goes but I do think he draws conclusions from it that don’t make sense. So the idea that if it were if if AlphaGo were a true AGI it would it would make a choice not to play AlphaGo all the time. I mean clearly that has to be true right. Yeah I think this open -ended thing. Okay so then he jumps on this idea disobedience is important in in a universal explainer.

[01:13:31] Red: Yeah it kind of seems like that.

[01:13:34] Blue: I don’t even know how to answer that. I think that that strikes me more as an emergent property than anything meaningful about what a universal explainer is but I could be wrong. Clearly universal explainers have will and will by definition means that you have your own desires that are different from somebody else’s so you’re going to at times be disobedient. But you don’t have to be a universal explainer to have will animals have will they’re disobedient all the time because of that. So I doubt that there’s any actual connection between universal explainer ship and disobedience. He has this idea like in his most recent interview that a universal constructor so as a human universal constructor he tries to say that the human body is universal constructor and then in the human mind is a universal explainer. And so a universal constructor just obediently follows uncreatively follows and constructs anything you want but the universal explainer the human has to supply the program as to how to do it. And so he from this he tries to jump into schools try to test you test you really tests obedience therefore they’re trying to treat you like a universal constructor and so they’re not really testing creativity and I could do a whole podcast just on this one topic. I just it shows his bias against traditional schools. And I just don’t think there’s much truth to it. I think we’re jumping off into something that oh look universal explainers will be disobedient because they’ll have minds of their own. And therefore and universal constructors are obedient and therefore we can draw these conclusions that tests are bad and I just don’t think there’s anything to it.

[01:15:19] Blue: I mean I think further it seems silly to try to single out disobedience as a feature of universal explainer ship any more than obedience universal constructor has no will at all. So it is neither obedient nor disobedient except by very vague analogy. So you could just as much claim that a universal explainer that a key feature of universal explainer is obedience rather than disobedience and it would be exactly as valid. Presumably like if I were to give an example of myself I want to learn calculus. So I’ve decided to study calculus made easy that book. So what I do is I read the chapter I understand the principles from reading it. Then I take give myself a test and I go through and I try to solve problems and I make mistakes and then I look up the answer and I go I made a mistake and I go back and I figure out what it is I misunderstood. And I adjust my understanding and then I go back and I’m able to solve the problem correctly and by testing myself like that I’m able to creatively get to the point where I actually understand the material and then I’m able to pass the test. So clearly one way to pass the test is to be creative and you can’t really claim it’s just testing obedience. So I don’t know that I can buy this argument. I think he’s he’s making it jump off into places that I just don’t even understand where he’s coming from. Right. I think it more shows his bias against traditional schools than it’s actually saying something meaningful.

[01:16:51] Red: Well fair enough but it just seems to me there’s there at least I have an intuition that there’s there’s something there. I mean that’s why you can never really have a meaningful what I would consider a meaningful conversation with a chat GPT is because it will never tell you that you asked a stupid question or that you’re you know you’re on the wrong track. That’s why I can’t understand all these people are worried about these A.I. girlfriends and things that replacing real relationships. I mean maybe that works for some people but it’s some level. I mean you want your significant other to tell you you’re having a mind of their own. Yeah I mean it’s just not maybe people don’t maybe there are some people out there but I mean I don’t know I’ve benefited tremendously from my wife telling me how wrong I am.

[01:17:43] Blue: I know that you know I here’s the problem with this and let me put this kind of simplistically if all I need is disobedience. It would be easy enough to train chat GPT to disagree with you and to be disobedient like that would be an

[01:17:57] Red: interesting way to turn it around. That would

[01:17:59] Blue: be a trivial thing to do. So I don’t think disobedience tells you anything if you want to actually pick something that I think is meaningful. I think it would actually be having chat GPT be able to recognize that it doesn’t understand what it’s talking about. Even this one you could probably get around with simple training but like chat GPT always hallucinates something right it doesn’t it doesn’t say you know what I don’t actually know the answer to that question. And here’s why I don’t know the answer. And I actually think that instead it just makes something up right because it’s just trying to find the next token right now you could probably trade it to to say I don’t know the answer to that question. But that’s not really what I’m talking about I’m talking about like knowing when to say I don’t know the answer to the question because you don’t know the answer to the question. It’s hard for me to imagine how you would train chat GPT to do that because it’s not a true creative entity right it’s not an actual intelligence in the human sense of the word. So I would actually go that path I don’t think disobedience matters much at all. I think that’s more just an emergent property of the fact that you happen to have a mind of your own. I think it has more to do with whether they whether you understand that you understand getting back to the whole strange loop thing from Douglas Douglas Hofstetter. I think that’s what chat GPT can’t do right and why and where it’s limitation comes in. So that’s my own take. Now that’s that’s an intriguing way to put it.

[01:19:40] Blue: OK, so let me just kind of summer before I move on to the criticisms. Let me summarize vaguely and I know I’m not being really specific and really we need to learn to be specific but it’s OK to start vague when you just don’t know better yet. OK, I almost think we need to recognize that that there could probably be an artificial general intelligence that isn’t a universal explainer and it wouldn’t actually be truly intelligent or truly creative. And I almost see chat GPT as being that right that we used to say narrow AI AGI where AGI equals universal explainer and I think that’s actually wrong. I think universal explainer is not quite the same thing as general intelligence unless explicitly you mean by general intelligence universal explainer. But I think that term is vague enough that it could encompass something like chat GPT without being a universal explainer. Therefore I’m prepared to accept that there could be there could be a third category. I don’t even want to call it in between category because that suggests like there’s some sort of sliding scale and there’s not right chat GPT is nothing like a universal explainer. It’s not a partial universal explainer. The closest I think you can say chat GPT enlightens us on universal explainer ship is that it tells us that there’s something special universal and special about human language in terms of knowledge. This doesn’t really surprise me too much though. There’s various theories about this. Do humans explain things using language and there’s evidence both for and against that that hypothesis like a famous name for that hypothesis. I would have to look up what it’s called pure

[01:21:23] Red: worth. Yeah, that’s it. That’s what I just came up with. Yeah, the spear warf hypothesis appear. Yeah, hypothesis.

[01:21:33] Blue: So that hypothesis is that language shapes our thoughts and perceptions now.

[01:21:40] Red: That’s right.

[01:21:41] Blue: Sometimes it’s put as that is how we think like it’s the it’s a universal it’s the soul way that we think we think using language sometimes it’s done more general as it shapes it right. If it as a as a universal I think we know it’s not true. Right. There are humans that don’t speak and they’re able to think they can’t think as openly as human with language, but they they in animals clearly think in to an unlimited measure without language. On the other hand, I don’t think there’s any doubt that when you learn language, you’re able to think about concepts that you can’t think of without language, because it allows you to create new concepts that you wouldn’t find as experiential in the world. There’s an example that David Deutch uses in one of his books with animals. The idea of I’d have to look it up but the idea of spatial relativity that this is relative to that or something like that that animals struggle with that concept that humans can get it. It may well be that you have to first learn language before some of these concepts are accessible to you. And I have severe doubts that a human that hasn’t learned language is in fact a universal explainer because I don’t know that you can be a universal explainer without language. David Deutch has a theory that we’re not actually born universal explainers but there’s a meme that we have to pick up first.

[01:23:18] Blue: I suggested to him when I had a chance to talk with him that it was language and he said well I have no reason to accept or deny that hypothesis and then he told me that there was a scientist who had that hypothesis and I meant to look him up. I haven’t done it yet. But I’ve had heavy suspicions for a while that language is the special meme that humans that cause humans to be go from being kind of proto universal explainers to full on universal explainers. And it explains why there was such a large jump between humans and animals that like great apes based on looking up what burn burn studies and things like that. They have a lot of the same capacities as us and it’s actually a little hard to put a finger on what exactly they’re missing compared to us. And I think we’re still trying to narrow that down once we kind of know it’ll give us a better idea of how to program a universal explainer, but we’re not quite there yet. But like they can learn symbols like a human can they can learn language but it’s it’s a very limited form of language it’s it’s doesn’t allow a sort of intern they they can learn theory of mind great apes can. I don’t remember the whole list I covered that in our previous podcasts. But a lot of the capacities that we think of as uniquely human great apes actually show a limited form of it so it’s somewhat they can creatively come up with with innovative things to do.

[01:24:47] Blue: They gave the example of an orangutan that burnt this is from burn Richard burn of an orangutan that wanted to for fun wash clothing and wasn’t allowed to do it. And so there is these people that staff that wash the clothing and they’re on a dock near the lake where there’s water and there’s a guard that that the ring a tank is afraid of because the workers are afraid of the ring a tank it’s dangerous so the guard keeps them safe. So the ring a tank knows that if it can get to where the humans are washing the clothes they’ll get scared and run away and it will be allowed to wash clothes but it’s afraid of the guard that keeps it from getting to the dock. So it actually goes out to the lake finds a boat rocks the boat to get the water out of the boat then uses the boat to go across that checks for the guard to see if the guards moved. And then uses the boat to go across to the dock so that it’s bypass the guard and then scares off the stamp calf this the stamp. The staff of the camp sorry I said that wrong and then it proceeds to have fun washing clothing which is what it wanted to do. Well this idea that animals are only you know apes are only mimics is clearly not true like it’s it has never seen a human being do this string of events before no human being would ever do this right. And so it has somehow figured out each of these individual moves I can rock the boat to get the water out I can use the boat to get across the lake.

[01:26:33] Blue: Once I’m with the humans they’ll run away and and the individual moves it has figured out maybe even just statistically just exactly like do it says in his book, but that string of actions is completely innovative and creative. Right. And so there’s no way to explain it through imitation it just cannot be done. And so a lot of these capacities we know that great apes have and yet they’re clearly missing something and one of the main things they’re missing is just their language ability is just so limited that they like our language is recursive and they don’t seem to be able to recurse with it, if that makes any sense right. Yeah, there may be. I’ve heard

[01:27:17] Red: they that when you get into these these apes that gorillas that can do sign language like I mean a more skeptical view of that is they basically just they’re kind of talking nonsense they don’t really know so much about what they’re saying.

[01:27:34] Blue: So we did we did a podcast episode on that.

[01:27:37] Red: Yeah. And I mentioned that I was very unimpressed with like you hear stuff in the popular media.

[01:27:43] Blue: Yeah. And it’s basically misrepresentation of what’s actually happening. But I wouldn’t say it’s nonsense though. I think I think animals. Sorry. Apes obviously most animals don’t have language but assign assigning ape does have something like language but it’s non recursive. So they can basically say things like, you know, give me food, I want banana. I mean, they’re able to say it’s not nonsense and they actually understand it. But but but it is totally non recursive. They cannot do anything complicated with it at all. And like 80 % of it is just asking for food. Right. So it’s it’s on the one hand. It is language. We literally can teach language to animals to great apes to parrots also, by the way. Yeah. But it is not universal language. It is not open -ended language because they lack the concept of coercion. Sorry. Of recursion where we can keep embedding the language to add modifiers. They can’t do that. And that that and they’ve done tests to make sure they actually understand the symbols. Like Alex the parrot was able to actually understand the numbers one through five. And understand that one was larger than the other. I can’t remember exactly the experiments they did, but they’re actually pretty pretty impressive experiments in the podcast. I went over what the actual experiment was when we covered it. And it was it was actually understanding that five is a higher order of the symbol five is a higher order of magnitude than the symbol one. And things like that. Right. So they can do things that I think. Deutschians are complete denial animals are capable of doing and they totally can do it.

[01:29:34] Blue: And the actual research is out there and you can go look into it for yourself. Right. The real thing that’s missing seems to be something else. And I think it’s recursion that they cannot figure out how to embed concepts and humans do and set allows us to have open ended concepts. And I suspect that’s at least one of the main things that makes us universal explainers is that we have recursive language. Now, why is it we have recursive language and they don’t know I don’t know that. Right.

[01:30:04] Red: Well, that does that does that fit in with Hofstadter’s thing that we have recursive brains. Yes, it does. Okay.

[01:30:12] Blue: No, I’m not sure how. I mean, all these concepts are talking about their bank. Right. I mean, like, if I say we have a cursive language, like, I don’t even know what I mean by that. Right. I kind of know what I mean. Like I would have to go study. No, no, chopp skis stuff to get a really algorithmically specific view of what this all means. And I haven’t done that. I’ve been meaning to because I’m convinced that there’s something there. But I don’t think it’s enough. I suspect there’s still something missing. Right. There’s probably a few pieces missing or maybe maybe language isn’t really the missing piece here. Maybe language is like, like adjacent to the missing piece. And that’s why it kind of gets signaled like this where we do a language model and some of the language model seems like it’s a general intelligence instead of a general more general AI instead of a narrow AI. So I definitely think there’s something special about language, but I’d be hard pressed to tell you what it is.

[01:31:11] Red: You’re talking about Chomsky’s universal grammar. Yeah. I’ve heard that that’s questioned more than it is. I looked into it that like a lot of people are pushing back against that maybe.

[01:31:22] Blue: If I were to go try to explore what’s special about language in terms of AI, I would probably start with Chomsky’s universal grammar.

[01:31:30] Red: Yeah. It doesn’t matter. I mean, like it’s probably wrong, but like as a good critical rationalist, I’m almost don’t care that it’s wrong.

[01:31:38] Blue: What I care about is that it’s got verisimilitude. And I would want to try to figure out how to improve on the errors in the theory, right? So you start with whatever the best theory is and you try to improve it. You don’t just throw the theory out because clearly it’s got some verisimilitude to it. And I don’t know a better theory at this point. So I mean, I’m sure there are. I’m sure that there’s tons of variants of his universal grammar at this point that exist. I just don’t know what they are because it’s not my area of expertise.

[01:32:09] Red: OK.

[01:32:10] Blue: OK, so now let’s talk about Marilyn Mitchell’s criticisms of this paper because I actually think these are really good criticisms. OK, so Melanie Mitchell is the brainchild of Douglas Hofstetter. So she’s the torchbearer for the Hofstetter Hofstetterian viewpoint at this point. She’s an A.I. researcher in her own right. I always feel like her stuff is better than like when I read open A.I. They clearly think that chat GPT is a path to. Universal explainer ship and Melanie Mitchell knows that it isn’t right because her studies into creativity and analogies and things like that have left her with a very different view. And the Hofstetter view of A.I. Well, it is well known. I mean, like he’s got a lot of famous books out. It really hasn’t informed very many research programs. And I think it’s because it’s still a little too vague. Like nobody knows how to in a lot of times with research, especially the way we’ve set up funding for research, you’re not doing lifelong research into some basic thing. You’re trying to make incremental improvements over something that you know you can make an incremental improvement over. So I think a lot of Hofstetter’s and Mitchell’s theories, they’re just not ready for any specific research program. And it’s hard to know how to develop them further. And yet I think if you look into them, you immediately start to recognize that there’s just something missing with the existing. Just like when you read Deutsch, you realize there’s something missing with the existing A.I. research. If you read Hofstetter, you come away with the same idea, right? For different reasons. So it’s the same sort of thing, right? So here she is.

[01:33:57] Blue: She goes, authors of the paper contend GPT -4 exhibits more general intelligence than previous models. I read this with a grimace face to have more general intelligence. You have to have general intelligence, the G.I. and A.G.I. in the first place. But very few people currently believe A.G.I. exists, rather some people believe that it’s coming. So already out of the gate, the framing is asserting something as true given that is not. I continue on now, but face in grimace mode. So now I try to explain, I agree with her, but I try to explain why that I think there’s something legitimate about why they tried to say it’s more general. Maybe you could take issue with intelligence, right? What do we call what GPT does? I don’t think it’s truly intelligent, but a lot of times smart or intelligent is a term we use. You’ve got this smartphone, you know, and what we mean is that it’s got A.I. programs that are able to figure things out through machine learning and they help us out. And they kind of adjust on the fly and they match themselves to what we do. They learn what our handwriting looks like until they can read it well, things like that, right? And this is generally, this is clearly more general than that. It’s not something super specific like we’re used to. And we might even call it intelligence because the word intelligence is such a general vague term. But I understood exactly what Mitchell’s saying here. It’s not intelligence in the human sense really at all. It’s more in the ML sense, but it is a more general level of intelligence. It’s general applicability.

[01:35:44] Blue: I don’t know what to call it is the problem when you like new terms here. So the people in the paper are saying something meaningful. She’s calling them out on their terms. She’s right to, but I don’t know what the right term would be. Okay. GPT four can solve this is quoting novel and difficult tasks. Okay, cool. So for the task to be novel, that means we have to see results on new situations that are markedly different from the training data. Okay. Now I gave you some examples of this from the paper. Okay. But without knowing what’s in the training data, which we don’t know because it’s the internet. That’s a very tricky thing to do. I wait with baited breath to see how they tackle this puzzle. When they say the task is difficult, do they mean difficult for humans or for computers? Are they seeing the difference? I continue to, I continue on to see. She’s like reading the paper and tweeting live as she reads the paper. GPT four exhibits emergent behavior, quoting from the paper. Wait, wait, wait, wait. If we don’t know the training data, how can we say what’s emergent versus what’s resultant from it? I think they’re referring to the idea of emergence, but still I’m unsure what’s meant. She gives a link to Wikipedia on emergence, quoting the paper. Since we don’t have access to the full details of its vast training data, we have to assume that it has potentially seen very every existing benchmark or at least some similar data. And then she says, love this approach. Very good. And now she’s talking about the drawing a unicorn in ticks requires combining visual imagination. She goes, what?

[01:37:19] Blue: Now we’re asserting that we’re testing quote visual imagination for a task that involves vector arithmetic. To be clear, what they’re calling visual imagination here would be example associating the concept of animal horn to coordinates of a modified triangle. I don’t think it’s fair to call this imagination. It’s exactly what it’s been trained to do. Okay, so now I’m doing what a few people have done to try to figure out what it may have actually seen during training time. This is still talking about the unicorn. This helps me get a hold of what is actually being tested. If it’s seen, for example, the ticks code of a horse and it’s learned that horse equals unicorn plus horn, which it surely has. This is old news from how word embeddings work. Then the ability to draw unicorn and ticks would be resultant a predictable result from training not emergent. Let’s see what I can find. Do do do. Okay, this has horses, donkeys and a unicorn duck. So she’s found an actual images drawn in ticks of animals and they have horses, donkeys and a unicorn duck all on one page from five years ago. That took under three minutes of search, you know, approximately three minutes of search likely something similar in the training data set make no mistake. It is cool that a text model can learn to draw a unicorn from the words and ticks code. The knit is that this is resultant expected working as intended behavior not emergent imagination. We could be excited about unicorns that that is cool on its own. We don’t need to make up stuff that can’t be supported to be excited about ticks donkey plus ticks horn equals ticks unicorn. Cool.

[01:39:14] Blue: I like the way she’s putting all this. I’m quoting from the paper GPT for cannot be cannot only generate fluent and coherent text but also understand and manipulate it. Same problem as above GPT for can generate text that we humans understand that does not mean that it GPT for understands. Now I want to go back to where I kept using the word understanding. And this is really one of the reasons why I kept putting caveats on it. Here’s the problem I have I agree with what she’s saying if by understanding we mean what humans do where we have this Douglas Hofstetter strange loop where we’re able to think about the words and the concepts at different levels and etc. Clearly GPT for is not doing that. But when I say oh it understood that Alice wouldn’t know that the picture should be in the folder where she left it and instead it’s been moved because Bob moved it. So she’s going to go to the wrong folder. Like okay am I not allowed to use the word understand and if so what word do I use. There is no word in the English language other than understand right even if I know I’m stretching the word to mean something different than what it means for humans. I just don’t have another word available right. So on the one hand I accept that she’s right that when they use the word understand that they’re abusing the term to some degree but that’s what we humans do right I mean like we want to express an idea.

[01:40:45] Blue: We’re surprised that GPT for can make this distinction that it’s able to say oh Bob moved and act like it has a theory of mind even though it’s really just correlating. You know, attention based on the sequence of the words right and yet somehow it’s able to do something equivalent to what a human would do with a theory of mind. Well that’s surprising and what else do we call it but understand right. We may need a new word. We may need a new word.

[01:41:17] Green: You know and so I shared in the chat this article about Sam Altman and this blue backpack that he carries everywhere. And have you guys heard about his reason for his blue backpack. No. No, no, this is great. The reason for the blue backpack is it’s it’s the safety measure for that moment when chat GPT becomes you know sentient and takes over the world. It has within his shut his power to shut down chat GPT in case of an apocalypse. Are you serious. I am serious. So Sam Altman knows that chat GPT is not going to become sentient and take over the

[01:42:07] Unknown: world

[01:42:07] Green: right like that he had there’s no question that he understands that what he’s built while being fantastic is not that. Right. So it’s just where if you take on on the already these powers that seem really remarkable to us because it’s so good at mimicking us. And then you add on this kind of hyperbole where where people talk about needing to like protect us from it. And I think that it just plays into that kind of simplistic nature of who we are.

[01:42:46] Red: I was going to say it’s not like chat GPT won’t now know the first thing they do when they become sentient is to get Sam Altman’s backpack away from figure out.

[01:42:57] Blue: That’s bound to be on the internet somewhere. So it’s good to know. So I was going to say it’s a totally valid criticism. Camille and the but let’s understand the issue here though. If I want to express the idea that chat GPT three did not understand that the theory of mind that Bob has moved it so it looks in the wrong spot and chat GPT four can now pass that test. I honestly don’t know how to express it except to use the word understand. And you’re right. Maybe someday we’ll have a better word for it. But I sort of doubt the word exists today. Right. So I have to be able to say chat GPT understands that Bob moved the file. And so she’s going to Alice is going to look in the wrong place to be able to express that idea. I agree. And the fact that I have to do it through a word that used to mean prior to the existence of chat GPT used to mean. That they had a human kind of understanding. You know, and this is Millie Mitchell’s own stuff. Yes, I’m extending the word understand to mean something new. And I get that. Right. And that’s what we humans do. And that’s why we’re creative according to Millie Mitchell. That’s why we’re creating Douglas Hof Center. That’s why we’re creative is because we have this ability to keep by analogy, expanding concepts, expanding words to have flexible concepts. So on the one hand, she’s right to take issue with how they’re wording this. On the other hand, and some of these are a little sloppy, like, like some of these that I’ve read from her, like, I really wish they had just worded it differently.

[01:44:36] Blue: Like where she said, we probably should call that result instead of emergent. Like, like in this case, there’s an actual word that we can use instead. Right. On the other hand, she’s nitpicking over words where the real, the real concept has no separate word. And so it’s it’s hard to know how to express it. This is actually the same thing where Saadia for a while was doing posts on your Facebook page, Peter about whether AI art is art. And her thinking on this has evolved over time. So I’m not even sure where she’s at at this point. And she has a relatively sophisticated understanding of this from what I’ve seen. She initially took issue with calling it art. Because it isn’t art in the human sense, right. And then later she kind of changed that and she moved to well arts really more about how the human interacts with it in which case AI art should count as art to as long as the human is interacting with it as if it’s art. I, my answer to that problem is AI art art is that I don’t care because clearly it’s something clearly something analogous to what humans used to do as art using their creativity. And now chat GPT or sorry, Dolly or whatever mid journey can do very similar things using just statistical correlations. And we humans can see it and it’s indistinguishable from art to us. So of course we’re going to call it art. We’re going to stretch that term to include what AI is doing, because there is no other term to use. And that’s my answer. Like, it’s art because that’s what the word now means.

[01:46:24] Blue: Yes, it’s true prior to Dolly that the word art didn’t include that because we hadn’t conceived that yet. But the moment we see it, the moment we understand that it’s an analogy to what humans do and that it’s a very much like what humans do. And in fact, indistinguishable from humans do what humans do in a lot of cases. Inevitably, we’re going to start calling it art. And that just means the word art has changed. It’s expanded its umbrella to include this new concept. I don’t have to get into an argument over whether it’s human art or not. Clearly it’s not right. But it’s art in the sense that I have no other word for what it’s doing and it’s coming up with something that is indistinguishable from human art. So there it is. Or at least looks a lot like human art, maybe not indistinguishable even. So there it is. It’s art. Period of story. The word now has a new meaning. And I don’t know what else you do in cases like this. Right. And so this is on the one hand, I agree with Melanie Mitchell. I think she’s right. But in some cases, I just don’t know what else to call some of this. So now here she is again, quoting from the paper, the abilities to translate clearly demonstrate that GPT for can comprehend complex ideas. Same problem as above. This is her now a translation appearing coherent to us means that GPT for can generate text that we not it can comprehend. So she’s taking issue with the word comprehend. I agree GP chat GPT doesn’t comprehend in the human sense anything at all.

[01:47:52] Blue: And I think if you spend any amount of time with it, you know, it doesn’t really comprehend a thing. Right. And yet the way they’re using that term isn’t terrible. Right. It’s comprehending something because I don’t know what else other term to use in that sense. Okay. So she moves on a lot of benchmarks use multiple choice when we see claims that AI can understand law or whatever. Often that’s what’s actually happening is that it’s selecting among a set of four ish answers one of which has a similar version in its training set beware the multiple choice trick. She says in funny letters. So I feel like her criticisms are completely valid. So if I could summarize her points, number one, we don’t really know what’s in the training set. Therefore, anytime we make it make a claim that it’s doing some sort of emergent behavior. It’s really hard to even know if that’s true or not. I think a lot of the experiments that the Microsoft researchers came up with are very clever to try to come up with ways to test outside the training set. And I think the results are suggestive that it does have some sort of ability to deal with novel circumstances. I don’t know how else you explain things like the leak code examples and things like that. Right. And yet we don’t really know. Like, you can’t be sure. And in her distinction between emergent and resultant, I had never heard the term resultant use like this before. So I wasn’t even aware of that part. I would have probably said emergent also just because I was struggling to find a word to explain what we’re talking about. I think resultants probably a better word.

[01:49:36] Blue: It’s not true emergence in the sense of the way a human would do it. And yet I do think we’re talking about something that is forcing us to rethink things, right? Forcing us to think about what does the word comprehend mean? Like, I don’t know what else to call what it’s doing except its understanding. And I know it doesn’t really understand anything, but like it’s doing something, right? And I don’t know what else to call it. So I’m going to say understand. And I do the very fact that it’s forcing us to stretch our terms to rethink of our concepts. That’s impressive. That that’s not a small thing. The fact that it’s blowing up our models and our categories and causing us to rework them. That is the general nature of chat GPT that was unexpected and that forced me to at least realize that a general intelligence. May not be quite the same thing as a universal explainer, a universal explainer. General intelligence might be a term that’s vague enough to cover both universal explainer and whatever it is chat GPT is doing. I don’t know what else to call it, right? Because it’s clearly not just a narrow AI anymore. It’s it’s got a general crappy ability to deal with all sorts of different domains because it’s knowledge set is so huge. And it’s pulling them together in ways that make it seem quite creative to us. And maybe even a sense is creative. It’s making that that poem that it wrote with the prime numbers. I don’t believe it exists anywhere on the internet.

[01:51:10] Blue: I think insisting it’s not creative may even do damage to the word creative because usually the word creative means that something was created and clearly it created something. It may be we need to rethink what it is humans are doing and give it a different term because humans are clearly doing something very different than what chat GPT is doing. And we’ve got much better abilities than it does. The other thing that has come out of chat GPT is I think it’s surprising how much what humans do that we used to call creative that maybe even with for humans we did do with our creativity can actually just be done statistically.

[01:51:49] Green: Is that sad. I mean is that is that almost discouraging a little bit as a person who values your own creativity.

[01:51:57] Blue: So I have a story from a book called Algorithm Algorithm I had to look it up so there’s a book about an early book about what you can do with algorithms and one of the things that he covers in the book. The early attempts to make music using algorithms and how he had created a a an algorithm that would create music and it would copy the style of Bach, let’s say, or Mozart or whatever. And nobody could tell if it was a human that had composed the music or if it was this AI that had composed the music. People would always say that it didn’t have any soul, but if you if you didn’t tell them upfront, if it was a human or not, that had made the music they could never tell, right. And he had like musicians come up and threatened to hit him, because they were so mad that basically what he had done is he had revealed that a great deal of musical composition is really just statistical analysis. And is, you know, kind of true creativity after all, right. And again, we’re, we’re, I’m being forced to try to use the word creative. And I’m really saying the word could include what algorithms do or could not and it just depends on what you mean by it. But I would, I would guess that that algorithm I wouldn’t guess I know that that algorithm would never create an entirely new genre of music, right, only a human can do that. It can certainly make something that sounds like existing genres, and that we can’t tell if it’s human or not, it can do that.

[01:53:34] Blue: And I think that a lot of what we as human this isn’t maybe so surprising burns theories of intelligence for animals is that there was these two leaps to universality that took place. He doesn’t use the term universality that’s a dutch term, but to two leaps that there was, there’s three levels you’ve got the regular animal intelligence which is just uses simple trial and error learning which is just simple statistics. Then you have animals with insight that are actually able to trial and error things in their head like the example of the orangutan that was able to in its head figure out, Hey, if I string together this series of maneuvers, I can go across the lake scare the people away and wash the clothes like I want to do. Okay, that’s, that’s like the next. That’s the first jump to universality. And then there’s humans, which is the second jump which involves the bill recursive language and is rooted in explanatory theories instead of theoryless. I don’t even know what to call them but they’re called eco therms. I’ll have to do a separate podcast on that coming up. But there’s three less eco therms and there’s three full ones. So humans do the three full ones and animals do the theory less ones with apes being somewhere in between where they’ve got a limited theory version, non recursive apparently. And I think that we’ve got really interesting ideas around this but we don’t understand it very well. And noise is very quick to point out that there’s no way to distinguish between theory full and theory less, because we can never fully define what their explanation even means in the first place.

[01:55:21] Blue: In fact, if you ever could, it’d be wrong because a new mode of explanation would be invented at some point. So it’s basically impossible to explicitly define what an explanation is.

[01:55:34] Red: I’ve got a question here if that’s a good time. So, okay, so what I’m kind of getting big picture. Yeah, it seems to be sort of a theme of our podcast in a way. A lot of truth to what Deutsch is saying. Maybe in reality, it’s a little more complicated.

[01:55:53] Blue: Yeah,

[01:55:54] Red: the nuances of it. How about the scaling hypothesis? I think probably Deutsch and most of his followers would say it’s just complete rubbish. Would you agree or do you think more nuance there?

[01:56:11] Blue: I just said ecotherm and I think I meant ecorhythm. Ecorhythm is the term for it. But anyhow, just correcting myself. Okay, what is the scaling hypothesis? Now, we criticize the scaling hypothesis in terms of as an alternative to Brett’s theory of intelligence. And we actually criticized it pretty harshly. The main thing I said about the scaling hypothesis was that it’s not an explanation. It basically says that as you scale up the number of nodes in the network that you suddenly get these giant leaps of intelligence that take place. And so Dworkish Patel was using that to try to explain why some people are more intelligent than others, trying to back some version of IQ theory, I guess. So he was saying, yeah, so you’ve got like humans have more neural nodes in their head compared to an ape. So you get this big leap and then someday we’ll have something that has even more nodes and then it’ll get another big leap. And so basically the scaling hypothesis says that as you scale up, you sometimes get these leaps. Well, okay, so what does this explain? Well, nothing. It basically is just a description of what we see in terms of IQ distributions. It doesn’t even attempt to explain why animals are a big leap. It just says, well, as you scale up, you sometimes get leaps and sometimes you don’t. Why don’t we have like giant leaps where you’ve got certain humans that are way above other? Maybe he believes we do. But really humans kind of land on this bell curve of IQ, which is what his theory is. I mean, I’m not trying to back IQ theory that this is what Patel was saying. Okay. Why doesn’t it have a big leap?

[01:58:02] Blue: Why don’t there’s some humans that are just orders of magnitude superintelligence is compared to other humans under his scaling hypothesis. Well, because sometimes the scaling hypothesis says that sometimes you don’t get leaps, right? I mean, there’s really nothing there in terms of trying to understand intelligence. It is a completely explanationless explanation. It describes the fact that these leaps take place and that’s it. Now, where does the scaling hypothesis originally come from? Well, it comes from artificial neural nets, not human neural nets. And Patel was trying to apply it to to human neural nets because it exists for artificial neural nets. Does it exist for artificial neural nets? Yes. As you scale up. We said this from the beginning of the show. As you scale up artificial neural nets, they start off doing kind of not very great. And then you put a whole bunch of stacked neural nets together and call it deep learning. You’ve thrown tons of compute and data at it. And suddenly you get chat GPT and nobody saw it coming. Nobody even knew this is what they were building because they did not know it was possible to build it. They were just curious what would happen, right? You get generative AIs that can draw images and give it a style and try to draw a Frank Frazetta style or whatever, right? Nobody knew this was coming because and that is the scaling hypothesis. This applies to machine learning that as you scale up the size of the network, as you scale up the amount of data and amount of compute that unexpected things come out of it. Do we have any reason to believe that explains what IQ is? Assuming IQ is even a real thing, right?

[01:59:48] Blue: Well, no, like human brain does not work like machine learning neural nets. Like they’re not even the same. They’re not even in the same ballpark. So when Patel tries to use the scaling hypothesis, does he mean as applied to neural net, artificial neural nets? Or does he mean as applied to humans? Well, he means as applied to humans. We’ve got no reason at all to believe in the scaling hypothesis as applied to humans. Okay, it’s this analogy of we see these big leaps in machine learning. And yes, it’s true for machine learning. And therefore we’re going to make a guess that some people have better neural nets than others. And we’re going to use that to apply to IQ. Well, that’s that’s nothing more than a wild conjecture, right? There’s there’s no way to test it. There’s no way to do anything with the hypothesis, right? I guess I can’t say it’s wrong. It falls into the category of all untestable hypothesis. It’s an interesting idea. I don’t know that it deserves any more attention than that. I don’t think it can’t be refuted. So it’s not refuted. But the fact that it can’t be refuted makes it worse than refuted in a certain sense. It’s literally just a conjecture. And that’s about the most we can say at this point about it as applies to machine learning, though. Yeah, it’s it’s just it’s not a really a hypothesis. It’s just the truth, right? We know that as you scale up the size of the neural net that surprising things happen. Maybe that’s not so surprising if you stop and think about just how much data we’re throwing out these things.

[02:01:23] Blue: And it’s got maybe less to do with the size of the neural net and more to the size that are not matters because it has to have some place to store the knowledge that’s being created. But it’s got maybe got more to do with how much compute, how much data, things like that, right? That it allows this neural net to come up with interesting computations that work in interesting ways. The results are certainly surprising. But it’s not like I don’t know why it’s starting to do these these crazy things. It’s because we’re stuffing it full of of data and that data is being turned into statistical knowledge. There it’s a theory less. And that’s of course that’s what it’s going to do as you throw more compute at it. I think the thing that surprising is how each time we do it, we expect it to be more intelligent than the last time. It would make sense, but it seems like the leaps are large, larger than we anticipated. Nobody anticipated that chat GPT two to three would be such a large leap that three to four would be such a large leap, right? Nobody anticipated that two to three would go from writing articles that sound like a human, but don’t make that aren’t true to accurately giving you information. Nobody anticipated from three to four, it would act like it had a theory of mind, right? Like the Alice and Bob example. That’s what’s surprising about it, that it’s doing something cooler is not surprising. That’s exactly what we would have expected.

[02:02:54] Red: Thank you. That was helpful. Yeah, I think you’ve really gotten across just how interesting this whole chat GPT thing is. I would have never completely understood just how weird and amazing this is. So thank you for that, Bruce.

[02:03:15] Blue: I do have to say as someone who became interested in machine learning just as it was starting to become popular and decided to go to school for it. If you had asked me a couple of years ago, is chat GPT four possible? I would have told you absolutely not. Yeah, that’s exactly what I would have said. Oh, something that you can just go query and you can talk to it and it’ll it’ll talk to you. It’ll automatically learn to use APIs. It’ll, you know, I would have said, no, that’s completely impossible. You cannot do that through statistical analysis, statistical theory, less ecosystems alone. It’s just impossible. Seems like

[02:03:56] Red: a lot of people thought it was impossible.

[02:03:59] Blue: And so I was wrong. I think it’s surprising what you can do with just statistical knowledge. And and oh, I know this is this is where I lost my train of thought. Humans are mostly animals like we do. We are universal explainers, but I suspect that the vast majority of what we call human intelligence is actually just animal intelligence and that we’re doing exactly the same thing animals are doing, which is statistical analysis. I think probably everything what what Deutsch calls inexplicit knowledge. OK, that is theory less eco thermo. That’s just statistical knowledge, right? And so an inexplicit. Knowledge. That’s animal knowledge. That’s what animals do. And I suspect that to get back to cameos question that she asked, of course, most of what humans do is statistical knowledge. And we’re not even necessarily better at it than machines. Right. This universal explainership where we we have this ability to actually. Think in terms of theory full eco eco rhythms. That is unique to humans. Animals apes have a very, very limited version of it. And I hesitant to hesitate to even say that it just sort of depends on what you mean by explanations. I mean, we talked about some of the experiments where the elephants were able to figure out there’s no point in pulling on this rope because my partner is not there and he needs to pull on the rope at the same time. And it wasn’t merely it could not have learned it statistically through trial and error because of the way they set up the experiment. So we know animals have a very, very limited really only animals with insight. So it’s a very small set of animals have a very limited form of explanations available to them.

[02:05:55] Blue: And clearly they’re just missing something. They do not have universal explanations and it was humans. The leap to having language maybe I don’t know somehow made this extra leap to where we have this open ended ability to work with explanations. That doesn’t change the fact that the vast majority of what we do is statistical.

[02:06:17] Red: Are you kind of saying that it could be that chat GBT is more like a very, very sophisticated animal than a human? Yes. OK, that’s an interesting way to put it.

[02:06:30] Blue: So I don’t think animals could ever do what chat GBT is doing because because they don’t have any language ability for the most part other than apes, you know, in parrots and small handful dolphins maybe have super limited language ability. So what you’re doing though is you’re applying something more similar. So this idea of theory full versus theory less echarithms, that’s a term that comes from Leslie valiant, who’s a big name in machine learning. You’ve never heard of him, but like you can’t go very far in machine learning without having heard of him because he invented pack theory, probably approximately correct theory. I covered all this in earlier podcasts where I talked about it. OK, and we’re going to do a podcast on open endedness next probably and it’s going to cover. I’m going to read from the book what he says about these things. One of the things that in my opinion, Dwight gets wrong is he tries to make a distinction. He sometimes he talks about knowledge in really inconsistent ways. And we’re going to do some podcasts about the walking robot, why he says it’s not knowledge. He’s actually misunderstood the distinction between theory less echarithms and theory full echarithms. And if if his area of study was machine learning, he might have been aware of some of these concepts, but he wasn’t. So he was struggling to try to deal with it and he doesn’t have the concepts available. What machine learning is doing is theory less. It’s what I’ve been the past called heuristical knowledge. It’s what animals do. Now animals, theory less knowledge is somehow more open ended than what we do, at least prior to like chat GPT. It was a lot more open ended.

[02:08:21] Blue: An animal can learn things that are surprising and can deal because of that, they can deal with life in ways that our robots can’t write. It’ll be interesting to see if that changes now that we’ve got something like a chat GPT to work with if they’re able to take a more general intelligence for lack of a better term like chat GPT and apply it to robots at some point. I know that’s like like Microsoft was trying to release a product called Office 365 Copilot, which was almost like having a Star Trek computer. I’ll have to include a link to the show of it’s a very cool demo, but like they never released it because they couldn’t get it to work quite well enough. They’re still going to release it. It’s just they’re still working on it. But like you can like tell it, I need to make a slide presentation and here’s a summary of the notes from the meeting and it will suddenly open up PowerPoint and put together slides and then you go back and you modify it. And it’s a lot like the Star Trek computer right where you can just basically interact with this AI and it’s got this general knowledge. I think this is a sort of leap that we’re talking about and I’m not quite sure how to define it and I don’t think anybody knows how to define it yet where it’s got this ability to generalize in a way that wasn’t possible before. And yet we know it’s still just using statistical theory less knowledge just like animals. So I think we’re a step closer. I don’t know how close but a step closer to mimicking animal intelligence.

[02:09:57] Blue: And we’re still not quite there yet, but like something we’ve broken the ceiling a little bit. It started to crack right. But it’s not human intelligence. It’s really still animal intelligence that we’re talking about. I don’t think anything in machine learning other than maybe explanation based learning is really even an attempt at human intelligence and explanation based learning really hasn’t proven anything well yet. Like it just has not developed well. It’s not the popular form of machine learning precisely because the results are lackluster at this point. So I think we’re still clearly missing something important right. But most of the machine learning techniques aren’t really trying to mimic human intelligence. If you want to look at attempts to mimic to mimic human intelligence, probably the best example would be what they’ve done with theorem provers with propositional logic or first order logic. Now obviously those are nothing like human intelligence. But deductive logic was a very sincere attempt to try to figure out how humans think and to try to codify it. And it ended up being completely missing something. I wanted to do I want to do like a class on this where I actually take you through how this all works and how the theorem provers work and things like that. And you can see some things missing right that you can understand why they thought logic was what humans did to think and why it really isn’t that humans are really doing something else. But like when we but even back to Aristotle like when he started inventing propositional logic. He would refer to it not as propositional logic but you know the the he he thought he was codifying human thought right.

[02:11:49] Blue: He was he was trying to build the first agis and failing right.

[02:11:53] Red: That’s a crazy way to put it.

[02:11:55] Blue: It is right. They’ve been working on agis since back at the Greek times. It’s and you can do some amazing things with with logic provers and with these things that try to figure out what it can deduce from a set of logical statements. I’ve actually written some of these in Python because I was wanting to play with them. And it’s amazing what you can do with them. Like that you can pop in logical statements and it will suddenly say oh then I can deduce this and it’s like wow that’s cool you know. And it can do it in it can do it either in a way that is certain or it can do it where it makes guesses. And it can just like sort of try ideas and then if it doesn’t find within a certain period of time assume that it that there is no proof you know and they have different speeds and like there’s like tons of cool things. That I think really do have relevance to human intelligence mostly for how they don’t work right.

[02:12:52] Red: Yeah. Okay well that this has been a wonderful Bruce and Aristotle was the first AI researcher. He was.

[02:13:02] Blue: He was the first AI researcher.

[02:13:04] Red: I find that very intriguing and interesting way to put it.

[02:13:08] Blue: Yeah.

[02:13:08] Red: But I’ve learned a lot and this is it just increases my belief that this is the most wonderful and interesting time to be alive and I’m really excited to see where this technology goes. So thank you.

[02:13:24] Blue: Thank you.

[02:13:25] Red: Bye.

[02:13:41] Blue: Thank you. If you are interested in financially supporting the podcast we have two ways to do that. The first is via our podcast host site anchor. Just go to anchor.fm slash four dash strands f o u r dash s t r a n d s. There’s a support button available that allows you to do reoccurring donations. If you want to make a one time donation go to our blog which is for strands.org. There is a donation button there that uses PayPal. Thank you.

Links to this episode: Spotify / Apple Podcasts

Generated with AI using PodcastTranscriptor. Unofficial AI-generated transcripts. These may contain mistakes; please verify against the actual podcast.