En Chat robot som förstår dig har kommit ut, ChatGPT har på kort tid revolutionerat sättet vi kommunicerar med en sökmotor.

Genom Reinforcement Learning from Human Feedback (RLHF) och mycket data har man lyckats få fram en chat robot som beter sig mer som en människa och svarar som vi ville ha gjort.

I utvecklingen har man följt 3 steg:

  1. Samlat in demo data, och tränat en övervakad modell
  2. Samlat in jämförelse data och tränat en reward modell
  3. Optimerat en policy för reward modellen genom PPO förstärkt lärande algoritm.

Ett exempel på hur ChatGPT fungerar:


Tell me about when Christopher Columbus came to the US in 2015?


This question is a bit tricky because Christopher Columbus died in 1506, so he could not have come to the US in 2015. But let’s pretend for a moment that he did!

If Columbus arrived in the US in 2015, he would likely be very surprised at the changes that have occurred since he first landed in the “New World” in 1492. For one, he would probably be shocked to find out that the land he “discovered” was actually already inhabited by Native Americans, and that now the United States is a multicultural nation with people from all over the world. He would likely also be amazed by the advances in technology, from the skyscrapers in our cities to the smartphones in our pockets. Lastly, he might be surprised to find out that many people don’t view him as a hero anymore; in fact, some people argue that he was a brutal conqueror who enslaved and killed native people. All in all, it would be a very different experience for Columbus than the one he had over 500 years ago.

Vill du veta mer: