Warehouse Automation

View Original

Microsoft Researchers Are Using ChatGPT to Control Robots, Drones

The company's researchers use ChatGPT to write computer code that can control a robot arm and an aerial drone.


ChatGPT is best known as an AI program capable of writing essays and answering questions, but now Microsoft is using the chatbot to control robots.

On Monday, the company’s researchers published(Opens in a new window) a paper on how ChatGPT can streamline the process of programming software commands to control various robots, such as mechanical arms and drones.

“We still rely heavily on hand-written code to control robots,” the researchers wrote. Microsoft’s approach, on the other hand, taps ChatGPT to write some of the computer code.



ChatGPT can do this because the AI model was trained on huge libraries of human text—including the code for software programs. ChatGPT has already shown it can write and debug programs in various languages based on text-based requests. So Microsoft’s researchers decided to see if they could apply the same capabilities to write code for robotics hardware.

“It turns out that ChatGPT can do a lot by itself, but it still needs some help,” the researchers wrote. To help ChatGPT write the computer code, the researchers first outlined to the AI program the various commands it could use to control a given robot. 


An example of a prompt given to ChatGPT to control a robot. (Credit: Microsoft)


“We write a text prompt for ChatGPT which describes the task goal while also explicitly stating which functions from the high-level library are available. The prompt can also contain information about task constraints, or how ChatGPT should form its answers,” the researchers added. 

The team applied the approach to several demos, one of which included using ChatGPT to write computer code to control an aerial drone. Microsoft researchers first fed(Opens in a new window) the AI chatbot a rather long prompt laying out the computer commands it could write to control the drone. After that, the researchers could make requests to instruct ChatGPT to control the robot in various ways. This included asking ChatGPT to use the drone’s camera to identify a drink, such as coconut water and a can of Coca-Cola. 


Have you ever wanted to tell a robot what to do using just words? Our team at #Microsoft is introducing a new paradigm for #robotics based on language, powered by #ChatGPT.

Our recent paper "ChatGPT for Robotics" describes a series of design principles that can be used to guide ChatGPT towards solving robotics tasks. In this video, we present a summary of our ideas, and experimental results from some of the many scenarios that ChatGPT enables in the domain of robotics: such as manipulation, aerial navigation, even full perception-action loops.

Our goal is to empower even non-technical users to harness the full potential of robotics using ChatGPT.

For more details, please see:

  • Blogpost: https://aka.ms/ChatGPT-Robotics

  • Technical paper: https://www.microsoft.com/en-us/resea...

  • Code repository: https://github.com/microsoft/PromptCr...


“ChatGPT asked clarification questions when the user’s instructions were ambiguous, and wrote complex code structures for the drone such as a zig-zag pattern to visually inspect shelves,” the team said.

In one instance, the researchers also told the chatbot: “Take a selfie using a reflective surface.” ChatGPT was able to interpret the request and write computer code for the drone to fly in front of a mirror and take the selfie. Meanwhile, in another demo(Opens in a new window), the researchers used ChatGPT to write code capable of directing a robot arm to build the Microsoft logo using several wooden blocks.


Although the research shows ChatGPT’s potential in robotics, the approach still has a key limitation: the chatbot can only write the computer code for the robot, based on the initial “prompt” or text-based request the human gives it. Hence, a human engineer has to thoroughly explain to ChatGPT how the application programming interface for a robot works, otherwise the AI program will struggle to generate applicable computer code.

In their paper, Microsoft researchers supply some guidelines on how to write an effective prompt for ChatGPT when it comes to controlling robots. The team also created an open-source platform on GitHub(Opens in a new window) “where anyone can share examples of prompting strategies for different robotics categories.”

Still, the other limitation is how it appears the robot needs to be constantly connected to ChatGPT. But on the flip side, the integration could unleash an era where robots are smart enough to understand all kinds of human voice commands.

"Have you ever wanted to tell a robot what to do using your own words, like you would to a human? Wouldn’t it be amazing to just tell your home assistant robot: 'Please warm up my lunch,' and have it find the microwave by itself?" the researchers ask.

In the meantime, researchers are warning others to be careful when using ChatGPT to control a robot. "We emphasize that these tools should not be given full control of the robotics pipeline, especially for safety critical applications," they wrote. "Given the propensity of LLMs (large language models) to eventually generate incorrect responses, it is fairly important to ensure solution quality and safety of the code with human supervision before executing it on the robot."

Full Story >