

4
Smart:
The learning algorithms replace complex programming …
… and allow for quick commissioning of the system.
Simple:
knowledge transfer from one gripper to another
Machine learning
Machine learning methods are comparable to those of human
beings. Their practical implementation within this subdivision of
artificial intelligence is the work of algorithms which develop
performance or motion strategies on the basis of feedback
received on its behaviour. Just like people, machines also need
feedback on their actions – positive as well as negative – in order
to classify them and continue learning.
Trial and error – learning through reinforcement
The LearningGripper makes use of the method of reinforcement
learning. The gripper optimises its capabilities exclusively on
the basis of feedback that it receives concerning its previous
actions. The system is not provided with specific actions which it
has to imitate, as would be the case with supervised learning.
The learning system alternates its actions with the key objective
of maximising feedback over the long term. Consequently, this
increases the probability that a successful action will be
executed and that a less successful action will therefore not be
repeated again.
The reward principle
At first, the LearningGripper attempts to randomly rotate the ball
so that its label is on the top. It receives feedback from a position
sensor inside the ball indicating how far the label is from the palm
of the gripper’s hand – the greater the distance, the more positive
the feedback.
In time, the learning algorithms develop a motion strategy on
the basis of this feedback. The gripper learns which action needs
to be executed for each given status. It knows how to modify its
motion so that it receives as much positive feedback as possible,
and finally executes its task reliably.
The LearningGripper display for trade fairs demonstrates a gripper
which takes less than an hour to learn a mechanical motion
strategy – from its first attempt to the reliable execution of the
required task. A second gripper demonstrates a process it had
learned previously within the desired target scenario: it lifts the
ball and positions it so that the embossed lettering can finally be
seen at the top.