The computer vision
is one of the difficult and important part on machine learning. So, how can we make the computer see and recognize the image? It is advanced question about image processing.
Imagine that the computer see and recognize the image like human being. It is related to the robot vision for the robotics.
This post will be introduced about what the computer vision is and the application using computer vision
.
Introduction
There are related fields with computer vision.
- Image processing
- Pattern recognition
- photogrammetry
Subjects
- camera models, calibration
- Image processing
- Motion and tracking
- stereopsis
prerequisite for robotics
- Mechanical engineering
- Mathematics
- Control theory
- electronic engineering
- Computer science
Basic of robotics
Pose
Basic of robotics is pose (description of position and orientation).
Coordinate system(frame) attatched to eacth object.
Transforms between coordinate systems necessary.
kinematics
It is about science of motion without regard to causing forces, Position, velocity, accerlation
Links and joint types
Velocities and singularities
Trajectory
Control
Coordinate System
Point $P$ in frame $W$ is world(global) frame. \(^WP = \begin{pmatrix} P_x\\ P_y\\ P_z\end{pmatrix}\) Rotation matrix $R$, $W \to B$ \(^WR_B = \begin{pmatrix} ^WX_B\\ ^WY_B\\ ^WZ_B\end{pmatrix}\)
Relative pose
Position and orientation is different depending on the view. For example a point P can be described by coordinate vectors relative to either select ${A}$ or ${B}$.
\[^AP = {^A\xi_B} \cdot {^BP}\]You can think ${^A\xi_B}$ as a motion it move $A$ to $B$.
- world frame, ${O}$
- robot frame, $R$
- camera frame, $C$
- object, $B$
Rotation matrix
Rotation matrix $^VR_B$ describes how point are transformed from frame ${B}$ to Fram ${V}$ when the frame is rotated. \(\begin{pmatrix} \frac{V_y} {V_y}\end{pmatrix} = \begin{pmatrix} cos \theta & -sin \theta\\ -sin \theta & cos \theta\end{pmatrix} \begin{pmatrix} B_x\\ B_y\end{pmatrix}\)
Rotation matrix has some features.
- Orthogonal
- Has orthogonal columns
- Columns are unit vectors
- Inverse = transpose
Frame-to-Frame mapping
pose = position + orientation
${B}={^AR_B,\ ^AP_{BORG}}$
$^AP=\ ^AR_{B}{^BP} +\ ^AP_{BORG}$
Homogeneous transform
$P$ is homogeneous position vector $T$ is $4 \times 4$ matrix called homogeneous transform.
\[\begin{align} &P= \begin{pmatrix} x,y,z,1 \end{pmatrix}^T\\ &\begin{pmatrix} ^AP\\ 1 \end{pmatrix} = \begin{pmatrix} ^AR_B & ^AP_{BORG}\\ 0\ 0\ 0 & 1\\ \end{pmatrix} \begin{pmatrix} ^BP\\ 1\\ \end{pmatrix}\\ &^AP=^AT_B{^BP} \end{align}\] \[^AT_C=^AT_B{^BT_C}\\ ^AT_B^{-1}=^BT_A=\begin{pmatrix} ^AR_B^T & -^AR^T_B{^AP}_{BORG}\\ 0\ 0\ 0 & 1\\ \end{pmatrix}\]The $3 \times 3$ transformation matrix can take care of the translation and rotation, But we may also need scaling.
The $3 \times 3$ transformation matrix matrix can all these (translation, rotation and scaling).
The transform matrix is obtained by matrix multiplication of the part transforms
$translation * rotation * scaling $
Forward Kinematics
The kinematic chains is formed by a sequence of homogeneous transfomr, each describing one link. \(^0T_N =\ ^0T_1\ ^1T_2 \cdots \ ^{N-1}T_{N}\)