Human pose estimation via deep neural networks

Contributors: Zoya Jamadar*, Vivek Verma, Aishwarya Pujari, Sahil Zawar, Abhishek Yeole*

Human Pose Estimation is a computer vision task of detecting and estimating the positions of human joints in an image or a video. It plays a crucial role in various applications such as action recognition, human-computer interaction, and gaming.

Traditionally, human pose estimation was performed using computer vision techniques such as edge detection and feature extraction. However, these methods have limitations in handling complex backgrounds and occlusions, leading to inaccurate results.

Recently, deep neural networks (DNNs) have been widely adopted for human pose estimation, providing accurate and robust results. DNNs can learn from large amounts of annotated data, allowing them to generalize well to new images.

One popular approach to human pose estimation using DNNs is the Convolutional Pose Machines (CPMs) framework. CPMs consist of several stages of convolutional layers that gradually refine the estimate of the human pose. The final output of the network is a heatmap representing the likelihood of each body joint’s location.

An example of Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera

Another approach is the use of multi-person Pose Estimation (MPPE) networks. MPPE networks can simultaneously estimate the poses of multiple people in an image, making them suitable for real-world applications where multiple people are present.

The Progress of Human Pose Estimation: A Survey and Taxonomy of Models Applied in 2D Human Pose Estimation

There are several deep learning libraries that can be used for human pose estimation, some of the popular ones include:

  1. TensorFlow: TensorFlow is a popular open-source platform for machine learning and deep learning. It provides a large number of pre-trained models for human pose estimation, as well as tools for training custom models.

  2. PyTorch: PyTorch is another open-source deep learning library that is widely used for computer vision tasks. It has a user-friendly API and provides pre-trained models for human pose estimation.

  3. OpenCV: OpenCV is an open-source computer vision library that provides various tools for image and video processing. It includes a module for human pose estimation using deep learning.

  4. Caffe: Caffe is a deep learning framework developed by the Berkeley Vision and Learning Center. It provides pre-trained models for human pose estimation and has a fast implementation for real-time applications.

  5. Darknet: Darknet is an open-source neural network framework written in C and CUDA. It provides a fast and efficient implementation of human pose estimation models.

These libraries provide various tools and models for human pose estimation, and choosing one depends on the specific requirements and goals of a project.

For a real time implementation check out our colab notebook here where we implement human pose using move-net :

https://colab.research.google.com/drive/14pxHPEbq5gojF7y7Gv4LhwM1QoiAMVaa?usp=sharing

So in conclusion, deep neural networks have revolutionized the field of human pose estimation, providing accurate and robust results even in challenging scenarios. With the continuous advancements in DNNs, we can expect even more improved performance in the future.