BlazePose: On-Machine Real-time Body Pose Tracking > 자유게시판

본문 바로가기
사이트 내 전체검색

자유게시판

BlazePose: On-Machine Real-time Body Pose Tracking

페이지 정보

profile_image
작성자 Selene
댓글 0건 조회 3회 작성일 25-12-01 03:36

본문

We present BlazePose, a lightweight convolutional neural network architecture for human pose estimation that is tailored for real-time inference on cell gadgets. During inference, the community produces 33 physique keypoints for a single particular person and runs at over 30 frames per second on a Pixel 2 cellphone. This makes it particularly suited to actual-time use instances like health tracking and sign language recognition. Our principal contributions embody a novel physique pose monitoring resolution and a lightweight physique pose estimation neural network that makes use of both heatmaps and regression to keypoint coordinates. Human body pose estimation from images or video performs a central role in numerous applications corresponding to well being tracking, signal language recognition, and gestural control. This process is difficult on account of a wide number of poses, numerous degrees of freedom, and occlusions. The widespread strategy is to provide heatmaps for every joint together with refining offsets for every coordinate. While this selection of heatmaps scales to multiple people with minimal overhead, it makes the mannequin for a single particular person significantly bigger than is suitable for actual-time inference on cellphones.



Thomas_Reiter_with_tracking_device_node_full_image_2.jpgIn this paper, we address this specific use case and exhibit vital speedup of the mannequin with little to no high quality degradation. In contrast to heatmap-based strategies, regression-based mostly approaches, whereas much less computationally demanding and extra scalable, try to predict the imply coordinate values, typically failing to handle the underlying ambiguity. We lengthen this idea in our work and use an encoder-decoder network structure to foretell heatmaps for all joints, adopted by one other encoder that regresses on to the coordinates of all joints. The key insight behind our work is that the heatmap department might be discarded during inference, making it sufficiently lightweight to run on a cell phone. Our pipeline consists of a lightweight physique pose detector followed by a pose iTagPro bluetooth tracker network. The tracker predicts keypoint coordinates, the presence of the particular person on the present body, and the refined region of curiosity for the present body. When the tracker signifies that there isn't a human present, we re-run the detector community on the next body.



Nearly all of modern object detection solutions rely on the Non-Maximum Suppression (NMS) algorithm for their final post-processing step. This works effectively for inflexible objects with few levels of freedom. However, this algorithm breaks down for eventualities that include highly articulated poses like those of people, e.g. folks waving or hugging. It is because a number of, ambiguous bins fulfill the intersection over union (IoU) threshold for the NMS algorithm. To beat this limitation, we concentrate on detecting the bounding box of a comparatively rigid body part just like the human face or torso. We observed that in many instances, the strongest sign to the neural community about the position of the torso is the person’s face (as it has high-contrast options and has fewer variations in look). To make such a person detector fast and lightweight, we make the strong, but for AR applications valid, assumption that the pinnacle of the individual should all the time be visible for our single-person use case. This face detector predicts further person-specific alignment parameters: the center point between the person’s hips, the size of the circle circumscribing the whole person, and incline (the angle between the lines connecting the 2 mid-shoulder and mid-hip factors).



This permits us to be in line with the respective datasets and inference networks. In comparison with the vast majority of present pose estimation solutions that detect keypoints utilizing heatmaps, our tracking-primarily based solution requires an preliminary pose alignment. We limit our dataset to these cases the place both the whole individual is seen, iTagPro bluetooth tracker or the place hips and shoulders keypoints will be confidently annotated. To make sure the model helps heavy occlusions that aren't present within the dataset, we use substantial occlusion-simulating augmentation. Our coaching dataset consists of 60K pictures with a single or few folks in the scene in frequent poses and 25K images with a single person in the scene performing health workout routines. All of those images had been annotated by people. We undertake a combined heatmap, offset, and regression strategy, as shown in Figure 4. We use the heatmap and offset loss solely within the training stage and remove the corresponding output layers from the model before running the inference.



Thus, we effectively use the heatmap to supervise the lightweight embedding, which is then utilized by the regression encoder network. This strategy is partially impressed by Stacked Hourglass strategy of Newell et al. We actively make the most of skip-connections between all of the stages of the network to attain a balance between high- and low-level options. However, the gradients from the regression encoder usually are not propagated back to the heatmap-educated options (word the gradient-stopping connections in Figure 4). We have found this to not solely enhance the heatmap predictions, but additionally considerably enhance the coordinate regression accuracy. A relevant pose prior is an important part of the proposed resolution. We intentionally limit supported ranges for the angle, scale, and translation throughout augmentation and knowledge preparation when coaching. This permits us to lower the community capacity, making the community quicker whereas requiring fewer computational and thus vitality resources on the host device. Based on either the detection stage or the earlier body keypoints, we align the particular person in order that the purpose between the hips is situated at the center of the square image passed as the neural network enter.

댓글목록

등록된 댓글이 없습니다.

회원로그인

회원가입

사이트 정보

회사명 : 회사명 / 대표 : 대표자명
주소 : OO도 OO시 OO구 OO동 123-45
사업자 등록번호 : 123-45-67890
전화 : 02-123-4567 팩스 : 02-123-4568
통신판매업신고번호 : 제 OO구 - 123호
개인정보관리책임자 : 정보책임자명

공지사항

  • 게시물이 없습니다.

접속자집계

오늘
2,455
어제
4,161
최대
24,404
전체
1,330,184
Copyright © 소유하신 도메인. All rights reserved.