Vào thẳng nội dung

VitPose

Overview

Simple Vision Transformer Baselines for Human Pose Estimation

APIs

Model

Bases: BaseModel

build_model(det_model_name, pose_model_name)

Load model from pre-training

Parameters:

Name Type Description Default
det_model_name str

detection model name

required
pose_model_name str

pose model name

required

Returns:

Name Type Description
_type_

description

process_predictions(video_path, box_score_threshold, max_num_frames, kpt_score_threshold, vis_dot_radius, vis_line_thickness)

process video

Parameters:

Name Type Description Default
video_path str

video path

required
box_score_threshold float

score threshold for box

required
max_num_frames int

maximum number of frames

required
kpt_score_threshold float

KPT score threshold

required
vis_dot_radius int

radius of point

required
vis_line_thickness int

line thickness

required

Returns:

Type Description
tuple[str, list[list[dict[str, ndarray]]]]

tuple[str, list[list[dict[str, np.ndarray]]]]: description