VitPose
Overview
Simple Vision Transformer Baselines for Human Pose Estimation
APIs
Model
Bases: BaseModel
build_model(det_model_name, pose_model_name)
Load model from pre-training
Parameters:
Name | Type | Description | Default |
---|---|---|---|
det_model_name |
str
|
detection model name |
required |
pose_model_name |
str
|
pose model name |
required |
Returns:
Name | Type | Description |
---|---|---|
_type_ |
description |
process_predictions(video_path, box_score_threshold, max_num_frames, kpt_score_threshold, vis_dot_radius, vis_line_thickness)
process video
Parameters:
Name | Type | Description | Default |
---|---|---|---|
video_path |
str
|
video path |
required |
box_score_threshold |
float
|
score threshold for box |
required |
max_num_frames |
int
|
maximum number of frames |
required |
kpt_score_threshold |
float
|
KPT score threshold |
required |
vis_dot_radius |
int
|
radius of point |
required |
vis_line_thickness |
int
|
line thickness |
required |
Returns:
Type | Description |
---|---|
tuple[str, list[list[dict[str, ndarray]]]]
|
tuple[str, list[list[dict[str, np.ndarray]]]]: description |