Grouding DINO
Overviews
Grounding DINO is a zero-shot object detection model made by combining a Transformer-based DINO detector and grounded pre-training.
Usage
Options
Example of opt.json:
{
"weights_directory": "weights",
"weights": "./weights/groundingdino_swint_ogc.pth",
"weights_url":"https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth",
"config_path": "config/GroundingDINO_SwinT_OGC.py",
"TEXT_PROMPT": "scratch",
"BOX_TRESHOLD": 0.1,
"TEXT_TRESHOLD": 0.25,
"area_percent": 1,
"type": "pytorch"
}
With:
weights
: path to save weightsweights_url
: URL download weightTEXT_PROMPT
: prompt to detectBOX_TRESHOLD
: box confidence thresholdTEXT_TRESHOLD
: text confidence thresholdarea_percent
: limited area by percentage
Initialize model
In the method build_model
we can use it in the following ways:
import os
import sys
from pathlib import Path
FILE = Path(__file__).resolve()
FILE_DIR = os.path.join(os.path.dirname(__file__))
ROOT = FILE.parents[0]
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT))
sys.path.insert(-1, os.path.join(os.path.dirname(__file__), "..", ".."))
from argparse import ArgumentParser
from ecos_core.groundingDino.dino import Model
parser = ArgumentParser()
opts = parser.parse_args()
with open(opt_path, 'r') as f:
opts.__dict__ = json.load(f)
model_instance = Model(opts)
- Use custom weight:
# Build model with custom weight
model = model_instance.build_model(custom_weight=<PATH_TO_WEIGHT>)
- Build model with default options (weight will be downloaded and saved into ecos_core)
model = model_instance.build_model()
- Build model with default options (weight will be downloaded and saved according to the deploy model)
# override weight path
self.opt.weights = os.path.join(FILE_DIR, self.opt.weights)
# build model
model = model_instance.build_model()
APIs
Model
Bases: BaseModel
build_model(custom_weight=None)
Build model in GroundingDINO model
Returns:
Name | Type | Description |
---|---|---|
_type_ |
description |
get_transform()
Get image data transform function for preprocessing
Returns:
Name | Type | Description |
---|---|---|
transform |
func
|
transform function. This function will take input as image path and output - raw_image: image numpy mat - image_transform: image tensor (pytorch) Example: torchvision.transforms.transforms: data transforms function |
inference(image, caption, box_threshold, text_threshold)
Run inference grounding dino given image, caption, box_threshold and text_threshold
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image |
Tensor
|
input image |
required |
caption |
str
|
prompt to detect |
required |
box_threshold |
float
|
box threshold for filtering |
required |
text_threshold |
float
|
text threshold for filtering |
required |
Returns:
Name | Type | Description |
---|---|---|
boxes |
Tensor
|
tensor containing boxes, shape(n, 4) |
logits |
Tensor
|
tensor containing confidence scores for each box |
phrases |
list
|
list of class names |
process_image(image_path)
Process image in groundingDINO model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_path |
_type_
|
img path |
required |
Returns:
process_predictions(net_output, raw_image_path, raw_image_mat, image_transform, save_image_path)
post process the output of the net
Parameters:
Name | Type | Description | Default |
---|---|---|---|
net_output |
_type_
|
output of detection net |
required |
raw_image_path |
str
|
raw image path |
required |
Returns:
Type | Description |
---|---|
save_image_path inform that the annotation image has been written successfully in the same directory contain the annotation image, the annotation text file will be named "annotated_image.txt" each line format (yolo): class, x, y, w, h, confidence, class_name |
reload_param(opt_path='./opt.json')
reload parameters
Parameters:
Name | Type | Description | Default |
---|---|---|---|
opt_path |
str
|
Opt path. Defaults to "./opt.json". |
'./opt.json'
|