Skip to content

Grouding DINO

Overviews

Grounding DINO is a zero-shot object detection model made by combining a Transformer-based DINO detector and grounded pre-training.

Usage

Options

Example of opt.json:

{
    "weights_directory": "weights",
    "weights": "./weights/groundingdino_swint_ogc.pth",
    "weights_url":"https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth",
    "config_path": "config/GroundingDINO_SwinT_OGC.py",
    "TEXT_PROMPT": "scratch",
    "BOX_TRESHOLD": 0.1,
    "TEXT_TRESHOLD": 0.25,
    "area_percent": 1,
    "type": "pytorch"
}

With:

  • weights: path to save weights
  • weights_url: URL download weight
  • TEXT_PROMPT: prompt to detect
  • BOX_TRESHOLD: box confidence threshold
  • TEXT_TRESHOLD: text confidence threshold
  • area_percent: limited area by percentage

Initialize model

In the method build_model we can use it in the following ways:

import os
import sys
from pathlib import Path
FILE = Path(__file__).resolve()
FILE_DIR = os.path.join(os.path.dirname(__file__))
ROOT = FILE.parents[0]
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))

sys.path.insert(-1, os.path.join(os.path.dirname(__file__), "..", ".."))
from argparse import ArgumentParser
from ecos_core.groundingDino.dino import Model
parser = ArgumentParser()
opts = parser.parse_args()
with open(opt_path, 'r') as f:
    opts.__dict__ = json.load(f)
model_instance = Model(opts)
  • Use custom weight:
# Build model with custom weight
model = model_instance.build_model(custom_weight=<PATH_TO_WEIGHT>)
  • Build model with default options (weight will be downloaded and saved into ecos_core)
model = model_instance.build_model()
  • Build model with default options (weight will be downloaded and saved according to the deploy model)
# override weight path
self.opt.weights = os.path.join(FILE_DIR, self.opt.weights)
# build model
model = model_instance.build_model()

APIs

Model

Bases: BaseModel

build_model(custom_weight=None)

Build model in GroundingDINO model

Returns:

Name Type Description
_type_

description

get_transform()

Get image data transform function for preprocessing

Returns:

Name Type Description
transform func

transform function. This function will take input as image path and output - raw_image: image numpy mat - image_transform: image tensor (pytorch) Example: torchvision.transforms.transforms: data transforms function

inference(image, caption, box_threshold, text_threshold)

Run inference grounding dino given image, caption, box_threshold and text_threshold

Parameters:

Name Type Description Default
image Tensor

input image

required
caption str

prompt to detect

required
box_threshold float

box threshold for filtering

required
text_threshold float

text threshold for filtering

required

Returns:

Name Type Description
boxes Tensor

tensor containing boxes, shape(n, 4)

logits Tensor

tensor containing confidence scores for each box

phrases list

list of class names

process_image(image_path)

Process image in groundingDINO model

Parameters:

Name Type Description Default
image_path _type_

img path

required

Returns:

process_predictions(net_output, raw_image_path, raw_image_mat, image_transform, save_image_path)

post process the output of the net

Parameters:

Name Type Description Default
net_output _type_

output of detection net

required
raw_image_path str

raw image path

required

Returns:

Type Description

save_image_path inform that the annotation image has been written successfully in the same directory contain the annotation image, the annotation text file will be named "annotated_image.txt" each line format (yolo): class, x, y, w, h, confidence, class_name

reload_param(opt_path='./opt.json')

reload parameters

Parameters:

Name Type Description Default
opt_path str

Opt path. Defaults to "./opt.json".

'./opt.json'