码力全开 / YOLOv5目标检测代码精解

Created Fri, 27 Jun 2025 16:29:48 +0800 Modified Fri, 27 Jun 2025 20:20:23 +0800
1505 Words 4 min

前言

之前介绍了YOLOv3推理过程整个代码的过程,详细内容可以参考。而YOLOv5的推理代码由于是同一个团队的实现,实际上并没有太大的变化,只是将其中的cfg配置文件修改为yaml格式,而权重文件是PyTorch格式的。

首先下载其源码:

git clone -b v4.0 --depth 1 http://github.com/ultralytics/yolov5.git

这里采用的是其版本4.0的代码进行讲解。其代码目录结构如下:

|   detect.py
|   Dockerfile
|   hubconf.py
|   LICENSE
|   README.md
|   requirements.txt
|   test.py
|   train.py
|   tutorial.ipynb
|
+---.github
|   |   dependabot.yml
|   |
|   +---ISSUE_TEMPLATE
|   |       bug-report.md
|   |       feature-request.md
|   |       question.md
|   |
|   \---workflows
|           ci-testing.yml
|           codeql-analysis.yml
|           greetings.yml
|           rebase.yml
|           stale.yml
|
+---data
|   |   coco.yaml
|   |   coco128.yaml
|   |   hyp.finetune.yaml
|   |   hyp.scratch.yaml
|   |   voc.yaml
|   |
|   +---images
|   |       bus.jpg
|   |       zidane.jpg
|   |
|   \---scripts
|           get_coco.sh
|           get_voc.sh
|
+---models
|   |   common.py
|   |   experimental.py
|   |   export.py
|   |   yolo.py
|   |   yolov5l.yaml
|   |   yolov5m.yaml
|   |   yolov5s.yaml
|   |   yolov5x.yaml
|   |   __init__.py
|   |
|   \---hub
|           anchors.yaml
|           yolov3-spp.yaml
|           yolov3-tiny.yaml
|           yolov3.yaml
|           yolov5-fpn.yaml
|           yolov5-p2.yaml
|           yolov5-p6.yaml
|           yolov5-p7.yaml
|           yolov5-panet.yaml
|
+---utils
|   |   activations.py
|   |   autoanchor.py
|   |   datasets.py
|   |   general.py
|   |   google_utils.py
|   |   loss.py
|   |   metrics.py
|   |   plots.py
|   |   torch_utils.py
|   |   __init__.py
|   |
|   \---google_app_engine
|           additional_requirements.txt
|           app.yaml
|           Dockerfile
|
\---weights
        download_weights.sh

其中YOLOv5中添加了torch.hub的加载方式,提供了shell脚本用于下载COCO及PASCAL VOC数据集。另外还提供了教程文件tutorial.ipynb用于让开发人员了解其调用方式。

目标检测训练代码

其训练代码在train.py模块中,首先是相应命令行参数:

    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', type=str, default='yolov5s.pt', help='initial weights path')
    parser.add_argument('--cfg', type=str, default='', help='model.yaml path')
    parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path')
    parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path')
    parser.add_argument('--epochs', type=int, default=300)
    parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs')
    parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
    parser.add_argument('--rect', action='store_true', help='rectangular training')
    parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
    parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
    parser.add_argument('--notest', action='store_true', help='only test final epoch')
    parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
    parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
    parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
    parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
    parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
    parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
    parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
    parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
    parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
    parser.add_argument('--log-imgs', type=int, default=16, help='number of images for W&B logging, max 100')
    parser.add_argument('--log-artifacts', action='store_true', help='log artifacts, i.e. final trained model')
    parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
    parser.add_argument('--project', default='runs/train', help='save to project/name')
    parser.add_argument('--name', default='exp', help='save to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--quad', action='store_true', help='quad dataloader')
    opt = parser.parse_args()

其中参数说明如下:

  • weights,模型权重文件路径
  • cfg,模型配置文件路径
  • data,训练的数据集,默认为data/coco128.yaml。该数据集只有6.6M,可以用于训练测试。而完整的数据集需要调用data/scripts目录下的shell脚本,其中COCO数据集大小为27GB,而VOC数据集为2.8GB
  • hyp,超参数路径
  • epochs,训练的轮数,默认为300
  • batch-size,总的batch数量

其中数据集内目录结构如下:

  • images,图片数据
  • labels,标注数据

每张图片对应一个标注数据文本文件,其内容类似如下:

45 0.479492 0.688771 0.955609 0.5955
45 0.736516 0.247188 0.498875 0.476417
50 0.637063 0.732938 0.494125 0.510583
45 0.339438 0.418896 0.678875 0.7815

其中45和50是相应的类型的ID,之后4位是x,y,w,h归一化后的值。而yaml文件中需要指定如下一些内容:

train: ../coco128/images/train2017/  # 128 images
val: ../coco128/images/train2017/  # 128 images

# number of classes
nc: 80

# class names
names: [ 'person', ..., 'hair drier', 'toothbrush' ]

分别是训练和验证集的目录,分类的数量及每个类的名称。

之后是使用并行GPU的参数设置:

    # Set DDP variables
    opt.world_size = int(os.environ['WORLD_SIZE']) if 'WORLD_SIZE' in os.environ else 1
    opt.global_rank = int(os.environ['RANK']) if 'RANK' in os.environ else -1
    set_logging(opt.global_rank)
    if opt.global_rank in [-1, 0]:
        check_git_status()

    # DDP mode
    opt.total_batch_size = opt.batch_size
    device = select_device(opt.device, batch_size=opt.batch_size)
    if opt.local_rank != -1:
        assert torch.cuda.device_count() > opt.local_rank
        torch.cuda.set_device(opt.local_rank)
        device = torch.device('cuda', opt.local_rank)
        dist.init_process_group(backend='nccl', init_method='env://')  # distributed backend
        assert opt.batch_size % opt.world_size == 0, '--batch-size must be multiple of CUDA device count'
        opt.batch_size = opt.total_batch_size // opt.world_size

接着是断点续练,从最近的训练点继续训练,需要在命令行参数中设置--resume为true进行开启:

    # Resume
    if opt.resume:  # resume an interrupted run
        ckpt = opt.resume if isinstance(opt.resume, str) else get_latest_run()  # specified or most recent path
        assert os.path.isfile(ckpt), 'ERROR: --resume checkpoint does not exist'
        apriori = opt.global_rank, opt.local_rank
        with open(Path(ckpt).parent.parent / 'opt.yaml') as f:
            opt = argparse.Namespace(**yaml.load(f, Loader=yaml.FullLoader))  # replace
        opt.cfg, opt.weights, opt.resume, opt.global_rank, opt.local_rank = '', ckpt, True, *apriori  # reinstate
        logger.info('Resuming training from %s' % ckpt)
    else:
        # opt.hyp = opt.hyp or ('hyp.finetune.yaml' if opt.weights else 'hyp.scratch.yaml')
        opt.data, opt.cfg, opt.hyp = check_file(opt.data), check_file(opt.cfg), check_file(opt.hyp)  # check files
        assert len(opt.cfg) or len(opt.weights), 'either --cfg or --weights must be specified'
        opt.img_size.extend([opt.img_size[-1]] * (2 - len(opt.img_size)))  # extend to 2 sizes (train, test)
        opt.name = 'evolve' if opt.evolve else opt.name
        opt.save_dir = increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok | opt.evolve)  # increment run

设置超参数的配置:

    # Hyperparameters
    with open(opt.hyp) as f:
        hyp = yaml.load(f, Loader=yaml.FullLoader)  # load hyps
        if 'box' not in hyp:
            warn('Compatibility: %s missing "box" which was renamed from "giou" in %s' %
                 (opt.hyp, 'https://github.com/ultralytics/yolov5/pull/1120'))
            hyp['box'] = hyp.pop('giou')

之后调用train函数进行模型训练:

    # Train
    logger.info(opt)
    if not opt.evolve:
        tb_writer = None  # init loggers
        if opt.global_rank in [-1, 0]:
            logger.info(f'Start Tensorboard with "tensorboard --logdir {opt.project}", view at http://localhost:6006/')
            tb_writer = SummaryWriter(opt.save_dir)  # Tensorboard
        train(hyp, opt, device, tb_writer, wandb)

利用Tensorboard的SummaryWriter将训练数据汇总并写入。

整个过程逻辑还是比较清晰的,没有什么难度。只要按照其对应的数据格式标注自己的数据集即可。

如果喜欢这篇文章或对您有帮助,可以:[☕] 请我喝杯咖啡 | [💓] 小额赞助