RDK X5 使用 YOLO11 实现避障检测

摘要

本项目是一个基于ROS2和YOLO11的实时障碍物检测系统，可作为YOLOv5赛道障碍物检测的YOLOv8，YOLO11改进版，可无缝切换至此系统。系统能够实时检测赛道上的锥桶，二维码，停车标志，为赛车提供环境感知能力，在RDK X5平台上可跑满30FPS。

ROS2通信架构与消息设计

话题订阅与发布

系统采用ROS2的发布-订阅模式进行数据通信，实现了高效的分布式处理架构。主要话题包括：

输入话题：

/hbmem_img (类型：hbm_img_msgs::msg::HbmMsg1080P)
- 订阅摄像头实时图像数据
- 使用SensorDataQoS服务质量策略
- 支持1080P分辨率的NV12格式图像

输出话题：

/racing_obstacle_detection (类型：ai_msgs::msg::PerceptionTargets)
- 发布障碍物检测结果
- 包含目标类型、位置、置信度等信息
- 支持多目标同时检测

消息类型定义

系统使用标准化的ROS2消息类型，确保与其他系统的兼容性：

// 输入消息：HbmMsg1080P (hbm_img_msgs)
struct HbmMsg1080P {
    builtin_interfaces::msg::Time time_stamp;   // 时间戳
    std::vector<uint8_t> data;                  // NV12格式图像数据
    // 其他元数据字段
};

// 输出消息：PerceptionTargets (ai_msgs)
struct PerceptionTargets {
    std_msgs::msg::Header header;               // 消息头
    std::vector<Target> targets;                // 检测目标列表
};

struct Target {
    std::string type;                           // 目标类型
    uint32_t track_id;                          // 跟踪ID
    std::vector<Roi> rois;                      // 感兴趣区域
};

struct Roi {
    std::string type;                           // ROI类型 ("rect")
    sensor_msgs::msg::RegionOfInterest rect;    // 矩形区域
    float confidence;                           // 置信度
};

struct RegionOfInterest {
    uint32_t x_offset;                          // X偏移量
    uint32_t y_offset;                          // Y偏移量
    uint32_t height;                            // 高度
    uint32_t width;                             // 宽度
    bool do_rectify;                            // 是否校正
};

消息流处理流程

系统的消息处理流程采用异步回调机制，确保实时性能：

void img_callback(const hbm_img_msgs::msg::HbmMsg1080P::SharedPtr msg) {
    auto input_timestamp = msg->time_stamp;
    ++frame_count_; 

    // 1. 提取图像数据
    auto& frame = msg->data;
    int src_w = 640, src_h = 480;
    int dst_w = 640, dst_h = 640;
 
    // 2. 图像预处理
    std::vector<uint8_t> output_nv12(dst_w * dst_h * 3 / 2);
    int x_shift, y_shift;
    float x_scale, y_scale;
 
    letterbox_nv12(
        frame.data(), src_w, src_h, 
        output_nv12.data(), dst_w, dst_h,
        x_shift, y_shift, x_scale, y_scale
    );
        
    // 3. 执行障碍物检测
    detector_.detect(output_nv12.data());
    detector_.postprocessing(x_shift, y_shift, x_scale, y_scale, src_w, src_h);
 
    // 4. 发布检测结果
    publish_detection_results(input_timestamp);

    // 5. 性能监控
    auto now = std::chrono::steady_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::seconds>(now - last_time_).count();
    if (duration >= 1) {
        RCLCPP_INFO(this->get_logger(), "Frame rate: %.2f fps, last frame_id published: %lu",
              static_cast<float>(frame_count_) / duration, frame_count_); 
        frame_count_ = 0;
        last_time_ = now;
    }
}

系统架构设计

系统采用分层架构设计，从底层到顶层依次为硬件加速层、算法处理层、ROS2通信层和应用接口层。硬件加速层利用地平线RDK平台的BPU专用加速器，为YOLO11模型提供高效的推理能力；算法处理层实现图像预处理、模型推理和后处理等核心算法；ROS2通信层负责图像订阅和检测结果发布；应用接口层提供配置管理和系统监控功能。这种分层设计确保了系统的模块化和可维护性，同时通过硬件加速实现了实时性能要求。

int main(int argc, char * argv[]) {
    RacingObstacleDetection obstacleDetector;
    obstacleDetector.load_config();
    int code = obstacleDetector.load_bin_model();
    std::cout << "[INFO] Racing Obstacle Detection Init completed with code: " << code << std::endl;
    
    rclcpp::init(argc, argv);                       
    auto node = std::make_shared<ImageNV12Subscriber>(obstacleDetector);
    rclcpp::spin(node);                             
    rclcpp::shutdown();
    return 0;
}

图像预处理模块

图像预处理模块是系统的第一个环节，负责将摄像头输入的NV12格式图像转换为YOLO11模型所需的640x640输入格式。该模块实现了Letterbox变换算法，在保持图像原始宽高比的同时，通过缩放和填充将图像调整到目标尺寸。具体实现中，系统分别处理Y平面（亮度）和UV平面（色度），确保色彩信息的正确性。Letterbox变换的核心思想是计算缩放比例，然后在目标图像中心放置缩放后的图像，周围用127（灰度值）进行填充，这样既保持了图像比例，又满足了模型的输入要求。

void letterbox_nv12(const uint8_t* nv12, int src_w, int src_h,
                    uint8_t* out_nv12, int dst_w, int dst_h,
                    int& x_shift, int& y_shift, float& x_scale, float& y_scale) {
    // 1. 拆分Y/UV平面
    const uint8_t* y_in = nv12;
    const uint8_t* uv_in = nv12 + src_w * src_h;
 
    // 2. 构建Y和UV的Mat，分别处理
    cv::Mat y_plane_in(src_h, src_w, CV_8UC1, (void*)y_in);
    cv::Mat uv_plane_in(src_h/2, src_w/2, CV_8UC2, (void*)uv_in);
 
    // 3. 计算缩放比例和填充偏移
    x_scale = std::min(1.0 * dst_h / src_h, 1.0 * dst_w / src_w);
    y_scale = x_scale;
 
    int new_w = int(src_w * x_scale + 0.5);
    int new_h = int(src_h * y_scale + 0.5);
 
    x_shift = (dst_w - new_w) / 2;
    int x_other = dst_w - new_w - x_shift;
    y_shift = (dst_h - new_h) / 2;
    int y_other = dst_h - new_h - y_shift;
 
    // 4. 分别resize、letterbox Y和UV平面
    cv::Mat y_plane_resize, y_plane_out;
    cv::resize(y_plane_in, y_plane_resize, cv::Size(new_w, new_h), 0, 0, cv::INTER_LINEAR);
    cv::copyMakeBorder(y_plane_resize, y_plane_out, y_shift, y_other, x_shift, x_other, cv::BORDER_CONSTANT, cv::Scalar(127));
 
    // UV平面处理（UV分辨率减半）
    cv::Mat uv_resize, uv_out;
    int uv_new_w = new_w / 2;
    int uv_new_h = new_h / 2;
    int uv_x_shift = x_shift / 2;
    int uv_x_other = (dst_w/2) - uv_new_w - uv_x_shift;
    int uv_y_shift = y_shift / 2;
    int uv_y_other = (dst_h/2) - uv_new_h - uv_y_shift;
 
    cv::resize(uv_plane_in, uv_resize, cv::Size(uv_new_w, uv_new_h), 0, 0, cv::INTER_LINEAR);
    cv::copyMakeBorder(uv_resize, uv_out, uv_y_shift, uv_y_other, uv_x_shift, uv_x_other, cv::BORDER_CONSTANT, cv::Scalar(127, 127));
 
    // 5. 合并为NV12格式
    memcpy(out_nv12, y_plane_out.data, dst_w * dst_h);
    memcpy(out_nv12 + dst_w * dst_h, uv_out.data, (dst_w/2) * (dst_h/2) * 2);
}

YOLO11模型推理模块

YOLO11推理模块是整个系统的核心，负责深度学习模型的加载、推理和后处理。该模块支持D-Robotics量化的YOLO11模型，利用地平线RDK平台的BPU硬件加速器实现高效推理。模型加载过程中，系统会验证模型的输入输出规范，确保输入格式为NV12、布局为NCHW、尺寸为640x640x3，输出为6个输出头，分别对应不同尺度的特征图。推理过程中，系统使用BPU专用内存管理函数hbSysAllocCachedMem分配内存，通过hbDNNInfer执行异步推理，并通过hbDNNWaitTaskDone等待推理完成。

int RacingObstacleDetection::load_bin_model() {
    std::cout << "[INFO] Loading Binary Model From: " << model_file << std::endl;
    
    // 1. 初始化DNN模型
    const char *model_file_name = model_file.c_str();
    rdk_check_success(
        hbDNNInitializeFromFiles(&packed_dnn_handle, &model_file_name, 1),
        "hbDNNInitializeFromFiles failed");
    
    // 2. 获取模型名称和句柄
    const char **model_name_list;
    int model_count = 0;
    rdk_check_success(
        hbDNNGetModelNameList(&model_name_list, &model_count, packed_dnn_handle),
        "hbDNNGetModelNameList failed");
    
    const char *model_name = model_name_list[0];
    rdk_check_success(
        hbDNNGetModelHandle(&dnn_handle, packed_dnn_handle, model_name),
        "hbDNNGetModelHandle failed");
    
    // 3. 验证输入张量属性
    rdk_check_success(
        hbDNNGetInputTensorProperties(&input_properties, dnn_handle, 0),
        "hbDNNGetInputTensorProperties failed");
    
    // 检查输入格式
    if (input_properties.tensorType == HB_DNN_IMG_TYPE_NV12) {
        std::cout << "[INFO] Input Tensor Type: HB_DNN_IMG_TYPE_NV12" << std::endl;
    } else {
        std::cout << "[ERROR] Input Tensor Type is not HB_DNN_IMG_TYPE_NV12" << std::endl;
        return -1;
    }
    
    if (input_properties.tensorLayout == HB_DNN_LAYOUT_NCHW) {
        std::cout << "[INFO] Input Tensor Layout: HB_DNN_LAYOUT_NCHW" << std::endl;
    } else {
        std::cout << "[ERROR] Input Tensor Layout is not HB_DNN_LAYOUT_NCHW" << std::endl;
        return -1;
    }
    
    // 4. 验证输出张量属性（6个输出头）
    rdk_check_success(
        hbDNNGetOutputCount(&output_count, dnn_handle),
        "hbDNNGetOutputCount failed");
    
    if (output_count == 6) {
        for (int i = 0; i < 6; i++) {
            hbDNNTensorProperties output_properties;
            rdk_check_success(
                hbDNNGetOutputTensorProperties(&output_properties, dnn_handle, i),
                "hbDNNGetOutputTensorProperties failed");
            std::cout << "[INFO] Output[" << i << "] Valid Shape: (" 
                      << output_properties.validShape.dimensionSize[0] << ", "
                      << output_properties.validShape.dimensionSize[1] << ", "
                      << output_properties.validShape.dimensionSize[2] << ", "
                      << output_properties.validShape.dimensionSize[3] << ")" << std::endl;
        }
    }
    
    return 0;
}

void RacingObstacleDetection::detect(uint8_t* ynv12) {
    // 1. 准备输入张量
    task_handle = nullptr;
    input.properties = input_properties;
    hbSysAllocCachedMem(&input.sysMem[0], int(3 * input_H * input_W / 2));
    memcpy(input.sysMem[0].virAddr, ynv12, int(3 * input_H * input_W / 2));
    hbSysFlushMem(&input.sysMem[0], HB_SYS_MEM_CACHE_CLEAN);
 
    // 2. 准备输出张量
    output = new hbDNNTensor[output_count];
    for (int i = 0; i < 6; i++) {
        hbDNNTensorProperties& output_properties = output[i].properties;
        hbDNNGetOutputTensorProperties(&output_properties, dnn_handle, i);
        int out_aligned_size = output_properties.alignedByteSize;
        hbSysMem& mem = output[i].sysMem[0];
        hbSysAllocCachedMem(&mem, out_aligned_size);
    }
 
    // 3. 执行推理
    hbDNNInferCtrlParam infer_ctrl_param;
    HB_DNN_INITIALIZE_INFER_CTRL_PARAM(&infer_ctrl_param);
    hbDNNInfer(&task_handle, &output, &input, dnn_handle, &infer_ctrl_param);
    hbDNNWaitTaskDone(task_handle, 0);
}

多尺度特征图后处理

后处理模块实现了YOLO11的完整推理流程，包括DFL解码、Sigmoid激活、NMS处理和坐标转换。系统处理三个不同尺度的特征图：8x下采样（80x80）、16x下采样（40x40）和32x下采样（20x20），每个尺度都有对应的分类分数和边界框回归输出。DFL解码是YOLO11的核心创新，它将回归值建模为分布，通过指数函数和加权平均计算实际的边界框坐标。系统首先使用单调性筛选，只对置信度超过阈值的检测框进行DFL计算，避免了无用的计算开销。NMS处理去除重叠的检测框，最后通过坐标转换将模型输出坐标映射回原始图像坐标系。

int RacingObstacleDetection::postprocessing(float x_shift, float y_shift, float x_scale, float y_scale, int src_w, int src_h) {
    // YOLO11-Detect 后处理
    float CONF_THRES_RAW = -log(1 / score_threshold - 1);  // 利用反函数作用阈值
    std::vector<std::vector<cv::Rect2d>> bboxes(class_num);
    std::vector<std::vector<float>> scores(class_num);
    
    // 1. 小目标特征图处理 (8x下采样)
    // output[order[0]]: (1, H//8, W//8, class_num) - 分类分数
    // output[order[1]]: (1, H//8, W//8, 4*reg) - 边界框回归
    
    // 检查反量化类型
    if (output[order[0]].properties.quantiType != NONE) {
        std::cout << "output[order[0]] QuantiType is not NONE, please check!" << std::endl;
        return -1;
    }
    if (output[order[1]].properties.quantiType != SCALE) {
        std::cout << "output[order[1]] QuantiType is not SCALE, please check!" << std::endl;
        return -1;
    }
    
    // 对缓存的BPU内存进行刷新
    hbSysFlushMem(&(output[order[0]].sysMem[0]), HB_SYS_MEM_CACHE_INVALIDATE);
    hbSysFlushMem(&(output[order[1]].sysMem[0]), HB_SYS_MEM_CACHE_INVALIDATE);
    
    // 将BPU推理完的内存地址转换为对应类型的指针
    auto *s_cls_raw = reinterpret_cast<float *>(output[order[0]].sysMem[0].virAddr);
    auto *s_bbox_raw = reinterpret_cast<int32_t *>(output[order[1]].sysMem[0].virAddr);
    auto *s_bbox_scale = reinterpret_cast<float *>(output[order[1]].properties.scale.scaleData);
    
    for (int h = 0; h < (input_H / 8); h++) {
        for (int w = 0; w < (input_W / 8); w++) {
            // 找到分数的最大值索引
            int cls_id = 0;
            for (int i = 1; i < class_num; i++) {
                if (cur_s_cls_raw[i] > cur_s_cls_raw[cls_id]) {
                    cls_id = i;
                }
            }
            
            // 不合格则直接跳过，避免无用的反量化、DFL和dist2bbox计算
            if (cur_s_cls_raw[cls_id] < CONF_THRES_RAW) {
                s_cls_raw += class_num;
                s_bbox_raw += reg * 4;
                continue;
            }
            
            // 计算这个目标的分数
            float score = 1 / (1 + std::exp(-cur_s_cls_raw[cls_id]));
            
            // DFL解码：对bbox_raw信息进行反量化和DFL计算
            float ltrb[4], sum, dfl;
            for (int i = 0; i < 4; i++) {
                ltrb[i] = 0.;
                sum = 0.;
                for (int j = 0; j < reg; j++) {
                    int index_id = reg * i + j;
                    dfl = std::exp(float(cur_s_bbox_raw[index_id]) * s_bbox_scale[index_id]);
                    ltrb[i] += dfl * j;
                    sum += dfl;
                }
                ltrb[i] /= sum;
            }
            
            // 剔除不合格的框
            if (ltrb[2] + ltrb[0] <= 0 || ltrb[3] + ltrb[1] <= 0) {
                s_cls_raw += class_num;
                s_bbox_raw += reg * 4;
                continue;
            }
            
            // dist2bbox转换 (ltrb 2 xyxy)
            float x1 = (w + 0.5 - ltrb[0]) * 8.0;
            float y1 = (h + 0.5 - ltrb[1]) * 8.0;
            float x2 = (w + 0.5 + ltrb[2]) * 8.0;
            float y2 = (h + 0.5 + ltrb[3]) * 8.0;
            
            // 对应类别加入到对应的vector中
            bboxes[cls_id].push_back(cv::Rect2d(x1, y1, x2 - x1, y2 - y1));
            scores[cls_id].push_back(score);
            
            s_cls_raw += class_num;
            s_bbox_raw += reg * 4;
        }
    }

    // 2. 中目标特征图处理 (16x下采样)
    ······

    // 3. 大目标特征图处理 (32x下采样)
    ······
    
    // 4. 对每一个类别进行NMS处理
    std::vector<std::vector<int>> indices(class_num);
    for (int i = 0; i < class_num; i++) {
        cv::dnn::NMSBoxes(bboxes[i], scores[i], score_threshold, nms_threshold, indices[i], 1.f, nms_top_k);
    }
    
    // 5. 坐标转换（逆letterbox变换）并存储检测结果
    detected_objects_.clear();
    for (int cls_id = 0; cls_id < class_num; cls_id++) {
        for (std::vector<int>::iterator it = indices[cls_id].begin(); it != indices[cls_id].end(); ++it) {
            float x1 = (bboxes[cls_id][*it].x - x_shift) / x_scale;
            float y1 = (bboxes[cls_id][*it].y - y_shift) / y_scale;
            float w = bboxes[cls_id][*it].width / x_scale;
            float h = bboxes[cls_id][*it].height / y_scale;
            
            // 边界检查
            x1 = std::max(0.0f, std::min(x1, static_cast<float>(src_w)));
            y1 = std::max(0.0f, std::min(y1, static_cast<float>(src_h)));
            w = std::min(w, static_cast<float>(src_w) - x1);
            h = std::min(h, static_cast<float>(src_h) - y1);
            
            // 存储检测结果
            DetectedObject obj;
            obj.class_name = cls_names_list[cls_id % class_num];
            obj.confidence = scores[cls_id][*it];
            obj.x = static_cast<uint32_t>(x1);
            obj.y = static_cast<uint32_t>(y1);
            obj.width = static_cast<uint32_t>(w);
            obj.height = static_cast<uint32_t>(h);
            
            detected_objects_.push_back(obj);
        }
    }

    // 6. 释放任务
    hbDNNReleaseTask(task_handle);

    // 7. 释放内存
    hbSysFreeMem(&(input.sysMem[0]));
    for (int i = 0; i < 6; i++)
        hbSysFreeMem(&(output[i].sysMem[0]));
    
    return 0;
}

ROS2集成与实时处理

ROS2集成模块负责系统的消息订阅、处理和发布，实现了完整的机器人操作系统集成。该模块订阅/hbmem_img话题的NV12格式图像数据，通过回调函数进行实时处理。系统采用ROS2的SensorDataQoS服务质量策略，确保图像数据的实时性和可靠性。在图像处理完成后，系统将检测结果封装为ai_msgs::msg::PerceptionTargets消息类型，发布到/racing_obstacle_detection话题，供其他控制节点订阅使用。

class ImageNV12Subscriber : public rclcpp::Node {
public:
    ImageNV12Subscriber(RacingObstacleDetection& detector) 
        : Node("image_nv12_subscriber"), 
          frame_count_(0),
          detector_(detector) {
        // 创建发布者，发布检测结果
        publisher_ = this->create_publisher<ai_msgs::msg::PerceptionTargets>(
            "/racing_obstacle_detection", 10);

        // 创建订阅者，订阅图像数据
        sub_ = this->create_subscription<hbm_img_msgs::msg::HbmMsg1080P>(
            "/hbmem_img", 
            rclcpp::SensorDataQoS(),
            std::bind(&ImageNV12Subscriber::img_callback, this, std::placeholders::_1));
        last_time_ = std::chrono::steady_clock::now();
    }
 
private:
    void img_callback(const hbm_img_msgs::msg::HbmMsg1080P::SharedPtr msg) {
        auto input_timestamp = msg->time_stamp;
        ++frame_count_; 

        auto& frame = msg->data;
        int src_w = 640, src_h = 480;
        int dst_w = 640, dst_h = 640;
 
        std::vector<uint8_t> output_nv12(dst_w * dst_h * 3 / 2);
 
        int x_shift, y_shift;
        float x_scale, y_scale;
 
        // 执行letterbox预处理
        letterbox_nv12(
            frame.data(), src_w, src_h, 
            output_nv12.data(), dst_w, dst_h,
            x_shift, y_shift, x_scale, y_scale
        );
        
        // 执行障碍物检测
        detector_.detect(output_nv12.data());
        detector_.postprocessing(x_shift, y_shift, x_scale, y_scale, src_w, src_h);
 
        // 发布检测结果
        publish_detection_results(input_timestamp);

        // 性能监控：计算和显示FPS
        auto now = std::chrono::steady_clock::now();
        auto duration = std::chrono::duration_cast<std::chrono::seconds>(now - last_time_).count();
        if (duration >= 1) {
            RCLCPP_INFO(this->get_logger(), "Frame rate: %.2f fps, last frame_id published: %lu",
                  static_cast<float>(frame_count_) / duration, frame_count_); 
            frame_count_ = 0;
            last_time_ = now;
        }
    }

    void publish_detection_results(const builtin_interfaces::msg::Time& timestamp) {
        auto targets_msg = ai_msgs::msg::PerceptionTargets();

        targets_msg.header.stamp = timestamp;
        targets_msg.header.frame_id = std::to_string(frame_count_);

        // 获取检测结果并封装为ROS2消息
        const auto& objects = detector_.get_detected_objects();
        for (const auto& obj : objects) {
            ai_msgs::msg::Target target;
            target.type = obj.class_name;
            target.track_id = 0; 
            
            ai_msgs::msg::Roi roi;
            roi.type = "rect";
            roi.rect.x_offset = obj.x;
            roi.rect.y_offset = obj.y;
            roi.rect.height = obj.height;
            roi.rect.width = obj.width;
            roi.rect.do_rectify = false;
            roi.confidence = obj.confidence;
            
            target.rois.push_back(roi);
            targets_msg.targets.push_back(target);
        }
        
        publisher_->publish(targets_msg);
    }
 
    rclcpp::Subscription<hbm_img_msgs::msg::HbmMsg1080P>::SharedPtr sub_;
    size_t frame_count_; 
    std::chrono::steady_clock::time_point last_time_;
    RacingObstacleDetection& detector_; 
    rclcpp::Publisher<ai_msgs::msg::PerceptionTargets>::SharedPtr publisher_;
};

配置管理与系统参数

配置管理系统采用JSON格式文件进行参数管理，提供了灵活的参数配置能力。系统支持多种配置参数，包括模型文件路径、类别数量、类别名称列表、预处理类型、NMS阈值、置信度阈值等。通过load_config函数，系统在启动时自动加载配置文件，验证参数的有效性，并将配置信息打印到控制台供调试使用。

void RacingObstacleDetection::load_config() {
    std::cout << "================================================" << std::endl;
    std::cout << "[INFO] Loading Configuration From config/yolov8.json" << std::endl;
    std::ifstream config_file("config/yolov8.json");
    if (!config_file.is_open()) {
        std::cerr << "[ERROR] Failed to open config file." << std::endl;
        return;
    }
 
    nlohmann::json config;
    config_file >> config;
 
    // 解析配置参数
    model_file = config["model_file"];
    class_num = config["class_num"];
    dnn_parser = config["dnn_Parser"];
    cls_names_list = config["cls_names_list"].get<std::vector<std::string>>();
    preprocess_type = config["preprocess_type"];
    nms_threshold = config["nms_threshold"];
    score_threshold = config["score_threshold"];
    nms_top_k = config["nms_top_k"];
    reg = config["reg"];
    font_size = config["font_size"];
    font_thickness = config["font_thickness"];
    line_size = config["line_size"];

    config_file.close();
    
    // 打印配置信息供调试使用
    std::cout << "[INFO] Model File: " << model_file << std::endl;
    std::cout << "[INFO] DNN Parser: " << dnn_parser << std::endl;
    std::cout << "[INFO] Class Number: " << class_num << std::endl;
    std::cout << "[INFO] Class Names List: ";
    for (const auto& name : cls_names_list) {
        std::cout << name << " ";
    }
    std::cout << std::endl;
    
    if (preprocess_type == 0) {
        std::cout << "[INFO] Preprocess Type: Resize" << std::endl;
    } else if (preprocess_type == 1) {
        std::cout << "[INFO] Preprocess Type: Letterbox" << std::endl;
    }
    
    std::cout << "[INFO] NMS Threshold: " << nms_threshold << std::endl;
    std::cout << "[INFO] Score Threshold: " << score_threshold << std::endl;
    std::cout << "[INFO] NMS Top K: " << nms_top_k << std::endl;
    std::cout << "[INFO] Regression: " << reg << std::endl;
    std::cout << "[INFO] Font Size: " << font_size << std::endl;
    std::cout << "[INFO] Font Thickness: " << font_thickness << std::endl;
    std::cout << "[INFO] Line Size: " << line_size << std::endl;
    std::cout << "[INFO] Load Configuration Successfully!" << std::endl;
    std::cout << "================================================" << std::endl << std::endl;
}

系统启动与部署流程

系统启动流程通过ROS2的launch文件进行管理，实现了多节点的协调启动。launch文件定义了障碍物检测节点和websocket节点的启动参数。系统启动后，首先初始化RacingObstacleDetection对象，加载配置文件和模型，然后创建ROS2节点并进入消息循环。

def generate_launch_description():
    # 障碍物检测节点配置
    obstacle_detection_node = Node(
        package='racing_obstacle_detection',
        executable='racing_obstacle_detection',
        name='racing_obstacle_detection',
        output='screen'
    )

    # websocket节点配置
    websocket_node = Node(
        package='websocket',
        executable='websocket',
        name='websocket',
        output='screen',
        parameters=[
            {"image_topic": "/image"},
            {"image_type": "mjpeg"},
            {"only_show_image": False},
            {"output_fps": 0},
            {"smart_topic": "/racing_obstacle_detection"},
            {"channel": 0}
        ],
        arguments=['--ros-args', '--log-level', 'info']
    )

    return LaunchDescription([
        obstacle_detection_node,
        websocket_node
    ])

总结

本系统成功实现了基于ROS2和YOLO11的实时障碍物检测功能，相较于YOLOv5，模型效果更好，可以更好的完成地瓜机器人竞赛的任务一、任务三。

个人博客：JasperXzy's Blog
代码仓库：Github

竞赛

#Smart Car Race #RDK X5 #Yolo #Deep Learning #Object Detection #PyTorch

RDK X5 使用 YOLO11 实现避障检测

https://blog.jasperxzy.com/2025/06/29/rdk_x5_yolo11_detect_obstacle/

作者

Zhengyi Xu

发布于

2025年6月29日

许可协议

第20届智能车竞赛地瓜机器人创意组技术报告上一篇

Docker 学习笔记下一篇