投审稿入口

强噪声产线手写字符的可校准识别

Calibrated recognition of handwritten characters in heavy-noise steel production lines

  • 摘要: 厚板产线的板坯号识别受高温、粉尘、强反光与遮挡等工况影响,非规整粉笔手写体导致显著域偏移,使通用光学字符识别(OCR)在困难样本上大幅漏检/错检且跨厂泛化弱,同时精细标注代价高。为此,本文提出一套与工程约束兼容的两阶段流程:在训练阶段,先由两级检测自动裁剪侧边区域与字符框,并引入垂直投影先验以增强细弱与连笔笔画的可分性;随后通过自监督分类—核心样本选择—自训练更新的闭环,驱动视觉变换器(ViT)分类头迭代优化。在检测阶段,复用侧边/字符框检测与分类模型,结合领域规则与字符框位置进行序列约束,再由多模态串级后处理完成纠错、缺失位补全与候选重排,输出结构化结果。其中,多模态串级后处理模块仅接收前端检测/分类产生的字符候选、置信度、空间顺序与工艺规则等结构化信息,在约束可行域内进行串级纠错与重排;不直接处理图像像素,也不替代前端视觉识别,当前端候选缺失/漏检时仅输出“需复核/拒绝下发”而非凭空生成字符。基于真实产线构建的遮挡、擦拭、反光、低照度与倾斜子集评测表明:相较通用 OCR 与小样本强监督基线,所提方法在字符级与串级指标上取得显著提升,尤其在困难子集上显著降低漏检;在边缘算力预算下同时兼顾分辨率、吞吐与置信度校验;标注工作量约降 70%,并在加热炉前环节明显减少人工抄录/复核成本与错装风险。该范式可推广至连铸号、钢卷号与炉号等冶金标识任务。

     

    Abstract: Slab number recognition in heavy-plate production lines operates under extreme conditions such as high temperature, dust, strong glare, and occlusion. Irregular chalk handwriting introduces substantial domain shifts, causing generic OCR systems to miss or misclassify hard samples, generalize poorly across plants, and lead to high costs for fine-grained annotations. To address these challenges, this paper proposes an industrial-constraint-compatible two-stage pipeline. In the training phase, a two-level detector automatically crops slab-side regions and character boxes. This paper introduces a vertical projection prior to enhance the separability of faint strokes and connected handwriting. Then, a closed loop of self-supervised classification, core sample selection, and self-training update iteratively optimizes the ViT classification head. In the inference phase, this paper reuse the slab-side/character-box detector and the classifier, and impose sequence constraints using process-specific rules and the spatial ordering of character boxes. A sequence-level multimodal post-processing module subsequently performs error correction, missing-position completion, and candidate re-ranking to output structured results. Importantly, this post-processing module only consumes structured inputs produced by the front-end (character candidates, confidence scores, spatial order, and process rules); it does not operate on image pixels and does not replace visual recognition. When front-end candidates are insufficient due to detection misses, it outputs ″review required/reject release″ rather than hallucinating characters. This paper evaluates on real production-line subsets covering occlusion, smearing, glare, low illumination, and tilt. Compared with generic OCR and small-sample supervised baselines, the proposed method achieves significant gains at both character-level and sequence-level metrics, with notably fewer misses on the hardest subsets. Under edge-compute budgets, the approach jointly balances resolution, throughput, and confidence calibration, reduces manual annotation effort by about 70%, and measurably lowers manual transcription/verification workload and misloading risk at the reheating-furnace stage. The same paradigm can be extended to other metallurgical identifiers, such as continuous casting numbers, coil IDs, and furnace IDs.

     

/

返回文章
返回