| 126 | 0 | 479 |
| 下载次数 | 被引频次 | 阅读次数 |
大型工程施工过程中产生了海量的安全隐患排查记录,蕴含了多类隐患要素关联知识,对工程安全管控有重要参考意义。然而,通过人工手段进行隐患危险源信息抽取与其内在关联挖掘耗时费力,难以及时反馈现场管控。提出一种基于通用信息抽取(Universal Information Extraction, UIE)框架与改进Apriori算法的隐患危险源实体智能抽取与知识挖掘方法。首先,基于UIE框架构建危险源实体识别模型,确定实体抽取提示标签,并通过小样本微调实现高效、准确的危险源实体自动抽取;然后,提出考虑隐患数据类型约束改进Apriori算法流程,进行多要素关联规则的挖掘与可视化。实例分析表明,所提出的危险源实体抽取模型在验证集与测试集上的F1值分别达到了0.892和0.886,显著高于基础模型的0.253与0.307,在整体模型上的危险源实体识别率提高了36.66%;此外,利用桑基图和关联网络图对改进Apriori抽取的多要素强关联规则进行可视化,展示出良好的可解释性。可为大型工程的海量安全隐患文本知识挖掘提供了高效、智能化的技术手段,为施工现场针对性安全管控措施制定提供了数据支持。
Abstract:Mega projects generate a vast amount of safety hazard inspection records, which contain valuable knowledge on the relationships between various hazard elements and are essential for safety management. However, manually extracting safety hazard information and uncovering their internal correlations is time-consuming and inefficient, making it difficult to provide timely feedback for on-site safety management. An intelligent extraction and knowledge mining method was proposed for hazard source entities based on the Universal Information Extraction(UIE) framework and an improved Apriori algorithm. First, a safety hazard entity recognition model is constructed using the UIE framework, with specific entity extraction prompts defined. The model is fine-tuned with few-shot learning to achieve efficient and accurate automatic extraction of safety hazard entities. Then, an improved Apriori algorithm is introduced, considering the constraints of hazard data types, to perform multi-factor association rule mining and visualization. Case analysis shows that the proposed safety hazard entity extraction model achieved F1 scores of 0.892 and 0.886 on the validation and test datasets respectively, significantly outperforming the baseline model′s scores of 0.253 and 0.307, and the overall entity recognition rate improves 36.66%. Additionally, the extracted multi-factor strong association rules are visualized using Sankey diagrams and association network graphs, demonstrating good interpretability. Research findings provides an efficient and intelligent method for mining knowledge from the vast amount of safety hazard text data generated in mega construction projects, offering data-driven support for the development of targeted safety management measures on construction sites.
[1] 陈晓.基于数据挖掘的煤矿安全管理知识可视化研究[D].北京:中国矿业大学(北京),2017.
[2] HOSSAIN A,SUN X,THAPA R,et al.Applying association rules mining to investigate pedestrian fatal and injury crash patterns under different lighting conditions[J].Transportation Research Record,2022,2676(6):659-672.
[3] XU X,ZOU P X W.Discovery of new safety knowledge from mining large injury dataset in construction[J].Safety Science,2021,144:105-111.
[4] 谭章禄,陈孝慈.基于链路预测的安全隐患管理研究[J].中国安全生产科学技术,2020,16(9):18-23.
[5] XU N,ZHANG B,GU T,et al.Expanding domain knowledge elements for metro construction safety risk management using a co-occurrence-based pathfinding approach[J].Buildings,2022,12(10):1-15.
[6] 蔡近近,宋瑞,何世伟,等.基于改进FP-Growth算法和贝叶斯的营业线施工安全风险分析[J].铁道科学与工程学报,2024,21(8):3370-3381.
[7] FU L,WANG X,ZHAO H,et al.Interactions among safety risks in metro deep foundation pit projects:An association rule mining-based modeling framework[J].Reliability Engineering & System Safety,2022,221:108381.
[8] TAO F,PI Y,ZHANG M,et al.Hidden danger association mining for water conservancy projects Based on task scenario-driven[J].Water,2023,15:2814.
[9] 陈碧云,丁晋,陈绍南.基于关联规则挖掘的电力生产安全事故事件关键诱因筛选[J].电力自动化设备,2018,38(4):68-74.
[10] 张明媛,朱密,赵雪峰.任务驱动下的建筑施工现场危险源关联规则挖掘[J].安全与环境学报,2019,19(1):14-20.
[11] TIAN D,LI M,SHI J,et al.On-site text classification and knowledge mining for large-scale projects construction by integrated intelligent approach[J].Advanced Engineering Informatics,2021,49:101355.
[12] CHENG M Y,KUSOEMO D,GOSNO R A.Text mining-based construction site accident classification using hybrid supervised machine learning[J].Automation in Construction,2020,118:103265.
[13] 钟雪妍,钟波涛,沈罗昕,等.基于NLP技术的建筑工程质量隐患信息抽取[J].土木工程与管理学报,2023,40(5):113-120.
[14] LEE J,TOUTANOVA K.Pre-training of deep bidirectional transformers for language understanding[J].arXiv preprint arXiv,2018,3(8):1810.04805.
[15] FLORIDI L,CHIRIATTI M.GPT-3:Its nature,scope,limits,and consequences[J].Minds and Machines,2020,30:681-694.
[16] 刘婷,张社荣,王超,等.水利施工事故文本智能分析的BERT-BiLSTM混合模型[J].水力发电学报,2022,41(7):1-12.
[17] 王仁超,张毅伟,毛三军.水电工程施工安全隐患文本智能分类与知识挖掘[J].水力发电学报,2022,41(11):96-106.
[18] TIAN D,LI M,HAN S,et al.A novel and intelligent safety-hazard classification method with syntactic and semantic features for large-scale construction projects[J].Journal of Construction Engineering and Management,2022,148(10):4022109.
[19] 杨飘,董文永.基于BERT嵌入的中文命名实体识别方法[J].计算机工程,2020,46(4):40-45.
[20] 田丹,沈扬,李明超,等.混凝土坝施工文档实体知识智能挖掘方法[J].水力发电学报,2021,40(6):139-151.
[21] 杨燕,叶枫,许栋,等.融合大语言模型和提示学习的数字孪生水利知识图谱构建[J].计算机应用,2024,45(3):785-793.
[22] 杨阳蕊,朱亚萍,刘雪梅,等.水利工程文本中抢险实体和关系的智能分析与提取[J].水利学报,2023,54(7):818-828.
[23] LU Y,LIU Q,DAI D,et al.Unified structure generation for universal information extraction[EB/OL].(2022-03-23)[2024-10-14].https://arxiv.org/abs/2203.12277.
[24] 刘浏,王东波.命名实体识别研究综述[J].情报学报,2018,37(3):329-340.
[25] ZHENG H,HE J,LIU Q,et al.Multi-objective optimization based fuzzy association rule mining method[J].World Wide Web,2022,26(3):1-18.
[26] BARALIS E,CAGLIERO L,CERQUITELLI T,et al.Generalized association rule mining with constraints[J].Information Sciences,2012,194:68-84.
基本信息:
DOI:10.13928/j.cnki.wrahe.2025.S1.016
中图分类号:TV513;TP391.1
引用信息:
[1]刘国平,李欣,刘东海,等.基于UIE与改进Apriori的大型工程隐患危险源抽取与知识挖掘方法[J].水利水电技术(中英文),2025,56(S1):102-110.DOI:10.13928/j.cnki.wrahe.2025.S1.016.
基金信息:
中国长江三峡集团有限公司企业科研项目(202103551)