Adaptive watermark generation mechanism based on time series prediction for stream processing

来源 :计算机科学前沿 | 被引量 : 0次 | 上传用户:stanley45518501
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
The data stream processing framework processes the stream data based on event-time to ensure that the request can be responded to in real-time.In reality,streaming data usu-ally arrives out-of-order due to factors such as network delay.The data stream processing framework commonly adopts the watermark mechanism to address the data disorderedness.Wa-termark is a special kind of data inserted into the data stream with a timestamp,which helps the framework to decide whether the data received is late and thus be discarded.Traditional wa-termark generation strategies are periodic;they cannot dynam-ically adjust the watermark distribution to balance the respon-siveness and accuracy.This paper proposes an adaptive water-mark generation mechanism based on the time series prediction model to address the above limitation.This mechanism dynam-ically adjusts the frequency and timing of watermark distribu-tion using the disordered data ratio and other lateness proper-ties of the data stream to improve the system responsiveness while ensuring acceptable result accuracy.We implement the proposed mechanism on top of Flink and evaluate it with real-world datasets.The experiment results show that our mecha-nism is superior to the existing watermark distribution strate-gies in terms of both system responsiveness and result accuracy.
其他文献
为提高投资决策水平、完善投资决策机制,针对变电站投资建设项目后评价,提出了一种基于粗糙集与模糊?多级可拓法的评价方法.首先,在国家电网公司现有项目后评价体系的基础上,结合变电站的建设特点,确定了具体的评价指标体系,并基于粗糙集理论确定项目后评价指标权重;然后,将模糊综合评价理论与多级可拓评价方法结合,提出了一种改进的模糊?多级可拓评价方法;最后,通过算例分析,验证了所提评价方法的合理性.
As the mean-time-between-failures(MTBF)con-tinues to decline with the increasing number of components on large-scale high performance computing(HPC)systems,pro-gram failures might occur during the execution period with high probability.Ensuring successful
在输电铁塔真型实验中,需要对连接角钢的应变进行测量,而实验过程中的贴片误差以及扭矩的存在势必会造成应变数据失真.针对该问题,分析了贴片误差和角钢扭矩对应变测量数据的影响,给出了用应变数据计算角钢轴力的计算公式;对双肢连接角钢进行了有限元仿真分析,研究仅受轴向力作用时贴片误差对角钢轴力测量的影响;在轴向力载荷的基础上施加一定的扭矩,研究同时存在扭矩和贴片误差对角钢轴力测量的影响;同时研究了扭矩大小对角钢轴力测量的影响规律.研究结果表明:在角钢仅受轴向力作用时,贴片误差对其测量结果影响很小;而当角钢中存在扭矩
针对某燃煤发电厂带式输送机时常发生由皮带粘黏物冲击导致清扫器刮刀大幅度弹起和振荡问题,分析了刮刀俯仰角度和扭簧刚度对清扫器刮刀运动特性的影响.首先,基于ADAMS建立了清扫器刮刀动力学仿真模型,分析得到不同俯仰角度下,刮刀受到冲击载荷后最大弹起幅度、最大振荡幅度、振荡持续时间,给出了刮刀俯仰角度合理设计范围.其后,通过分析得到了不同扭簧刚度下的刮刀动力学特性,给出了合理的扭簧设计刚度.分析结果可为合理设计带式输送机清扫器提供数据支撑.
传统的分布式光伏测图通过载波相位差分技术进行实地采集数据,集中式光伏具有体量大、地形复杂、设计周期短等突出特点,采用传统勘测手段测图费时费力,且在时间节点难以满足业主要求,在永靖光伏测图过程中采用无人机倾斜摄影技术生成测区的实景三维模型,在实景三维模型上三维测绘大比例测图地形图,随机选取检查点进行精度验证,其结果满足1:1000比例测图精度要求.
针对虚拟发电厂中常规分布式电源、传统热电联产机组的碳排放特性,基于风电机组出力与用户热能需求的互补性,提出在虚拟发电厂中配置电锅炉以实现虚拟发电厂的碳中和.分析了风电机组、微型电热联产机组和电锅炉三者间的出力关系;建立了计及虚拟发电厂碳排放成本、虚拟发电厂运行成本和交易电量成本的虚拟发电厂电热联合运行模型,并利用量子进化算法对其进行了求解;计算了3种典型场景下的机组的出力、交易电量、碳排放总量.仿真算例表明,电锅炉联合风电供热能够显著降低虚拟发电厂的整体碳排放和运行成本.
为提升电力系统短期负荷预测精度,提出粒子群算法优化量子加权门控循环单元神经网络模型.首先,将量子加权神经元融入门控循环单元神经网络中,构建量子加权门控循环单元神经网络预测模型,利用量子信息处理机制,提高该神经网络的非线性逼近能力和泛化能力.然后,使用全局优化能力较强的改进粒子群优化算法对所提出模型的参数进行寻优,构建权重矩阵进行负荷预测.最后,通过实际电网算例进行仿真,仿真结果表明,本文提出的粒子群优化量子加权门控循环单元神经网络预测模型的预测精度较高.
Machine learning(ML)techniques and algorithms have been successfully and widely used in various areas includ-ing software engineering tasks.Like other software projects,bugs are also common in ML projects and libraries.In order to more deeply understand t
Closely related to the safety and stability of power grids,stability analysis has long been a core topic in the elec-tric industry.Conventional approaches employ computational simulation to make the quantitative judgement of the grid sta-bility under dist
Reinforcement learning is about learning agent models that make the best sequential decisions in unknown en-vironments.In an unknown environment,the agent needs to explore the environment while exploiting the collected infor-mation,which usually forms a s