Alarm Event Propagation

I’ve been working on a project utilizing the ignition mqtt modules and during commissioning and development we have noticed a bug that happens around alarms that are being triggered and cleared quickly within the transmission time frame. If the initial alarm event and the cleared alarm event are sent with the same mqtt transmission then an error condition will occur on the MQTT engine to the tune of the below.

I’ve cleared out any identifying information, but I think there are some ways that scripting or logging to take care of this in a cleaner fashion to inform and warn the user instead of creating an error condition to help with troubleshooting the ping ponging alarm conditions. Currently the alarm led us down a path of conflicting alarms and uuids being duplicated which was not the case.

Logging Error

Failed to insert <alarm Name> alarm event (id=<UUID>) into the ALARM_EVENTS H2 table

Stack Trace

org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Unique index or primary key violation: "PUBLIC.PRIMARY_KEY_7 ON PUBLIC.ALARM_EVENTS(ID) VALUES ( /* 207 */ CAST('<UUID>' AS CHAR(36)) )"; SQL statement: INSERT INTO `ALARM_EVENTS` (ID, NAME, SOURCE, PRIORITY, ACTIVE_TIME, ACK_TIME, CLEAR_TIME, GROUPID, EDGENODEID, DEVICEID, STATE, ACKED_BY, NOTES) VALUES ('<UUID>', '<alarmName>', 'prot:MQTT:/src:<gateway>:/group_id:<groupId>:/edge_node_id:<edgeNodeID>:/device_id:<deviceID>:/prov:<provider>:/tag:<tagPath>:/alm:<alarmName>', '<priority>', 

at org.h2.message.DbException.getJdbcSQLException(DbException.java:520)

at org.h2.engine.SessionRemote.readException(SessionRemote.java:650)

at org.h2.engine.SessionRemote.done(SessionRemote.java:619)

at org.h2.command.CommandRemote.executeUpdate(CommandRemote.java:237)

at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:262)

at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:231)

at com.cirruslink.mqtt.engine.gateway.alarm.EngineAlarmStoreManager.h2InsertAlarmEvent(EngineAlarmStoreManager.java:451)

at com.cirruslink.mqtt.engine.gateway.alarm.EngineAlarmManager.addAlarm(EngineAlarmManager.java:255)

at com.cirruslink.mqtt.engine.gateway.alarm.EngineAlarmManager.handleAlarmEvent(EngineAlarmManager.java:138)

at com.cirruslink.mqtt.engine.gateway.sparkplug.SparkplugDevice.updateTagValue(SparkplugDevice.java:225)

at com.cirruslink.mqtt.engine.gateway.sparkplug.SparkplugBPayloadHandler.handleDeviceData(SparkplugBPayloadHandler.java:1932)

at com.cirruslink.mqtt.engine.gateway.sparkplug.SparkplugPayloadHandler.handlePayload(SparkplugPayloadHandler.java:220)

at com.cirruslink.mqtt.engine.gateway.sparkplug.SparkplugMessageRunnable.run(SparkplugMessageRunnable.java:72)

at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

This should only happen if the alarms have the same ID regardless of when the events are sent. Are you using rolling buffer? Or potentially clearing the alarms twice quickly? I think the latter could potentially cause Ignition to tell MQTT Transmission that two events actually did occur…

Hey Wes. I do have the rolling buffer setup, but we did have some issues with UNS clashes that may have caused stability on the engine gateway. Overall I’ve noticed this behavior on those tags that are triggering an an alarm state off/on very quickly. I had tried to delete and remove the h2 database on the transmission and engine side to see if this would address the issue from a duplicate alarm standpoint but did not see any resolution.

My theory at least looking at the stack trace was that the Alarm_Events table expected a single update foe each mqtt transmission not two.

I think I’d like to get some more details on this to see if we can repro. I understand why the insert is failing, but I’m interested in getting info on how we can reproduce the issue and make sure we handle all possible consequences of getting duplicate alarm events. Would you mind opening a support ticket at support@cirrus-link.com and referencing this forum post? We’d probably want logs to start and potentially a gwbk from the edge so we can get config details and repro the issue.