Connection (and publishing) works through MQTT Engine, but fails in MQTT Transmission

Hello,

I’m running Ignition Cloud edition (8.1.36) in AWS and connect through MQTT Engine to IoT Core without any issue. We don’t use sparkblug, but custom namespaces to filter where the information comes from.

I need to start publishing some data to IoT core, and I thought of using MQTT Transmission, but connection fails constantly, with IoT Core logs lamenting an authentication_failure and the Ignition logs showing constantly

which does not make a lot of sense since certificates and pkey used are the same used by the MQTT Engine module, with just a different client ID. This credentials/client ID have been tested with MQTT Explorer and they work fine.

Also, the credentials privilegese are good, because publishing via script through MQTT Engine works with no issues. I also tried to change the Data Format Type without success.

Did anybody face a similar issue?

Does the fact that we use custom namespace prevent the use of MQTT Transmission which, as much as I can see, doesn’t allow for anything aside Sparkplug?

And, if forced to use MQTT Engine via scripting, does anybody have experience on the load impact on the gateway for intense use?

Thank you all in advance for any advice.

I can see from your log that you are connecting and then getting a ‘Connection lost’. So, your certs are mostly good as you note. I suspect the issue has to do with the ‘policy’ attached to the certificate. You likely have not allowed subscription or publish topics that are required by Transmission.

Another possibility is that you have not permitted the ‘RetainPublish’ option on the policy. This is also required by MQTT Transmission (and any Sparkplug Edge Node). See the policy setup at the top of the following document for full details: Connecting to AWS IoT Core - MQTT Modules for Ignition 8.x - Confluence

Thanks for the quick feedback, I have asked my colleague handling IoT Core to verify that this policy is set, I will come back with feedback as soon as I have it.

So, the RetainPublish action was not allowed as you mentioned. It is now active, but MQTT Transmission still refuses to connect with the same error as before.

In the docs you pointed out, the ARN is set as the the full AWS_REGION:ACCOUNT_ID:

While in our case, the policy limits the subscription only to certain subtopics. Could this be the culprit?I can filter namespaces (therefore topics) in MQTT Engine, but there is no similar setting in MQTT Transmission.

If MQTT Transmission is effectively trying to connect to the whole set of topics, I understand how this will be refused by IoT Core.

Thanks again for the support.

To add to this, the only transmitter I am using for this MQTT Transmission connection is set to publish to the correct topic in the Group ID/Edge Node ID, which map to one of the subtopic in which the policy allows my client to sub/pub.

Without knowing what you’ve set your ACLs too, I can’t say what is wrong with them. But, this can be a bit complex to set up as there are subscriptions issued such the one for NCMDs which is required per the Sparkplug specification. If you can post the ACLs you are using, I should be able to tell you which one(s) are problematic.

You should also be able to turn the ‘TahuClient’ logger to TRACE in Ignition. It should show the exact subscriptions being issued so you can see which one(s) are violating your ACLs.

Thanks again for the feedback. Sadly my counterpart in AWS is not available today so I cannot check the ACLs. I set the logging level to trace as suggested, but I’m not sure I can make a lot out of the logs here. I post these as code since they might be able to copy and I’m not allowed more than one screenshot.

TransmissionClient	11Mar2024 10:43:26
	[xxx_xxxx_xxx_data/ignition_test][yyy_yyyy_yyyyyyyy_yyyy] Failed to achieve connected state
TransmissionClient	11Mar2024 10:43:26
	Attempting disconnect iot_core_URL:8883 :: yyy_yyyy_yyyyyyyy_yyyy with sendDisconnect=false, publishLwt=true, waitForLwt=false, resetForceTagScan=false
TransmissionClient	11Mar2024 10:43:26
	[xxx_xxxx_xxx_data/ignition_test][yyy_yyyy_yyyyyyyy_yyyy] No longer attempting to connect
TransmissionClient	11Mar2024 10:43:26
	[xxx_xxxx_xxx_data/ignition_test][yyy_yyyy_yyyyyyyy_yyyy] Attempting to disconnect from target server
TransmissionClient	11Mar2024 10:43:26
	[xxx_xxxx_xxx_data/ignition_test][yyy_yyyy_yyyyyyyy_yyyy] is not attempting to connect
ClientsManager	11Mar2024 10:43:25
	return getFieldvalue: 0 of 1
TransmissionClient	11Mar2024 10:43:25
	[xxx_xxxx_xxx_data/ignition_test][yyy_yyyy_yyyyyyyy_yyyy] Attempting to connect
TransmissionClient	11Mar2024 10:43:25
	[xxx_xxxx_xxx_data/ignition_test][yyy_yyyy_yyyyyyyy_yyyy] is attempting to connect
TransmissionClient	11Mar2024 10:43:25
	Total timeout to connect is 50 seconds
TransmissionClient	11Mar2024 10:43:25
	Setting Transmission client's connection retry interval to 1000 milliseconds
TransmissionClient	11Mar2024 10:43:25
	[xxx_xxxx_xxx_data/ignition_test][] Not connected - attempting connect with isStayRunning=true
SparkplugTransmissionClient	11Mar2024 10:43:25
	Creating new SparkplugMqttCallback
SparkplugTransmissionClient	11Mar2024 10:43:25
	Shutting down old MQTT callback
TransmissionClient	11Mar2024 10:43:25
	Trying to connect to target MQTT server 'New Mqtt Server' at index 0 now.
TransmissionClient	11Mar2024 10:43:25
	Setting Transmission Info/Transmitters/Example Transmitter/Edge Nodes/xxx_xxxx_xxx_data/ignition_test/MQTT Client/ to OFFLINE
TransmissionClient	11Mar2024 10:43:25
	Transmission Client 'xxx_xxxx_xxx_data/ignition_test' has gone offline
TransmissionClientManager	11Mar2024 10:43:25
	onPayload - publishing: spBv1.0/xxx_xxxx_xxx_data/NDATA/ignition_test with payload: SparkplugBPayload [timestamp=1710150205365, metrics=[Metric [name=Ramp_to_20, alias=null, timestamp=1710150202917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=2, isNull=false], Metric [name=Ramp_to_10, alias=null, timestamp=1710150202967, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=2, isNull=false], Metric [name=Ramp_to_20, alias=null, timestamp=1710150203917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=3, isNull=false], Metric [name=Ramp_to_10, alias=null, timestamp=1710150203966, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=3, isNull=false], Metric [name=Ramp_to_20, alias=null, timestamp=1710150204917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=4, isNull=false], Metric [name=Ramp_to_10, alias=null, timestamp=1710150204966, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=4, isNull=false]], seq=null, uuid=null, body=null]
TransmissionClient	11Mar2024 10:43:24
	Storing history on topic=spBv1.0/xxx_xxxx_xxx_data/NDATA/ignition_test with payload: SparkplugBPayload [timestamp=1710150202356, metrics=[Metric [name=Ramp_to_20, alias=null, timestamp=1710150199917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=19, isNull=false], Metric [name=Ramp_to_10, alias=null, timestamp=1710150199966, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=9, isNull=false], Metric [name=Ramp_to_20, alias=null, timestamp=1710150200917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=0, isNull=false], Metric [name=Ramp_to_10, alias=null, timestamp=1710150200967, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=0, isNull=false], Metric [name=Ramp_to_20, alias=null, timestamp=1710150201917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=1, isNull=false], Metric [name=Ramp_to_10, alias=null, timestamp=1710150201966, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=1, isNull=false]], seq=null, uuid=null, body=null]
TransmissionClient	11Mar2024 10:43:24
	Setting Transmission Info/Transmitters/Example Transmitter/Edge Nodes/xxx_xxxx_xxx_data/ignition_test/MQTT Client/ to OFFLINE
TransmissionClient	11Mar2024 10:43:24
	Transmission Client 'xxx_xxxx_xxx_data/ignition_test' has gone offline
TransmissionClient	11Mar2024 10:43:24
	[xxx_xxxx_xxx_data/ignition_test][] Attempting to disconnect from target server
TransmissionClient	11Mar2024 10:43:24
	With metric: Metric [name=Ramp_to_10, alias=null, timestamp=1710150201966, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=1, isNull=false]
TransmissionClient	11Mar2024 10:43:24
	With metric: Metric [name=Ramp_to_20, alias=null, timestamp=1710150201917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=1, isNull=false]
TransmissionClient	11Mar2024 10:43:24
	With metric: Metric [name=Ramp_to_10, alias=null, timestamp=1710150200967, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=0, isNull=false]
TransmissionClient	11Mar2024 10:43:24
	With metric: Metric [name=Ramp_to_20, alias=null, timestamp=1710150200917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=0, isNull=false]
TransmissionClient	11Mar2024 10:43:24
	With metric: Metric [name=Ramp_to_10, alias=null, timestamp=1710150199966, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=9, isNull=false]
TransmissionClient	11Mar2024 10:43:24
	With metric: Metric [name=Ramp_to_20, alias=null, timestamp=1710150199917, dataType=Int32, isHistorical=null, isTransient=null, metaData=null, properties=PropertySet [propertyMap={}], value=19, isNull=false]
TransmissionClient	11Mar2024 10:43:24
	Storing publish task for spBv1.0/xxx_xxxx_xxx_data/NDATA/ignition_test
TransmissionClient	11Mar2024 10:43:24
	publishPayload for xxx_xxxx_xxx_data/ignition_test: globalInOrderFlushingActive = false
TransmissionClient	11Mar2024 10:43:24
	publishPayload: isConnectedToPrimaryHost() = false
TransmissionClient	11Mar2024 10:43:24
	publishPayload: isConnected() = false
TransmissionClient	11Mar2024 10:43:24
	publishPayload: birthBuildInProgress = false
TransmissionClient	11Mar2024 10:43:24
	Clearing client yyy_yyyy_yyyyyyyy_yyyy
TransmissionClient	11Mar2024 10:43:24
	Successfully disconnected iot_core_URL:8883 :: yyy_yyyy_yyyyyyyy_yyyy
TransmissionClient	11Mar2024 10:43:24
	Setting Transmission Info/Transmitters/Example Transmitter/Edge Nodes/xxx_xxxx_xxx_data/ignition_test/MQTT Client/ to OFFLINE
TransmissionClient	11Mar2024 10:43:24
	Transmission Client 'xxx_xxxx_xxx_data/ignition_test' has gone offline
ClientsManager	11Mar2024 10:43:23
	return getFieldvalue: 0 of 1

The exception for the disconnect on the Tahu client is still the EOFException presenet in the first post of this thread.

org.eclipse.paho.client.mqttv3.MqttException: Connection lost
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:197)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException: null
at java.base/java.io.DataInputStream.readByte(Unknown Source)
at org.eclipse.paho.client.mqttv3.internal.wire.MqttInputStream.readMqttWireMessage(MqttInputStream.java:92)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:137)
... 1 common frames omitted

The only additional thing I can see in the logs is the client trying to set a will on NDEATH when disconnecting.

TahuClient	11Mar2024 10:47:23
	xxx_xxxx_xxx_xxxx: Setting WILL on spBv1.0/xxx_xxxx_xxx_data/NDEATH/ignition_test with retain false

I will try to get a hold of the ACLs as soon as possible. Anyway, if you confirm me that even custom namespaces can be used in IoT Core without issue, then I know my problem resides somewhere in the access side and I will focus there with my AWS counterpart.

Once again, thanks for the support, have a great week.

Custom namespaces can be used with IoT Core as well. I should note, AWS IoT Core has a maximum message size of 128 KB. So, you need to stay under this. Other MQTT Servers such as Chariot and MQTT Distributor do not have this limitation (MQTT Broker - MQTT Servers - Cirrus Link Solutions).

Also to be clear, Sparkplug clients require the following permissions in AWS:

Sorry for the delay, I was confirming some things with my colleagues.

So, I replicated a Transmission → Distributor → Engine setup locally, and for what I see, the Transmission follows strictly (as intended) the Sparkplug B specification, by publishing data to the spBv1.0 root, with a verb between group and node.

I think my problem is, our IoT Core setup does not follow this spec. There is no spBv1.0 root topic, and our policies are set by namespace, e.g. my connection will fail if I try to sub/pub to a topic I don’t have access to.

Can you confirm that by design, Transmission cannot transmit outside the spBv1.0 topic?I am trying to find docs for this, but the only thing I found is this thread in the Ignition forum, which seems to confirm my fears.

MQTT Transmission does use Sparkplug which has a well defined MQTT topic namespace. There are many reasons for this which are described in this presentation: https://www.youtube.com/watch?v=3syYeTg6RBc

Transmission does also support scripting publishes which are described here: MQTT Publishing via MQTT Transmission - MQTT Modules for Ignition 8.x - Confluence

Is there a reason you can’t modify your Policy in IoT Core?

Infrastructure is in the hands of infrastructure engineers in my company, and there are strict policies that we are now discussing, due to this limitations.

A quick publishing to IoT Core via scripting is possible, but I don’t think this is a viable solution for thousands of tags.

For now, I am happy that (I think) we narrowed down the culprit, but since MQTT Engine gives the possibility to use custom namespaces (and deactivate sparkplug), I feel like the same should be possible on the MQTT Transmission side.

In any case, the topic can be considered solved. I thank you again for all your support.