HA Update and MQTT crashes, anyone?

I made the fated decision to do the updates on HA and all the add-ons that said they had updates available. Something went south and now none of my Sonoff devices will communicate with MQTT. Like at all. I changed NOTHING setting wise. I can go into TasmoAdmin (plugin) and see the switch states and change them so that is fine. But I trigger in Node Red and they do nothing (based on MQTT triggers). HA sees them on the overview page and I can change the state from there, which I assume is through TasmoAdmin or the Wemo integration, since they are Wemo emulated. The stupid thing about this is that the 2 devices I have on ESP32’s still communicate through MQTT and I have not changed their MQTT access passwords.

The MQTT add-on made me change my MQTT password since it supposedly showed up in the PWNED database ( which was fine with the previous version of MQTT) and nothing worked Sonoff wise. So, I went in and changed the password in my Sonoff’s to the new one I put in the add-on. Still nothing. So, basically everything I have that uses NodeRed flows… is DOA. F’ing frustrating. This is exactly why I won’t do any automation of the house stuff. My wife would be kicking my ass around the block if the stuff in the house was not functioning right now.

I haven’t used HA yet so I am unsure how MQTT is configured. But can you use a Terminal and listen to any MQTT chatter on the device that is your MQTT server?

mosquitto_sub -u ***** -P ***** -t "#"

  1. Check that MQTT server is running.
  2. Check that MQTT server and HA client are configured to use the same IP and port.
  3. Does HA have an error log?

Good luck.

I’m quite sure there’s was breaking changes to Sonos in the latest release, check the change logs.

@Guru_Of_Nothing, you are not alone, I have also done a HA update and found none of my Sonoff devices listed. Problem I had was I also updated my Mosqitto broker at the same time so wasn’t sure what was really broken. I ended up power cycling both my RPi3 and RPi Zero as resets didnt solve the problem. However, my laptop running a MQTT monitor was still showing traffic from the Sonoffs! And HA showed other sensors to be okay.
Now by default I always power cycle the RPi after a HA update just in case.

I decided to take some time off of the vault stuff and focus my energy on other neglected projects or system issues. My focus as of late has been the SEEMINGLY random MQTT issues. I have fought like mad with my vault board disconnecting from MQTT within a few minutes after it has been power cycled. All of my Sonoff’s try to connect and after a little while… a few minutes to maybe an hour or two… they drop off the MQTT map as well. So I started digging into the logs and there seem to be LOTS of socket errors that bump everything off. Sometimes I can get a few hours of everything working fine but then the wheels fall off the damned thing.

Hours of research points to one specific thing… Home Assistant. There seems to be a random bug of some sort with HA and the MQTT add-ons. It’s like HA gets an update and the MQTT stuff gets screwy. In all honesty, I have not figured out why I even need HA. I have a bunch of Sonoffs, some not deployed yet, and a bunch of ESP32’s. I don’t have any smart lighting and can’t ever see a need for them, especially with as expensive as they seem to be. I don’t own anything that uses the net to process like Alexa or Google Home. My internet is trash and isn’t going to get any better any year soon so any externally integrated devices are worthless here. Never has all of my Sonoff’s that have been deployed show up in HA consistently or dependably and I have no idea at all if you can even make an ESP32 show up in HA.?.?.? I have lots I want to do but can do it with Node Red… if I can get freaking MQTT to be stable and operate like it should. I really like TasmoAdmin on HA but could live without it, I suppose. So… I am toying with pulling down the RPi and doing a fresh install with just Node Red and MQTT. I know little to nothing on this but as I have said before… why not learn it! What fun is the learning curve if you can’t get run over by it every now and then?:crazy_face::upside_down_face:

1 Like

@chris Yes, there are all sorts of standalone apps and programs that allow you to not only listen to the chatter but to publish into it directly. I set up beacons on my non-Sonoff devices that publish a message that tells me status (basically) every 15 seconds. I go into my program and subscribe to ‘beacon/+’ and every device sends it’s beacon message out so it looks like “beacon/gate, payload 0”, “beacon/vault, payload 0”. If the beacon message doesn’t get sent, I know that that device has fallen off the planet. The Sonoff’s send status messages of all sorts that can be listened for as well.

As it is currently set up, Hassio is on my Rpi and my MQTT broker is an add-on to that program so the logs are accessible through that application. What I see is a lot of this:

1562707249: Socket error on client HL_Bridge, disconnecting.
1562707249: Socket error on client DVES_58417F, disconnecting.
1562707255: New connection from 192.168.42.152 on port 1883.
1562707255: New client connected from 192.168.42.152 as DVES_58417F (p2, c1, k10, u'mosquitto').
1562707255: New connection from 192.168.42.151 on port 1883.
1562707255: New client connected from 192.168.42.151 as HL_Bridge (p2, c1, k10, u'mosquitto').
1562707332: Saving in-memory database to /data/mosquitto/mosquitto.db.
1562707671: Client HL_Bridge has exceeded timeout, disconnecting.
1562707672: New connection from 192.168.42.151 on port 1883.
1562707672: New client connected from 192.168.42.151 as HL_Bridge (p2, c1, k10, u'mosquitto').

What’s funny looking through my logs now is that there is only one device left connected to it and that is a Sonoff Bridge. All of the other Sonoff’s and the ESP32 boards have all gone on vacation. They socket error-ed themselves to death and won’t reconnect until I restart MQTT

One day, I’ll figure out what the fascination with the Rpi is…

Do you have an actual computer you can run Linux on?

If so, I’d suggest setting up with just Mosquitto & play with the MQTT stuff on it’s own to get a better feel for what’s actually going on.

Then take the next step & add in Node-Red…

Then go from there.

1 Like

ROFLMAO…!!!
I love your responses! Lol. I have 7 computers running in my network actually. I put a Raspi in the fleet as a testing box that is super low power. Since it requires little to no brain power to operate for something like Mosquitto and node red, I figured it was a good fit. I have 4 Pi’s and 2 of them sit in a plastic box doing squat. One is an Octopi and it isn’t doing anything since I am still struggling with my POS 3d printer.
The reason I don’t have a real computer in the pile running anything Linux is I don’t have hardly any experience with it and I dont trust my abilities to install anything and keep it stable. I have tried that angle in the past and ended up getting something working stable and then updated it and trashed the effing thing and lost all of what I was trying to accomplish. The Pi gave me a low power tool to try something that relied on nothing else and could easily be wiped and reinstalled if I dickered it up. Only one application was affected.

As a prime example, and this may be because of what I am using (Raspbian) I spent a couple hours following through some examples to install base Raspbian and Mosquitto as a headless setup. Followed the steps and ended up with errors on dependencies that were supposedly part of the packages that were no were to be found. No idea where to find them or to instil if I had them. I hate Windows with a passion but it’s all I (kinda) know. I have a monster server that i could run a crap-ton of VMs on an do a MosquittoVM but i dont know enough about Linux shit (or server management) to work with it if I started setting it up. And I will not pay the stupid prices MS wants for windows server licensing for a home lab. My frustration is based on lack of knowledge.

Bwahahahahaha!!!

You are about to become a Linux convert.

Pick a random machine. (Pretty much any spec you like…)

Go to https://www.linuxmint.com/edition.php?id=261

Download the ISO…

Either burn a DVD or a thumbdrive…

Install…
(defaults all around work well…)

Come back here for instructions to get mosquitto installed & running.

HINT: My mother (little old lady with computer destruction skills…) has been using Mint 19.1 for a while now. I rarely get support calls now…

Does she do tech support?

:grin:

Give her a few more months…

Right now, she tends to volunteer her son to all her friends…

(They get a little freaked out when they hear my standard billing rate tho… :smirk: )

Hi @Guru_Of_Nothing.
I may have found the reason for your crashes. See this post.

A few weeks ago I posted to you some code that added a random number to the client name, which should prevent the disconnection errors. Here’s my link again.

I use this for both wifi and ethernet connected arduinos - if the server goes down etc (or a device D/Cs) when everything is back up, my devices/code happily reconnects. Haven’t had to touch my devices for over a year.

Hope it helps.

It import bits:

// Within setup() after the wifi has connected, run algorithm to obtain random numbers based on our timestamp
randomSeed(micros());


/**
* MQTT reconnection function.
* Generate new client name every reconnect to prevent crashing.
* And then re-subscribe.
*
* @return void
*/
void mqttReconnect()
{
	while (!mqtt.connected())
	{
		// Client name + random number
		mqtt_client += String(random(0xffff), HEX);

		if (mqtt.connect(mqtt_client.c_str(), mqtt_user, mqtt_pass))
		{
			// Could subscribe to multiple topics here
			mqtt.subscribe(mqtt_sub);
		}
		else
		{
			// Happily wait till the server is good and ready
			delay(5000);
		}
	}
}

I believe the wifi library will continually disconnect and reconnect thus our mqtt reconnection logic needs to supply a new client name for the server to be happy.

I’ve used variables to hold the values used within this function so that I can copy and paste the function between projects and I only have the change the variables at the top of my project file.

@chris I didn’t initially understand what that chunk of code was about until I started researching the problem and I saw that exact post you mentioned. But that only fixes half my problem… I still have Tasmota driven devices being killed off by the MQTT broker. What my research told me was that it isn’t necessarily an MQTT protocol problem but whatever the add-ons to HA are doing. A number of people had suggested a standalone MQTT broker. I have seen messages in the log that shows a disconnect that I knew about… a forced power cycle, aka, unplugged the device during it’s normal keep alive period… and the log showed that it was already registered and the log was updated within MQTT without an issue.

A few different posts said basically what you are about the device being registered already so the broker gets angry and won’t renew it so you just work around the problem by giving it a fake ID every time it walks through the door. But I am reading this is not the way MQTT is supposed to work and it should actively renew that ID upon reconnect request. Hence the issue with the add-ons. I am adding the subroutine code you provided to all of the devices I program and I am certain that will fix that issue but need a good broker that works properly so the Tasmota devices are not excluded. At this point, all of my Tasmota devices are running the most current firmware as suggested by the folks on the Tasmota forums and the HA broker add-on (both Mosquitto and the HiveMQ versions) socket errors within minutes dropping all of my devices. That came about with a relatively recent HA update and none of the ones since then have fixed it.

So…Since I am not using the functionality of HA, I am working on getting a standalone Mosquitto broker up and running. And I will know soon after that if that is truly the fix or if it’s still going to be a problem. At least then I know what direction to look for the Tasmota stuff as HA is removed from the equation.

@chris… Question for you about the reconnect code…

Being as it uses a delay() if the MQTT broker isn’t connected, doesn’t that mean that this becomes a blocking issue to all the rest of the code in the device until the broker reconnects? I have a couple devices so far that are not dependent on MQTT for their function and couldn’t care less if they ever connect but if I understand that correct, doesn’t that mean the processor is locked up with the delay function every loop until it can get a connection? Not sure if I am interpreting that correctly.

@Guru_Of_Nothing You are very much correct. My code does lock itself into that loop until a connection is made. All of my project’s devices are slaves and rely on MQTT for instructions or reporting.

You could borrow the timer function I built for you in your mailbox project to attempt to reconnect every 5 seconds without locking up the rest of your project’s code.

I have no experience with Tasmota as I’ve always flashed my own code onto the Sonoffs.

Yep. I figured the timer would work interchangeably. Will update the subroutine code with that and add it into my arduino or esp projects. Makes sense to have one sort of forced reconnect in the loop regardless of reason.

I am going to put Mint on… brace yourself @Tinker… on a Rpi after work and work on getting a Mosquitto broker going. Long term, like within the next few months, I plan to sort out the whole mega-server thing with VMs and migrate a bunch of my boxes into one system.

Yep. Fired up the HP, installed ESXi and set up a Win7 VM as well as a Linux Mint VM. I gotta say… pretty slick all the way around. I went through a couple tutorials on installing Mosquitto and I got it installed. I set static IP’s for each VM and can ping the Mint box (known in the network as Minty_Mosquitto :face_with_raised_eyebrow: ) and started it. But I can’t do anything with it. I am not sure what is happening there. I did a netstat -at to see what ports were opened and mine looks like this:


I assume because the ports are open and I started the thing with

sudo /etc/init.d/mosquitto start

that the broker is running? I can ping the VM but if I add a port number to it, it says that it doesn’t exist. Suggestions?

I got it figured out. Appears that I inadvertently shut down MQTT. Oops. So I now have a solo copy of MQTT running on a Mint VM. I take that as a win. Now let’s see how stable it is compared to an HA add-on.

1 Like