I run my MQTT on a separate R.Pi, but only because it was there (along with my original Python script based automations) before I started playing with HA… Was planning on migrating it onto the HA Pi so I’d have one spare for experimenting on… Until the 4’s came out, and I got one of those, of course.
So get two… or three… or five… Then, you have your main system, and your development system. You do your experimenting on the development system (could also set it up as a satellite system, I believe), and if you mess it up, just restore it from the main one. Then when you’ve got it how you like, go the other way, replacing the main one from the development one. rsync works well there. I Git my main config, but that doesn’t work so well for passwords and stuff, if you ever might possibly plan to make it public (or if Microsoft one day decides you want to make it public)…
You reboot your HA? Why? What image are you running? Restart the HA image every so often, sure… Takes it offline for a minute or two, which is a pain, so do it while people are out, or asleep, and won’t get annoyed at you (unless all the lights turn themselves on, or something, in which case wait till people are out).
But my HA Pi’s current uptime (actually, both of them, since it was a blackout that last caused them to be restarted) is 263.5 days. I run Hass.io, and the HA itself gets restarted anywhere between when I feel like the latest updates are worth trying out, and several times an hour while I’m tinkering, but the other containers keep on running, so MQTT should too. Linux isn’t Windows. When I used to run Linux as my sole system, I was less than a month short of a full year uptime a couple times, and that’s with everything updated regularly except the kernel (which I’d upgrade, often several times, before the next restart — usually by blackout). Seriously, Linux isn’t Windows.
But even then, a HA restart takes all of a minute or two. And if you have split main/dev systems (I believe with Docker — what Hass.io uses — it should actually be possible to run them both on the SAME system, though it could get tricky and/or confusing), then that’s a minute or two downtime only when you feel confident with your new setup to apply it to the main system. And a duel system like that, you can even do things like bridge the two MQTT servers only one way (so dev gets updated from changes in the house, but the lights don’t go crazy if you screw up your automation), and then you can selectively bridge back the other way, too, so, like, only the lights in that one room are actually being controlled by the dev system. You can do all those things.
But, if you’re comfortable putting the logic on the nodes, then that’s just as good. Teach the relay driver board to toggle individual relays, and have the switches send toggle messages, and you’re all sweet. As I said before, I did that, but I don’t use it, it’s just there in case I feel like it at some point. I’d probably use the MQTT as the data store, though, because otherwise you’ll be wanting to fiddle with the controllers EEPROM and all that, and then there’s the question of what state should they come up in after a blackout. Some devices you might want to default ON, or OFF, others you might want to remember their previous state. Easiest way to do that, is with the MQTT retain flag. Retain will make it resume it’s previous state, where not using retain on individual messages lets you retain the initial state you want them to be in — just have to be consistent with a given device. That too, is easy to do if your switch messages come into HA, and get sent back out to your relays on a different topic, then the HA logic that binds them can decide whether to retain or not.
But definitely get a second HA instance for experimentation, either way.