Hi folks,
Not sure if anyone else has noticed this bug - but the topic: motor_power_active ALWAYS reported ‘true’ on startup, whether or not the motor power button is on or off.
If you then power the motor on, it continues to be ‘true’, if you then power the motor off, it finnally goes ‘false’.
Our software is quite dependant on this. This was not an issue in V40, only an issue in V43.
Thanks,
Kris
I will look at this problem this week. There has not been a report of this till now. Thank you for reporting this issue. Can you tell me if the OLED display shows proper on/off state once the ROS side host software comes up when switch is off? Does the topic say on at any time when the OLED display shows off? There can be a very long time after bootup when ROS host side software does not know what to report and the topic can only say on or off so we choose on for that case until after the MCB is recognized and initialized.
Hey there! I’m not boarderkris, nor do I work with him, but I can confirm that the OLED behaves the same way as the topic Kris described. It indicates that the motor power is ON even when it’s actually OFF, and it only goes OFF after you turn it on and off again.
Hi Mark,
Unfortunately the OLED is not installed/activated on our device.
Expect you are correct, as there is no comms with the MCB at that time (it would be powered off). As mentioned above V40 works, not V43.
My thoughts on start up:
Default motor state should be reporting OFF.
- Either it’s Powered ON and not communicating yet - which should result in OFF.
- It’s NOT powered ON therefore definitiely not communicating yet - which should also result in OFF.
Thanks
Kris
I have looked into this and it is more involved than a simple reading of the power on or off. It is a bug and I agree it needs to be fixed.
The reason it is broken is because back before version 5.0 MCB board we shipped a rev 4.9 PC board. In trying to support some critical fixes that have evolved as time moved on we have had to use involved logic in both firmware as well as host software. Something in that logic has broken with v43 and the MCB firmware thinks (until it sees a power change) to report power is on to the host software as if it is on rev 4.9 pc board. v40 did not have all the changes/fixes in v43 is why it may still work (I am taking your word on that and have not tried to see if v40 works as that is not moving forward, you have already done that anyway).
We will form a solution to this issue now that it has been identified.
Thank you for reporting it as that is the 1st step
I have found root cause and have a fix for you to incorporate. I will merge this fix into our codeline however I cannot promise a new image to fix this at this time.
If you use your own clone of ubiquity_motor then you can simply incorporate this into your own code. If you regularly pull from ubiquity’s version of ubiquity_motor then that too will also work in a day or so. For now I suggest you do as follows.
First I made a new v45 version that never allows itself to be thinking there is no ESTOP switch state readback. This however did not fix the problem. I then put v40 on the magni and have determined myself that v40 also has this issue which disagrees with your observations so I’m not sure unless your v40 has a different daycode. I am using .what is found to be ubiquity_motor/firmware/v40_20201209_enc.cyacd as on the 2-9-2023 image. This led me to dig in and find that this issue is in the host side software. The quickest fix is to set the default board version to be 51 where it was set as 49 which leads to this fault (in a fairly involved way).
The fix is to fix defaults in ubiquity_motor code then remake code.
- cd /home/ubuntu/catkin_ws
- edit src/ubiquity_motor/include/ubiquity_motor/motor_parameters.h
- Modify in 2 places controller_board_version(49) so they read controller_board_version(51)
- catkin_make
After that you can run with v43 (or v40) and they will report proper OFF for when motor power is off or ON when motor power is ON at reboot or power-up cases.
I have just pushed to our ubiquity_motor repo in the noetic_devel branch the change to make the default be more current board. You can get the above on the 2-9-2023 image using a git pull. The Rev 4.9 MCB is by todays standards ‘ancient’. We cannot support all boards forever and all boards since rev 5.0 are considered current. Additionally only MCB versions 5.2 and later support the Raspberry Pi 4 and that is what we test and ship all our products with for a few years now. It is a low percentage of the very old rev 4.9 boards and these would not be on noetic images or rather should not be on noetic images anyway.
Hi Mark,
Does this new version (V45), include the fix for a default reporting of motor off?
We just installed and tested, and it is still defaulting to ON.
Thanks,
Kris
Did you do the host side changes I listed? Those are the real issue I feel.
If you want to try the pre-release v45 I can send that to you.
Send an email to support@ubiquityrobotics.com and in subject put Attn Mark request v45 beta firmware
Again, you really must make the host side changes too even with v45. I don’t think this can be fixed with just v45 without host changes.
Recall this please: If the repo of ubiquity is located in catkin_ws/src/ubiquity_motor then ROS does this clever trick and ‘overlays’ the binary version of ubiquity_motor driver code and launches the freshly made motor driver in your catkin_ws workspace. Do the catkin make with changes I mentioned especially to the parameters file. Just to be safest do full robot reboot to properly source the changes.
Hi Mark,
Can you confirm the host side changes that need to be done?
We have pulled in the latest code - only difference I can see, is the V49 to V51 config change in motor_params - perhaps I’ve missed something else. We then updated the firmware to V45 (cyad file in the repository).
Not sure if this helps - but just for a test, we ran V40 (which reports motor status correctly on our kinetic versions, with a 5.3 board), and it also reported the incorrect value on noetic (even though it works in kinetic).
Thanks,
Kris
Yes the default parameter value of 51 was the change. Thank you for confirmation that you do see that in your pull. That change was ONLY on the noetic codebase on github.
That is what I did to confirm here that the same image you use with that new code made motor power state ok.
I also pushed v45_20231105_enc.cyacd but note it is considered Beta code and not fully tested however as the old software phrase goes ‘It should be ok’. Our online (network) code download will NOT be v45, you must do this with manual load from a file. I tried to be careful AND I did run it myself for just basic movement verification not full test suite.
I need to clarify 2 statements:
A) You ran v40 on kinetic and it did not show the bug.
B) You ran v40 on noetic BEFORE my changes were used and it had the initial power on when power should be off.
Correct?
I did not see from your post above if this happened:
- Run v43 from current Noetic change that defaults to 51 and this fixes problem
- Run v45 from current Noetic change that defaults to 51 and this fixes problem of initial motor power ON being incorrect when you boot up with motor power off.
I await your reply, Mark
Note: If all bugs were simple there would be no need for seasoned programmers. Lol
Thanks for the quick reply Mark.
To answer your questions:
A) When we run v40 on kinetic we continue to have no issues. i.e we start the device with the motor off and motor_power_active reports FALSE immediately.
B) We ran v40 on noetic before the latest code change (ie. change from 49 to 51), and it also did not work. Our test, we start the device with the motor off and motor_power_active reports TRUE immediately.
Other tests I just ran now (all on noetic with 51 set):
Again my test: Start machine with motor off. If motor_power_active is TRUE, it is a fail. If motor_power_active is FALSE, it is a pass.
: v45 - Fail
: v43 - Fail
: V40 - Fail
Hope this helps,
Kris
We then have some form of difference in our environments so that should be the focus now.
Thank you for a very good summary just posted. Besides what I discuss below there is some possibility (only slight) that this follows a given MCB board. We try to initialize MCB static variables on startup but in the off chance there is a bug there then this issue would follow some but hopefully not all boards. This is a real ‘stretch’ but is in the possible cause list for a firmware defect. I feel this is a host side defect frankly.
For background you should note a significant change in the yaml files that set robot parameters happened from the kinetic to noetic code. Our goal was so that it would be far less likely to wipe out the yaml robot parameters if they moved from the magni_robot repo under magni_robot/magni_bringup/param/base.yaml to a location out of the way of any git upgrade. So the main config file moved to /etc/ubiquity/robot.yaml for noetic releases. Can you then do these two things:
- verify you have an coment only file in ~/catkin_ws/src/magni_robot/magni_bringup/param/base.yaml
- somehow send the in full your /etc/ubiquity/robot.yaml file WHICH is the master config file now
We had in the past a way to force a given board type so I am seeking to know if that is going on now on your robot.
You can send the file to support@ubiquityrobotics.com and in subject say ATTN Mark here is robot.yaml file
Hi Mark,
Thank you again for the detailed response.
Our code defeinitly was not using the /etc/ubiquity/robot.yaml file - most of the configuration was coming from the base.yaml file.
With that said, I went ahead an updated this. Our robot.yaml file, is now identical to yours in git (minus a few dimension changes).
I have also commented out everything in the base.yaml file.
Unfortuantely - Still same thing on boot - reporting a motor that is on, when it is still off.
Note that this change I have been discussing applies for the image using noetic only. Back on Kinetic images you had to use what was suitable for that codebase. I was not suggesting you use the current noetic compatible yaml file on the older Kinetic images.
This does bring up an issue worth clarification.
I am doing all my tests using an image that was meant for the noetic-devel repo. Specifically I start with this image: 2023-02-09-ubiquity-base-focal Any changes I have discussed are to our ubiquity_motor repo in the noetic_devel branch.
When you say you had done some testing on Kinetic I had assumed you meant running Kinetic on older image that was designed for use of Kinetic. The Kinetic images were around back when you started doing work with Magni. So for you to test on Kinetic I assumed you meant you swaped back in a different SD card and so on. This is how I ‘go back in time’.
Since your kinetic image works of course it is best to NOT mess with that.
I feel this is how you tested on Kinetic but it is worth saying ‘just in case’.
Can you edit ~/.ros/log/latest/rosout.log and look early in this log (when the bug is happening) and look for a line that has the following text (in your case it may not say 51).
Firmware is version 43. Setting Controller board version to 51
Right after that will be a line containing ‘setting hardware_version to’ with a number. tell me what that says as well please.
Then a line will say ‘reading MCB option switch on the I2C bus’ and I want the next lines that will say:
Setting firmware option register to 0x82. and then next
setting MCB option switch register to 0x82
These may offer a clue. Thanks
Hi Mark,
Some good news…
Just to follow up with your first comment - we have not changed anything on the Kinetic side (completly different image). Everything being discussed is in the noetic environment, with your ubiquity_motor packages from the noetic branch.
Your second question.
I checked the logs - and sure enough it said Firmware is version 43. Setting Controller board version to 49!!
So - why is it 49 was the next question?
For reference: controller_board_version(51) is set in motor_parameters.h.
I then added controller_board_version to the /etc/ubiquity/robot.yaml file:
ubiquity_motor:
controller_board_version: 51
…and voila! Next boot, everything was good to go. Checked the logs:
Firmware is version 43. Setting Controller board version to 51.
Perhaps you want to add this to the github?
Thanks again for the support,
Kris
Hi Mark,
Well, to add to this, I found the root of the 49 vs 51 issue. I didn’t realize there was a second function (overloaded) in the motor_params.h file where controller_board_version was defined. One was set to 51, one was set to 49.
Thanks,
Kris
I was going to experiment to see if the controller_board_version would work still (I added that long ago but was not sure if it was broken or not). So it does seem to work on the noetic image and that is good feedback.
I will also look at the overloaded function. I thought I said 2 places had to change and thought I caught them both but guess I left one at 49 still and that is why you had the issue.
The setting of board version has always been greatly tricky and that has a root cause that the MCB processor itself cannot read it’s own board version. We ran out of pins long ago. So the host tries to read the rev from an I2C io expander I put on the board. The host starts the MCB then reads the mcb version then writes it back to the MCB. THEN besides that there is this parameter that acts as an override. It’s surprising how involved it is spanning mcb firmware and host and all the insanely complex jumping through hoops it takes to set and get ROS parameters through yaml and so on.
Anyway, glad you have a solution. Next time this comes up I will have people check for that log right away as if it says 49 then the host feels that estop power is always on as those old boards had no way to read if it was on or off like 5.x boards can do.