OSD3358 cannot reboot with frequency scaling enabled in Linux

Hello again,

On custom hardware using an OSD3358-512-BCB, we are having some issues. Our kernel is the 5.10.56-bone-rt48 kernel from Robert Nelson (with a couple auxiliary patches which are demonstrably unrelated), and we are using u-boot 2021.07. This issue is very similar to my previous post (https://forum.digikey.com/t/am3358-based-board-will-not-reboot-no-issue-with-cold-boot), but I have more information and have been unable to fix it this time. Here’s the situation:

First, I found out that we had weird intermittent boot issues (every now and again it wouldn’t properly reboot). This seemed to be fixed by disabling falcon mode in u-boot, so we were happy with that. Then, I realized that u-boot was setting the system to OPP turbo (I believe that’s the one-whichever it is, it’s the 800MHz option). This appears to be because the efuse in the OSD3358 reports strange information (a known issue) and u-boot determines the system max is 800MHz instead of 1GHz, which we know it’s capable of. My initial thought was that u-boot doesn’t really need to be running very fast, so we’d be fine leaving it at 800MHz. Instead, I enabled CPU frequency scaling in Linux with a default governor of “performance,” which achieved our goal of 1GHz.

This seemed fine, but then a while later I tried a warm restart (using the reboot command in Linux) I noticed that the system just hangs. The SPL never prints its version message, but it appears the kernel is properly shutting down (I am not entirely sure how to test this, but according to the code I have read in the kernel for shutdown on the AM3358, it should be printing an error to the UART console if shutdown fails). The console goes silent after the kernel’s final messages, then nothing happens. It’s clear that the hardware is being reset, as our LEDs are reset to their POR states (which does not happen under Linux, as the Linux device tree disables them at boot, while they’re on at cold start). It appears that u-boot does not get along with CPU frequency scaling. However, when I manually changed the governor to “userspace” and set a target frequency of 800MHz, it also failed to reboot, suggesting that something to do with cpufreq is causing the problem.

I have tried some modifications to u-boot in all the places I can think of, including (but not limited to):

  • Using the internal RTC 32K OSC instead of external (and the corresponding change in Linux, of course)
  • Making am335x_get_efuse_mpu_max_freq() return MPUPLL_M_1000 in all cases
  • Changing board/ti/am335x/board.c to use MPUPLL_M_1000 everywhere that the Pocketbeagle does (get_dpll_mpu_params(), scale_vcores_bone(), and then scale_vcores() for good measure)

The RTC OSC change had no effect and the second and third seem to have inconsistent results–sometimes a complete failure to boot, sometimes getting part of the way through the boot process before hanging, etc. I have tried enabling more logging in u-boot, but it doesn’t reveal anything useful, the bootloader just hangs randomly at some point during boot each time.

If anyone has any ideas, I’m all ears. We are using the pocketbeagle device tree, though with the addition of tick-timer = &timer2; and changing stdout to uart5. Previously we were using the am335x-evm device tree, and it seemed to be more broken, but I am not entirely sure why yet.

Thanks!

Hi @will_eccles ,

So all OSD3358’s are 1.0Ghz AM335x, it’s just missing the proper efuse settings…

i’ve been dealing with a lot of fun regressions on v5.10.x…

Can you please retest with v4.19.x-ti to see what your board is doing for shutdown/etc.?

systemd is fun to log on shutdown… ti think the kernel parameter “systemd.log_level=debug”, helps debug it…

Regards,

Hi Robert,

I’ve tried building an image with 4.19-ti using our existing u-boot build (trying to minimize independent variables). I have run into issues trying to get it to boot with an image built using the omap-image-builder, as our image does not use an initrd and has a slightly different structure for boot files. I’ve modified u-boot to (as far as I can tell) be able to boot the 4.19 image, but I am unable to get bootup working properly. However, I am pretty sure that the shutdown process on the kernel is beyond the point where systemd would be able to say much–it appears to reach the point where the kernel should have exited. Our system does not have any userspace utilities related to CPU frequency, etc. The only processes running are busybox init, sshd, and our application (which doesn’t touch CPU frequency, and I have confirmed that with our application not running I see the same behavior).

I believe that in the old days when we used 4.19, we had no trouble rebooting, but we also used a different (and undesirable) version of u-boot. Let me know if there’s anything else I can try.

If it helps, I think it’s probably important to note that when I copy the code the pocketbeagle uses to change the clock speed to 1GHz on the OSD3358, u-boot cannot boot properly (it just randomly hangs). It seems like u-boot has issues with >800MHz for some reason. However, I was under the impression that the ROM would reset everything to 500MHz. Not sure what to make of this.

an initrd hasn’t been a hard requirement for a long time… (v4.1.x+…) today it’s only usage is for “flashing” the eMMC in single user mode, as we need a few tools in memory, and when using “overlayroot”…

Your current booting image with v5.10.x, you should be able to just “swap” to v4.19.x and retest, don’t change u-boot or anything else…

Regards,

@RobertCNelson Apologies for the delay, my resourcing is currently split with another high priority project, so time for testing this is limited at the moment. I should have test results by the end of the day or tomorrow. I’ll try dropping a 4.19.x in, though it may take some messing with kernel config and whatnot to work properly on our hardware.

Hi @will_eccles no woriess.

BTW, you shouldn’t need to hack up u-boot for OPP settings… As long as your kernel has:

&cpu0_opp_table {
	/*
	* Octavo Systems:
	* The EFUSE_SMA register is not programmed for any of the AM335x wafers
	* we get and we are not programming them during our production test.
	* Therefore, from a DEVICE_ID revision point of view, the silicon looks
	* like it is Revision 2.1.  However, from an EFUSE_SMA point of view for
	* the HW OPP table, the silicon looks like it is Revision 1.0 (ie the
	* EFUSE_SMA register reads as all zeros).
	*/
	oppnitro-1000000000 {
		opp-supported-hw = <0x06 0x0100>;
	};
};

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/am335x-osd335x-common.dtsi#n21

Then the full OSD335x frequencies will be un-locked, once in linux.

Regards,

Hi Robert,

The reason I had attempted to do this in u-boot was to allow us to disable frequency scaling in Linux–this would avoid the issue with being unable to reboot with cpufreq enabled. However, it looks like that caused more issues than it was worth, so I’ve reverted the changes to u-boot and will try 4.19.x with frequency scaling enabled and an equivalent kernel config.

Hi @will_eccles , did you disable ARM_TI_CPUFREQ, that should force cpufreq off?

Regards,

After disabling cpufreq in Linux (using the option you mentioned and also just by disabling the entire feature of the kernel), u-boot can boot at 800MHz and Linux runs fine. However, if I have u-boot set the system to 1GHz, it cannot boot properly, so the only option is to let Linux do the scaling. Is there a functional difference between outright disabling cpufreq and just disabling the TI_CPUFREQ option?

Hi @will_eccles by disabling TI_CPUFREQ, what ever state u-boot left the OSD335x in before jumping to linux, is the state will stay no matter what… U-boot setting 1Ghz should not be un-stable…

I really don’t understand why your OSD335x can’t handle 1Ghz… Everything is inside the SIP, so there’s really nothing to mess up…

I’d really ping the Octavo developers, specifically Eric Welsh…

Regards,

When setting 1GHz in u-boot, it seems to just randomly halt during the boot process. No errors are printed, and it often gets to different stages of the process before halting, but it never reaches the kernel. It usually hangs after printing board info. Once Linux takes over, it can set the system to 1GHz just fine, but then reboot hangs. I am not really sure what to make of this.

How are you powering the OSD335x, usb or 5v bus? Have you shared the schematic, power section?

Regards,

The processor is powered by 5V input. I’ll see about sharing a schematic, but I’ll have to check on what I can and can’t share. At a glance, the schematic looks ideal according to the documentation from Octavo.

I’ve gotten the latest 4.19.x RT kernel from bb-kernel and dropped it in. It successfully sets the system to 1GHz in Linux, but still can’t reboot. This is the last thing I see on the console before it hangs:
image

Hi @will_eccles looking at your screen shot, that’s not systemd… What init system are you using? We had a ton of issues pre systemd, i haven’t touched non-systemd system in years, would have been in the 3.8.x era…

At that point, the “RTC” should be resetting the system…

This:

ti,pmic-shutdown-controller;

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/am335x-osd335x-common.dtsi?h=v5.17-rc6#n63

and this:

system-power-controller;

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/am335x-osd3358-sm-red.dts?h=v5.17-rc6#n431

Enable taht…

Regards,

Hi Robert,

I believe those are both enabled. Our device tree is a bit of a mess right now, but I’ve attached it for your reference (I’ve had to obfuscate the names of GPIO lines, etc.). We are using busybox init as I ran into some issues with systemd and its lack of determinism during boot (as well as the long boot times it causes). It’s also worth noting that I have tried with both the internal and external RTC 32K OSC, so I know it’s not that.

am335x-osd3358-sm-red.dts (17.3 KB)

Does your busybox build have acpid enabled?

That was one of the old workarounds…

Regards,

Here are some excerpts from our schematic:


I’ll look into the ACPI thing now. I don’t believe acpid is enabled. Does it require any kernel/bootloader configuration to match?

Enabling acpid doesn’t seem to have had any effect. I am not sure what’s going on here. It’s odd that the pocketbeagle has no issue updating u-boot to force 1GHz, but it doesn’t work on my system with the same package. I’d like to see if I can get u-boot to work properly at 1GHz, removing the need for Linux cpufreq in the first place (and it may also solve this issue anyway). I’ve tried everything I could think of, but I’m not sure where it’s going wrong, and it’s inconsistent as well.