Skip to content

Conversation

@foonerd
Copy link
Contributor

@foonerd foonerd commented Dec 18, 2025

Problem

This PR adds defensive initialization (stabilization delays and retry logic) to both Waveshare DSI panel drivers to address intermittent cold boot failures.

Users report intermittent dark screens on cold boot with Waveshare DSI displays on Raspberry Pi 4. The display works after warm reboot but fails after extended power-off periods. Journal shows:

ws_touchscreen 10-0045: I2C write failed: -5

Error -5 is EIO - the panel controller is not ready when the driver attempts I2C communication immediately after obtaining the handle.

Missing

In panel-waveshare-dsi.c, the probe function fires three I2C writes immediately after setting ts->i2c = i2c with no stabilization delay. The ws_panel_i2c_write function makes a single attempt with no retry logic.

Similarly, panel-waveshare-dsi-v2.c sends MIPI DSI DCS commands without retry logic, making it vulnerable to the same timing issues.

On cold boot, the panel controller requires time to stabilize after power-up. Warm boots succeed because residual state keeps the controller responsive.

Fix

Add minimal defensive initialization following patterns used in production embedded drivers:

  1. Stabilization delay before first communication attempt
  2. Retry logic with configurable attempts and inter-retry delays

Changes are surgical - only the initialization path is modified.

Testing

Validated on Raspberry Pi 4 Model B Rev 1.4 with Waveshare 7.9" DSI display running Volumio 4.081 (kernel 6.12.47). Multiple overnight cold boots successful with patched driver.

Affected Panels

panel-waveshare-dsi.c (I2C-based)

  • 2.8", 3.4", 4.0", 5.0", 6.25", 7.0"C, 7.0"H, 7.9", 8.0", 8.8", 10.1", 11.9", 13.3"

panel-waveshare-dsi-v2.c (MIPI DSI DCS-based)

  • 3.4"C, 4.0"C, 5.0"A, 5.5"A, 7.0"A, 7.0"B, 8.0"A, 8.8"A, 9.0"B, 10.1"A, 10.1"B, 12.3"A

The Waveshare DSI panel driver can fail to initialize on cold boot with
"I2C write failed: -5" (EIO) errors. This occurs because the panel
controller requires time to stabilize after power-up, but the driver
attempts I2C communication immediately after obtaining the handle.

Warm boots succeed because residual controller state keeps it responsive.
Cold boots after extended power-off periods fail intermittently depending
on timing conditions (ambient temperature, power supply characteristics).

Add defensive initialization:
- 100ms stabilization delay before first I2C write in probe
- Retry logic (3 attempts, 50ms between retries) for I2C writes
- Return value propagation from ws_panel_i2c_write

These patterns follow established practice in production embedded panel
drivers and resolve the cold boot initialization race.

Signed-off-by: Andrew Seredyn <andser@gmail.com>
…t reliability

Similar to the I2C-based Waveshare panel driver, the MIPI DSI DCS-based
driver (v2) can fail on cold boot when the panel controller is not ready
to receive commands immediately after reset.

Add defensive initialization:
- 50ms stabilization delay at start of DCS command sequence
- Retry logic (3 attempts, 50ms between retries) for DCS writes

This ensures reliable initialization across varying cold boot timing
conditions.

Signed-off-by: Andrew Seredyn <andser@gmail.com>
@foonerd
Copy link
Contributor Author

foonerd commented Dec 18, 2025

@pelwell
Copy link
Contributor

pelwell commented Dec 18, 2025

Displays are not my field, but the patches look sensible to me. The only thing I saw is that the change for the V1 displays has a dedicated WS_INIT_DELAY_MS value, while the V2 reuses WS_DSI_RETRY_DELAY_MS. I think this deserves a separate constant, even if they end up being the same value. And it's probably worth allowing an extra 50ms for the display to wake up, given that it isn't retried.

@6by9
Copy link
Contributor

6by9 commented Dec 18, 2025

I'm reviewing this already. Various comments on things.

Responsibility for this driver is with @waveshare, so a review from them would be welcome.

ts->i2c = i2c;

/* Allow panel controller to stabilize after power-up */
msleep(WS_INIT_DELAY_MS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you waiting on for this? No commands have been issued to the display at all by the driver at this stage, nor control of any power rails (there is no control). The display will have got power when the Pi was powered on, so adding 100ms here means nothing.

ts->i2c = i2c; does not obtain an I2C handle, it is just storing a handle given to it by the probing framework within the state structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent was to handle cases where probe occurs before the panel controller IC is ready to accept I2C commands after power-on. The actual fix is the retry logic in ws_panel_i2c_write - the initial delay was belt-and-suspenders.
Options:

Remove the msleep entirely and rely on retry logic alone
Move the stabilization delay into the first iteration of the retry loop

Perhaps the retry logic alone may be sufficient given the 50ms inter-retry delays.

int i, err = 0;

/* Allow panel controller to stabilize after power-up/reset */
msleep(WS_DSI_RETRY_DELAY_MS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've already had a msleep(60) in ws_panel_prepare after setting reset high, or 20ms after having enabled avdd. Extend that delay if you view that to be too short as that is what they are explicitly there to do.

Or if it is one particular panel type that is slow, add an _INIT_DELAY_CMD entry at the start of that table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - the delay infrastructure already exists in ws_panel_prepare. Adding another msleep here is redundant.
If specific panels need longer stabilization, DELAY_CMD in their init tables is the approach.

#include <video/mipi_display.h>

#define WS_DSI_RETRIES 3 /* Number of retry attempts for DSI operations */
#define WS_DSI_RETRY_DELAY_MS 50 /* Delay between retries in milliseconds */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 retries with 50ms between would be 150ms per command. I make it nearly 200 commands per table. Worst case situation is you've just added a 30second delay.

Why do you believe there needs to be a delay between retries? DSI is an uncontended point to point interface, so it's not as if you need to wait for the bus to clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. DSI is point-to-point with no bus contention - if a command fails, waiting and retrying does not address the underlying cause.
The reported cold boot failures were on I2C-based panels (7.9", 11.9") where retry logic makes sense for transient bus issues. I do not have evidence of the same problem on DSI-based panels.

#define WS_DSI_DRIVER_NAME "ws-ts-dsi"

#define WS_I2C_RETRIES 3 /* Number of retry attempts for I2C operations */
#define WS_I2C_RETRY_DELAY_MS 50 /* Delay between retries in milliseconds */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here as to why delay between retries. Multiple I2C masters on the same bus aren't supported by the Pi, so what do you believe is happening during the 50ms you're delaying for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The observed failure mode is cold boot - driver probes before the panel controller IC completes its internal power-on initialization. First I2C write returns -5 (EIO).
The delay between retries gives the IC additional time since power-on. Not waiting for bus clearance - waiting for the IC to become responsive.
However, if the IC is not ready at probe time, would returning -EPROBE_DEFER be more appropriate than retry loops? Let the kernel reschedule the probe rather than busy-waiting in the driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question for @waveshare then.

The microcontroller that is on the other end of the I2C interface will have been powered up with the Pi. Getting to the point where this driver will poke the controller is going to be multiple seconds down the line. What on earth is the microcontroller doing to take that long to be ready to process I2C requests, and how can adding 100ms be enough to make it happy?!

@pelwell
Copy link
Contributor

pelwell commented Dec 18, 2025

It does seem odd that the addition of only a 100ms delay allows the display to be initialised, but a delay is what @waveshare have requested. From the linked thread:

Waveshare’s response confirms the diagnosis - they acknowledge it is a timing issue where the panel controller is not ready when the driver initializes.

Are there, for example, any regulators etc. involved, with the possibility of a race between the regulator being enabled and the panel driver being probed?

@6by9
Copy link
Contributor

6by9 commented Dec 18, 2025

It does seem odd that the addition of only a 100ms delay allows the display to be initialised, but a delay is what @waveshare have requested. From the linked thread:

Waveshare’s response confirms the diagnosis - they acknowledge it is a timing issue where the panel controller is not ready when the driver initializes.

Are there, for example, any regulators etc. involved, with the possibility of a race between the regulator being enabled and the panel driver being probed?

No, there are no regulators involved with this driver, so adding 100ms seems bogus.
It's an I2C connected microcontroller that implements a couple of commands to set up a DSI to DPI bridge (ICN6211) and control the backlight, very much like our V1 panel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants