Code cleanup and null checks #373

mathieucarbou · 2026-01-22T13:48:25Z

Adds some null checks to avoid any usage of _client pointer some loop is in progress calling send() and a disconnect event arrives at the same time (disconnect event frees the _client pointer)

Copilot

Pull request overview

This pull request adds defensive null checks and improves code quality in the AsyncEventSource module. The changes focus on preventing potential null pointer dereferences and improving the handling of disconnected clients.

Changes:

Added null checks for _client pointer in _queueMessage and _runQueue methods
Refactored avgPacketsWaiting() to use a ternary operator for cleaner division-by-zero prevention
Added connected() checks in send() and _adjust_inflight_window() to skip disconnected clients
Improved _adjust_inflight_window() to use connected client count instead of total client count

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/AsyncEventSource.cpp

willmmiles

Re this one and #370: there's a recurring pattern/problem in this codebase regarding the ownership and life cycle of 'SomeClient' type objects when they're also attached to a "SomeServer" type object that wants to operate on a list of them. In the original AsyncServer->AsyncWebRequest->AsyncWebResponse architecture, the Request and Response objects are "self-owned" and managed by their AsyncClient's life cycle. But with AsyncWebSocketClient and AsyncEventClient, we must have a second reference to these objects in a list in their associated Source type, making cleanup of these objects somewhat more complex: they cannot simply be deleted when their AsyncClient terminates, as they might be in use by another task; and that other task has to still interlock with the fact that the AsyncClient has been invalidated. (AsyncTCP handles this gracefully, as it's now interlocked internally, but I haven't personally reviewed the older ESP8266 ESPAsyncTCP; perhaps I should.)

I think we probably want to consider building a common data structure to manage this pattern across the different use cases in this library -- something that allows us to safely do for(client& c: clients) c->write() without getting in to trouble. (Also: the major feature I added the WLED AWS fork is to recycle that pattern for AsyncWebRequest as well, so as to support a request queue to manage heap load.) I spent some time on looking at this but I don't yet have a coherent API suggestion. I think we can do some magic with std::shared_ptr/std::weak_ptr and making a temporary stack copy of the list contents during iteration to minimize a lot the friction.

willmmiles · 2026-01-22T14:46:10Z

src/AsyncEventSource.cpp

-      ++hits;
-    } else {
-      ++miss;
+    if (c->connected()) {


Do we also need a client-pointer-is-not-null check here? (and all the other c->connected() pattern usages?)

Not in this PR at least because c is the unique ptr of a AsyncEventSourceClient object wrapping the _client pointer. c->connected() si checking for a null _client ptr behind. c is not supposed to be null.

This PR is just some code cleanup and null checks.

I would rather discuss how to correctly protect the iteration over the AsyncEventSourceClient list in #370 which was opened for that goal.

mathieucarbou · 2026-01-22T15:40:53Z

Re this one and #370: there's a recurring pattern/problem in this codebase regarding the ownership and life cycle of 'SomeClient' type objects when they're also attached to a "SomeServer" type object that wants to operate on a list of them. In the original AsyncServer->AsyncWebRequest->AsyncWebResponse architecture, the Request and Response objects are "self-owned" and managed by their AsyncClient's life cycle. But with AsyncWebSocketClient and AsyncEventClient, we must have a second reference to these objects in a list in their associated Source type, making cleanup of these objects somewhat more complex: they cannot simply be deleted when their AsyncClient terminates, as they might be in use by another task; and that other task has to still interlock with the fact that the AsyncClient has been invalidated. (AsyncTCP handles this gracefully, as it's now interlocked internally, but I haven't personally reviewed the older ESP8266 ESPAsyncTCP; perhaps I should.)

I think we probably want to consider building a common data structure to manage this pattern across the different use cases in this library -- something that allows us to safely do for(client& c: clients) c->write() without getting in to trouble. (Also: the major feature I added the WLED AWS fork is to recycle that pattern for AsyncWebRequest as well, so as to support a request queue to manage heap load.) I spent some time on looking at this but I don't yet have a coherent API suggestion. I think we can do some magic with std::shared_ptr/std::weak_ptr and making a temporary stack copy of the list contents during iteration to minimize a lot the friction.

Yes, that's typically the kind of effort I wanted to initiate through #370, which is not to review yet: it was just an attempt to prove that the source of the issue @zekageri saw was the modification of the list during its iteration. Bu there were also some missing NPE checks.

Let's focus on this cleanup PR first and then we will see how to correctly solve this problem.

zekageri · 2026-01-22T16:32:54Z

I have removed the cleanup call entirely and tested the whole day but so far no crash

mathieucarbou · 2026-01-22T17:23:02Z

I have removed the cleanup call entirely and tested the whole day but so far no crash

Thanks a lot for being proactive!

That's what I was going to ask you... to test this branch. I saw some missing null checks so i was hoping they would solve the issue you saw.

So that's perfect then!

mathieucarbou · 2026-01-22T17:26:37Z

I have removed the cleanup call entirely and tested the whole day but so far no crash

Ah wait!

I think I misunderstood...

Can you please test this PR branch ? It only has some null checks but still has the list erase doing a mutation in the disconnect handler.

If this branch fixes your issue then is was a missing null check.

If this branch still has the bug, then this is the list mutation that is problematic.

Thanks 🙏

zekageri · 2026-01-22T17:59:17Z

Oh okay, will test it tomorrow

bdraco · 2026-01-22T20:50:21Z

Once there is a final version, I'm happy to retest on 8266 as well. Feel free to ping me here or reach out on discord (same handle)

mathieucarbou · 2026-01-22T21:24:49Z

Once there is a final version, I'm happy to retest on 8266 as well. Feel free to ping me here or reach out on discord (same handle)

@bdraco : this PR is final and rebased on top of main which also includes your previous PR. So it would be nice of you to test this branch also to have your go for the merge 👍

Note: we can merge this PR even if @zekageri still see the issue on ESP32 (regarding the list mutation), because IF there is still an issue regarding list mutation on ESP32, we will fix in another PR.

bdraco · 2026-01-22T21:26:25Z

Great, I'll test it after we finish with the ESPHome patch release

bdraco · 2026-01-22T22:14:59Z

testing passes on the reproducer. trying ESPHome now

bdraco · 2026-01-22T22:29:53Z

can not longer repro crash on ESPHome with heavy reloads (tested with this pr + other fixes since main isn't merge in here)

diff --git a/esphome/components/web_server_base/__init__.py b/esphome/components/web_server_base/__init__.py
index d5d75b395d..4be653c362 100644
--- a/esphome/components/web_server_base/__init__.py
+++ b/esphome/components/web_server_base/__init__.py
@@ -48,4 +48,5 @@ async def to_code(config):
         if CORE.is_libretiny:
             CORE.add_platformio_option("lib_ignore", ["ESPAsyncTCP", "RPAsyncTCP"])
         # https://github.com/ESP32Async/ESPAsyncWebServer/blob/main/library.json
-        cg.add_library("ESP32Async/ESPAsyncWebServer", "3.7.10")
+        # Testing PR #370 for ESP8266 SSE crash fix
+        cg.add_library("https://github.com/bdraco/ESPAsyncWebServer.git#pr-370", None)

Pages of

[12:28:23.506]E async_ws 233: Event message queue overflow: discard message
[12:28:23.519]E async_ws 233: Event message queue overflow: discard message
[12:28:23.524]E async_ws 233: Event message queue overflow: discard message
[12:28:23.531]E async_ws 233: Event message queue overflow: discard message
[12:28:23.537]E async_ws 233: Event message queue overflow: discard message
[12:28:23.543]E async_ws 233: Event message queue overflow: discard message
[12:28:23.549]E async_ws 233: Event message queue overflow: discard message
[12:28:23.554]E async_ws 233: Event message queue overflow: discard message
[12:28:23.561]E async_ws 233: Event message queue overflow: discard message
[12:28:23.567]E async_ws 233: Event message queue overflow: discard message
[12:28:23.577]E async_ws 233: Event message queue overflow: discard message
[12:28:23.585]E async_ws 233: Event message queue overflow: discard message
[12:28:23.591]E async_ws 233: Event message queue overflow: discard message
[12:28:23.598]E async_ws 233: Event message queue overflow: discard message
[12:28:23.605]E async_ws 233: Event message queue overflow: discard message
[12:28:23.612]E async_ws 233: Event message queue overflow: discard message
[12:28:23.644]E async_ws 233: Event message queue overflow: discard message
[12:28:23.651]E async_ws 233: Event message queue overflow: discard message
[12:28:23.657]E async_ws 233: Event message queue overflow: discard message
[12:28:23.663]E async_ws 233: Event message queue overflow: discard message
[12:28:23.669]E async_ws 233: Event message queue overflow: discard message
[12:28:23.675]E async_ws 233: Event message queue overflow: discard message
[12:28:23.682]E async_ws 233: Event message queue overflow: discard message
[12:28:23.688]E async_ws 233: Event message queue overflow: discard message
[12:28:23.695]E async_ws 233: Event message queue overflow: discard message
[12:28:25.448]E async_ws 233: Event message queue overflow: discard message
[12:28:26.985]E async_ws 233: Event message queue overflow: discard message
[12:28:27.004]E async_ws 233: Event message queue overflow: discard message

on the console though. but thats expected.

Might be nice to track drop messages and only log every so often than x messages were discarded to reduce the serial blocking, but a nice to have for sure

bdraco · 2026-01-22T22:55:40Z

tested rtlxxx with libretiny as well. All good

bdraco · 2026-01-22T23:06:22Z

ESPHome PR memory impact analysis esphome/esphome#13467 (comment)

mathieucarbou · 2026-01-23T06:45:59Z

Might be nice to track drop messages and only log every so often than x messages were discarded to reduce the serial blocking, but a nice to have for sure

Yes we know. Logging is something we would like to improved and have a way to collect stats instead of logging like that.

FYI @me-no-dev - you came with this idea last time also

mathieucarbou · 2026-01-23T06:50:47Z

Thanks a lot for your testing @bdraco !

I will ask the team for approval / review.

Do you need a release just after for your upgrade ?

We can tackle the esp32 issue in a next PR / release.

bdraco · 2026-01-23T07:16:31Z

A release in the next week would be great timing as it gives time to bake in 2026.2.x-dev for 3-4 weeks and for the ESP8266 users that test dev to report issues

mathieucarbou · 2026-01-23T09:19:57Z

@me-no-dev @vortigont @willmmiles : if you can have a look and approve : merge. Thank you!

mathieucarbou · 2026-01-23T14:03:39Z

@bdraco : v3.9.5 is released!

@zekageri : you will be able to do your testing with v3.9.5. If you still see the issue, please let me know and I will open a specific issue for the problem you saw (list modification during iteration).

Thank you both for helping!

zekageri · 2026-01-23T14:28:09Z

I don't see the problem anymore. Strange since I have removed the cleanup call as well. Maybe it was another issue I had with TLS access. Sorry if it was a false report.

mathieucarbou · 2026-01-23T16:09:17Z

I don't see the problem anymore. Strange since I have removed the cleanup call as well. Maybe it was another issue I had with TLS access. Sorry if it was a false report.

I have added some null checks in this area so maybe that helped.
Thank you for your testing!

Code cleanup and null checks

547e25e

mathieucarbou requested review from me-no-dev, vortigont and willmmiles January 22, 2026 13:48

mathieucarbou self-assigned this Jan 22, 2026

mathieucarbou added the Status: Pending Merge label Jan 22, 2026

Copilot AI review requested due to automatic review settings January 22, 2026 13:48

Copilot started reviewing on behalf of mathieucarbou January 22, 2026 13:51 View session

Copilot AI reviewed Jan 22, 2026

View reviewed changes

src/AsyncEventSource.cpp Outdated Show resolved Hide resolved

src/AsyncEventSource.cpp Outdated Show resolved Hide resolved

src/AsyncEventSource.cpp Outdated Show resolved Hide resolved

src/AsyncEventSource.cpp Show resolved Hide resolved

copilot review

13d6ccf

Copilot AI mentioned this pull request Jan 22, 2026

[WIP] Update client connection logic based on feedback #374

Closed

ESP32Async deleted a comment from Copilot AI Jan 22, 2026

cleanup comment

de7cb0d

mathieucarbou mentioned this pull request Jan 22, 2026

Fix ESP8266 crash in logging macros due to __FUNCTION__ in flash #371

Merged

willmmiles reviewed Jan 22, 2026

View reviewed changes

mathieucarbou linked an issue Jan 22, 2026 that may be closed by this pull request

Crash on ESP8266 with rapid reloads #364

Closed

5 tasks

bdraco mentioned this pull request Jan 22, 2026

Update ESPAsyncWebServer to 3.9.x (fixes ESP8266 logging crash) esphome/esphome#13467

Draft

17 tasks

me-no-dev approved these changes Jan 23, 2026

View reviewed changes

mathieucarbou merged commit 9e94769 into main Jan 23, 2026
33 checks passed

mathieucarbou deleted the fix/npe branch January 23, 2026 13:25

mathieucarbou added Platform: ESP8266 Platform: ESP32 labels Jan 23, 2026

Code cleanup and null checks #373

Code cleanup and null checks #373

Conversation

mathieucarbou commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

willmmiles left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

willmmiles Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

mathieucarbou Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mathieucarbou commented Jan 22, 2026

Uh oh!

zekageri commented Jan 22, 2026

Uh oh!

mathieucarbou commented Jan 22, 2026

Uh oh!

mathieucarbou commented Jan 22, 2026

Uh oh!

zekageri commented Jan 22, 2026

Uh oh!

bdraco commented Jan 22, 2026

Uh oh!

mathieucarbou commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Jan 22, 2026

Uh oh!

bdraco commented Jan 22, 2026

Uh oh!

bdraco commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Jan 22, 2026

Uh oh!

bdraco commented Jan 22, 2026

Uh oh!

mathieucarbou commented Jan 23, 2026

Uh oh!

mathieucarbou commented Jan 23, 2026

Uh oh!

bdraco commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathieucarbou commented Jan 23, 2026

Uh oh!

Uh oh!

mathieucarbou commented Jan 23, 2026

Uh oh!

zekageri commented Jan 23, 2026

Uh oh!

mathieucarbou commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mathieucarbou commented Jan 22, 2026 •

edited

Loading

willmmiles left a comment •

edited

Loading

mathieucarbou Jan 22, 2026 •

edited

Loading

mathieucarbou commented Jan 22, 2026 •

edited

Loading

bdraco commented Jan 22, 2026 •

edited

Loading

bdraco commented Jan 23, 2026 •

edited

Loading