I suggest to explicitely delcare a (say) IMXRT_TMR_CH_t struct which collects the registers for a TMR channel instead of implicitely defining this struct in the declaration of IMXRT_TMR_t.
This would allow for an easier definition of user classes modelling a TMR channel.
Not a big deal, could also do that mayself the class code but it would be more convenient if this would be done in imxrt.h
Edited here on the web - copied to local machine and it built as tested : https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=209959&viewfull=1#post209959
Proposed add locations for startup_early_hook(); and startup_late_hook(); just before two existing waiting while()'s.
With debug 'PRINT_DEBUG_STUFF' disabled the three T4B2's here hit the EARLY code about 1.2 ms and LATE at about 45.3 ms after the systick is started.
Where the current printf_debug_init() call is might be critical at some point adjust 'something' as you did enabling the Serial4 port for debug - before clocks and other things are started perhaps startup_reset_hook()
Add quick test at start of function to return if there are no events pending.
Do this before checking if in ISR and the like. Cut timing down a lot in yield.
Put in probably temporary bandaids to allow code that uses:
Serial.readBytes(Buffer, cnt);
To function. It simply loops calling the underlying code for Serial.read().
Comment out some of the main programs debug print statements plus add yield call to main loop.
yield can now call SerialEvent.
Put hack in that if the user has not overwritten the SerialEvent function, the weak linked one will turn off calling itself...