In full turbo, and when approaching it, as the FET PWM ratio increases the 7135s contribute less and less to the current flow, so less heat is generated on the driver/MCU temp. sensor hence the increased lag.
A quick and dirty way to fix this might be to mount a power resistor as close as possible to the MCU, connected to the LED drive output. This way the parasitic loss in this heater resistor would provide the quick response needed for PID type strategies, compensating for the lower contribution from the 7135s.
If the driver was arranged so the resistor was only powered by the FET, not the 7135s, there would be no parasitic loss in 7135 modes, but that might require a second small FET just to drive it.
Edit:, or use a gate-able constant current source, shorted to ground, as the heater, rather than resistor+FET.
Otherwise the classic method is to use feed forward in addition to PID, but this requires a pretty good model of the overall behaviour and is not for the faint hearted.
Since we are not trying to thermally control it, simply to turn it down just before it overheats, I reckon some straightforward mapping of the behaviour could work.
Mapping could be attempted by taking a series of step response measurements using a thermocouple on the head and timing the duration from start temperature to critical temperature. Maybe map 8 or 16 different power levels, from full 7135 to full FET. These could then be used, factored by the head temperature measured at the MCU, to determine a step-down time for each power setting, the step down being either smoothed out, or left to step to give visual indication of what’s happening.
Once back down to full 7135 the universal algorithm taking over again.
PS: some more detail on how to use the LED Vf as it’s own temperature sensor:
http://www.electronicdesign.com/lighting/use-forward-voltage-drop-measure-junction-temperature
PPS: You already have a constant current source on the driver (x1 7135) so if you could arrange a second set of wires to the LED (4 wire probe) feeding an ADC input to the MCU, you could do it, with just the tiniest flicker when taking the measurement. Guessing at 2mV/degreeC slope, you’d be looking to resolve say 200 mV over the range 0-100C, on the say 3V Vf, which sounds do-able.