If you converted this to a 4s setup you would get a lot more current to each emitter. I don’t know if that is feasible in this particular light, though.
With 4 18650s in parallel directly connected to 12 XPLs in parallel, you are getting ~29A, which is only ~2.4A to each emitter. But with 4 18650 in series connected to 4s3p emitters, you would send ~14A, which is ~4.7A to each emitter. With the all-parallel setup, the higher current causes a larger voltage drop across the circuit resistances, which kills the resulting current.
These numbers were calculated using the method I describe here. It assumes each 18650 cell has 0.03 Ohms, the other circuit resistances (wires, traces, springs, FET, etc.) total 0.025 Ohms, and the LEDs used are XPL V6 1A.
This method graphs the voltage at the LEDs as a function of the current, and the forward voltage of the LEDs as a function of current. The intersection of the two curves is the current (x-axis) that will flow.
All parallel:
Googleand(.19**(x%2F12)%2B2.74)
4s:
Google(4.15*4)-x**+and+4**(.19*(x%2F3)%2B2.74)