There can be several reasons for a different multiplier:
1) integration is poor
As Toykeeper pointed out, the integration could be limited, due to the box being not spherical (I still think that a box vs a sphere should not matter that much for integration, but never measured it), or to the absence of a well positioned baffle blocking light from the source going directly to the light sensor. Like the bended pipe device that has limited integration, all works acceptably well as long as reflector flashlights are measured with similar light distribution (beam pattern) that are pointed in the same direction all the time, and if the calibration was done with such flashlights as well. But a bare led is an extremely different beam, getting accurate results for both a directed source like a flashlight and a spread-out source like a bare led, requires much better integration.
To check for integration quality of a measuring sphere/box/pipe, you can use a small zoomie (small to minimise entrance hole effects, see number 2 ) in spot modus, on low setting so that the output remains more or less the same (you can check for output variation of the zoomie before starting the next step). Then hold it in the same position in the entrance hole and write down the reading when shining it in several directions into the box (straight ahead, 30 degrees in different directions, 45 degrees in different directions. If you find a variation in your values of more than a few percent, the integration of the device is limited, different light distributions will affect the reading. If I remember well, my 46cm sphere gives variations of 3% maximum, with still only 10% difference (lower) when shining the spot directly on the baffle.
2) entrance hole effects.
the total reflectivity of your sphere, which is a measure for your multiplier, is also influenced by the reflectivity of the entrance hole. The reflectivity of the entrance hole varies with what object is in there. A flashlight with its (large or small) reflector will always cause a higher reflectivity than a bare led (or an empty hole), so your mulitplier will be lower for a flashlight than for a bare led, but how much? You can minimise this effect by using a low hole/inner surface ratio (big sphere, small hole), or you can every time correct your multiplier for ‘hole effects’ (3 of my spheres have such a correction option, it is using an build-in constant light source in the sphere). There is a very stubborn misconception on BLF btw that the sphere’s reflectivity must be maximised at all times for the most accurate results (people are using inserts that alter entrance hole size, or white disks around the flashlight), but what is really needed is that the reflectivity is kept constant, and you have no way to know (measure) how such an insert influences the reflectivity (and thus multiplier). A high reflectivity is good, is needed, but keeping it constant is way more important, that gives you the non-changing multiplier.
You can get a feeling for your entrance hole effect by fabricating an aluminium foil round around your bare led of the size of the entrance hole of the measuring device (a flat round of alu-foil, not conus shaped like a flashlight reflector), this mimics the reflectivity of a flashlight. Then measure the led at the same current with and without the aluminium round in place.
3) factory specs may differ from real world results.
I’m not sure how your calibration for bare leds was done. I assume you used the factory specs for a number of leds and figured an average multiplier out of that? It could be that how the factory measures the led (for milli-seconds is what I have read somewhere) is different and gives higher values than real world measurements with the led continuously lighted. This is not something I know a lot of btw, I just pose an assumption here.
I hope this makes a little sense to you to and I hope you will find out where the multiplier difference comes from. And if not, just use the different multipliers for different light sources and live happily everafter, being obsessive about accuracy (and never really get there ) like me may not be very health anyway