[UPDATE:v1.7.1,Q8&1chanOTSM]bistro-HD, bistro your way. OTSM, eswitch(devel), Vcc reads, safe_presses, turbo timeout...

Thanks for testing it and for the feedback.

So you've got the exact same software and different hardware and you think it's a software problem?

It's not completely impossible for that to be true actually, but it's far from likely.

Do they all work on older versions? You are using a different cap from me, and for the small LDO boards you're not using a diode, something I've never tested myself and it depends on the LDO quality as you've seen. Is this an LDO build? Does it have a diode?

It's possible that you're right on the edge with your hardware, and some work and some don't. Watchdog clocks have a little variation. Capacitors have a little variation (20% actually for yours), etc. My hardware has TONS of room to spare, but if your hardware is marginal, then this can happen.

Have you ever gone through the exercise of determining how much extra cap time you have? Voltage measurements are one way, but the most reliable is to use the OTSM-debug config option and compile.I probably should document that better, not that anyone will likely read it. Then on every click press it will flash out the number of sleeps that occurred. For long presses you'll either get the number or a 255(or maybe nothing instead of 255). No blinking at all or 255, means it lost power. A number (in quarter seconds) means it was still going. So you can see what the longest click possible is.

Did you try them with an earlier version? Even if the answer is yes and they worked, it doesn't change the possibility of things being right on the edge, but it gets more interesting. I have tested timing on 3 boards with >1.5 builds and they all work. 2 of those were OTSM and one was e-switch, which never runs out of power. Anyway, let me know if and which earlier version worked.

If you're using different hardware though, you really should go through the exercise of running that test.I haven't invoked that config option in awhile, so let me know if it's not working. There's a chance the debug blink-out got moved to early with some changes.

All this said, there is a known bug (that I'm zooming in on slowly I hope) that causes very occasional instability, that was introduced somewhere recently I think, I'm guessing maybe between 1.4.1 and 1.5, but, although it does present strangely, never anything remotely like what you're describing. Still if 1.3-r2 works for you, great. (I may re-post it soon).

I'll take note though and keep an eye out for poor timing performance.

it was normal 1S builds
same hardware as usual

flashed v1.3 on them after v1.5 failed and it worked flawless

I did one v1.3 on 2S with diode after the LDO and it worked fine

hmm,

Same hardware as usual still doesn't tell me if your hardware is on the edge of working or not though.

How does 1 work and 2 don't? It's the same software. It's very strange. If they all have room to spare, that would be interesting to know.

However, 1.3-r2 is still the most stable build for basic OTSM and I'll repost it when I get a chance. Maybe when I fix the other bug we'll see if it fixes your issue too. I don't see how, but I don't know.

I'm zooming in on the other one.

If you feel like flashing 1.4.1 (not the same as 1.4) and seeing how it goes that might be helpful too. There was some re-arrangement of the timing code from there to 1.5 and 1.6 to make it more readable. But it’s still hard to read too much into any timing test without really knowing for sure what the hardware is able to do. Measuring sleep time capability of the hardware has been the standard benchmark for everyone trying OTSM hardware development from the start. It’s not that I’m saying your hardware is bad either, it’s just a matter of pinning things down and ruling things out. We’ve got a matrix that looks like this

                    Hardware:  1                   2
Software:                        
   A                              works            works
   B                              works            !works

And then I’ve got more hardware in category 1, three of very different builds.
So is that a software problem or hardware problem? I just don’t know.

Has anyone tried using Bistro with an Eswitch? What is your experience? Is it intuitive or awkward?

High light-rider. I'm the only one who has a version that actually works more or less right with e-switch. All except the q8 version are marked in the manual as highly experimental, and the Q8 was pulled from that category slightly prematurely.

So yes, I've tried it, if that counts. I like it just fine. It's simple. It's bistro. It's long click off, but unlike narsil there's no lockout on mode change, which I like. I can add shortcuts later, shortcuts on to turbo or memory for example (with med press or long press, or maybe double click even).

Unfortunately in some version after 1.3-r2 (not sure which one yet and I will repost that one shortly) there is a bug that affects both e-switch and OTSM (but is worse on e-switch), but those are also the versions where e-switch is most developed. The bug creates a persistant instability after some significant amount of click switching, likely because of a race (precision timing of the click to cause a problem) that locks the light or otherwise makes it misbehave (sometimes clicks still work, but it still misbehaves again after clicking), and requires detatching battery power to reset (twisting the tube).

It's a combination of a race and some kind corruption probably, and that can be a very hard combination to pin down especially since it's rare and I can't produce it in the simulator, but I'm getting closer, much closer. So once I get that fixed I'll release a new e-switch build.

I have a pruned down test build that for whatever reason manifests the bug more easily than the real build. I've been diffing it against 1.3-r2 and systematically testing partial reverse patches of all interesting differences. I've narrowed it down quite a bit. I don't have that much time though, so I'm chipping at it slowly.

Unfortunately with a bug like this, it's possible I'm narrowing down code that enables or unmasks the bug but maybe not the root cause. It's even possible that root cause existed in 1.3-r2. I've also ordered a cheap hardware debugger of sorts so hopefully that can help me understand the state of the machine after the bug happens. That's hard work to analyze (memory dumps, comparison with simulator, etc) , but could really pinpoint the cause. Slow going.

Hardware is still TA triple build

in TA topic for the driver are the parts listed
I just changed the C2 to this capacitor rest are parts you listed them

the FET is SIR404, but should make no difference

If I flash v1.5 and it does not work, then flash v1.3 and it works why should the hardware be the reason?

I never tested your cap. I'm not saying it's bad. It's unlikely it's bad, but it's not identical specs to mine and specs aren't testing. If I bought or tried a different cap I'd absolutely test it with the OTSM-debug.

It's also unlikely that software can somehow behave differently on identical hardware don't you think? The only way I can imagine is through ram decay to different values, but there is no RAM decay on OTSM medium presses. It's still powered.

So we have two very unlikely explanations. One of them must be right but which one? For all I know you got a bad flash on 2 of them. I'm not ruling out software as PART of the issue (there's obviously a hardware aspect to the explanation, and there's obviously a software aspect, which is mainly responsible? again, I don't know). It's just a pretty big mystery at the moment.

Are you serious with that question?

[quote=Flintrock talking in his head]

If you flash 1.5 on one board you made and it works, then flash 1.5 on another board you made and it doesn't work. Why should software be the reason?

[/quote]

We can ask that both ways clearly. Did you see the matrix?

Both of those 1.5's were bit for bit identical, so what was different in those two cases? Not software obviously. Is the software deficient in dealing with hardware variation... possibly, but the difference is clearly, certainly hardware. Is the software significantly worse in 1.5 but variation in hardware meant some hardware still barely worked? Possibly. Or is the software only very slightly different in 1.5 and hardware that was already barely working failed? You don't know. I don't know. You can guess, and might even guess right.

All the OTSM drivers I have build have components from the same reel, so there should be no different MCU revision or other bad hardware change

My guess there is something on the edge with never versions that made it fail or sometimes works
I downloaded 3 times newer versions of your OTSM after the May v1.3 and they had all issues not working properly

I can only tell you that there is something going on with non working medium presses when it comes to build multiple OTSM drivers

If you want to analyze it I could send you a driver for some compensation that works with 1.3 and fails with 1.5

I've helped you quite a bit (for free, for boards you're selling), and I will keep an eye out for this as well.

I will certainly not test for free and certainly not pay to test your boards with your hardware selection. I will test my boards for timing on various versions, like everyone else who has made new OTSM hardware or software configurations has done. It takes literally 2 minutes to flash a version and say what blinks. It takes hours to develop and debug the versions. Thanks for helping.

It is certainly true that for now 1.3-r2 is the most tested and proven version for basic TA otsm builds.

So, I am considering to make medium press to enter hidden modes but short press to cycle through them. This wouldn't be until after stabilizing present builds.

I don't like making too many UI changes, because possibilities are endless and I want to keep close to consistent. But med press back through all the hidden modes can be annoying if you want one quickly. I haven't even though yet about the details of how to do it. Doing it intelligently probably requires a significant change to the modegroup construction routine that went through a ton of optimization already. [edit: ok really it just requires reversing the hidden mode definitions in all the modegroup config files, doing it that way is an API change though from the perspective of imagined mod-ers with their own modegroups, so it should come with a new preproc define to enable it, more of an issue the later it's done].

I very much like this idea of not having to medium press four times before getting to the desired function! Hope it gets worked out.

@Lexel, if a board fails to opporate using v1.5, will it work if you erase and reflash with the same v1.5 firmware?

Yeah, it's something I like about Narsil, although I had it mind before I found it in Narsil. My problem with Narsil though is in strobes, battcheck etc, it takes a second to realize what mode you're in, and by then Narsil locks you out and you have to start over. 1.2 second control lockout in custom modes doesn't work unless you've memorized them and your brain isn't engaged in something else.

LightRider, he said he flashed it a few times if I read right. Rarely bad flashes happen, but it's rare.

What's clear, if the matrix is trustable (there wasn't a slight change in supply voltage, temp etc.. unlikely, but just making the conditions clear) is there is a difference in hardware AND there is a difference in software. And so I will DEFINTELY look on my end.

What we don't know is is the software difference very small and the hardware difference between his hardware and mine very big, with small variations within his hardware, or are all the hardware differences very small, and the software difference is very big.

The only thing I know in software that can degrade the performance a bunch are pin states, but they're set the same. I'll double check that in the simulator. I'll do my part, but I only have one fully functional non-prototype OTSM board and I have to desolder it to test it. I generally test features on my proto board, and just flash new whole versions to the light once in awhile to make sure it works. And it does.

It's cake for lexel to flash out the otsm-debug values for both builds because building and flashing boards is what he does, and it would definitely help me out.

Ok, so lexel didn't say he flashed it multiple times, my mistake.

Anyway, I just ran otsm-debug tests on 1.5 on my light. Desoldered again just to test it.

At 4.2V I could get up 13 seconds of click time. Should be enough.

At 3.1 V I got somewhere between 6 and 7.

At 2.9V, I could get around 4s.

Now what this all proves is that a bit weaker cap should probably be ok. But what it also proves is, 1.5 works fine on good hardware.

So, the issue status is presently changed to WORSFORME.

Lexel if you want to try to reproduce it and/or provide otsm-debug tests to potentially shed some light on what's going on on your end, that's great. There's nothing more I can say from here since I cannot reproduce your issue.

I also took the chance to use a stopwatch this time with the high voltage tests. 12 seconds read dead on as 48 wakes. Seems fairly accurate.

A theory for lexel's issue.

It occurred to me while I was stuck in muggle mode while stress testing 1.4.1 (so far so good, no freeze bug yet) that there is a way lexel's issue can happen, and it has nothing do with which version of the software he flashed

The first time the light start, it has no saved data. It tries to read saved data (specifically last mode) and sees there isn't any and stores all the initialization values to eeprom. Writes to eeprom are slow, about 3ms per byte. So if power is interrupted during that first turn-on, some values may not get set, and you may get memory turned off. A reset (menu item 8) will fix it.

There is a change in the software (any version) that can probably prevent this. The data that's checked should be the last data saved, then it's a commit. It's not that way right now. Simple fix and I'll include it in future releases. Might even release a 1.4.2 with it, since 1.4.1 seems very stable (haven't tested 1.5 as thoroughly yet).

This issue is no different on new builds than old ones, and it's never struck me, but it could be what happened to Lexel.

UPDATE I did take down 1.5 and 1.6 because neither achieved their goals yet, there is a different bug somewhere (not sure when, maybe even after 1.6) and because 1.4.1 is now well tested and was targeted specifically as a bug fix release, being created actually after 1.5 for the purpose of backporting bug-fixes from 1.5