Hello
I thought it would be cool if there was a torch database website that had torches from all different brands indexed by searchable metadata such as battery type, dimensions, brightness etc etc. I started with Nitecore, their website was pretty easy to scrape and exposes a reasonable amount of metadata for all of their products, even the discontinued ones. Fenix however is more problematic, for some annoying reason they display most of the specifications as an image rather than as parseable text. I tried running those spec images they have through a OCR library but it only got about 80% of the text so wasn’t very reliable. I might look for another OCR library if other avenues don’t pan out.
I’ve scoured a few other websites looking for a good source of data but haven’t found one. I can get the model names from fenixlighting.com easy enough, but then getting the rest of the metadata is tricky. Hennie Haynes has good metadata but they don’t have all the fenix torches and also matching products between the two is clunky and unreliable.
If anyone has suggestions for good sources of data for Fenix or even other brands I’d be most grateful. I’ve looked at most of the shop websites I can find but couldn’t find one that would be easy to parse.
My plan is to scrape as much data as possible to the point where I have a reasonable level of data for most of a given company’s products, say 80-90%, and then build a user-submission platform where people can add extra or missing stuff or correct things, kind of like a wiki, but search will be the key thing. All fields will be searchable so you could say “find all 18650 torches with at least 1000 lumens and candela above x” etc.
The best possible data source would be some kind of shop website I can scrape that has lots of brands and has easily parseable metadata for all of them.