• Fenrisulfir@lemmy.ca
    link
    fedilink
    arrow-up
    37
    ·
    9 months ago

    I was working on this with a friend over 10 years ago but the only grocery store that made a decent effort at organizing their website to be scrapeable was Loblaws and all the others had APIs that cost $100,000

    • BluesF@lemmy.world
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      9 months ago

      Which is one area ML models might (with the right investment) actually be useful. A model trained to look at web pages and relay information from the content visually like we do would be very powerful. The newer ChatGPT models have visual capabilities, I wonder if you could give it a website screen capture and ask it for prices.

      • Joe Cool@lemmy.ml
        link
        fedilink
        arrow-up
        1
        ·
        9 months ago

        Why would you want a model trained on outdated prices? This is not really something LLMs are particularly suited for.
        Maybe to crunch historical data, but not for daily comparisons.

        • BluesF@lemmy.world
          link
          fedilink
          arrow-up
          1
          ·
          9 months ago

          Why would the model be trained on outdated prices? I’m not talking about LLMs, but separate model designed to parse visual information - specifically websites - and extract particular elements like prices. My comment about ChataGPT was in reference to the newer models which can relay visual information, I’m not suggesting that would be the right approach for training a new model.

          The applications would be broader than just prices - this would allow you to scrape any human-readable website without needing to do bespoke development.

    • ToffeeIsForClosers@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      Flipp allows for some of this desired capability now through digital flyer scraping and online feeds, APIs. Maybe things have gotten better on the API side over time.

      Pretty sure it’s a Canadian app, coincidentally.