2024-10-20
Tried this out: 1) i opened a 110 pages pdf and automatically scrolled through each page (0.75s per page) 2) recorded the video of that scroll 3) sent the 1m30s video to Gemini In 6seconds Gemini was able to analyse the video and tell me in which pdf page the information was.
Simon Willison's Weblog
Notes on an inexpensive and effective “video scraping” technique in which the user feeds a screen recording into Google AI Studio to extract data with Gemini
Instead of writing a bunch of code, he simply screen recorded himself scrolling through the emails … X: @defnotbeka : to me, this is a stunning indictment of the state of software ...
2024-10-19
Tried this out: 1) i opened a 110 pages pdf and automatically scrolled through each page (0.75s per page) 2) recorded the video of that scroll 3) sent the 1m30s video to Gemini In 6seconds Gemini was able to analyse the video and tell me in which pdf page the information was.
Simon Willison's Weblog
Notes on an inexpensive and effective “video scraping” technique in which the user feeds a screen recording into Google AI Studio to extract data with Gemini
The other day I found myself needing to add up some numeric values that were scattered across twelve different emails.