We're diving back into the fascinating realm of LLM-powered hourglass animations. This is part two, building directly on our previous exploration: Generating hourglass animation with Chat GPT, part 1. We'll stick with the same visual design, but this time, we're kicking off with a pre-crafted prompt, honed from our earlier research.
So, let's push the boundaries and see how different LLMs tackle this animation challenge.
Our next stop? Gemini 2.5. The buzz online suggests it's a serious contender, often touted as superior to its peers, which naturally piqued my interest.
My initial foray involved feeding the ChatGPT 4.0-proven prompt directly into the standard Gemini interface (Gemini 2.0). The outcome? Let's just say it fell short, dramatically:
The result bore little resemblance to my sketch, and the sand color was, shall we say, a creative interpretation.
Next, I ventured into "Studio AI", granting access to the more acclaimed Gemini 2.5 model.
Here's what emerged from the first couple of attempts:
Notably, this was the first model that even attempted to animate the sand flow, albeit in a rudimentary fashion. ChatGPT hadn't offered this "feature" without explicit prompting. While the overall animation remained rough, we're working with what we've got. Let's keep iterating and see where this leads.
At this juncture, I opted for a fresh start, tweaking the prompt slightly. The core concept remained the same, but I emphasized the sand's behavior, stressing that it should settle at the bottom of its designated area.
Here's a snippet of the explanatory guidance I provided:
In an hourglass, there are two sections: the upper section and the bottom section. Sand from the upper section falls into the bottom section. The sand in the upper section forms a downward-pointing triangle that gradually shrinks. In the bottom section, the sand forms an upward-pointing triangle that grows over time.
The triangle of sand in the upper section should not be stuck to the ceiling—this isn’t how gravity works. It should remain within the upper section but be positioned near the bottom of that section.
This iteration yielded a significantly improved result. The timing, I'd argue, even surpassed ChatGPT's output:
However, the sand stream animation was still absent. Given the previous model's capability, I aimed to replicate that here.
The refined prompt, specifically targeting the falling sand animation, looked like this:
Implement a particle-based animation of sand flowing from the upper section to the lower section. The animation should accurately depict the sand's movement. Specifically:
- Use numerous fine particles for the sand.
- Introduce a degree of natural randomness to the particles' descent.
- The sand accumulation in the lower section should only begin once falling particles make contact.
- Particles should disappear upon impact with the lower section."
The outcome exceeded expectations. Not only did the animation showcase a particle stream, but it also achieved a visually appealing effect, closely aligning with my request. A minor glitch—a brief "flicker" at the end of the filling cycle—didn't detract from the impressive result.
Now, let's delve into the overall experience of code manipulation with Gemini 2.5. It proved surprisingly smooth and informative. The model efficiently delivered code changes accompanied by clear explanations. Several aspects set Gemini apart from other models I've tested.
Gemini's ability to isolate and provide only the necessary code modifications is a notable feature. While this requires manual integration, it demonstrates the model's contextual awareness.
Furthermore, Gemini employs comments to logically segment the code, enhancing navigation. This proves particularly useful when incorporating new suggestions.
However, this "focused replacement" approach has a caveat. Gemini occasionally reintroduces variables defined in unchanged code segments, leading to duplication errors. These are easily rectified, but nonetheless, a minor annoyance.
Overall, the result is better than I expected. But I want to mention an important issue that one needs to take into account while working with LLMs. Currently, the context window and generated code at each step are crucial for providing good results. Only the prompt doesn't do the trick. Generated code is a crucial part of the successful result. It means that if in the end you got what you wanted—then if you smoosh all the prompts together and pass it to the same model in a new chat (new context window), you wouldn't get the same result. This is because you're missing a crucial part of the equation—the generated code at every step of the way.