Nirmalya Ghosh Applied AI | Technologist

TTFT Optimisation: Practical Patterns

How to reduce TTFT in production: practical patterns, implementation strategies, and edge cases to watch for.

Continue reading ...

How Prompt Size Directly Impacts LLM Response Latency

Understanding the mechanics of Time to First Token (TTFT) and why those extra tokens may lead to poor user experience (UX).

Continue reading ...

A Newsletter Decluttering AI Agent Using ReAct Pattern

Our inboxes contain dozens (if not hundreds) of newsletters we subscribed to during moments of curiosity, but we seldom read most of them. Manually unsubscribing is tedious: open each email, scroll to the bottom, click unsubscribe, confirm … repeat 50+ times.

This post covers a personal project developing an AI agent using the ReAct pattern to analyse newsletters I have subscribed to and recommend the ones to unsubscribe based on my reading behaviour.

Continue reading ...

Trying Out Osmosis-Structure-0.6B

While large language models (LLMs), such as GPT-4 and Claude, are capable of extracting structured information from text, small language models (SLMs) have historically struggled to do so reliably. Previously, the only viable approach was to fine-tune a larger open-weights model using distillation. A week ago, there was an announcement, which appears to an alternative.

Continue reading ...

Slopsquatting (i.e., package hallucination)

Researchers have identified a cyber threat known as slopsquatting, also referred to as package hallucination, in which malicious actors exploit large language models (LLMs) tendency to generate non-existent package names during code generation. These hallucinated package names, when registered by attackers with malware payloads, create a new vector for software supply chain attacks—particularly within AI-assisted development workflows.

Continue reading ...