<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>_Archiveds on Hemant Sethi</title>
    <link>https://www.sethihemant.com/_archived/</link>
    <description>Recent content in _Archiveds on Hemant Sethi</description>
    <generator>Hugo -- 0.146.0</generator>
    <language>en-us</language>
    <lastBuildDate>Wed, 14 Jan 2026 09:33:53 -0800</lastBuildDate>
    <atom:link href="https://www.sethihemant.com/_archived/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Projects</title>
      <link>https://www.sethihemant.com/_archived/projects/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>https://www.sethihemant.com/_archived/projects/</guid>
      <description>&lt;p&gt;My work spans AI storage optimization, distributed systems, and high-throughput data processing systems. Below are some key areas/projects I have been involved with.&lt;/p&gt;
&lt;h3 id=&#34;multi-protocol-storage-for-ml-workloads&#34;&gt;Multi-Protocol Storage for ML Workloads&lt;/h3&gt;
&lt;p&gt;Building and supporting storage systems that support both NFS (for legacy HPC workflows) and S3 (for cloud-native pipelines) over the same dataset. Key challenges include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Protocol semantic differences (POSIX vs object storage)&lt;/li&gt;
&lt;li&gt;Consistency models for concurrent access patterns&lt;/li&gt;
&lt;li&gt;Performance optimization for sequential reads (training) vs random access (checkpointing/Inference)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;tier-0-caching-strategies-for-gpu-clusters&#34;&gt;Tier-0 Caching Strategies for GPU Clusters&lt;/h3&gt;
&lt;p&gt;Researching node-local caching architectures to minimize data transfer over network fabric:&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>My work spans AI storage optimization, distributed systems, and high-throughput data processing systems. Below are some key areas/projects I have been involved with.</p>
<h3 id="multi-protocol-storage-for-ml-workloads">Multi-Protocol Storage for ML Workloads</h3>
<p>Building and supporting storage systems that support both NFS (for legacy HPC workflows) and S3 (for cloud-native pipelines) over the same dataset. Key challenges include:</p>
<ul>
<li>Protocol semantic differences (POSIX vs object storage)</li>
<li>Consistency models for concurrent access patterns</li>
<li>Performance optimization for sequential reads (training) vs random access (checkpointing/Inference)</li>
</ul>
<h3 id="tier-0-caching-strategies-for-gpu-clusters">Tier-0 Caching Strategies for GPU Clusters</h3>
<p>Researching node-local caching architectures to minimize data transfer over network fabric:</p>
<ul>
<li>Pre-fetching strategies based on training iteration patterns</li>
<li>Cache eviction policies</li>
<li>RDMA integration for sub-microsecond latency data transfers</li>
<li>Benchmarking with PyTorch and TensorFlow data pipelines</li>
</ul>
<h3 id="storage-benchmarking-for-transformer-training">Storage Benchmarking for Transformer Training</h3>
<p>Profiling storage I/O patterns for large language model training:</p>
<ul>
<li>Characterizing data loading bottlenecks in transformer architectures</li>
<li><strong>Tools:</strong> FIO, elbencho, custom PyTorch profilers</li>
<li>Building Load Testing Frameworks on top of Elbencho to test File and Object storage.</li>
</ul>
<h3 id="high-throughput-streaming-data-ingestion-and-delivery">High-Throughput Streaming Data Ingestion and Delivery</h3>
<p>Scaling data ingestion pipeline to billions of events and delivery across thousands of Partitions(s3-prefixes):</p>
<ul>
<li><strong>Dynamic Partitioning redesign</strong>: Akka Streams architecture achieving 400% throughput improvement</li>
<li>Backpressure handling and flow control in reactive systems</li>
<li>Lease management for distributed coordination</li>
<li>SQS/DynamoDB for workload distribution and state management</li>
</ul>
<h3 id="performance-engineering">Performance Engineering</h3>
<p>Approaches to performance optimization:</p>
<ul>
<li>Profiling Java services for memory leaks and GC tuning</li>
<li>Native memory analysis, exploring various unix memory allocators(<strong>malloc vs JEMalloc</strong>) and heap dump investigation</li>
<li>Identifying hot paths and optimizing critical sections</li>
<li>Monitoring and observability for production systems</li>
</ul>
<h2 id="research-interests">Research Interests</h2>
<p>Current areas of exploration and experimentation:</p>
<ul>
<li><strong>RDMA-enabled storage</strong>: Low-latency data access for AI workloads</li>
<li><strong>Caching hierarchies</strong>: Multi-tier caching strategies (node-local, rack-level, datacenter)</li>
<li><strong>Storage cost optimization</strong>: Balancing performance and cost for training workloads</li>
<li><strong>Benchmarking methodologies</strong>: Standardized approaches for storage performance evaluation</li>
</ul>
<h2 id="technical-writing--speaking">Technical Writing &amp; Speaking</h2>
<p>As an aspiring technical author, I&rsquo;m interested in writing about:</p>
<ul>
<li>Storage architecture for AI/ML infrastructure</li>
<li>Distributed systems design patterns</li>
<li>Performance optimization techniques</li>
<li>Real-world case studies from production systems</li>
</ul>
<p><em>Interested in collaboration, co-authorship, or speaking opportunities? <a href="mailto:sethi.hemant@gmail.com">Reach out</a>.</em></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
