Aritra Roy Gosthipaty's picture

Aritra Roy Gosthipaty PRO

ariG23498

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

updated a dataset about 8 hours ago

model-metadata/custom_code_execution_files

updated a dataset about 8 hours ago

model-metadata/models_executed_urls

updated a dataset about 8 hours ago

model-metadata/model_vram_code

View all activity

Organizations

upvoted a collection 2 days ago

EmbeddingGemma

3 items • Updated 3 days ago • 53

upvoted an article 3 days ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

By

and 5 others •

4 days ago

• 150

upvoted a collection 3 days ago

Built with Distill blog ❤️

Collection of all interactive blogs built on top of Distill template. To create your own check: https://huggingface.co/spaces/lvwerra/distill-blog-tem • 6 items • Updated Mar 14 • 2

upvoted an article 5 days ago

Article

Make your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation

By

and 3 others •

6 days ago

• 40

upvoted a collection 9 days ago

FastVLM

Efficient Vision Encoding for Vision Language Models • 9 items • Updated 5 days ago • 88

upvoted a collection 11 days ago

Hermes 4 Collection

11 items • Updated 5 days ago • 65

upvoted 2 collections 17 days ago

Command Models

Latest Cohere Labs Command models • 10 items • Updated 12 days ago • 28

DeepSeek-V3.1

3 items • Updated 17 days ago • 222

upvoted a collection 18 days ago

SuryaBench

Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction • 8 items • Updated 20 days ago • 5

upvoted an article 19 days ago

Article

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

By

and 4 others •

about 1 month ago

• 59

upvoted an article 20 days ago

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

By

and 1 other •

21 days ago

• 54

upvoted a collection 20 days ago

NVIDIA Nemotron

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 4 items • Updated 4 days ago • 56

upvoted 2 papers 20 days ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 70

DINOv3

Paper • 2508.10104 • Published 25 days ago • 239

upvoted a collection 24 days ago

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated 17 days ago • 277

upvoted a paper 25 days ago

Technical Report: Full-Stack Fine-Tuning for the Q Programming Language

Paper • 2508.06813 • Published 29 days ago • 5

upvoted a collection 25 days ago

qqWen-Series

Based off the Qwen-2.5 Series - model finetuned for the Q programming language. • 11 items • Updated 10 days ago • 10

upvoted an article 25 days ago

Article

🕳️ Attention Sinks in LLMs for endless fluency

By

•

Oct 9, 2023

• 18

upvoted a paper 25 days ago

Aryabhata: An exam-focused language model for JEE Math

Paper • 2508.08665 • Published 26 days ago • 16

upvoted an article 25 days ago

Article

Optimization story: Bloom inference

By

•

Oct 12, 2022

• 6