AI Neural Networks Visualized | 神经网络可视化

Proceedings of IEEE 1998

LeNet-5

Gradient-Based Learning Applied to Document Recognition - The CNN "Hello World" that started the deep learning revolution, used in banks for check recognition.

CVPR 2016 Best Paper

ResNet

Deep Residual Learning for Image Recognition - The breakthrough architecture enabling 100+ layer networks through skip connections.

ICCV 2023

DiT · Diffusion Transformers

Scalable Diffusion Models with Transformers - Replacing U-Nets with transformers for state-of-the-art image generation.

NAACL 2019

BERT

Bidirectional Encoder Representations from Transformers - Deep bidirectional pre-training that revolutionized NLP.

NeurIPS 2017

Transformer

Attention Is All You Need - The groundbreaking architecture that replaced RNNs with self-attention mechanisms.

NeurIPS 2014

GAN

Generative Adversarial Networks - The revolutionary framework where two neural networks compete to generate realistic data.

ICLR 2014

VAE

Auto-Encoding Variational Bayes - The probabilistic approach to deep generative modeling that learns structured latent representations.

ICLR 2021

Vision Transformer

An Image is Worth 16x16 Words - Applying transformers to image recognition with patch-based sequences.

ICML 2021

CLIP

Learning Transferable Visual Models From Natural Language Supervision - Zero-shot image classification using natural language descriptions.

CVPR 2022

Stable Diffusion · LDM

High-Resolution Image Synthesis with Latent Diffusion Models - Democratizing AI image generation on consumer hardware.

Nature 2015 · Turing Award

Deep Learning

The seminal review by LeCun, Bengio & Hinton that established deep learning as a revolutionary field in AI - 50,000+ citations.

Foundation LLM 2023

LLaMA

Open and Efficient Foundation Language Models - The breakthrough open-source LLM that democratized access to powerful language models.

ICLR 2015

VGG

Very Deep Convolutional Networks for Large-Scale Image Recognition - Small 3x3 filters, deep architectures, and the simplicity that outperformed complex designs.

CVPR 2016

YOLO

You Only Look Once - Unified, Real-Time Object Detection. The paradigm shift that made real-time object detection practical.

CVPR 2015

GoogLeNet · Inception

Going Deeper with Convolutions - Multi-scale feature extraction with inception modules achieving state-of-the-art with fewer parameters.

NeurIPS 2017

MoE · Mixture of Experts

Adaptive computation with sparsely-gated expert networks. The foundation for scaling large language models efficiently.

NIPS 2017

CapsNet · Capsule Networks

Dynamic Routing Between Capsules - Hinton's revolutionary architecture using vector outputs and dynamic routing to solve CNN's spatial information loss.

NeurIPS 2020

GPT-3

Language Models are Few-Shot Learners - 175 billion parameters enabling powerful few-shot learning without fine-tuning.

arXiv 2023

GPT-4

Large Multimodal Model - Human-level performance on professional exams including passing the bar in the top 10%.

arXiv 2017

MobileNet · Efficient CNNs

Efficient Convolutional Neural Networks for Mobile Vision - Depthwise separable convolutions enabling real-time mobile AI.

NeurIPS 2012

AlexNet

ImageNet Classification with Deep Convolutional Neural Networks - The paper that ignited the deep learning revolution with 8 layers and 60M parameters.

CVPR 2017 Best Paper

DenseNet

Densely Connected Convolutional Networks - Feature reuse through dense connections, maximum information flow with fewer parameters.

ICML 2019

EfficientNet

Rethinking Model Scaling for CNNs - Compound scaling method that uniformly scales depth, width, and resolution.

OpenAI 2018

GPT-1

Improving Language Understanding by Generative Pre-Training - The original Generative Pre-Training that started the GPT revolution.

OpenAI 2019

GPT-2

Language Models are Unsupervised Multitask Learners - 1.5B parameters and zero-shot task transfer without fine-tuning.

Neural Computation 1997

LSTM

Long Short-Term Memory - The groundbreaking architecture that solved the vanishing gradient problem for sequential data.

ICML 2024

Mamba · SSM

Linear-Time Sequence Modeling with Selective State Spaces - A new paradigm challenging Transformers with linear complexity.

Coming Soon

NeRF · StyleGAN · T5 · More

More classic papers will be added hourly by automated updates. Stay tuned for detailed explanations with visualizations.

🧠 Neural Networks

📚 Classic Papers