Safetensors
Safetensors is a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy). Safetensors is really fast 🚀.
Installation
with pip:
pip install safetensors
with conda:
conda install -c huggingface safetensors
Usage
Load tensors
from safetensors import safe_open
tensors = {}
with safe_open("model.safetensors", framework="pt", device=0) as f:
for k in f.keys():
tensors[k] = f.get_tensor(k)
Loading only part of the tensors (interesting when running on multiple GPU)
from safetensors import safe_open
tensors = {}
with safe_open("model.safetensors", framework="pt", device=0) as f:
tensor_slice = f.get_slice("embedding")
vocab_size, hidden_dim = tensor_slice.get_shape()
tensor = tensor_slice[:, :hidden_dim]
Save tensors
import torch
from safetensors.torch import save_file
tensors = {
"embedding": torch.zeros((2, 2)),
"attention": torch.zeros((2, 3))
}
save_file(tensors, "model.safetensors")
Format
Let’s say you have safetensors file named model.safetensors
, then model.safetensors
will have the following internal format:
Featured Projects
Safetensors is being used widely at leading AI enterprises, such as Model Database, EleutherAI, and StabilityAI. Here is a non-exhaustive list of projects that are using safetensors:
- huggingface/transformers
- AUTOMATIC1111/stable-diffusion-webui
- Llama-cpp
- microsoft/TaskMatrix
- hpcaitech/ColossalAI
- huggingface/pytorch-image-models
- CivitAI
- huggingface/diffusers
- coreylowman/dfdx
- invoke-ai/InvokeAI
- oobabooga/text-generation-webui
- Sanster/lama-cleaner
- PaddlePaddle/PaddleNLP
- AIGC-Audio/AudioGPT
- brycedrennan/imaginAIry
- comfyanonymous/ComfyUI
- LianjiaTech/BELLE
- alvarobartt/safejax
- MaartenGr/BERTopic
- LaurentMazare/tch-rs