Shared Pod Storage in GKE: Why We Chose NFS Over GCS Fuse

TL;DR: Needed shared ReadWriteMany storage between pods in GKE. Spent time setting up GCS Fuse because it seemed like the “right” cloud-native choice. The latency and setup overhead weren’t worth it for our use case. Switched to in-cluster NFS - simpler, faster, and just works.

The Problem

We had a batch processing pipeline where multiple pods needed to share scratch space. One pod writes intermediate results, another picks them up for the next stage. Classic producer-consumer pattern.

flowchart LR
    subgraph "What We Needed"
        POD1[Pod A<br/>Producer] --> |"writes"| SHARED[(Shared<br/>Storage)]
        SHARED --> |"reads"| POD2[Pod B<br/>Consumer]
        SHARED --> |"reads"| POD3[Pod C<br/>Consumer]
    end

    style SHARED fill:#ffd93d,color:#000

The requirements were straightforward:

Requirement	Why
ReadWriteMany	Multiple pods reading and writing simultaneously
Low latency	Frequent small file operations
Cross-node	Pods could land on any node in the cluster
~1-2GB capacity	Scratch space, not long-term storage

GKE’s default storage classes are ReadWriteOnce. A PersistentVolumeClaim can only be mounted by one pod at a time. That wasn’t going to work.

The Options

Two main contenders for shared storage in GKE:

flowchart TB
    subgraph "Option 1: GCS Fuse"
        POD_F1[Pod] --> |"FUSE mount"| CSI[GCS Fuse<br/>CSI Driver]
        CSI --> |"API calls"| GCS[(GCS Bucket)]
        POD_F2[Pod] --> CSI
    end

    subgraph "Option 2: NFS"
        POD_N1[Pod] --> |"NFS mount"| NFS[NFS Server<br/>in-cluster]
        NFS --> PD[(Persistent Disk)]
        POD_N2[Pod] --> NFS
    end

    style GCS fill:#4285f4,color:#fff
    style NFS fill:#4ecdc4,color:#000

GCS Fuse mounts a GCS bucket as a filesystem using a CSI driver. It’s “native” GCP, integrates with Workload Identity, and the data is accessible outside the cluster too.

NFS runs an NFS server inside the cluster (or uses GCP Filestore). Traditional, battle-tested, boring.

I went with GCS Fuse first. It seemed like the modern, cloud-native choice.

That was a mistake.

Why GCS Fuse Didn’t Work

GCS Fuse looks like a filesystem, but it’s really a translation layer that converts file operations into GCS API calls.

flowchart LR
    subgraph "What Happens With GCS Fuse"
        APP[App calls<br/>write] --> FUSE[FUSE<br/>intercepts]
        FUSE --> API[GCS API<br/>PUT object]
        API --> |"~50-100ms"| GCS[(GCS)]
    end

    subgraph "What Happens With NFS"
        APP2[App calls<br/>write] --> NFS[NFS<br/>protocol]
        NFS --> |"~1-5ms"| DISK[(Disk)]
    end

    style API fill:#ff6b6b,color:#fff
    style NFS fill:#96ceb4,color:#000

Every open(), read(), write(), close() becomes an API call. For large files written once and read occasionally, that’s fine. For our workload - lots of small files, frequent access - the latency compounded fast.

Then there was the setup overhead. Even after you configure the PVC, the Kubernetes service account attached to the pod needs proper GCS permissions and the right annotations - otherwise the mount silently fails or the pod gets stuck in ContainerCreating.

The full setup chain:

Configure Workload Identity for the Kubernetes service account
Create a GCP service account with Storage Object permissions
Bind the Kubernetes SA to the GCP SA via IAM policy
Install and configure the GCS Fuse CSI driver
Add the gke-gcsfuse/volumes: "true" annotation to pods
Handle partial POSIX compliance edge cases in your application

We spent more time debugging Fuse quirks than building features. Files that worked fine locally would behave strangely in the cluster. Append operations were unreliable. File locking didn’t work as expected.

The final straw was watching a simple directory listing take seconds when there were a few hundred files.

The Comparison

	NFS	GCS Fuse
Latency	Low (in-cluster)	Higher (API calls to GCS)
Throughput	Good for small files	Better for large files, slow for small files
POSIX compliance	Full	Partial (no hard links, append-only issues)
Random read/write	Fast	Slow (not designed for it)
Cost	Disk cost only (~$0.40/mo for 10GB)	Storage + API calls (adds up with frequent access)
Setup complexity	Medium (NFS deployment)	Medium (Workload Identity, CSI driver, annotations)
Cross-node	Yes	Yes
Survives restarts	Yes	Yes
File locking	Yes	Not reliable

The NFS Solution

Switched to running an NFS server in-cluster. The architecture:

flowchart TB
    subgraph "GKE Cluster"
        subgraph "Storage Namespace"
            NFS_POD[NFS Server Pod] --> PVC_BACK[Backing PVC<br/>Persistent Disk]
            NFS_SVC[NFS Service<br/>ClusterIP]
            NFS_POD --- NFS_SVC
        end

        subgraph "Application Namespace"
            POD1[Producer Pod] --> |"NFS mount"| NFS_SVC
            POD2[Consumer Pod A] --> |"NFS mount"| NFS_SVC
            POD3[Consumer Pod B] --> |"NFS mount"| NFS_SVC
        end

        PV[PersistentVolume<br/>type: nfs] --> NFS_SVC
        SHARED_PVC[Shared PVC<br/>ReadWriteMany] --> PV
        POD1 --> SHARED_PVC
        POD2 --> SHARED_PVC
        POD3 --> SHARED_PVC
    end

    style NFS_POD fill:#4ecdc4,color:#000
    style PVC_BACK fill:#45b7d1,color:#000
    style SHARED_PVC fill:#ffd93d,color:#000

The NFS Server

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs-server
  namespace: storage
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-server
  template:
    metadata:
      labels:
        app: nfs-server
    spec:
      containers:
        - name: nfs-server
          image: gcr.io/google_containers/volume-nfs:0.8
          ports:
            - name: nfs
              containerPort: 2049
            - name: mountd
              containerPort: 20048
            - name: rpcbind
              containerPort: 111
          securityContext:
            privileged: true
          volumeMounts:
            - name: storage
              mountPath: /exports
      volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: nfs-backing-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: nfs-server
  namespace: storage
spec:
  ports:
    - name: nfs
      port: 2049
    - name: mountd
      port: 20048
    - name: rpcbind
      port: 111
  selector:
    app: nfs-server

The Backing Storage

The NFS server needs its own storage. A standard GKE PersistentVolumeClaim works:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-backing-pvc
  namespace: storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard-rwo

The Shared Volume

Now create a PV and PVC that applications can mount:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: shared-nfs
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: nfs-server.storage.svc.cluster.local
    path: "/"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: shared-scratch
  namespace: applications
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  volumeName: shared-nfs
  resources:
    requests:
      storage: 10Gi

Mounting in Pods

Pods mount it like any other PVC:

apiVersion: v1
kind: Pod
metadata:
  name: producer
spec:
  containers:
    - name: app
      image: my-app
      volumeMounts:
        - name: scratch
          mountPath: /scratch
  volumes:
    - name: scratch
      persistentVolumeClaim:
        claimName: shared-scratch

The flow is simple:

flowchart LR
    POD[Pod] --> |"mount /scratch"| PVC[PVC<br/>shared-scratch]
    PVC --> PV[PV<br/>shared-nfs]
    PV --> |"NFS protocol"| SVC[nfs-server<br/>Service]
    SVC --> NFS[NFS Server<br/>Pod]
    NFS --> DISK[Persistent<br/>Disk]

    style PVC fill:#ffd93d,color:#000
    style NFS fill:#4ecdc4,color:#000

Works. Fast. No surprises.

When to Use Each

flowchart TD
    START([Need shared storage?]) --> Q1{Mostly large files?<br/>Write-once, read-many?}

    Q1 --> |Yes| Q2{Need access<br/>outside cluster?}
    Q1 --> |No| NFS_WIN[Use NFS]

    Q2 --> |Yes| FUSE_WIN[Use GCS Fuse]
    Q2 --> |No| Q3{Already deep in<br/>GCP ecosystem?}

    Q3 --> |Yes| FUSE_OK[GCS Fuse is fine]
    Q3 --> |No| NFS_WIN

    style NFS_WIN fill:#4ecdc4,color:#000
    style FUSE_WIN fill:#4285f4,color:#fff
    style FUSE_OK fill:#4285f4,color:#fff

Use NFS if:

Frequent small file reads/writes
Apps expect normal filesystem behavior
Low latency matters
File locking needed

Use GCS Fuse if:

Mostly large file transfers
Data needs to be accessed outside the cluster too
Write-once, read-many pattern
You’re deep in the GCP ecosystem and want unified storage

The Honest Answer

For 1-2GB of shared scratch space between pods: NFS wins. GCS Fuse has too much overhead for typical inter-pod file sharing.

And honestly, if your pods are guaranteed to land on the same node, hostPath is even simpler and faster than both. But that assumption breaks the moment Kubernetes schedules pods across nodes - which it will.

Takeaways

GCS Fuse is not a general-purpose filesystem. It’s optimized for large sequential reads/writes, not random access patterns. The “filesystem” abstraction is leaky.
“Cloud-native” isn’t always better. Sometimes an NFS server running in your cluster outperforms the managed option. Boring technology has its place.
API call overhead is real. When every file operation becomes a remote API call, latency compounds. For small files, this kills performance.
Match the tool to the workload. GCS Fuse is great for what it’s designed for - just not for scratch space and frequent small file access.
Simple usually wins. NFS has been around for decades. It works. Sometimes that’s all you need.
Test with realistic workloads. Our synthetic tests looked fine. Real workload patterns exposed the problems immediately.

The takeaway isn’t that GCS Fuse is bad - it’s that it’s designed for a different use case. Pick the right tool for your specific access patterns, not the one that looks most modern on paper.